Cloud Computing, IT, and more: The Blinding Light of Big Data

I have held back from publishing this article because I'm not sure I'm using the right terminology. I am not a data wonk. However over the past six months my interactions with those who are heavy weight experts have convinced me I'm on the right path and need to start the dialog. So my apologies if my point isn't crisp and please ask me to explain what's not clear.

I feel the bright light of interest in Big Data is casting a long shadow over the convergence of two significant challenges: depth of executive experience with analytics and their understanding of the limits on predicting the future. Good executives want data to drive their decisions and big data promises to expand the range of leaders using data to draw conclusions. However I see nothing being done to protect us from a group of leaders thrust into a dependence on data analytics without insight into how to compose analytic groups, an appreciation for the limits of predictive models, and knowledge of the law of unintended consequences.

One glaring reality today is the lack of available analytics talent; we simply don't have enough people with the requisite skills. The subset of people with the requisite background and experience who can turn the torrent of new data into value is significantly smaller than the need. Equally important, that subset is not exclusively statisticians and modelers. We need diverse teams to protect us from weighting conclusions too heavily based on natural biases of one group of professionals. Picking on one of those groups of people, my favorite quote "A statistician is a person who can draw a straight line from an unwarranted assumption to a forgone conclusion" reminds me every professional has their own set of biases. A diverse steam will improve the quality of data used, models developed and conclusions drawn.

Second, leaders need to understand the limits of predicting the future. We often find data which purports to predict the future through the application of some algorithm which was likely developed through the use of historical data. But how well does the algorithm's predictions predict the future? It depends on how much the future repeats the past. As long as things stay within tight bounds, the future is reasonably predictable. Yet few leaders understand the stress points of predictive models, the data elements where small changes can generate a wide variance in results. Our financial meltdown in 2008 was precipitated by an untested boundary condition; what happens to the value of Mortgage Backed Securities (MBS) when the default rate goes outside of historical trends. Unfortunately we all experienced the result. Whereas in statistics we were taught such testing is imperative, for some reason it was entirely ignored.

Predictability is the byproduct of repetition without variance. However today we live in a world rife with innovation. When innovation happens, analytics are of little help to comprehend the impact until the population size is significant enough, by which time the opportunity door is closing fast. Once a model seems to predict the future it seems all consideration for its continued accuracy evaporates; worse when it directly correlates to making money.

Finally we cannot overlook the impact of unintended consequences. There is a well known story of a retailer using analytics on purchases to identify a woman was pregnant, only to inadvertently notify her family of her secret by sending her targeted marketing products. This is one of many unintended consequences of making decisions purely based on data. I am convinced concern about unintended consequences needs to be part of the thinking. As William Gibson said "the future is already here but unevenly distributed". Having a diverse team is a key element of limiting this negative side of analytics, however executives need to be sensitive to it as well. One angle I use is to consider the motivation of participants versus the actions being considered based on the conclusion. If the motives don't match the actions we open the door for an unintended consequence. Returning to the retail example, the action to communicate marketing information did not align with the motivation of the buyer; the conclusion that all buyers want to save money was false. Where the retailer failed was in not considering the buyer's motivation which was not possible given their limited data set.

My greatest fear is the widespread movement evolving which attempts to replace intuition with modelling. I've already seen the seeds of this being sowed in books and articles from highly respected authors casting doubt on the value of intuition. Intuition helps one to see the future whereas data can only help one to project the past into the future. The difference is significant.

Cloud Computing, IT, and more

Wednesday, April 2, 2014

The Blinding Light of Big Data

1 comment: