Avinash's magnificent zoo, and the unfulfilled promise of Big Data
Junk Charts 2014-01-08
Avinash (Web Analytics 2.0) is fond of animal metaphors. I think he's the one who coined HIPPOs (Highest Paid Person's Opinion). Now he's come up with Reporting Squirrels and Analysis Ninjas. See his recent post here.
In short, he is shouting about "return on analytics", a really, really important thing. What is being lost in the hype of Big Data is that all the investment in analytics has to generate value for businesses. All too often, the "data scientists" have no idea what tangible value they are creating.
For example, I saw presentations from two different tech firms that have data scientists working on the following problem: when you "check in" to a location on your phone, and start typing the name of the location, the app will use GPS and other data to guess where you are, and pop up a list of guesses so that you can save a few keystrokes.
This is an interesting research question that can showcase how to use GPS and other data. It is an impressive feat to process such data in near real time. How does it create value for the respective businesses? When pressed, I expect the data scientist to make the following claim: the check-in location predictor will reduce the amount of time it takes for users to check in, which means they will check in more frequently, which means they will present more opportunities for our advertisers to present them with offers.
If you come to me with this answer, I will press you some more. How will these offers benefit the business? So you continue the argument, and now tells me when there are more offers, there will be more sales, and when the advertisers sell more, we will earn more (that is, if you are paid per action; if you are paid per impression, then showing more offers directly generate more revenue share).
You see the problem here? The path from the check-in prediction to incremental revenues has many steps. Each of these steps is being argued logically - there is not a shred of data to support any of those steps. Many factors affect revenues other than the ones mentioned here; so what is crucial to understand is the magnitude of the impact that the predictive technology has generated. You also want to undertand incremental revenues - counting revenues that would have been earned whether or not the check-in predictor exists is a sleight of hand that produces no value.
***
In theory, the availability of data should improve our ability to measure performance. In reality, the measurement revolution has not taken place. It turns out that measuring performance requires careful design and deliberate collection of the right types of data -- while Big Data is the processing and analysis of whatever data drops onto our laps. Ergo, we are far from fulfilling the promise.