Nassim Taleb’s opinion piece in Wired discusses the potential pitfalls of big data, pointing out especially the potential for “spurious” results, as researchers cherry pick “whatever statistics confirm their belief”.
Taleb says we used to have protections for this kind of thing (We did?) but big data presents an even bigger temptation. Researchers can now sample and re-sample until they get the answer they want. (Though how that’s new is beyond me.)
Taleb’s not wrong, but he is engaging in a little cherry picking of his own. The fact that researchers publish positive results more often than negative is nothing new. The problems with lack of repeated study are also well-known and not limited to observational studies, or even studies with large samples. The possibility that someone might make important decisions based on someone else’s bad analysis has been keeping me awake since I first started my Ph.D.–and has probably made wiser folks than me sleepless for ages gone by.
The increased number of studies and organizations’ increasing reliance on data has increased the odds of catastrophic failure, but that’s just a result of large numbers. The actual number of catastrophic failures won’t change. It’s just that some of the failures based on bad intuition will be shifted over to bad data analysis. Nothing new about that, either.
Taleb’s made a career out of pointing out flaws in other people’s predictions, but big data presents an opportunity for the best kind of disinfectant. Better access to data means more opportunity to re-create (and therefore debunk) poor studies–as Taleb points out himself. Making data available across an organization also provides more opportunities for competing analysis. The various open government projects across the country are an amazing first step to helping people understand how they’re governed and what they can do to improve their own lives.
Taleb seems to be warning mostly against over enthusiasm, i.e. against the people who think big data will solve all of their problems while making them rich, thin, and happy. He’s not wrong, but I wonder if the right people will listen to that warning.
Or will the wrong people go back to the old ways of doing business instead?

Pingback: Favorite Links 02/17/2013 | Hello Data