I should probably feel good that everyone (no matter how ill-informed) suddenly has an opinion on big data and data science, and what it’s good for, and how it will change society. I suppose I should be pleased that some of the biggest software vendors are seeing big money in what I’ve always done for a living. These should be heady days for a data scientist, particularly a Ph.D. data scientist with a 15-year resume and a long, long history of successful projects. I should be rubbing my hands with glee at the avalanche of opportunities coming my way.
In fact, the mounting pile of opinion makes me queasy. For every thoughtful, well-informed piece, there’s an equally slap-dash, haphazard one, written by someone who really just needs to have an opinion on everything. It’s the latter series that gets the most attention, mostly because they’re written by popular pundits and exposed to large audiences through mainstream media outlets. It’s these that are the least informative and the most destructive. They make data science look like magic. They turn the computer into an all-seeing eye, and they promote funding for projects that are never going to work.
Today, Jim Harris posts an interview with me where I “De-Mystify Data Science”, and Thursday, I’m giving a talk to some of the smartest people I know on how “Data Science Isn’t a Fad”. I’m trying to get in front of the conversation because there’s a handful of people who are fit to lead it, and if I’m not the most qualified person for the job, at least I’m not trying to shout down every dissenting opinion.
And it’s not just me. There’s Phil Simon, and Cathy O’Neil, and Robert Klopp, and DJ Patil. There’s David Smith and Jim Harris. There’s a long list of people who work in data science (and not all of them curated by me) and a lot of them are communicating what we can do and how we can do it.
Let’s face it: We’re nerds. We’re used to speaking to other nerds. We are not given to sexing up the message, and if I’ve learned anything from 15 years of living between the science and commerce of data, I’ve learned that sexy sells faster than accurate and pretty still beats smart. Twenty years from now, people will still be asking me if I can predict if someone’s pregnant “like that Target model”, and I will still be smiling and telling them “If I have the right data—sure…” and crap like this will still get published as an “interesting new study” in articles that delicately leave out that no one who did that math has any reason to tell the truth.
But there is so much we can do to help people live better, be smarter, and attain whatever they see as happiness. There are so many ways we can have better government, better businesses, and better homes just by adopting the main idea of data science: Use data to verify your answers and ensure that your process is repeatable.
So, maybe it’s not all doom and gloom, but I know this: The interest in data science has already out-stripped the supply of people who can do the work. As a former fraud analyst, I could probably tell you what must logically happen after that, but I’ll let you guess.
In the meantime, I will take comfort from the fact that there are some really bright people making awesome contributions to this field, and I will encourage everyone to listen to them—even if it’s the last piece of advice you take from me.