What Data Scientists Can Learn From Goldman Sachs

I had a long, long post about “What is a Data Scientist?” that was supposed to be my write-up of my session at North Carolina’s Analytics Camp on February 25. I’ll get back to that, and this blog will soon have a series on data science and what it means. It turns out, however, that my first data science writings weren’t about data science.  Instead, I was writing about my old passion, the one that still keeps me up at night. I was writing about fraud–how to detect it, and how to fight it.

Then someone resigned from Goldman Sachs publicly and noisily, and I realized I knew what needed to be said. After all, there are plenty of people talking about data science and what it means to be a data scientist. There’s no point saying less than than your predecessors.

I have spent my entire career turning data into information and using that information to solve problems. Whether it was improving welfare-to-work programs, fighting fraud, or designing smart systems, I have always believed in the power of statistical analysts–sound, carefully vetted and utilized statistical analysis—to help people do their jobs better. A good analysis can answer questions you never thought to ask. It can give you insights that decades of business experience can’t. It can literally transform your thinking and your organization.

Data, harnessed and utilized by the right people in the right way, can be a powerful force for good. Whether your version of good is making a lot of money for your customers or catching bad guys or developing efficient technologies that improve the way we live, information is the first step in getting there, and collecting data is the first step in developing information.

But data, harnessed by the wrong people in the wrong way, can be a powerful force for evil. The complicated mortgage-backed securities that wrecked the economy are a shining example of data analysis gone wrong. Whether you think the Wall Street traders selling those financial instruments were delusional, misinformed, or unlucky, the fact is that those instruments were designed using sophisticated mathematics that look a lot like the models we’re using to run train systems and target advertising. In the early days, those financial instruments paid off beyond any investor’s expectations. They were based on the then-sound principle that people will pay their mortgages almost surely.

Mortgage-backed securities were profitable, and so Wall Street needed more. Eventually, they started running low on mortgages, and so they needed more mortgages. Banks had enormous incentives to lend money in any way that they could justify as a “mortgage”, which meant they had incentives to relax those rules that traditionally guaranteed a mortgage holder would pay. The securities Wall Street was selling still looked the same, but their foundations were shakier and shakier—and then they collapsed.

Data science is currently at the same stage as those early derivatives. The technology is new; there are plenty of customers who can benefit, and the upsides are huge. Growth like this, however, does not last. It cannot last, and now is the time for us to think hard about where we’re headed. What we do is difficult. It’s hard to explain, and it’s easy to fake. It’s our responsibility to make sure our customers are informed about what they’re buying and their expected return on investment. Sure, the first project might pay off (or be made to look like it did), but what about the second one? Or the third? How many projects can we run, for how long, and when do we stop earning those big checks?

There are already some disturbing trends in our little community, and I am not the only person who’s noticed. From the schools who promise quick degrees followed by lucrative careers to the big data advocates who promise to transform a business overnight, we’re already in danger of selling too much too fast. When one of these projects fails publicly and expensively (and one will, the laws of probability dictate that one must) will we be prepared to handle that failure gracefully?

Greg Smith, formerly of Goldman Sachs, is right: If people don’t trust you, they’ll stop doing business with you. Right now, we are making some grand, grand promises that demand a lot of investment from the people who believe in them. The question is, do we really plan to deliver?

Or are we just making as much profit as we can, as fast as we can, and letting the customer come last?

About Melinda Thielbar

Melinda Thielbar is a co-founder of Research Triangle Analysts, Ph.D. statistician, spinner of fine yarn, martial artist, fraud analyst, and fiction writer. In other words, she's a polymath. Follow Melinda on Twitter @mthielbar, or join the Research Triangle Analysts group on G+ to join the conversation about data science.
This entry was posted in Analysis, Fraud and tagged . Bookmark the permalink.

2 Responses to What Data Scientists Can Learn From Goldman Sachs

  1. Pingback: It Takes a Team: What Data Science Can Learn From Bruce Springsteen | Melinda Thielbar

  2. Pingback: Forbes Says Data Science is a Fad. I Say Forbes is Wrong. | Melinda Thielbar

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s