BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Which Businesses Really Need A 'Data Scientist'

Following
This article is more than 5 years old.

Yes, the quotes are intentional. I’m a marketer, so I understand that rebranding might drive up the price, so some programmers and analysts are calling themselves data scientists. What that means is that businesspeople, from executives to HR to line managers, should have a better understanding "of the claim and the reality."

The entire purpose of business software systems has always been to provide business decision makers the KPI’s they need to make informed decisions. That was the goal with mainframes and reports on ribbon paper, and it remains the goal on today’s smartphones. Yet data scientist is a fairly new term; what changed?

It’s not the Algorithms but the Computer Power

While the speed of individual computers has massively grown since I was a computer operator in the 1980s, it’s not that which has created the large jump in computing power in the last decade. Clustered computing, the ability to share the workload among large numbers of computers, is what has exponentially grown the ability to perform far more complex calculations much faster.

The algorithms used have, therefore, become more complex in concert with those advances. Forget the “what took six weeks now takes six minutes” aspect — what wasn’t even reasonable to consider running on a computer is now reasonable.

Most of the algorithms running on computers these days aren’t new, as they weren’t developed for computing. They existed as digital laboratories for mathematical exercises or computing science theory. Now that theory has been coming into practice.

Companies Don't Create Their Own Accounting Standards

Okay, Enron and other companies showed that, yes, they sometimes do. However, in the U.S. we have GAAP; and other standards exist in other nations. A team of very experienced people sat down and figured out something that everyone can apply without understanding the details that were studied in the creation of the standards.

The same is true with algorithms. A few mathematicians and computer scientists can think up complex new ways to analyze information. A larger but still focused group can implement those algorithms into software systems, and the far larger business world at large can leverage the analysis. However, those algorithms must meet with expected standards, both in business and other arenas. For instance, medical imaging algorithms must pass rigorous FDA approval processes.

The analogy is that the core algorithms, whether they be for accounting or data analysis, can be created by a few, while a far larger group can use those algorithms within their own organizations. Neither usage demands absolute rigidity, interpretations can vary, but the basic tool remains consistent and most people who leverage those tools need not understand the complexities behind the decision to use or the reason why the algorithms are created.

What is a Data Scientist?

Programmers have always worked to implement algorithms. That has meant a need to understand mathematics. That does not mean that programmers need to know as much as a mathematician. What’s needed to create the Frankenstein monster I view as the data scientist is someone good at math, someone else who understands programming, another person who comprehends what business use the algorithm might have and a person who can intermediate between all those other people.

Sure, if you find one person who can do all that, fine, but there’s no need. The “data scientist” function is really a team function. A programmer who has a minor in math might label him- or herself as a data scientist, but unless a candidate is able to fulfill all the aforementioned requirements, it’s a label useful only for salary negotiations. To me, what’s being called a data scientist is just a programmer who knows more math than others, but they fit into the same position as other programmers. We don’t call someone who focuses on programs that dynamically analyze engine performance a “car scientist,” and a programmer focusing on coding software for the biotech industry would be laughed out of the room is his business card read “biologist,” so don’t overemphasize what’s going on.

Analyzing data is important, and the analysis has become ever more complex, but things haven’t changed so much that a data scientist is a different species. A data scientist is a person or a team who tries to adapt complex modeling to the business world.

Who Needs a Data Scientist?

The focus of a mainline business, large or small, is: “How can I know more about my processes and my ecosystem in order to make better decisions in a more timely fashion?” Just as that business isn’t going to build its own phone system, as a better ROI is to use another firm who knows how to do that, there’s no need for a business to sit back and think, “Gee, what algorithms do I need to perform better?”

The business intelligence (BI) industry exists purely to answer that question for their customers. Just as a vendor will tell the company why VoIP is a better business solution, then deal with the technical issues itself, the BI vendor’s job is to figure out the tools that will best help the market, provide those tools and show the customers how to access and leverage the tools. A business doesn’t need to know how VoIP works, but its people need to be trained to use a new system. In the same way, a business doesn’t need to know how an algorithm works, its people only need to know how to use that tool to gain insight.

If the BI firm does its job, training will occur so the business customers know what data to feed the algorithm and then understand how to read the results. There’s a black box that processes, just like the black box running the lights, phones and other fundamentals in the office.

In a similar way, we’re seeing the artificial intelligence (AI) industry, and its deep-learning offshoot, leverage the same term to describe the people building the deep-learning networks. The same issues that apply to BI apply to AI.

That means the market for the “data scientist” is instantly restricted. The firms incorporating the algorithms into software and providing the wrappers to make the algorithm accessible can pay a person or a team to provide that function, but the majority of IT organizations don’t need to do so.

The concept of focusing on how the massive amounts of modern business data can better be analyzed to improve business performance is great, but it isn’t new. It’s a regular part of business software.

The individual or team, often called the "data scientist", providing the deep, technical skills to convert modern mathematical models into useful business software are critical to the BI and AI industries – Not to the IT world in general.

Follow me on Twitter or LinkedInCheck out my website