by Karthik Guruswamy, Teradata
Which is more relevant in the market place? What do customers ask for? First of all, are the terms Analytics & Data Science talking about two different things or the same thing? My article is inspired by a discussion on this topic with some of my team members during a recent conference.
A lot of millennials think Analytics as an old fashioned way of describing Data Science. On the other hand, a lot of traditional data mining/warehouse people think Data Science as old wine in a newly designed big data bottle. I know this debate not only exists in traditional companies who have been solving customer problems for years, but also for new startups who are trying to find a range of customers to consume their products. This debate rages also among customers who’ve used different products over years and are now wondering – “What is the new data science shiny object thingy? We’ve been using scoring algorithms with SAS and SPSS for years? Is this just a new marketing term?”
In this article I want to demystify the two terms and call out the differences. As a practitioner, I’ll have to tell you that the terms Analytics & Data Science are distinctively different. They are two different animals. Here’s why…
Analytics is often tied to a technology or a relatively canned approach, usually to a few hypotheses with well defined methods. Mature analytic processes often creates useful solutions.
Data Science however is free form that often drives what Analytics should be run for a problem and falls into a discovery realm. Data Science helps in many cases to formulate the hypothesis itself…
That being said, the definition has morphed quite a bit over last few years. Just like how BI and Analytics is used in the same breath in many places, the word Analytics is now tied to Data Science and used often to represent the same thing!
How to use BOTH the terms in the same paragraph
A churn problem comes with poly structured data – Transactions, Emails, Call center notes. We need to decide what ‘Analytics’ need to be performed to generate a rank ordered list of customers. We can decide to use Aster’s Multi-Genre Advanced Analytics or Spark MLIB to dig through the data and create models for prediction. The process of finding out what “Analytics” needs to be performed requires Data Science knowledge. Should I use Hidden Markov or Logistic regression, or a combination of both? How about if I throw in Personalized PageRank features and use XGBoost to increase my precision and recall? Which “Analytic method(s)” are worth a shot given the descriptive statistics of the data?
Analytics methods are also more prescriptive and used in places where it’s close to operationalization. Analytics folks are more tied to the code, underlying technology and tend to create controls (KPIs) over the process. Data Science is more on the discovery realm. Data Science folks tend to get more caught up in how algorithms work – even under the hood, limitations of it, visualizations/insights/finding needles in the haystack, smoking guns, etc.
In layman terms if Analytics is like law enforcement in the streets, Data Science would be the equivalent of the forensic lab to provide the methods & evidence. We need both to solve the problem not just once, but repeatedly in the future.
That’s just my observation. There could be other interpretations of the same topic. But end of the day, I think we should use the terms that our customers are more familiar with and call out the differences as succinctly as possible.