by Randy Lea, Teradata
The word easy is not often associated with big data, but there is at least one pretty easy answer when it comes to the question of why companies embark on big data deployments: money, whether it is making it or saving it. What may be less obvious is why, despite their best intentions, organizations are sometimes getting lost along the big data road. Instead of improving the bottom line through the power of deep insights, big data becomes more of an expense than a boost.
Why does this happen? One reason is that big data initiatives require substantial resources to get the infrastructure in place. It is all too easy for companies to get lost in the IT weeds: the systems, the processes, and the logistics that support big data management and loading come to dominate their landscape. Resources are poured into scaling the organization’s capabilities to store big data, manage big data, and do various kind of processing on big data. None of those things actually deliver the answers that move the business. Instead they represent an intense focus on how to best and most cost-effectively amass data rather than on how to generate value from it.
How can an organization tell if it has fallen into this trap, collecting big data without deriving its true value? One powerful indicator is to question whether the organization has built an environment that not only can perform analytics, but what can be thought of as big analytics. At the most basic level, big analytics is the ability to perform multi-genre analytics using SQL, statistical modeling, machine learning, path and pattern analysis, time series, text analysis, and graph analytics (among others), and to do so at scale.
For example, a common analytic we see maps the relationships between people and products using graph engines that enable you to explore the links between them. This type of analytic is most commonly performed in memory and thus it is by definition limited by the amount of memory in the total number of connections it can look at and process. That is not to say that there is not analytics taking place, merely that it is not big analytics. To achieve big analytics, the same attention to scalability needs to be applied to the analytics platform as has been previously given by IT departments to the data warehouse.
Another scenario to consider is so-called customer journeys, the path that leads a buyer from interest to purchase. This type of analytic is frequently cited in conversations about big data. But, is it actually big analytics? Following a transaction that begins with a visit to a store, then a subsequent series of clicks on a web site may very well produce a large amount of data. But even a few billion events of this type pale in comparison to the volume of data created when, for example, sensor data becomes part of the equation for understanding behavior. If the analytics platform cannot keep up with the variety and volume of data available to the organization, it won’t deliver the insight into behavior that analytics of big data can support.
That’s where the notion of big analytics emerges. Big analytics is the natural partner for big data. It is the product of a balancing act, re-evaluating the application of resources so that there is better distribution between the means—that is, the data warehouse and the data lake—and the ends, incisive multi-genre analysis of all that data. SQL analysis might tell a retailer about the current state of sales, but not provide insight into the relationship between sales and customer behavior. Sentiment analysis alone might reveal customer dissatisfaction, but it may not quantify the impact of those sentiments on customer behavior. Path analysis can indicate bottlenecks in the journey to a conversion, but not provide an indication of whether a poor customer experience is impacting overall brand sentiment.
As you review your investment in big data infrastructure, don’t lose sight of the goal: making money from business insights. If you’re not getting those insights now, look at whether you can apply multiple analytics at scale and achieve ROI on big data. And remember: data is worthless unless you analyze it.