by Paul Barsch, Teradata
Evaluating risk vs. return of a big data initiative can be tricky, especially because the open source market is so active and fluid. Financial risk aside, business risk actually plays the bigger spoiler in properly estimating future cash flows and profitability of a big data project. Accounting for risks such as competition, emerging technologies, rising costs and regulatory challenges could make the difference between an accurate financial big data ROI forecast and one that’s woefully wrong.
In addition to project risk (which has its own set of challenges), there are four business risks for big data that might prevent you from realizing the financial value you expect.
How confident are you that revenues associated with your big data project will come in as anticipated? In terms of competition, are there emerging firms that might take a bite from your revenues? If your cost-benefit analysis for a particular big data project requires you to sell 10% more of widget X in the next three years to reach breakeven—and new competitors emerge—then you could be at risk of missing revenue forecasts.
What’s hot in open source today might not be tomorrow. For example, Apache Spark is attracting hundreds of contributors. But will the open source community move on to the next big thing? If the past is any evidence, the answer is yes—eventually. Consider whether the open source project you’re adopting has staying power. Keep in mind that if the community moves on from the open source project you’re using, it’s possible that costs could rise as you scramble for development skills and support.
Think that data scientist you hired for $120K annually is going to stick around when companies down the street are now paying $150K? This article shows that some data scientists with a master’s degree can earn $200K or more! Moreover, analytic development and solution costs are rising faster than the cost of inflation, with demand outstripping supply. Of course, we haven’t even mentioned rising training costs for change management, additional compute and storage needed as users pile on to your new big data solution, new BI tools for Hadoop, or additional Hadoop administrators.
In short, it pays to test assumptions and perform a sensitivity analysis to determine what happens if salaries, utilities or other costs rise 10-20%. With rising costs it’s entirely possible to find your Net Present Value (NPV) analysis upside down.
Remember that data lake full of raw data for your data scientists to browse and discover new insights? Now someone from your HR group wants to add a large healthcare data set in order view disease trends and provide employees health coaching. Uh-oh.
While securing the data lake at the outset is a best practice, some data lakes have grown up with less governance than National Lampoon’s Animal House. Securing sensitive data sets and providing access to authenticated users may add unforeseen additional costs based on your use cases today. Be sure to account for additional security/privacy costs in bringing sensitive corporate data into your big data solution.