Information Week has published an interesting interview with David McJannet, VP of marketing for Hortonworks, in which he discusses the limitations of existing definitions of ‘big data’ and proposes a more practical, business-oriented, definition.
…Big data is about “building new analytic applications based on new types of data, in order to better serve your customers and drive a better competitive advantage,” said McJannet.
This approach is in keeping with my recent reassessment of our communication in relation to big data, as I explained during our recent webinar “Big Data Reconsidered” (a replay of which is available here).
While volume, variety and velocity are common characteristics of the ‘big data’ projects we talk to clients about, they do not define their problems and initiatives in terms of volume, variety and velocity (or any other Vs).
Instead they are focused on business problems (e.g. churn analysis, fraud analysis) or functional challenges (e.g. processing and analysis of very large data sets in their entirety, stream processing of sensor and machine-generated data) that help deal with business problems.
Based on these ongoing conversations I would agree with McJannet that enterprises are looking for more a business-oriented definition of big data that focuses less on the nature of the data and more on the business outcomes.
For what it’s worth, and in the spirit of taking the conversation forward, below is the description of big data that we seem to have settled on:
‘Big Data’ is the realization of competitive advantage by storing, processing and analyzing data that was previously ignored due to the cost and functional limitations of traditional data management technologies.
Our description and McJannet’s are variations on a theme, both focusing on a key factor that is missing from most V-driven definitions: competitive advantage. However, since the baseline for conversation has been set by the 3Vs here’s the longer version which puts our description in context:
‘Big Data’ is the realization of competitive advantage by storing, processing and analyzing data that was previously ignored due to the cost and functional limitations of traditional data management technologies to handle its volume, velocity and variety.