The Data Day, A few days: February 15-21 2014

Informatica eyes eyes $1bn in sales. And more

And that’s the data day, today.

The Data Day, A few days: February 8-14 2014

Hortonworks and Red Hat expand Hadoop partnership. And more.

And that’s the data day, today.

The Data Day, A few days: February 1-7 2014

Orchestrate launches. Domo raises $125m. And more

And that’s the data day, today.

The Data Day, A few days: January 25-31 2014

VoltDB targets in-memory analytics. Garantia Data becomes Redis Labs. And more.

And that’s the data day, today.

The Data Day, A few days: January 18-24 2014

Is Hadoop a planet? MariaDB Enterprise. And more

And that’s the data day, today.

Hadoop: why enterprises need something to aspire to

Merv Adrian wrote a blog boast recently bemoaning the “aspirational marketing” that surrounds Hadoop, in particular the fact that current deployments are a long way from delivering on the vision.

While I completely agree that many enterprises are struggling to translate tactical use-cases into the business use-cases required to drive more strategic adoption beyond the proof of concept stage, I don’t think that aspirational marketing around Hadoop is necessarily a bad thing.

It is certainly true that part of the problem lies in clearly understanding how Hadoop can be used as a complement to traditional relational database technologies deployed as an enterprise data warehouse.

That is why we recently asked Is Hadoop a planet? – comparing confusion around Hadoop’s classification to that of Pluto – while also describing Hadoop as a framework in search of a metaphor.

Given the confusion, however, I believe it is incumbent on Hadoop providers to describe not just the functional use-cases that are driving tactical adoption, but also the bigger vision that will drive more strategic adoption.

The data management industry has become accustomed to thinking about the storage, processing and analysis of data in analytical databases as akin to warehousing, to the extent that the phrase ‘data warehouse’ no longer requires an explanation.

We believe that a good understanding of the potential strategic role of Hadoop, even if it is only aspirational at this stage, will be important in encouraging broader and deeper adoption of Hadoop.

In addition, it is not as if there are no enterprises deploying Hadoop more strategically. Cloudera estimates that about 20% of its 300 subscription customers are already deploying Hadoop as what it calls an Enterprise Data Hub.

I’m not personally convinced that Enterprise Data Hub is really the right term, (not least since we previously used the term Data Hub in a slightly different context). Other potential terms include data lake and data refinery.

Although the latter better describes Hadoop’s role in aggregating and processing data and the industrial-scale processes used to make data more acceptable for different analytic use-cases, it appears to have quickly passed out of fashion compared to the former.

I have begun using the term ‘data treatment plant’ as a combination of the two concepts to describe how Hadoop can be used as a single ‘logical’ unified data platform into which you simply poor data, while industrial-scale processes – the multiple data processing and analytic engines that will be supported by Hadoop 2.0: such as MapReduce, streaming processing, SQL and NoSQL – are used to make data more acceptable for a desired end-use.

451 clients can get more detail on the ‘data treatment plant’ and why we believe a bit of aspirational marketing may not be a bad thing for Hadoop, from our recent report, Hadoop: a framework in search of a metaphor.

The Data Day, A few days: January 11-17 2014

Neo updates Neo4j. Cloudera sharpens focus on Accumulo. And more

And that’s the data day, today.

The Data Day, A few days: January 6-10 2014

IBM forms Watson Group. And more

And that’s the data day, today.

The Data Day, A few days: December 13-10 2013

VC funding for Hadoop and NoSQL tops $1bn. And more

And that’s the data day, today.

NoSQL LinkedIn Skills Index – December 2013

There’s an early end to the quarter for our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, just as there was in 2012.

We predicted in Q3 that Couchbase would overtake MarkLogic this quarter, which came to pass, but were somewhat surprised to see Couchbase also leapfrog Riak to claim 7th place. It’s almost too close to call between the three, though we wouldn’t be surprised to see those places change hands in the coming quarters.

december2013

There were no other changes of position outside the top ten, although Titan is bearing down on Hypertable having recorded the fastest growth in Q4 (49.5%) and can be expected to gain a place in Q1 2014. The second fastest climber, in terms of mentions, was FoundationDB, followed by ArangoDB, RethinkDB and Apache Cassandra (the latter being particularly notable since it was the only one of the five fastest growers to also be one of the top ten most mentioned in LinkedIn member profiles).

That growth was of course not enough to close the gap on MongoDB as the most mentioned NoSQL database in LinkedIn member profiles, although for the first time MongoDB’s proportion of the overall total actually declined – from 49% in Q3 to 48%, upsetting our prediction that MongoDB would pass the 50% threshold in Q4.

Q42013

It will be interesting to see whether MongoDB’s dominance declines again in Q1, although either way it retains a monumental lead over all the other NoSQL databases in terms of mentions in LinkedIn profiles.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.