NoSQL LinkedIn Skills Index – An Interesting Occasional Update

I was recently prompted by OrientDB CEO Luca Garulli to take another look at the NoSQL LinkedIn Skills Index, which we previously updated on a regular basis between September 2012 and 2015.

I wouldn’t read too much into the results since there’s been such a long period between updates, and this is – as ever – just a snapshot of one particular data source. However, they are definitely interesting, especially when you consider that we retired the NoSQL LinkedIn Skills Index primarily because the results had become so boringly predictable.

As such I’d make the following observations without any additional comment:

  • It is interesting to note that MongoDB’s share of mentions of NoSQL databases in LinkedIn member profiles has declined since September 2015, from 51% to 48%. Of course, MongoDB remains the number one by a considerable margin.
  • It is also interesting to note that Redis has climbed above Cassandra to claim second spot.
  • Similarly it is interesting that Neo4j has climbed above CouchDB for fifth place.
  • And it is also interesting that DynamoDB has overtaken Couchbase for eighth place.
  • It is also interesting that the two fastest growing NoSQL databases, in terms of mentions in LinkedIn profiles, are Google Cloud Bigtable (up 557%) and Azure DocumentDB (up 254%).
  • And it is also interesting that the third fastest growth came from RethinkDB, despite the recent demise of the company of the same name.
  • Those growth rates saw Google Clooud Bigtable climb above Voldemort, ArangoDB, Hypertable and Allegrograph, while Azure DocumentDB climbed above Titan and Voldemort, and RethinkDB climbed above Titan and Accumulo.

Since Luca prompted another look at the results, I should also probably point out that mentions of OrientDB grew at a healthy 83% as OrientDB held on to 11th place in the Index.

Interesting…

NoSQL LinkedIn Skills Index – December 2014

As usual there’s an early finish to the quarter for our NoSQL LinkedIn Skills Index, which tracks mentions of NoSQL database in LinkedIn member profiles, but as usual that has little impact on the results as MongoDB continues to account for 49% of all LinkedIn member profiles mentioning a NoSQL project.

Q4donut

There are a few changes further down the list of NoSQL projects with both Aerospike and OrientDB overtaking Voldemort, as predicted, and RethinkDB overtaking Hypertable.

As noted last quarter, there was a chance that Aerospike might get overtaken by OrientDB and MarkLogic might get overtaken by DynamoDB. As it happens both held off their respective challengers but their places remain under threat.

ArangoDB had the fastest rate of growth in the quarter (21.57%), followed by RethinkDB (21.28%), FoundationDB (19.74%), OrientDB (18.02%) and Aerospike (17.62%). DynamoDB was next, and the fastest growing inside the top ten, with 14.37%.

q4chart

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

NoSQL LinkedIn Skills Index – September 2014

Time for a new look for our NoSQL LinkedIn Skills Index, which tracks mentions of NoSQL database in LinkedIn member profiles, as it enters its third year. We’ve switched from a bar chart to a line chart to reduce clutter – at least on the horizontal plane.

Unfortunately the dominance of MongoDB means that the chart is inevitably cluttered on the low end of the vertical plane, but the line chart at least provides a clear illustration of that dominance.

nosql

There are a few other changes of note further down the list, with FoundationDB gaining a place on Sparksee (as predicted) thanks to it having the fastest rate of growth (40.74%) in Q3. ArangoDB also gained a place on InfiniteGraph thanks to recording the second fastest growth rate (37.84%).

We noted last time that Q3 could see OrientDB overtake Aerospike, unless the release of Aerospike as open source had an immediate impact on interest levels. That seems to have occurred, with Aerospike recording 23.80% growth to not only hold off OrientDB but gain ground on Voldemort, which looks likely to be overtaken by both Aerospike and OrientDB in Q4. Inside the top 10 there is also a chance that DynamoDB could overtake MarkLogic in Q4.

Titan (25.97%), RethinkDB (22.88%) and DynamoDB (22.85%) also deserve a mention in terms of growth in Q3, while Neo4j was the fastest growing of the top 10 with 17.99%. MongoDB was of course most popular NoSQL database by a considerable margin, once again accounting for 49% of all LinkedIn member profiles mentioning a NoSQL project.

nosql2

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

NoSQL LinkedIn Skills Index – September 2013

With our rebooted NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, now into its second year, I thought it was a good time to add some newer projects to the list; specifically: ArangoDB, FoundationDB, RethinkDB, and Titan.

It shouldn’t surprise anyone to find that those four new additions failed to make a dent in the top ten list of the NoSQL databases most often cited in LinkedIn profiles. However, there is still some interesting activity this quarter, with Riak leapfrogging MarkLogic (as predicted).

linkedinq31

Outside the top ten, Apache Accumulo overtook Voldemort, and saw the second fastest growth in mentions in Q3, behind only DynamoDB and ahead of Neo4j, MongoDB, and Cassandra.

That growth saw MongoDB extend its lead as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 49% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 47% in June.

allNoSQLq3

Incidentally, adding the four new NoSQL databases to the analysis did not have a significant impact on MongoDB’s share. Without them it still registered 49%. Expect MongoDB to pass the 50% threshold in Q4, however, as well as Couchbase to overtake MarkLogic.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

Saying yes to NoSQL

As a company, The 451 Group has built its reputation on taking a lead in covering disruptive technologies and vendors. Even so, with a movement as hyped as NoSQL databases, it sometimes pays to be cautious.

In my role covering data management technologies for The 451 Group’s Information Management practice I have been keeping an eye on the NoSQL database movement for some time, taking the time to understand the nuances of the various technologies involved and their potential enterprise applicability.

That watching brief has now spilled over into official coverage, following our recent assessment of 10gen. I also recently had the chance to meet up with Couchio’s VP of business development, Nitin Borwankar (see coverage initiation of Couchio). I’ve also caught up with Basho Technologies sooner rather than later. A report on that is now imminent.

There are a couple of reasons why I have formally began covering the NoSQL databases. The first is the maturing of the technologies, and the vendors behind them, to the point where they can be considered for enterprise-level adoption. The second is the demand we are getting from our clients to provide our view of the NoSQL space and its players.

This is coming both from the investment community and from existing vendors, either looking for potential partnerships or fearing potential competition. The number of queries we have been getting related to NoSQL and big data have encouraged articulation of my thoughts, so look-out for a two-part spotlight on the implications for the operational and analytical database markets in the coming weeks.

The biggest reason, however, is the recognition that the NoSQL movement is a user-led phenomena. There is an enormous amount of hype surrounding NoSQL but for the most part it is not coming from vendors like 10gen, Couchio and Basho (although they may not be actively discouraging it) but from technology users.

A quick look at the most prominent key-value and column-table NoSQL data stores highlights this. Many of these have been created by user organizations themselves in order fill a void and overcome the limitations of traditional relational databases – for example Google (BigTable), Yahoo (Hbase), Zvents (Hypertable), LinkedIn (Voldemort), Amazon (Dynamo), and Facebook (Cassandra).

It has become clear that traditional database technologies do need meet the scalability and performance requirements of dealing with big data workloads, particularly at a scale experienced by social networking services.

That does raise the question of how applicable these technologies will be to enterprises that do not share the architecture of the likes of Google, Facebook and LinkedIn – at least in the short-term. Although there are users – Cassandra users include Rackspace, Digg, Facebook, and Twitter, for example.

What there isn’t – for the likes of Cassandra and Voldemort, at least – is vendor-based support. That inevitably raises questions about the general applicability of the key-value/column table stores. As Dave Kellog notes, “unless you’ve got Google’s business model and talent pool, you probably shouldn’t copy their development tendencies”.

Given the levels of adoption it seems inevitable that vendors will emerge around some of these projects, not least since, as Dave puts it, “one day management will say: ‘Holy Cow folks, why in the world are we paying programmers to write and support software at this low a level?'”

In the meantime, it would appear that the document-oriented data stores (Couchio’s CouchDB, 10gen’s MongoDB, Basho’s Riak) are much more generally applicable, both technologically and from a business perspective. UPDATE – You can also add Neo Technology and its graph database technology to that list).

In our forthcoming two-part spotlight on this space I’ll articulate in more detail our view on the differentiation of the various NoSQL databases and other big data technologies and their potential enterprise applicability. The first part, on NoSQL and operational databases, is here.