The Data Day, A few days: October 11 – October 17 2014

Insanely large Strata-Hadoop World edition

And that’s the data day, today.

NoSQL LinkedIn Skills Index – September 2014

Time for a new look for our NoSQL LinkedIn Skills Index, which tracks mentions of NoSQL database in LinkedIn member profiles, as it enters its third year. We’ve switched from a bar chart to a line chart to reduce clutter – at least on the horizontal plane.

Unfortunately the dominance of MongoDB means that the chart is inevitably cluttered on the low end of the vertical plane, but the line chart at least provides a clear illustration of that dominance.

nosql

There are a few other changes of note further down the list, with FoundationDB gaining a place on Sparksee (as predicted) thanks to it having the fastest rate of growth (40.74%) in Q3. ArangoDB also gained a place on InfiniteGraph thanks to recording the second fastest growth rate (37.84%).

We noted last time that Q3 could see OrientDB overtake Aerospike, unless the release of Aerospike as open source had an immediate impact on interest levels. That seems to have occurred, with Aerospike recording 23.80% growth to not only hold off OrientDB but gain ground on Voldemort, which looks likely to be overtaken by both Aerospike and OrientDB in Q4. Inside the top 10 there is also a chance that DynamoDB could overtake MarkLogic in Q4.

Titan (25.97%), RethinkDB (22.88%) and DynamoDB (22.85%) also deserve a mention in terms of growth in Q3, while Neo4j was the fastest growing of the top 10 with 17.99%. MongoDB was of course most popular NoSQL database by a considerable margin, once again accounting for 49% of all LinkedIn member profiles mentioning a NoSQL project.

nosql2

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

NoSQL LinkedIn Skills Index – June 2014

There isn’t a great deal of movement in the June update to our NoSQL LinkedIn Skills Index, which tracks mentions of NoSQL database in NoSQL member profiles. At the tail-end of the list FoundationDB jumped a place above InfiniteGraph and can be expected to gain another place on Sparksee in the next quarter, but otherwise it’s very much ‘as you were’.

Q3 could also see OrientDB overtake Aerospike, unless the recent release of Aerospike as open source his an immediate impact on interest levels. FoundationDB was among those with the fastest growth rates in Q2 at 35.0%, although the faster growth came from ArangoDB (48.0%) followed by RethinkDB (36.6%), Titan (27.1%) and Couchbase (18.9%).

nosql-Jun

Once again MongoDB was the most popular NoSQL database by a considerable margin, representing 49% of all LinkedIn member profiles mentioning a NoSQL project.

jun-donut

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

NoSQL LinkedIn Skills Index – March 2014

The latest version of our NoSQL LinkedIn Skills Index shows the continued strength of MongoDB, as the document database increased its share back to 49% of all mentions of NoSQL databases in Q1.

q1_donut

We were surprised to find MongoDB’s proportion of the Index (based on the number of LinkedIn member profiles mentioning each of the NoSQL projects) actually declined in the previous quarter: from 49% to 48%. The Q1 results suggest that was just a blip.

We had wondered whether Couchbase’s leap of two places in our previous update might also be a blip, but in fact Couchbase retained seventh spot in Q1 and there were no changes of position within the top ten this quarter.

q1-chart

Outside the top ten, Titan gained a place on Hypertable, as expected, while RethinkDB leapfrogged AllegroGraph, thanks to recording the third fastest growth (34.94%) in the quarter. The fastest climber, in terms of mentions, was FoundationDB (42.86%), followed by ArangoDB (38.89%). DynamoDB (28.03%) and Titan (26.88%) complete the list of the top five fastest climbers.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

The Data Day, A few days: December 13-10 2013

VC funding for Hadoop and NoSQL tops $1bn. And more

And that’s the data day, today.

NoSQL LinkedIn Skills Index – December 2013

There’s an early end to the quarter for our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, just as there was in 2012.

We predicted in Q3 that Couchbase would overtake MarkLogic this quarter, which came to pass, but were somewhat surprised to see Couchbase also leapfrog Riak to claim 7th place. It’s almost too close to call between the three, though we wouldn’t be surprised to see those places change hands in the coming quarters.

december2013

There were no other changes of position outside the top ten, although Titan is bearing down on Hypertable having recorded the fastest growth in Q4 (49.5%) and can be expected to gain a place in Q1 2014. The second fastest climber, in terms of mentions, was FoundationDB, followed by ArangoDB, RethinkDB and Apache Cassandra (the latter being particularly notable since it was the only one of the five fastest growers to also be one of the top ten most mentioned in LinkedIn member profiles).

That growth was of course not enough to close the gap on MongoDB as the most mentioned NoSQL database in LinkedIn member profiles, although for the first time MongoDB’s proportion of the overall total actually declined – from 49% in Q3 to 48%, upsetting our prediction that MongoDB would pass the 50% threshold in Q4.

Q42013

It will be interesting to see whether MongoDB’s dominance declines again in Q1, although either way it retains a monumental lead over all the other NoSQL databases in terms of mentions in LinkedIn profiles.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

Visualizing the $1bn+ VC investment in Hadoop and NoSQL

Cumulative VC funding for Hadoop and NoSQL vendors broke through the $1bn barrier in 2013, according a Spotlight report published by 451 Research, based on data provided by The 451 M&A KnowledgeBase.

The data indicates that there was a substantial increase in funding in 2013 ($530.5m, not including RethinkDB’s $8m announced yesterday) compared to 2012 ($190.9m), thanks to major rounds for the likes of MongoDB, Pivotal, Hortonworks and DataStax.

The report includes a visualization created by 451’s Director of Data Strategy and Solutions, Barbara Peng, that illustrates the connections between the various investors and the NoSQL and Hadoop vendors in which they have invested.

A snapshot of the visualization is shown below but the the original is interactive, enabling 451 Research clients to drag the various elements around for greater emphasis, as well as isolate the NoSQL or Hadoop categories.

vc-firms

451 Research clients can also scroll over the blue circles to see the total amount of funding raised by the individual Hadoop and NoSQL vendors, and scroll over the smaller orange circles to see which investors have backed which companies.

The sample set was limited to 16 vendors for visual clarity, but the six Hadoop and 10 NoSQL providers cited account for more than 87% of funding to date (with Pivotal representing the vast majority of the remaining 13%).

This visualization illustrates that investment in Hadoop and NoSQL providers comes from a relatively small group of VC firms (52 to be specific, excluding individual seed investors), resulting in a relatively tightly clustered graph.

However, the visualization also enables us to put to the test the recent blog post by MarkLogic’s Adam Fowler in which he stated:

“Just look at the number of investors who are investing in multiple NoSQL companies. They’re hedging their bets because they’re not sure themselves which businesses will survive.”

In fact investment in multiple Hadoop and NoSQL vendors is relatively rare. Only 11 out of the 52 VC firms have invested in more than one Hadoop and/or NoSQL vendor, with seven of those picking one Hadoop vendor and one NoSQL provider. Less hedging their bets as picking a winner in each category.

Of the remaining four investment shops, two have invested in one Hadoop distributor, one NoSQL specialist and one Hadoop-as-a-service provider (MapR, DataStax and Qubole for Lightspeed Venture Partners; Cloudera, Couchbase and Altiscale for Accel Partners), while In-Q-Tel has invested in one Hadoop supplier, one NoSQL vendor and one NoSQL-as-a-service provider (Cloudera, MongoDB and Cloudant).

Only Sequoia Capital has invested in multiple NoSQL vendors (as well as Hadoop-as-a-service provider Altiscale) having invested in MongoDB, DataStax and – hold onto your hats, irony fans – MarkLogic. It should be noted however that Sequoia has not invested in DataStax since its series A round in late 2010.

The full report, Venture funding for Hadoop and NoSQL vendors tops $1bn is available now to 451 Research clients and also includes our perspective on when combined Hadoop and NoSQL revenue might begin to exceed combined Hadoop and NoSQL VC funding, as well as the potential for M&A and IPO activity in 2014.

NoSQL LinkedIn Skills Index – September 2013

With our rebooted NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, now into its second year, I thought it was a good time to add some newer projects to the list; specifically: ArangoDB, FoundationDB, RethinkDB, and Titan.

It shouldn’t surprise anyone to find that those four new additions failed to make a dent in the top ten list of the NoSQL databases most often cited in LinkedIn profiles. However, there is still some interesting activity this quarter, with Riak leapfrogging MarkLogic (as predicted).

linkedinq31

Outside the top ten, Apache Accumulo overtook Voldemort, and saw the second fastest growth in mentions in Q3, behind only DynamoDB and ahead of Neo4j, MongoDB, and Cassandra.

That growth saw MongoDB extend its lead as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 49% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 47% in June.

allNoSQLq3

Incidentally, adding the four new NoSQL databases to the analysis did not have a significant impact on MongoDB’s share. Without them it still registered 49%. Expect MongoDB to pass the 50% threshold in Q4, however, as well as Couchbase to overtake MarkLogic.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

The Data Day, Two days: December 10/11 2012

451’s perspective on ScaleArc, RethinkDB, Tableau. And more.

And that’s the Data Day, today.

The Data Day, Two days: November 12/13 2012

Platfora raises $20m. IBM trumpets ‘integration anywhere’. And more

And that’s the Data Day, today.