The Data Day, A few days: February 15-21 2014

Informatica eyes eyes $1bn in sales. And more

And that’s the data day, today.

NoSQL LinkedIn Skills Index – December 2013

There’s an early end to the quarter for our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, just as there was in 2012.

We predicted in Q3 that Couchbase would overtake MarkLogic this quarter, which came to pass, but were somewhat surprised to see Couchbase also leapfrog Riak to claim 7th place. It’s almost too close to call between the three, though we wouldn’t be surprised to see those places change hands in the coming quarters.

december2013

There were no other changes of position outside the top ten, although Titan is bearing down on Hypertable having recorded the fastest growth in Q4 (49.5%) and can be expected to gain a place in Q1 2014. The second fastest climber, in terms of mentions, was FoundationDB, followed by ArangoDB, RethinkDB and Apache Cassandra (the latter being particularly notable since it was the only one of the five fastest growers to also be one of the top ten most mentioned in LinkedIn member profiles).

That growth was of course not enough to close the gap on MongoDB as the most mentioned NoSQL database in LinkedIn member profiles, although for the first time MongoDB’s proportion of the overall total actually declined – from 49% in Q3 to 48%, upsetting our prediction that MongoDB would pass the 50% threshold in Q4.

Q42013

It will be interesting to see whether MongoDB’s dominance declines again in Q1, although either way it retains a monumental lead over all the other NoSQL databases in terms of mentions in LinkedIn profiles.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

Visualizing the $1bn+ VC investment in Hadoop and NoSQL

Cumulative VC funding for Hadoop and NoSQL vendors broke through the $1bn barrier in 2013, according a Spotlight report published by 451 Research, based on data provided by The 451 M&A KnowledgeBase.

The data indicates that there was a substantial increase in funding in 2013 ($530.5m, not including RethinkDB’s $8m announced yesterday) compared to 2012 ($190.9m), thanks to major rounds for the likes of MongoDB, Pivotal, Hortonworks and DataStax.

The report includes a visualization created by 451’s Director of Data Strategy and Solutions, Barbara Peng, that illustrates the connections between the various investors and the NoSQL and Hadoop vendors in which they have invested.

A snapshot of the visualization is shown below but the the original is interactive, enabling 451 Research clients to drag the various elements around for greater emphasis, as well as isolate the NoSQL or Hadoop categories.

vc-firms

451 Research clients can also scroll over the blue circles to see the total amount of funding raised by the individual Hadoop and NoSQL vendors, and scroll over the smaller orange circles to see which investors have backed which companies.

The sample set was limited to 16 vendors for visual clarity, but the six Hadoop and 10 NoSQL providers cited account for more than 87% of funding to date (with Pivotal representing the vast majority of the remaining 13%).

This visualization illustrates that investment in Hadoop and NoSQL providers comes from a relatively small group of VC firms (52 to be specific, excluding individual seed investors), resulting in a relatively tightly clustered graph.

However, the visualization also enables us to put to the test the recent blog post by MarkLogic’s Adam Fowler in which he stated:

“Just look at the number of investors who are investing in multiple NoSQL companies. They’re hedging their bets because they’re not sure themselves which businesses will survive.”

In fact investment in multiple Hadoop and NoSQL vendors is relatively rare. Only 11 out of the 52 VC firms have invested in more than one Hadoop and/or NoSQL vendor, with seven of those picking one Hadoop vendor and one NoSQL provider. Less hedging their bets as picking a winner in each category.

Of the remaining four investment shops, two have invested in one Hadoop distributor, one NoSQL specialist and one Hadoop-as-a-service provider (MapR, DataStax and Qubole for Lightspeed Venture Partners; Cloudera, Couchbase and Altiscale for Accel Partners), while In-Q-Tel has invested in one Hadoop supplier, one NoSQL vendor and one NoSQL-as-a-service provider (Cloudera, MongoDB and Cloudant).

Only Sequoia Capital has invested in multiple NoSQL vendors (as well as Hadoop-as-a-service provider Altiscale) having invested in MongoDB, DataStax and – hold onto your hats, irony fans – MarkLogic. It should be noted however that Sequoia has not invested in DataStax since its series A round in late 2010.

The full report, Venture funding for Hadoop and NoSQL vendors tops $1bn is available now to 451 Research clients and also includes our perspective on when combined Hadoop and NoSQL revenue might begin to exceed combined Hadoop and NoSQL VC funding, as well as the potential for M&A and IPO activity in 2014.

The Data Day, A few days: December 6-12 2013

Talend raises $40m. GridGain names new CEO. And more

And that’s the data day, today.

The Data Day, A few days: October 5-11 2013

TransLattice acquires StormDB. Funding for Cirro and TempoDB. And more.

And that’s the data day, today.

NoSQL LinkedIn Skills Index – September 2013

With our rebooted NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, now into its second year, I thought it was a good time to add some newer projects to the list; specifically: ArangoDB, FoundationDB, RethinkDB, and Titan.

It shouldn’t surprise anyone to find that those four new additions failed to make a dent in the top ten list of the NoSQL databases most often cited in LinkedIn profiles. However, there is still some interesting activity this quarter, with Riak leapfrogging MarkLogic (as predicted).

linkedinq31

Outside the top ten, Apache Accumulo overtook Voldemort, and saw the second fastest growth in mentions in Q3, behind only DynamoDB and ahead of Neo4j, MongoDB, and Cassandra.

That growth saw MongoDB extend its lead as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 49% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 47% in June.

allNoSQLq3

Incidentally, adding the four new NoSQL databases to the analysis did not have a significant impact on MongoDB’s share. Without them it still registered 49%. Expect MongoDB to pass the 50% threshold in Q4, however, as well as Couchbase to overtake MarkLogic.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

Forthcoming Webinar: Get Down to Serious Business with Hadoop

On Wednesday, July 17, at 11:00am ET / 8:00am PT, I’ll be taking part in a webinar in association with MarkLogic on the subject of Hadoop.

As we’ve stated a few times, we believe that the flexibility of Apache Hadoop is one of its biggest assets – enabling organizations to generate value from data that was previously considered too expensive to be stored and processed in traditional databases – but it also results in “Hadoop” meaning different things to different people.

The result is that organizations still struggle over which Hadoop ecosystem components to adopt in order to obtain the greatest value, which application workloads might be suitable for deployment on Hadoop, and how to deploy Hadoop in conjunction with existing relational and non-relational databases.

On the webinar I’ll be providing an overview of the current state of the Hadoop ecosystem, geographic adoption, use cases, while MarkLogic’s Director of Product Management Justin Makeig to will provide an introduction to complementary technology from MarkLogic that can help your organization achieve real-time analysis, transactional data updates, integrity, granular security, and full-text search.

For full details, and to register, click here.

The Data Day, A few days: April 9-12 2013

Funding for MarkLogic and ParElastic. And more

And that’s the data day, today.

The Data Day, Two days: February 11/12 2013

ClearStory sheds light on data analysis service. Illuminating ‘dark data’. More.

And that’s the data day, today.

NoSQL LinkedIn Skills Index – December 2012

Time again to take a look at our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects. This is the first update since we rebooted the analysis in September to account for more products and refine our search terms.

NoSQL_Dec

On the face of it not a lot has changed in the last quarter, although there are a few interesting statistics to pick out. For instance, Neo4j is now practically tied for sixth place with MarkLogic and can be expected to overtake it in Q1 2013. Outside the top ten shown above, Apache Accumulo has gained two places – overtaking Aerospike and Hypertable.

In fact, Apache Accumulo showed the fastest rate of growth in mentions between September and December, just ahead of DynamoDB and OrientDB, followed by Couchbase and MongoDB.

MongoDB’s growth means that it has cemented its place as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 45% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 43% in September.

nosql_all_dec