NoSQL LinkedIn Skills Index – September 2013

With our rebooted NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, now into its second year, I thought it was a good time to add some newer projects to the list; specifically: ArangoDB, FoundationDB, RethinkDB, and Titan.

It shouldn’t surprise anyone to find that those four new additions failed to make a dent in the top ten list of the NoSQL databases most often cited in LinkedIn profiles. However, there is still some interesting activity this quarter, with Riak leapfrogging MarkLogic (as predicted).

linkedinq31

Outside the top ten, Apache Accumulo overtook Voldemort, and saw the second fastest growth in mentions in Q3, behind only DynamoDB and ahead of Neo4j, MongoDB, and Cassandra.

That growth saw MongoDB extend its lead as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 49% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 47% in June.

allNoSQLq3

Incidentally, adding the four new NoSQL databases to the analysis did not have a significant impact on MongoDB’s share. Without them it still registered 49%. Expect MongoDB to pass the 50% threshold in Q4, however, as well as Couchbase to overtake MarkLogic.

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

Forthcoming webinar: NoSQL Technology and Real-time, Accurate Predictive Analytics

On August 29 at 10:00 am PT I’ll be taking part in a webinar in association with Objectivity entitled “Big Data: NoSQL Technology and Real-time, Accurate Predictive Analytics”.

The webinar will provide an overview of NoSQL database technology and, in particular, the role that graph databases have in the expanding analytics market.

I’ll be joined by Leon Guzenda, Founder, Objectivity, who will provide a brief overview of Objectivity, Inc and its products Objectivity/DB and InfiniteGraph, as well as J.C. Smart, Director Global Insight Laboratory, Georgetown University, who will explain how Georgetown University is taking advantage of Objectivity’s products to develop one of the most interconnected databases today – examining information from all types of sources worldwide in real-time.

For full details, and to register, click here.

The Data Day, A few days: August 1-7 2013

MySQL, NoSQL, NewSQL, DBaaS market sizing. And more

Sizing the opportunities for MySQL, NoSQL, NewSQL and DBaaS

451 Research has recently published an update to our market sizing estimates for the MySQL ecosystem, NoSQL and NewSQL sectors, adding coverage of the database-as-a-service market.

The report, Next-Generation Operational Databases: 2012-2016, can be found here and provides estimates for the size of the aggregate market and each market sector, as well as competitive landscape maps. It also includes a growth forecast for each sector, and highlights the opportunities and threats facing participating vendors.

splash

The key findings are also available in the a short, free presentation (registration required), which can be found here, and provides details of how the MySQL, NoSQL and DBaaS sectors are each expected to grow to generate revenue in excess of $1bn by 2016.

The Data Day, A few days: July 24-31 2013

Next-Gen DB market sizing. Total Data Integration. And more.

And that’s the data day, today.

NoSQL LinkedIn Skills Index – June 2013

Four quarters have now passed since we rebooted our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, giving us a good view of the relative growth of the various NoSQL databases in the past year.

NoSQL-Jun

A few interesting statistics to pick out: Cassandra has jumped ahead of Redis for second place, while outside the top ten, shown here, OrientDB climbed above Hypertable and DEX climbed above InfiniteGraph. Looking ahead, expect Riak to overtake MarkLogic in the next three months.

DynamoDB saw the greatest increase in terms of the number of mentions in LinkedIn profiles in the past three months, although it remains in 10th position. In terms of growth, DynamoDB was followed by OrientDB, Neo4j, Apache Accumulo and DEX.

However, MongoDB once again extended its lead as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 47% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 46% in March.

NoSQL_Jun2

Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.

Another significant data source that can provide a different perspective on the NoSQL market is our market-sizing revenue estimate. Stand-by for an update on our sizing estimates for the NoSQL, NewSQL, MySQL and DBaaS sectors in the coming weeks.

The Data Day, A few days: May 13-May 17 2013

Tableau IPOs. Funding for EdgeSpring, Cloudant LucidWorks, and GraphLab

And that’s the data day, today.

NoSQL LinkedIn Skills Index – March 2013

As Q1 comes to a close its time to take another look at our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects. This is the second update since we rebooted the analysis in September 2012 to account for more products and refine our search terms.

NoSQL_Mar

A few interesting statistics to pick out: Neo4j has, as predicted, jumped ahead of MarkLogic for sixth place. No other changes of position, but outside the top ten, shown here, Apache Accumulo continues to grow well.

In fact, Apache Accumulo had the fastest rate of growth for the second quarter in succession, just ahead of DynamoDB and OrientDB -once again – followed by Apache Cassandra and MongoDB.

MongoDB’s growth means that it once again extended its lead as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 46% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 45% in December.

NoSQL_Mar2

The Data Day, Two days: February 11/12 2013

ClearStory sheds light on data analysis service. Illuminating ‘dark data’. More.

And that’s the data day, today.

Neither fish nor fowl: the rise of multi-model databases

One of the most complicated aspects of putting together our database landscape map was dealing with the growing number of (particularly NoSQL) databases that refuse to be pigeon-holed in any of the primary databases categories.

I have begun to refer to these as “multi-model databases” in recognition of the fact that they are able to take on the characteristics of multiple databases. In truth though there are probably two different groups of products that could be considered “multi-model”:

True multi-model databases that have been designed specifically to serve multiple data models and use-cases

Examples include:
FoundationDB, which is being designed to support ACID and NoSQL, but more to the point in this instance, multiple layers including key-value, document, and object layers

Aerospike, which is planning to combine SQL, key value, and document and graph database technologies in a single database by bringing together its Citrusleaf NoSQL database with the acquired AlchemyDB NewSQL project

OrientDB, which is, at heart, a document database, but can also be used as a graph database; as an object database, making use of the Java persistence API; and as a hybrid database, taking advantage of multiple models to serve different application requirements

ArangoDB, which promises to deliver the benefits of key value and document and graph stores in a single database

Other products that could be considered true multi-model databases are:
Couchbase Server 2.0, which can be used as both a document store and a key value store, as well as a distributed cache

Riak, which is a key-value store, although it can be used as a document store since the value can be a JSON document

NuoDB, which will provide compatibility with other databases by taking on multiple ‘personalities’ – an Oracle personality via PL/SQL compatibility is in the development roadmap, as is a document store personality via JSON support.

General-purpose databases with multi-model options
What’s the difference between multi-model databases and existing general-purpose databases that have optional capabilities for serving multiple models? My book book it’s about being designed for purpose, but I’m sure that will be a debating point for the future. In the mean-time, examples include:

Oracle MySQL 5.6, which can support both SQL-based access and key-value access via the Memcached API.

Oracle MySQL Cluster 7.2, which similarly supports concurrent NoSQL and SQL access to the database.

IBM DB2 10, which extends DB2’s hybrid relational and XML engine to enable the storage and management of graph triples, as well as support for the SPARQL 1.0 query language.

Akiban Server, which has the ability to treat groups of tables as objects and access them as JSON documents via SQL.

PostgreSQL h-store, which can be used for storing key-value pairs within a PostgreSQL data field, thereby enabling schema-less queries against data stored in PostgreSQL

We are also aware of other NewSQL database that plan to adopt support for popular NoSQL data models, while IBM has also talked about plans to integrate key value store NoSQL access capabilities with DB2 and Informix database software.

Other products that could be considered multi-model options include:
Oracle Spatial and Graph, an option for Oracle Database 11g.

One of the drivers of NoSQL database adoption has been polyglot persistence – using multiple databases depending on the specific requirements of individual applications. Multi-model databases contradict this trend, to some extent, so it will be interesting to see whether they begin to gain traction.

While we see the wisdom of selecting the best database for the job, we also recognise that it could sometimes be a matter of choosing the best data model for the job, while relying on a single storage back-end.