NoSQL enters the multi-model age. And more
And that’s the data day, today.
Last week I tweeted that this week was shaping up to be a watershed week in the history of NoSQL. I was referring, of course, to MongoDB launching 3.0 and DataStax acquiring Aurelius – although more specifically what the context of these two announcements tells us about the future of NoSQL.
While each of these announcements could be considered significant in its own right in combination they suggest a new stage in the evolution of NoSQL and a clear signal that the future of NoSQL will be driven by database products that support multiple data models.
When we formally started covering NoSQL in 2010 it made sense to divide the various projects into four groups: key value stores, distributed (wide) column stores (or BigTable clones), graph databases, and document-oriented databases.
By early 2013 it had become obvious that there was another emerging category: multi-model databases.
Multi-model NoSQL databases have therefore been around for several years but while we have seen growing interest in these multi-model databases, in terms of widespread adoption they still lagged behind the early specialist NoSQL databases. That’s what makes the recent announcements by MongoDB and DataStax so significant.
Along with releasing version 3.0 of its document database, MongoDB also began to share (at least with us) its long-term multi-model vision for MongoDB, explaining how the pluggable storage engine architecture could enable the database to support multiple data models – such as key value, graph and relational.
Meanwhile DataStax described how its acquisition of Aurelius will see it developing a graph database to complement Apache Cassandra’s wide column key value model, and explained its multi-model strategy.
Multi-model momentum may have been growing for years but the fact that the commercial providers behind the two most popular NoSQL databases have detailed their plans to go multi-model confirms that the multi-model approach is the future of NoSQL.
Indeed, since we expect to see similar moves from other NoSQL players it will become increasingly difficult to divide the NoSQL space in terms of key value stores, wide column stores, graph databases, and document-oriented databases. Instead it makes sense to divide the NoSQL projects in terms of whether they are single-model or multi-model.
451 Research clients can read more about our perspectives on MongoDB’s strategic direction, as well as DataStax’s acquisition of Aurelius, and the wider implications for the NoSQL sector.
One of the most complicated aspects of putting together our database landscape map was dealing with the growing number of (particularly NoSQL) databases that refuse to be pigeon-holed in any of the primary databases categories.
I have begun to refer to these as “multi-model databases” in recognition of the fact that they are able to take on the characteristics of multiple databases. In truth though there are probably two different groups of products that could be considered “multi-model”:
True multi-model databases that have been designed specifically to serve multiple data models and use-cases
Examples include:
FoundationDB, which is being designed to support ACID and NoSQL, but more to the point in this instance, multiple layers including key-value, document, and object layers
Aerospike, which is planning to combine SQL, key value, and document and graph database technologies in a single database by bringing together its Citrusleaf NoSQL database with the acquired AlchemyDB NewSQL project
OrientDB, which is, at heart, a document database, but can also be used as a graph database; as an object database, making use of the Java persistence API; and as a hybrid database, taking advantage of multiple models to serve different application requirements
ArangoDB, which promises to deliver the benefits of key value and document and graph stores in a single database
Other products that could be considered true multi-model databases are:
Couchbase Server 2.0, which can be used as both a document store and a key value store, as well as a distributed cache
Riak, which is a key-value store, although it can be used as a document store since the value can be a JSON document
NuoDB, which will provide compatibility with other databases by taking on multiple ‘personalities’ – an Oracle personality via PL/SQL compatibility is in the development roadmap, as is a document store personality via JSON support.
General-purpose databases with multi-model options
What’s the difference between multi-model databases and existing general-purpose databases that have optional capabilities for serving multiple models? My book book it’s about being designed for purpose, but I’m sure that will be a debating point for the future. In the mean-time, examples include:
Oracle MySQL 5.6, which can support both SQL-based access and key-value access via the Memcached API.
Oracle MySQL Cluster 7.2, which similarly supports concurrent NoSQL and SQL access to the database.
IBM DB2 10, which extends DB2’s hybrid relational and XML engine to enable the storage and management of graph triples, as well as support for the SPARQL 1.0 query language.
Akiban Server, which has the ability to treat groups of tables as objects and access them as JSON documents via SQL.
PostgreSQL h-store, which can be used for storing key-value pairs within a PostgreSQL data field, thereby enabling schema-less queries against data stored in PostgreSQL
We are also aware of other NewSQL database that plan to adopt support for popular NoSQL data models, while IBM has also talked about plans to integrate key value store NoSQL access capabilities with DB2 and Informix database software.
Other products that could be considered multi-model options include:
Oracle Spatial and Graph, an option for Oracle Database 11g.
One of the drivers of NoSQL database adoption has been polyglot persistence – using multiple databases depending on the specific requirements of individual applications. Multi-model databases contradict this trend, to some extent, so it will be interesting to see whether they begin to gain traction.
While we see the wisdom of selecting the best database for the job, we also recognise that it could sometimes be a matter of choosing the best data model for the job, while relying on a single storage back-end.