NoSQL/NewSQL/MySQL is not a zero sum game

It has been fascinating to watch how the industry has responded to ‘NewSQL’ since we published our first report using the term.

From day one the term has taken on a life of its own as the vendors such as ScaleBase, VoltDB, NimbusDB and Xeround have picked it up and run with it , while the likes of Marten Mickos and Michael Stonebraker have also adopted the term.

The reaction hasn’t been all positive, of course, although much of the criticism has been of the “are you kidding?” or “this is getting silly” variety rather than constructive debate about either the term or the associated technologies.

Another popular response is along the lines of “does this mean the end of NoSQL?”. I think it is important to address this question because it depends on a common misunderstanding about technology: that in order for the latest technology to succeed it is necessary for the technology that immediately preceded it to fail.

While our report into NoSQL, NewSQL and Beyond identified common drivers for interest in NoSQL and NewSQL databases, as well as data caching/grid technologies, in truth there is a significant difference between the requirements for databases that provide relaxed consistency and/or schema dependency and those that retain the ACID properties of transactional database systems.

Although there will be isolated examples, it is going to be rare, therefore, that any potential adopter would be directly comparing NoSQL and NewSQL technologies unless they are still at the stage trying to figure out the level of consistency required for an individual application.

The other option they would have is to use an existing SQL database, particularly Oracle’s MySQL, which provides the middle ground that overlaps both NoSQL and NewSQL. A significant number of the NoSQL deployments we have identified have migrated from MySQL, while existing MySQL deployments (although probably not the same ones) are also targets for the numerous NewSQL vendors.

VoltDB is a primary example, as last’s week’s GigaOm article covering CTO Michael Stonebraker’s view on Facebook’s MySQL ‘fate worse than death’ illustrated.

Much debate (125 comments at last count) has followed Stonebraker’s assertion that Facebook would be better off migrating to a NewSQL offering like VoltDB, most of which has not supported his view.

There’s a good reason for that. There is a good argument to be made that if you were trying to create Facebook from scratch today you probably wouldn’t choose the shard management overhead involved in MySQL. In that regard, Stonebraker has a point.

However, the fact is that MySQL was pretty much the only logical choice when Facebook began and its commitment to MySQL has grown over the years. The company is now probably one of the world’s experts in scaling and managing MySQL – to the extent that Facebook engineer Domas Mituzas argues that the operational overhead in handling sharding and availability of MySQL has become a constant cost.

Under those circumstances it would take something significant for a company like Facebook to even consider migrating to a MySQL alternative. Database migration projects are costly and complex and extremely rare – even at non-Facebook scale.

And it is not as if the company hasn’t experimented with other database technologies – having created Apache Cassandra and adopted Apache HBase for its Messages update.

This is exactly the polyglot persistence strategy we are seeing from NoSQL and NewSQL adopters: retaining MySQL (or another SQL database) where is makes sense to do so, while adding NoSQL and perhaps NewSQL for new projects and applications for which it is appropriate.

One other point to note, however, is that adopting a NewSQL technology might not require migrating away from MySQL. While the NewSQL category includes new database products such as VoltDB, it also includes alternative MySQL storage engines and database load balancing and clustering products such as ScaleBase and ScalArc, which are specifically designed to improve the scalability of MySQL (with other SQL databases to come) in order to avoid migration to an alternative database.

Adoption of these technologies does not require the complete abandonment of ‘standard MySQL’ any more than the adoption of NoSQL for non-ACID application requirements does, and it certainly doesn’t require the abandonment of NoSQL.

MySQL NoSQL survey highlights role of polyglot persistence

The MySQL developer website is currently running a poll to gauge the adoption of NoSQL database projects by MySQL developers.

The results are interesting, particularly in relation to our research report on the emergence and adoption of NoSQL and NewSQL databases, which I am completing this week.

Our research has shown that one of the drivers of NoSQL has been performance, and in particular the failure of MySQL to provide predictable performance at scale. We do see NoSQL being deployed for applications that previously ran on MySQL, or for which MySQL would previously have been the natural choice.

For example, while Facebook continues to run its core applications on MySQL running the InnoDB storage engine and memcached it also created what became Apache Cassandra to power its inbox search, and selected Apache HBase for its Messages application, which was updated in late 2010 to combine chat, email, and SMS, having found that MySQL was unable to deliver the performance required for large data sets.

Similarly, content discovery service StumbleUpon adopted HBase following problems with MySQL failover, Digg replaced its MySQL cluster with Apache Cassandra, and Wordnik replaced MySQL with MongoDB.

Clearly, however, not every MySQL application is suitable for a NoSQL database. Just because almost 80% of the MySQL survey respondents are adopting NoSQL database, does not mean they are replacing MySQL with NoSQL.

Like Facebook, many major NoSQL users also continue to use MySQL, including Twitter which back-tracked on a planned migration of its core status table to Apache Cassandra in 2010. It continues to use MySQL, but is adopting Cassandra for newer projects.

The adoption of multiple database products depending on the nature of the application is another of the six major drivers for NoSQL and NewSQL adoption highlighted by our research.

The theory of polyglot persistence has developed based on the fact that different data storage models have their own strengths and the acceptance that while the relational model is suitable for a large proportion of data storage requirements, there are times when a document, graph, or object database might be more suitable, or even a distributed file system.

Facebook and Twitter are prime examples of polyglot persistence in action, and the survey of MySQL developers shows that the practice is widespread. At the time of writing 205 people have responded to the survey, providing 421 responses.

If we exclude the 42 that indicate they are not using a NoSQL database, that means that the remaining 163 people are using 379 NoSQL databases, which equates to 2.33 databases per respondent, not including their existing use of MySQL or other traditional or NewSQL databases.

I’ll provide more details of the research report, including the other four adoption drivers, once the report is published. The report contains analysis of the drivers behind the development and adoption of NoSQL and NewSQL databases, as well as the evolving role of data grid technologies, as well as the associated use cases. It will be available soon for clients of our Information Management and CAOS practices.