Entries from August 2010 ↓

Sizing the data warehousing opportunity

The data warehousing market will see a compound annual growth rate of 11.5% from 2009 through 2013 to reach a total of $13.2bn in revenues.

That is the main finding highlighted by the latest report from The 451 Group’s Information Management practice, which provides market-sizing information for the data-warehousing sector from 2009 to 2013.

The report includes revenue estimates and growth projections, and examines the business and technology trends driving the market.

It was put together with the assistance of Market Monitor – the new market-sizing service from The 451 Group and Tier1 Research. Props to Greg Zwakman and Elizabeth Nelson for their number-crunching.

Among the key findings, available via the executive summary (PDF), are:

  • Four vendors dominate the data-warehouse market, with 93.6% of total revenue in 2010. These vendors are expected to retain their advantage and generate 92.2% of revenue in 2013.
  • Analytic databases are now able to take advantage of greater processor performance at a lower cost, improving price/performance and lowering barriers to entry.
  • With the application of cloud capabilities, users now have the promise of pools of enterprise data that marry central management with distributed use and control.
  • Products that take advantage of improved hardware performance will drive revenue growth for all vendors, and will protect the market share of incumbents.
  • As a result of systems performance improvements, data-warehousing vendors are also taking advantage of the opportunity to bring more advanced analytic capabilities to the DB engine.
  • Although we expect many smaller vendors to grow at a much faster rate between now and 2013, it will not be at the expense of the market’s dominant vendors.
  • While the Hadoop Core is not a direct alternative to traditional analytic DBs, the increased maturity of associated projects means that use cases for Hadoop- and MapReduce-enabled analytic DBs will overlap.

There is, of course, much more detail in the full report. 451 Group clients can download the report here, while non-clients can also use the same link to purchase the report, or request more information.

Scalable SQL: more than the mullet of the database world?

In the first part of our coverage on emerging database products and vendors we examined the new NoSQL databases and suggested that the incumbent database vendors would likely respond to the growing threat with a mix of in-memory and distributed caching technologies.

That is yet to happen, although it has only been a few months and the NoSQL databases have generated more noise than revenue at this stage, but in the meantime a new set of database vendors and products have emerged that could pose a more direct threat to the database incumbents while thwarting the potential of the NoSQL upstarts.

For want of a better phrase we have taken to referring to these products collectively as scalable SQL databases, and have just published a new spotlight report pulling together our various reports on the runners and riders.

Some of the vendors promise to deliver the scalability and flexibility promised by NoSQL while retaining the support for SQL queries and/or ACID (atomicity, consistency, isolation, durability). That is not an insignificant boast and it will be tough to offer the best of both worlds.

“SQL For Business, NoSQL For Partay!” is the explanation offered by MulletDB, a project that promises scalability and SQL queries. The danger is the scalable SQL ends up being the database equivalent of the celebrated mullet hairstyle or its business attire equivalent: the jacket and jeans.

One of the companies trying to avoid that problem is GenieDB (coverage) The London-based company’s GenieDB Engine is a fully replicated distributed database that combines a key-value store database with a ‘sharded’ memcached layer. Another example is Clustrix, which was founded in December 2006 to develop a new database appliance that would offer both scalability and durability in a single product.

Meanwhile VoltDB emerged earlier this summer with a transactional database management system that is designed to scale across clusters of industry-standard servers while retaining transactional integrity.

Additionally Xeround has recently confirmed its intention to reposition its Intelligent Data Grid (IDG) technology as Xeround Data Service, a scalable SQL database with support for ACID-compliant transactional capabilities for cloud computing environments, while New Technology/enterprise’s CloudTran, is designed to bring enterprise-level transaction management to GigaSpaces’ XAP in-memory data grid for on-premises deployment, and eventually any PaaS offering.

Meanwhile we are intrigued by VMware’s acquisiton of distributed data management vendor GemStone and its positioning of GemFire as a next-generation data management layer for cloud applications, as well as the forthcoming introduction of SQL querying in GigaSpaces’ eXtreme Application Platform (XAP), which will enable in-memory management of relational data and initiatives.

It is very early stages for all these vendors, and they have yet to prove that they have truly solved the problem of consistency and partition tolerance. In the meantime there are plenty of other contenders waiting in line.

Akiban is promising that it has the secret to SQL scalability with an approach that pre-groups data in order to overcome latency, caching and data distribution issues. Another company currently in stealth mode is JustOne Database which is working on perfecting a new storage model in order to deliver the performance and scalability required to support transactions and analytics on the same data simultaneously.

That is also the goal of Tokutek, which offers the TokuDB MySQL storage engine is based on Fractal Tree indexing technology designed to reduce data-insertion times and improve the performance of MySQL for both read and write applications.

JustOne and Tokutek are part of a slightly different set of vendors we are viewing under the scalable SQL umbrella: those that promise to improve performance for appropriate workloads to the extent that the advanced scale-out capabilities promised by some NoSQL databases become irrelevant.

While we’re on the subject of existing database vendors that could be considered part of the scalable SQL set, it is also worth mentioning MarkLogic. The company has recently been| associating itself with NoSQL and while the fact that it does not support SQL makes it a better literal fit with NoSQL the company’s support for ACID means that we would see it as an option for customers looking to improve performance without losing consistency, especially for unstructured or semi-structured data.*

As we previously noted; to some degree, the rise of NoSQL has resulted from the inability of the MySQL database to scale consistently. It is no surprise to see many of the scalable SQL vendors promising to improve the performance and scalability of MySQL, therefore, while others promote a clean-slate approach to address new big data management problems.

We have more details on each of the products and projects, mentioned above (as well as some not mentioned) their potential use cases, how they relate to MySQL, and what potential impact they may have on the adoption of NoSQL technologies, in the full report.

This is very much the start of our coverage of these vendors however. Expect more coverage in the near future, as well as a wider perspective on the potential for alternatives to the incumbent database suppliers, into 2011.

*Additionally, since the absence of SQL is only really tangential to many of the projects and products referred to as NoSQL it seems to me to be appropriate to have a database that does not support SQL in the scalable SQL category.