Two data management webinars this week

In addition to the 451 Group’s own data warehousing webinar on Thursday I will also be taking part in a webinar on Wednesday with EnterpriseDB on the subject of open source database adoption in the enterprise.

During the webinar we will provide recommendations for how organizations can effectively leverage open source software. Attendees will learn about open source software trends for 2010, top considerations when using open source databases, and best practices for successful deployments of open source software.

I’ll be providing some data points from our recent surveys on database adoption and open source adoption while EnterpriseDB’s Larry Alston will also showcase successful enterprise deployments of Postgres Plus.

The open source database webinar is Wednesday, December 16, at 1 pm ET. To register, visit this link.

The data warehousing webinar is Thursday, December 17th, at 1 pm ET. To register, visit this link.

Because 20+ data warehousing vendors is never enough

In our recent report on the data warehousing market we speculated that there would soon be a change in the number of vendors operating in what is a crowded market. We were anticipating that the number of vendors would go down, rather than up, but – in the short term at least – we have been proved wrong, as two new open source analytical databases emerged this week.

First came the formation of Dynamo Business Intelligence Corp, (aka Dynamo BI), a new commercially supported distribution, and sponsor, of LucidDB. Then came the launch of InfiniDB Community Edition, a new open source analytic database based on MySQL from Calpont.

We actually included Calpont in our report but its product plans at that time looked precarious to say the least as the company found that its plans to launch a data warehousing platform based on MySQL were overshadowed by Oracle’s acquisition of Sun.

We were somewhat sceptical about whether Calpont – which has had a couple of false starts in the past – would find a way to bring something to market and we are impressed that the company has reached a licensing agreement with Sun that supports its open source and commercial aims.

Specifically the company has arranged an OEM agreement with Sun for the MySQL Community Server version that enables it to be used with both Calpont’s open source and commercially licensed products. The first of those is InfiniDB Community Edition, a column-oriented, multi-threaded data warehouse platform which acts as a storage engine for MySQL.

The GPLv2 Community Edition will only be available for deployment on a single-server and without any formal support from Calpont and is primarily aimed at raising interest among MySQL developers. A fully certified and supported commercial version will follow, although Calpont is reticent about providing details on that at the moment other than that it will make use of Calpont’s massively parallel processing capabilities and modular architecture to scale out as well as up.

Calpont faces some competition in the MySQL segment from Kickfire and Infobright, particularly the latter given their similar open source software strategies (Kickfire is a MySQL appliance). Infobright has has grown rapidly since going open source and now boasts more than 100 customers, although Calpont maintains that leaves plenty of opportunities amongst MySQL users.

We would agree with that, and also with the company’s claim to offer something different from Infobright technologically. Infobright also offers column-based storage but not massively parallel processing (although it is working on a shared-everything, peer-to-peer architecture). We should note that InfiniDB Community Edition is also restricted to a single server but this is the result of a strategic decision, rather than a technical limitation. The commercial version will be fully MPP.

We recently noted that LucidDB is another open source database that is often overlooked since the LucidDB code is not commercially supported.

Any concern over the future of LucidDB following the demise of LucidEra should be put to bed by the formation of Dynamo BI with the intention to provide a commercially supported distribution of LucidDB.

As LucidDB project lead John Sichi wrote:

“This is an offering which has been completely missing up until now, and which I and others such as Julian Hyde believe to be essential for accelerating adoption of LucidDB. LucidEra provided much of the critical development effort, but never offered commercial support on LucidDB since that was not part of its software-as-a-service business model. Eigenbase provides community infrastructure and development coordination, but a commercial offering is not part of its non-profit charter. So in the past, when individuals and companies have asked me whom they should talk to in order to purchase support for LucidDB, I have never had a good answer. “

Meanwhile Nicholas Goodman revealed that the company has acquired the commercial rights to LucidDB and plans to offer DynamoDB as a prepackaged, assembled distribution. It will also be fully open source and all new features will be contributed to LucidDB.

It is very early days for Dynamo BI, which doesn’t even have a website as yet, so it’s difficult to judge the company’s plans, but with some of the lead LucidDB developers involved and a solid starting project – “the best database no one ever told you about” – it has every chance. We’ll be looking to catch up with the company just as soon as it gets up and running.

The data warehousing sector is extremely crowded and we continue to believe that there will be a shakeout in the near future, but there are opportunities for companies that are able to differentiate themselves from the pack. Starting a data warehousing company is generally not something that we would recommend right now, but both Calpont and Dynamo BI have opportunities to establish themselves.

Ten considerations for choosing/building a data warehouse

There is healthy competition in data warehousing, with more than 20 vendors competing for the attention of would-be customers with a variety of technologies, architectures and implementation methodologies.

With choice comes potential confusion, since users have to identify and compare different products and features, as well as vendor viability, to ensure they are investing their IT budgets wisely – especially in the current economic climate.

Our latest special report, Warehouse Optimization – Ten considerations for choosing/building a data warehouse, is designed to help reduce that confusion and is now available for existing 451 Group clients to download and non-clients to purchase. An executive summary is also available.

The report provides an overview of the data-warehousing vendor landscape, as tracked by The 451 Group, and examines the business and technology trends driving this market. It identifies 10 key technology trends in data warehousing and assesses how they can be used to choose the technologies and vendors that are best suited to a would-be customer and its specific application.

The report is not designed to make recommendations on particular vendors or technologies, but to provide an independent overview of the sector, which could be used by customers as part of a vendor-evaluation process. The report also examines the potential for consolidation and identifies some potential merger and acquisition drivers, as well as providing profiles of the data-warehousing vendors being tracked by The 451 Group as part of its ongoing coverage of this sector.

Look out also for a forthcoming webinar in which we will present the key findings and implications. We’ll keep you posted on the details.

Ingres launches project for in-memory, columnar, vectorized database engine

Interesting news from Ingres today that it is teaming up with VectorWise, a database engine spin-off from Amsterdam’s Centrum Wiskunde & Informatica (CWI) scientific research establishment, to collaborate on a new database kernel project.

The Ingres VectorWise project will create a new open source storage engine for the Ingres Database that will better enable it to be positioned as a platform for data warehouse and analytic workloads, although Ingres does not have detailed plans for the productization of the technology at this stage. The starting point for the project is the theory that modern multi-core parallel processors now look like, and behave like, symmetrical multi processing (SMP) servers, and that on-chip memory is taking the place of RAM, but that database software has not been updated to take advantage of process developments.

In order to do so Ingres and VectorWise will be collaborating on vectorized execution, which sees multiple instructions processed simultaneously, and in-cache processing, through which the execution occurs within the CPU cache and main memory is effectively treated like disk. The result, according to Ingres, is to reduce the I/O bottleneck for query processing. Additionally, the VectorWise engine enables on the fly decompression and operation handling in memory and includes a compressed column store.

It is claimed that the Ingres VectorWise project will deliver 10x performance increases over the current Ingres database.

VectorWise span off from CWI in 2008 to commercialize the the X100 system previously created by its database architecture research group. Development of X100, now also known as VectorWise, has been led by respected research scientists Peter Boncz and Marcin Zukowski.

Ingres maintains that by working with the CWI research scientists it has proven that their theories are technically feasible in a commercial product. Bringing such a commercial product to general availability is the next step, and history has proven that can be easier said than done. With that caveat we are impressed with the vision and ambition that Ingres is demonstrating.