The Data Day: April 21, 2017

We’re going to use American data, we’re going to use American analytics, we are going to come first in all deals.

And that’s the data day, today.

The Data Day, A few days: May 21-27, 2016

Updated 451 data platforms and analytics market and vendor estimates. And more.

And that’s the data day, today.

The Data Day, A few days: February 20-26, 2016

Salesforce acquires PredictionIO. And more.

And that’s the data day, today.

The Data Day, A few days: February 15-21 2014

Informatica eyes eyes $1bn in sales. And more

And that’s the data day, today.

The Data Day, A few days: October 26-November 1 2013

Cloudera launches Enterprise Data Hub. And more

And that’s the data day, today.

The Data Day, Two days: September 21/24 2012

Alpine Data bags EMC. Infobright delivers appliance. And more.

And that’s the Data Day, today.

The Data Day, Today: May 18 2012

SAP expands HANA. Informatica embraces big data. Gary Bloom joins MarkLogic. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* For 451 Research clients

# Informatica 9.5: ‘big data’ runs through the integration platform makeover Impact Report

# Lucid Imagination launches search-based ‘big data’ platform Impact Report

# Datameer updates Hadoop-based BI stack with an eye to more complex analysis Impact Report

# MarkLogic searches for operational analytics role with plans for SQL, MapReduce support Impact Report

# Infobright shines following shift to machine-generated data Impact Report

# Starcounter focuses on performance with in-memory database update Impact Report

# Guavus bears fruit with data-processing platform for communications operators Impact Report

# InsightSquared bags $4.5m series A funding and salesforce.com as an investor Impact Report

# MarkLogic names veteran exec Gary Bloom as new president and CEO Analyst note

* SAP Continues to Expand Capabilities and Scale of SAP HANA Platform and Ease Developer Adoption

* SAP HANA Offers Multi-Node Capabilities to Help Customers Scale Out

* Gary Bloom Joins MarkLogic as Chief Executive Officer

* Amazon RDS for SQL Server and .NET support for AWS Elastic Beanstalk

* Informatica 9.5 Unleashes the Power of Hadoop

* Informatica Brings Master Data Management to Big Data, Social, Cloud and Mobile Computing

* Talend Announces New Release of Enterprise Open Source Integration Platform

* Lucid Imagination Combines Search, Analytics and Big Data to Tackle the Problem of Dark Data

* Big Data Refinery Fuels Next-Generation Data Architecture

* 7 Key Drivers for the Big Data Market

* Google puts a price tag on Cloud SQL services

* Actuate and Hortonworks Collaborate to Visualize Big Data

* Hadapt and Cloudera Deliver Big Data Analytics with Apache Hadoop

* Cloudera Partners With Hadoop Managed Services Provider MetaScale to Help Large Traditional Enterprises Adopt Apache Hadoop

* Opera Solutions’ Big Analytics Tailor Made for SAP HANA: Signal Hub Technology

* Cloudant to Contribute Big Data Capabilities to Apache CouchDB Project

* Hortonworks and Kognitio Announce Technical Partnership

* Starcounter Unveils World’s Fastest Consistent Database

* XAP 9.0 – Geared for Real-Time Big Data Stream Processing

* How long before R overtakes SAS and SPSS?

* Betting big on live sports data, Perform lays €120 million on RunningBall

And that’s the Data Day, today.

Because 20+ data warehousing vendors is never enough

In our recent report on the data warehousing market we speculated that there would soon be a change in the number of vendors operating in what is a crowded market. We were anticipating that the number of vendors would go down, rather than up, but – in the short term at least – we have been proved wrong, as two new open source analytical databases emerged this week.

First came the formation of Dynamo Business Intelligence Corp, (aka Dynamo BI), a new commercially supported distribution, and sponsor, of LucidDB. Then came the launch of InfiniDB Community Edition, a new open source analytic database based on MySQL from Calpont.

We actually included Calpont in our report but its product plans at that time looked precarious to say the least as the company found that its plans to launch a data warehousing platform based on MySQL were overshadowed by Oracle’s acquisition of Sun.

We were somewhat sceptical about whether Calpont – which has had a couple of false starts in the past – would find a way to bring something to market and we are impressed that the company has reached a licensing agreement with Sun that supports its open source and commercial aims.

Specifically the company has arranged an OEM agreement with Sun for the MySQL Community Server version that enables it to be used with both Calpont’s open source and commercially licensed products. The first of those is InfiniDB Community Edition, a column-oriented, multi-threaded data warehouse platform which acts as a storage engine for MySQL.

The GPLv2 Community Edition will only be available for deployment on a single-server and without any formal support from Calpont and is primarily aimed at raising interest among MySQL developers. A fully certified and supported commercial version will follow, although Calpont is reticent about providing details on that at the moment other than that it will make use of Calpont’s massively parallel processing capabilities and modular architecture to scale out as well as up.

Calpont faces some competition in the MySQL segment from Kickfire and Infobright, particularly the latter given their similar open source software strategies (Kickfire is a MySQL appliance). Infobright has has grown rapidly since going open source and now boasts more than 100 customers, although Calpont maintains that leaves plenty of opportunities amongst MySQL users.

We would agree with that, and also with the company’s claim to offer something different from Infobright technologically. Infobright also offers column-based storage but not massively parallel processing (although it is working on a shared-everything, peer-to-peer architecture). We should note that InfiniDB Community Edition is also restricted to a single server but this is the result of a strategic decision, rather than a technical limitation. The commercial version will be fully MPP.

We recently noted that LucidDB is another open source database that is often overlooked since the LucidDB code is not commercially supported.

Any concern over the future of LucidDB following the demise of LucidEra should be put to bed by the formation of Dynamo BI with the intention to provide a commercially supported distribution of LucidDB.

As LucidDB project lead John Sichi wrote:

“This is an offering which has been completely missing up until now, and which I and others such as Julian Hyde believe to be essential for accelerating adoption of LucidDB. LucidEra provided much of the critical development effort, but never offered commercial support on LucidDB since that was not part of its software-as-a-service business model. Eigenbase provides community infrastructure and development coordination, but a commercial offering is not part of its non-profit charter. So in the past, when individuals and companies have asked me whom they should talk to in order to purchase support for LucidDB, I have never had a good answer. “

Meanwhile Nicholas Goodman revealed that the company has acquired the commercial rights to LucidDB and plans to offer DynamoDB as a prepackaged, assembled distribution. It will also be fully open source and all new features will be contributed to LucidDB.

It is very early days for Dynamo BI, which doesn’t even have a website as yet, so it’s difficult to judge the company’s plans, but with some of the lead LucidDB developers involved and a solid starting project – “the best database no one ever told you about” – it has every chance. We’ll be looking to catch up with the company just as soon as it gets up and running.

The data warehousing sector is extremely crowded and we continue to believe that there will be a shakeout in the near future, but there are opportunities for companies that are able to differentiate themselves from the pack. Starting a data warehousing company is generally not something that we would recommend right now, but both Calpont and Dynamo BI have opportunities to establish themselves.