The Data Day, Two days: February 7/8 2013

Teradata results. Funding for DataXu. The chemistry of data. And more.

And that’s the data day, today.

The Data Day, Today: Jan 24 2012

Thoughts on Splunk’s IPO and DynamoDB. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Thoughts on the Splunk IPO and S-1 By Dave Kellogg.

* Thoughts on SimpleDB, DynamoDB and Cassandra By Adrian Cockcroft.

* Recommind’s Revenue Leaps 95% in Record-Setting 2011 Predictable.

* Hewlett-Packard Expands to Cambridge via Vertica’s “Big Data” Center Moving.

* Announcing SkySQL Enterprise HA for the MariaDB & MySQL databases

* Membase Server is Now Couchbase Server But not *the* Couchbase Server.

* Cloudera Teams With O’Reilly Media to Merge Hadoop World and Strata Conferences

* Survey results: How businesses are adopting and dealing with data 100 Strata Online Conference attendees.

* Big data market survey: Hadoop solutions

* LinkedIn released SenseiDB, an open source distributed, realtime, semi-structured database.

* For 451 Research clients

# VMware: not your father’s database company Impact Report

# Sparsity Technologies draws up plans for graph database adoption Impact Report

# Amazon launches DynamoDB, an auto-configuring database as a service Market Development report

# NuoDB targets Q2 release for elastic relational database Market Development report

# ADVIZOR illuminates growth strategy, roadmap in data discovery and analysis Market Development report

# Birst adds own analytic engine for BI, OEM agreement with ParAccel Market Development report

* Google News Search outlier of the day: RentAGrandma.com Recruiting Wonderful Grandmas

And that’s the Data Day, today.

The Data Day, Today: Jan 19 2012

Amazon launches DynamoDB. Red Hat virtually supports JasperReports. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Amazon Web Services Launches Amazon DynamoDB See also blog posts from Werner Vogels and Jeff Barr, as well as reaction from DataStax and Basho.

* Jaspersoft Delivers Analytics for Red Hat Enterprise Virtualization Customers JasperReports Server is embedded in Red Hat Enterprise Virtualization 3.0.

* Tableau 7.0 Brings Simplicity to Business Intelligence Including new Data Server for data sharing and management.

* Hortonworks to Deliver Next-Generation of Apache Hadoop Pre-announcement (emphasis on the pre).

* RainStor Announces First Enterprise Database Running Natively on Hadoop as well as partnerships with Cloudera, Hortonworks, and MapR, and support from Composite Software.

* Talend Platform for Data Services Operationalizes Information and Data A common development, deployment and monitoring environment for both data management and application integration.

* Fujitsu Launches Cloud Services as a Platform for Big Data Data Utilization Platform Services.

* All you wanted to know about Hadoop, but were too afraid to ask A graphic illustration of the various versions of Apache Hadoop.

* Oracle Database or Hadoop? Another good post from Pythian’s Gwen Shapira. See also Aaron Cordova’s Do I need SQL or Hadoop?

* Meet Code 42, Accel’s first Big Data Fund investment GigaOM has the details.

* MapR CEO Sees Big Changes in Big Data in 2012 Predictive.

* Introducing DataFu: an open source collection of useful Apache Pig UDFs LinkedIn launches open source user-defined functions.

* Big Data Needs Data Scientists, Or Quants, Or Excel Jockeys … or something.

* Career of the Future: Data Scientist [INFOGRAPHIC] Infotaining.

* Knives out for Oracle. SAP and IBM offer some perspectives on Exalytics and Big Data Appliance respectively.

* For 451 Research clients

# Information Builders uses Infobright to take BI in-memory, expands SMB reach Market development report

# RainStor launches database complement to Apache Hadoop Market development report

# Heroku’s Postgres is poised for growing interest in database as a service Market development report

* Google News Search outlier of the day: This Spud’s For All of You: “2012 Is the Year of the Potato”

And that’s the Data Day, today.

Because 20+ data warehousing vendors is never enough

In our recent report on the data warehousing market we speculated that there would soon be a change in the number of vendors operating in what is a crowded market. We were anticipating that the number of vendors would go down, rather than up, but – in the short term at least – we have been proved wrong, as two new open source analytical databases emerged this week.

First came the formation of Dynamo Business Intelligence Corp, (aka Dynamo BI), a new commercially supported distribution, and sponsor, of LucidDB. Then came the launch of InfiniDB Community Edition, a new open source analytic database based on MySQL from Calpont.

We actually included Calpont in our report but its product plans at that time looked precarious to say the least as the company found that its plans to launch a data warehousing platform based on MySQL were overshadowed by Oracle’s acquisition of Sun.

We were somewhat sceptical about whether Calpont – which has had a couple of false starts in the past – would find a way to bring something to market and we are impressed that the company has reached a licensing agreement with Sun that supports its open source and commercial aims.

Specifically the company has arranged an OEM agreement with Sun for the MySQL Community Server version that enables it to be used with both Calpont’s open source and commercially licensed products. The first of those is InfiniDB Community Edition, a column-oriented, multi-threaded data warehouse platform which acts as a storage engine for MySQL.

The GPLv2 Community Edition will only be available for deployment on a single-server and without any formal support from Calpont and is primarily aimed at raising interest among MySQL developers. A fully certified and supported commercial version will follow, although Calpont is reticent about providing details on that at the moment other than that it will make use of Calpont’s massively parallel processing capabilities and modular architecture to scale out as well as up.

Calpont faces some competition in the MySQL segment from Kickfire and Infobright, particularly the latter given their similar open source software strategies (Kickfire is a MySQL appliance). Infobright has has grown rapidly since going open source and now boasts more than 100 customers, although Calpont maintains that leaves plenty of opportunities amongst MySQL users.

We would agree with that, and also with the company’s claim to offer something different from Infobright technologically. Infobright also offers column-based storage but not massively parallel processing (although it is working on a shared-everything, peer-to-peer architecture). We should note that InfiniDB Community Edition is also restricted to a single server but this is the result of a strategic decision, rather than a technical limitation. The commercial version will be fully MPP.

We recently noted that LucidDB is another open source database that is often overlooked since the LucidDB code is not commercially supported.

Any concern over the future of LucidDB following the demise of LucidEra should be put to bed by the formation of Dynamo BI with the intention to provide a commercially supported distribution of LucidDB.

As LucidDB project lead John Sichi wrote:

“This is an offering which has been completely missing up until now, and which I and others such as Julian Hyde believe to be essential for accelerating adoption of LucidDB. LucidEra provided much of the critical development effort, but never offered commercial support on LucidDB since that was not part of its software-as-a-service business model. Eigenbase provides community infrastructure and development coordination, but a commercial offering is not part of its non-profit charter. So in the past, when individuals and companies have asked me whom they should talk to in order to purchase support for LucidDB, I have never had a good answer. “

Meanwhile Nicholas Goodman revealed that the company has acquired the commercial rights to LucidDB and plans to offer DynamoDB as a prepackaged, assembled distribution. It will also be fully open source and all new features will be contributed to LucidDB.

It is very early days for Dynamo BI, which doesn’t even have a website as yet, so it’s difficult to judge the company’s plans, but with some of the lead LucidDB developers involved and a solid starting project – “the best database no one ever told you about” – it has every chance. We’ll be looking to catch up with the company just as soon as it gets up and running.

The data warehousing sector is extremely crowded and we continue to believe that there will be a shakeout in the near future, but there are opportunities for companies that are able to differentiate themselves from the pack. Starting a data warehousing company is generally not something that we would recommend right now, but both Calpont and Dynamo BI have opportunities to establish themselves.