The Data Day, A few days: February 8-14 2014

Hortonworks and Red Hat expand Hadoop partnership. And more.

And that’s the data day, today.

The Data Day, A few days: October 26-November 1 2013

Cloudera launches Enterprise Data Hub. And more

And that’s the data day, today.

The Data Day, A few days: October 12-18 2013

Apache Hadoop 2 goes GA. Teradata cuts guidance. And more

And that’s the data day, today.

The Data Day, The week that was: October 22-26 2012

Cloudera launches Impala. Actuate snags Quiterian. Microsoft previews HDInsight.

And the rest:
– Microsoft previewed its Windows Azure HDInsight Service and Microsoft HDInsight Server for Windows.

– SAP launched a new “big data” bundle and go-to-market strategy.

– Informatica introduced Informatica PowerCenter Big Data Edition and reported its third quarter results.

– Also announcing financial results last week were QlikTech and Pervasive.

– Teradata updated its Unity suite with the addition of Unity Loader, and introduced its Unified Data Environment and the Unified Data Architecture.

– Splunk confirmed the release of Splunk Hadoop Connect and the Splunk App for HadoopOps.

– 10gen added five vice presidents to its management team.

– Rackspace partnered with Hortonworks to create OpenStack and Hadoop-based offerings for public and private cloud.

– Talend added support for Cassandra, HBase and MongoDB , and introduced big data profiling for Apache Hadoop to its integration platform

– MarkLogic announced support for HDFS and expanded its relationship with Hortonworks.

– Kognitio adopted a free licensing model.

– Calpont launched InfiniDB 3.5.

– MetaMarkets announced that it is open sourcing its Druid streaming, real-time data store.

– YarcData updated its uRiKA Big Data appliance for graph analytics.

– Alpine Data Labs announced a global OEM partnership with QlikTech.

– Actian and Attunity announced Attunity Replicate for Actian Vectorwise.

And that’s the Data Day, today.

Forthcoming Webinar: Analytic Platforms in the Real World, with Calpont

On Wednesday July 18 at 10am PT I’ll be taking part in a webinar in association with Calpont discussing the rise of analytic platforms. Here are the details:

The data management landscape is changing rapidly as users adopt new data management technologies and data-analysis approaches to cope with and exploit the increasing volume, variety and velocity of ‘big data.’ While traditional data warehouses are ideal for predictable queries against structured data, and Hadoop and its peers are optimized for new and unexpected queries against unstructured data, a third class of technologies has emerged that aims to split the difference.

Join 451 Group Research Manager Matt Aslett and Calpont VP of Engineering Bob Wilkinson for this 45 min webinar to learn about:

  • How the analytic platform emerged and its place in the data management ecosystem
  • Trends in analytic platforms and what to look for when considering one
  • Real world use cases of InfiniDB, Calpont’s analytic platform, in telecommunications and online advertising

You can register for the event here.

The Data Day, Today: Mar 2 2012

Hortonworks partners with Talend. Teradata and Greenplum updates. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Talend Empowers Apache Hadoop Community with Talend Open Studio for Big Data

* Hortonworks Announces Strategic Partnership With Talend to Bring World’s Most Popular Open Source Data Integration Platform to Apache Community Talend Open Studio for Big Data, will be bundled as part of Hortonworks Data Platform.

* Teradata Transforms Global Database Technology

* New EMC Greenplum Database Enhancements Boost Big Data Analytics

* Cisco’s servers now tuned for Hadoop

* Amplidata Closes $8M Funding Round with Big Bang Ventures, Endeavour Vision, Intel Capital and Swisscom

* Got Big Data? Jaspersoft CEO Brian Gentile outlines three approaches to connecting to ‘big data’ for business intelligence reporting and analysis.

* Cray’s YarcData Division Launches New Big Data Graph Appliance

* Introducing Spring Hadoop Developing applications for Hadoop technologies based on Spring technologies.

* MarkLogic and Hortonworks Partner to Enhance Real-Time Big Data Applications with Apache Hadoop

* Continuent and SkySQL Join Forces to Better Serve the Global MySQL Community

* Data Entrepreneurship

* For 451 Research clients

# Anaplan bags $11.4m in VC, looks beyond budgeting and planning to business operations Impact Report

# XtremeData seeks to differentiate analytic database for extreme data workloads Impact Report

# Calpont adds parallel loading to columnar database for online analytics Market Development Report

# MarkLogic formalizes Hadoop support with Hortonworks partnership Analyst note

And that’s the Data Day, today.

Because 20+ data warehousing vendors is never enough

In our recent report on the data warehousing market we speculated that there would soon be a change in the number of vendors operating in what is a crowded market. We were anticipating that the number of vendors would go down, rather than up, but – in the short term at least – we have been proved wrong, as two new open source analytical databases emerged this week.

First came the formation of Dynamo Business Intelligence Corp, (aka Dynamo BI), a new commercially supported distribution, and sponsor, of LucidDB. Then came the launch of InfiniDB Community Edition, a new open source analytic database based on MySQL from Calpont.

We actually included Calpont in our report but its product plans at that time looked precarious to say the least as the company found that its plans to launch a data warehousing platform based on MySQL were overshadowed by Oracle’s acquisition of Sun.

We were somewhat sceptical about whether Calpont – which has had a couple of false starts in the past – would find a way to bring something to market and we are impressed that the company has reached a licensing agreement with Sun that supports its open source and commercial aims.

Specifically the company has arranged an OEM agreement with Sun for the MySQL Community Server version that enables it to be used with both Calpont’s open source and commercially licensed products. The first of those is InfiniDB Community Edition, a column-oriented, multi-threaded data warehouse platform which acts as a storage engine for MySQL.

The GPLv2 Community Edition will only be available for deployment on a single-server and without any formal support from Calpont and is primarily aimed at raising interest among MySQL developers. A fully certified and supported commercial version will follow, although Calpont is reticent about providing details on that at the moment other than that it will make use of Calpont’s massively parallel processing capabilities and modular architecture to scale out as well as up.

Calpont faces some competition in the MySQL segment from Kickfire and Infobright, particularly the latter given their similar open source software strategies (Kickfire is a MySQL appliance). Infobright has has grown rapidly since going open source and now boasts more than 100 customers, although Calpont maintains that leaves plenty of opportunities amongst MySQL users.

We would agree with that, and also with the company’s claim to offer something different from Infobright technologically. Infobright also offers column-based storage but not massively parallel processing (although it is working on a shared-everything, peer-to-peer architecture). We should note that InfiniDB Community Edition is also restricted to a single server but this is the result of a strategic decision, rather than a technical limitation. The commercial version will be fully MPP.

We recently noted that LucidDB is another open source database that is often overlooked since the LucidDB code is not commercially supported.

Any concern over the future of LucidDB following the demise of LucidEra should be put to bed by the formation of Dynamo BI with the intention to provide a commercially supported distribution of LucidDB.

As LucidDB project lead John Sichi wrote:

“This is an offering which has been completely missing up until now, and which I and others such as Julian Hyde believe to be essential for accelerating adoption of LucidDB. LucidEra provided much of the critical development effort, but never offered commercial support on LucidDB since that was not part of its software-as-a-service business model. Eigenbase provides community infrastructure and development coordination, but a commercial offering is not part of its non-profit charter. So in the past, when individuals and companies have asked me whom they should talk to in order to purchase support for LucidDB, I have never had a good answer. “

Meanwhile Nicholas Goodman revealed that the company has acquired the commercial rights to LucidDB and plans to offer DynamoDB as a prepackaged, assembled distribution. It will also be fully open source and all new features will be contributed to LucidDB.

It is very early days for Dynamo BI, which doesn’t even have a website as yet, so it’s difficult to judge the company’s plans, but with some of the lead LucidDB developers involved and a solid starting project – “the best database no one ever told you about” – it has every chance. We’ll be looking to catch up with the company just as soon as it gets up and running.

The data warehousing sector is extremely crowded and we continue to believe that there will be a shakeout in the near future, but there are opportunities for companies that are able to differentiate themselves from the pack. Starting a data warehousing company is generally not something that we would recommend right now, but both Calpont and Dynamo BI have opportunities to establish themselves.