The Data Day, A few days: February 13-19, 2016

ASF launches Apache Arrow. And more

And that’s the data day, today.

The Data Day, A few days: June 10-16 2014

Oracle launches Oracle Database In-Memory. And more

And that’s the data day, today.

The Data Day, A few days: April 26-May 2 2014

TIBCO acquires Jaspersoft. And more

And that’s the data day, today.

The Data Day, A few days: March 22-April 4 2014

Cloudera raises $900m. Pivotal launches Big Data Suite. And more.

And that’s the data day, today.

The Data Day, A few days: November 2-8 2013

Garantia Data almost becomes RedisDB, raises $9m. And more

And that’s the data day, today.

The Data Day, A few days: June 11-25 2013

A bumper round-up of the past 14 days’ data-related news

* Cisco announced its intention to acquire Composite Software.

* Software AG acquired Apama.

* TIBCO Software acquired StreamBase Systems.

* Cloudera appointed Tom Reilly as Chief Executive Officer and Mike Olson as Chief Strategy Officer and Chairman of the Board.

* Sears Holdings named Jeff Balagna Chief Executive Officer of MetaScale

* Ex-Yahoo CTO launched Altiscale, hardcore Hadoop as a service.

* SpaceCurve raised a $10M Series B round of financing.

* Sqrrl announced general availability of Sqrrl Enterprise.

* GE launched Predictivity services, supported by supported by Proficy Historian HD.

* Datameer announced Datameer 3.0.

* Oracle announced the general availability of MySQL Cluster 7.3.

* MemSQL announced the upcoming availability of MemSQL 2.1.

* Continuuity announced the release of Weave, a new open source project that enables Java developers to rapidly build scalable, distributed applications on YARN.

* RainStor adds security, text search features to database complement for Hadoop.

* Composite Software introduced version 6.2 SP3 of its Composite Data Virtualization Platform

* TokuDB launched TokuMX.

* Terracotta announced the immediate availability of Terracotta Universal Messaging.

* HP united its data management assets under HAVEn brand.

* Hortonworks and Red Hat announced an engineering collaboration around Hadoop.

* Rackspace Hosting’s ObjectRocket Database as a Service entered into a strategic agreement with 10gen.

* Simon Phipps posted State Of The Sea Lion – June 2013.

* Netflix announced that its Genie Hadoop-aaS management software is now open source

* Storm-YARN released as open source.

* Big Data arrived at the Oxford English Dictionary

And that’s the data day, today.

The Data Day, A few days: March 20-22 2013

MongoDB goes Enterprise. Riak CS goes open source. And more.

And that’s the data day, today.

The Data Day, Two days: November 12/13 2012

Platfora raises $20m. IBM trumpets ‘integration anywhere’. And more

And that’s the Data Day, today.

The Data Day, Three days: August 15/16/17 2012

Symantec teams CFS with Hadoop. Informatica Cloud. And more

And that’s the Data Day, today.

What’s ‘big’ got to do with it?

Jaspersoft has released the results of its latest Big Data Survey and was good enough to share with us a few additional details. It makes for interesting reading.

The first thing to take into account is the sample bias. The survey was conducted with over 600 Jaspersoft community members. 63% of respondents are application developers, and 37% are in the software and Internet industry.

This already speaks volumes about the sectors with interest in big data, and it is interesting to compare the state of big data adoption with the recent results of 451 Research’s TheInfoPro storage study, which is conducted with storage professionals.

According to that study, 24% of storage respondents had already implemented solutions for big data, while 56% had no plans. As you might expect, Jaspersoft’s sample was more keen, with 36% having already deployed or in development, and 38% with no plans.

That’s still a good proportion of respondents with no plans to adopt a big data analytics project, however, with the biggest reasons not to adopt being a reliance on structured data (37%) and no clear understanding of what ‘big data’ is (35%).

Sceptics might suggest that the respondents to Jaspersoft’s survey that do have plans for big data are also somewhat confused about what constitutes a big data project.

Certainly they are using some fairly traditional technologies and approaches. Looking at the most popular answers to a range of questions we find that those with big data plans are:

  • creating reports (76%)
  • to analyze customer experience (48%)
  • based on data from enterprise applications (79%)
  • stored on relational databases (60%)
  • processed using ETL (59%)
  • running on-premises (60%)

So far, so what. The characteristics above could be used to describe many existing business intelligence projects.

It’s not even as if respondents are looking at huge volumes of data, with 38% expecting total data volume to be in the gigabytes, 40% expecting terabytes, and just 10% expecting petabytes and above.

So what makes these big data projects? It’s not until you look at the source of the data that you get any sense that the respondents with ongoing big data projects are doing anything different from those without: 68% are using machine-generated content (web logs, sensor data) as the a source for their big data projects, and 46% are using human-generated text (social media, blogs).

The results do suggest that some non-traditional analytics and data processing approaches are gaining ground, with 64% citing the importance of data visualization, 54% statistical/predictive analytics, 50% search, and 45% text analytics. However, just 18% are using Hadoop HDFS at this point (behind MongoDB with 19%).