The Data Day, Today: Feb 29 2012

Microsoft and Hortonworks expand Hadoop partnership. Oracle ships Exalytics. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Hortonworks to Bring Apache Hadoop to Millions of New Users Hortonworks and Microsoft expanded their relationship around Apache Hadoop.

See also:
# Big Data for Everyone: Using Microsoft’s Familiar BI Tools with Hadoop
# Microsoft’s Hadoop roadmap reveals new big data deliverables
# Karmasphere Expands Big Data Analytics on Hadoop in the Enterprise
# Datameer to Bring Hadoop Analytics to Windows Azure
# HStreaming Brings Real-Time Analytics to Microsoft’s Hadoop-based Services for Windows Server and Windows Azure

* Oracle Announces Availability of Oracle Exalytics In-Memory Machine

* Fujitsu Releases “Interstage Big Data Parallel Processing Server V1.0” to Help Enterprises Utilize Big Data

* Pentaho and DataStax announce strategic partnership delivering the first complete Apache Cassandra-based big data analytics solution to the market

* Cloudant Names Andy Palmer to its Board of Directors

* R integrated throughout the enterprise analytics stack

* Jaspersoft Announces Big Data Index to Track Demand for Big Data Analytics

* 1010data Enables Companies to Rapidly Model and Predict Individual Consumer Behavior and Social Network Relationships

* Tableau Software Teams with Attivio to Tap Unstructured Content and Deliver Deeper Insight to Business Users

* Infochimps and the Future of Data Marketplaces “This is the clearest indication yet that data marketplaces may be the latest ‘Application Service Provider’ cycle, as in right idea, wrong time.”

* HStreaming and RainStor Partner to Lower the Cost of Big-Data Analytics on Hadoop

* JustOne Database Sets the Stage for Accelerated Growth in 2012 and Beyond

* Big Data investment map

* A group of Google Engineers released “vitess” – a project to help scale MySQL databases.

* For 451 Research clients

# Reassessing the M&A potential of NoSQL and NewSQL Sector IQ report

# Sears Holdings creates Hadoop managed service provider MetaScale Impact Report

# Datawatch turns the corner with focus on report analytics suite Impact Report

# arcplan details growth plan, as it expands into front end for SAP HANA and social BI Impact Report

# Objectivity adds reusable queries to InfiniteGraph NoSQL database Market Development Report

# Host Analytics illuminates cloud performance management growth strategy and roadmap Market Development Report

And that’s the Data Day, today.

The Data Day, Today: Feb 14 2012

Teradata closes best year ever. NetApp and EMC propose big data forum. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Teradata Announces 2011 Fourth Quarter and Full-Year Results (PDF)

* Hell Has Not Frozen Over: NetApp and EMC Combine to Educate for Big Data Standards

* Cray Forms New Big Data Division, Hires New General Manager

* Privacy in the Age of Big Data

* ScaleBase Unveils New Elastic Load Balancing Feature at Cloud Connect

* Introducing CDH4

* Lucid Imagination “Search-as-a-Service” Powers Flexible, Cost-Effective Enterprise-Wide Data Discovery

* Couchbase Survey Shows Accelerated Adoption of NoSQL in 2012

* Open Source OData Tools for MySQL and PHP Developers

* New Release of WhereScape’s Data Warehouse Development Environment Enables Cross-Platform Database Appliance Support

* On MongoDB, SQL and ACID

* For 451 Research clients

# IxReveal seeks opportunities as a hub for data fusion Impact Report

# 5000fish sets out to swim beyond an IT services management pond in BI Impact Report

# Zimory boosts Scale cloud database with pickup of sones development team Deal Analysis Report

# Alpine Data outlines strategy as it follows the workflow for advanced analytics Market Development report

# 10gen targets agility and flexibility for increased document database adoption Market Development report

# ScaleArc expands its database-clustering and load-balancing focus beyond MySQL Market Development report

And that’s the Data Day, today.

Last chance to take part in our MySQL/NoSQL/NewSQL survey

Thanks to everyone who has already taken part in our survey exploring changing attitudes to MySQL following its acquisition by Oracle and examining the competitive dynamic between MySQL and other database technologies, including NoSQL and NewSQL.

The response has been great and even a quick look at the results makes for interesting reading, particularly in the light of our previous findings which indicated declining MySQL usage.

I am really looking forward to having the opportunity for a deep dive into the results and break out the figures to get a better understanding of the potential impact of alternative MySQL distribution and support providers, as well as NoSQL and NewSQL, on continued usage of MySQL.

The survey results will be made freely available on our blogs, as well as being included in a long format report containing our additional analysis and research related to the MySQL ecosystem and competitive dynamic.

Right now, however, is your last chance to contribute to the survey and get your voice heard. There are just 12 questions to answer, spread over four pages, and the entire survey should take no longer than five minutes to complete. All individual responses are of course confidential.

The survey will close in 24 hours.

Membase + CouchOne =

I put this slide together for my own benefit as I was trying to keep track of the various incarnations of Couchbase’s brands. Looks like I wasn’t the only one, so I thought I’d also make our perspective available.

There are a couple of differences between our slide and Koji Kawamura’s:

Ours contains an extra layer of names (e.g. “Elastic Couchbase”) that were briefly used by Couchbase in discussion and I believe in marketing, although never for shipping product.

Also ours doesn’t mention memcached. It could be on there given that Membase is based on it, and Couchbase Server can still be deployed in “memcached only mode”, but in that sense it is a feature of Membase/Couchbase Server. And anyway, I couldn’t fit it on 🙂

The Data Day, Today: Jan 27 2012

Amazon launches AWS Storage Gateway. Postgres Plus Cloud Server. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Amazon Web Services Announces AWS Storage Gateway to Connect Enterprise Data with the Cloud

* EnterpriseDB Announces Availability of Postgres Plus Cloud Database

* Big VCs Invest In Big Data Startup Continuuity

* At Davos, Discussions of a Global Data Deluge

* Zimory Names New Head of zimory®scale; the Cloud Database Elasticity Division

* Jaspersoft’s Java Reporting Engine Integrated with Cloud Foundry

* IBM Debuts New Analytics Appliance to Help Retailers Transform Big Data Into Business Opportunities

* The Mass Technology Leadership Council published its report on big data and analytics.

* Apache HBase 0.92.0 has been released

* Is Security An Afterthought For NoSQL?

* What’s the big deal about Big Data?

* Hadoop Summit 2012 Announced to Showcase Apache Hadoop as Next Generation Enterprise Data Platform

* Announcing BigCouch 0.4

* Microsoft’s plan for Hadoop and big data

* Google Goes MoreSQL With Tenzing – SQL Over MapReduce

* Seismic Data Science: Reflection Seismology and Hadoop

* GoodData Posts Record-Breaking 600% Year-Over-Year Revenue Growth In 2011

* For 451 Research clients

# 2012 M&A Outlook – Software Assessing the runners and riders for M&A and IPOs in 2012

# RJMetrics scores $1.2m debt funding, sets out SaaS BI stall Impact report

* Google News Search outlier of the day: Pork Tenderloin: A Healthy Eating Hero

And that’s the Data Day, today.

Is MySQL usage really declining?

If you’re a MySQL user, tell us about your adoption plans by taking our current survey.

Back in late 2009, at the height of the concern about Oracle’s imminent acquisition of Sun Microsystems and MySQL, 451 Research conducted a survey of open source software users to assess their database usage and attitudes towards Oracle.

The results provided an interesting snapshot of the potential implications of the acquisition and the concerns of MySQL users and even, so I am told, became part of the European Commission’s hearing into the proposed acquisition (used by both sides, apparently, which says something about both our independence and the malleability of data).

One of the most interesting aspects concerned the apparently imminent decline in the usage of MySQL. Of the 285 MySQL users in our 2009 survey, only 90.2% still expected to be using it two years later, and only 81.8% in 2014.

Other non-MySQL users expected to adopt the open source database after 2009, but the overall prediction was decline. While 82.1% of our sample of 347 open source users were using MySQL in 2009, only 78.7% expected to be using it in 2011, declining to 72.3% in 2014.

This represented an interesting snapshot of sentiment towards MySQL, but the result also had to be taken with a pinch of salt given the significant level of concern regarding MySQL future at the time the survey was conducted.

The survey also showed that only 17% of MySQL users thought that Oracle should be allowed to keep MySQL, while 14% of MySQL users were less likely to use MySQL if Oracle completed the acquisition.

That is why we are asking similar questions again, in our recently launched MySQL/NoSQL/NewSQL survey.

More than two years later Oracle has demonstrated that it did not have nefarious plans for MySQL. While its stewardship has not been without controversial moments, Oracle has also invested in the MySQL development process and improved the performance of the core product significantly. There are undoubtedly users that have turned away from MySQL because of Oracle but we also hear of others that have adopted the open source database specifically because of Oracle’s backing.

That is why we are now asking MySQL users to again tell us about their database usage, as well as attitudes to MySQL following its acquisition by Oracle. Since the database landscape has changed considerably late 2009, we are now also asking about NoSQL and NewSQL adoption plans.

Is MySQL usage really in decline, or was the dip suggested by our 2009 survey the result of a frenzy of uncertainty and doubt given the imminent acquisition. Will our current survey confirm or contradict that result? If you’re a MySQL user, tell us about your adoption plans by taking our current survey.

451 Research MySQL/NoSQL/NewSQL survey

I’ve just launched a new survey that should be of interest if you are currently using or actively considering MySQL or any of the NoSQL or NewSQL offerings

The aim of the survey is threefold:

– identify trends in database usage over time
– explore changing attitudes to MySQL following its acquisition by Oracle
– examine the competitive dynamic between MySQL and other database technologies, including NoSQL and NewSQL

There are just 12 questions to answer, spread over four pages, and the entire survey should take no longer than five minutes to complete.

All individual responses are of course confidential. The results will be published as part of a major research report due at the end of Q1. Thanks in advance for your participation.

The survey can be found at: http://www.surveymonkey.com/s/MySQLNoSQLNewSQL

The Data Day, Today: Jan 13 2012

Splunk files for IPO. Oracle updates its price list. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Splunk Inc. Files Registration Statement for an Initial Public Offering And here it is.

* Oracle updated its Engineered System price list.

* Comparing Hadoop Appliances Great post from Pythian’s Gwen Shapira.

* What is big data? Edd Dumbill provides an introduction to the big data landscape.

* Why Couchbase? Damien Katz clarifies the reasons behind his preference for Couchbase over Apache CouchDB.

* Jaspersoft First to Develop Business Intelligence for Platform-as-a-Service BI suite now available with Red Hat OpenShift.

* Birst and ParAccel Partner to Deliver Scalable and Agile Big Data Analytics in the Cloud. Leverage.

* Recommind Names 451 Research Cofounder Nick Patience Director of Product Marketing and Strategy Our loss is Recommind’s gain.

* Oracle Unveils Oracle TimesTen In-Memory Database 11g Release 2 Performance and scalability improvements.

* Walkie Talkie App Voxer Soars Past a Billion Operations per Day powered by Basho Riak 10-4 good buddy.

* ISYS Search to Provide Enhanced Text Data Extraction Capabilities for New Generation of SAP Solutions OEM deal.

* Using SQLFire as a read-only cache for MySQL. VMware explains why and how.

* Announcing MySQL Enterprise Backup 3.7.0 Self-explanatory.

* Tableau Software Doubles Sales in 2011, Announces Massive Growth in Customer Roster Worldwide Customer base up by 40 percent in 2011.

* VoltDB Completes 2011 With Significant Market Growth and Company Expansion Including growth in new customer accounts of more than 300%.

* Clarabridge Wins Record Number of New Clients in 2011 More than 60 new Clarabridge Enterprise customers and more than 700 new Clarabridge Professional customers.

* For 451 Research clients

# Oracle selects Cloudera for Hadoop-based Big Data Appliance Market development report

# Microsoft may offer ‘big security data’ for free Analyst note

# Zimory considering virtual independence for cloud database business Market development report

# Jitterbit sheds light on growth strategy, integration business under new CEO Market development report

# SnapLogic snaps into the enterprise, shifts gaze away from midmarket integration Market development report

* Google News Search outlier of the day: My Best Friend’s Hair Launches Nationwide Website to Help You Find the Perfect Hairstylist

And that’s the Data Day, today.

NoSQL ≠ open source

I thought we finished with trying to define NoSQL in 2010 but Martin Fowler has raised the question again with his recent post – although he has a good reason to do so since he is collaborating on a book on the subject.

Fowler’s list of common characteristics (which he acknowledges is not definitional) is as follows:

  • Not using the relational model (nor the SQL language)
  • Open source
  • Designed to run on large clusters
  • Based on the needs of 21st century web properties
  • No schema, allowing fields to be added to any record without controls
  • You could argue about whether all NoSQL databases are designed to run on large clusters, but the characteristic from the list above that I would dispute is open source.

    While it is undoubtedly true to say that most NoSQL databases are open source, I don’t believe it defines them in the same way that other common characteristics do.

    The main argument for making open source licensing a requirement of NoSQL seems to me to be historical. The first NoSQL meeting, cited by Fowler, specified that it was about “open source, distributed, non-relational databases”.

    However, making open source licensing a defining characteristic of NoSQL would also exclude a number of products that would otherwise clearly fit the definition of NoSQL, as well as projects such as Google’s BigTable and Amazon’s Dynamo which were the genesis of much – although by no means all – of the momentum behind the NoSQL database movement.

    For the sake of argument let’s assume Amazon decided to release a version of Dynamo that could be deployed on-premise and for whatever reason decided not to release “Dynamo-on-premise” under an open source license.

    Is anyone seriously going to argue that a closed source “Dynamo-on-premise” wouldn’t be a NoSQL database?

    For what it’s worth since our NoSQL, NewSQL and Beyond report the description of NoSQL I have been using is:

  • A new breed of non-relational database products
  • sharing a rejection of fixed table schema and join operations
  • designed to meet scalability requirements of distributed architectures
  • and/or schema-less data management requirements
  • Although, like Fowler I would not claim this to be a definition.

    The Data Day, Today: Jan 10 2012

    Oracle OEMs Cloudera. The future of Apache CouchDB. And more.

    An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

    * Oracle announced the general availability of Big Data Appliance, and an OEM agreement with Cloudera for CDH and Cloudera Manager.

    * The Future of Apache CouchDB Cloudant confirms intention to integrate the core capabilities of BigCouch into Apache CouchDB.

    * Reinforcing Couchbase’s Commitment to Open Source and CouchDB Couchbase CEO Bob Wiederhold attempts to clear up any confusion.

    * Hortonworks Appoints Shaun Connolly to Vice President of Corporate Strategy Former vice president of product strategy at VMware.

    * Splunk even more data with 4.3 Introducing the latest Splunk release.

    * Announcement of Percona XtraDB Cluster (alpha release) Based on Galera.

    * Bringing Value of Big Data to Business: SAP’s Integrated Strategy Forbes interview with with Sanjay Poonen, President and corporate officer of SAP Global Solutions.

    * New Release of Oracle Database Firewall Extends Support to MySQL and Enhances Reporting Capabilities Self-explanatory.

    * Big data and the disruption curve “Many efforts are being funded by business units and not the IT department and money is increasingly being diverted from large enterprise vendors.”

    * Get your SQL Server database ready for SQL Azure Microsoft “codename” SQL Azure Compatibility Assessment.

    * An update on Apache Hadoop 1.0 Cloudera’s Charles Zedlewski helpfully explains Apache Hadoop branch numbering.

    * Xeround and the CAP Theorem So where does Xeround fit in the CAP Theorem?

    * Can Yahoo’s new CEO Thompson harness big data, analytics? Larry Dignan thinks Scott Thompson might just be the right guy for the job.

    * US Companies Face Big Hurdles in ‘Big Data’ Use “21% of respondents were unsure how to best define Big Data”

    * Schedule Your Agenda for 2012 NoSQL Events Alex Popescu updates his list of the year’s key NoSQL events.

    * DataStax take Apache Cassandra Mainstream in 2011; Poised for Growth and Innovation in 2012 The usual momentum round-up from DataStax.

    * Objectivity claimed significant growth in adoption of its graph database, InfiniteGraph and flagship object database, Objectivity/DB.

    * Cloudera Connector for Teradata 1.0.0 Self-explanatory.

    * For 451 Research clients

    # SAS delivers in-memory analytics for Teradata and Greenplum Market Development report

    # With $84m in funding, Opera sets out predictive-analytics plans Market Development report

    * Google News Search outlier of the day: First Dagger Fencing Competition in the World Scheduled for January 14, 2012

    And that’s the Data Day, today.