Entries from March 2013 ↓

The Data Day, A few days: March 25-28 2013

March 28th, 2013 — Data management

Google pledges patent support for OSS. Basho open sources Riak CS. And more

For 451 Research clients: Basho takes Riak cloud storage platform down open source path bit.ly/YChz61 By @simonrob451, and me

— Matt Aslett (@maslett) March 25, 2013

For 451 Research clients: Continuent expands its database vision beyond clustering and replication bit.ly/YCcgJT

— Matt Aslett (@maslett) March 27, 2013

For 451 clients: Glassbeam plots course for its SaaS analysis service with unstructured data twist bit.ly/XDo3S1 By Krishna Roy

— Matt Aslett (@maslett) March 26, 2013

For 451 clients: Our take on what the Pivotal Initiative means for EMC, VMware and software-defined everything bit.ly/XDnW95

— Matt Aslett (@maslett) March 26, 2013

Google pledges not to assert MapReduce patents against users, distributors or developers of open source software. bit.ly/WYRW1W

— Matt Aslett (@maslett) March 28, 2013

MapR teams with Canonical bit.ly/13B3Sfk releases source code for Hadoop modifications on GitHub. bit.ly/13B42DC

— Matt Aslett (@maslett) March 28, 2013

Nice intro from Hortonworks for anyone looking for help understanding Hadoop 2.0. bit.ly/XDodZA

— Matt Aslett (@maslett) March 26, 2013

Platfora’s native in-memory business intelligence platform for Hadoop is now generally available. bit.ly/XDofk1

— Matt Aslett (@maslett) March 26, 2013

Akiban Server is OSS bit.ly/YChK12

— Matt Aslett (@maslett) March 25, 2013

(Apache) Tajo (incubating) is a relational and distributed data warehouse system for Hadoop. bit.ly/YChNdr

— Matt Aslett (@maslett) March 25, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 25-28 2013

NoSQL LinkedIn Skills Index – March 2013

March 26th, 2013 — Data management

As Q1 comes to a close its time to take another look at our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects. This is the second update since we rebooted the analysis in September 2012 to account for more products and refine our search terms.

A few interesting statistics to pick out: Neo4j has, as predicted, jumped ahead of MarkLogic for sixth place. No other changes of position, but outside the top ten, shown here, Apache Accumulo continues to grow well.

In fact, Apache Accumulo had the fastest rate of growth for the second quarter in succession, just ahead of DynamoDB and OrientDB -once again – followed by Apache Cassandra and MongoDB.

MongoDB’s growth means that it once again extended its lead as the most popular NoSQL database, according to LinkedIn profile mentions. As the chart below illustrates, it now accounts for 46% of all mentions of NoSQL technologies in LinkedIn profiles, according to our sample, compared with 45% in December.

3 Comments

Big Data and the Cloud: A Perfect Storm? – expanded edition

March 25th, 2013 — Data management

451 Research will be hosting its annual HCTS EU event in London on April 9/10. The event includes presentations from 451 Research, Uptime Institute and Yankee Group analysts, as well as representatives from vendors and enterprises – such as HCBS, The BBC, Google, Morgan Stanley, Greenpeace, News International, BNP Paribas, and ING.

We also have a guest speaker in Tim Harford author of “The Undercover Economist”.

As if that wasn’t enough, I’ll be presenting the expanded version of my “Big Data and the Cloud: A Perfect Storm?” presentation.

As I previously wrote ahead of presenting the shortened version at Cloud Expo Europe, many people seem to believe that cloud computing and big data have the potential to create a perfect storm of disruption.

However, 451 Research has been tracking the adoption of data management technologies on the cloud – and the lack of it – since relational databases became available on AWS in 2008, and the effect of the confluence of big data and the cloud would perhaps better be described as dead calm, rather than a perfect storm. Other than development and test environments, adoption has been limited.

In my presentation I’ll take a look at the factors that have restricted adoption of databases in the cloud to date – including some exclusive results from our recent database survey – explain why we see the potential for cloud database growth in the coming years, and examine how the strategies of emerging Hadoop- and database-as-a-service providers are evolving to ensure that big data and the cloud combine to fulfil their potential to disruptive the IT landscape as we know it.

For full details of the event, and to register, click here.

Comments Off on Big Data and the Cloud: A Perfect Storm? – expanded edition

The Data Day, A few days: March 20-22 2013

March 22nd, 2013 — Data management

MongoDB goes Enterprise. Riak CS goes open source. And more.

For 451 Research clients: 10gen accelerates NoSQL commercial plans with MongoDB Enterprise bit.ly/ZJoXhl

— Matt Aslett (@maslett) March 20, 2013

For 451 Research clients: JethroData raises funding to develop Hadoop-based analytic database bit.ly/11j5wgF

— Matt Aslett (@maslett) March 21, 2013

For 451 clients: The rise of the ‘predictive business’ – a machine-learning future for analytics M&A? bit.ly/11j5BRe By Krishna Roy

— Matt Aslett (@maslett) March 21, 2013

Concurrent Closes $4 Million in Series A Funding, Appoints Gary Nakamura as CEO mwne.ws/Yr669s

— Matt Aslett (@maslett) March 20, 2013

Riak CS – simple, available cloud storage built on Riak – is now open source. basho.com/riak-cs-is-now…

— Basho Technologies (@basho) March 20, 2013

Cloudera and T-Systems announce strategic partnership to deliver cloud-based data analytics based on Hadoop. bit.ly/16IZGck

— Matt Aslett (@maslett) March 20, 2013

Actian announced the launch of Vectorwise 3.0 analytic database with Hadoop integration. bit.ly/ZRnvZ0

— Matt Aslett (@maslett) March 19, 2013

Jaspersoft added more than 400 new customer deals in 2012. bit.ly/Y3lCfa

— Matt Aslett (@maslett) March 22, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 20-22 2013

The Data Day, A few days: March 18-19 2013

March 19th, 2013 — Data management

Splunk adds structure. MapR raises $30m. And more.

For 451 Research clients: Splunk adds structured data integration with DB Connect bit.ly/ZPzI0f

— Matt Aslett (@maslett) March 19, 2013

For 451 clients: Armed with fresh VC and a new focus, FeedZai takes fraud-detection wares to US bit.ly/ZPzJkO By Krishna Roy

— Matt Aslett (@maslett) March 19, 2013

10gen Releases MongoDB 2.4 with Hash-based Sharding, Capped Arrays, Geo Enhancements and more bit.ly/11a4eV0

— 10gen(@10gen) March 19, 2013

MapR Technologies Closes $30 Million in New Funding bit.ly/XWqd2B

— Matt Aslett (@maslett) March 19, 2013

F5 Networks exec named CEO of database startup @sqrrl_inc (by @kylealspach) bizjournals.com/boston/blog/st…

— The BBJ Newsroom (@BostonBizNews) March 19, 2013

Splunk delivers relational database integration with Splunk DB Connect. bit.ly/WRcP12

— Matt Aslett (@maslett) March 18, 2013

Drawn to Scale Announces Spire for Mongo, a distributed data platform for MongoDB. bit.ly/Zs4Gh2

— Matt Aslett (@maslett) March 19, 2013

TIBCO Software has announced the launch of TIBCO Spotfire. mwne.ws/WRd4t7

— Matt Aslett (@maslett) March 18, 2013

Kognitio announces massively parallel ‘R’ via external scripts. prn.to/XZXZX7

— Matt Aslett (@maslett) March 19, 2013

Big Data Broadens Its Range on.wsj.com/Wgbek9

— Matt Aslett (@maslett) March 14, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 18-19 2013

Forthcoming webinar: The New Path to Performance. No Sharding!

March 19th, 2013 — Data management

On Tuesday March 26th at 10am PT I’ll be taking part in a webinar with NuoDB on the subject of The New Path to Performance. No Sharding!

As part of the webinar I’ll be explaining the various strategies used by enterprises to attempt to achieve scalability of relational databases, why they fail to meet modern distributed processing requirements, and why companies are increasingly open to looking at alternatives to the traditional relational database.

Wiqar Chaudry from NuoDB will also be discussing how to eliminate technical acrobatics, including:

Sharding
Clustering
Performance tuning
Replication
And other kinds of 20th century database tricks.

To register, click http://go.nuodb.com/no-sharding-webinar-register-s.html

Comments Off on Forthcoming webinar: The New Path to Performance. No Sharding!

The Data Day, A few days: March 11-14 2013

March 14th, 2013 — Data management

SAP’s predictive analytics plans. Dell’s Boomi MDM. And more

For 451 Research clients: SAP sheds light on predictive-analytics business bit.ly/ZEc5eu By Krishna Roy

— Matt Aslett (@maslett) March 12, 2013

For 451 Research clients: Dell Boomi launches into MDM with cloud service for the midmarket bit.ly/ZEcjSK By Krishna Roy

— Matt Aslett (@maslett) March 12, 2013

Teradata delivers Teradata Data Warehouse Appliance 2700. prn.to/Yre3ug

— Matt Aslett (@maslett) March 13, 2013

MIT researchers are developing a system called DBSeer to improve the efficiency of databases in cloud environments. bit.ly/10GAVcx

— Matt Aslett (@maslett) March 13, 2013

Twitter and Cloudera are open-sourcing Parquet: columnar storage format for Hadoop. Take a look! parquet.github.com

— Dmitriy Ryaboy (@squarecog) March 12, 2013

Scality integrates its RING storage software with Hadoop. bit.ly/YbfUaB

— Matt Aslett (@maslett) March 14, 2013

Hadapt adds Netezza co-founder Jit Saxena to Board of Directors. prn.to/Xardwd

— Matt Aslett (@maslett) March 14, 2013

What it means to be “all in” on Hadoop bit.ly/Y4Svre This isn’t about which vendor has the most Hadoop committers.

— Matt Aslett (@maslett) March 11, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 11-14 2013

What it means to be “all in” on Hadoop

March 11th, 2013 — Data management

Pivotal HD is not Hadoop
Neither is Cloudera’s Distribution, including Apache Hadoop.
Nor the Hortonworks Data Platform.
Nor the MapR Distribution.
Nor IBM’s InfoSphere BigInsights.
Nor the WANdisco Distro.
Nor Intel’s Distribution for Apache Hadoop.

Apache Hadoop is Hadoop. And Hadoop is Apache Hadoop.

I don’t write that to be pedantic, or controversial, but because it is the only logical conclusion you can reach after reading Defining Apache Hadoop from the Apache Hadoop Wiki.

“The key point is that the only products that may be called Apache Hadoop or Hadoop are the official releases by the Apache Hadoop project as managed by that Project Management Committee (PMC)… Products that are derivative works of Apache Hadoop are not Apache Hadoop, and may not call themselves versions of Apache Hadoop, nor Distributions of Apache Hadoop.”

It is with this in mind that one should view the reaction to EMC Greenplum’s recent launch of of Pivotal HD; and in particular this statement from Scott Yara, EMC Greenplum senior Vice President, Products and Co-Founder:

“We’re all in on Hadoop, period.”

What does it mean to be “all in on Hadoop”? Based on a strict reading of Defining Apache Hadoop (a document that demands by its own words to be read strictly), being “all in” on Hadoop means only one thing: being “all in” on Apache Hadoop.

I have no doubt that EMC Greenplum is “all in” on Pivotal HD, but that’s not the same thing at all.

Not a purity debate

There is nothing wrong with offering additional functionality beyond the scope of Apache Hadoop – the licensing terms clearly encourage it.

As my fellow analyst Merv Adrian notes:

“Having some components of your solution stack provided by the open source community is a fact of life and a benefit for all. So are roads, but nobody accuses Fedex or your pizza delivery guy of being evil for using them without contributing some asphalt.”

That is true. However, to continue the analogy, you would expect any company that claimed to be “all in on roads” to be getting involved in laying and maintaining them, rather than just driving on top of them.

Despite what some people may think this isn’t a matter of arguing about which vendor has the most Hadoop committers. It is a matter of defining what users understand Hadoop to be, and what they understand it not to be. It is a matter of drawing a line between Hadoop – Apache Hadoop – and additional, proprietary, functionality beyond the scope of the project.

User preference

Whether users will choose to go with a pure approach to Hadoop-based products and services is another matter. Dan Woods, for one, clearly believes that products like Pivotal HD will drive further mainstream adoption beyond “the limits of open source.”

The idea is that most enterprises don’t care if it meets the Apache definition of Hadoop or not, as long as it works.

While I have no doubt that some companies will be drawn to the additional features and confidence that vendors such as EMC and Intel can provide, I have also spoken to multiple enterprises – including one very large enterprise just last week – for which the preference is to default to open in order to avoid any potential for lock-in and vendor-specific architecture choices.

There are many vendors that do very much care whether what they are adopting meets the Apache definition of Hadoop.

Which of these attitudes will dominate? I’m not going to pretend I know the answer to that question at this point, but our previous coverage of open source adoption suggests that once the door to openness has been unlocked its very hard to force it shut again.

Dan Woods responded to my (sarcastic) comment about this as follows:

@maslett Linux is an enterprise product. The use-value players (IBM, HP, Intel) took it over, invested, and adapted it to enterprise needs.

— Dan Woods (@danwoodscito) March 5, 2013

I would dispute that players like IBM, HP, and Intel “took Linux over” but in any case it is undeniable that they had a significant role to play – alongside Red Hat, Novell et al, and individual developers – in turning Linux into an enterprise-grade operating system.

The point is though that they did so by engaging with the Linux project, not by launching their own differentiated versions of Linux.

1 Comment

The Data Day, A few days: March 6-8 2013

March 8th, 2013 — Data management

Ayasdi emerges. Amazon slashes DynamoDB prices. And more

For 451 Research clients: Ayasdi applies TDA, machine learning and Hadoop twist to advanced analytics bit.ly/XYG0fZ By Krishna Roy

— Matt Aslett (@maslett) March 8, 2013

For 451 Research clients: Software AG launches In-Genius take on in-memory operational intelligence bit.ly/XYFI8G

— Matt Aslett (@maslett) March 8, 2013

For 451 clients: Anaplan lands $30m to accelerate growth, sets out expansion strategy under new CEO bit.ly/YMQ6gF By Krishna Roy

— Matt Aslett (@maslett) March 6, 2013

NGDATA has acquired ENQIO, a data management, business analytics and campaign management consultancy. bit.ly/Zq7ryb

— Matt Aslett (@maslett) March 6, 2013

#DynamoDB One Year Later: Bigger, Better, and 85% Cheaper… – #AllThingsDistributed #aws wv.ly/W9sCWy

— Werner Vogels (@Werner) March 8, 2013

Big SQL is new technology from IBM that provides SQL access to data in Hadoop. bit.ly/YHqEwh

— Matt Aslett (@maslett) March 8, 2013

Percona launches remote DBA services for MySQL. bit.ly/ZqaXsy

— Matt Aslett (@maslett) March 6, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 6-8 2013

The Data Day, A few days: March 1-5 2013

March 5th, 2013 — Data management

SQL and Hadoop: ascloseasthis. Splunk revenue up 64%. and more.

For 451 Research clients: SQL and Hadoop: a marriage of convenience bit.ly/162RfZ9 Just what do they see in each other?

— Matt Aslett (@maslett) March 4, 2013

For 451 clients: Skytree reaches into front end for machine learning, prepares for Series A funding bit.ly/XJQzDi By Krishna Roy

— Matt Aslett (@maslett) March 5, 2013

For 451 Research clients: Kalido illuminates MDM business and product roadmap bit.ly/VgPLaq By Krishna Roy

— Matt Aslett (@maslett) March 1, 2013

Splunk reports FY revenue of $198.9m, up 64%. bit.ly/VgVqx2

— Matt Aslett (@maslett) March 1, 2013

Terracotta launches BigMemory 4.0 bit.ly/Vv8TkX and In-Genius in-memory intelligence software. bit.ly/Vv8YFs

— Matt Aslett (@maslett) March 4, 2013

Acunu has announced the availability of Acunu Analytics for Cassandra. bit.ly/162Rrrb (PDF)

— Matt Aslett (@maslett) March 4, 2013

SkySQL, Codership and Monty Program team up to release Galera Cluster for MariaDB. mwne.ws/YaZivl

— Matt Aslett (@maslett) March 5, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 1-5 2013

Entries from March 2013 ↓

The Data Day, A few days: March 25-28 2013

NoSQL LinkedIn Skills Index – March 2013

Big Data and the Cloud: A Perfect Storm? – expanded edition

The Data Day, A few days: March 20-22 2013

The Data Day, A few days: March 18-19 2013

Forthcoming webinar: The New Path to Performance. No Sharding!

The Data Day, A few days: March 11-14 2013

What it means to be “all in” on Hadoop

The Data Day, A few days: March 6-8 2013

The Data Day, A few days: March 1-5 2013

Search

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives

Entries from March 2013 ↓

Search

Tags

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives