February 21st, 2014 — Matthew Aslett
Data management
Informatica eyes eyes $1bn in sales. And more
And that’s the data day, today.
Tags: 28msec, BeyondCore, cloudant, couchbase, Datawatch, Excel, infobright, Informatica, information governance, MapR, marklogic, Microsoft, MySQL, percona, Qlik, TokuMX, Tokutek, vertica, YARN
February 14th, 2014 — Matthew Aslett
Data management
Hortonworks and Red Hat expand Hadoop partnership. And more.
And that’s the data day, today.
Tags: Alpine Data, Altibase, Altiscale, Attivio, calpont, cloudera, couchbase, Google Cloud SQL, hortonworks, IBM, infinidb, Intel, MapR, Parelastic, pentaho, Pivotal, red hat, SkySQL, Splice MAchine, SQLstream, storm, Tesora, TokuDB, Tokutek, zettaset
February 7th, 2014 — Matthew Aslett
Data management
Orchestrate launches. Domo raises $125m. And more
And that’s the data day, today.
Tags: Actian, Autonomy, birst, cloudera, ClusterControl, concurrent, Domo, hadoop, HANA, HP, Intel, MemSQL, MetaScale, Orchestrate, percona, SAP, SeveralNines, Spotfire, Tableau, teradata, ThoughtSpot, tibco, trifacta, Xplenty, XtraDB
January 31st, 2014 — Matthew Aslett
Data management
VoltDB targets in-memory analytics. Garantia Data becomes Redis Labs. And more.
And that’s the data day, today.
Tags: Actian, Altiscale, amazon, BigDataLite, elasticsearch, ELK, Garantia Data, Global IDs, Kinesis, microstrategy, opera, Oracle, PRIME, Redis Labs, storm, teradata, VotlDB
January 24th, 2014 — Matthew Aslett
Data management
Is Hadoop a planet? MariaDB Enterprise. And more
And that’s the data day, today.
Tags: amazon, BitYota, FatCloud, GigaSpaces, Greg Luck, hadoop, hazelcast, hortonworks, Informatica, MariaDB Enterprise, MemSQL, Redshift, SAS Institute, SkySQL, trifacta
January 22nd, 2014 — Matthew Aslett
Data management
Merv Adrian wrote a blog boast recently bemoaning the “aspirational marketing” that surrounds Hadoop, in particular the fact that current deployments are a long way from delivering on the vision.
While I completely agree that many enterprises are struggling to translate tactical use-cases into the business use-cases required to drive more strategic adoption beyond the proof of concept stage, I don’t think that aspirational marketing around Hadoop is necessarily a bad thing.
It is certainly true that part of the problem lies in clearly understanding how Hadoop can be used as a complement to traditional relational database technologies deployed as an enterprise data warehouse.
That is why we recently asked Is Hadoop a planet? – comparing confusion around Hadoop’s classification to that of Pluto – while also describing Hadoop as a framework in search of a metaphor.
Given the confusion, however, I believe it is incumbent on Hadoop providers to describe not just the functional use-cases that are driving tactical adoption, but also the bigger vision that will drive more strategic adoption.
The data management industry has become accustomed to thinking about the storage, processing and analysis of data in analytical databases as akin to warehousing, to the extent that the phrase ‘data warehouse’ no longer requires an explanation.
We believe that a good understanding of the potential strategic role of Hadoop, even if it is only aspirational at this stage, will be important in encouraging broader and deeper adoption of Hadoop.
In addition, it is not as if there are no enterprises deploying Hadoop more strategically. Cloudera estimates that about 20% of its 300 subscription customers are already deploying Hadoop as what it calls an Enterprise Data Hub.
I’m not personally convinced that Enterprise Data Hub is really the right term, (not least since we previously used the term Data Hub in a slightly different context). Other potential terms include data lake and data refinery.
Although the latter better describes Hadoop’s role in aggregating and processing data and the industrial-scale processes used to make data more acceptable for different analytic use-cases, it appears to have quickly passed out of fashion compared to the former.
I have begun using the term ‘data treatment plant’ as a combination of the two concepts to describe how Hadoop can be used as a single ‘logical’ unified data platform into which you simply poor data, while industrial-scale processes – the multiple data processing and analytic engines that will be supported by Hadoop 2.0: such as MapReduce, streaming processing, SQL and NoSQL – are used to make data more acceptable for a desired end-use.
451 clients can get more detail on the ‘data treatment plant’ and why we believe a bit of aspirational marketing may not be a bad thing for Hadoop, from our recent report, Hadoop: a framework in search of a metaphor.
Tags: aspirational marketing, data hub, data lake, data refinery, data treatment plant, enterprise data hub, hadoop
January 17th, 2014 — Matthew Aslett
Data management
Neo updates Neo4j. Cloudera sharpens focus on Accumulo. And more
And that’s the data day, today.
Tags: accumulo, basho, cloudera, Diyotta, Google, hadoop, IBM, Koverse, neo, Neo4J, Oracle, riak, SGI, Veristorm, Verizon, vStorm, watson
January 10th, 2014 — Matthew Aslett
Data management
IBM forms Watson Group. And more
And that’s the data day, today.
Tags: Aspera, cloudera, HANA, Host Analytics, IBM, KXEN, Logi Analytics, Presto, Qubole, SAP, Spark, tibco, watson
December 19th, 2013 — Matthew Aslett
Data management
VC funding for Hadoop and NoSQL tops $1bn. And more
And that’s the data day, today.
Tags: aws, couchdb, datameer, dynamoDB, Elastic MapReduce, enterprisedb, fedora, Graph Builder, hadoop, impala, Intel, Kinesis, MapR, noSQL, Nutonian, Qubole, RethinkDB, Salesforce.com, Splice MAchine
December 18th, 2013 — Matthew Aslett
Data management
There’s an early end to the quarter for our NoSQL LinkedIn Skills Index, based on the number of LinkedIn member profiles mentioning each of the NoSQL projects, just as there was in 2012.
We predicted in Q3 that Couchbase would overtake MarkLogic this quarter, which came to pass, but were somewhat surprised to see Couchbase also leapfrog Riak to claim 7th place. It’s almost too close to call between the three, though we wouldn’t be surprised to see those places change hands in the coming quarters.
There were no other changes of position outside the top ten, although Titan is bearing down on Hypertable having recorded the fastest growth in Q4 (49.5%) and can be expected to gain a place in Q1 2014. The second fastest climber, in terms of mentions, was FoundationDB, followed by ArangoDB, RethinkDB and Apache Cassandra (the latter being particularly notable since it was the only one of the five fastest growers to also be one of the top ten most mentioned in LinkedIn member profiles).
That growth was of course not enough to close the gap on MongoDB as the most mentioned NoSQL database in LinkedIn member profiles, although for the first time MongoDB’s proportion of the overall total actually declined – from 49% in Q3 to 48%, upsetting our prediction that MongoDB would pass the 50% threshold in Q4.
It will be interesting to see whether MongoDB’s dominance declines again in Q1, although either way it retains a monumental lead over all the other NoSQL databases in terms of mentions in LinkedIn profiles.
Of course, we would also note that this is not meant to be a comprehensive analysis, but rather a snapshot of one particular data source.
Tags: arangodb, cassandra, couchbase, foundationdb, Hypertable, LinkedIn, marklogic, mongodb, noSQL, RethinkDB, riak, Titan