hadoop — Too much information

The Data Day, A few days: March 1-5 2013

March 5th, 2013 — Data management

SQL and Hadoop: ascloseasthis. Splunk revenue up 64%. and more.

For 451 Research clients: SQL and Hadoop: a marriage of convenience bit.ly/162RfZ9 Just what do they see in each other?

— Matt Aslett (@maslett) March 4, 2013

For 451 clients: Skytree reaches into front end for machine learning, prepares for Series A funding bit.ly/XJQzDi By Krishna Roy

— Matt Aslett (@maslett) March 5, 2013

For 451 Research clients: Kalido illuminates MDM business and product roadmap bit.ly/VgPLaq By Krishna Roy

— Matt Aslett (@maslett) March 1, 2013

Splunk reports FY revenue of $198.9m, up 64%. bit.ly/VgVqx2

— Matt Aslett (@maslett) March 1, 2013

Terracotta launches BigMemory 4.0 bit.ly/Vv8TkX and In-Genius in-memory intelligence software. bit.ly/Vv8YFs

— Matt Aslett (@maslett) March 4, 2013

Acunu has announced the availability of Acunu Analytics for Cassandra. bit.ly/162Rrrb (PDF)

— Matt Aslett (@maslett) March 4, 2013

SkySQL, Codership and Monty Program team up to release Galera Cluster for MariaDB. mwne.ws/YaZivl

— Matt Aslett (@maslett) March 5, 2013

And that’s the data day, today.

Comments Off on The Data Day, A few days: March 1-5 2013

The Data Day, Two days: February 27/28 2013

February 28th, 2013 — Data management

Rackspace buys ObjectRocket. Intel delivers Hadoop distro. And more.

For 451 Research clients: Rackspace docks with ObjectRocket to boost data services portfolio bit.ly/15ktZV8 Deal Analysis

— Matt Aslett (@maslett) February 28, 2013

For 451 Research clients: Intel formally enters the Hadoop business with hardware-enhanced performance focus bit.ly/ZzSkD3

— Matt Aslett (@maslett) February 27, 2013

For 451 Research clients: Cloudera manages to navigate Hadoop toward auditing and backup/recovery bit.ly/15ku2jI

— Matt Aslett (@maslett) February 28, 2013

For 451 clients: Revolution Analytics furthers ‘big data’ analysis push with Hortonworks partnership bit.ly/ZzSlXN By Krishna Roy

— Matt Aslett (@maslett) February 27, 2013

JethroData secures $4.5m funding led by Pitango Venture Capital for columnar RDBMS/HDFS combo. bit.ly/15kur5A

— Matt Aslett (@maslett) February 28, 2013

Intel launches Distribution for Apache Hadoop intel.ly/V5JQ7S intel.ly/V5JQ7S

— Matt Aslett (@maslett) February 26, 2013

Aerospike launches Aerospike Enterprise Edition 2.6 with enhanced cross data center replication. bit.ly/V5KqTd

— Matt Aslett (@maslett) February 26, 2013

Dell Software launches Kitenga Analytics 2.0. bit.ly/15ku7UE

— Matt Aslett (@maslett) February 28, 2013

Rackspace confirms ObjectRocket acquisition. bit.ly/VatoU7

— Matt Aslett (@maslett) February 27, 2013

MyRedis launches on-demand Redis hosting bit.ly/ZzSzy6

— Matt Aslett (@maslett) February 27, 2013

Hadoop is evil. bit.ly/YAhcri Via @al3xandru “All U.S. nuclear waste can be directly attributable to Hadoop’

— Matt Aslett (@maslett) February 27, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 27/28 2013

The Data Day, Two days: February 21/22 2013

February 22nd, 2013 — Data management

Aster Discovery. Delphix and SAP. Hadoop use-cases. And more.

For 451 Research clients: Teradata updates Aster Discovery Platform with visual analytic functions bit.ly/WYiLni

— Matt Aslett (@maslett) February 21, 2013

For 451 Research clients: Delphix lands SAP endorsement for expanded database-virtualization software bit.ly/W7uOQp

— Matt Aslett (@maslett) February 22, 2013

For 451 clients: Yottamine launches R-based machine-learning service in the cloud for data scientists bit.ly/WYiIIa By Krishna Roy

— Matt Aslett (@maslett) February 21, 2013

Informatica moves into process automation with Active Endpoints acquisition bit.ly/W7uSji by @carllehmann1 and Krishna Roy

— Matt Aslett (@maslett) February 22, 2013

For 451 clients: Red Hat plans Hadoop cooperation with open source Hadoop plug-in for Red Hat Storage bit.ly/WYiHni Analyst note.

— Matt Aslett (@maslett) February 21, 2013

Big Data’s New Use Cases: Transformation, Active Archive, and Exploration bit.ly/UMcQBr Great post on Hadoop’s role by @awadallah

— Matt Aslett (@maslett) February 21, 2013

SAP has announced the availability of SAP Sybase IQ 16. prn.to/WYiZuv

— Matt Aslett (@maslett) February 21, 2013

Basho introduces Riak 1.3 with active anti-entropy and faster multi-datacenter replication. bit.ly/UMbreh

— Matt Aslett (@maslett) February 21, 2013

The State of CouchDB bit.ly/WYkZDb (via @al3xandru)

— Matt Aslett (@maslett) February 21, 2013

Introducing The Couch Firm bit.ly/WYl7CE

— Matt Aslett (@maslett) February 21, 2013

VCE has introduced the Vblock Specialized System SAP HANA s.tt/1A2Dv

— Matt Aslett (@maslett) February 21, 2013

Understanding Unicorn: A deep dive into Facebook’s Graph Search zd.net/W7vdCH

— Matt Aslett (@maslett) February 22, 2013

Red Hat will open source its Hadoop plug-in for Red Hat Storage (GlusterFS) later this year. red.ht/134Rsem

— Matt Aslett (@maslett) February 20, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 21/22 2013

Hadoop Summit keynote preview: What is the point of Hadoop?

February 21st, 2013 — Data management

I am very pleased and honoured to have been asked to provide a keynote presentation at the inaugural Hadoop Summit Europe, which will be held in Amsterdam on March 20-21.

The title of my talk is “What is the point of Hadoop?” which isn’t as derogatory as it sounds. Our research suggests there are hundreds of potential workloads that are suitable for Hadoop, but three core roles:

Big data storage: Hadoop as a system for storing large, unstructured, data sets
Big data processing/integration: Hadoop as a data ingestion/ETL layer
Big data analytics: Hadoop as a platform new new exploratory analytic applications

The flexibility of Apache Hadoop is one of its biggest assets – enabling businesses to generate value from data that was previously considered too expensive to be stored and processed in traditional databases – but also results in Hadoop meaning different things to different people.

As early adopters press ahead with innovative new analytic applications, many mainstream enterprises are are still scratching their heads trying to demonstrate Hadoop’s value. While it is very tempting to try and run before you can walk when you see others demonstrating the potential for Hadoop-based analytics it is my view that trying to jump ahead to Hadoop-based analytics without first understanding Hadoop’s storage and integration roles runs the risk of confusion and, potentially, disillusionment.

My keynote presentation at Hadoop Summit Europe will explore the impact that Hadoop is having on the traditional data processing landscape, examining the expanding ecosystem of vendors and their relationships with Apache Hadoop, exploring adoption trends around the world, and highlighting how an understanding of the roles Hadoop can play will be essential to helping Hadoop cross the chasm from early adopters to mainstream adoption.

Anyone interested in attending the event can get a 20% discount, using the registration code 13aslett20.

Comments Off on Hadoop Summit keynote preview: What is the point of Hadoop?

The Data Day, Two days: February 11/12 2013

February 12th, 2013 — Data management

ClearStory sheds light on data analysis service. Illuminating ‘dark data’. More.

For 451 clients: ClearStory bags $9m in series A funding, sheds light on its data analysis service bit.ly/Y6v8sV By Krishna Roy

— Matt Aslett (@maslett) February 12, 2013

For 451 clients: Global IDs makes ‘big data’ MDM play via cloud and Hadoop, touts profitable growth bit.ly/Y6v6kL By Krishna Roy

— Matt Aslett (@maslett) February 12, 2013

ScaleBase releases version 2.0 of its MySQL database scalability software bit.ly/WGtEtN

— Matt Aslett (@maslett) February 12, 2013

MarkLogic introduces free Developer License for MarkLogic Enterprise Edition and Mongo2MarkLogic converter. mwne.ws/14QiwMH

— Matt Aslett (@maslett) February 12, 2013

Illuminating “Dark Data” bit.ly/XqIwvB

— Matt Aslett (@maslett) February 11, 2013

Can Anyone Use the Name Hadoop? bit.ly/YR0hn5

— Matt Aslett (@maslett) February 11, 2013

Extremely comprehensive presentation on the considerations for deploying NoSQL, by @akmalchaudhri slidesha.re/YR1MSl

— Matt Aslett (@maslett) February 11, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 11/12 2013

The Data Day, Two days: February 7/8 2013

February 8th, 2013 — Data management

Teradata results. Funding for DataXu. The chemistry of data. And more.

For 451 Research clients: Oracle launches major update to MySQL open source database bit.ly/TSONAt

— Matt Aslett (@maslett) February 8, 2013

For 451 clients: Analyzing the chemistry of data bit.ly/TSOV2R By @451wendy Treating sensitive data like dangerous chemicals

— Matt Aslett (@maslett) February 8, 2013

Teradata: Q4 net income $112m on revenue up 10% to $740m, FY net income $419m on revenue up 13% to $2.7bn. bit.ly/14FNS8L (PDF)

— Matt Aslett (@maslett) February 7, 2013

DataXu Closes $27 Million of Growth Capital bit.ly/14UEbEy

— Matt Aslett (@maslett) February 7, 2013

WANdisco launches WANdisco Distro (WDD) of Apache Hadoop bit.ly/TSPytp

— Matt Aslett (@maslett) February 8, 2013

Actuate connects BIRT to Amazon DynamoDB for business analytics. bit.ly/UDsdNO

— Matt Aslett (@maslett) February 7, 2013

Hadoop at Yahoo!: More Than Ever Before yhoo.it/WBrWrN

— Matt Aslett (@maslett) February 7, 2013

Drawn to Scale compares the various approaches to implementing SQL on Hadoop. bit.ly/WBs8ai

— Matt Aslett (@maslett) February 7, 2013

VMware vFabric PostgreSQL 9.2 is now generally available for download and purchase.bit.ly/WyZqsv

— Matt Aslett (@maslett) February 8, 2013

Datawatch has announced support for Hadoop and Hive with the release of Datawatch Data Pump 11.6. bit.ly/12ymJWD

— Matt Aslett (@maslett) February 8, 2013

Xeround gets performance boost, full test search by upgradingits Cloud Database to MySQL 5.5. bit.ly/UDrEUi

— Matt Aslett (@maslett) February 7, 2013

ANTs Software, Inc. Re-acquires ANTs Data Server prn.to/UDsiB3 It’s alive!

— Matt Aslett (@maslett) February 7, 2013

Data Fabric Design Patterns: Transactional Data Service bit.ly/WBrSIo

— Matt Aslett (@maslett) February 7, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 7/8 2013

New 451 Research report: Total Data Analytics

December 13th, 2012 — Data management

451 Research’s Information Management practice has published its latest long-format report: Total Data Analytics. Written by Krishna Roy, Analyst, BI and Analytics, along with myself, it examines the impact of ‘big data’ on business intelligence and analytics.

The growing emphasis on ‘big data’ has focused unprecedented attention on the potential of enterprises to gain competitive advantage from their data, helping to drive adoption of BI/analytics beyond the retail, financial services, insurance and telecom sectors.

In 2011 we introduced the concept of ‘Total Data‘ to reflect the path from the volume, velocity and variety of big data to the all-important endgame of deriving maximum value from that data. Analytics plays a key role in deriving meaningful insight – and therefore, real-world business benefits – from Total Data.

In short, big data and Total Data are changing the face of the analytics market. Advanced analytics technologies are no longer the preserve of MBAs and ‘stats geeks,’ as line-of-business managers and others increasingly require this type of analysis to do their jobs.

Total Data Analytics outlines the key drivers in the analytics sector today and in the coming years, highlighting the technologies and vendors poised to shape a future of increased reliance on offerings that deliver on the promise of analyzing structured, semi-structured and unstructured data.

The report also takes a look at M&A activity in the analytics sector in 2012, as well as the history of investment funding involving Hadoop, NoSQL and Hadoop-based analytics specialists. It also contains a list of 40 vendors we believe have the greatest potential to shape the market in the coming years.

The report is available now to 451 Research clients, here. Non-clients can get more information and download an executive summary from the same link.

Comments Off on New 451 Research report: Total Data Analytics

The Data Day, Two days: December 6/7 2012

December 7th, 2012 — Data management

Cloudera raises $65m. HP launches Hadoop AppSystem. And more

For 451 Research clients: HP launches Hadoop AppSystem to lower adoption complexity bit.ly/VNoadz

— Matt Aslett (@maslett) December 7, 2012

For 451 Research clients: BitYota launches cloud-based data warehouse as a service bit.ly/TLswAl

— Matt Aslett (@maslett) December 6, 2012

For 451 clients: Talend expands ‘big data’ integration and quality horizons… bit.ly/VNoeKz By Krishna Roy and Carl Lehmann

— Matt Aslett (@maslett) December 7, 2012

Cloudera Raises $65M to Accelerate Enterprise Growth bit.ly/VzKdbx #Hadoop #BigData

— Cloudera (@cloudera) December 6, 2012

Zettaset filed with the SEC for $4.75 million in new funding.bit.ly/VNpdu7

— Matt Aslett (@maslett) December 7, 2012

GenieDB launches version 1.0 of MySQL-compatible distributed relational database. bit.ly/VpwsGn

— Matt Aslett (@maslett) December 6, 2012

GoodData is integrating CloverETL’s data transformation into GoodData CloudConnect. bit.ly/VNpQUz

— Matt Aslett (@maslett) December 7, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Two days: December 6/7 2012

Weird Science – Darwinian theory and emerging Hadoop vendor business strategies

November 30th, 2012 — Data management

Dan Woods recently opined that Apache Hadoop has had a weird beginning thanks to its “Three Headed Open Core” model and warned that there is a danger than it will fragment – à la Unix – thanks to competing commercial forces.

There are a couple of points to address here. The first is the assumption that the vendor community developing Hadoop is in some way ‘weird’. Not for those of us that have studied the evolution of open source-related business strategies it isn’t.

In fact, Hadoop’s multi-vendor community is a prime example of the corporate-dominated development communities we saw emerging as the fourth stage of commercial open source back in 2010.

Some people still have trouble understanding, as I wrote two years ago, that

being successful is about sharing your code development with the competition via multi-vendor open source projects in order to benefit from improved code quality and lower research and development costs for non-differentiating features AND beating your competition with proprietary complementary technologies.

This isn’t weird. I firmly believe in the not-too-distant future this will be seen as entirely normal.

Another issue to address is the suggestion that these competing vendors pose a danger to the core project. In the blog linked above I argued that the contrary is true: comparing the various competing players in collaborative communities as having a similar impact on the development of a project as various competing factors – climate, habitat, existence or dearth of predators etc – do in Darwin’s evolutionary process: i.e. making it stronger.

I would be much more concerned about the potential fragmentation of Hadoop if we were looking at four or five different competing implementations of Google’s MapReduce and file system research. Instead, you could compare the differentiating features that Cloudera, Hortonworks, MapR, IBM and EMC have introduced to the result of natural selection based on a need to evolve to certain conditions.

So long as there remains a single core Apache Hadoop project upon which these differentiating features are based I believe Hadoop will not only survive, but will thrive. If I may quote myself again: “As long as they continue to collaborate on the non-differentiating code, the project should benefit from being stretched in multiple directions.”

I believe that, as with Linux, the vendors involved have learned the lessons of the Unix wars and understand that it is in their best interests – let alone everyone else’s – not to repeat them.

Another key point when we look at the Hadoop ecosystem is that we see multiple vendors building on others’ differentiating features and often supporting multiple distributions. It’s not a case of a herd of individually differentiated Hadoops, but more like a stack of Russian Hadoop dolls.

To my mind there are (currently) eight main Hadoop business strategies, each of which has the potential to build on those before it:

Hadoop distributors

e.g. Cloudera, Hortonworks, MapR, EMC, IBM

Hadoop cloud services

e.g. Amazon EMR, Google Compute Engine

Hadoop-based deployment services

e.g. Infochimps, Metascale

Hadoop-based deployment stack/appliances

e.g. Zettaset, Oracle BDA, Dell

Hadoop-based development services

e.g. Continuuity, Mortar Data

Hadoop-based application stacks

e.g. NGDATA, Guavus

Hadoop-based database stacks

e.g. Drawn to Scale, Splice Machine

Hadoop-based analytic services

e.g. Treasure Data, Qubole

Comments Off on Weird Science – Darwinian theory and emerging Hadoop vendor business strategies

The Data Day, Two days: November 8/9 2012

November 9th, 2012 — Data management

Funding for Neo, Elasticsearch and Hadapt. And more

For 451 Research clients: Elasticsearch raises $10m for search-based analytics platform bit.ly/Udk9wg

— Matt Aslett (@maslett) November 9, 2012

For 451 Research clients: IBM trumpets ‘integration anywhere,’ moves into reference data management bit.ly/RShjOM By Krishna Roy

— Matt Aslett (@maslett) November 12, 2012

For 451 Research clients: Neo Technology raises $11m series B to fund graph database adoption push bit.ly/VI3h6N

— Matt Aslett (@maslett) November 8, 2012

For 451 Research clients: Birst gets big into ‘big data’ with BI service in the cloud for Hadoop bit.ly/Udkdfk By Krishna Roy

— Matt Aslett (@maslett) November 9, 2012

Database startup Hadapt reports $6.7M fundraise bizjournals.com/boston/blog/st… via @bbjnewsroom

— Matt Aslett (@maslett) November 8, 2012

Elasticsearch raises $10m series A funding led by Benchmark along with Rod Johnson and Data Collective. bit.ly/VI3JSw

— Matt Aslett (@maslett) November 8, 2012

Revolution Analytics updates Revolution R Enterprise with big data decision trees and predictive analytics on Hadoop. bit.ly/VI3uXE

— Matt Aslett (@maslett) November 8, 2012

Tableau Software offers native Google BigQuery connector. bit.ly/UdkmPZ

— Matt Aslett (@maslett) November 9, 2012

Google updates Google Cloud SQL for better performance and more storage. Adds free trial bit.ly/SU3rG0

— Matt Aslett (@maslett) November 8, 2012

Facebook open sources Corona — a better way to do webscale Hadoop gigaom.com/data/facebook-…

— Derrick Harris (@derrickharris) November 8, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Two days: November 8/9 2012

The Data Day, A few days: March 1-5 2013

The Data Day, Two days: February 27/28 2013

The Data Day, Two days: February 21/22 2013

Hadoop Summit keynote preview: What is the point of Hadoop?

The Data Day, Two days: February 11/12 2013

The Data Day, Two days: February 7/8 2013

New 451 Research report: Total Data Analytics

The Data Day, Two days: December 6/7 2012

Weird Science – Darwinian theory and emerging Hadoop vendor business strategies

The Data Day, Two days: November 8/9 2012

Search

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives

Search

Tags

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives