Matthew Aslett — Too much information

The Data Day, Two days: February 25/26 2013

February 26th, 2013 — Data management

EMC Pivotal HD. Hortonworks Hadoop for Windows. And more.

For 451 Research clients: EMC Greenplum adds SQL to Hadoop with Pivotal new distribution bit.ly/YUwKsa

— Matt Aslett (@maslett) February 26, 2013

For 451 clients: Hortonworks launches Hadoop for Windows, the foundation for Microsoft HDInsight bit.ly/YUwMjC Analsyt note

— Matt Aslett (@maslett) February 26, 2013

EMC launches Pivotal HD, integrating EMC Greenplum MPP database technology with Apache Hadoop. prn.to/YUwHwk

— Matt Aslett (@maslett) February 26, 2013

New Products and Releases: Cloudera Navigator, Cloudera Enterprise BDR, and More bit.ly/15NeDJY

— Cloudera (@cloudera) February 26, 2013

Hortonworks Brings Apache Hadoop to Windows bit.ly/125OpDr

— Matt Aslett (@maslett) February 25, 2013

Press release: New #Hadoop/#rstats support for @hortonworks, developing “in-Hadoop” predictive analytics bit.ly/XymIyw

— Revolution Analytics (@RevolutionR) February 26, 2013

DataDirect Networks unveils hScaler Hadoop appliance with integrated ETL engine and shared storage. bit.ly/13faZcp

— Matt Aslett (@maslett) February 26, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 25/26 2013

The Data Day, Two days: February 21/22 2013

February 22nd, 2013 — Data management

Aster Discovery. Delphix and SAP. Hadoop use-cases. And more.

For 451 Research clients: Teradata updates Aster Discovery Platform with visual analytic functions bit.ly/WYiLni

— Matt Aslett (@maslett) February 21, 2013

For 451 Research clients: Delphix lands SAP endorsement for expanded database-virtualization software bit.ly/W7uOQp

— Matt Aslett (@maslett) February 22, 2013

For 451 clients: Yottamine launches R-based machine-learning service in the cloud for data scientists bit.ly/WYiIIa By Krishna Roy

— Matt Aslett (@maslett) February 21, 2013

Informatica moves into process automation with Active Endpoints acquisition bit.ly/W7uSji by @carllehmann1 and Krishna Roy

— Matt Aslett (@maslett) February 22, 2013

For 451 clients: Red Hat plans Hadoop cooperation with open source Hadoop plug-in for Red Hat Storage bit.ly/WYiHni Analyst note.

— Matt Aslett (@maslett) February 21, 2013

Big Data’s New Use Cases: Transformation, Active Archive, and Exploration bit.ly/UMcQBr Great post on Hadoop’s role by @awadallah

— Matt Aslett (@maslett) February 21, 2013

SAP has announced the availability of SAP Sybase IQ 16. prn.to/WYiZuv

— Matt Aslett (@maslett) February 21, 2013

Basho introduces Riak 1.3 with active anti-entropy and faster multi-datacenter replication. bit.ly/UMbreh

— Matt Aslett (@maslett) February 21, 2013

The State of CouchDB bit.ly/WYkZDb (via @al3xandru)

— Matt Aslett (@maslett) February 21, 2013

Introducing The Couch Firm bit.ly/WYl7CE

— Matt Aslett (@maslett) February 21, 2013

VCE has introduced the Vblock Specialized System SAP HANA s.tt/1A2Dv

— Matt Aslett (@maslett) February 21, 2013

Understanding Unicorn: A deep dive into Facebook’s Graph Search zd.net/W7vdCH

— Matt Aslett (@maslett) February 22, 2013

Red Hat will open source its Hadoop plug-in for Red Hat Storage (GlusterFS) later this year. red.ht/134Rsem

— Matt Aslett (@maslett) February 20, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 21/22 2013

Hadoop Summit keynote preview: What is the point of Hadoop?

February 21st, 2013 — Data management

I am very pleased and honoured to have been asked to provide a keynote presentation at the inaugural Hadoop Summit Europe, which will be held in Amsterdam on March 20-21.

The title of my talk is “What is the point of Hadoop?” which isn’t as derogatory as it sounds. Our research suggests there are hundreds of potential workloads that are suitable for Hadoop, but three core roles:

Big data storage: Hadoop as a system for storing large, unstructured, data sets
Big data processing/integration: Hadoop as a data ingestion/ETL layer
Big data analytics: Hadoop as a platform new new exploratory analytic applications

The flexibility of Apache Hadoop is one of its biggest assets – enabling businesses to generate value from data that was previously considered too expensive to be stored and processed in traditional databases – but also results in Hadoop meaning different things to different people.

As early adopters press ahead with innovative new analytic applications, many mainstream enterprises are are still scratching their heads trying to demonstrate Hadoop’s value. While it is very tempting to try and run before you can walk when you see others demonstrating the potential for Hadoop-based analytics it is my view that trying to jump ahead to Hadoop-based analytics without first understanding Hadoop’s storage and integration roles runs the risk of confusion and, potentially, disillusionment.

My keynote presentation at Hadoop Summit Europe will explore the impact that Hadoop is having on the traditional data processing landscape, examining the expanding ecosystem of vendors and their relationships with Apache Hadoop, exploring adoption trends around the world, and highlighting how an understanding of the roles Hadoop can play will be essential to helping Hadoop cross the chasm from early adopters to mainstream adoption.

Anyone interested in attending the event can get a 20% discount, using the registration code 13aslett20.

Comments Off on Hadoop Summit keynote preview: What is the point of Hadoop?

The Data Day, Two days: February 19/20 2013

February 20th, 2013 — Data management

Tableau IPO rumour. Funding for Elasticsearch. And more.

For 451 Research clients: TransLattice updates Elastic Database for multi-cloud environments bit.ly/W386co

— Matt Aslett (@maslett) February 20, 2013

For 451 Research clients: Rapid-I sets out to tackle US, moves into trading analytics via marketplace bit.ly/W38984 By Krishna Roy

— Matt Aslett (@maslett) February 20, 2013

Rumor has it Tableau Software has put in its prospectus (quietly). It could be the next billion-dollar tech #IPO.blogs.the451group.com/techdeals/web-…

— brenon (@brenondaly) February 19, 2013

Elasticsearch closes $24M series B round bit.ly/Zdz9yR

— Matt Aslett (@maslett) February 19, 2013

Three announcements from Hortonworks today: Stinger Initiative, and Tez and Knox Gateway projects. Overview: bit.ly/XvoAXD

— Matt Aslett (@maslett) February 20, 2013

Teradata updates renamed Aster Discovery Platform with new Visual SQL-MapReduce functions. bit.ly/15uS9gK

— Matt Aslett (@maslett) February 20, 2013

TransLattice launches version 3.0 of the TransLattice Elastic Database, with multi-cloud support. prn.to/131ugO7

— Matt Aslett (@maslett) February 19, 2013

Concurrent has announced Lingual, an open source project enabling SQL application development on Apache Hadoop. bit.ly/YAhi46

— Matt Aslett (@maslett) February 20, 2013

WANdisco announces Hadoop Console for enterprise Hadoop deployment and management. mwne.ws/131tArU

— Matt Aslett (@maslett) February 19, 2013

Infosys launches BigDataEdge real-time data discovery and analysis platform. prn.to/YAeQdX

— Matt Aslett (@maslett) February 20, 2013

LucidWorks integrates LucidWorks Search with MapR’s Hadoop distribution. prn.to/15uSkIF

— Matt Aslett (@maslett) February 20, 2013

AWS is building an exabyte-scale globally distributed data service. bit.ly/W38xDs Or at least looking for engineers to build one.

— Matt Aslett (@maslett) February 20, 2013

Apache #Accumulo is now available on Amazon’s Elastic MapReduce aws.amazon.com/articles/20651…

— AccumuloData (@AccumuloData) February 20, 2013

Interested in using your data skills for good? The Parkinsons Data Challenge: bit.ly/W001Fn #bigdata

— Datameer (@datameer) February 18, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 19/20 2013

Forthcoming webinar: Strategies for scaling MySQL

February 19th, 2013 — Data management

On February 28 at 1pm EST I’ll be taking part in a webinar, sponsored by ScaleBase, on strategies for scaling MySQL.

Scalability is one of the primary drivers we’ve seen for database users considering alternatives to traditional relational databases. That could mean adopting an entirely new database for new projects or – more likely for existing applications – looking at various strategies for improving the scalability of an existing database.

During the webinar I will be joined by Doron Levari and Paul Campaniello, both from ScaleBase, which enables applications to scale without disruption to the existing infrastructure. We’ll be discussing, amongst other things:

Scaling-out your MySQL databases
New high availability strategies
Centrally managing a distributed MySQL environment

For further details, and to register, click here.

Comments Off on Forthcoming webinar: Strategies for scaling MySQL

The Data Day, Two days: February 15/18 2013

February 18th, 2013 — Data management

Redshift goes GA. Pivotal’s Google in a box. And more.

For 451 Research clients: Garantia Data goes GA with Redis and memcached cloud services bit.ly/12oy4Jg

— Matt Aslett (@maslett) February 15, 2013

For 451 clients: Context Relevant brings app focus to the business of machine learning on ‘big data’ bit.ly/12oyaAG By Krishna Roy

— Matt Aslett (@maslett) February 15, 2013

Amazon’s Redshift is now generally available. bit.ly/12oM6KU

— Matt Aslett (@maslett) February 15, 2013

QlikTech announces FY net inc of $14.6m on rev up 21% to $388.5m, Q4 net inc of $26.5m on rev up 27% to $137.5m bit.ly/12oAaZy

— Matt Aslett (@maslett) February 15, 2013

Paul Maritz Wants to Sell You ‘Google in a Box’ bit.ly/12ozwex

— Matt Aslett (@maslett) February 15, 2013

Informatica announces the Informatica Cloud Connector for Amazon Redshift. bit.ly/12oBaNn

— Matt Aslett (@maslett) February 15, 2013

Amazon Redshift and Designing for Resilience bit.ly/12oAuHM

— Matt Aslett (@maslett) February 15, 2013

HStreaming Funded to Turn Big Data Intelligence into Action – Yahoo! Finance finance.yahoo.com/news/hstreamin… via @yahoofinance

— HStreaming (@HStreaming) February 15, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 15/18 2013

The Data Day, Two days: February 13/14 2013

February 14th, 2013 — Data management

TempoDB’s timely DBaaS for the Internet of Things. ScaleBase 2.0. And more

For 451 Research clients: TempoDB has timely database service for the Internet of Things bit.ly/YcQuqA

— Matt Aslett (@maslett) February 13, 2013

For 451 Research clients: ScaleBase provides centralized management of distributed MySQL databases bit.ly/YcQTcs

— Matt Aslett (@maslett) February 13, 2013

For 451 Research clients: XtremeData turns its attention to cloud-based data warehousing bit.ly/XB7MLY

— Matt Aslett (@maslett) February 14, 2013

Garantia Data’s Redis and Memcached hosting services go GA. bit.ly/X7557B

— Matt Aslett (@maslett) February 14, 2013

Talend Appoints Jim Foy as Executive Chairman bit.ly/WqwlOn

— Matt Aslett (@maslett) February 14, 2013

Amazon RDS Reduces Price of Multi-AZ Deployments amzn.to/XB7Lb6

— Matt Aslett (@maslett) February 14, 2013

Attivio has announced version 3.5 of itsActive Intelligence Engine. prn.to/YcRStb

— Matt Aslett (@maslett) February 13, 2013

ParAccel introduces Right to Deploy pricing model. mwne.ws/YcSA9Z

— Matt Aslett (@maslett) February 13, 2013

VMware explains how to run DBaaS metering and chargebacks with Data Director. bit.ly/YcSaAi

— Matt Aslett (@maslett) February 13, 2013

You now have just two weeks left to participate in our 2013 451 Research Database survey bit.ly/451db13

— Matt Aslett (@maslett) February 14, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 13/14 2013

NoSQL on MySQL: stating the obvious

February 13th, 2013 — Data management

Some of the NoSQL vendors seemed to have stirred up a mild controversy with their reactions to the launch of NoSQL access to InnoDB in MySQL 5.6 and their suggestions that NoSQL access is only a part of the NoSQL story.

Mark Leith, software development senior manager at Oracle has described the criticism as laughable and Oracle’s director of MySQL product marketing, Mat Keep, accused the NoSQL vendors of “trying to stand on the shoulders of giants” (which is pretty ironic given we are talking about Oracle adding NoSQL capabilities to one of its databases).

In any case I don’t see what the fuss is all about.

Sure, Couchbase and DataStax laid it on a bit thick, but these are corporate blog posts – it goes with the territory.

Besides while it might seem churlish to criticise NoSQL access to InnoDB in MySQL 5.6 for not being a document database or for enabling masterless multi-datacenter replication, the responses are valid in the context of hyperbolic claims that “MySQL can provide the best of both worlds… You don’t have to split your data and manage two databases.”

The caveat to all these claims, and indeed probably any claim ever made in a corporate blog, is “if it suits your particular application requirement.”

Back in early 2011 when we first considered the momentum behind NoSQL development and adoption we highlighted six key drivers:

Scalability
Performance
Relaxed consistency
Agility
Intricacy
Necessity

How many of those are addressed by key value access to the InnoDB storage engine? Query performance and agility, certainly. Necessity, perhaps – but only if your application workload requires both SQL and key value access.

As we stated when Oracle first began previewing key value access to the InnoDB storage engine:

“Support for data access using the memcached API by no means alleviates the need for NoSQL alternatives, but it will provide additional flexibility and agility for existing MySQL adopters.”

I also have to agree with Couchbase that this is a point that is illustrated by the existence of Oracle’s own NoSQL Database. As we stated at the time of its launch:

“The launch of Oracle NoSQL is… a clear indication that there are trends at work here that cannot be solved by adding non-SQL querying to existing relational databases.”

And that’s really all Couchbase and DataStax are pointing out.

If you’re looking for an offering that provides direct, key value insertion and querying of data in addition to SQL-based access to relational database tables, then MySQL 5.6 is clearly a leading contender. If that’s all you’re looking for, then you could arguably forget the need to manage two databases.

That clearly doesn’t necessarily make MySQL 5.6 suitable for use as a pure key value store, let alone a document database, or wide-column store, or graph database. If those are your requirements, MySQL 5.6 isn’t the best of any world, let alone both.

Comments Off on NoSQL on MySQL: stating the obvious

The Data Day, Two days: February 11/12 2013

February 12th, 2013 — Data management

ClearStory sheds light on data analysis service. Illuminating ‘dark data’. More.

For 451 clients: ClearStory bags $9m in series A funding, sheds light on its data analysis service bit.ly/Y6v8sV By Krishna Roy

— Matt Aslett (@maslett) February 12, 2013

For 451 clients: Global IDs makes ‘big data’ MDM play via cloud and Hadoop, touts profitable growth bit.ly/Y6v6kL By Krishna Roy

— Matt Aslett (@maslett) February 12, 2013

ScaleBase releases version 2.0 of its MySQL database scalability software bit.ly/WGtEtN

— Matt Aslett (@maslett) February 12, 2013

MarkLogic introduces free Developer License for MarkLogic Enterprise Edition and Mongo2MarkLogic converter. mwne.ws/14QiwMH

— Matt Aslett (@maslett) February 12, 2013

Illuminating “Dark Data” bit.ly/XqIwvB

— Matt Aslett (@maslett) February 11, 2013

Can Anyone Use the Name Hadoop? bit.ly/YR0hn5

— Matt Aslett (@maslett) February 11, 2013

Extremely comprehensive presentation on the considerations for deploying NoSQL, by @akmalchaudhri slidesha.re/YR1MSl

— Matt Aslett (@maslett) February 11, 2013

And that’s the data day, today.

Comments Off on The Data Day, Two days: February 11/12 2013

Forthcoming webinar: How to Take Advantage of NewSQL in the Cloud

February 11th, 2013 — Data management

On February 21, at 10:00am PST / 1:00pm EST, I’ll be taking part in a webinar – How to Take Advantage of NewSQL in the Cloud – in conjunction with Clustrix.

In this free webinar I, along with Mark Sarbiewski, Clustrix CMO, will discuss:

The current cloud database inflection point – and how that affects you and your company
How to migrate your SQL database to the cloud
How to get effortless scale from your database in public or private clouds
How to ensure database availability in the cloud for business critical applications

For full details and registration, click here.

Comments Off on Forthcoming webinar: How to Take Advantage of NewSQL in the Cloud

The Data Day, Two days: February 25/26 2013

The Data Day, Two days: February 21/22 2013

Hadoop Summit keynote preview: What is the point of Hadoop?

The Data Day, Two days: February 19/20 2013

Forthcoming webinar: Strategies for scaling MySQL

The Data Day, Two days: February 15/18 2013

The Data Day, Two days: February 13/14 2013

NoSQL on MySQL: stating the obvious

The Data Day, Two days: February 11/12 2013

Forthcoming webinar: How to Take Advantage of NewSQL in the Cloud

Search

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives

Search

Tags

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives