The Data Day: January 25, 2019

The meaning and importance of #automation in data-centric environments. And more.

And that’s the Data Day, today.

The Data Day: January 11, 2019

Cloudera completes Hortonworks acquisition. And more.

And that’s the Data Day, today.

The Data Day: June 17, 2016

What happened in data and analytics this week has to be seen to be believed

And that’s the data day, today.

The Data Day, A few days: May 14-20, 2016

Funding for ThoughtSpot and AtScale. And more.

And that’s the data day, today.

The Data Day, A few days: May 9-May 15, 2015

MarkLogic raises $102m. And more

And that’s the data day, today.

It’s the end of NoSQL as we know it (and I feel fine)

Last week I tweeted that this week was shaping up to be a watershed week in the history of NoSQL. I was referring, of course, to MongoDB launching 3.0 and DataStax acquiring Aurelius – although more specifically what the context of these two announcements tells us about the future of NoSQL.

While each of these announcements could be considered significant in its own right in combination they suggest a new stage in the evolution of NoSQL and a clear signal that the future of NoSQL will be driven by database products that support multiple data models.

When we formally started covering NoSQL in 2010 it made sense to divide the various projects into four groups: key value stores, distributed (wide) column stores (or BigTable clones), graph databases, and document-oriented databases.

By early 2013 it had become obvious that there was another emerging category: multi-model databases.

Multi-model NoSQL databases have therefore been around for several years but while we have seen growing interest in these multi-model databases, in terms of widespread adoption they still lagged behind the early specialist NoSQL databases. That’s what makes the recent announcements by MongoDB and DataStax so significant.

    1. Along with releasing version 3.0 of its document database, MongoDB also began to share (at least with us) its long-term multi-model vision for MongoDB, explaining how the pluggable storage engine architecture could enable the database to support multiple data models – such as key value, graph and relational.
    1. Meanwhile DataStax described how its acquisition of Aurelius will see it developing a graph database to complement Apache Cassandra’s wide column key value model, and explained its multi-model strategy.
  • Multi-model momentum may have been growing for years but the fact that the commercial providers behind the two most popular NoSQL databases have detailed their plans to go multi-model confirms that the multi-model approach is the future of NoSQL.

    Indeed, since we expect to see similar moves from other NoSQL players it will become increasingly difficult to divide the NoSQL space in terms of key value stores, wide column stores, graph databases, and document-oriented databases. Instead it makes sense to divide the NoSQL projects in terms of whether they are single-model or multi-model.

    451 Research clients can read more about our perspectives on MongoDB’s strategic direction, as well as DataStax’s acquisition of Aurelius, and the wider implications for the NoSQL sector.

    7 Hadoop questions. Q6: Hadoop’s shortcomings

    What are the major shortcomings of Hadoop? The answer to that questions looks set to shape the future development roadmap for the open source data processing framework, which is why it is one of the major questions being asked as part of our 451 Research 2013 Hadoop survey.


    The limitations of Hadoop have been widely reported over the years, but as the Apache Hadoop community and related vendors have responded to issues such as reliability and high availability – not least via the now generally available Apache Hadoop 2 – so attention turns to other areas such as security, administration and performance, as well as more advanced functionality requirements, including graph processing, stream processing, improved SQL support and virtualization support.


    The list of potential improvements is therefore fairly long, and as we near the end of our survey it is interesting to see that the list of key advances respondents are looking for in order to increase adoption of Hadoop is fairly widespread.

    So far the responses to our Hadoop survey suggest administration tooling and performance top the list, followed by reliability, SQL support and backup and recovery, but development tools and authentication and access control are not far behind.

    To give your view on this and other questions related to the adoption of Hadoop, please take our 451 Research 2013 Hadoop survey.

    Forthcoming webinar: NoSQL Technology and Real-time, Accurate Predictive Analytics

    On August 29 at 10:00 am PT I’ll be taking part in a webinar in association with Objectivity entitled “Big Data: NoSQL Technology and Real-time, Accurate Predictive Analytics”.

    The webinar will provide an overview of NoSQL database technology and, in particular, the role that graph databases have in the expanding analytics market.

    I’ll be joined by Leon Guzenda, Founder, Objectivity, who will provide a brief overview of Objectivity, Inc and its products Objectivity/DB and InfiniteGraph, as well as J.C. Smart, Director Global Insight Laboratory, Georgetown University, who will explain how Georgetown University is taking advantage of Objectivity’s products to develop one of the most interconnected databases today – examining information from all types of sources worldwide in real-time.

    For full details, and to register, click here.

    Forthcoming webinar: data discovery accelerating advances in cancer treatment

    I’ll be taking part in a webinar on Thursday May 30th at 11:00am PST, in association with YarcData, on the subject of how rapid data discovery can help deal with big data challenges.

    I’ll be explaining how large scale graph analytics can harness the power of ‘big data’ and bring business-focused solutions to enterprises by exploring patterns in data to prompt new questions and enable new business insight.

    I’ll be joined by Ilya Shmulevich, who serves as a professor for the Institute for Systems Biology and directs a Genome Data Analysis Center as part of The Cancer Genome Atlas (TCGA).

    Ilya will explain how ISB is using graph analytics and data discovery to accelerate advances in cancer treatment by identifying existing drugs which are candidates to be re-purposed to treat cancer.

    For full details, and to register, click here.

    The Data Day, Two days: February 21/22 2013

    Aster Discovery. Delphix and SAP. Hadoop use-cases. And more.

    And that’s the data day, today.