One of the essential problems with the covering the NoSQL movement is that it describes not what the associated databases are, but what they are not (and doesn’t even do that very well since SQL itself is in many cases orthogonal to the problem the databases are designed to solve).
It is interesting to see fellow analyst Curt Monash facing the same problem. As he notes, while there seems to be a common theme that “NoSQL is Foo without joins and transactions,” no one has adequately defined what “Foo” is.
Curt has proposed HVSP (High-Volume Simple Processing) as an alternative to NoSQL, and while I’m not jumping on the bandwagon just yet, it does pass the Ronseal test (it does what it says on the tin), and it also matches my view of what defines these distributed data store technologies.
Some observations:
.
There are numerous categorizations of the various NoSQL technologies available on the Internet. Without wishing to add yet another to the mix, I have created another one – more for my benefit than anything else.
It includes a list of users for the various projects (where available), and also some sense of whether the various projects fit into CAP Theorem, an understanding of which is, to my mind, essential for understanding how and why the NoSQL/HVSP movement has emerged (look out for more on CAP Theorem in a follow-up post on alternatives to NoSQL).
Here’s my take, for those that are interested. As you can see there’s a graph database-shaped whole in my knowledge. I’m hoping to fill that sooner rather than later.
By the way, our Spotlight report introducing The 451 Group’s formal coverage of NoSQL databases will be available here imminently.
Update: VMware has announced that it has hired Redis creator Salvatore Sanfilippo, and is taking on the Redis key value store project. The image below has been updated to reflect that, as well as the launch of NorthScale’s Membase.
9 comments ↓
>> I agree with Curt’s view that object-oriented and
>> XML databases should not be considered part of
>> this new breed of distributed data
>> store technologies.
>> There is a danger that NoSQL simply comes
>> to mean non-relational.
Well sorry to tell you but what the hell does XML has to do with anything? So JSON is ok but XML is not? Or…. Is ’cause it’s hip???
I thought it was about CAP ? If it is, then some XML databases frekin rock. They produce websites with pentabytes of information, with ingestion rates that would make many websites blush. Aside from that they exhibit the features (sharding, replication, high availability, strict consistency) that most NoSQL systems do but in a completely different maturity level, as they have been doing this for years now!
Having said that, your point is?
Cool your jets there Nuno. Nothing wrong with XML databases, which can indeed provide the levels of scalability and performance you describe. But if people are trying to figure out what these new NoSQL/HVSP databases are, then in my view including everything that is not a relational database into the category just confuses the situation even more than it already is – which is a lot. I’m not altogether clear on how it benefits the XML database providers to lump themselves in with NoSQL, either, as it obscures their strengths rather than playing to them
To me it’s fairly obvious that JSON maps quite closely to objects (Object notation) and XML to documents – real world entities that people want to digitize. I would go a little further and say that JSON is closer to the relational model than XML, but that’s a personal opinion.
So this really has nothing to do with being NoSQL or not, and that was my point!
So if any XML database tries to lump in, that’s just wrong.
I did comment on NoSQLdatabase.org that they should be aware that not all XML databases are NoSQL — just like not all JSON datastores are NoSQL.
But there are some databases that store XML are NoSQL as the community see’s it, because they have been using the same techniques to solve the same problems. And MarkLogic (which is the one that Curt used as an example) as been doing that consistently for some years now.
There are lots of people (myself included) trying to wrap their heads around this, I quite like Nathan Hurst’s attempt to position NoSQL systems in terms of Availability, Partition Tolerance and Consistency.
It’s just a different way of diagramming the contents of your penultimate column, really but perhaps worth a peek.
http://blog.nahurst.com/visual-guide-to-nosql-systems
Chris
Thanks Chris, I saw Nathan’s post this morning and agree it is very useful for explaining the role of CAP Theorem.
Matt,
It strikes me that excluding XML databases from a category (or should I say un-category) called “NoSQL” is illogical.
I agree that NoSQL needs definition. Perhaps unfairly, I’m OK with excluding object databases, most probably on grounds of irrelevance / asked-and-answered.
However, XQuery is most definitely not SQL and unless you want to change the name of NoSQL itself — as some do — then it should be included.
Check out the Wikipedia “structured storage” page which includes both the “traditional” NoSQL suspects as well as other options, including currently XML systems.
http://en.wikipedia.org/wiki/Structured_storage
Best,
Dave
Hi Dave,
My view is that the term NoSQL is so broad that it is essentially meaningless, and that another term is required to describe the likes of Cassandra, Redis, Voldemort, Riak et al in a meaningful way. Whatever that term is – I’ll use Curt’s HVSP here for simplicity’s sake – existing XML and object databases are not HVSP databases.
[…] Mark Atwood is less than impressed with my recent statement: “Memcached is not a key value store. It is a cache. Hence the […]
[…] and Cassandra are being called column-stores with increasing frequency (e.g.here, here, and here), due to their ability to store and access column families separately. This makes them appear to be […]