How to to provide a strongly consistent distributed database and not break CAP Theorem

In the months since we coined the term NewSQL we have come to define it as referring to a new breed of relational database products designed to meet scalability requirements of distributed architectures, or improve performance so horizontal scalability is no longer a necessity, while maintaining support for SQL and ACID.

During the recent round of NoSQL Road Show events it has emerged that this description could be taken to suggest that NewSQL products are able to provide consistency, availability and partition tolerance and therefore contravene the common understanding of CAP Theorem that “a distributed system can satisfy any two of these guarantees at the same time, but not all three.”

How is possible to provide strongly consistent distributed systems and not break CAP Theorem?

For a start, CAP Theorem is not that simple. As others have pointed out – Cloudera’s Henry Robinson for example – CAP Theorem isn’t simply a case of “consistency, availability, partition tolerance. Pick two.”

In fact the father of CAP Theorem, Dr Eric Brewer, has clarified that the “2 of 3” explanation is misleading: “First, because partitions are rare, there is little reason to forfeit C or A when the system is not partitioned. Second, the choice between C and A can occur many times within the same system at very fine granularity; not only can subsystems make different choices, but the choice can change according to the operation or even the specific data or user involved. Finally, all three properties are more continuous than binary. Availability is obviously continuous from 0 to 100 percent, but there are also many levels of consistency, and even partitions have nuances, including disagreement within the system about whether a partition exists.”

We know that CAP is not simply a case of “pick two”, since while Amazon’s Dynamo (and the many NoSQL databases it has inspired) sacrifices consistency for availability, it does so with eventual consistency, not the total absence of consistency.

Clearly is possible to have systems that are partition tolerant, highly available and offer *a degree of consistency* (although as Fred Holahan points out, whether that degree is suitable for you particular workload is another matter).

Partition tolerance is not necessarily something that can be relaxed in the same manner – in fact the proof of CAP Theorem relies on an assumption of partition tolerance. As Yammer engineer Coda Hale explains: “Partition Tolerance is mandatory in distributed systems. You cannot not choose it.”

Daniel Abadi has previously explained how CAP is not really about choosing two of three states, but about answering the question “if there is a partition, does the system give up availability or consistency?”

Just as systems that sacrifice consistency retain a degree of consistency, Daniel also makes the point that systems that give up availability also do not do so in totality, noting that “availability is only sacrificed when there is a network partition.”

As such, Daniel makes the point that the roles of consistency and availability in CAP are asymmetric, and that latency is the forgotten factor that re-balances the equation.

Daniel has also returned to the issue of the tradeoff between latency and consistency in a more recent post, noting that, unlike availability vs consistency, “the latency vs. consistency tradeoff is present even during normal operations of the system.”

The Apache Cassandra wiki actually makes this point very well:

“The CAP theorem… states that you have to pick two of Consistency, Availability, Partition tolerance: You can’t have the three at the same time and get an acceptable latency. Cassandra values Availability and Partitioning tolerance (AP). Tradeoffs between consistency and latency are tunable in Cassandra. You can get strong consistency with Cassandra (with an increased latency).”

This suggests that you can, in fact, have consistency, partition tolerance and availability at the same time, but that latency will suffer. ScaleDB’s Mike Hogan made that argument earlier this year in describing the ‘CAP event horizon’ – “the point at which latency for a clustered system exceeds that which is acceptable and then you must decide what concessions you are willing to make”.

See also Brian Bulkowski’s explanation of how Citrusleaf can claim to deliver immediate consistency by relaxing availability in the event of partition failure: “During this period, Citrusleaf will seem less highly available – that is, latencies will be higher – until the reconfiguration completes. Transactions still flow during this period – they are queued and forwarded at different places in the client and in the servers – but the cluster has, in theoretical terms, lower availability.”

Like Citrusleaf’s ACID-compliant NoSQL database, NewSQL databases are not designed to avoid the CAP event horizon by being as available as eventually consistent systems – that *would* break CAP Theorem – but arguably they are designed to delay that CAP event horizon as much as possible by delivering systems that, in the event of a partition, are highly consistent and offer *a degree of availability*.

Whether that degree of availability is suitable for your application will depend on your tolerance – not for partitions but for latency.

Tags: , , , , , ,

3 comments ↓

#1 Fred Holahan on 12.14.11 at 3:23 pm

Matt, excellent essay, as usual.

I’ll offer a clarification to one of your comments. You said:

“Clearly is possible to have systems that are partition tolerant, highly available and offer *a degree of consistency*.”

Eventual consistency models often work adequately when data operations are commutative. When operations are non-commutative, eventual consistency is generally unhelpful.

If Amazon (or other vendor using an eventually consistent model) has a single item in inventory, and you and I order that item while on opposite sides of a partition, we will both incorrectly receive an order confirmation. Repair of that error often takes the form of a compensating transaction (e.g., your order gets filled, I get an email saying I’ve been back ordered). In this non-commutable scenario, there is NO consistency – there is a broken transaction that needs to be fixed, and that fix may have real implications (I may get pissed off for getting that email after I’d been told I successfully bought something).

So if your data operations are commutative (e.g., you’re counting things), eventual consistency may well suffice. If your operations are non-commutative (as are many business transactions), eventual consistency is essentially equivalent to no consistency.

Thus, I would modify your original statement as follows:

“Clearly is possible to have systems that are partition tolerant, highly available and MAY offer a degree of consistency WHEN DATA OPERATIONS ARE COMMUTABLE. WHEN OPERATIONS ARE NON-COMMUTABLE, THEN THERE IS NO GUARANTEED DEGREE OF CONSISTENCY*.”

To be clear, I’m in no way taking a swipe at Amazon or the NoSQL guys. My goal is only to add precision to what is already a high quality essay.

Fred Holahan
VoltDB, Inc.

#2 Fred Holahan on 12.14.11 at 3:25 pm

correction – “non-commutable” should be “non-commutative”

#3 How to to provide a strongly consistent... | Database | Syngu on 12.18.11 at 1:24 am

[…] In the months since we coined the term NewSQL we have come to define it as referring to a new breed of relational database products designed to meet scalability requirements of distributed architectures, or improve performance so horizontal scalability is no longer a necessity, while maintaining support for SQL and ACID.    Database Read the original post on DZone… […]