DataStax Blog

Let’s be consistent about consistency – a post for the relational mind

By Billy Bosworth -  November 4, 2011 | 3 Comments

Fans of The Princess Bride will be quite familiar with this exchange:

Vizzini: He didn’t fall? INCONCEIVABLE.
Inigo Montoya: You keep using that word. I do not think it means what you think it means.

Such is our plight with the word “consistency” as it relates to Cassandra (as well as other so-called NoSQL databases).  The problem arises when you are talking to someone with a background in relational databases — someone like me.  There are lots of different types of consistency, but when I first heard the term used in a NoSQL database context, my RDBMS gene kicked in.  I immediately thought they were talking about “consistency” as defined in the ACID acronym: Atomicity Consistency Isolation Durability.

Consistency, in the ACID sense, means that the database will guarantee the data to be in compliance with whatever set of rules and constraints were created for it. Some examples would be:

  • Setting NOT NULL for a given field.
  • An INTEGER field is guaranteed to not allow string data.
  • A referential integrity constraint that says you cannot delete a row in Table A if Table B has records that refer to that row.

So when I would read something about eventual consistency or tunable consistency, I would think: “That’s an oxymoron.  Consistency for the constraints and rules must be consistent, all the time, else it is meaningless!”

Turns out that when you hear NoSQL people talking about consistency, they are most likely not talking about the  ”C” in ACID.  What they are really talking about is “data consistency”, which largely has to do with concurrent reads. It is referring to whether the data itself going to be consistent for everyone reading from the database.  With data consistency, each user sees a consistent view of the data, including visible changes made by the user’s own transactions and transactions of other users.

With that understood, let’s see what a NoSQL person is talking about regarding “eventual” and “tunable” consistency by assuming the following happened in the database:

  1. A developer writes a record to the database that says “Average Temperature = 60F on 11/3/2011″
  2. The next day, the developer updates the database so that “Average Temperature = 58F on 11/3/2011″

If someone queries the NoSQL database immediately after step 2 happens, and eventual consistency is in effect, then there is a chance that the user will see that the average temperature on 11/3/2011 was 60F — not 58F, which represents the latest data.  However, “eventually” the data will, in fact, propagate such that everyone eventually sees 58F as their answer.

If you are a relational person, your first reaction might be “WHAT!?  That’s crazy!”  But if you think about it, there are actually lots of application scenarios where that type of thing is actually not a problem.

However, because it might be a problem in certain scenarios, Cassandra’s ability to offer “tunable consistency” is really the best solution to the problem.  ”Tunable consistency” gives the developer the choice, per transaction, of whether they want their data to be “eventually consistent” or “strongly consistent”, and can be done for both writes and reads.  Strong consistency would guarantee that everyone saw 58F as the answer to their query after step 2.

Tunable consistency really is a great thing, but it is quite different from the “C” in ACID, and that’s worth understanding.

The next question is usually: “OK, so are NoSQL databases ACID compliant?”  But we’ll leave that one for another post.



Comments

  1. Amrith Kumar says:

    Billy,

    Great point(s). Yes, there is little point in calling a system eventual consistency and providing no way to quantify when consistency will be achieved.

    For that reason, systems define consistency windows and attempt to achieve consistency in a windowed manner. For more details about that, you can read more about that on my blog post on the subject here.

    In that context, a NoSQL database is not ACID compliant because ACID defines “Consistency” to be “Perfectly Consistent” or in the terms of my earlier referenced post, Tc = 0, where all changes are totally ordered and meet the criteria of linearizability (M. Herlihy and J. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, Volume 12, (1990) pg 463-492.)

    Thx,

    -amrith

  2. Any time you can work in a Princess Bride quotation into a blog post makes it a winner for me. I must admit I wasn’t very clear on ‘tunable consistency’, and I’d like to read more posts with some more use cases. The use case I go to with eventual consistency is ‘is anyone going to die if my new Facebook status doesn’t roll out at the same time to everyone in my network, but maybe that’s a case for ‘strong consistency’ and not ‘eventual consistency’?.

  3. Jeff Darcy says:

    Yep. The C in NoSQL is usually the C in Brewer’s CAP Conjecture (I no longer consider Gilbert and Lynch’s Theorem relevant), which is mostly about the A and the I in ACID. Knowing that helps clear up a lot of the confusion.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>