Let’s be consistent about consistency – a post for the relational mind
Fans of The Princess Bride will be quite familiar with this exchange:
Vizzini: He didn’t fall? INCONCEIVABLE.
Inigo Montoya: You keep using that word. I do not think it means what you think it means.
Such is our plight with the word “consistency” as it relates to Cassandra (as well as other so-called NoSQL databases). The problem arises when you are talking to someone with a background in relational databases — someone like me. There are lots of different types of consistency, but when I first heard the term used in a NoSQL database context, my RDBMS gene kicked in. I immediately thought they were talking about “consistency” as defined in the ACID acronym: Atomicity Consistency Isolation Durability.
Consistency, in the ACID sense, means that the database will guarantee the data to be in compliance with whatever set of rules and constraints were created for it. Some examples would be:
- Setting NOT NULL for a given field.
- An INTEGER field is guaranteed to not allow string data.
- A referential integrity constraint that says you cannot delete a row in Table A if Table B has records that refer to that row.
So when I would read something about eventual consistency or tunable consistency, I would think: “That’s an oxymoron. Consistency for the constraints and rules must be consistent, all the time, else it is meaningless!”
Turns out that when you hear NoSQL people talking about consistency, they are most likely not talking about the “C” in ACID. What they are really talking about is “data consistency”, which largely has to do with concurrent reads. It is referring to whether the data itself going to be consistent for everyone reading from the database. With data consistency, each user sees a consistent view of the data, including visible changes made by the user’s own transactions and transactions of other users.
With that understood, let’s see what a NoSQL person is talking about regarding “eventual” and “tunable” consistency by assuming the following happened in the database:
- A developer writes a record to the database that says “Average Temperature = 60F on 11/3/2011″
- The next day, the developer updates the database so that “Average Temperature = 58F on 11/3/2011″
If someone queries the NoSQL database immediately after step 2 happens, and eventual consistency is in effect, then there is a chance that the user will see that the average temperature on 11/3/2011 was 60F — not 58F, which represents the latest data. However, “eventually” the data will, in fact, propagate such that everyone eventually sees 58F as their answer.
If you are a relational person, your first reaction might be “WHAT!? That’s crazy!” But if you think about it, there are actually lots of application scenarios where that type of thing is actually not a problem.
However, because it might be a problem in certain scenarios, Cassandra’s ability to offer “tunable consistency” is really the best solution to the problem. “Tunable consistency” gives the developer the choice, per transaction, of whether they want their data to be “eventually consistent” or “strongly consistent”, and can be done for both writes and reads. Strong consistency would guarantee that everyone saw 58F as the answer to their query after step 2.
Tunable consistency really is a great thing, but it is quite different from the “C” in ACID, and that’s worth understanding.
The next question is usually: “OK, so are NoSQL databases ACID compliant?” But we’ll leave that one for another post.