| The Cassandra 1.2 documentation is transitioning to a new format! Please use the new Cassandra 1.2 documentation instead. | Back to Table of Contents All Documents List |
In Cassandra, consistency refers to how up-to-date and synchronized a row of data is on all of its replicas. Cassandra extends the concept of eventual consistency by offering tunable consistency. For any given read or write operation, the client application decides how consistent the requested data should be.
In addition to tunable consistency, Cassandra has a number of built-in repair mechanisms to ensure that data remains consistent across replicas.
Consistency levels in Cassandra can be set at to manage response time versus data accuracy. You can set consistency on a cluster, data center, or individual I/O operation basis. Very strong or eventual data consistency among all participating nodes can be set globally and also controlled on a per-operation basis (for example insert or update) using Cassandra’s drivers and client libraries.
When you do a write in Cassandra, the consistency level specifies on how many replicas the write must succeed before returning an acknowledgment to the client application.
The following consistency levels are available, with ANY being the lowest consistency (but highest availability), and ALL being the highest consistency (but lowest availability). QUORUM is a good middle-ground ensuring strong consistency, yet still tolerating some level of failure.
A quorum is calculated as (rounded down to a whole number):
(replication_factor / 2) + 1
For example, with a replication factor of 3, a quorum is 2 (can tolerate 1 replica down). With a replication factor of 6, a quorum is 4 (can tolerate 2 replicas down).
| Level | Description |
|---|---|
| ANY | A write must be written to at least one node. If all replica nodes for the given row key are down, the write can still succeed once a hinted handoff has been written. Note that if all replica nodes are down at write time, an ANY write will not be readable until the replica nodes for that row key have recovered. |
| ONE | A write must be written to the commit log and memory table of at least one replica node. |
| TWO | A write must be written to the commit log and memory table of at least two replica nodes. |
| THREE | A write must be written to the commit log and memory table of at least three replica nodes. |
| QUORUM | A write must be written to the commit log and memory table on a quorum of replica nodes. |
| LOCAL_QUORUM | A write must be written to the commit log and memory table on a quorum of replica nodes in the same data center as the coordinator node. Avoids latency of inter-data center communication. |
| EACH_QUORUM | A write must be written to the commit log and memory table on a quorum of replica nodes in all data centers. |
| ALL | A write must be written to the commit log and memory table on all replica nodes in the cluster for that row key. |
When you do a read in Cassandra, the consistency level specifies how many replicas must respond before a result is returned to the client application.
Cassandra checks the specified number of replicas for the most recent data to satisfy the read request (based on the timestamp).
The following consistency levels are available, with ONE being the lowest consistency (but highest availability), and ALL being the highest consistency (but lowest availability). QUORUM is a good middle-ground ensuring strong consistency, yet still tolerating some level of failure.
A quorum is calculated as (rounded down to a whole number):
(replication_factor / 2) + 1
For example, with a replication factor of 3, a quorum is 2 (can tolerate 1 replica down). With a replication factor of 6, a quorum is 4 (can tolerate 2 replicas down).
| Level | Description |
|---|---|
| ONE | Returns a response from the closest replica (as determined by the snitch). By default, a read repair runs in the background to make the other replicas consistent. |
| TWO | Returns the most recent data from two of the closest replicas. |
| THREE | Returns the most recent data from three of the closest replicas. |
| QUORUM | Returns the record with the most recent timestamp once a quorum of replicas has responded. |
| LOCAL_QUORUM | Returns the record with the most recent timestamp once a quorum of replicas in the current data center as the coordinator node has reported. Avoids latency of inter-data center communication. |
| EACH_QUORUM | Returns the record with the most recent timestamp once a quorum of replicas in each data center of the cluster has responded. |
| ALL | Returns the record with the most recent timestamp once all replicas have responded. The read operation will fail if a replica does not respond. |
Note
LOCAL_QUORUM and EACH_QUORUM are designed for use in multiple data center clusters using a rack-aware replica placement strategy (such as NetworkTopologyStrategy) and a properly configured snitch. These consistency levels will fail when using SimpleStrategy.
Choosing a consistency level involves determining your requirements for consistent results (always reading the most recently written data) versus read or write latency (the time it takes for the requested data to be returned or for the write to succeed).
If latency is a top priority, consider a consistency level of ONE (only one replica node must successfully respond to the read or write request). There is a higher probability of stale data being read with this consistency level (as the replicas contacted for reads may not always have the most recent write). For some applications, this may be an acceptable trade-off. If it is an absolute requirement that a write never fail, you may also consider a write consistency level of ANY. This consistency level has the highest probability of a read not returning the latest written values (see hinted handoff).
If consistency is top priority, you can ensure that a read always reflects the most recent write by using the following formula:
(nodes_written + nodes_read) > replication_factor
For example, if your application is using the QUORUM consistency level for both write and read operations and you are using a replication factor of 3, then this ensures that 2 nodes are always written and 2 nodes are always read. The combination of nodes written and read (4) being greater than the replication factor (3) ensures strong read consistency.
Ideally, you want a client request to be served by replicas in the same data center in order to avoid latency. Contacting multiple data centers for a read or write request can slow down the response. The consistency level LOCAL_QUORUM is specifically designed for doing quorum reads and writes in multi data center clusters.
A consistency level of ONE is also fine for applications with less stringent consistency requirements. A majority of Cassandra users do writes at consistency level ONE. With this consistency, the request will always be served by the replica node closest to the coordinator node that received the request (unless the dynamic snitch determines that the node is performing poorly and routes it elsewhere).
Keep in mind that even at consistency level ONE or LOCAL_QUORUM, the write is still sent to all replicas for the written key, even replicas in other data centers. The consistency level just determines how many replicas are required to respond that they received the write.
You can use a new cqlsh command, CONSISTENCY, to set the consistency level for the keyspace. The WITH CONSISTENCY clause has been removed from CQL 3 commands in the release version of CQL 3. Programmatically, set the consistency level at the driver level. For example, call execute_cql3_query with the required binary query, the compression settings, and consistency level.
Cassandra has a number of built-in repair features to ensure that data remains consistent across replicas. These features are: