SPOF 0: Why Every Node in a Cassandra Cluster is the Same
date: November 11, 2010
In a distributed system, component failure should be expected, particularly as the size of the infrastructure grows. Single points of failure in systems marketing themselves as distributed are unfortunately quite common. Systems using master-slave replication and/or mixed-master fail-over schemes sometimes fall into the "distributed" category, but these techniques are fragile and impose unnecessary limits on scalability (usually in the form of hardware interconnection complexity and geographic proximity).
Cassandra, as a horizontally scalable distributed database, is designed in such a way that all nodes serve the same role - that of a primary replica for a portion of the data. This design is discussed in detail in the Amazon Dynamo paper (which is well written and quite accessible considering the subject matter) but we'll highlight a few of the benefits related to this subject below.
First, let's explore the term primary replica mentioned above. Though the node is the "primary" for a portion of the data in the cluster, the number of copies of the data kept on other nodes in the cluster is configurable. When a node goes down, the other nodes containing copies, referred to as "replicas", continue to service read requests and will even accept writes for the down node. When the node returns, these queued up writes are sent from the replicas to bring the node back up to date (you can find more detail on this process, known as hinted handoff, and Cassandra's implementation of such here: http://wiki.apache.org/cassandra/HintedHandoff).
Another benefit of this design is the ease of which new nodes can be added. When a new node is brought in to the cluster, it can take over a portion of the data from existing nodes, relieving them of the responsibility for that range of data. Because all nodes are the same, this communication can happen seamlessly in a running cluster with the nodes exchanging messages to one another and the rest of the cluster as needed.
Having all nodes share the same role also streamlines operations and systems administrations tasks as well. Because Cassandra has a single node type, it has only a single set of requirements for hardware, for monitoring, and deployment.
By having all nodes share the same role, Cassandra facilitates true distributed systems behavior by removing any single point of failure. As a result of designing a system from the ground up to account for this, management, reliability and scalability all benefit.