Replication is the process of storing copies of data on multiple nodes to ensure reliability and fault tolerance. When you create a keyspace in Cassandra, you must decide the replica placement strategy: the number of replicas and how those replicas are distributed across nodes in the cluster. The replication strategy relies on the cluster-configured snitch to help it determine the physical location of nodes and their proximity to each other.
The total number of replicas across the cluster is often referred to as the replication factor. A replication factor of 1 means that there is only one copy of each row. A replication factor of 2 means two copies of each row. All replicas are equally important; there is no primary or master replica in terms of how read and write requests are handled.
As a general rule, the replication factor should not exceed the number of nodes in the cluster. However, it is possible to increase replication factor, and then add the desired number of nodes afterwards. When replication factor exceeds the number of nodes, writes will be rejected, but reads will be served as long as the desired consistency level can be met.
The replica placement strategy determines how replicas for a keyspace are distributed across the cluster. The replica placement strategy is set when you create a keyspace.
There are a number of strategies to choose from based on your goals and the information you have about where nodes are located.
SimpleStrategy is the default replica placement strategy when creating a keyspace using the Cassandra CLI. Other interfaces, such as the CQL utility, require you to explicitly specify a strategy.
SimpleStrategy places the first replica on a node determined by the partitioner. Additional replicas are placed on the next nodes clockwise in the ring without considering rack or data center location.
NetworkTopologyStrategy is the preferred replication placement strategy when you have information about how nodes are grouped in your data center, or you have (or plan to have) your cluster deployed across multiple data centers. This strategy allows you to specify how many replicas you want in each data center.
When deciding how many replicas to configure in each data center, the primary considerations are (1) being able to satisfy reads locally, without incurring cross-datacenter latency, and (2) failure scenarios.
The two most common ways to configure multiple data center clusters are:
Asymmetrical replication groupings are also possible depending on your use case. For example, you may want to have three replicas per data center to serve real-time application requests, and then have a single replica in a separate data center designated to running analytics. In Cassandra, the term data center is synonymous with replication group - it is a grouping of nodes configured together for replication purposes. It does not have to be a physical data center.
With NetworkTopologyStrategy, replica placement is determined independently within each data center (or replication group). The first replica per data center is placed according to the partitioner (same as with SimpleStrategy). Additional replicas in the same data center are then determined by walking the ring clockwise until a node in a different rack from the previous replica is found. If there is no such node, additional replicas will be placed in the same rack. NetworkTopologyStrategy prefers to place replicas on distinct racks if possible. Nodes in the same rack (or similar physical grouping) can easily fail at the same time due to power, cooling, or network issues.
Here is an example of how NetworkTopologyStrategy would place replicas spanning two data centers with a total replication factor of 4 (two replicas in Data Center 1 and two replicas in Data Center 2):
Notice how tokens are assigned to alternating racks.
NetworkTopologyStrategy relies on a properly configured snitch to place replicas correctly across data centers and racks, so it is important to configure your cluster to use a snitch that can correctly determine the locations of nodes in your network.
Note
NetworkTopologyStrategy should be used in place of the OldNetworkTopologyStrategy, which only supports a limited configuration of exactly 3 replicas across 2 data centers, with no control over which data center gets two replicas for any given row key. Some rows will have two replicas in the first and one in the second, while others will have two in the second and one in the first.
The snitch is a configurable component of a Cassandra cluster used to define how the nodes are grouped together within the overall network topology (such as rack and data center groupings). Cassandra uses this information to route inter-node requests as efficiently as possible within the confines of the replica placement strategy. The snitch does not affect requests between the client application and Cassandra (it does not control which node a client connects to).
Snitches are configured for a Cassandra cluster in the cassandra.yaml configuration file. All nodes in a cluster should use the same snitch configuration. When assigning tokens, assign them to alternating racks. For example: rack1, rack2, rack3, rack1, rack2, rack3, and so on.
The following snitches are available:
The SimpleSnitch (the default) is appropriate if you have no rack or data center information available. Single-data center deployments (or single-zone in public clouds) usually fall into this category.
If using this snitch, use replication_factor=<#> when defining your keyspace strategy_options. This snitch does not recognize data center or rack information.
DseSimpleSnitch is used in DataStax Enterprise (DSE) deployments only. It logically configures Hadoop analytics nodes in a separate data center from pure Cassandra nodes in order to segregate analytic and real-time workloads. It can be used for mixed-workload DSE clusters located in one physical data center. It can also be used for multi-data center DSE clusters that have exactly 2 data centers, with all analytic nodes in one data center and all Cassandra real-time nodes in another data center.
If using this snitch, use Analytics or Cassandra as your data center names when defining your keyspace strategy_options.
RackInferringSnitch infers the topology of the network by analyzing the node IP addresses. This snitch assumes that the second octet identifies the data center where a node is located, and the third octet identifies the rack.
If using this snitch, use the second octet number of your node IPs as your data center names when defining your keyspace strategy_options. For example, 100 would be the data center name.
PropertyFileSnitch determines the location of nodes by referring to a user-defined description of the network details located in the property file cassandra-topology.properties. This snitch is best when your node IPs are not uniform or you have complex replication grouping requirements. See Configuring the PropertyFileSnitch for more information.
If using this snitch, you can define your data center names to be whatever you want. Just make sure the data center names you define in the cassandra-topology.properties file correlates to what you name your data centers in your keyspace strategy_options.
EC2Snitch is for simple cluster deployments on Amazon EC2 where all nodes in the cluster are within the same region. Instead of using the node’s IP address to infer node location, this snitch uses the AWS API to request the region and availability zone of a node. The region is treated as the data center and the availability zones are treated as racks within the data center. For example, if a node is in us-east-1a, us-east is the data center name and 1a is the rack location.
If using this snitch, use the EC2 region name (for example,``us-east``) as your data center name when defining your keyspace strategy_options.
EC2MultiRegionSnitch is for cluster deployments on Amazon EC2 where the cluster spans multiple regions. Instead of using the node’s IP address to infer node location, this snitch uses the AWS API to request the region and availability zone of a node. Regions are treated as data centers and availability zones are treated as racks within a data center. For example, if a node is in us-east-1a, us-east is the data center name and 1a is the rack location.
If using this snitch, you must configure each Cassandra node so that the listen_address is set to the private IP address or the node, and the broadcast_address is set to the public IP address of the node. This allows Cassandra nodes in one EC2 region to bind to nodes in another region, thus enabling multi-data center support. Additionally, you must set the addresses of the seed nodes in the cassandra.yaml file to that of the public IPs because private IPs are not routable between networks. For example: seeds: 50.34.16.33, 60.247.70.52. To find the public IP address, run this command from each of the seed nodes in EC2:
curl http://instance-data/latest/meta-data/public-ipv4
If using this snitch, use the EC2 region name (for example,``us-east``) as your data center names when defining your keyspace strategy_options.
By default, all snitches also use a dynamic snitch layer that monitors read latency and, when possible, routes requests away from poorly-performing nodes. The dynamic snitch is enabled by default, and is recommended for use in most deployments.
Dynamic snitch thresholds can be configured in the cassandra.yaml configuration file for a node.