Cassandra performs replication to store multiple copies of data on multiple nodes for reliability and fault tolerance. You need to choose a data partitioner and replica placement strategy to configure replication. Data partitioning determines how to place the data across the nodes in the cluster. Choosing a partitioner determines on which node to place the first copy of data. Choosing the replica placement strategy determines which nodes get additional copies of data.
Nodes communicate with each other about replication and other things using the gossip protocol. This section covers the gossip settings and other node configuration information. It also covers how to change a replication setting.
When you deploy a Cassandra cluster, make sure that each node is responsible for roughly an equal amount of data. To accomplish this load balancing, configure the partitioner for each node, and assign the node an initial-token value.
DataStax strongly recommends using the RandomPartitioner (the default) for all cluster deployments. Assuming use of this partitioner, each node in the cluster is assigned a token that represents a hash value within the range of 0 to 2127 -1.
You can calculate tokens for a cluster having nodes in a single data center by dividing the range by the total number of nodes in the cluster. In multiple data center deployments, tokens should be calculated such that each data center is individually load balanced as well. Partition each data center as if it were its own distinct ring. See Generating Tokens for the different approaches to generating tokens for nodes in single and multiple data center clusters.
The snitch is responsible for knowing the location of nodes within your network topology. The location of nodes affects where replicas are placed and how requests are routed between replicas. All nodes must have exactly the same snitch configuration. The endpoint_snitch property configures the snitch for a node.
In cassandra.yaml, the snitch is set to the DSE Delegated Snitch (endpoint_snitch: com.datastax.bdp.snitch.DseDelegateSnitch). The Delegated Snitch is used to implement Elastic Workload Re-provisioning. The following sections describe a few commonly-used snitches. All snitches are described in the Apache Cassandra documentation.
In DataStax Enterprise, the default delegated snitch is the DseSimpleSnitch (org.apache.cassandra.locator.DseSimpleSnitch), located in:
DseSimpleSnitch is used only for DataStax Enterprise (DSE) deployments. To segregate analytics and real-time workloads, this snitch logically configures Hadoop analytics nodes in a separate data center from Cassandra real-time nodes. Use DseSimpleSnitch for mixed-workload DSE clusters located in one physical data center or for multiple data center DSE clusters that have exactly two data centers: one with all Analytics nodes and the other with all Cassandra real-time nodes.
When defining your keyspace strategy_options, use Analytics or Cassandra for your data center names. You use the delegated snitch for re-provisioning the workload of a cluster.
For a single data center (or single node) cluster, using SimpleSnitch is usually sufficient. However, if you plan to expand your cluster at a later time to multiple racks and data centers, it is easier if you choose a rack and data center aware snitch from the start, such as the RackInferringSnitch . All snitches are compatible with all placement strategies.
The PropertyFileSnitch allows you to define your data center and rack names to be whatever you want. Using this snitch requires that you define network details for each node in the cluster in a cassandra-topology.properties configuration file. This file is located in /etc/dse/cassandra/conf/cassandra.yaml in packaged installations or <install_location>/resources/cassandra/conf/cassandra.yaml in binary installations.
Every node in the cluster should be described in this file, and specified exactly the same on every node in the cluster.
For example, supposing you had non-uniform IPs and two physical data centers with two racks in each, and a third logical data center for replicating analytics data:
# Data Center One
175.56.12.105=DC1:RAC1
175.50.13.200=DC1:RAC1
175.54.35.197=DC1:RAC1
120.53.24.101=DC1:RAC2
120.55.16.200=DC1:RAC2
120.57.102.103=DC1:RAC2
# Data Center Two
110.56.12.120=DC2:RAC1
110.50.13.201=DC2:RAC1
110.54.35.184=DC2:RAC1
50.33.23.120=DC2:RAC2
50.45.14.220=DC2:RAC2
50.17.10.203=DC2:RAC2
# Analytics Replication Group
172.106.12.120=DC3:RAC1
172.106.12.121=DC3:RAC1
172.106.12.122=DC3:RAC1
# default for unknown nodes
default=DC3:RAC1
Make sure the data center names defined in the /etc/dse/cassandra/cassandra-topology.properties file correlates to what you name your data centers in your keyspace strategy-options.
When you create a keyspace, you must define the replica placement strategy and the number of replicas you want. DataStax recommends choosing NetworkTopologyStrategy for single and multiple data center clusters. It is as easy to use as SimpleStrategy and allows for expansion to multiple data centers in the future, should that become useful. It is much easier to configure the most flexible replication strategy up front, than to reconfigure replication after you have already loaded data into your cluster.
NetworkTopologyStrategy takes as options the number of replicas you want per data center. Even for single data center (or single node) clusters, you can use this replica placement strategy and just define the number of replicas for one data center. For example (using cassandra-cli):
CREATE KEYSPACE test
WITH strategy_class = 'NetworkTopologyStrategy'
AND strategy_options:us-east = 6;
Or for a multiple data center cluster:
CREATE KEYSPACE test2 WITH strategy_class = 'NetworkTopologyStrategy’
AND strategy_options:DC1 = 3 AND strategy_options:DC2 = 3;
When declaring the keyspace strategy-options, what you name your data centers depends on the snitch you have chosen for your cluster. The data center names must correlate to the snitch you are using in order for replicas to be placed in the correct location.
As a general rule, the number of replicas should not exceed the number of nodes in a replication group. However, it is possible to increase the number of replicas, and then add the desired number of nodes afterwards. When the replication factor exceeds the number of nodes, writes will be rejected, but reads will still be served as long as the desired consistency level can be met.
The default replication for system keyspaces, such as the HiveMetaStore keyspace, is 1. The default replication factor is suitable for development and testing of a single node. If you use Hive in a production environment, you should increase the cfs and HiveMetaStore keyspace replication factors to at least 2 to be resilient to single-node failures.
The procedure for changing the replication of data in the Cassandra File System (CFS) involves these tasks:
To change the replication of data in the CFS
Check the name of the data center of the node.
bin/nodetool -h localhost ring
The output tells you that the name of the data center for your node is, for example, datacenter1.
Change the replication factor of the cfs and cfs_archive keyspaces:
[default@unknown] UPDATE KEYSPACE cfs WITH strategy_options = {datacenter1:3}; [default@unknown] UPDATE KEYSPACE cfs_archive WITH strategy_options = {datacenter1:3};
If you use Hive, update the HiveMetaStore keyspace accordingly:
[default@unknown] UPDATE KEYSPACE HiveMetaStore WITH strategy_options = {datacenter1:3};
If you change the replication factor of a keyspace that contains data, run nodetool repair to avoid having missing data problems or data unavailable exceptions.
A major part of planning your Cassandra cluster deployment is understanding and setting the various node configuration properties. This section explains the various configuration decisions that need to be made before deploying a Cassandra cluster, be it a single-node, multi-node, or multiple data center cluster.
The properties mentioned in this section are set in the cassandra.yaml configuration file. Each node should be correctly configured before starting it for the first time.
By default, a node is configured to store the data it manages in /var/lib/cassandra. In a production cluster deployment, you should change the commitlog_directory so it is on a different disk device than the data_file_directories.
The gossip settings control a nodes participation in a cluster and how the node is known to the cluster.
| Property | Description |
|---|---|
| cluster_name | Name of the cluster that this node is joining. Should be the same for every node in the cluster. |
| listen_address | The IP address or hostname that other Cassandra nodes will use to connect to this node. Should be changed from localhost to the public address for the host. |
| seeds | A comma-delimited list of node IP addresses used to bootstrap the gossip process. Every node should have the same list of seeds. In multi data center clusters of Cassandra or analytics nodes, the seed list should include a node from each data center. |
| storage_port | The intra-node communication port (default is 7000). Should be the same for every node in the cluster. |
| initial_token | The initial token is used to determine the range of data this node is responsible for. |
Gossip information is also persisted locally by each node to use immediately next restart without having to wait for gossip. To clear gossip history on node restart (for example, if node IP addresses have changed), add the following line to the cassandra-env.sh file. This file is located in /usr/share/cassandra or $CASSANDRA_HOME/conf in Cassandra installations.
-Dcassandra.load_ring_state=false