DataStax Enterprise 3.0 Documentation

Configuring replication

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

Cassandra performs replication to store multiple copies of data on multiple nodes for reliability and fault tolerance. You need to choose a data partitioner and replica placement strategy to configure replication. Data partitioning determines how to place the data across the nodes in the cluster. Choosing a partitioner determines on which node to place the first copy of data. Choosing the replica placement strategy determines which nodes get additional copies of data.

Nodes communicate with each other about replication and other things using the gossip protocol. This section covers the gossip settings and other node configuration information. It also covers how to change a replication setting.

Partitioner settings

When you deploy a Cassandra cluster, make sure that each node is responsible for roughly an equal amount of data. To accomplish this load balancing, configure the partitioner for each node, and assign the node an initial-token value.

DataStax strongly recommends using the RandomPartitioner (the default) for all cluster deployments. Assuming use of this partitioner, each node in the cluster is assigned a token that represents a hash value within the range of 0 to 2127 -1.

You can calculate tokens for a cluster having nodes in a single data center by dividing the range by the total number of nodes in the cluster. In multiple data center deployments, tokens should be calculated such that each data center is individually load balanced as well. Partition each data center as if it were its own distinct ring. See Generating Tokens for the different approaches to generating tokens for nodes in single and multiple data center clusters.

Snitch settings

The snitch is responsible for knowing the location of nodes within your network topology. The location of nodes affects where replicas are placed and how requests are routed between replicas. All nodes must have exactly the same snitch configuration.

The endpoint_snitch property configures the snitch for a node. In cassandra.yaml, the snitch is set to the DSE Delegated Snitch (endpoint_snitch: com.datastax.bdp.snitch.DseDelegateSnitch). The following sections describe a few commonly-used snitches. All snitches are described in the Apache Cassandra documentation. The default endpoint_snitch is the DseDelegateSnitch. The default snitch delegated by this snitch is the DseSimpleSnitch (org.apache.cassandra.locator.DseSimpleSnitch). You set the snitch used by the DseDelegateSnitch in the dse.yaml file:

  • Packaged installations: /etc/dse/dse.yaml
  • Tarball installations: <install_location>/resources/dse/conf/dse.yaml

DseSimpleSnitch

DseSimpleSnitch is used only for DataStax Enterprise (DSE) deployments. To segregate analytics and real-time workloads, this snitch logically configures Hadoop analytics nodes in a separate data center from Cassandra real-time nodes. Use DseSimpleSnitch for mixed-workload DSE clusters located in one physical data center or for multiple data center DSE clusters that have exactly two data centers: one with all Analytics nodes and the other with all Cassandra real-time nodes.

When defining your keyspace strategy_options, use Analytics or Cassandra for your data center names.

SimpleSnitch

For a single data center (or single node) cluster, using SimpleSnitch is usually sufficient. However, if you plan to expand your cluster at a later time to multiple racks and data centers, it is easier if you choose a rack and data center aware snitch from the start, such as the RackInferringSnitch . All snitches are compatible with all placement strategies.

Configuring the PropertyFileSnitch

The PropertyFileSnitch allows you to define your data center and rack names to be whatever you want. Using this snitch requires that you define network details for each node in the cluster in a cassandra-topology.properties configuration file. This file is located in /etc/dse/cassandra/conf/cassandra.yaml in packaged installations or <install_location>/resources/cassandra/conf/cassandra.yaml in binary installations.

Every node in the cluster should be described in this file, and specified exactly the same on every node in the cluster.

For example, supposing you had non-uniform IPs and two physical data centers with two racks in each, and a third logical data center for replicating analytics data:

# Data Center One

175.56.12.105=DC1:RAC1
175.50.13.200=DC1:RAC1
175.54.35.197=DC1:RAC1

120.53.24.101=DC1:RAC2
120.55.16.200=DC1:RAC2
120.57.102.103=DC1:RAC2

# Data Center Two

110.56.12.120=DC2:RAC1
110.50.13.201=DC2:RAC1
110.54.35.184=DC2:RAC1

50.33.23.120=DC2:RAC2
50.45.14.220=DC2:RAC2
50.17.10.203=DC2:RAC2

# Analytics Replication Group

172.106.12.120=DC3:RAC1
172.106.12.121=DC3:RAC1
172.106.12.122=DC3:RAC1

# default for unknown nodes
default=DC3:RAC1

Make sure the data center names defined in the /etc/dse/cassandra/cassandra-topology.properties file correlates to what you name your data centers in your keyspace strategy-options.

Choosing keyspace replication options

When you create a keyspace, you must define the replica placement strategy and the number of replicas you want. DataStax recommends choosing NetworkTopologyStrategy for single and multiple data center clusters. It is as easy to use as SimpleStrategy and allows for expansion to multiple data centers in the future, should that become useful. It is much easier to configure the most flexible replication strategy up front, than to reconfigure replication after you have already loaded data into your cluster.

NetworkTopologyStrategy takes as options the number of replicas you want per data center. Even for single data center (or single node) clusters, you can use this replica placement strategy and just define the number of replicas for one data center. For example (using the Beta CQL 3):

CREATE KEYSPACE test
  WITH strategy_class = 'NetworkTopologyStrategy'
  AND strategy_options:"us-east" = 6;

Or for a multiple data center cluster:

CREATE KEYSPACE test2 WITH strategy_class = 'NetworkTopologyStrategy’
  AND strategy_options:DC1 = 3 AND strategy_options:DC2 = 3;

When declaring the keyspace strategy-options, what you name your data centers depends on the snitch you have chosen for your cluster. The data center names must correlate to the snitch you are using in order for replicas to be placed in the correct location.

As a general rule, the number of replicas should not exceed the number of nodes in a replication group. However, it is possible to increase the number of replicas, and then add the desired number of nodes afterwards. When the replication factor exceeds the number of nodes, writes will be rejected, but reads will still be served as long as the desired consistency level can be met.

In DataStax Enterprise 3.0.1, the default consistency level has changed from ONE to QUORUM for reads and writes to resolve a problem finding a CassandraFS block when using consistency level ONE on a Hadoop node.

Changing replication settings

The default replication of 1 for keyspaces is suitable only for development and testing of a single node. For production environments, it is important to change the replication of keyspaces from 1 to a higher number. To avoid operations problems, changing the replication of these system keyspaces is especially important:

  • HiveMetaStore, cfs, and cfs_archive keyspaces

    If the node is an Analytics node that uses Hive, increase the HiveMetaStore and cfs keyspace replication factors to 2 or higher to be resilient to single-node failures. If you use cfs_archive, increase it accordingly.

  • dse_system keyspace

    On an Analytics/Hadoop node, this keyspace contains information about the location of the job tracker. If only a single node contains the job tracker replica, other nodes cannot find the job tracker when that node is unavailable for some reason.

To change the replication these keyspaces

  1. Check the name of the data center of the node.

    bin/nodetool -h localhost ring
    

    The output tells you the name of the data center for the node, for example, datacenter1.

  2. Change the replication of the cfs and cfs_archive keyspaces from 1 to 3, for example:

    Using the Beta CQL 3:

    ALTER KEYSPACE cfs
     WITH strategy_class = NetworkTopologyStrategy
     AND strategy_options:datacenter1=3;
    
    ALTER KEYSPACE cfs_archive
     WITH strategy_class = NetworkTopologyStrategy
     AND strategy_options:datacenter1=3;
    

    Using CLI:

    [default@unknown] UPDATE KEYSPACE cfs
       WITH strategy_options = {datacenter1:3};
    
    [default@unknown] UPDATE KEYSPACE cfs_archive
       WITH strategy_options = {datacenter1:3};
    

    How high you increase the replication depends on the number of nodes in the cluster, as discussed in the previous section.

  3. If you use Hive, update the HiveMetaStore keyspace to increase the replication from 1 to 3, for example.

  4. Update the dse_system keyspace to increase the replication from 1 to 3, for example.

  5. If the keyspaces you changed contain any data, run nodetool repair to avoid having missing data problems or data unavailable exceptions.

Choosing node configuration options

A major part of planning your Cassandra cluster deployment is understanding and setting the various node configuration properties. This section explains the various configuration decisions that need to be made before deploying a Cassandra cluster, be it a single-node, multi-node, or multiple data center cluster.

The properties mentioned in this section are set in the cassandra.yaml configuration file. Each node should be correctly configured before starting it for the first time.

Storage settings

By default, a node is configured to store the data it manages in /var/lib/cassandra. In a production cluster deployment, you should change the commitlog_directory so it is on a different disk device than the data_file_directories.

Gossip settings

The gossip settings control a nodes participation in a cluster and how the node is known to the cluster.

Property Description
cluster_name Name of the cluster that this node is joining. Should be the same for every node in the cluster.
listen_address The IP address or hostname that other Cassandra nodes will use to connect to this node. Should be changed from localhost to the public address for the host.
seeds A comma-delimited list of node IP addresses used to bootstrap the gossip process. Every node should have the same list of seeds. In multi data center clusters of Cassandra or analytics nodes, the seed list should include a node from each data center.
storage_port The intra-node communication port (default is 7000). Should be the same for every node in the cluster.
initial_token The initial token is used to determine the range of data this node is responsible for.

Purging gossip state on a node

Gossip information is also persisted locally by each node to use immediately next restart without having to wait for gossip. To clear gossip history on node restart (for example, if node IP addresses have changed), add the following line to the cassandra-env.sh file. This file is located in /usr/share/cassandra or $CASSANDRA_HOME/conf in Cassandra installations.

-Dcassandra.load_ring_state=false