A snitch has two functions:
Note
If you change the snitch after data is inserted into the cluster, you must run a full repair, since the snitch affects where replicas are placed.
The following snitches are available:
The SimpleSnitch (the default) does not recognize data center or rack information. Use it for single-data center deployments (or single-zone in public clouds).
Using a SimpleSnitch, the only keyspace strategy option you specify is a replication factor.
The RackInferringSnitch determines the location of nodes by rack and data center, which are assumed to correspond to the 3rd and 2nd octet of the node's IP address, respectively. Use this snitch as an example of writing a custom Snitch class.
The PropertyFileSnitch determines the location of nodes by rack and data center. This snitch uses a user-defined description of the network details located in the cassandra-topology.properties file. Use this snitch when your node IPs are not uniform or if you have complex replication grouping requirements as shown in Configuring the PropertyFileSnitch.
When using this snitch, you can define your data center names to be whatever you want. Make sure that the data center names you define in the cassandra-topology.properties file correlates to the name of your data centers in your keyspace strategy_options. Every node in the cluster should be described in the cassandra-topology.properties file, and this file should be exactly the same on every node in the cluster.
The location of the cassandra-topology.properties file depends on the type of installation; see Cassandra Configuration Files Locations or DataStax Enterprise Configuration Files Locations.
If you had non-uniform IPs and two physical data centers with two racks in each, and a third logical data center for replicating analytics data, the cassandra-topology.properties file might look like this:
# Data Center One
175.56.12.105=DC1:RAC1
175.50.13.200=DC1:RAC1
175.54.35.197=DC1:RAC1
120.53.24.101=DC1:RAC2
120.55.16.200=DC1:RAC2
120.57.102.103=DC1:RAC2
# Data Center Two
110.56.12.120=DC2:RAC1
110.50.13.201=DC2:RAC1
110.54.35.184=DC2:RAC1
50.33.23.120=DC2:RAC2
50.45.14.220=DC2:RAC2
50.17.10.203=DC2:RAC2
# Analytics Replication Group
172.106.12.120=DC3:RAC1
172.106.12.121=DC3:RAC1
172.106.12.122=DC3:RAC1
# default for unknown nodes
default=DC3:RAC1
The GossipingPropertyFileSnitch defines a local node's data center and rack; it uses gossip for propagating this information to other nodes. The conf/cassandra-rackdc.properties file defines the default data center and rack used by this snitch:
dc=DC1
rack=RAC1
The location of the conf directory depends on the type of installation; see Cassandra Configuration Files Locations or DataStax Enterprise Configuration Files Locations
To migrate from the PropertyFileSnitch to the GossipingPropertyFileSnitch, update one node at a time to allow gossip time to propagate. The PropertyFileSnitch is used as a fallback when cassandra-topologies.properties is present.
Use the EC2Snitch for simple cluster deployments on Amazon EC2 where all nodes in the cluster are within a single region. The region is treated as the data center and the availability zones are treated as racks within the data center. For example, if a node is in us-east-1a, us-east is the data center name and 1a is the rack location. Because private IPs are used, this snitch does not work across multiple Regions.
When defining your keyspace strategy_options, use the EC2 region name (for example,``us-east``) as your data center name.
Use the EC2MultiRegionSnitch for deployments on Amazon EC2 where the cluster spans multiple regions. As with the EC2Snitch, regions are treated as data centers and availability zones are treated as racks within a data center. For example, if a node is in us-east-1a, us-east is the data center name and 1a is the rack location.
This snitch uses public IPs as broadcast_address to allow cross-region connectivity. This means that you must configure each Cassandra node so that the listen_address is set to the private IP address of the node, and the broadcast_address is set to the public IP address of the node. This allows Cassandra nodes in one EC2 region to bind to nodes in another region, thus enabling multiple data center support. (For intra-region traffic, Cassandra switches to the private IP after establishing a connection.)
Additionally, you must set the addresses of the seed nodes in the cassandra.yaml file to that of the public IPs because private IPs are not routable between networks. For example:
seeds: 50.34.16.33, 60.247.70.52
To find the public IP address, run this command from each of the seed nodes in EC2:
curl http://instance-data/latest/meta-data/public-ipv4
Finally, be sure that the storage_port or ssl_storage_port is open on the public IP firewall.
When defining your keyspace strategy_options, use the EC2 region name, such as``us-east``, as your data center names.
By default, all snitches also use a dynamic snitch layer that monitors read latency and, when possible, routes requests away from poorly-performing nodes. The dynamic snitch is enabled by default and is recommended for use in most deployments. For information on how this works, see Dynamic snitching in Cassandra: past, present, and future.
Configure dynamic snitch thresholds for each node in the cassandra.yaml configuration file. For more information, see the properties listed under Fault detection properties.