|Understanding the architecture|
A snitch determines which data centers and racks are written to and read from.
Snitches inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into data centers and racks. All nodes must have exactly the same snitch configuration. Cassandra does its best not to have more than one replica on the same rack (which is not necessarily a physical location).
Monitors the performance of reads from the various replicas and chooses the best replica based on this history.
By default, all snitches also use a dynamic snitch layer that monitors read latency and, when possible, routes requests away from poorly-performing nodes. The dynamic snitch is enabled by default and is recommended for use in most deployments. For information on how this works, see Dynamic snitching in Cassandra: past, present, and future. Configure dynamic snitch thresholds for each node in the cassandra.yaml configuration file.
For more information, see the properties listed under Failure detection and recovery.
The SimpleSnitch (the default) does not recognize data center or rack information. Use it for single-data center deployments (or single-zone in public clouds).
Using a SimpleSnitch, the only keyspace strategy option you specify is a replication factor.
The RackInferringSnitch determines the location of nodes by rack and data center, which are assumed to correspond to the 3rd and 2nd octet of the node's IP address, respectively. Use this snitch as an example of writing a custom Snitch class.
Determines the location of nodes by rack and data center.
This snitch uses a user-defined description of the network details located in the cassandra-topology.properties file. Use this snitch when your node IPs are not uniform or if you have complex replication grouping requirements. When using this snitch, you can define your data center names to be whatever you want. Make sure that the data center names you define correlate to the name of your data centers in your keyspace strategy_options. Every node in the cluster should be described in the cassandra-topology.properties file, and this file should be exactly the same on every node in the cluster.
# Data Center One 188.8.131.52 =DC1:RAC1 184.108.40.206 =DC1:RAC1 220.127.116.11 =DC1:RAC1 18.104.22.168 =DC1:RAC2 22.214.171.124 =DC1:RAC2 126.96.36.199 =DC1:RAC2 # Data Center Two 188.8.131.52 =DC2:RAC1 184.108.40.206 =DC2:RAC1 220.127.116.11 =DC2:RAC1 18.104.22.168 =DC2:RAC2 22.214.171.124 =DC2:RAC2 126.96.36.199 =DC2:RAC2 # Analytics Replication Group 188.8.131.52 =DC3:RAC1 184.108.40.206 =DC3:RAC1 220.127.116.11 =DC3:RAC1 # default for unknown nodes default =DC3:RAC1
The GossipingPropertyFileSnitch defines a local node's data center and rack; it uses gossip for propagating this information to other nodes. The conf/cassandra-rackdc.properties file defines the default data center and rack used by this snitch:
dc =DC1 rack =RAC1
To migrate from the PropertyFileSnitch to the GossipingPropertyFileSnitch, update one node at a time to allow gossip time to propagate. The PropertyFileSnitch is used as a fallback when cassandra-topologies.properties is present.
Use the EC2Snitch for simple cluster deployments on Amazon EC2 where all nodes in the cluster are within a single region. The region is treated as the data center and the availability zones are treated as racks within the data center. For example, if a node is in us-east-1a, us-east is the data center name and 1a is the rack location. Because private IPs are used, this snitch does not work across multiple Regions.
When defining your keyspace strategy option, use the EC2 region name (for example,``us-east``) as your data center name.
Use the EC2MultiRegionSnitch for deployments on Amazon EC2 where the cluster spans multiple regions. As with the EC2Snitch, regions are treated as data centers and availability zones are treated as racks within a data center. For example, if a node is in us-east-1a, us-east is the data center name and 1a is the rack location.
This snitch uses public IPs as broadcast_address to allow cross-region connectivity. This means that you must configure each Cassandra node so that the listen_address is set to the private IP address of the node, and the broadcast_address is set to the public IP address of the node. This allows Cassandra nodes in one EC2 region to bind to nodes in another region, thus enabling multiple data center support. (For intra-region traffic, Cassandra switches to the private IP after establishing a connection.)
Additionally, you must set the addresses of the seed nodes in the cassandra.yaml file to that of the public IPs because private IPs are not routable between networks. For example:
seeds: 18.104.22.168, 22.214.171.124
To find the public IP address, run this command from each of the seed nodes in EC2:
When defining your keyspace strategy option, use the EC2 region name, such as ``us-east``, as your data center names.