This section discusses running routine node repair.
The nodetool repair command repairs inconsistencies across all of the replicas for a given range of data. Run repair in these situations:
- During normal operation as part of regular, scheduled cluster maintenance unless Cassandra applications perform no deletes.
- During node recovery, for example, when bringing a node back into the cluster after a failure
- On nodes containing data that is not read frequently
- To update data on a node that has been down
The guidelines for running node repair include:
The hard requirement for repair frequency is the value of gc_grace_seconds. Run a repair operation at least once on each node within this time period. Following this important guideline ensures that deletes are properly handled in the cluster.
Repair requires heavy disk and CPU consumption. Use caution when running node repair on more than one node at a time. Be sure to schedule regular repair operations for low-usage hours.
In systems that seldom delete or overwrite data, it is possible to raise the value of gc_grace_seconds with minimal impact to disk space. This allows wider intervals for scheduling repair operations with the nodetool utility.