Cassandra backs up data by taking a snapshot of all on-disk data files (SSTable files) stored in the data directory. You can take a snapshot on all keyspaces, a single keyspace, or a single column family while the system is online. However, to restore a snapshot, you must take the nodes offline.
Using a parallel ssh tool (such as pssh), you can snapshot an entire cluster. This provides an eventually consistent backup. Although no one node is guaranteed to be consistent with its replica nodes at the time a snapshot is taken, a restored snapshot resumes consistency using Cassandra's built-in consistency mechanisms.
After a system-wide snapshot is performed, you can enable incremental backups on each node to backup data that has changed since the last snapshot: each time an SSTable is flushed, a hard link is copied into a /backups subdirectory of the data directory (provided JNA is enabled).
If JNA is enabled, snapshots are performed by hard links. If not enabled, I/O activity increases as the files are copied from one location to another, which significantly reduces efficiency.
Snapshots are taken per node using the nodetool snapshot command. To take a global snapshot, run the nodetool snapshot command using a parallel ssh utility, such as pssh.
A snapshot first flushes all in-memory writes to disk, then makes a hard link of the SSTable files for each keyspace. By default the snapshot files are stored in the /var/lib/cassandra/data/<keyspace_name>/<column_family_name>/snapshots directory.
You must have enough free disk space on the node to accommodate making snapshots of your data files. A single snapshot requires little disk space. However, snapshots can cause your disk usage to grow more quickly over time because a snapshot prevents old obsolete data files from being deleted. After the snapshot is complete, you can move the backup files to another location if needed, or you can leave them in place.
To create a snapshot of a node
Run the nodetool snapshot command, specifying the hostname, JMX port, and keyspace. For example:
$ nodetool -h localhost -p 7199 snapshot demdb
The snapshot is created in <data_directory_location>/<keyspace_name>/<column_family_name>/snapshots/<snapshot_name>. Each snapshot folder contains numerous .db files that contain the data at the time of the snapshot.
When taking a snapshot, previous snapshot files are not automatically deleted. You should remove old snapshots that are no longer needed.
The nodetool clearsnapshot command removes all existing snapshot files from the snapshot directory of each keyspace. You should make it part of your back-up process to clear old snapshots before taking a new one.
To delete all snapshots for a node, run the nodetool clearsnapshot command. For example:
$ nodetool -h localhost -p 7199 clearsnapshot
To delete snapshots on all nodes at once, run the nodetool clearsnapshot command using a parallel ssh utility.
When incremental backups are enabled (disabled by default), Cassandra hard-links each flushed SSTable to a backups directory under the keyspace data directory. This allows storing backups offsite without transferring entire snapshots. Also, incremental backups combine with snapshots to provide a dependable, up-to-date backup mechanism.
As with snapshots, Cassandra does not automatically clear incremental backup files. DataStax recommends setting up a process to clear incremental backup hard-links each time a new snapshot is created.
Restoring a keyspace from a snapshot requires all snapshot files for the column family, and if using incremental backups, any incremental backup files created after the snapshot was taken. You can restore a snapshot in several ways:
If restoring a single node, you must first shutdown the node. If restoring an entire cluster, you must shutdown all nodes, restore the snapshot data, and then start all nodes again.
Restoring from snapshots and incremental backups temporarily causes intensive CPU and I/O activity on the node being restored.
To restore a node from a snapshot and incremental backups:
Shut down the node.
Clear all files in /var/lib/cassandra/commitlog.
Delete all *.db files in <data_directory_location>/<keyspace_name>/<column_family_name> directory, but DO NOT delete the /snapshots and /backups subdirectories.
Locate the most recent snapshot folder in <data_directory_location>/<keyspace_name>/<column_family_name>/snapshots/<snapshot_name>, and copy its contents into the <data_directory_location>/<keyspace_name>/<column_family_name> directory.
If using incremental backups, copy all contents of <data_directory_location>/<keyspace_name>/<column_family_name>/backups into <data_directory_location>/<keyspace_name>/<column_family_name>.
Restart the node.
Restarting causes a temporary burst of I/O activity and consumes a large amount of CPU resources.