Apache Cassandra™ 2.0

About deletes

The way Cassandra deletes data differs from the way a relational database deletes data. A relational database might spend time scanning through data looking for expired data and throwing it away or an administrator might have to partition expired data by month, for example, to clear it out faster. Data in a Cassandra column can have an optional expiration date called TTL (time to live). Use CQL to set the TTL in seconds for data. Cassandra marks TTL data with a tombstone after the requested amount of time has expired. After data is marked with a tombstone, the data is automatically removed during the normal compaction process defined by the gc_grace_seconds table property and repair processes.

Facts about deleted data to keep in mind are:
  • Cassandra does not immediately remove data marked for deletion from disk. The deletion occurs during compaction.
  • You can drop data immediately by manually starting the compaction process.
  • A deleted column can reappear if you do not run node repair routinely.

Marking data with a tombstone signals Cassandra to retry sending a delete request to a replica that was down at the time of delete. If the replica comes back up within the grace period of time, it eventually receives the delete request. However, if a node is down longer than the grace period, then the node can possibly miss the delete altogether because the tombstone disappears after gc_grace_seconds. Cassandra always attempts to replay/repair missed updates when the node comes back up again. When bringing a node back into the cluster after a failure, run node repair to repair inconsistencies across all of the replicas.