|The Cassandra 1.2 documentation is transitioning to a new format! Please use the new Cassandra 1.2 documentation instead.||Back to Table of Contents All Documents List|
As updates come in, Cassandra does not overwrite the rows in place, but instead groups updates in the memtable.
Any number of columns may be inserted at the same time. When inserting or updating columns in a table, the client application specifies the row key to identify which column records to update. The row key is similar to a primary key in that it must be unique for each row within a table. However, unlike inserting a primary key, inserting a duplicate row key does not result in a primary key constraint violation.
Inserting a duplicate row key is treated as an upsert. Eventually, the updates are streamed to disk using sequential I/O and stored in a new SSTable.
Columns are overwritten only if the timestamp in the new version of the column is more recent than the existing column, so precise timestamps are necessary if updates (overwrites) are frequent. The timestamp is provided by the client, so the clocks of all client machines should be synchronized using NTP (network time protocol), for example.
Cassandra deletes data in a different way from a traditional, relational database. A relational database might spend time scanning through data looking for expired data and throwing it away or an administrator might have to partition expired data by month, for example, to clear it out faster. In Cassandra, you do not have to manually remove expired data. Two facts about deleted Cassandra data to keep in mind are:
After an SSTable is written, it is immutable (the file is not updated by further DML operations). Consequently, a deleted column is not removed immediately. Instead a tombstone is written to indicate the new column status. Columns marked with a tombstone exist for a configured time period (defined by the gc_grace_seconds value set on the table). When the grace period expires, the compaction process permanently deletes the column.
Marking a deleted column with a tombstone signals Cassandra to retry sending a delete request to a replica that was down at the time of delete. If the replica comes back up within the grace period of time, it eventually receives the delete request. However, if a node is down longer than the grace period, then the node can possibly miss the delete altogether, and replicate deleted data once it comes back up again. To prevent deleted data from reappearing, administrators must run regular node repair on every node in the cluster (by default, every 10 days).