CQL for Cassandra 1.2

BATCH

Write multiple DML statements.

Synopsis

BEGIN BATCH
  | BEGIN UNLOGGED
  | BEGIN COUNTER
  USING TIMESTAMP timestamp;
  dml_statement
  dml_statement
  ...
APPLY BATCH;

dml_statement is:

  • INSERT
  • UPDATE
  • DELETE

Synopsis legend

  • Uppercase means literal
  • Lowercase means not literal
  • Italics mean optional
  • The pipe (|) symbol means OR or AND/OR
  • Ellipsis (...) means repeatable
  • « means a non-literal, open parenthesis used to indicate scope
  • » means a non-literal, close parenthesis used to indicate scope

A semicolon that terminates CQL statements is not included in the synopsis.

Description

A BATCH statement combines multiple data modification language (DML) statements (INSERT, UPDATE, DELETE) into a single logical operation, and sets a client-supplied timestamp for all columns written by the statements in the batch. Batching multiple statements saves network exchanges between the client/server and server coordinator/replicas.

In Cassandra 1.2 and later, batches are atomic by default. In the context of a Cassandra batch operation, atomic means that if any of the batch succeeds, all of it will. To achieve atomicity, Cassandra first writes the serialized batch to the batchlog system table that consumes the serialized batch as blob data. When the rows in the batch have been successfully written and persisted (or hinted) the batchlog data is removed. There is a performance penalty for atomicity. If you do not want to incur this penalty, prevent Cassandra from writing to the batchlog system by using the UNLOGGED option: BEGIN UNLOGGED BATCH

Although an atomic batch guarantees that if any part of the batch succeeds, all of it will, no other transactional enforcement is done at the batch level. For example, there is no batch isolation. Other clients are able to read the first updated rows from the batch, while other rows are in progress. However, transactional row updates within a single row are isolated: a partial row update cannot be read.

Using a timestamp

BATCH supports setting a client-supplied timestamp, an integer, in the USING clause that is used by all batched operations. If not specified, the current time of the insertion (in microseconds) is used.

Individual DML statements inside a BATCH cannot specify a timestamp. However, if you do not specify a batch-level timestamp, you can specify a timestamp in the individual DML statements.

Batching counter updates

Use BEGIN COUNTER BATCH in a batch statement for batched counter updates. Unlike other writes in Cassandra, counter updates are not idempotent.

Example

BEGIN BATCH
  INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3b', 'second user')
  UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2'
  INSERT INTO users (userID, password) VALUES ('user3', 'ch@ngem3c')
  DELETE name FROM users WHERE userID = 'user2'
  INSERT INTO users (userID, password, name) VALUES ('user4', 'ch@ngem3c', 'Andrew')
APPLY BATCH;