Apache Cassandra 1.2 Documentation

Anatomy of a table

The Cassandra 1.2 documentation is transitioning to a new format!
Please use the new Cassandra 1.2 documentation instead.
Back to Table of Contents
All Documents List     

At one level, Cassandra tables, rows, and columns can be thought of much the same way as those in a relational database. In both SQL and CQL you define tables, which have defined columns and associated data types, and you can create indexes to allow efficient querying by column values.

However, an important difference is that since Cassandra is designed from the ground up as a distributed system, it emphasizes denormalization instead of normalization and joins, and provides tools like collections to support this.

Example -- a music service

This example of a social music service requires a songs table having a title, album, and artist column, plus a column called data for the actual audio file itself. The table uses a UUID as a primary key.

CREATE TABLE songs (
  id uuid PRIMARY KEY,
  title text,
  album text,
  artist text,
  data blob
);

In a relational database, you would create a playlists table with a foreign key to the songs, but in Cassandra, you denormalize the data. To represent the playlist data, you can create a table like this:

CREATE TABLE playlists (
  id uuid,
  song_order int,
  song_id uuid,
  title text,
  album text,
  artist text,
  PRIMARY KEY (id, song_order)
);

The combination of the id and song_order in the playlists table uniquely identifies a row in the playlists table. You can have more than one row with the same id as long as the rows contain different song_order values.

Note

UUIDs are handy for sequencing the data or automatically incrementing synchronization across multiple machines. For simplicity, an int song_order is used in this example.

After inserting the example data into playlists, the output of selecting all the data looks like this:

SELECT * FROM playlists;

../../_images/select_playlists.png

The next example illustrates how you can create a query that uses the artist as a filter. First, add a little more data to the playlist table to make things interesting for the collections examples later:

INSERT INTO playlists (id, song_order, song_id, title, artist, album)
  VALUES (62c36092-82a1-3a00-93d1-46196ee77204, 4,
  7db1a490-5878-11e2-bcfd-0800200c9a66,
  'Ojo Rojo', 'Fu Manchu', 'No One Rides for Free');

With the schema as given so far, a query that includes the artist filter would require a sequential scan across the entire playlists dataset. Cassandra will reject such a query. If you first create a secondary index on artist, Cassandra can efficiently pull out the records in question.

CREATE INDEX ON playlists(artist);

Now, you can query the playlists for songs by Fu Manchu, for example:

SELECT * FROM playlists WHERE artist = 'Fu Manchu';

The output looks something like this:


../../_images/artist_fumanchu1.png

Compound keys and clustering

A compound primary key gives you clustering guarantees per partition as well as to the normal SQL semantics of the compound primary key.

About the partition key and clustering columns

Cassandra uses the first column name in the primary key definition as the partition key. For example, in the playlists table, id is the partition key. The remaining column, or columns that are not partition keys in the primary key definition are the clustering columns. In the case of the playlists table, the song_order is the clustering column. The data for each partition is clustered by the remaining column or columns of the primary key definition. On a physical node, when rows for a partition key are stored in order based on the clustering columns, retrieval of rows is very efficient. For example, because the id in the playlists table is the partition key, all the songs for a playlist are clustered in the order of the remaining song_order column.

Insertion, update, and deletion operations on rows sharing the same partition key for a table are performed atomically and in isolation.

You can query a single sequential set of data on disk to get the songs for a playlist.

SELECT * FROM playlists WHERE id = 62c36092-82a1-3a00-93d1-46196ee77204
  ORDER BY song_order DESC LIMIT 50;

The output looks something like this:


../../_images/select_desc1.png

Collection columns

CQL 3 introduces these collection types:

In a relational database, to allow users to have multiple email addresses, you create an email_addresses table having a many-to-one (joined) relationship to a users table. CQL 3 handles the classic multiple email addresses use case, and other use cases, by defining columns as collections. Using the set collection type to solve the multiple email addresses problem is convenient and intuitive.

Another use of a collection type can be demonstrated using the music service example.

Adding a collection to a table

The music service needs to include the capability to tag the songs. From a relational standpoint, you can think of storage engine rows as partitions, within which (object) rows are clustered. To tag songs, use a collection set. Declare the collection set using the CREATE TABLE or ALTER TABLE statements. Because the songs table already exists from the earlier example, just alter that table to add a collection set, tags:

ALTER TABLE songs ADD tags set<text>;

Updating a collection

Update the songs table to insert the tags data:

UPDATE songs SET tags = tags + {'2007'}
  WHERE id = 8a172618-b121-4136-bb10-f665cfc469eb;
UPDATE songs SET tags = tags + {'covers'}
  WHERE id = 8a172618-b121-4136-bb10-f665cfc469eb;
UPDATE songs SET tags = tags + {'1973'}
  WHERE id = a3e64f8f-bd44-4f28-b8d9-6938726e34d4;
UPDATE songs SET tags = tags + {'blues'}
  WHERE id = a3e64f8f-bd44-4f28-b8d9-6938726e34d4;
UPDATE songs SET tags = tags + {'rock'}
  WHERE id = 7db1a490-5878-11e2-bcfd-0800200c9a66;

A music reviews list and a schedule (map collection) of live appearances can be added to the table:

ALTER TABLE songs ADD reviews list<text>;
ALTER TABLE songs ADD venue map<timestamp, text>;

Each element of a map, list, or map is internally stored as one Cassandra column. To update a set, use the UPDATE command and the addition (+) operator to add an element or the subtraction (-) operator to remove an element. For example, to update a set:

UPDATE songs
  SET tags = tags + {'rock'}
  WHERE id = 7db1a490-5878-11e2-bcfd-0800200c9a66;

To update a list, a similar syntax using square brackets instead of curly brackets is used.

UPDATE songs
  SET reviews = reviews + [ 'hot dance music' ]
  WHERE id = 7db1a490-5878-11e2-bcfd-0800200c9a66;

To update a map, use INSERT to specify the data in a map collection.

INSERT INTO songs (id, venue)
  VALUES (7db1a490-5878-11e2-bcfd-0800200c9a66,
  { '2013-9-22 12:01'  : 'The Fillmore',
  '2013-10-1 18:00' : 'The Apple Barrel'});

Inserting data into the map replaces the entire map.

Querying a collection

To query a collection, include the name of the collection column in the select expression. For example, selecting the tags set returns the set of tags, sorted alphabetically in this case because the tags set is of the text data type:

SELECT id, tags FROM songs;

../../_images/select_tags.png
SELECT id, venue FROM songs;

../../_images/select_venue.png

The collection types are described in more detail in Using collections: set, list, and map.

Expiring columns

Data in a column can have an optional expiration date called TTL (time to live). Whenever a column is inserted, the client request can specify an optional TTL value, defined in seconds, for the data in the column. TTL columns are marked as having the data deleted (with a tombstone) after the requested amount of time has expired. After columns are marked with a tombstone, they are automatically removed during the normal compaction (defined by the gc_grace_seconds) and repair processes.

Use CQL to set the TTL for a column.

If you want to change the TTL of an expiring column, you have to re-insert the column with a new TTL. In Cassandra, the insertion of a column is actually an insertion or update operation, depending on whether or not a previous version of the column exists. This means that to update the TTL for a column with an unknown value, you have to read the column and then re-insert it with the new TTL value.

TTL columns have a precision of one second, as calculated on the server. Therefore, a very small TTL probably does not make much sense. Moreover, the clocks on the servers should be synchronized; otherwise reduced precision could be observed because the expiration time is computed on the primary host that receives the initial insertion but is then interpreted by other hosts on the cluster.

An expiring column has an additional overhead of 8 bytes in memory and on disk (to record the TTL and expiration time) compared to standard columns.

Counter columns

A counter is a special kind of column used to store a number that incrementally counts the occurrences of a particular event or process. For example, you might use a counter column to count the number of times a page is viewed.

Counter column tables must use Counter data type. Counters may only be stored in dedicated tables.

After a counter is defined, the client application then updates the counter column value by incrementing (or decrementing) it. A client update to a counter column passes the name of the counter and the increment (or decrement) value; no timestamp is required.


../../_images/counter_column_12.png

Internally, the structure of a counter column is a bit more complex. Cassandra tracks the distributed state of the counter as well as a server-generated timestamp upon deletion of a counter column. For this reason, it is important that all nodes in your cluster have their clocks synchronized using a source such as network time protocol (NTP).

Unlike normal columns, a write to a counter requires a read in the background to ensure that distributed counter values remain consistent across replicas. Typically, you use a consistency level of ONE with counters because during a write operation, the implicit read does not impact write latency.