Apache Cassandra 1.1 Documentation

Data Modeling

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

When comparing Cassandra to a relational database, the column family is similar to a table in that it is a container for columns and rows. However, a column family requires a major shift in thinking for those coming from the relational world.

In a relational database, you define tables, which have defined columns. The table defines the column names and their data types, and the client application then supplies rows conforming to that schema: each row contains the same fixed set of columns.

In Cassandra, you define column families. Column families can (and should) define metadata about the columns, but the actual columns that make up a row are determined by the client application. Each row can have a different set of columns. There are two types of column families:

Column families consist of these kinds of columns:

  • Standard: Has one primary key.
  • Composite: Has more than one primary key, recommended for managing wide rows.
  • Expiring: Gets deleted during compaction.
  • Counter: Counts occurrences of an event.
  • Super: Used to manage wide rows, inferior to using composite columns.

Although column families are very flexible, in practice a column family is not entirely schema-less.

Designing Column Families

Each row of a column family is uniquely identified by its row key, similar to the primary key in a relational table. A column family is partitioned on its row key, and the row key is implicitly indexed.

Static Column Families

A static column family uses a relatively static set of column names and is similar to a relational database table. For example, a column family storing user data might have columns for the user name, address, email, phone number and so on. Although the rows generally have the same set of columns, they are not required to have all of the columns defined. Static column families typically have column metadata pre-defined for each column.

../../_images/static_column_family.png

Dynamic Column Families

A dynamic column family takes advantage of Cassandra's ability to use arbitrary application-supplied column names to store data. A dynamic column family allows you to pre-compute result sets and store them in a single row for efficient data retrieval. Each row is a snapshot of data meant to satisfy a given query, sort of like a materialized view. For example, a column family that tracks the users that subscribe to a particular user's blog is dynamic.

../../_images/dynamic_column_family.png

Instead of defining metadata for individual columns, a dynamic column family defines the type information for column names and values (comparators and validators), but the actual column names and values are set by the application when a column is inserted.

Standard Columns

The column is the smallest increment of data in Cassandra. It is a tuple containing a name, a value and a timestamp.

../../_images/column.png

A column must have a name, and the name can be a static label (such as "name" or "email") or it can be dynamically set when the column is created by your application.

Columns can be indexed on their name (see secondary indexes). However, one limitation of column indexes is that they do not support queries that require access to ordered data, such as time series data. In this case a secondary index on a timestamp column would not be sufficient because you cannot control column sort order with a secondary index. For cases where sort order is important, manually maintaining a column family as an 'index' is another way to lookup column data in sorted order.

It is not required for a column to have a value. Sometimes all the information your application needs to satisfy a given query can be stored in the column name itself. For example, if you are using a column family as a materialized view to lookup rows from other column families, all you need to store is the row key that you are looking up; the value can be empty.

Cassandra uses the column timestamp to determine the most recent update to a column. The timestamp is provided by the client application. The latest timestamp always wins when requesting data, so if multiple client sessions update the same columns in a row concurrently, the most recent update is the one that will eventually persist. See About Transactions and Concurrency Control for more information about how Cassandra handles conflict resolution.

Composite Columns

Cassandra’s storage engine uses composite columns under the hood to store clustered rows. All the logical rows with the same partition key get stored as a single, physical wide row. Using this design, Cassandra supports up to 2 billion columns per (physical) row.

Composite columns comprise fully denormalized wide rows by using composite primary keys. You create and query composite columns using CQL 3.

Tweets Example

For example, in the database you store the tweet, user, and follower data, and you want to use one query to return all the tweets of a user's followers.

First, set up a tweets table and a timeline table.

  • The tweets table is the data table where the tweets live. The table has an author column, a body column, and a surrogate UUID key.

    Note

    UUIDs are handy for sequencing the data or automatically incrementing synchronization across multiple machines.

  • The timeline table denormalizes the tweets, setting up composite columns by virtue of the composite primary key.

    CREATE TABLE tweets (
      tweet_id uuid PRIMARY KEY,
      author varchar,
      body varchar
     );
    
    CREATE TABLE timeline (
      user_id varchar,
      tweet_id uuid,
      author varchar,
      body varchar,
      PRIMARY KEY (user_id, tweet_id)
    );
    

The combination of the user_id and tweet_id in the timeline table uniquely identifies a row in the timeline table. You can have more than one row with the same user ID as long as the rows contain different tweetIDs. The following figure shows three sample tweets of Patrick Henry, George Washington, and George Mason from different years. The tweet_ids are unique.

Tweets Table

../../_images/tweets.png

The next figure shows how the tweets are denormalized for two of the users who are following these men. George Mason follows Patrick Henry and George Washington and Alexander Hamilton follow John Adams and George Washington.

Timeline Table

../../_images/timeline.png

Cassandra uses the first column name in the primary key definition as the partition key, which is the same as the row key to the underlying storage engine. For example, in the timeline table, user_id is the partition key. The data for each partition key will be clustered by the remaining columns of the primary key definition. Clustering means that the storage engine creates an index and always keeps the data clustered by that index. Because the user_id is the partition key, all the tweets for gmason's friends, are clustered in the order of the remaining tweet_id column.

The storage engine guarantees that the columns are clustered according to the tweet_id. The next figure shows explicitly how the data maps to the storage engine: the gmason partition key designates a single storage engine row in which the rows of the logical view of the data share the same tweet_id part of a composite column name.

Timeline Physical Layout

../../_images/timeline_physical_layout.png

The gmason columns are next to each other as are the ahamilton columns. All the gmason or ahamiliton columns are stored sequentially, ordered by the tweet_id columns within the respective gmason or ahamilton partition key. In the gmason row, the first field is the tweet_id, 1765, which is the composite column name, shared by the row data. Likewise, the 1742 row data share the 1742 component. The second field, named author in one column and body in another, contains the literal data that Cassandra stores. The physical representation of the row achieves the same sparseness using a compound primary key column as a standard Cassandra column.

Using the CQL 3 model, you can query a single sequential set of data on disk to get the tweets of a user's followers.

SELECT * FROM timeline WHERE user_id = gmason
ORDER BY tweet_id DESC LIMIT 50;

Compatibility with Older Applications

The query, expressed in SQL-like CQL 3 replaces the CQL 2 query that uses a range and the REVERSE keyword to slice 50 tweets out of the timeline material as viewed in the gmason row. The custom comparator or default_validation class that you had to set when dealing with wide rows in CQL 2 is no longer necessary in CQL 3.

The WITH COMPACT STORAGE directive is provided for backward compatibility with older Cassandra applications; new applications should avoid it. Using compact storage prevents you from adding new columns that are not part of the PRIMARY KEY. With compact storage, each logical row corresponds to exactly one physical column:

../../_images/compact_storage.png

Expiring Columns

A column can also have an optional expiration date called TTL (time to live). Whenever a column is inserted, the client request can specify an optional TTL value, defined in seconds, for the column. TTL columns are marked as deleted (with a tombstone) after the requested amount of time has expired. Once they are marked with a tombstone, they are automatically removed during the normal compaction (defined by the gc_grace_seconds) and repair processes.

Use CQL to set the TTL for a column.

If you want to change the TTL of an expiring column, you have to re-insert the column with a new TTL. In Cassandra the insertion of a column is actually an insertion or update operation, depending on whether or not a previous version of the column exists. This means that to update the TTL for a column with an unknown value, you have to read the column and then re-insert it with the new TTL value.

TTL columns have a precision of one second, as calculated on the server. Therefore, a very small TTL probably does not make much sense. Moreover, the clocks on the servers should be synchronized; otherwise reduced precision could be observed because the expiration time is computed on the primary host that receives the initial insertion but is then interpreted by other hosts on the cluster.

An expiring column has an additional overhead of 8 bytes in memory and on disk (to record the TTL and expiration time) compared to standard columns.

Counter Columns

A counter is a special kind of column used to store a number that incrementally counts the occurrences of a particular event or process. For example, you might use a counter column to count the number of times a page is viewed.

Counter column families must use CounterColumnType as the validator (the column value type). This means that currently, counters may only be stored in dedicated column families.

Counter columns are different from regular columns in that once a counter is defined, the client application then updates the column value by incrementing (or decrementing) it. A client update to a counter column passes the name of the counter and the increment (or decrement) value; no timestamp is required.


../../_images/counter_column.png

Internally, the structure of a counter column is a bit more complex. Cassandra tracks the distributed state of the counter as well as a server-generated timestamp upon deletion of a counter column. For this reason, it is important that all nodes in your cluster have their clocks synchronized using network time protocol (NTP).

A counter can be read or written at any of the available consistency levels. However, it's important to understand that unlike normal columns, a write to a counter requires a read in the background to ensure that distributed counter values remain consistent across replicas. If you write at a consistency level of ONE, the implicit read will not impact write latency, hence, ONE is the most common consistency level to use with counters.

Super Columns

Do not use super columns. They are a legacy design from a pre-open source release. This design was structured for a specific use case and does not fit most use cases. Super columns read entire super columns and all its sub-columns into memory for each read request. This results in severe performance issues. Additionally, super columns are not supported in CQL 3.

Use composite columns instead. Composite columns provide most of the same benefits as super columns without the performance issues.

About Data Types (Comparators and Validators)

In a relational database, you must specify a data type for each column when you define a table. The data type constrains the values that can be inserted into that column. For example, if you have a column defined as an integer datatype, you would not be allowed to insert character data into that column. Column names in a relational database are typically fixed labels (strings) that are assigned when you define the table schema.

In Cassandra, the data type for a column (or row key) value is called a validator. The data type for a column name is called a comparator. You can define data types when you create your column family schemas (which is recommended), but Cassandra does not require it. Internally, Cassandra stores column names and values as hex byte arrays (BytesType). This is the default client encoding used if data types are not defined in the column family schema (or if not specified by the client request).

Cassandra comes with the following built-in data types, which can be used as both validators (row key and column value data types) or comparators (column name data types). One exception is CounterColumnType, which is only allowed as a column value (not allowed for row keys or column names).

Internal Type CQL Name Description
BytesType blob Arbitrary hexadecimal bytes (no validation)
AsciiType ascii US-ASCII character string
UTF8Type text, varchar UTF-8 encoded string
IntegerType varint Arbitrary-precision integer
Int32Type int 4-byte integer
LongType bigint 8-byte long
UUIDType uuid Type 1 or type 4 UUID
TimeUUIDType timeuuid Type 1 UUID only (CQL3)
DateType timestamp Date plus time, encoded as 8 bytes since epoch
BooleanType boolean true or false
FloatType float 4-byte floating point
DoubleType double 8-byte floating point
DecimalType decimal Variable-precision decimal
CounterColumnType counter Distributed counter value (8-byte long)

Composite Types

Additional new composite types exist for indirect use through CQL. Using these types through an API client is not recommended. Composite types used through CQL 3 support Cassandra wide rows using composite column names to create tables.

About Validators

Using the CLI you can define a default row key validator for a column family using the key_validation_class property. Using CQL, you use built-in key validators to validate row key values. For static column families, define each column and its associated type when you define the column family using the column_metadata property.

Key and column validators may be added or changed in a column family definition at any time. If you specify an invalid validator on your column family, client requests that respect that metadata will be confused, and data inserts or updates that do not conform to the specified validator will be rejected.

For dynamic column families (where column names are not known ahead of time), you should specify a default_validation_class instead of defining the per-column data types.

Key and column validators may be added or changed in a column family definition at any time. If you specify an invalid validator on your column family, client requests that respect that metadata will be confused, and data inserts or updates that do not conform to the specified validator will be rejected.

About the Comparator

Within a row, columns are always stored in sorted order by their column name. The comparator specifies the data type for the column name, as well as the sort order in which columns are stored within a row. Unlike validators, the comparator may not be changed after the column family is defined, so this is an important consideration when defining a column family in Cassandra.

Typically, static column family names will be strings, and the sort order of columns is not important in that case. For dynamic column families, however, sort order is important. For example, in a column family that stores time series data (the column names are timestamps), having the data in sorted order is required for slicing result sets out of a row of columns.

Compressing Column Family Data

Cassandra application-transparent compression maximizes the storage capacity of your Cassandra nodes by reducing the volume of data on disk. In addition to the space-saving benefits, compression also reduces disk I/O, particularly for read-dominated workloads. To compress column family data, use CLI or CQL.