Apache Cassandra 1.0 Documentation

About Column Families

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

When comparing Cassandra to a relational database, the column family is similar to a table in that it is a container for columns and rows. However, a column family requires a major shift in thinking for those coming from the relational world.

In a relational database, you define tables, which have defined columns. The table defines the column names and their data types, and the client application then supplies rows conforming to that schema: each row contains the same fixed set of columns.

In Cassandra, you define column families. Column families can (and should) define metadata about the columns, but the actual columns that make up a row are determined by the client application. Each row can have a different set of columns.

Although column families are very flexible, in practice a column family is not entirely schema-less. Each column family should be designed to contain a single type of data. There are two typical column family design patterns in Cassandra; the static and dynamic column families.

A static column family uses a relatively static set of column names and is more similar to a relational database table. For example, a column family storing user data might have columns for the user name, address, email, phone number and so on. Although the rows will generally have the same set of columns, they are not required to have all of the columns defined. Static column families typically have column metadata pre-defined for each column.


../../_images/static_column_family.png

A dynamic column family takes advantage of Cassandra's ability to use arbitrary application-supplied column names to store data. A dynamic column family allows you to pre-compute result sets and store them in a single row for efficient data retrieval. Each row is a snapshot of data meant to satisfy a given query, sort of like a materialized view. For example, a column family that tracks the users that subscribe to a particular user's blog.


../../_images/dynamic_column_family.png

Instead of defining metadata for individual columns, a dynamic column family defines the type information for column names and values (comparators and validators), but the actual column names and values are set by the application when a column is inserted.

For all column families, each row is uniquely identified by its row key, similar to the primary key in a relational table. A column family is always partitioned on its row key, and the row key is always implicitly indexed. Empty row keys are not allowed.

About Columns

The column is the smallest increment of data in Cassandra. It is a tuple containing a name, a value and a timestamp.


../../_images/column.png

A column must have a name, and the name can be a static label (such as "name" or "email") or it can be dynamically set when the column is created by your application.

Columns can be indexed on their name (see secondary indexes). However, one limitation of column indexes is that they do not support queries that require access to ordered data, such as time series data. In this case a secondary index on a timestamp column would not be sufficient because you cannot control column sort order with a secondary index. For cases where sort order is important, manually maintaining a column family as an 'index' is another way to lookup column data in sorted order.

It is not required for a column to have a value. Sometimes all the information your application needs to satisfy a given query can be stored in the column name itself. For example, if you are using a column family as a materialized view to lookup rows from other column families, all you need to store is the row key that you are looking up; the value can be empty.

Cassandra uses the column timestamp to determine the most recent update to a column. The timestamp is provided by the client application. The latest timestamp always wins when requesting data, so if multiple client sessions update the same columns in a row concurrently, the most recent update is the one that will eventually persist. See About Transactions and Concurrency Control for more information about how Cassandra handles conflict resolution.

About Special Columns (Counter, Expiring, Super)

Cassandra has three special types of columns, described below:

About Expiring Columns

A column can also have an optional expiration date called TTL (time to live). Whenever a column is inserted, the client request can specify an optional TTL value, defined in seconds, for the column. TTL columns are marked as deleted (with a tombstone) after the requested amount of time has expired. Once they are marked with a tombstone, they are automatically removed during the normal compaction (defined by the gc_grace_seconds) and repair processes.

You can use either CLI or CQL to set the TTL for a column. See Setting an Expiring Column and Specifying Column Expiration with TTL.

If you want to change the TTL of an expiring column, you have to re-insert the column with a new TTL. In Cassandra the insertion of a column is actually an insertion or update operation, depending on whether or not a previous version of the column exists. This means that to update the TTL for a column with an unknown value, you have to read the column and then re-insert it with the new TTL value.

TTL columns have a precision of one second, as calculated on the server. Therefore, a very small TTL probably does not make much sense. Moreover, the clocks on the servers should be synchronized; otherwise reduced precision could be observed because the expiration time is computed on the primary host that receives the initial insertion but is then interpreted by other hosts on the cluster.

An expiring column has an additional overhead of 8 bytes in memory and on disk (to record the TTL and expiration time) compared to standard columns.

About Counter Columns

A counter is a special kind of column used to store a number that incrementally counts the occurrences of a particular event or process. For example, you might use a counter column to count the number of times a page is viewed.

Counter column families must use CounterColumnType as the validator (the column value type). This means that currently, counters may only be stored in dedicated column families; they will be allowed to mix with normal columns in a future release.

Counter columns are different from regular columns in that once a counter is defined, the client application then updates the column value by incrementing (or decrementing) it. A client update to a counter column passes the name of the counter and the increment (or decrement) value; no timestamp is required.


../../_images/counter_column.png

Internally, the structure of a counter column is a bit more complex. Cassandra tracks the distributed state of the counter as well as a server-generated timestamp upon deletion of a counter column. For this reason, it is important that all nodes in your cluster have their clocks synchronized using network time protocol (NTP).

A counter can be read or written at any of the available consistency levels. However, it's important to understand that unlike normal columns, a write to a counter requires a read in the background to ensure that distributed counter values remain consistent across replicas. If you write at a consistency level of ONE, the implicit read will not impact write latency, hence, ONE is the most common consistency level to use with counters.

About Super Columns

A Cassandra column family can contain either regular columns or super columns, which adds another level of nesting to the regular column family structure. Super columns are comprised of a (super) column name and an ordered map of sub-columns. A super column can specify a comparator on both the super column name as well as on the sub-column names.


../../_images/super_column.png

A super column is a way to group multiple columns based on a common lookup value. The primary use case for super columns is to denormalize multiple rows from other column families into a single row, allowing for materialized view data retrieval. For example, suppose you wanted to create a materialized view of blog entries for the bloggers that a user follows.

../../_images/super_column_example.png

One limitation of super columns is that all sub-columns of a super column must be deserialized in order to read a single sub-column value, and you cannot create secondary indexes on the sub-columns of a super column. Therefore, the use of super columns is best suited for use cases where the number of sub-columns is a relatively small number.

About Data Types (Comparators and Validators)

In a relational database, you must specify a data type for each column when you define a table. The data type constrains the values that can be inserted into that column. For example, if you have a column defined as an integer datatype, you would not be allowed to insert character data into that column. Column names in a relational database are typically fixed labels (strings) that are assigned when you define the table schema.

In Cassandra, the data type for a column (or row key) value is called a validator. The data type for a column name is called a comparator. You can define data types when you create your column family schemas (which is recommended), but Cassandra does not require it. Internally, Cassandra stores column names and values as hex byte arrays (BytesType). This is the default client encoding used if data types are not defined in the column family schema (or if not specified by the client request).

Cassandra comes with the following built-in data types, which can be used as both validators (row key and column value data types) or comparators (column name data types). One exception is CounterColumnType, which is only allowed as a column value (not allowed for row keys or column names).

Internal Type CQL Name Description
BytesType blob Arbitrary hexadecimal bytes (no validation)
AsciiType ascii US-ASCII character string
UTF8Type text, varchar UTF-8 encoded string
IntegerType varint Arbitrary-precision integer
LongType int, bigint 8-byte long
UUIDType uuid Type 1 or type 4 UUID
DateType timestamp Date plus time, encoded as 8 bytes since epoch
BooleanType boolean true or false
FloatType float 4-byte floating point
DoubleType double 8-byte floating point
DecimalType decimal Variable-precision decimal
CounterColumnType counter Distributed counter value (8-byte long)

About Validators

For all column families, it is best practice to define a default row key validator using the key_validation_class property.

For static column families, you should define each column and its associated type when you define the column family using the column_metadata property.

For dynamic column families (where column names are not known ahead of time), you should specify a default_validation_class instead of defining the per-column data types.

Key and column validators may be added or changed in a column family definition at any time. If you specify an invalid validator on your column family, client requests that respect that metadata will be confused, and data inserts or updates that do not conform to the specified validator will be rejected.

About Comparators

Within a row, columns are always stored in sorted order by their column name. The comparator specifies the data type for the column name, as well as the sort order in which columns are stored within a row. Unlike validators, the comparator may not be changed after the column family is defined, so this is an important consideration when defining a column family in Cassandra.

Typically, static column family names will be strings, and the sort order of columns is not important in that case. For dynamic column families, however, sort order is important. For example, in a column family that stores time series data (the column names are timestamps), having the data in sorted order is required for slicing result sets out of a row of columns.

About Column Family Compression

Data compression can be configured on a per-column family basis. Compression maximizes the storage capacity of your Cassandra nodes by reducing the volume of data on disk. In addition to the space-saving benefits, compression also reduces disk I/O, particularly for read-dominated workloads.

Besides reducing data size, compression typically improves both read and write performance. Cassandra is able to quickly find the location of rows in the SSTable index, and only decompresses the relevant row chunks. This means compression improves read performance not just by allowing a larger data set to fit in memory, but it also benefits workloads where the hot data set does not fit into memory.

Unlike in traditional databases, write performance is not negatively impacted by compression in Cassandra. Writes on compressed tables can in fact show up to a 10 percent performance improvement. In traditional relational databases, writes require overwrites to existing data files on disk. This means that the database has to locate the relevant pages on disk, decompress them, overwrite the relevant data, and then compress them again (an expensive operation in both CPU cycles and disk I/O).

Because Cassandra SSTable data files are immutable (they are not written to again after they have been flushed to disk), there is no recompression cycle necessary in order to process writes. SSTables are only compressed once, when they are written to disk.

Enabling compression can yield the following benefits, depending on the data characteristics of the column family:

  • 2x-4x reduction in data size
  • 25-35% performance improvement on reads
  • 5-10% performance improvement on writes

When to Use Compression

Compression is best suited for column families where there are many rows, with each row having the same columns, or at least many columns in common. For example, a column family containing user data such as username, email, etc., would be a good candidate for compression. The more similar the data across rows, the greater the compression ratio will be, and the larger the gain in read performance.

Compression is not as good a fit for column families where each row has a different set of columns, or where there are just a few very wide rows. Dynamic column families such as this will not yield good compression ratios.

Configuring Compression on a Column Family

When you create or update a column family, you can choose to make it a compressed column family by setting the compression_options attributes.

You can enable compression when you create a new column family, or update an existing column family to add compression later on. When you add compression to an existing column family, existing SSTables on disk are not compressed immediately. Any new SSTables that are created will be compressed, and any existing SSTables will be compressed during the normal Cassandra compaction process. If necessary, you can force existing SSTables to be rewritten and compressed by using nodetool upgradesstables (Cassandra 1.0.4 or later) or nodetool scrub.

For example, to create a new column family with compression enabled using the Cassandra CLI, you would do the following:

[default@demo] CREATE COLUMN FAMILY users WITH key_validation_class=UTF8Type AND column_metadata = [ {column_name: name, validation_class: UTF8Type} {column_name: email, validation_class: UTF8Type} {column_name: state, validation_class: UTF8Type} {column_name: gender, validation_class: UTF8Type} {column_name: birth_year, validation_class: LongType} ] AND compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64};