Apache Cassandra 0.7 Documentation

Column Families

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

A column family resembles a table in an RDBMS. Column families contain rows and columns. Each row is uniquely identified by a row key. Each row has multiple columns, each of which has a name, value, and a timestamp.

Unlike a table in an RDBMS, different rows in the same column family do not have to share the same set of columns, and a column may be added to one or multiple rows at any time. It can be useful to distinguish between “static” column families that contain values such as user data or other object data, and “dynamic” column families that contain data such as precalculated query results.

Column Sorting & Comparator Types

Columns within a row are stored in sorted order based on the column names and the comparator used. Note that super column families may use a different comparator for super column names and column names.

Writing a Custom Comparator

If one of the built-in comparator types doesn’t work for you, you can write a custom comparator. It’s easiest to start with src/java/org/apache/cassandra/db/marshal/LongType.java to get an idea of the functionality needed and modify the class to support your own comparator type. Specifically, it should have an instance class member (a singleton for the class) and implement or inherit two methods: compare() and toString().

Data Validation

For each column family, you can specify a default validation class to use for validating the column values. Valid values are the same types listed for the comparator. It is possible to implement additional validators by creating custom validation classes.

Cassandra also can validate data on a per-column basis. You can specify a validator class at the column level using the column metadata field, validator. Validators at the column level take precedence over the default validator specified at the column family level.

Note

The validator metadata field is required for all indexed columns. If you want to build a secondary index for a row, you must specify a column-level validator.