Apache Cassandra 1.2 Documentation

CQL data types

The Cassandra 1.2 documentation is transitioning to a new format!
Please use the new Cassandra 1.2 documentation instead.
Back to Table of Contents
All Documents List     

CQL comes with the following built-in data types for columns. One exception is counter type, which is allowed only as a column value (not allowed for row key values).

CQL Type Constants Description
ascii strings US-ASCII character string
bigint integers 64-bit signed long
blob blobs Arbitrary bytes (no validation), expressed as hexadecimal
boolean booleans true or false
counter integers Distributed counter value (64-bit long)
decimal integers, floats Variable-precision decimal
double integers 64-bit IEEE-754 floating point
float integers, floats 32-bit IEEE-754 floating point
inet strings IP address string in IPv4 or IPv6 format [1]
int integers 32-bit signed integer
list n/a A collection of one or more ordered elements
map n/a A collection of one or more timestamp, value pairs
set n/a A collection of one or more elements
text strings UTF-8 encoded string
timestamp integers, strings Date plus time, encoded as 8 bytes since epoch
uuid uuids A UUID in standard UUID format
timeuuid uuids Type 1 UUID only (CQL 3)
varchar strings UTF-8 encoded string
varint integers Arbitrary-precision integer
[1]Used by python-cql driver and binary protocols.

In addition to the CQL types listed in this table, you can use a string containing the name of a JAVA class (a sub-class of AbstractType loadable by Cassandra) as a CQL type. The class name should either be fully qualified or relative to the org.apache.cassandra.db.marshal package.

Enclose ASCII text, timestamp, and inet values in single quotation marks. Enclose names of a keyspace, table, or column in double quotation marks.

Blob

Cassandra 1.2.3 still supports blobs as string constants for input (to allow smoother transition to blob constant). Blobs as strings are now deprecated and will not be supported in the near future. If you were using strings as blobs, update your client code to switch to blob constants.

A blob constant is an hexadecimal number defined by 0[xX](hex)+ where hex is an hexadecimal character, e.g. [0-9a-fA-F]. For example, 0xcafe.

Blob conversion functions

A number of functions convert the native types into binary data (blob). For every <native-type> nonblob type supported by CQL3, the typeAsBlob function takes a argument of type type and returns it as a blob. Conversely, the blobAsType function takes a 64-bit blob argument and converts it to a bigint value. For example, bigintAsBlob(3) is 0x0000000000000003 and blobAsBigint(0x0000000000000003) is 3.

The map, set, and list collection types

A collection column is declared using the collection type, followed by another type, such as int or text, in angle brackets. For example, you can create a table having a list of textual elements, a list of integers, or a list of some other element types.

list<text>
list<int>

Collection types cannot currently be nested. For example, you cannot define a list within a list:

list<list<int>>     \\not allowed

Currently, you cannot create a secondary index on a column of type map, set, or list.

UUID types for column names

The UUID (universally unique id) comparator type is used to avoid collisions in column names. Alternatively, you can use the timeuuid.

Timeuuid type

A value of the timeuuid type is a Type 1 UUID. A type 1 UUID includes the time of its generation and are sorted by timestamp, making them ideal for use in applications requiring conflict-free timestamps. For example, you can use this type to identify a column (such as a blog entry) by its timestamp and allow multiple clients to write to the same row key simultaneously. Collisions that would potentially overwrite data that was not intended to be overwritten cannot occur.

A valid timeuuid conforms to the timeuuid format shown in valid expressions.

Timeuuid functions

You can use these functions with the timeuuid type:

  • dateOf()

    Used in a SELECT clause, this function extracts the timestamp of a timeuuid column in a resultset. This function returns the extracted timestamp as a date. Use unixTimestampOf() to get a raw timestamp.

  • now()

    Generates a new unique timeuuid when the statement is executed. This method is useful for inserting values. The value returned by now() is guaranteed to be unique.

  • minTimeuuid() and maxTimeuuid()

    Returns a UUID-like result given a conditional time component as an argument. For example:

    SELECT * FROM myTable
       WHERE t > maxTimeuuid('2013-01-01 00:05+0000')
       AND t < minTimeuuid('2013-02-02 10:00+0000')
    

    This example selects all rows where the timeuuid column, t, is strictly later than 2013-01-01 00:05+0000 but strictly earlier than 2013-02-02 10:00+0000. The t >= maxTimeuuid('2013-01-01 00:05+0000') does not select a timeuuid generated exactly at 2013-01-01 00:05+0000 and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000').

    The values returned by minTimeuuid and maxTimeuuid functions are not true UUIDs in that the values do not conform to the Time-Based UUID generation process specified by the RFC 4122.

    Warning

    The values returned by these methods are not unique. Use these methods for querying only. Inserting the result of these methods in the database is not recommended.

  • unixTimestampOf()

    Used in a SELECT clause, this functions extracts the timestamp of a timeuuid column in a resultset. Returns the value as a raw, 64-bit integer timestamp.

Timestamp type

Values for the timestamp and timeuuid types are encoded as 64-bit signed integers representing a number of milliseconds since the standard base time known as the epoch: January 1 1970 at 00:00:00 GMT. Timestamp and timeuuid types can be entered as integers for CQL input.

Timestamp types can also be input as string literals in any of the following ISO 8601 formats:

yyyy-mm-dd HH:mm
yyyy-mm-dd HH:mm:ss
yyyy-mm-dd HH:mmZ
yyyy-mm-dd HH:mm:ssZ
yyyy-mm-dd'T'HH:mm
yyyy-mm-dd'T'HH:mmZ
yyyy-mm-dd'T'HH:mm:ss
yyyy-mm-dd'T'HH:mm:ssZ
yyyy-mm-dd
yyyy-mm-ddZ

where Z is the RFC-822 4-digit time zone, expressing the time zone's difference from UTC. For example, for the date and time of Jan 2, 2003, at 04:05:00 AM, GMT:

2011-02-03 04:05+0000
2011-02-03 04:05:00+0000
2011-02-03T04:05+0000
2011-02-03T04:05:00+0000

The +0000 is the RFC 822 4-digit time zone specification for GMT. US Pacific Standard Time is -0800. The time zone may be omitted. For example:

2011-02-03 04:05
2011-02-03 04:05:00
2011-02-03T04:05
2011-02-03T04:05:00

If no time zone is specified, the time zone of the Cassandra coordinator node handing the write request is used. For accuracy, DataStax recommends specifying the time zone rather than relying on the time zone configured on the Cassandra nodes.

If you only want to capture date values, the time of day can also be omitted. For example:

2011-02-03
2011-02-03+0000

In this case, the time of day defaults to 00:00:00 in the specified or default time zone.

Timestamp output appears in the following format by default:

yyyy-mm-dd HH:mm:ssZ

You can change the format by setting the time_format property in the [ui] section of the .cqlshrc file.

counter type

To use counter types, see the DataStax blog about counters. Do not assign this type to a column that serves as the primary key. Also, do not use the counter type in a table that contains anything other than counter types (and primary key). To generate sequential numbers for surrogate keys, use the timeuuid type instead of the counter type.