Apache Cassandra 1.1 Documentation

CQL Data Types

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

Cassandra has a schema-optional data model. You can define data types when you create your column family schemas. Creating the schema is recommended, but not required. Column names, column values, and row key values can be typed in Cassandra.

CQL comes with the following built-in data types, which can be used for column names and column/row key values. One exception is counter, which is allowed only as a column value (not allowed for row key values or column names).

CQL Type Description
ascii US-ASCII character string
bigint 64-bit signed long
blob Arbitrary bytes (no validation), expressed as hexadecimal
boolean true or false
counter Distributed counter value (64-bit long)
decimal Variable-precision decimal
double 64-bit IEEE-754 floating point
float 32-bit IEEE-754 floating point
int 32-bit signed integer
text UTF-8 encoded string
timestamp Date plus time, encoded as 8 bytes since epoch
uuid Type 1 or type 4 UUID
timeuuid Type 1 UUID only (CQL3)
varchar UTF-8 encoded string
varint Arbitrary-precision integer

In addition to the CQL types listed in this table, you can use a string containing the name of a class (a sub-class of AbstractType loadable by Cassandra) as a CQL type. The class name should either be fully qualified or relative to the org.apache.cassandra.db.marshal package.

CQL3 timeuuid Type

The timeuuid type in CQL3 uses the Thrift comparator TimeUUIDType underneath. This type accepts type 1 UUID only; consequently, using this type prevents mistakenly inserting UUID that do not represent a time and allows the CQL date syntax, such as 2012-06-24 11:04:42, to input the UUID.

Working with Dates and Times

Values serialized with the timestamp type are encoded as 64-bit signed integers representing a number of milliseconds since the standard base time known as the epoch: January 1 1970 at 00:00:00 GMT.

Timestamp types can be input in CQL as simple long integers, giving the number of milliseconds since the epoch.

Timestamp types can also be input as string literals in any of the following ISO 8601 formats:

yyyy-mm-dd HH:mm
yyyy-mm-dd HH:mm:ss
yyyy-mm-dd HH:mmZ
yyyy-mm-dd HH:mm:ssZ
yyyy-mm-dd'T'HH:mm
yyyy-mm-dd'T'HH:mmZ
yyyy-mm-dd'T'HH:mm:ss
yyyy-mm-dd'T'HH:mm:ssZ
yyyy-mm-dd
yyyy-mm-ddZ

For example, for the date and time of Jan 2, 2003, at 04:05:00 AM, GMT:

2011-02-03 04:05+0000
2011-02-03 04:05:00+0000
2011-02-03T04:05+0000
2011-02-03T04:05:00+0000

The +0000 is the RFC 822 4-digit time zone specification for GMT. US Pacific Standard Time is -0800. The time zone may be omitted. For example:

2011-02-03 04:05
2011-02-03 04:05:00
2011-02-03T04:05
2011-02-03T04:05:00

If no time zone is specified, the time zone of the Cassandra coordinator node handing the write request is used. For accuracy, DataStax recommends specifying the time zone rather than relying on the time zone configured on the Cassandra nodes.

If you only want to capture date values, the time of day can also be omitted. For example:

2011-02-03
2011-02-03+0000

In this case, the time of day defaults to 00:00:00 in the specified or default time zone.

CQL Comments

Comments can be used to document CQL statements in your application code. Single line comments can begin with a double dash (--) or a double slash (//) and extend to the end of the line. Multi-line comments can be enclosed in /* and */ characters.

Specifying Consistency Level

In Cassandra, consistency refers to how up-to-date and synchronized a row of data is on all of its replica nodes. For any given read or write operation, the client request specifies a consistency level, which determines how many replica nodes must successfully respond to the request.

In CQL, the default consistency level is ONE. You can set the consistency level for any read (SELECT) or write (INSERT, UPDATE, DELETE, BATCH) operation. For example:

SELECT * FROM users USING CONSISTENCY QUORUM WHERE state='TX';

Consistency level specifications are made up the keywords USING CONSISTENCY, followed by a consistency level identifier. Valid consistency level identifiers are:

  • ANY (applicable to writes only)
  • ONE (default)
  • TWO
  • THREE
  • QUORUM
  • LOCAL_QUORUM (applicable to multi-data center clusters only)
  • EACH_QUORUM (applicable to multi-data center clusters only)
  • ALL

See tunable consistency for more information about the different consistency levels.