Cassandra has a schema-optional data model. You can define data types when you create your column family schemas. Creating the schema is recommended, but not required. Column names, column values, and row key values can be typed in Cassandra.
CQL comes with the following built-in data types, which can be used for column names and column/row key values. One exception is counter, which is allowed only as a column value (not allowed for row key values or column names).
|ascii||US-ASCII character string|
|bigint||64-bit signed long|
|blob||Arbitrary bytes (no validation), expressed as hexadecimal|
|boolean||true or false|
|counter||Distributed counter value (64-bit long)|
|double||64-bit IEEE-754 floating point|
|float||32-bit IEEE-754 floating point|
|int||32-bit signed integer|
|text||UTF-8 encoded string|
|timestamp||Date plus time, encoded as 8 bytes since epoch|
|uuid||Type 1 or type 4 UUID|
|timeuuid||Type 1 UUID only (CQL3)|
|varchar||UTF-8 encoded string|
In addition to the CQL types listed in this table, you can use a string containing the name of a class (a sub-class of AbstractType loadable by Cassandra) as a CQL type. The class name should either be fully qualified or relative to the org.apache.cassandra.db.marshal package.
The timeuuid type in CQL3 uses the Thrift comparator TimeUUIDType underneath. This type accepts type 1 UUID only; consequently, using this type prevents mistakenly inserting UUID that do not represent a time and allows the CQL date syntax, such as 2012-06-24 11:04:42, to input the UUID.
Values serialized with the timestamp type are encoded as 64-bit signed integers representing a number of milliseconds since the standard base time known as the epoch: January 1 1970 at 00:00:00 GMT.
Timestamp types can be input in CQL as simple long integers, giving the number of milliseconds since the epoch.
Timestamp types can also be input as string literals in any of the following ISO 8601 formats:
yyyy-mm-dd HH:mm yyyy-mm-dd HH:mm:ss yyyy-mm-dd HH:mmZ yyyy-mm-dd HH:mm:ssZ yyyy-mm-dd'T'HH:mm yyyy-mm-dd'T'HH:mmZ yyyy-mm-dd'T'HH:mm:ss yyyy-mm-dd'T'HH:mm:ssZ yyyy-mm-dd yyyy-mm-ddZ
For example, for the date and time of Jan 2, 2003, at 04:05:00 AM, GMT:
2011-02-03 04:05+0000 2011-02-03 04:05:00+0000 2011-02-03T04:05+0000 2011-02-03T04:05:00+0000
The +0000 is the RFC 822 4-digit time zone specification for GMT. US Pacific Standard Time is -0800. The time zone may be omitted. For example:
2011-02-03 04:05 2011-02-03 04:05:00 2011-02-03T04:05 2011-02-03T04:05:00
If no time zone is specified, the time zone of the Cassandra coordinator node handing the write request is used. For accuracy, DataStax recommends specifying the time zone rather than relying on the time zone configured on the Cassandra nodes.
If you only want to capture date values, the time of day can also be omitted. For example:
In this case, the time of day defaults to 00:00:00 in the specified or default time zone.
Comments can be used to document CQL statements in your application code. Single line comments can begin with a double dash (--) or a double slash (//) and extend to the end of the line. Multi-line comments can be enclosed in /* and */ characters.
In Cassandra, consistency refers to how up-to-date and synchronized a row of data is on all of its replica nodes. For any given read or write operation, the client request specifies a consistency level, which determines how many replica nodes must successfully respond to the request.
In CQL, the default consistency level is ONE. You can set the consistency level for any read (SELECT) or write (INSERT, UPDATE, DELETE, BATCH) operation. For example:
SELECT * FROM users USING CONSISTENCY QUORUM WHERE state='TX';
Consistency level specifications are made up the keywords USING CONSISTENCY, followed by a consistency level identifier. Valid consistency level identifiers are:
See tunable consistency for more information about the different consistency levels.