Cassandra Query Language (CQL) is based on SQL (Structured Query Language), the standard for relational database manipulation. Although CQL has many similarities to SQL, there are some fundamental differences. For example, CQL is adapted to the Cassandra data model and architecture so there is still no allowance for SQL-like operations such as JOINs or range queries over rows on clusters that use the random partitioner. This reference describes CQL 2.0.0.
CQL input consists of statements. Like SQL, statements change data, look up data, store data, or change the way data is stored. All statements end in a semicolon (;).
For example, the following is valid CQL syntax:
SELECT * FROM MyColumnFamily; UPDATE MyColumnFamily SET 'SomeColumn' = 'SomeValue' WHERE KEY = B70DE1D0-9908-4AE3-BE34-5573E5B09F14;
This is a sequence of two CQL statements. This example shows one statement per line, although a statement can usefully be split across lines as well.
String literals and identifiers, such as keyspace and column family names, are case-sensitive. For example, identifier MyColumnFamily and mycolumnfamily are not equivalent. CQL keywords are case-insensitive. For example, the keywords SELECT and select are equivalent, although this document shows keywords in uppercase.
Valid expressions consist of these kinds of values:
Cassandra has a schema-optional data model. You can define data types when you create your column family schemas. Creating the schema is recommended, but not required. Column names, column values, and row key values can be typed in Cassandra.
CQL comes with the following built-in data types, which can be used for column names and column/row key values. One exception is counter, which is allowed only as a column value (not allowed for row key values or column names).
|ascii||US-ASCII character string|
|bigint||64-bit signed long|
|blob||Arbitrary bytes (no validation), expressed as hexadecimal|
|boolean||true or false|
|counter||Distributed counter value (64-bit long)|
|double||64-bit IEEE-754 floating point|
|float||32-bit IEEE-754 floating point|
|int||32-bit signed integer|
|text||UTF-8 encoded string|
|timestamp||Date plus time, encoded as 8 bytes since epoch|
|uuid||Type 1 or type 4 UUID|
|varchar||UTF-8 encoded string|
In addition to the CQL types listed in the previous table, you can use a string containing the name of a class (a sub-class of AbstractType loadable by Cassandra) as a CQL type. The class name should either be fully qualified or relative to the org.apache.cassandra.db.marshal package.
Values serialized with the timestamp type are encoded as 64-bit signed integers representing a number of milliseconds since the standard base time known as the epoch: January 1 1970 at 00:00:00 GMT.
Timestamp types can be input in CQL as simple long integers, giving the number of milliseconds since the epoch.
Timestamp types can also be input as string literals in any of the following ISO 8601 formats:
yyyy-MM-DD HH:mm yyyy-MM-DD HH:mm:ss yyyy-MM-DD HH:mmZ yyyy-MM-DD HH:mm:ssZ yyyy-MM-DD'T'HH:mm yyyy-MM-DD'T'HH:mmZ yyyy-MM-DD'T'HH:mm:ss yyyy-MM-DD'T'HH:mm:ssZ yyyy-MM-DD yyyy-MM-DDZ
For example, for the date and time of Jan 2, 2003, at 04:05:00 AM, GMT:
2011-02-03 04:05+0000 2011-02-03 04:05:00+0000 2011-02-03T04:05+0000 2011-02-03T04:05:00+0000
The +0000 is the RFC 822 4-digit time zone specification for GMT. US Pacific Standard Time is -0800. The time zone may be omitted. For example:
2011-02-03 04:05 2011-02-03 04:05:00 2011-02-03T04:05 2011-02-03T04:05:00
If no time zone is specified, the time zone of the Cassandra coordinator node handing the write request is used. For accuracy, DataStax recommends specifying the time zone rather than relying on the time zone configured on the Cassandra nodes.
If you only want to capture date values, the time of day can also be omitted. For example:
In this case, the time of day defaults to 00:00:00 in the specified or default time zone.
Comments can be used to document CQL statements in your application code. Single line comments can begin with a double dash (--) or a double slash (//) and extend to the end of the line. Multi-line comments can be enclosed in /* and */ characters.
In Cassandra, consistency refers to how up-to-date and synchronized a row of data is on all of its replica nodes. For any given read or write operation, the client request specifies a consistency level, which determines how many replica nodes must successfully respond to the request.
In CQL, the default consistency level is ONE. You can set the consistency level for any read (SELECT) or write (INSERT, UPDATE, DELETE, BATCH) operation. For example:
SELECT * FROM users WHERE state='TX' USING CONSISTENCY QUORUM;
Consistency level specifications are made up the keywords @USING CONSISTENCY@, followed by a consistency level identifier. Valid consistency level identifiers are:
See tunable consistency for more information about the different consistency levels.
Certain CQL commands allow a WITH clause for setting certain properties on a keyspace or column family. CQL does not currently offer support for defining all of the possible properties, just a subset.
CQL supports setting the following keyspace properties.
CQL supports setting the following column family properties, which in a few cases have slightly different names than their corresponding column family attributes.
|CQL Parameter Name||Default Value|
|comment||''(an empty string)|
compaction_strategy_class in CQL corresponds to the compaction_strategy attribute. default_validation in CQL corresponds to the default_validation_class attribute.
The CQL language is comprised of the following commands:
Using the CQL client, cqlsh, you can query the Cassandra database from the command line. All of the commands included in CQL are available on the CQLsh command line, plus the following commands: