The legacy Cassandra CLI client utility can be used to do limited Thrift data definition (DDL) and data manipulation (DML) within a Cassandra cluster. CQL 3 is the recommended API for Cassandra. You can access CQL 3 tables using CLI. The CLI utility is located in /usr/bin/cassandra-cli in packaged installations or <install_location>/bin/cassandra-cli in binary installations.
To start the CLI and connect to a particular Cassandra instance, launch the script together with -host and -port options. Cassandra connects to the cluster named in the cassandra.yaml file. Test Cluster is the default cluster name. For example, if you have a single-node cluster on localhost:
$ cassandra-cli -host localhost -port 9160
Or to connect to a node in a multi-node cluster, give the IP address of the node:
$ cassandra-cli -host 110.123.4.5 -port 9160
To see help on the various commands available:
[default@unknown] help;
For detailed help on a specific command, use help <command>;. For example:
[default@unknown] help SET;
A command must be terminated by a semicolon (;). Using the return key without a semicolon at the end of the line echoes an ellipsis ( . . .), which indicates that the CLI expects more input.
You can use the Cassandra CLI commands described in this section to create a keyspace. This example creates a keyspace called demo, with a replication factor of 1 and using the SimpleStrategy replica placement strategy.
The single quotes around the string value of placement_strategy:
[default@unknown] CREATE KEYSPACE demo
with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
and strategy_options = {replication_factor:1};
You can verify the creation of a keyspace with the SHOW KEYSPACES command. The new keyspace is listed along with the system keyspace and any other existing keyspaces.
In Cassandra 1.2 and later, you can use the CLI GET command to query tables created with or without the COMPACT STORAGE directive in CQL 3. The CLI SET command can also be used with CQL 3 tables. For more examples of querying CQL 3 tables, see Querying a legacy table.
In a relational database, you must specify a data type for each column when you define a table. The data type constrains the values that can be inserted into that column. For example, if you have a column defined as an integer datatype, you would not be allowed to insert character data into that column. Column names in a relational database are typically fixed labels (strings) that are assigned when you define the table schema.
In Cassandra CLI and Thrift, the data type for a column (or row key) value is called a validator. The data type for a column name is called a comparator. Cassandra validates that data type of the keys of rows. Columns are sorted, and stored in sorted order on disk, so you have to specify a comparator for columns. You can define the validator and comparator when you create your table schema (which is recommended), but Cassandra does not require it. Internally, Cassandra stores column names and values as hex byte arrays (BytesType). This is the default client encoding used if data types are not defined in the table schema (or if not specified by the client request).
Cassandra comes with the following built-in data types, which can be used as both validators (row key and column value data types) or comparators (column name data types). One exception is CounterColumnType, which is only allowed as a column value (not allowed for row keys or column names).
| Internal Type | CQL Name | Description |
|---|---|---|
| BytesType | blob | Arbitrary hexadecimal bytes (no validation) |
| AsciiType | ascii | US-ASCII character string |
| UTF8Type | text, varchar | UTF-8 encoded string |
| IntegerType | varint | Arbitrary-precision integer |
| Int32Type | int | 4-byte integer |
| InetAddressType | inet | IP address string in xxx.xxx.xxx.xxx form |
| LongType | bigint | 8-byte long |
| UUIDType | uuid | Type 1 or type 4 UUID |
| TimeUUIDType | timeuuid | Type 1 UUID only (CQL3) |
| DateType | timestamp | Date plus time, encoded as 8 bytes since epoch |
| BooleanType | boolean | true or false |
| FloatType | float | 4-byte floating point |
| DoubleType | double | 8-byte floating point |
| DecimalType | decimal | Variable-precision decimal |
| CounterColumnType | counter | Distributed counter value (8-byte long) |
Using the CLI you can define a default row key validator for a table using the key_validation_class property. Using CQL, you use built-in key validators to validate row key values. For static tables, define each column and its associated type when you define the table using the column_metadata property.
Key and column validators may be added or changed in a table definition at any time. If you specify an invalid validator on your table, client requests that respect that metadata are confused, and data inserts or updates that do not conform to the specified validator are rejected.
You cannot know the column names of dynamic tables ahead of time, so specify a default_validation_class instead of defining the per-column data types.
Key and column validators can be added or changed in a table definition at any time. If you specify an invalid validator on the table, client requests that respect that metadata get confused, and data inserts or updates that do not conform to the specified validator are rejected.
Within a row, columns are always stored in sorted order by their column name. The comparator specifies the data type for the column name, as well as the sort order in which columns are stored within a row. Unlike validators, the comparator may not be changed after the table is defined, so this is an important consideration when defining a table in Cassandra.
Typically, static table names will be strings, and the sort order of columns is not important in that case. For dynamic tables, however, sort order is important. For example, in a table that stores time series data (the column names are timestamps), having the data in sorted order is required for slicing result sets out of a row of columns.
First, connect to the keyspace where you want to define the table with the USE command.
[default@unknown] USE demo;
In this example, we create a users table in the demo keyspace. This table defines a few columns: full_name, email, state, gender, and birth_year. This is considered a static table because the column names are specified and most rows are expected to have more-or-less the same columns.
Notice the settings of comparator, key_validation_class and validation_class. These values set the default encoding used for column names, row key values and column values. In the case of column names, the comparator also determines the sort order. To create a table using the CLI, you use the column family keyword.
[default@unknown] USE demo;
[default@demo] CREATE COLUMN FAMILY users
WITH comparator = UTF8Type
AND key_validation_class=UTF8Type
AND column_metadata = [
{column_name: full_name, validation_class: UTF8Type}
{column_name: email, validation_class: UTF8Type}
{column_name: state, validation_class: UTF8Type}
{column_name: gender, validation_class: UTF8Type}
{column_name: birth_year, validation_class: LongType}
];
Next, create a dynamic table called blog_entry. Notice that here we do not specify column definitions as the column names are expected to be supplied later by the client application.
[default@demo] CREATE COLUMN FAMILY blog_entry
WITH comparator = TimeUUIDType
AND key_validation_class=UTF8Type
AND default_validation_class = UTF8Type;
A counter table contains counter columns. A counter column is a specific kind of column whose user-visible value is a 64-bit signed integer that can be incremented (or decremented) by a client application. The counter column tracks the most recent value (or count) of all updates made to it. A counter column cannot be mixed in with regular columns of a table, you must create a table specifically to hold counters.
To create a table that holds counter columns, set the default_validation_class of the table to CounterColumnType. For example:
[default@demo] CREATE COLUMN FAMILY page_view_counts
WITH default_validation_class=CounterColumnType
AND key_validation_class=UTF8Type AND comparator=UTF8Type;
To insert a row and counter column into the table (with the initial counter value set to 0):
[default@demo] INCR page_view_counts['www.datastax.com'][home] BY 0;
To increment the counter:
[default@demo] INCR page_view_counts['www.datastax.com'][home] BY 1;
The following examples illustrate using the SET command to insert columns for a particular row key into the users table. In this example, the row key is bobbyjo and we are setting each of the columns for this user. Notice that you can only set one column at a time in a SET command.
[default@demo] SET users['bobbyjo']['full_name']='Robert Jones';
[default@demo] SET users['bobbyjo']['email']='bobjones@gmail.com';
[default@demo] SET users['bobbyjo']['state']='TX';
[default@demo] SET users['bobbyjo']['gender']='M';
[default@demo] SET users['bobbyjo']['birth_year']='1975';
In this example, the row key is yomama and we are just setting some of the columns for this user.
[default@demo] SET users['yomama']['full_name']='Cathy Smith';
[default@demo] SET users['yomama']['state']='CA';
[default@demo] SET users['yomama']['gender']='F';
[default@demo] SET users['yomama']['birth_year']='1969';
In this example, we are creating an entry in the blog_entry table for row key yomama:
[default@demo] SET blog_entry['yomama'][timeuuid()] = 'I love my new shoes!';
Note
The Cassandra CLI sets the consistency level for the client. The level defaults to ONE for all write and read operations. For more information, see About data consistency.
Use the GET command within Cassandra CLI to retrieve a particular row from a table. Use the LIST command to return a batch of rows and their associated columns (default limit of rows returned is 100).
For example, to return the first 100 rows (and all associated columns) from the users table:
[default@demo] LIST users;
Cassandra stores all data internally as hex byte arrays by default. If you do not specify a default row key validation class, column comparator and column validation class when you define the table, Cassandra CLI will expect input data for row keys, column names, and column values to be in hex format (and data will be returned in hex format).
To pass and return data in human-readable format, you can pass a value through an encoding function. Available encodings are:
For example to return a particular row key and column in UTF8 format:
[default@demo] GET users[utf8('bobbyjo')][utf8('full_name')];
You can also use the ASSUME command to specify the encoding in which table data should be returned for the entire client session. For example, to return row keys, column names, and column values in ASCII-encoded format:
[default@demo] ASSUME users KEYS AS ascii;
[default@demo] ASSUME users COMPARATOR AS ascii;
[default@demo] ASSUME users VALIDATOR AS ascii;
When you set a column in Cassandra, you can optionally set an expiration time, or time-to-live (TTL) attribute for it.
For example, suppose we are tracking coupon codes for our users that expire after 10 days. We can define a coupon_code column and set an expiration date on that column. For example:
[default@demo] SET users['bobbyjo']
[utf8('coupon_code')] = utf8('SAVE20') WITH ttl=864000;
After ten days, or 864,000 seconds have elapsed since the setting of this column, its value will be marked as deleted and no longer be returned by read operations. Note, however, that the value is not actually deleted from disk until normal Cassandra compaction processes are completed.
The CLI can be used to create secondary indexes (indexes on column values). You can add a secondary index when you create a table or add it later using the UPDATE COLUMN FAMILY command.
For example, to add a secondary index to the birth_year column of the users column family:
[default@demo] UPDATE COLUMN FAMILY users WITH comparator = UTF8Type
AND column_metadata =
[{column_name: birth_year,
validation_class: LongType,
index_type: KEYS
}
];
Because of the secondary index created for the column birth_year, its values can be queried directly for users born in a given year as follows:
[default@demo] GET users WHERE birth_year = 1969;
The Cassandra CLI provides the DEL command to delete a row or column (or subcolumn).
For example, to delete the coupon_code column for the yomama row key in the users table:
[default@demo] DEL users ['yomama']['coupon_code'];
[default@demo] GET users ['yomama'];
Or to delete an entire row:
[default@demo] DEL users ['yomama'];
With Cassandra CLI commands you can drop tables and keyspaces in much the same way that tables and databases are dropped in a relational database. This example shows the commands to drop our example users table and then drop the demo keyspace altogether:
[default@demo] DROP COLUMN FAMILY users;
[default@demo] DROP KEYSPACE demo;