Apache Cassandra 1.0 Documentation

SELECT

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

Retrieves data from a Cassandra column family.

Synopsis

SELECT [FIRST <n>] [REVERSED] <select expression>
FROM <column family>
[USING <consistency>]
[WHERE (<clause>)] [LIMIT <n>]

<select expression> syntax is:

{ <start_of_range> .. <end_of_range> | * }
| COUNT(* | 1)

<clause> syntax is:

KEY | <key_alias> { = | < | > | <= | >= } <key_value>
KEY | <key_alias> IN (<key_value> [,...])

Description

A SELECT expression reads one or more records from a Cassandra column family and returns a result-set of rows. Each row consists of a row key and a collection of columns corresponding to the query.

Unlike the projection in a SQL SELECT, there is no guarantee that the results will contain all of the columns specified because Cassandra is schema-optional. An error does not occur if you request non-existent columns.

Examples

Specifying Columns

The SELECT expression determines which columns, if any, appear in the result:

SELECT * from People;

Select two columns, Name and Occupation, from three rows having keys 199, 200, or 207:

SELECT Name, Occupation FROM People WHERE key IN (199, 200, 207);

A simple form is a comma-separated list of column names. The list can consist of a range of column names.

Specifying a Range of Columns

To specify a range of columns, specify the start and end column names separated by two periods (..). The set of columns returned for a range is start and end inclusive. Use the asterisk (*) to specify a range representing all columns.

Select a range of columns from all rows, but limit the number of columns to 3 per row starting with the end of the range:

SELECT FIRST 3 REVERSED 'time199'..'time100' FROM Events;

When requesting a range of columns, it may be useful to limit the number of columns that can be returned from each row using the FIRST clause. This sets an upper limit on the number of columns returned per row. The default is 10,000.

The REVERSED keyword sorts columns in reverse order. Using a FIRST clause, the query returns the columns at the end instead of at the beginning of the range.

Counting Returned Rows

A SELECT expression using COUNT(*) returns the number of rows that matched the query. Alternatively, you can use COUNT(1) to get the same result.

Count the number of rows in the users column family:

SELECT COUNT(*) FROM users;

If you do not specify a limit, a maximum of 10,000 rows are returned by default. Using the LIMIT option, you can specify that the query return a greater or fewer number of rows.

SELECT COUNT(*) FROM big_columnfamily;

 count
-------
 10000

SELECT COUNT(*) FROM big_columnfamily LIMIT 50000;

 count
-------
 50000

SELECT COUNT(*) FROM big_columnfamily LIMIT 200000;

 count
--------
 105291

Specifying the Column Family, FROM, Clause

The FROM clause specifies the column family to query. Optionally, specify a keyspace for the column family followed by a period, (.), then the column family name. If a keyspace is not specified, the current keyspace will be used.

Count the number of rows in the Migrations column family in the system keyspace:

SELECT COUNT(*) FROM system.Migrations;

Specifying a Consistency Level

You can optionally specify a consistency level, such as QUORUM:

SELECT * from People USING CONSISTENCY QUORUM;

See tunable consistency for more information about the consistency levels.

Filtering Data Using the WHERE Clause

The WHERE clause filters the rows that appear in the results. You can filter on a key name, a range of keys, or on column values if columns have a secondary index. Row keys are specified using the KEY keyword or key alias defined on the column family, followed by a relational operator, and then a value.

Relational operators are: =, >, >=, <, or <=.

To filter a indexed column, the term on the left of the operator must be the name of the column, and the term on the right must be the value to filter on.

Note: The greater-than and less-than operators (> and <) result in key ranges that are inclusive of the terms. There is no supported notion of strictly greater-than or less-than; these operators are merely supported as aliases to >= and <=.

Specifying Rows Returned Using LIMIT

By default, a query returns 10,000 rows maximum. Using the LIMIT clause, you can change the default limit of 10,000 rows to a lesser or greater number of rows. The default is 10,000 rows.

SELECT * from Artists WHERE favoriteArtist = 'Adele' LIMIT 90000;