CQL for Cassandra 2.x

Creating and updating a keyspace

Creating a keyspace is the CQL counterpart to creating an SQL database, but a little different. The Cassandra keyspace is a namespace that defines how data is replicated on nodes. Typically, a cluster has one keyspace per application. Replication is controlled on a per-keyspace basis, so data that has different replication requirements typically resides in different keyspaces. Keyspaces are not designed to be used as a significant map layer within the data model. Keyspaces are designed to control data replication for a set of tables.

When you create a keyspace, you specify a strategy class for replicating keyspaces. Using the SimpleStrategy class is fine for evaluating Cassandra. For production use or for use with mixed workloads, use the NetworkTopologyStrategy class.

To use NetworkTopologyStrategy for evaluation purposes using, for example, a single node cluster, specify the default data center name. To determine the default data center name, use the nodetool status command. On Linux, for example, in the installation directory:

$ bin/nodetool status

The output is:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID      Rack
UN  127.0.0.1  41.62 KB   256     100.0%            75dcca8f...  rack1
To use NetworkTopologyStrategy for production use, you need to change the default snitch, SimpleSnitch, to a network-aware snitch, define one or more data center names in the snitch properties file, and use the data center name(s) to define the keyspace; otherwise, Cassandra will fail to complete any write request, such as inserting data into a table, and log this error message:
Unable to complete request: one or more nodes were unavailable.

You cannot insert data into a table in keyspace that uses NetworkTopologyStrategy unless you define the data center names in the snitch properties file or you use a single data center named datacenter1.

Example of creating a keyspace

To query Cassandra, you first create and use a keyspace. You can chose an arbitrary data center name and register the name in the properties file of the snitch. Alternatively, if you use a cluster in a single data center, simply use the default data center name in OS Cassandra, for example datacenter1 and skip registering the name in the properties file.

Procedure

  1. Create a keyspace.
    cqlsh> CREATE KEYSPACE demodb WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 };
  2. Use the keyspace.
    USE demodb;

Updating the replication factor

Increasing the replication factor increases the total number of copies of keyspace data stored in a Cassandra cluster. If you are using security features, it is particularly important to increase the replication factor of the system_auth keyspace from the default (1) because you will not be able to log into the cluster if the node with the lone replica goes down. It is recommended to set the replication factor for the system_auth keyspace equal to the number of nodes in each data center.

Procedure

  1. Update a keyspace in the cluster and change its replication strategy options.
    ALTER KEYSPACE system_auth WITH REPLICATION =
      {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};

    Or if using SimpleStrategy:

    ALTER KEYSPACE "Excalibur" WITH REPLICATION =
      { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
  2. On each affected node, run the nodetool repair command.
  3. Wait until repair completes on a node, then move to the next node.
Show/hide