Jonathan Ellis

Cassandra 1.2 exposes almost everything that each server knows about the cluster in tables in the&nbsp;<tt>system</tt>&nbsp;keyspace. We started this process with the introduction of CQL3 in Cassandra 1.1, but introducing the&nbsp;<a href="https://www.datastax.com/dev/blog/binary-protocol">native protocol</a>&nbsp;motivated us to finish it, so native protocol drivers can introspect everything they need without falling back to the old Thrift calls.

Here's what the system keyspace contains.

<h3>Schema</h3>

&nbsp;

<pre>
<code>CREATE TABLE schema_keyspaces (
 keyspace_name text PRIMARY KEY,
 durable_writes boolean,
 strategy_class text,
 strategy_options text
);

CREATE TABLE schema_columnfamilies (
 keyspace_name text,
 columnfamily_name text,
 bloom_filter_fp_chance double,
 caching text,
 column_aliases text,
 comment text,
 compaction_strategy_class text,
 compaction_strategy_options text,
 comparator text,
 compression_parameters text,
 default_read_consistency text,
 default_validator text,
 default_write_consistency text,
 gc_grace_seconds int,
 id int,
 key_alias text,
 key_aliases text,
 key_validator text,
 local_read_repair_chance double,
 max_compaction_threshold int,
 min_compaction_threshold int,
 read_repair_chance double,
 replicate_on_write boolean,
 subcomparator text,
 type text,
 value_alias text,
 PRIMARY KEY (keyspace_name, columnfamily_name)
);

CREATE TABLE schema_columns (
 keyspace_name text,
 columnfamily_name text,
 column_name text,
 component_index int,
 index_name text,
 index_options text,
 index_type text,
 validator text,
 PRIMARY KEY (keyspace_name, columnfamily_name, column_name)
);
</code></pre>

&nbsp;

This all corresponds exactly with what you see in CREATE TABLE, so this is pretty straightforward. A couple things that might bear additional explanation:

<ul>
	<li><tt>durable_writes</tt>: allows disabling the commitlog for tables in this keyspace. Generally not recommended, but occasionally useful for temporary data or when you're confident that replication will be adequate to keep your data safe.</li>
	<li><tt>subcomparator</tt>: used by obsolete SuperColumns.</li>
	<li><tt>component_index</tt>: use by Cassandra internally with&nbsp;<a href="https://www.datastax.com/dev/blog/schema-in-cassandra-1-1">compound primary keys</a></li>
</ul>

<h3>Cluster information</h3>

Each node records what other nodes tell it about themselves over gossip:

&nbsp;

<pre>
<code>CREATE TABLE peers (
 peer inet PRIMARY KEY,
 data_center text,
 host_id uuid,
 rack text,
 release_version text,
 rpc_address inet,
 schema_version uuid,
 tokens set
);
</code></pre>

&nbsp;

And what it knows about itself, which is a superset of what it gossips:

&nbsp;

<pre>
<code>CREATE TABLE local (
 key text PRIMARY KEY,
 bootstrapped text,
 cluster_name text,
 cql_version text,
 data_center text,
 gossip_generation int,
 host_id uuid,
 partitioner text,
 rack text,
 release_version text,
 schema_version uuid,
 thrift_version text,
 tokens set,
 truncated_at map
);
</code></pre>

&nbsp;

Remember that starting with 1.2 each node can be assigned multiple tokens through&nbsp;<a href="https://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2">virtual nodes</a>.

There is only a single row in the&nbsp;<tt>local</tt>&nbsp;table (key also "local").

<h3>Other</h3>

The&nbsp;<tt>batchlog</tt>&nbsp;table contains data for&nbsp;<a href="https://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2">atomic batches</a>.

<a href="https://www.datastax.com/dev/blog/modern-hinted-handoff">Hinted handoff</a>&nbsp;records mutations to replay in the&nbsp;<tt>hints</tt>&nbsp;table.

<tt>IndexInfo</tt>&nbsp;stores information about index creation status and will probably be moved into&nbsp;<tt>schema_columnfamilies</tt>&nbsp;in the future.

<tt>NodeIdInfo</tt>&nbsp;stores&nbsp;<a href="https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/context/CounterContext.java">counter "node ids"</a>.

<tt>range_xfers</tt>&nbsp;is used to store range transfer status when upgrading a non-vnode cluster to use vnodes.

Request traces are stored not in the&nbsp;<tt>system</tt>&nbsp;keyspace, which is unreplicated, but in&nbsp;<a href="https://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2"><tt>system_traces</tt></a>.

<h3>Drive responsibly</h3>

Cassandra does allow you to update data in the&nbsp;<tt>system</tt>&nbsp;keyspace, but it goes without saying that you should only do so if you know what you are doing.

Schema changes to&nbsp;<tt>system</tt>&nbsp;are not allowed.

The data dictionary in Cassandra 1.2

Jonathan EllisTechnology

Share

Share

Schema

Cluster information

Other

Drive responsibly

More Technology

Knowledge Graphs for RAG without a GraphDB

How Winweb Built its AI Assistant with DataStax Astra DB and LangChain

Vercel + Astra DB: Get Data into Your GenAI Apps Fast

Simplifying Agent Development with Astra DB Connector for Vertex AI Search

One-stop Data API for Production GenAI