Primary Key
A Cassandra primary key uniquely identifies a row within a Cassandra table. This primary key consists of two parts: a partition key and optional clustering columns. Each of these two parts serve different and specific purposes.
Want to use Cassandra successfully? Your data model may be the most important factor! While Cassandra Query Language (CQL) looks like SQL, there are some key differences. Become aware of these differences so you can build a scalable data model.
You’re using Cassandra because you want your data access to be fast and scalable. The secret to Cassandra’s fast data access is an optimized storage mechanism, which you control with the Primary Key. The primary key, and its components, tells Cassandra how to find your data quickly.
A Cassandra primary key uniquely identifies a row within a Cassandra table. This primary key consists of two parts: a partition key and optional clustering columns. Each of these two parts serve different and specific purposes.
The partition key portion of the primary key consists of one or more columns. Cassandra concatenates all values from the partition key columns and uses the result to locate quickly a partition within the cluster.
A partition is a set of rows (a relatively small subset of the table) that shares the same partition key. The partition is a physical unit of access, which means Cassandra will fetch all rows in a partition at the same time — very quickly. You can think of partitions as the results of pre-computed queries.
Within a partition, Cassandra sorts the rows using the values of the clustering columns. Therefore, during a query, Cassandra can use the clustering column values to search the partition quickly for a specific row within the partition.
A five step process you can follow to make sure you’re designing great data models
Our most popular online course will give you detailed experience.
When you’ve mastered the basics, check out our series on more advanced data modeling for microservice architectures.
Material related to Cassandra Data Modeling
A complete example from the Apache Cassandra site.
Data modeling is one of the major factors that define a project's success.
A growing collection of data modeling examples, from various domains.