Tuning the row cache in Cassandra 2.1

By Ryan McGuire -  May 16, 2014 | 1 Comment

Cassandra works optimally when the data you need to access is already in memory. Disks are comparatively slow, so when data needs to be read from disk, it works best when it is performed as a single sequential operation. In order to design an effective data model in Cassandra, it’s good to keep these best practices in mind:

  • Use clustering columns in your tables so that your rows are ordered on disk in the same order you want them in when read.
  • Use the built-in caching mechanisms to limit the amount of reads from disk.

Row Clustering

To demonstrate these principles, consider the following example:

    user text, 
    status_id timeuuid, 
    status text, 
    PRIMARY KEY (user, status_id)) 

This table is designed for holding a time-series of status updates for users. It can help us answer this question: “What are the 10 most recent status updates for Bill?”. Because this type of query would retrieve multiple rows, the table has been designed such that Cassandra will write rows to disk in the same order we want to read them in. Most of the magic that helps us do that is contained in the PRIMARY KEY, which is composed of two parts:

  • user, the partition key, stores all status updates for a given user together on the same replica node(s).
  • status_id, the clustering column, sorts the status updates according to the column type, which in this case is a timeuuid, which sorts chronologically.

The CLUSTERING ORDER BY clause modifies the order in which the clustering works. So in this example, the order in which rows are stored on disk is in reverse chronological order according to the status_id.

Because we’ve carefully designed our table to store our rows in the order we want them, it’s a very simple query to retrieve the last 10 status updates for Bill:

SELECT * FROM status WHERE user = 'bill' LIMIT 10;

The data required to satisfy this query is guaranteed to always be at the beginning of the partition for that user, and in the order we want. Both of these conditions make it possible to utilize the row caching ability of Cassandra.

Row caching

With row caching enabled, Cassandra will detect frequently accessed partitions and store rows of data into RAM to limit the cases where it needs to read from disk. This is a long time feature of Cassandra, but it receives some great optimizations in the upcoming 2.1 release. In previous releases, this cache has required storing the entire partition in memory, which meant that if that was larger than the cache size, you would never be reading it from the cache. Cassandra 2.1 has introduced extra CQL syntax to specify the number of rows to cache per partition. Consider this modification to our table:

    user text, 
    status_id timeuuid, 
    status text, 
    PRIMARY KEY (user, status_id)) 
    AND caching = '{"keys":"ALL", "rows_per_partition":"10"}'

This new table specifies two types of caching:

  • A key cache, which helps Cassandra know where the partition is located on disk, decreasing seek times. This cache was already present in our first example as this is a feature that Cassandra turns on by default, but this setting makes it explicit.
  • A row cache, which in the above example has been set to only cache the first 10 rows of the partition. You can also set this to “ALL” to behave like the row cache of prior releases, which stores the entire contents of the partition.

To use the row cache, you must also instruct Cassandra how much memory you wish to dedicate to the cache using the row_cache_size_in_mb setting in the cassandra.yaml config file. Cassandra will use that much space in memory to store rows from the most frequently read partitions of the table.

Testing the row cache

When you design a data model with row caching, it can be useful to test that it is truly getting data from the cache rather than from disk. Consider if you have the following data inserted into the status table:

INSERT INTO status (user, status_id, status) VALUES 
      ('bill', now(), 'Sorry to disappoint you...') ;
INSERT INTO status (user, status_id, status) VALUES 
      ('bill', now(), '.. but I''m real') ;
INSERT INTO status (user, status_id, status) VALUES 
      ('bill', now(), 'All my best, bill') ;

You can test your query in cqlsh with tracing enabled to see if the cache is hit or not:

cqlsh> tracing on;
Now tracing requests.
cqlsh> SELECT * FROM status WHERE user = 'bill' limit 10;

user | status_id                            | status
 bill | 3a9b7d90-d9df-11e3-a61e-b1673f322ed1 |          All my best, bill
 bill | 39929910-d9df-11e3-a61e-b1673f322ed1 |            .. but I'm real
 bill | 21b4d380-d9df-11e3-a61e-b1673f322ed1 | Sorry to disappoint you...

In the trace that prints directly after that query you will see the following line:

Row cache miss [ReadStage:41]

This will always happen the first time you read data from a partition, as the cache has not been populated yet. Subsequent queries for the same partition (‘bill’) will have this line in the trace:

Row cache hit [ReadStage:55]

This tells you that the data you requested was found to be stored in the cache and no disk read was necessary. If the row is updated later, the cache for it will be invalidated (it may become a write-through cache someday though.) If your query would include any rows that are not a part of the cache; either you requested more rows than you told it to cache, or you skipped the beginning of the partition, you may see this message in the trace:

Ignoring row cache as cached value could not satisfy query [ReadStage:89]

This shows that the cache was insufficient to complete the request, so a disk read was necessary. This will most often occur if you are querying for rows that are not at the beginning of the partition, meaning you are adding additional constraints to your query such that it skips the beginning results, or that you are not placing a limit on your query and are returning more results than cached. To ensure this query hits the cache, you can try increasing the cache size limit, or you may wish to restructure your table to reorder your frequently accessed rows to be at the head of the partition.

You can read about the implementation of the rows_per_partition setting in CASSANDRA-5357. This setting gives you a lot of flexibility to zero in on the exact data you want cached. If you study your application’s query model, and tune it to store your most frequently accessed data according to these best practices, you can get great response times without any need for an external caching layer.


  1. Joe says:

    hello there,
    If it’s possible to call TRACING instruction by using python driver?


Your email address will not be published. Required fields are marked *

Subscribe for newsletter:

© 2017 DataStax, All rights reserved. Tel. +1 (408) 933-3120 sales@datastax.com Offices

DataStax is a registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.