DataStax Developer Blog

Index Summary and 1.2 Startup time improvements.

By Pavel Yaskevich -  June 6, 2012 | 0 Comments

Performance of the databases is very important especially when it comes to startup time because improved startup time could lead to better maintainability of the system. Due to the log-structured nature of the storage format there are challenges related to improving startup time which correlate with the amount of data stored in the system.

First, let’s review how ColumnFamily is loaded upon database startup:

  1. After schema information is loaded, ColumnFamilyStore object gets initialized
  2. Upon ColumnFamilyStore initialization Key/Row caches are loaded from the persisted storage (location is determined by `saved_caches_directory’ in cassandra.yaml)
  3. All SSTable files that belong to the given ColumnFamily are identified and loaded in parallel

The most interesting part so far is how Key/Row caches are loaded because they don’t store the file position for each given row or there is a risk that they would point to the outdated data, so instead they store only a key (the row identifier) and when the ColumnFamily’s SSTables are loaded, that set of keys from cache would be passed to each SSTable so the freshest location of each row can be determined (for more information about global caches see CASSANDRA-3143).

Let’s move on and review how the individual SSTables are loaded, the process itself is simple – a) open the primary index of the SSTable; b). read the index and calculate IndexSummary, BloomFilter and pre-load key cache (if needed).

The process is illustrated by the following pseudo-code:


primary_index := io.Open(sstable.PRIMARY_INDEX)
summary := new IndexSummary(primary_index)

while !eof(primary_index) {
  pos, key := read_key(primary_index)

  if key-is-in-cache(saved-keys, key)
    cache.add(key, pos)

  summary.addEntry(key, pos)
}

As you can see the bigger primary index gets the more work we would have to do on startup for each of the SSTables. Optimization here is to remove the loop which goes through the whole primary index. There are two steps to do that: IndexSummary should be persisted separately and Key cache format should be changed to be loaded independently from any other part of the system.

The first step in that direction was taken by CASSANDRA-2392 were IndexSummary was moved to the independent SSTable component which allowed to eliminate primary index use upon startup if users are running without caches and resulted in dramatic improvement of startup speed comparing to the older versions by sacrificing little of the disk space. The final change was introduced by CASSANDRA-3762 which allows to load caches separately from SSTables load by using already loaded BloomFilter and IndexSummary to minimize processing time.

Here is the results to show the difference in startup time between 1.1.x and 1.2 versions after inserting 10,000,000 rows (-n 10000000 -S 128) using stress tool (DEBUG enabled in conf/log4j-server.properties):

1.1 restart (after dropping page cache)

Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-328 (1402630182 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-229 (1902221766 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-326 (1663815492 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-346 (428903127 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-323 (205597008 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-312 (136536408 bytes)

INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-328: 7946 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-229: 11299 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-326: 8794 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-346: 2749 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-323: 333 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-312: 725 ms.

Total index load: 31846 ms. Total load time: 40.87 s.

1.2 restart (which dropped page cache)

Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-328 (1402630182 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-229 (1902221766 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-326 (1663815492 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-346 (428903127 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-323 (205597008 bytes)
Opening /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-312 (136536408 bytes)

INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-328: 665 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-229: 832 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-326: 463 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-346: 202 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-323: 209 ms.
INDEX LOAD TIME for /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hd-312: 111 ms.

Total index load: 2482 ms. Total load time: 7.95 s.

Conclusion

SSTable index load time was improved dramatically (~13x improvement and ~6x in total system load time) by eliminating the need to go through the whole primary index which resulted in faster system startup as a whole. Cassandra is under constant performance evaluation of the different system parts by developers as much as by community members and users which results in constant improvement of the product.



Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>