In this episode, Jeff Carpenter talks with Dikang Gu about the origins of Cassandra at Instagram, an update on how the adoption of RocksDB as a storage engine for Cassandra is progressing, geographic data partitioning, and how his team is providing Cassandra as a Service inside Instagram.
Highlights!
0:00 - Welcoming Dikang back to the show and recapping how Cassandra came back into Facebook via the Instagram acquisition.
1:09 - After starting to use Cassandra at a startup in China in 2010, Dikang now leads the Cassandra team at Instagram and is a committer on the project
3:14 - Dikang joined Facebook in 2012 and switched over to Instagram after the acquisition in to 2014 and joined the team working on Cassandra full time, which is now up to 7 engineers.
4:45 - The Cassandra team at Instagram has been working on performance and reliability improvements such as the RocksDB integration. This year they're working on efficiency in compute and storage to help limit hardware growth.
5:55 - Andrew Whang's talk is about a service Instagram has built to partition data by locality, for example to keep data within a particular country or region. This helps with privacy, efficiency, and performance.
9:18 - The team is also working to provide Cassandra as a Service within Instagram. Michael Figuiere's talk is about a layer they are providing on top of Cassandra to simplify access and provide traffic management. The API is based on Thrift.
13:43 - Most of the clusters at Instagram have been migrated onto the RocksDB-based engine. They're continuing to work on updating Cassandra APIs to make the storage engine pluggable, this will be a post-4.0 feature.
16:01 - Features in 4.0 that Dikang is excited about include virtual tables and the internode communication based on Netty.
18:08 - Another feature for the future is stronger membership management around the gossip protocol and handling failure.
22:00 - Looking forward to talks at Accelerate about repair and sidecars.