As a developer, I've found that looking for new ways to effectively test software has a dramatic payoff. We've written about some of the techniques we use at DataStax in the past. These writings include:
- tips on using system-level tools to observe Cassandra
- details on how we test Cassandra at scale
- steps to enable CI for all changes
During my summer internship at DataStax, I looked at how we could use the Jepsen framework for distributed systems analysis in testing Cassandra. In the following sections, I'll give overviews of why Jepsen interests us, how we used Jepsen, what we learned, and our plans for the future.
If you're familiar with Jepsen, feel free to skip ahead. If not, we can help clear things up. Jepsen is often used as an umbrella term to refer to a few things. These include:
- A Clojure framework for testing distributed systems created by Kyle Kingsbury
- Tests in this framework for many distributed systems, written by Kyle
- Analyses written by Kyle based on these tests
- Kyle's talks and consultancy around distributed system analysis
In this post, we'll be talking about tests and analysis we independently performed using the Jepsen framework.
At first glance, it might not be clear how Apache Cassandra could benefit from Jepsen in comparison to the testing approaches we already use: unit tests and distributed tests (dtests, for short). To highlight the differences, we can evaluate these styles of testing with respect to the criteria of state space coverage, controllability, and observability.
Unit tests excel at controllability and observability of Cassandra; since they are written and maintained as part of Cassandra's source, they are most effective at manipulating a single Cassandra process. On the other hand, they aren't very good at extensively exploring the state space of a Cassandra process or cluster. They exercise specific state space traces of a single process; they do not effectively model the operation of a real cluster and do not reflect the variety of inputs in a real deployment. Unit tests enable only limited observability outside of the Cassandra process.
Dtests offer a powerful balance of the three categories. Since they use CCM to start and observe a real Cassandra cluster at the client boundaries, coverage of the state space of a Cassandra cluster increases. At the same time, given that they run on a single node, the environment realities of a Cassandra cluster do not get adequately explored. System-level tools power observability, and CCM offers effective controllability of the Cassandra processes, but targeted exploration of specific state space traces is not easy. Dtests are conventionally written as a sequence of specific, deterministic steps; for example, one might write a test that performs writes, bootstraps a node, and then confirms that the writes are still present.
Because it uses SSH for configuration and manipulation, Jepsen allows system-level controllability and observability. For us, its strength lies in its ability to better explore the state space of both a single node and a whole cluster. As opposed to the targeted explorations of a typical dtest, a Jepsen test embraces concurrency and nondeterminism as the default mode of operation. Its powerful generator abstraction enables the test author to rapidly design and implement sequences of random instructions dictating how multiple processes should manipulate the database or environment. For example, several processes might be reading or writing to the database while another process periodically partitions the network. After the completion of such a randomized run, Jepsen checks whether this test's execution trace preserves certain properties. As a result of this more accurate environmental modeling and improved state space coverage, Jepsen tests necessarily take longer to run and require more system resources. This means they work well as another complementary step to find more nuanced paths to failure.
Jepsen in Practice
Jepsen tests work by checking invariants against a history produced by a test run. The test progresses through several high-level phases:
- Set up the OS on the nodes under test
- Set up Cassandra on the nodes under test
- Client/nemesis processes run operations provided by generator
- Checkers verify invariants against history
Our Jepsen tests implement several variations on each of the steps above. They permit the installation of any of the Cassandra 2.1, 2.2, and 3.x releases. Environment variables allow easy configuration of the compaction strategy, hints, commitlog compression, and various other parameters at test runtime. Clients exist for read and write of CQL sets, CQL maps, counters, batches, lightweight transactions, and materialized views. We use checkers to ensure that data structures function as expected at a variety of read-write consistencies, that lightweight transactions are linearizable, and that materialized views accurately reflect the base table. We verify all these properties under failures conditions such as network partitions, process crashes, and clock drift. Because Cassandra seeks to maintain these safety properties while undergoing cluster membership changes, we implemented a conductor abstraction that allows multiple nemeses to run concurrently. This allowed us to execute the above tests while adding or removing nodes from the cluster. In particular, we paid great attention to lightweight transactions, as linearizability is a more challenging consistency model to provide.
These tests helped to identify and reproduce issues in existing subsystems leading up to the release of 3.0: CASSANDRA-10231, CASSANDRA-10001 and CASSANDRA-9851. CASSANDRA-10231 in particular is a powerful example of how randomization can produce a hard-to-predict interleaving of cluster state composition changes and failure conditions.
Our Jepsen tests helped to stabilize the new materialized views feature before its release in Cassandra 3.0. Materialized views offer a nice case study for the value of these tests. Modeling materialized views is simple; we want eventual consistency between the contents of the base table and the view table. In our clients, we write to the base table and read from the view table. It is particularly important that the cluster undergoes changes in composition during the test since the pairing from base replica to view replica will change. The types of issues detected during testing reinforce this priority, as they mostly stem from interactions between environmental failures, materialized views, and changes in cluster membership. Issues identified during targeted testing of materialized views include CASSANDRA-10413, CASSANDRA-10674, and CASSANDRA-10068. In addition to these reported issues, our tests also helped prevent mistakes earlier in development.
We did not identify any new issues in the 2.1 or 2.2 versions of Cassandra as a result of Jepsen testing. We still found great value in this test coverage, as it further reinforced our confidence in the quality of these releases.
Work We Shared
We hope our work with Jepsen can help you with testing Cassandra and other distributed systems. The tests themselves are available on our GitHub. As part of our infrastructure for running tests, we've made available our tools for running Jepsen tests in Docker and multi-box Vagrant: Jepsen Docker and Jepsen Vagrant. Multi-box Vagrant support allows testing of clock drift on JVM databases: because libfaketime doesn't work well with most JVMs and Linux containers do not provide separate clocks, we need separate kernels for these tests. We greatly enjoyed working with Jepsen; we contributed fixes in PR #59 and PR #62 for the very minor issues we encountered. We've also suggested a fix for an issue in an upstream library.
Plans for the Future
I found Jepsen to promote several important ideas in its design. First, it highlights the importance of well-defined models and invariants in testing. This encourages unambiguous design and communication, and it also makes the purpose of the test clear. Second, Jepsen enables the tester to easily test these invariants in realistic conditions through test composability; the same core test can be run concurrently with node crashes, network partitions, or other pathological failures. Lastly, by embracing generative testing, it acknowledges the difficulty of thoroughly testing distributed systems using deterministic, hand-selected examples.
We're actively working on embracing these philosophies in our testing tools at DataStax. In particular, we aim to integrate these ideas with more flexible provisioning and orchestration, allowing us to easily test more cluster configurations and scales. We believe this will help us to ensure that Cassandra remains stable and correct throughout the course of further improvements.