How to Write a Dtest

By Philip Thompson -  February 19, 2016 | 0 Comments

What are Dtests?

Apache Cassandra’s functional test suite, cassandra-dtest, short for “distributed tests”, is an open-source Python project on GitHub where much of the Apache Cassandra test automation effort takes place. Unlike Cassandra’s unit tests, the dtests are end-to-end, black box tests that run against Cassandra clusters via CCM. The Cassandra Cluster Manager, or CCM, is a Python library that runs local C* clusters by hosting multiple JVMs on the same box. Each test’s runtime is anywhere from thirty seconds to several minutes. Many are general purpose functional tests, while others are regression tests for specific tickets from the Apache Cassandra JIRA.

Where are Dtests used?

Continuous integration for dtests runs on a publicly accessible Jenkins server at cassci.datastax.com. As patches are written, contributors can use CassCI to run the C* unit tests and the dtest suite against their new code, as discussed here.

Writing a Dtest

Adding a new dtest is quite simple. You’ll want to choose the appropriate module and/or test suite for your new test, or add one if necessary. Add a new test method to the file you’ve chosen; make sure that “test” is in the method name, or nosetests won’t pick it up. Now is a good time to add your test’s docstring. The docstring should include a description of what your test is trying to verify and how, as well as some doxygen markup. See dtest’s contributing.md for more on the appropriate doxygen annotations to use.

Now that the boilerplate is taken care of, you’re ready to begin writing your test. The first step is to launch a C* cluster, like so:

cluster = self.cluster
cluster.populate(3).start(wait_for_binary_proto=True)

You can modify the number of nodes in the cluster, the number of datacenters, or any of the cassandra.yaml options.

cluster = self.cluster
cluster.set_configuration_options(values={'hinted_handoff_enabled': False}) # Set a cassandra.yaml option
cluster.populate([2, 2]).start(wait_for_binary_proto=True) # A four node cluster. Two nodes in each of two datacenters

Remember that this is using CCM, so all of these processes are running on your laptop. Thus, it’s best not to launch more than five nodes. Most tests run against three nodes.

To create an object representing a connection to your C* cluster, you’ll want to use one of the following methods from dtest.py:

def cql_connection(self, node)
def exclusive_cql_connection(self, node)
def patient_cql_connection(self, node)
def patient_exclusive_cql_connection(self, node)

Use patient_cql_connection, unless you have a specific need for one of the others.

cluster = self.cluster
cluster.populate(3).start(wait_for_binary_proto=True)
node1, node2, node3 = cluster.nodelist()

session = self.patient_cql_connection(node1)

From here out will be the actual testing logic. You can use the Python driver to interact with C*, mostly via CQL, or the ccmlib API to run cassandra-stress, nodetool, or any other tool that ships in the C* source.

session.execute("CREATE KEYSPACE ks WITH replication = { 'class':'SimpleStrategy', 'replication_factor':1} AND DURABLE_WRITES = true")
session.execute("USE ks")
session.execute("CREATE TABLE t (id int PRIMARY KEY, v int)")
session.execute("INSERT INTO t (id, v) VALUES (1, 2)")
rows = session.execute("SELECT * FROM t")
node1, node2, node3 = cluster.nodelist()

node1.stress(['write', 'n=1M', '-rate', 'threads=10'])
node2.decommission()
node3.repair()

You can use assertions.py and Python unittest’s built-in assertions to assert C*’s correctness.

from assertions import assert_one

session.execute("CREATE KEYSPACE ks WITH replication = { 'class':'SimpleStrategy', 'replication_factor':1} AND DURABLE_WRITES = true")
session.execute("USE ks")
session.execute("CREATE TABLE t (id int PRIMARY KEY, v int)")
session.execute("INSERT INTO t (id, v) VALUES (1, 2)")
assert_one(session, "SELECT * FROM t", [1, 2])
rows = list(session.execute("SELECT * FROM t"))
self.assertEqual(rows[0], 1)
self.assertEqual(rows[1], 2)

Make sure you only use these, and not the Python assert keyword, as they offer significantly improved debug output on failures.

rows = list(session.execute("SELECT * FROM t"))
assert rows[0] == 1 # Do not do this.
assert rows[1] == 2

There’s no need to check for errors in C* logs, as that is automatically handled for you by dtest’s teardown.

Once you have finished with your test, make sure your new code is compliant with PEP8. See contributing.md for how to do so, along with further style guidelines. Now just open a pull request against the riptano/cassandra-dtest repository, and we’ll be happy to review and merge it.



Comments

Your email address will not be published. Required fields are marked *




Subscribe for newsletter:

© 2017 DataStax, All rights reserved. Tel. +1 (408) 933-3120 sales@datastax.com Offices

DataStax is a registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.