New in Java driver 2.1.2: native protocol v3 support
The latest version of the DataStax Java driver brings support for version 3 of the native CQL protocol. This post explores what that means from an end-user perspective.
Native protocol overview
The native protocol defines the format of the messages exchanged between the driver and Cassandra over TCP. It was introduced in Cassandra 1.2 as an alternative to Thrift, and has since undergone three revisions:
|Cassandra version||Supported protocol versions||Java driver version|
|2.0||1, 2||2.0.x, 2.1.0, 2.1.1|
|2.1||1, 2, 3||2.1.2|
Both Cassandra and the Java driver are backward-compatible with older versions of the protocol; for example, you can connect to Cassandra 1.2 from the driver 2.1.2 over protocol v1. If not specified at configuration time (see below), the best version for a given client and server will be negotiated on the first connection.
For interested readers, the full specification of the native protocol can be found in the Cassandra codebase.
Protocol version as a Java enum
The first change introduced by 2.1.2 is to model protocol versions as a Java enum. Typical use cases include forcing a specific version at startup, or manually deserializing a
ByteBuffer to a given datatype:
Cluster.builder().addContactPoint("127.0.0.1") .withProtocolVersion(ProtocolVersion.V2) .build(); String text = (String) DataType.deserialize(buffer, ProtocolVersion.V3);
An enum brings obvious type safety benefits: you can’t reference an unsupported version. For backward compatibility, we’ve kept the methods that take an
int, but you should use the newer ones whenever possible.
More streams per connection
The native protocol is asynchronous, in that each connection handles more than one request at the same time. Requests and responses are matched by a common stream id. Here’s how three requests might get interleaved on a single connection:
In version 2 of the native protocol, there were at most 128 stream ids per connection. The driver maintained a pool of connections to each node to handle higher throughputs.
In version 3, this number gets bumped to 32,768. The driver now only opens a single connection to each host. Most pooling options are no longer relevant and will be ignored if v3 is in use. However, we’ve added new options to limit the total number of requests per host. This is useful to enforce client-side throttling:
Cluster.builder().addContactPoint("127.0.0.1") .withPoolingOptions(new PoolingOptions() .setMaxSimultaneousRequestsPerHostThreshold(HostDistance.LOCAL, 16384) .setMaxSimultaneousRequestsPerHostThreshold(HostDistance.REMOTE, 2048)) .build();
These options default to 1024 for local hosts and 256 for remote hosts, which gives you roughly the same load as the v2 pool with the default options.
In Cassandra, each write has a microsecond-precision timestamp associated with it. Until now, there were two ways to assign it:
- automatically on the server-side. This can sometimes be a problem when the order of the writes matter: with unlucky timing (different coordinators, network latency, etc.), two successive requests from the same client might be processed in a different order server-side, and end up with out-of-order timestamps;
- explicitly in the CQL query string (with
USING TIMESTAMP). This solves the previous problem, but puts the burden of generating timestamps on client code.
With the native protocol version 3, a default timestamp can now be sent with each query. The driver will do it automatically if it’s configured with an instance of
Cluster.builder().addContactPoint("127.0.0.1") .withTimestampGenerator(new AtomicMonotonicTimestampGenerator()) .build();
In 2.1.2, the default is still server-side generation. So unless you explicitly provide a generator, you get the same behavior as previous driver versions.
In addition, you can also override the default timestamp on a per-statement basis:
Statement statement = new SimpleStatement("UPDATE users SET email = 'email@example.com' where id = 1"); statement.setDefaultTimestamp(1234567890); session.execute(statement);
As you can see, there are multiple ways to provide a timestamp, some of which overlap. The order of precedence is the following:
- if there is a “
USING TIMESTAMP” clause in the CQL string, use that over anything else;
- otherwise, if a default timestamp was set on the statement and is different from
Long.MIN_VALUE, use it;
- otherwise, if a generator is specified, invoke it and use its result if it is different from
- otherwise, let the server assign the timestamp.
Serial consistency level on batch statements
The serial consistency level is used in lightweight transactions. It applies to the “Paxos” phase — where the nodes reach a consensus on the proposal to proceed with — and can take one of two values:
LOCAL_SERIAL (to understand the motivation for
LOCAL_SERIAL, see CASSANDRA-5797). In contrast, the “regular” consistency level in a lightweight transaction applies to the standard write that happens once Paxos has unfolded.
In the Java driver, you set the serial consistency level through the aptly-named Statement#setSerialConsistencyLevel method. But in earlier versions, this method would throw an exception when called on a
BatchStatement, because the operation was not supported at the protocol level. This is now possible in 2.1.2 with protocol v3. Here it is, with an example from an earlier blog post:
BatchStatement batch = new BatchStatement(); batch.add(new SimpleStatement("UPDATE bills SET balance=-200 WHERE user='user1' IF balance=-208")); batch.add(new SimpleStatement("UPDATE bills SET paid=true" + "WHERE user='user1' AND expense_id=1" + "IF paid=false")); batch.setSerialConsistencyLevel(ConsistencyLevel.LOCAL_SERIAL); session.execute(batch);
Other improvements in 2.1.2
2.1.2′s main focus was protocol v3 support, but it also comes with a handful of other improvements and fixes. As always, refer to the changelog for the full list.