The Java Driver team is pleased to announce that version 2.0.10 has been released! This new release comes with a lot of new features, improvements and bugfixes.
Below are some highlights that we think will be of interest for most of our users. Please refer to the README file for a global overview of the current set of features, or to the complete changelog for the full list of changes.
- Speculative Executions
- Query Logger
- Per-Host Latency Histograms
- Netty 4
- Advanced customizations
- Manual Query Paging
- Improvements to
- Exposing Token Ranges
- Improvements to Connection Handling
- New pool resizing algorithm
- Asynchronous initialization
- Connection heartbeats
- Revert of JAVA-425
- Schema Agreement API
- Better Naming of Threads
Since version 2.0.2, Cassandra offers a mechanism to protect against bad read latencies: rapid read protection.
JAVA-561 now introduces a similar protection mechanism that we named Speculative Executions (not to be confused with retries): the driver is now able to pre-emptively start a second execution of the same query against another node, before the first node has replied or errored out. The driver would pass whichever response comes back first onto the client, canceling the other ones.
The driver currently ships with two speculative execution policies:
NoSpeculativeExecutionPolicy, which is the default one and that actually disables speculative executions;
ConstantSpeculativeExecutionPolicy, that spawns speculative executions at a constant rate.
As usual, you can also provide your own policy by simply implementing
Since speculative executions are disabled by default, to switch them on and use e.g.
ConstantSpeculativeExecutionPolicy, all you need to do is register your policy with your
Given the above configuration, speculative executions would be spawned at a constant rate according to the following scenario:
- start the initial execution at t0;
- if no response has been received at t0 + 500 milliseconds, start a speculative execution on another node;
- if no response has been received at t0 + 1000 milliseconds, start another speculative execution on a third node.
One important aspect to consider when using speculative executions is whether queries are idempotent or not, i.e. whether they can be applied multiple times on a given initial state while always producing the same resulting state. If a query is not idempotent, then speculative executions should not be attempted for it, because there is no way to guarantee that the mutation will be applied only once.
Java Driver users have long asked for a convenient way to log queries executed by the driver, and also for a tool to track slow queries yielding bad response times. This is now possible thanks to JAVA-646, that introduces a new API class named
Let's suppose that we want to track queries that take more than 300 milliseconds to complete; this can be achieved in two steps:
1) Create one (singleton)
QueryLogger instance at application startup and register it with the
2) Set the
com.datastax.driver.core.QueryLogger.SLOW logger level to
DEBUG, e.g. with Logback:
The driver would then print a log message for every query that takes more than 300 milliseconds to complete, including useful information such as the queried host and the query string.
QueryLogger's behavior can be fully customized to your needs. For more information, read the online documentation, or the API docs for
We are including in this version a beta preview of a set of new components that focus on recording latency histograms.
The core component is
PerHostPercentileTracker. It is a
LatencyTracker that records latencies for each host over a sliding time interval, and exposes an API to retrieve the current latency at a given percentile. This class uses HdrHistogram to record histograms behind the scenes. See JAVA-723 for more details, or the API docs for the
We are also including another more elaborate, percentile-based speculative execution policy called
PercentileSpeculativeExecutionPolicy. We're very excited about this policy, and we are expecting very good results for speculative executions triggered at higher latency percentiles (95th and above), so we decided to let users experiment with it. See the online documentation or the API docs for
PercentileSpeculativeExecutionPolicy for more details and usage examples. A separate blog post will be published soon and will focus on performance benchmarks for different kinds of speculative executions, including this one.
QueryLogger described above can also be configured to use dynamic, percentile-based thresholds instead of a constant threshold, although used this way, it should be considered as in beta state too. Find out more about dynamic thresholds for the
QueryLogger in the online documentation or in the API docs.
Again, we should stress that the above features are currently marked "beta" and are included in this version for evaluation purposes only and as such, they haven't been thoroughly tested yet, and their API is still subject to change.
Although most users won't notice it, another significant improvement under the hoods in version 2.0.10 is the upgrade from Netty 3 to Netty 4 (JAVA-622).
One important thing to notice is that Netty 4.0 sets the
TCP_NODELAY flag to
true by default. We are also now defaulting
true. Set this option explicitly to
false if you want to enable Naggle's algorithm.
Another important change is Netty shading. Since version 2.0.9 and 2.1.4, the Netty library has been shaded by default. Based on feedback we received since, we are now providing the driver artifacts in two different flavors: with and without shaded Netty classes. Please refer to the online documentation to find out how to use the shaded driver jar.
But there's even more: thanks to JAVA-640 and JAVA-676, it is now possible for client applications to customize the driver's underlying Netty layer. Clients that need such flexibility can now subclass the newly-created
NettyOptions class and provide the necessary customization by overriding its methods. But contrary to other driver options, the options available in this class should be considered as advanced features and as such, they should only be modified by expert users. Moreover, given that
NettyOptions API exposes Netty classes, it should only be extended and used by clients using the non-shaded version of driver. Check the API docs for
NettyOptions for more information about this feature and how to use it.
However, the paging state was kept internally by the driver and clients would not have a direct access to it. This was a serious limitation for applications trying to achieve "manual" paging, e.g. when displaying query results in a stateless web application.
With JAVA-550, this has been finally made possible. Check the online documentation to find out how.
BoundStatement class has been enriched with two new long-awaited improvements: a set of
get*() methods to retrieve typed CQL values either by index (starting at 0) or by name, as well as a more generic
getObject() method (see JAVA-547 and JAVA-584). These have been grouped in a new interface:
GettableData, that is implemented by both
Row. This means that it is now possible to retrieve bound values from a
We are also introducing a new method
DataType.format(Object) that formats a Java object as a String, again for pretty-printing CQL values.
Let's combine all of this into a simple example: suppose that we want to log our bound values and pretty-print them to the console. This can now be achieved with the following code:
Note however that, because bound values are stored internally in a serialized form, retrieving them like in the example above may have a non-negligible impact on performance, because they need to be deserialized back. These methods are thus provided for debugging purposes mainly and should not be used in normal application code.
Making the driver able to report information about token distribution across the ring is also a long-awaited feature for people building Hadoop and Spark applications that interact with Cassandra tables. So far, such applications were relying on the Thrift protocol, because the Java driver wasn't capable of providing enough information for these clients to be able to correctly compute
InputSplits for Cassandra tables and evenly dispatching jobs across the Hadoop/Spark cluster. One example of such applications is the DataStax Spark Cassandra Connector.
Thanks to JAVA-312, that has been backported from version 2.1.5, the Java driver is now able to report enough information to such clients, contributing to the progressive abandon of the deprecated Thrift protocol in the recalcitrant Hadoop/Spark area.
JAVA-312 introduces a new class:
TokenRange. Its most important method is
splitEvenly(int numberOfSplits), which splits the current token range into a number of smaller ranges of equal size. "Size" here refers to the number of tokens in each range; if you want to split according to the actual amount of data, sizing information is now exposed in a system table (see CASSANDRA-7688, fixed in Cassandra 2.1.5).
Check the online documentation on
Metadata for further information and guidelines about how to use the new
TokenRange class to compute splits for Cassandra tables.
Let's start with a nice improvement: JAVA-419 brings a brand new algorithm for connection pool resizing that finally fixes a well-known bug affecting variable-sized pools (core connections != max connections).
Connection pools will also benefit from asynchronous initialization: so far, when the driver creates a connection, it will block until each connection is established and initialization queries are performed. Moreover, a connection pool creates its connections sequentially; sometimes, and especially with large clusters and/or a large number of core connections per hosts, the overall process of creating all connection pools at session startup can be very long. JAVA-692 mitigates this fact by introducing asynchronous, parallel connection pool initialization. This improvement should be noticeable for most users: our tests showed that the new asynchronous initialization outperforms previous versions of the driver in every case, but specially with large clusters and clusters requiring authentication. For example, a 40-nodes cluster with authentication enabled showed an initialization time 8 times faster than before.
Also, the SUSPECT state has disappeared. Handling of misbehaving hosts is now done via connection heartbeats, introduced by JAVA-533 and backported from version 2.1.5. You will find out more about connection heartbeats in the online documentation on connection pooling.
Revert of JAVA-425
And last, but not least: the Java driver community spoke, and we heard you! JAVA-425 has just been completely reverted! For those of you who remember, JAVA-425 introduced a major behavior shift regarding driver read timeouts: in the event of such timeouts (as determined by
SocketOptions.getReadTimeoutMillis()), the driver would defunct the connection and mark the node DOWN.
As good as our intentions were, it appears that this was too an aggressive behavior for most of our users. Among the reasons our users invoked not to defunct the connection:
- The driver cannot reason about the state of the server based on a single request timeout;
- Gossip protocol and connection heartbeats are better indicators of a host's health than a single read timeout;
- It can create sudden hotspots by restraining the number of live nodes the driver can talk to, risking a domino effect and a complete cluster outage;
- It could conceal server-side issues, notably insufficient cluster capacity.
JAVA-669 introduces a new API to check schema agreement between peers.
After a DDL query, one can use
resultSet.getExecutionInfo().isSchemaInAgreement() to check if peers agreed. Also, at any time, users can now perform a one-time check with >
Check the online documentation for further information.
JAVA-583 has introduced changes in the way the driver names its threads to clearly mark them as belonging to the Java driver; all names are now prefixed with the cluster name.
And finally, JAVA-626 adds 4 new Gauges to the
getExecutorQueueDepth(): The number of queued up tasks in the non-blocking executor (threads named
getBlockingExecutorQueueDepth(): The number of queued up tasks in the blocking executor (threads named
getReconnectionSchedulerQueueSize(): The size of the work queue for the reconnection scheduler (threads named
<cluster>-reconnection). A queue size > 0 does not necessarily indicate a backlog as some tasks may not have been scheduled to execute yet.
getTaskSchedulerQueueSize(): The size of the work queue for the task scheduler (threads named
<cluster>-scheduled-task-workers). A queue size > 0 does not necessarily indicate a backlog as some tasks may not have been scheduled to execute yet.
These can be used for monitoring whether or not a Cluster's executors are becoming backlogged, which could help understand abnormal behavior of the driver.
How To Contribute
Have comments, feedback, questions? We would be glad to hear from you! Please use any of the following:
- Source Code: https://github.com/datastax/java-driver
- Mailing List: https://groups.google.com/a/lists.datastax.com/forum/#!forum/java-driver-user
- JIRA: https://datastax-oss.atlassian.net/browse/JAVA
- Documentation (on GitHub): http://datastax.github.io/java-driver/2.0.10
- Documentation (on DataStax website): https://www.datastax.com/documentation/developer/java-driver/2.0
- API docs: https://www.datastax.com/drivers/java/2.0
- IRC: #datastax-drivers on irc.freenode.net