Jorge Bay Gondra

<p>Version 1.4.0 of the&nbsp;<a href="http://docs.datastax.com/en/developer/nodejs-driver-dse/latest/">DataStax Enterprise Node.js Driver</a>&nbsp;and version 3.3.0 of the&nbsp;<a href="https://github.com/datastax/nodejs-driver">DataStax Node.js Driver for Apache Cassandra</a>&nbsp;are now available.</p>

<p>The main focus of these releases was to add support for speculative query executions. Additionally, we improved the performance of Murmur3 hashing and changed the query preparation logic along with other enhancements.</p>

<h2>Speculative query executions</h2>

<p>Speculative execution is a way to limit latency at high percentiles by preemptively starting one or more additional executions of the query against different nodes, that way the driver will yield the first response received while discarding the following ones.</p>

<p>Speculative executions are disabled by default. Speculative executions are controlled by an instance of&nbsp;<code>SpeculativeExecutionPolicy</code>&nbsp;provided when initializing the&nbsp;<code>Client</code>. This policy defines the threshold after which a new speculative execution is triggered.</p>

<p>The driver provides a&nbsp;<code>ConstantSpeculativeExecutionPolicy</code>&nbsp;that schedules a given number of speculative executions, separated by a fixed delay, the policy is exported under the&nbsp;<code>{root}.policies.speculativeExecution</code>&nbsp;submodule.</p>

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<p><code>const</code> <code>client = </code><code>new</code> <code>Client({</code></p>

			<p><code>&nbsp;&nbsp;</code><code>contactPoints,</code></p>

			<p><code>&nbsp;&nbsp;</code><code>policies: {</code></p>

			<p><code>&nbsp;&nbsp;&nbsp;&nbsp;</code><code>speculativeExecution: </code><code>new</code> <code>ConstantSpeculativeExecutionPolicy(</code></p>

			<p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code>200</code><code>, </code><code>// delay before a new execution is launched</code></p>

			<p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code><code>2</code><code>) </code><code>// maximum amount of additional executions</code></p>

			<p><code>&nbsp;&nbsp;</code><code>}</code></p>

			<p><code>});</code></p>
			</td>
		</tr>
	</tbody>
</table>

<p>Given the configuration above, an idempotent query would be handled this way:</p>

<ul>
	<li>Start the initial execution at t0</li>
	<li>If no response has been received at t0 + 200 milliseconds, start a speculative execution on another node</li>
	<li>if no response has been received at t0 + 400 milliseconds, start another speculative execution on a third node</li>
</ul>

<p>As with the rest of policies in the driver, you can provide your own implementation by extending the&nbsp;<code>SpeculativeExecutionPolicy</code>&nbsp;prototype.</p>

<p>One important aspect to consider is whether queries are idempotent, (that is, whether they can be applied multiple times without changing the result beyond the initial application). If a query is not idempotent, the driver never schedules speculative executions for it, because there is no way to guarantee that only one node will apply the mutation. Examples of operations that are not idempotent are: counter increments/decrements; adding items to a list column; using non-idempotent CQL functions, like&nbsp;<code>now()</code>&nbsp;or&nbsp;<code>uuid()</code>.</p>

<p>In the driver, query idempotence is determined by the&nbsp;<code>isIdempotent</code>&nbsp;flag in the&nbsp;<code>QueryOptions</code>, which defaults to&nbsp;<code>false</code>. You can set the default when initializing the&nbsp;<code>Client</code>&nbsp;or you can set it manually for each query, for example:</p>

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<p><code>const</code> <code>query = </code><code>'SELECT * FROM users WHERE key = ?'</code><code>;</code></p>

			<p><code>client.execute(query, [ </code><code>'usr1'</code> <code>], { prepare: </code><code>true</code><code>, isIdempotent: </code><code>true</code> <code>});</code></p>
			</td>
		</tr>
	</tbody>
</table>

<p>Note that enabling speculative executions causes the driver to send more individual requests, so throughput does not necessarily improve. You can read&nbsp;<a href="http://docs.datastax.com/en/developer/nodejs-driver/latest/features/speculative-executions/">how speculative executions affect retries and other practical details in the documentation</a>.</p>

<h2>Improved Murmur3 hashing performance</h2>

<p><a href="https://docs.datastax.com/en/cassandra/3.0/cassandra/architecture/archPartitionerAbout.html">Apache Cassandra uses Murmur3Partitioner</a>&nbsp;to determine the distribution of the data across cluster partitions. The adapted version of the&nbsp;<a href="https://en.wikipedia.org/wiki/MurmurHash">Murmur3 hashing algorithm</a>&nbsp;used by Cassandra performs several 64-bit integer operations. As there isn't a native int64 representation in ECMAScript, previously we used to&nbsp;<a href="https://google.github.io/closure-library/api/goog.math.Long.html">Google Closure's Long</a>&nbsp;to support those operations.</p>

<p>To perform int64 add and multiply operations with int32 types requires you to use smaller int16 chunks to handle overflows. Google Closure's Long handles it by creating 4 uint16 chunks of each operand, performing the operations and creating a new int64 value (composed of 2 int32 values), as Long is immutable.</p>

<p>To improve the performance of the partitioner on Node.js, we created a custom type&nbsp;<code>MutableLong</code>&nbsp;that maintains 4 uint16 fields that are used to apply the operation, modifying the internal state, preventing additional allocations per operation.</p>

<h2>Query preparation enhancements</h2>

<p>Previously, the driver prepared the query only on the first node selected by the load-balancing policy, taking a lazy approach.</p>

<p>In this revision, we added fine tuning options on how the driver has to deal with query preparation, introducing 2 new options:</p>

<ul>
	<li><code>prepareOnAllHosts</code>: That determines whether the driver should prepare the query on all hosts.</li>
	<li><code>rePrepareOnUp</code>: That when a node that has been down (unreachable) is considered back up, determines whether we should re-prepare all queries that have been prepared on other nodes.</li>
</ul>

<p>Both properties are set to true by default. You can change it when creating the Client instance:</p>

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<p><code>const</code> <code>client = </code><code>new</code> <code>Client({</code></p>

			<p><code>&nbsp;&nbsp;</code><code>contactPoints,</code></p>

			<p><code>&nbsp;&nbsp;</code><code>prepareOnAllHosts: </code><code>false</code><code>,</code></p>

			<p><code>&nbsp;&nbsp;</code><code>rePrepareOnUp: </code><code>false</code></p>

			<p><code>});</code></p>
			</td>
		</tr>
	</tbody>
</table>

<h2>Expose connection pool state</h2>

<p>The driver now provides a method to obtain a snapshot of the state of the pool per host. It provides the information of all hosts of the cluster, open connections per host and the amount of queries that are currently being executed (in-flight) through a given host.</p>

<p><a href="http://docs.datastax.com/en/developer/nodejs-driver/latest/api/module.metadata/class.ClientState/">You can check out the&nbsp;<code>ClientState</code>&nbsp;API docs for more information</a>.</p>

<p>You can also use the string representation, that provides the information condensed in a readable format useful for debugging or periodic logging in production.</p>

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<p><code>console.log(</code><code>'Pool state: %s'</code><code>, client.getState());</code></p>
			</td>
		</tr>
	</tbody>
</table>

<h2>Wrapping up</h2>

<p>More detailed information about all the features, improvements and fixes included in this release can be found in the changelogs:&nbsp;<a href="http://docs.datastax.com/en/developer/nodejs-driver-dse/1.4/changelog/">DSE driver changelog</a>&nbsp;and&nbsp;<a href="https://github.com/datastax/nodejs-driver/blob/master/CHANGELOG.md">Apache Cassandra driver changelog</a>.</p>

<p>New version of the drivers are available on npm:</p>

<ul>
	<li><a href="https://www.npmjs.com/package/dse-driver">dse-driver</a></li>
	<li><a href="https://www.npmjs.com/package/cassandra-driver">cassandra-driver</a></li>
</ul>

<p>Your feedback is important to us and it influences what features we prioritize. To provide feedback use the following:</p>

<ul>
	<li>Mailing List:&nbsp;<a href="https://groups.google.com/a/lists.datastax.com/forum/#!forum/nodejs-driver-user">https://groups.google.com/a/lists.datastax.com/forum/#!forum/nodejs-driver-user</a></li>
	<li>Report issues on JIRA:&nbsp;<a href="https://datastax-oss.atlassian.net/browse/NODEJS/issues">https://datastax-oss.atlassian.net/browse/NODEJS/issues</a></li>
	<li>DataStax Academy Slack:&nbsp;<a href="https://academy.datastax.com/slack">https://academy.datastax.com/slack</a></li>
	<li>Review and contribute source code:
	<ul>
		<li><a href="https://github.com/datastax/nodejs-dse-driver">https://github.com/datastax/nodejs-dse-driver</a></li>
		<li><a href="https://github.com/datastax/nodejs-driver">https://github.com/datastax/nodejs-driver</a></li>
	</ul>
	</li>
</ul>


New Features in the DataStax Node.js Drivers

Jorge Bay Gondra

Discover more

Share

Share

Speculative query executions

Improved Murmur3 hashing performance

Query preparation enhancements

Expose connection pool state

Wrapping up

More Company

DataStax Acquires Langflow to Accelerate Generative AI Development

The Top 5 DataStax Stories from 2023

2023 Recap: Data = AI

DataStax Astra DB Nabs Three Prestigious 2023 TrustRadius “Best of” Awards, Dominates the Vector Databases Category

One-stop Data API for Production GenAI