Adam Holmberg

<p>Today we are happy to announce the release of the DataStax Python Driver 2.6.0 for Apache Cassandra, which includes support for the new features in Cassandra 2.2 and native protocol v4, and also some other general improvements. A full list of changes can be found in the&nbsp;<a href="https://github.com/datastax/python-driver/blob/2.6.0/CHANGELOG.rst">CHANGELOG</a>.</p>

<p>In this post I'll start with describing the general improvements, and proceed with features specific to the recent Cassandra 2.2 release.</p>

<h2>Token-Aware Default Load Balancing Policy</h2>

<p>This release includes a change to the default load balancing policy used by the driver. Load balancing policies are used to plan the order in which nodes are attempted for each query. The new default uses a nested policy which is both token- and data-center-aware. Token awareness allows it to route requests directly to nodes holding a replica of the data (possibly avoiding an extra hop). Data-center-awareness makes the driver consider nodes from a local DC before any others.</p>

<p>By default, the 'local' DC is chosen from the Cluster&nbsp;<tt>contact_points</tt>, so contact points should be for nodes local to the client instance. If you specify contact points from more than one DC, you will need to specify the local DC by initializing the policy explicitly via your Cluster&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.load_balancing_policy">load_balancing_policy</a>.</p>

<p>These are not new policies, just new defaults. The change was made to make use of more advanced features out-of-the-box, and to bring this default in line with our other drivers. If you already specify a load balancing policy explicitly, this change will have no effect.</p>

<h2>New Default Protocol Version, Automatic Downgrading</h2>

<p>The default protocol version is now 4 (previously 2). This is done to avoid confusion when using the default with newer protocol features.</p>

<p>Along with this, the driver now supports downgrading protocol versions when connecting to older versions of Cassandra. This protocol downgrade only happens during the initial cluster connect, when the control connection is being established. This is mostly a convenience feature to allow using driver defaults for any Cassandra version. Production applications should set the protocol version explicitly to the version supported by their cluster. This is more efficient (avoiding protocol downgrades), and will also avoid degraded states if the client ever connects to a partially-upgraded cluster supporting mixed versions.</p>

<h2>Connect Timeout Configuration</h2>

<p>There was previously no easy way to set the timeout for making new connections. This release includes a new Cluster configuration parameter&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.connect_timeout">connect_timeout</a>.</p>

<p>The default timeout is 5 seconds. It covers not only TCP establishment, but also startup negotiation like options exchange, protocol negotiation, and authentication.</p>

<h2>Cluster Schema Refresh API Updates</h2>

<p><tt>Cluster.refresh_schema</tt>&nbsp;and&nbsp;<tt>Cluster.submit_schema_refresh</tt>&nbsp;are now deprecated. As new schema elements beyond keyspace and table (user type, function, aggregate), the API was becoming unwieldy. Rather than continue to explain and enforce various combinations of optional parameters, this API was deprecated in favor of dedicated calls for each schema entity.</p>

<p>Now,&nbsp;Cluster.refresh_schema_metadata&nbsp;is used to refresh everything from the database. Other entities are refreshed using one of the methods:</p>

<ul>
	<li>refresh_keyspace_metadata</li>
	<li>refresh_table_metadata</li>
	<li>refresh_user_type_metadata</li>
	<li>refresh_user_function_metadata</li>
	<li>refresh_user_aggregate_metadata</li>
</ul>

<p>The driver still refreshes these entities automatically based on schema change events from the server. These functions are useful when the driver is configured to ignore those events, and refresh is done ad hoc by the application.</p>

<h2>Distinguish Between NULL and UNSET Values</h2>

<p>Cassandra 2.2 adds the ability to distinguish between null and unset parameters in native protocol v4. This represents a major improvement as it allows binding any combination of parameters in a prepared statement (as you'd expect, partition key columns are still required).</p>

<p>With previous versions of the protocol, when using a prepared statement you had to bind all its parameters or get an error. Combined with the fact that inserting null values resulted in the creation of tombstones, this could have led to larger numbers of prepared statements needed in an application.</p>

<p>When using protocol v4+, the driver will now implicitly set missing values to unset (as long as missing values are not part of the partition key). Applications can also explicitly provide unset values using&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/query.html#cassandra.query.UNSET_VALUE">cassandra.query.UNSET_VALUE</a>.</p>

<p>For example, using positional binding:</p>

<table border="0" cellpadding="0" cellspacing="0">
	<tbody>
		<tr>
			<td>
			<p>1</p>

			<p>2</p>

			<p>3</p>

			<p>4</p>
			</td>
			<td>
			<p><code>from</code> <code>cassandra.query </code><code>import</code> <code>UNSET_VALUE</code></p>

			<p><code>ps </code><code>=</code> <code>session.prepare(</code><code>'INSERT INTO test (key, v0, v1) VALUES (?, ?, ?)'</code><code>)</code></p>

			<p><code>session.execute(ps, (</code><code>0</code><code>, </code><code>1</code><code>))&nbsp; </code><code># v1 implicitly unset</code></p>

			<p><code>session.execute(ps, (</code><code>0</code><code>, UNSET_VALUE, </code><code>2</code><code>))&nbsp; </code><code># v0 explicitly unset</code></p>
			</td>
		</tr>
	</tbody>
</table>

<p>Please note that when using an earlier protocol version, the driver will revert this behavior and unspecified parameters will result in an error.</p>

<h2>Client Warnings from the Server</h2>

<p>Cassandra 2.2 adds client warnings to native protocol v4, as a way to surface warnings to clients that may not have access to server logs. Examples include hitting thresholds like&nbsp;<tt>batch_size_warn_threshold_in_kb</tt>&nbsp;and&nbsp;<tt>tombstone_warn_threshold</tt>&nbsp;while executing a client request.</p>

<p>Any warnings received by the driver are unconditionally logged via&nbsp;<tt>cassandra.protocol</tt>. Warnings are also attached to the&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.ResponseFuture.warnings">request response future for programmatic access</a>.</p>

<h2>New&nbsp;<tt>smallint</tt>,&nbsp;<tt>tinyint</tt>&nbsp;CQL Types</h2>

<p>Cassandra 2.2 introduced two new integer types:&nbsp;<tt>smallint</tt>&nbsp;and&nbsp;<tt>tinyint</tt>. This driver release includes core support for these types (semantics the same as other signed integers, but with different ranges). The types are also supported in the cqlengine mapper column models&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cqlengine/columns.html#cassandra.cqlengine.columns.SmallInt">SmallInt</a>&nbsp;and&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cqlengine/columns.html#cassandra.cqlengine.columns.TinyInt">TinyInt</a>.</p>

<h2>New&nbsp;<tt>date</tt>,&nbsp;<tt>time</tt>&nbsp;CQL Types</h2>

<p>Cassandra 2.2 also introduced new simple&nbsp;<tt>date</tt>&nbsp;and&nbsp;<tt>time</tt>&nbsp;types. These were previously supported by the core driver, discussed under the heading "New Date and Time Cassandra Types" in the&nbsp;last release blog.</p>

<p>This driver release adds support for these types to cqlengine in the form of&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cqlengine/columns.html#cassandra.cqlengine.columns.Date">Date</a>&nbsp;and&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cqlengine/columns.html#cassandra.cqlengine.columns.Time">Time</a>&nbsp;column types. Doing this also removed the previously-deprecated overload of&nbsp;<tt>Date</tt>, which used&nbsp;<tt>timestamp</tt>&nbsp;CQL under the covers and simply truncated the time component on input. Users of this can change their models to use&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/cqlengine/columns.html#cassandra.cqlengine.columns.DateTime">DateTime</a>&nbsp;and use&nbsp;<tt>datetime.date</tt>&nbsp;as input.</p>

<h2>User Defined Function and Aggregate Metadata Model</h2>

<p>Cassandra 2.2 adds User Defined Functions and Aggregates to the server. Working with these in CQL is transparent to the driver. The one driver change<br />
related to functions and aggregates was to add these entities to the metadata queries and model. The models can be accessed via&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/metadata.html#cassandra.metadata.KeyspaceMetadata.functions">functions</a>&nbsp;and&nbsp;<a href="http://datastax.github.io/python-driver/api/cassandra/metadata.html#cassandra.metadata.KeyspaceMetadata.aggregates">aggregates</a><br />
attributes of the keyspace metadata.</p>

<h2>Platform and Runtime Survey</h2>

<p>We solicit input from our users regarding platform and runtime environments in which the driver is being used. If you haven't already (or if your environment has changed), we would appreciate your input on our&nbsp;<a href="https://www.datastax.com/dev/blog/datastax-driver-platform-and-runtimecompiler-surveys">platform and runtime surveys</a>.</p>

<h2>Wrapping Up</h2>

<p>As always, thanks to all who provided contributions and bug reports. The continued involvement of the community is appreciated:</p>

<ul>
	<li>Mailing List:&nbsp;<a href="https://groups.google.com/a/lists.datastax.com/forum/#!forum/python-driver-user">https://groups.google.com/a/lists.datastax.com/forum/#!forum/python-driver-user</a></li>
	<li>IRC: #datastax-drivers on irc.freenode.net</li>
	<li>Review and contribute source code:&nbsp;<a href="https://github.com/datastax/python-driver">https://github.com/datastax/python-driver</a></li>
	<li>Report issues on JIRA:&nbsp;<a href="https://datastax-oss.atlassian.net/browse/PYTHON">https://datastax-oss.atlassian.net/browse/PYTHON</a></li>
</ul>


Python Driver 2.6.0 with Cassandra 2.2 Features

Adam Holmberg

Share

Share

Token-Aware Default Load Balancing Policy

New Default Protocol Version, Automatic Downgrading

Connect Timeout Configuration

Cluster Schema Refresh API Updates

Distinguish Between NULL and UNSET Values

Client Warnings from the Server

New smallint, tinyint CQL Types

New date, time CQL Types

User Defined Function and Aggregate Metadata Model

Platform and Runtime Survey

Wrapping Up

More Technology

How to Build a Crystal Image Search App with Vector Search

Knowledge Graphs for RAG without a GraphDB

How Winweb Built its AI Assistant with DataStax Astra DB and LangChain

Vercel + Astra DB: Get Data into Your GenAI Apps Fast

One-stop Data API for Production GenAI