Python Driver 3.7.0 Released
The DataStax Python Driver 3.7.0 for Apache Cassandra has been released. This release had no specific area of focus, but brings a number of new features and improvements. A complete list of issues is available in the CHANGELOG. Here I will mention some of the new features.
Session request listener and query request size information
In addition to cluster metrics, you can now register a session request listener and use it to track alternative metrics about requests (ie. the request size). See this request analyzer as an example.
Speculative query retries
The driver now implements speculative query retries in order to offer smoother latencies even while experiencing some node hiccups. Idempotent statements can benefit from this mechanism. This is a generally extensible interface, but we have also added a ConstantSpeculativeExecutionPolicy implementation. To enable this feature, you need to set a speculative_execution_policy and mark your statement as idempotent.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
from cassandra.cluster import Cluster, ExecutionProfile from cassandra.policies import ConstantSpeculativeExecutionPolicy from cassandra.query import SimpleStatement
cluster = Cluster()
# send a new request every 100ms for a maximum of 10 attempts ep = ExecutionProfile(speculative_execution_policy=ConstantSpeculativeExecutionPolicy(.1, 10)) cluster.add_execution_profile('my_app_ep', ep) session = cluster.connect('test')
statement = SimpleStatement("SELECT i FROM d WHERE k = 0", is_idempotent=True) result = session.execute(statement, execution_profile='my_app_ep') |
Expose paging state
The ResultSet class exposes a new attribute: the paging_state. It can be useful if you have to resume pagination through stateless requests from your application. To use it, you just need to send the paging_state parameter when executing a new query (session.execute).
1 2 3 4 5 6 7 8 9 10 11 12 |
query = "SELECT * FROM users" statement = SimpleStatement(query, fetch_size=10) results = session.execute(statement)
# save the paging_state somewhere... session['paging_state'] = results.paging_state
# and use it later to resume the pagination query = "SELECT * FROM users" statement = SimpleStatement(query, fetch_size=10) paging_state = session['paging_state'] results = session.execute(statement, paging_state=paging_state) |
EC2 address resolver
In the 3.3.0 release, we introduced a new AddressTranslator interface that allows you to implement your ip addresses translation depending on your environment (ie. public ips versus private ips). We now add an official translator for Amazon EC2 since it is heavily used: the EC2MultiRegionTranslator.
1 2 3 4 5 6 7 |
from cassandra.cluster import Cluster from cassandra.policies import EC2MultiRegionTranslator
cluster = Cluster(['127.0.0.1'], address_translator=EC2MultiRegionTranslator()) session = cluster.connect()
# do stuff... |
CQLEngine: support of multiple keyspaces and sessions
Prior to this release, using multiple keyspaces and sessions was a common problematic. We now introduce a new experimental feature to accommodate this use case: the Connections. You can now register multiple connections and switch the context on the fly in your application. Here is an example of the cqlengine connection capabilities:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
from cassandra.cqlengine import connection # ...
CONNS = ['cluster1', 'cluster2'] KEYSPACES = ('client1', 'client2', 'client3', 'client4')
connection.register_connection('cluster1', ['127.0.0.1'], default=True) connection.register_connection('cluster2', ['127.0.0.50'], lazy_connect=True)
for keyspace in KEYSPACES: keyspace_simple(keyspace, 3, connections=CONNS)
class Automobile(Model): __connection__ = 'cluster2' # default connection per model manufacturer = columns.Text(primary_key=True) year = columns.Integer(primary_key=True) model = columns.Text()
# sync the table for all connections and keyspaces sync_table(Automobile, KEYSPACES, CONNS)
# Select the connection and keyspace via the ContextQuery with ContextQuery(Automobile, connection='cluster1' keyspace='client2') as A: A.objects.create(manufacturer='honda', year=2004, model='civic')
# Read from the default model connection 'cluster2' print len(Automobile.objects.using(keyspace='client2').get(manufacturer='honda', year=2004)) # 0
# Select the connection and keyspace on the fly print len(Automobile.objects.using(connection='cluster1', keyspace='client2').all()) # 1
# Select on the model instance a = Automobile.objects.using(connection='cluster1',keyspace='client2').get(manufacturer='honda', year=2004) a.using('cluster2').save() # save on cluster2 rather than cluster1
# Connection select with a BatchQuery with BatchQuery(connection='cluster1' keyspace='client4') as b: A.objects.batch(b).create(manufacturer='honda', year=2004, model='civic') A.objects.batch(b).create(manufacturer='honda', year=2005, model='civic') A.objects.batch(b).create(manufacturer='honda', year=2006, model='civic') |
See the documentation here for more details.
Wrap
As always, thanks to all who provided contributions and bug reports. The continued involvement of the community is appreciated:
- Mailing List: https://groups.google.com/a/lists.datastax.com/forum/#!forum/python-driver-user
- IRC: #datastax-drivers on irc.freenode.net
- Review and contribute source code: https://github.com/datastax/python-driver
- Report issues on JIRA: https://datastax-oss.atlassian.net/browse/PYTHON