CompanyJanuary 4, 2019

C/C++ Driver Performance Enhancements

Michael Fero
Michael Fero
C/C++ Driver Performance Enhancements

Over the past couple of months, we’ve released new versions of our C/C++ drivers for DataStax Enterprise (v1.6.0+) and Apache Cassandra (v2.10.0+). These releases fundamentally changed the internals of the C/C++ drivers and are the biggest releases for the driver since the initial 1.0.0 release. They included a complete refactor of the internal components and increased test coverage. The refactor has to lead to substantially improved performance, decreased resource utilization, and it has enabled us to better unit test internal components.

Performance

This release of the driver has substantially improved throughput.

Final Rate

This was accomplished in two ways.

First, we’ve eliminated the need for requests to be processed by a single shared session thread. Instead, requests are queued and directly processed on the thread that processes I/O for the request. This change also decreases CPU utilization because only a single thread is required to process a request instead of two.

The second way we’ve increased driver’s throughput is by upgrading the driver’s I/O handling. More specifically, we’ve improved the driver’s ability to coalesce requests into a fewer number of system calls and the driver better balances processing of new requests against requests that are already being processed.

95th Percentile

Workload Configuration

1 CPU with 8 cores (2.0 ghz)A simple client/server setup was used with HPE Moonshot System and the ProLiant m510 Server Cartridge:

  • Hyperthreading enabled resulting in 16 HT cores
  • 64 GB RAM
  • 2x1 TB SSD
  • Ubuntu 18.04 LTS

In order to keep things easily repeatable, DataStax Enterprise v6.0.3 was configured for a 3-node cluster using the default configuration. The client performance application used v1.6.0 of DataStax C/C++ driver for DataStax Enterprise and was configured to use 13 I/O threads with 2 threads at the application level. Each performance run utilized a payload of 256 bytes using prepared statements, 5,000 concurrent requests for 5,000,000 total requests.

The schema used in this workload is designed to benchmark the driver. To test the driver's maximum request throughput and reduce server-side load we used a keyspace with a replication factor of 1, a simple key/value pair table schema, and a consistency level of LOCAL_ONE for all request types.

CREATE KEYSPACE IF NOT EXISTS
keyspace1 WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1'};
CREATE TABLE IF NOT EXISTS keyspace1.table1 (key uuid PRIMARY KEY, value varchar);

Testing

In this version of the C/C++ driver internal components have been explicitly designed for testability. We’ve added greater than 60% more unit tests since the previous GA release and plan to continue increasing coverage in future releases.

What's Next

The improvements in this release are a solid foundation for future features and improvements. Here are some of the things we are considering for upcoming releases:

  • Improve connection utilization for large clusters
  • Improve performance and reduce allocations
  • Design a first class, object-based C++ API
  • Improve DSE graph integration utilizing the object-based C++ API

Get involved in the next release, we use your feedback to prioritize specific improvements and features. To get involved use the following resources:

Discover more
DriversC++
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.