DataStax Change Agent for Apache Cassandra® (CAC) is a local agent that runs in your Cassandra servers, captures data changes from the commit log, and publishes them into a Pulsar topic.
Effortlessly connect your Apache Cassandra databases to the real-time data ecosystem.
DataStax CDC for Apache Cassandra Advantages
With DataStax CDC for Apache Cassandra (CDC for Cassandra), you can now derive substantially more value from your Cassandra data stores including DataStax Enterprise (DSE). From strengthening data recovery solutions to easily building real-time data pipelines, CDC for Cassandra can help enterprises in many ways.
CDC for Cassandra automatically captures changes in real time, deduplicates them and streams the clean set of changed data into Pulsar where it can be processed by client applications or sent to downstream systems.
Built-in data transformation capability
Data that is piped into Pulsar can be scrubbed, transformed and enriched as specified by the use case. This feature enables CDC for Cassandra to simplify operations and improve developer productivity without dependencies on external processing systems.
Infinite scalability at high speeds
CDC for Cassandra is purpose-built for streaming data pipelines. Pulsar provides a highly scalable platform for building real time data applications, as it can ingest millions of messages with very low latency and process and stream high-volume fast data flows in a distributed environment.
Fault tolerant operations
Pulsar’s multi-layered architecture also enables decoupling of compute and storage, allowing for elastic scalability and support for unbounded message retention on low-cost storage. This feature gives CDC for Cassandra the ability to withstand localized network interruptions for an indeterminate amount of time and resume replication once connectivity is restored
What it Includes
CDC for Cassandra brings the expertise of DataStax engineers to Cassandra and DataStax Enterprise (DSE) users, in the form of enterprise assist for enabling CDC use cases. It provides mission critical support for the following components that make up your CDC implementation.
DataStax Cassandra Source Connector for Apache Pulsar™ (CSC) is a Pulsar IO source connector that publishes a deduplicated stream of changes to a Pulsar topic where they can be consumed.
You can leverage an optional support subscription to that brings in the world-class expertise of DataStax engineers for enabling CDC use cases.
CDC for Cassandra supports a wide variety of connectors in Pulsar, enabling you to easily stream data to other platforms, or process it programmatically with client libraries for many popular programming languages.
CDC for Cassandra is compatible with and open source Cassandra, versions 4.0.x and 3.11.x.
Build Real-time Data Pipelines using DataStax CDC for Cassandra
Microservice architectures create loose couplings between data and applications and offer greater development agility. In such architectures, service-to-service communication can lead to brittleness that can result in cascading failures, which in turn impact the reliability of your business operations. Using CDC for Cassandra to invoke other microservices in response to a data change event in one microservice domain can strengthen the overall reliability of your microservice architecture.
Updating Search Index or Analytical Workloads
In situations when you need advanced searching and indexing for the data stored in Cassandra, you can use CDC for Cassandra to stream the change events to a search engine like Elastic, and provide a delightful discovery experience for your customers; or to a real-time analytics system and enable insightful decisions at the speed of data.
Fronting Slower Data Stores with Cassandra as a Write Buffer
When your OLTP databases or operational data stores can’t keep up with the write throughput demands of a modern digital enterprise, using Cassandra as a write buffer can be an absolute lifesaver. Cassandra can handle even the most demanding internet-scale write volumes, but often you’ll still want this data to make its way to those operational stores. With CDC for Cassandra, you can easily capture changes that occur on your Cassandra instances and buffer them into other operational data stores or event streaming systems within your organization.
When you want to replicate data across different downstream systems and clients, you can configure your Cassandra instances as a source to send a change event stream into Pulsar. From there, Pulsar can pipe those changes to the relevant systems in the enterprise ecosystem to handle more advanced, fine-tuned replication use cases. CDC for Cassandra gives you the resilience to withstand localized network interruptions for any amount of time, and resume replication once connectivity is restored.