Cassandra and Kafka
Cassandra and Kafka are used together frequently in microservice architectures. These modern architectures are made up of a diverse landscape of technologies, each serving its purpose within the data ecosystem. Apache Kafka fits naturally as a distributed queue for event-driven architectures, serving as a buffer layer to transport the messages to the database and surrounding technologies. Cassandra can scale linearly by just adding more nodes, making it an excellent persistent data storage choice for microservices applications. This Skill Page will teach you common patterns for integrating Kafka and Cassandra.
Using Kafka With Cassandra
If your development organization embraces the benefits of microservices architecture, you are aware of Kafka’s durable logs for immutable events that allow your microservices to function independently and asynchronously. Sometimes these microservices need to access a system of record such as Apache Cassandra™. Apache Kafka embodies many of the same distributed systems values as Cassandra — for example, scalability and high availability, and therefore Cassandra and Kafka are technologies that complement each other well.
Kafka As An Event Fabric
Think of Kafka as an event fabric between microservices. A service consumes events from a Kafka stream and performs computations on the events. New Kafka events are produced, and/or data is written to Cassandra. Also, the service may use data from Cassandra as part of the event processing.
Cassandra As A Sink For Kafka
Cassandra is often used with Kafka for long term storage and serving application APIs. Using the DataStax Kafka Connector, data can be automatically ingested from Kafka topics to Cassandra tables.
DataStax’s Kafka Connector
DataStax Apache Kafka Connector is installed in the Kafka Connect framework, and synchronizes records from a Kafka topic with table rows in Cassandra/DSE. Running the connector in this framework enables multiple DataStax connector instances to share the load and to scale horizontally when run in Distributed Mode.
Cassandra And CDC
The reverse is also possible - Enabling CDC (Data Capture Change) on your cluster allows you to stream data out of Cassandra. Use the Kafka Connect framework to perform CDC from Cassandra via plugins. Currently we are working on a way to make this easier.
Material related to Kafka
DataStax Apache Kafka Connector Documentation
Synchronize records from a Kafka topic with table rows supported databases.
DSA: DataStax Apache Kafka™ Connector
Learn how to use the DataStax Apache Kafka™ Connector