CompanyJuly 13, 2022

Migrate to Modern Streaming using Starlight for RabbitMQ

Chris Latimer
Chris LatimerVP of Product Management
Christophe Bornet
Christophe Bornet
Migrate to Modern Streaming using Starlight for RabbitMQ

Starlight for RabbitMQ allows existing RabbitMQ applications to use Apache Pulsar as the native message processing provider with minimal changes. This post introduces you to Starlight for RabbitMQ and shows how you can use it to easily migrate your RabbitMQ application to Apache Pulsar for modern streaming.

Apache Pulsar’s open-source and cloud-native nature allows enterprises to move their RabbitMQ applications to run in any cloud environment as well as on-premises. Now with Starlight for RabbitMQ, you can tap into the open-source power and horizontal scalability of Apache Pulsar for all your RabbitMQ applications without rewriting them first.

To name some of its main highlights, Starlight for RabbitMQ is:

  • Fully open sourced
  • Blazing fast with independent scalability of compute and storage
  • RabbitMQ-compatible for drop-in replacement
  • The quickest path to adding message retention and replay capabilities to RabbitMQ applications
  • A component of a modern messaging strategy that reduces the total cost of ownership (TCO)

We detailed the “what” and “why” of this exciting API in our Starlight announcement. In this blog post, we’ll walk you through the thinking behind Starlight for RabbitMQ, and show you how you can easily migrate a RabbitMQ application to Apache Pulsar.

Under the hood of Starlight for RabbitMQ

Depending on the use case, Starlight for RabbitMQ can be deployed in a broker as a protocol handler or can be launched as a standalone java application.

AMQP 0.9.1 (the protocol RabbitMQ uses) employs the concepts of exchanges, queues, and bindings to provide basic routing capabilities inside the message broker. These concepts need to be mapped to Pulsar topics and features.

One important architectural decision is that Starlight for RabbitMQ doesn’t interact directly with the managed ledger like in other RabbitMQ integrations for Pulsar. Interacting with the ledger has the advantage of being performant, but the disadvantage is that the broker (which interacts with the ledger) must have ownership of the topic.

Since in AMQP 0.9.1, there is a many-to-many relationship between exchanges and queues, all exchanges and queues and related topics must be owned by the same broker. There are techniques to do this using topic bundles, but the result is that a full AMQP virtual host can be handled by only one broker at a time, which becomes an issue for scalability.

Instead, Starlight for RabbitMQ acts as a proxy and uses the Pulsar binary protocol to communicate with the brokers. This means it can leverage Pulsar features like load balancing of the topics on the brokers, batching of messages, partitioning of topics, and load balancing of the data on the consumers.

On the publish side, an AMQP exchange is mapped to a topic. Depending on the type of exchange, the publish routing key is also included in the topic name.

Diagram showing how Starlight for RabbitMQ for Apache Pulsar publishes messages

Figure 1: Diagram showing how Starlight for RabbitMQ for Apache Pulsar publishes messages.

On the consumer side, Pulsar shared subscriptions are used to represent the AMQP bindings from an exchange to a queue. When creating an AMQP queue consumer, the proxy creates Pulsar consumers for all the bindings/subscriptions of the queue.

When you unbind the queue, the subscription isn’t deleted right away since the consumer can be lagging. Messages will still be received from the subscription and filtered if their position is past the end of the binding. Starlight for RabbitMQ periodically checks if all messages from the closed binding have been acknowledged. Once the messages have been acknowledged, Starlight for RabbitMQ deletes the corresponding subscription.

Diagram showing how Starlight for RabbitMQ for Apache Pulsar consumes messages

Figure 2: Diagram showing how Starlight for RabbitMQ for Apache Pulsar consumes messages.

Pulsar shared subscriptions are used to represent bindings. This is because they enable the use of individual acknowledgments, which is a key feature that both RabbitMQ and Pulsar have in common to support message queue use cases.

Consistent metadata store

Starlight for RabbitMQ uses Apache Zookeeper to store the AMQP entities metadata consistently. The existing ZooKeeper configuration store can be reused for this, and Starlight for RabbitMQ will employ the /pulsar-rabbitmq-gw prefix to write its entries in ZooKeeper.

Security and authentication

Starlight for RabbitMQ supports connections using TLS/mTLS to ensure privacy and security of the communication. It also supports the PLAIN and EXTERNAL mechanisms used by RabbitMQ. Internally, it will use the same AuthenticationService as Pulsar and map these mechanisms to existing Pulsar authentication modes.

PLAIN mechanism

The PLAIN mechanism is mapped to the AuthenticationProviderToken mode of authentication. The username is ignored and the password is used as the JSON Web Token (JWT).

EXTERNAL mechanism

The EXTERNAL mechanism is mapped to the AuthenticationProviderTls mode of authentication. This is the equivalent of the rabbitmq-auth-mechanism-ssl plugin.

Starlight for RabbitMQ can connect to brokers that have TLS and/or authentication, and/or authorization enabled. To perform its operations, Starlight for RabbitMQ proxy currently needs to use an “admin role”. Future versions will relay the principal authenticated to the proxy and use a “proxy role” so operations on the broker will have permissions from the originating application.

At the moment there’s no authorization built in the proxy, but this is on the roadmap.

Multiple proxies support

Multiple proxies can be launched at the same time for scalability and high availability needs. The proxies are stateless and can be started and stopped at will. They share their configuration in Zookeeper so you can create/delete/bind/unbind exchanges and queues on any proxy, and it will be replicated on the other proxies.

Publishing messages can be done on any proxy. On the receiving side, messages will be dispatched evenly to all connected AMQP consumers since the Pulsar subscriptions are shared.

Diagram showing how Starlight for RabbitMQ for Apache Pulsar handles proxy clustering

Figure 3: Diagram showing how Starlight for RabbitMQ for Apache Pulsar handles proxy clustering.

Note that the current version of Starlight for RabbitMQ only supports authentication as a security mechanism. Future versions will implement authorization with access controls on the vhosts/exchanges/queues on one side. On the other side, it’ll relay the original principal to also implement this on the Pulsar topics.

Lastly, the current version supports the “direct” and “fanout” types of exchanges. In future releases, we intend to add the “topic” and “headers” types of exchanges.

Start streaming with Starlight for RabbitMQ

Starlight for RabbitMQ is now included in DataStax’s Luna Streaming Enterprise support for Apache Pulsar. You can also dig into the Starlight for RabbitMQ source code on GitHub (available under the Apache license) to learn more about how you can easily leverage the power of Apache Pulsar for your RabbitMQ applications.

Follow DataStax on Medium to keep up with our latest announcements and developer resources on all things Apache, Cassandra, streaming and more. To join our growing developer community, follow DataStaxDevs on Twitter.

Resources

  1. GitHub repo for Starlight for RabbitMQ for Apache Pulsar
  2. Certificate Authentication Mechanism for RabbitMQ
  3. RabbitMQ : AMQP 0-9-1 Complete Reference Guide
  4. Apache Pulsar Documentation : Pulsar binary protocol specification
  5. GitHub repo for Pluggable Protocol Handler
  6. DataStax Luna Streaming
  7. Apache Pulsar
  8. RabbitMQ
  9. Apache Zookeeper
Discover more
data operationsData ProcessingData ProcessingDataStax LunaArchitecture
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.