In this post, we'll go through how to process RabbitMQ messages in an asynchronous way and save them in MongoDB with an asynchronous connection.
Real Case Scenario
We have to build a distributed service, that needs to be scaled quickly and processed fast for better performance. The primary purpose was to migrate data amongst platforms and continuously listen for new updates.
So, the service must consume from multiple channels (platforms), pre-process messages with specific rules and publish them back to other related channels as new updates. All these processes needed to be fast not to overload queues and stay in high performance.
As a solution RabbitMQ covers the most of requirements where it supports asynchronous messaging for internal operations. However, we also need to consider the external part of our code such as connecting channels, declaring queues, processing messages, and database operations to make the run overall process concurrently.
Environment Configurations
First, let's install the required dependencies then we're going to configure docker-compose to run RabbitMQ and MongoDB containers.
requirements.txt
aio-pika==8.1.1
uvloop==0.16.0
ujson
urllib3
motor
There are 2 main packages that we need to consider:
aio-pika
is a wrapper for the aiormq for asyncio. It allows concurrent connections to RabbitMQ assets which will increase the performance.motor
is an asynchronous python driver for MongoDB.
Now, create a docker-compose file to run the services:
docker-compose.yml
version: '3'
services:
mongodb:
image: mongo
ports:
- "27017:27017"
command: mongod --bind_ip 0.0.0.0
rabbitmq:
container_name: rabbitmq
image: rabbitmq:3-management-alpine
ports:
- 5672:5672
- 15672:15672
networks:
- rabbitmq_net
networks:
rabbitmq_net:
We will not dockerize the actual service since it's much better to debug it through VScode (or another editor) to see the steps in detail. You can create a virtual environment and install the requirements above to run the project locally.
launch.json (for VSCode)
{
"version": "0.2.0",
"configurations": [
{
"name": "Consumer",
"type": "python",
"request": "launch",
"program": "${workspaceRoot}/src/ingest.py",
"env": { "PYTHONPATH": "${workspaceRoot}"},
"console": "integratedTerminal",
"justMyCode": false,
},
{
"name": "Producer",
"type": "python",
"request": "launch",
"program": "${workspaceRoot}/src/publisher_manual.py",
"env": { "PYTHONPATH": "${workspaceRoot}"},
"console": "integratedTerminal",
"justMyCode": false,
}
]
}
Implementation of Consumer
In this part, we will implement the consumer service alongside with helper functions and abstract classes. First, let's start by adding an abstract class of RabbitMQ consumer:
src/utils/abstract_rabbitmq_consumer.py
import asyncio
from abc import ABCMeta, abstractmethod
from aio_pika.message import IncomingMessage
from aio_pika.queue import Queue
async def mark_message_processed(orig_message: IncomingMessage):
"""Notify message broker about message being processed."""
try:
await orig_message.ack()
except Exception:
await orig_message.nack()
raise
class RabbitMQConsumer(metaclass=ABCMeta):
"""RabbitMQ consumer abstract class responsible for consuming data from the queue."""
def __init__(
self,
queue: Queue,
iterator_timeout: int = 5,
iterator_timeout_sleep: float = 5.0,
*args,
**kwargs,
):
"""
Args:
queue (Queue): aio_pika queue object
iterator_timeout (int): In seconds.
The queue iterator raises TimeoutError if no message comes for this time and iterating starts again.
iterator_timeout_sleep (float): In seconds. Time for sleeping between attempts of iterating.
"""
self.queue = queue
self.iterator_timeout = iterator_timeout
self.iterator_timeout_sleep = iterator_timeout_sleep
self.consuming_flag = True
async def consume(self):
"""Consumes data from RabbitMQ queue forever until `stop_consuming()` is called."""
async with self.queue.iterator(timeout=self.iterator_timeout) as queue_iterator:
while self.consuming_flag:
try:
async for orig_message in queue_iterator:
await self.process_message(orig_message)
if not self.consuming_flag:
break # Breaks the queue iterator
except asyncio.exceptions.TimeoutError:
await self.on_finish()
if self.consuming_flag:
await asyncio.sleep(self.iterator_timeout_sleep)
finally:
await self.on_finish()
@abstractmethod
async def process_message(self, orig_message: IncomingMessage):
raise NotImplementedError()
def stop_consuming(self):
"""Stops the consuming gracefully"""
self.consuming_flag = False
async def on_finish(self):
"""Called after the message consuming finished."""
pass
The abstract class includes the main consuming part of messages from the queue and processes messages asynchronously. You can take a look at simple consumer structure in aio-pika from the documentation.
Next, we need to add the actual consumer that will inherit from the abstract class.
src/consumer.py
import logging
from aio_pika.message import IncomingMessage
from aio_pika.queue import Queue
from motor.motor_asyncio import AsyncIOMotorClient
from src.utils.rabbitmq_helpers import RabbitMQConsumer
logger = logging.getLogger(__name__)
class Consumer(RabbitMQConsumer):
def __init__(
self,
queue: Queue,
db_client: AsyncIOMotorClient,
):
super().__init__(
queue=queue,
db_client=db_client,
)
async def process_message(self, orig_message: IncomingMessage):
logger.info(orig_message)
The naming conventions here are important based on what the consumer will be used for. You can rename the file name and class name as you want. The consumer class will initialize with two main parameters:
Subscribed queue where the consuming will happen
Asynchronous MongoDB client for crud operations after processing messages.
The process_message
function from an abstract class must be overridden for each consumer service.
In best practices, it's recommended to separate DB services from the actual consuming structure. The messages should be processed in the consuming part and the database functions should be in a separate service that only handles DB operations.
Main Ingest Structure
In this section, we'll add the starting point of the service which is the base structure of ingesting mechanism where the consumers and database are going to be initialized.
src/ingest.py
import os
import asyncio
import logging
import aio_pika
from aio_pika.channel import Channel
from aio_pika.queue import Queue
from motor.motor_asyncio import AsyncIOMotorClient
from src.consumer import Consumer
DEFAULT_QUEUE_PARAMETERS = {
"durable": True,
"arguments": {
"x-queue-type": "quorum",
},
}
loop = asyncio.get_event_loop()
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
async def _prepare_consumed_queue(channel: Channel) -> Queue:
if os.environ.get('RABBITMQ_DEAD_LETTER_EXCHANGE_NAME'):
# auto reroute nacked messages to DLX
DEFAULT_QUEUE_PARAMETERS["arguments"]["x-dead-letter-exchange"] = os.environ.get('RABBITMQ_DEAD_LETTER_EXCHANGE_NAME')
queue = await channel.declare_queue(
os.environ.get('RABBITMQ_QUEUE_NAME'),
**DEFAULT_QUEUE_PARAMETERS,
)
await queue.bind(os.environ.get('RABBITMQ_EXCHANGE_NAME'), "blog.posts.create")
return queue
async def _prepare_dead_letter_queue(channel: Channel) -> Queue:
dead_letter_queue: Queue = await channel.declare_queue(
os.environ.get("RABBITMQ_DEAD_LETTER_QUEUE_NAME"),
**DEFAULT_QUEUE_PARAMETERS,
)
bindings = os.environ.get('RABBITMQ_QUEUE_BINDINGS')
for routing_key in bindings:
await dead_letter_queue.bind(os.environ.get('RABBITMQ_DEAD_LETTER_EXCHANGE_NAME'), routing_key)
return dead_letter_queue
async def main(consumer_class) -> None:
mongo_db_client = AsyncIOMotorClient({
'host': "localhost",
'port': 27017
})
rabbitmq_connection = await aio_pika.connect_robust(
loop=loop,
url="amqp://guest:guest@localhost/"
)
try:
async with rabbitmq_connection.channel() as channel:
await channel.set_qos(prefetch_count=100)
if os.environ.get('RABBITMQ_DEAD_LETTER_ENABLED'):
await _prepare_dead_letter_queue(channel)
queue = await _prepare_consumed_queue(channel)
consumer = consumer_class(
queue=queue,
db_client=mongo_db_client,
)
await consumer.consume()
finally:
await rabbitmq_connection.close()
if __name__ == '__main__':
try:
asyncio.run(
main(Consumer)
)
except asyncio.CancelledError:
logger.info('Main task cancelled')
except Exception:
logger.exception('Something unexpected happened')
finally:
logger.info("Shutdown complete")
Basically, we're using the combination of asyncio
, aio-pika
and motor
to start ingesting process by initializing the rabbitmq assets, databases and consumers.
Sometimes messages can be unprocessable for different reasons, such as invalid body or excessive queue length or due to some internal logic. However, these messages must be processed again at a certain point to prevent data loss.
As a solution, you must create an exchange (type can be topic) that will be used to redirect unprocessable messages to related queues. Then, declare a queue (can be more than one) with {"dead-letter-exchange":"my-dlq-exchnage"}
as an extra argument. That will mark the queue to listen from the dead letter exchange.
Once the message is unacknowledged, RabbitMQ will automatically look for dead letter exchange and forward messages to corresponding queues.
Manual Publisher
Finally, we just need a simple script to push some messages to queues.
src/publisher_manual.py
import asyncio
import json
import aio_pika
from bson import json_util
async def main() -> None:
connection = await aio_pika.connect_robust(
"amqp://guest:guest@localhost/",
)
async with connection:
routing_key = "blog.posts.create"
channel = await connection.channel()
await channel.default_exchange.publish(
aio_pika.Message(body=json.dumps({"ping": "pong"}, default=json_util.default).encode()),
routing_key=routing_key,
)
if __name__ == "__main__":
asyncio.run(main())
We can run it through VSCode debugging as provided beginning of this post.
Conclusion
Now, we're ready to run the project with the following steps:
Run docker containers (rabbitmq, mongodb)
Run
Consumer
configuration through VSCode debug and the consumer will start to listen to incoming messages.Run
Producer
config to start the publisher manual script for sending messages to queues.
You can use this initial structure for real distributed projects. However, you have to extend the functionalities as your project required.
Support ๐
If you feel like you unlocked new skills, please share with your friends and stay connected :)