In today’s fast-paced and scalable application development landscape, event-driven architecture (EDA) has emerged as a powerful pattern for building systems that are decoupled, reactive, and highly extensible. Instead of relying on tightly coupled and synchronous workflows, EDA enables services to communicate through events, promoting flexibility, resilience, and parallel processing. One of the key advantages of EDA is that messages can seamlessly travel between different modules and even across domain boundaries, while their flow can be monitored and traced much more effectively than traditional logs—providing deeper visibility into system behavior and data flow.

In this blog post, we’ll explore how to implement an event-driven system in Python and Flask using Kafka as a message broker. Kafka allows us to produce and consume messages efficiently, and it enables scalable, parallel processing across multiple machines, overcoming the limitations of threading, which confines execution to a single machine and its CPU cores. While Kafka is a robust and popular choice, we’ll also touch on alternative queueing mechanisms like RabbitMQ and Celery, demonstrating how EDA can be implemented flexibly depending on your stack and use case. Whether you’re building microservices, data pipelines, or real-time applications, adopting EDA can bring significant gains in scalability, observability, and maintainability.
Vlog
Events in Real Life
Have you ever thought about how Starbucks manages queues so efficiently? You walk in, and one barista takes your name and your order, then writes it on a paper cup. That’s their job — fast and focused. Then that cup moves on to another barista who makes your coffee. Meanwhile, someone else might be restocking coffee beans from the supply chain to make sure nothing runs out.
Each person is doing a different part of the job — at their own pace — but everything flows together smoothly. That’s not a coincidence. It’s a system where each step is triggered by an event: a new customer, a new order, a low inventory alert. That’s basically how event-driven architecture works in software.
If you’re preparing for system design interviews, you’ll hear about this model a lot — and for good reason. It’s a smart way to build systems that scale easily, respond quickly, and handle real-world complexity with elegance.
Key Concepts in Event Driven Architecture
There are three main parts in event-driven systems:
- Event: This is a message that says something happened — for example, “user registered.”
- Producer: The part of the system that creates and sends the event.
- Consumer: The part that listens for events and reacts to them.
- Broker: Usually, there’s a middleman like Kafka or RabbitMQ that holds these events and delivers them to the right consumers.
One cool thing about events is that they can be designed as chains — where one event triggers another, which triggers the next, and so on.
Or sometimes, when you get an event, you can split it into multiple smaller tasks. Each task can be processed independently and in parallel. This helps break down complex work into manageable pieces and makes scaling easier.
More Reasonable In Python
In Python, we don’t like long-running for loops because they run serially, one after another, which slows things down. While threading or multiprocessing can help, it depends on the number of CPU cores you have. Event-driven systems take a different approach. You can design a system with many servers and many workers running in parallel.
Scalability is simple — if you add a new server that consumes messages from a topic, your app can process more events at the same time without changing your code.
Decoupling Request Handling from Processing
Another big advantage is how your app handles requests. In a traditional system, when you send a request, you often have to wait until the server finishes all the processing before getting a response.
In event-driven architecture, your app can respond immediately, saying, “Request received,” and give you a unique ID. Behind the scenes, the request is stored as an event in a topic.
Another job or worker listens for these events and does the actual processing, like sending emails or updating a database. To keep users informed, you can expose another endpoint where they can check the status of their request using the ID you provided. This way, your main app stays responsive all the time because it’s not doing the heavy lifting immediately.
Even if the worker goes down temporarily, events stay safely in the topic and get processed later when the worker is back up. This makes your system much more reliable.
Common Tools and Technologies
Back in the 90s, enterprise applications used heavy and complex message queues, called MQs.
Once, I had to implement an event-driven-like system using a temporary database table and a trigger. Whenever a new record was added to the main table, the trigger would insert its metadata into the temporary table. I was continuously polling this temporary table to detect new records. After processing a record, I would delete its corresponding entry from the temporary table.
Today, things are much lighter and easier to use. Popular tools include Kafka, RabbitMQ, and Celery.
Personally, I like consuming messages with Flask. When you build a message bus this way, it feeds incoming events to web service methods exposed in Flask. This approach is great because it makes it easy to monitor, debug, and test your event consumers using familiar HTTP endpoints.
Real-Life Example Scenario
Let’s look at a real-life example involving CCTV cameras and facial recognition. Imagine a busy public center with hundreds of people walking in. The CCTV records images continuously — each new image is fed as an event into the system.
A job consumes the image event and detects faces in the image. For each detected face, it creates a new event and puts it into another topic. Once done, that job’s task is complete.
Another job consumes these face events, runs facial recognition, and converts the faces into numerical representations called embeddings. These embeddings are sent as new events to yet another topic.
The final job listens for these embeddings and searches them against a database of wanted people. If it finds a match, it triggers an event that alerts the authorities.
Finally, criminals are reported to the police with their latest location — all done asynchronously and efficiently through a chain of events.
Traditional Approach
Consider the following snippet. When analyze method gets an image, it firstly calls DeepFace’s extract_faces function and this will return a list of faces. Then, it will call DeepFace’s analyze function for pre-detected and pre-extracted face.
def analyze(self, image: str):
faces = self.deepface_service.extract_faces(image)
self.logger.info(f"extracted {len(faces)} faces")
for idx, face in enumerate(faces):
demography = self.deepface_service.analyze(face)
self.logger.info(
f"{face_index+1}-th face analyzed: {demography['age']} years old "
f"{demography['dominant_gender']} "
)
Python’s standard for loops are synchronous and blocking, meaning each iteration completes before the next one starts. In contrast, JavaScript for loops can start asynchronous operations (like Promises) in parallel, allowing multiple tasks to run concurrently.
In Python, achieving parallel or asynchronous execution requires using asyncio, threading, or multiprocessing, because a normal for loop alone will not run tasks in parallel. Even if it runs in parallel like in JavaScript, it will still be limited to the multiple cores of a single machine. It is still not suitable for scaling across multiple machines.
Event Driven Approach
Instead, we will publish each item from the for loop to a Kafka topic, and a separate job will consume and process them.
def analyze(self, image: str):
faces = self.deepface_service.extract_faces(image)
self.logger.info(f"extracted {len(faces)} faces")
for idx, face in enumerate(faces):
encoded_face = base64.b64encode(face.tobytes()).decode("utf-8")
self.event_service.produce(
topic_name="faces.extracted",
key="extracted_face",
value={
"face_index": idx,
"encoded_face": encoded_face,
"shape": face.shape,
},
)
Producing to Kafka will be very fast because it is an asynchronous operation — the message is queued, but it is not verified whether it has actually been written to the topic (unless explicitly requested).
With the Flask-Kafka package, we can listen to a Kafka topic as if it were a web service. You can also test the service using an HTTP POST with the payload of a message placed on the topic. In other words, it’s the same whether you put a message on the faces.extracted Kafka topic or send an HTTP POST to the localhost:5000/analyze/extracted/face endpoint with the same payload.
@bus.handle("faces.extracted")
@blueprint.route("/analyze/extracted/face", methods=["POST"])
def analyze_extracted_face(input_args):
event = json.loads(input_args.value)
container: Container = blueprint.container
try:
container.core_service.analyze_extracted_face(
face_index=event["face_index"],
encoded_face=event["encoded_face"],
shape=event["shape"],
)
return {"status": "ok", "message": "analyzing face asyncly"}, 200
except Exception as err: # pylint: disable=broad-exception-caught
container.logger.error(
f"Exception while analyzing single face - {err}"
)
container.logger.error(traceback.format_exc())
return {"status": "error", "detail": str(err)}, 500
This way, the analyze_extracted_face function will be triggered whenever a message is placed on the faces.extracted topic, and it will perform the analysis for that individual face.
def analyze_extracted_face(
self,
face_index: int,
encoded_face: str,
shape: Tuple[int, int, int],
):
decoded_face = base64.b64decode(encoded_face)
face = np.frombuffer(decoded_face, dtype=np.float64).reshape(shape)
demography = self.deepface_service.analyze(face)
self.logger.info(
f"{face_index+1}-th face analyzed: {demography['age']} years old "
f"{demography['dominant_gender']} "
)
When starting the service with Gunicorn, faces will be analyzed in parallel according to the number of workers specified in the command (in my experiments, I used 2 workers). If we run this service on multiple machines, each machine will serve with the same number of workers. In other words, for scaling, it will be sufficient to adjust the number of workers and the partition count of the Kafka topic through configuration.
Conclusion
Event-driven architecture represents a shift from rigid, tightly coupled systems toward flexible, scalable, and resilient designs. By leveraging tools like Kafka, RabbitMQ, or Celery, developers can decouple services, enable parallel processing, and gain better visibility into the flow of data across their applications. While implementing EDA may introduce new concepts such as brokers, producers, and consumers, the long-term benefits in terms of scalability, maintainability, and fault tolerance make it a valuable investment for modern software systems. Whether you’re orchestrating microservices, handling high-throughput data streams, or building responsive user experiences, adopting an event-driven mindset can help future-proof your architecture and keep your systems ready for growth.
I pushed the source code of this study into GitHub. I strongly recommend you to pull the repo and run the service locally. Read me of the repo explains the steps to get the service up. Finally, you can support this work by starring the repo.
Support this blog financially if you do like!
