Apache Kafka Workflow | Kafka Pub-Sub Messaging
In our last Kafka tutorial, we discussed Kafka Docker. Today, we will discuss Kafka Workflow. Also, we will cover Workflow of Pub-Sub Messaging along with Workflow of Queue Messaging / Consumer Group in detail. Also, we will see the role of ZooKeeper in Apache Kafka.
So, let’s start with Kafka Workflow.
What is Kafka Workflow?
In Kafka Workflow, Kafka is the collection of topics which are separated into one or more partitions and partition is a sequence of messages, where index identifies each message (also we call an offset). However, in a Kafka cluster, all the data is the disjoint union of partitions.
The incoming messages are present at the end of a partition, hence consumer can read them. Also, by replicating the messages to different brokers, it maintains durability.
In a very fast, reliable, persisted, fault-tolerance and zero downtime manner, Kafka offers a Pub-sub and queue-based messaging system. Moreover, producers send the message to a topic and the consumer can select any one of the message systems according to their wish.
Stay updated with latest technology trends
Join DataFlair on Telegram!!
Workflow of Pub-Sub Messaging
In Apache Kafka, the stepwise workflow of the Pub-Sub Messaging is:
- At regular intervals, Kafka Producers send the message to a topic.
- Kafka Brokers stores all messages in the partitions configured for that particular topic, ensuring equal distribution of messages between partitions. For example, Kafka will store one message in the first partition and the second message in the second partition if the producer sends two messages and there are two partitions.
- Moreover, Kafka Consumer subscribes to a specific topic.
- Once the consumer subscribes to a topic, Kafka offers the current offset of the topic to the consumer and save the offset in the Zookeeper ensemble.
- Also, the consumer will request the Kafka in a regular interval, for new messages (like 100 Ms).
- Kafka will forward the messages to the consumers as soon as received from producers.
- The consumer will receive the message and process it.
- Then Kafka broker receives an acknowledgment of the message processed.
- Further, the offset is changed and updated to the new value as soon as Kafka receives an acknowledgment. Even during server outrages, the consumer can read the next message correctly, because ZooKeeper maintains the offsets.
- However, until the consumer stops the request, the flow repeats.
- As a benefit, the consumer can rewind/skip any offset of a topic at any time and also can read all the subsequent messages, as a par desire.
Workflow of Kafka Queue Messaging/Consumer Group
A group of Kafka consumers having the same Group ID can subscribe to a topic, instead of a single consumer, in a queue messaging system.
However, with the same Group ID all consumers, those are subscribing to a topic are considered as a single group and share the messages. This system’s workflow is:
- In regular intervals, Kafka Producers send the message to a Kafka topic.
- As similar to the earlier scenario, here also Kafka stores all messages in the partitions configured for that particular topic.
- Moreover, a single consumer in Kafka subscribes to a specific topic.
- In the same way as Pub-Sub Messaging, Kafka interacts with the consumer until new consumer subscribes to the same topic.
- As the new customers arrive, share mode starts in the operations and shares the data between two Kafka consumers. Moreover, until the number of Kafka consumers equals the number of partitions configured for that particular topic, the sharing repeats.
- Although, the new consumer in Kafka will not receive any further message, once the number of Kafka consumers exceeds the number of partitions. It happens until any one of the existing consumer unsubscribes. This scenario arises because in Kafka there is a condition that each Kafka consumer will have a minimum of one partition and if no partition remains blank, then new consumers will have to wait.
- In addition, we also call it Kafka Consumer Group. Hence, Apache Kafka will offer the best of both the systems in a very simple and efficient manner.
Role of ZooKeeper in Apache Kafka
Apache Zookeeper serves as the coordination interface between the Kafka brokers and consumers. Also, we can say it is a distributed configuration and synchronization service.
Basically, ZooKeeper cluster shares the information with the Kafka servers. Moreover, Kafka stores basic metadata information in ZooKeeper Kafka, such as topics, brokers, consumer offsets (queue readers) and so on.
In addition, failure of Kafka Zookeeper/broker does not affect the Kafka cluster. It is because the critical information which is stored in the ZooKeeper is replicated across its ensembles. Then Kafka restores the state as ZooKeeper restarts, leading to zero downtime for Kafka.
However, Zookeeper also performs leader election between the Kafka brokers, in the cases of leadership failure.
Hence, this was all about Apache Kafka Workflow. Hope you like our explanation.
Hence, we have seen the concept of Apache Kafka Workflow. Moreover, in this Kafka Workflow tutorial, we have discussed Workflow of Pub-Sub Messaging system, as well as the workflow of Kafka Queue Messaging system.
Finally, we saw the role of Zookeeper in Apache Kafka. Still, if any doubt occurs regarding Kafka Workflow, feel free to ask in the comment section.