Exploiting Functional Parallelism with Kafka

Abhi
3 min readApr 4, 2021

--

Kafka is an overgrown msg queue. How do u expect it to handle parallelism???

Kafka is an asynchronous messaging queue. It just so happens to be horizontally scalable, greatly resilient, and capable of handling large data throughput. Kafka reads the message into a very simplistic queue data structure where the records cannot be directly manipulated, only appending onto the log is possible. The order of input items onto Kafka logs is guaranteed.

Consumer Group

It is a mechanism of multiple consumers in one group. Data is equally divided among all consumers of a group, with no two consumers receiving the same data.

A Kafka consumer group includes related consumers with a common task. Kafka sends messages from partitions of a topic to consumers. Each partition is read by only a single consumer within the group. A consumer group has a unique group-id and can run multiple processes or instances at once. Multiple consumer groups can each have one consumer read from a single partition. If the number of consumers within a group is greater than the number of partitions, some consumers will be inactive. let's dive into Kafka pub-sub.

Step 1: Deploy Kafka Cluster

Step 2: Create a topic inside the broker

docker exec -it kafka-tutorial_kafka1_1 kafka-topics — zookeeper zookeeper:2181 — create — topic my-function — partitions 1 — replication-factor 3

while visiting http://localhost:9000/ this page you should be able to access Kafdrop GUI to visualize Kafka brokers and topics.

Step 3: we have to define producers and consumers to interact with Kafka service

The producer is written in Python language (v3.6). we will need to install Kafka libraries with the pip manager. Here “radom_gen()” generated random values and send it to the queue as separate messages. This is to replicate user inputs.

Step 4: Create consumers with different time complexity to end execution at different time frames.

Step5: Initially Run all 3 consumers and then run the producer,

Hear all 3 individual processes started execution at the same time by consuming messages from a common topic. due to the different time complexities of all 3 consumers, the “conumer2” finished the last. Apache Kafka offers a uniquely versatile and powerful architecture for streaming workloads with extreme scalability, reliability, and performance.

--

--

Abhi

Hello world, Basically a Linux evangelist, Working as a DevOps engineer — ♥ www.abhinand.in/