Breaking the Scale Barrier: 1 Million Messages with NodeJS and Kafka
Apache Kafka is a distributed event streaming platform which can be used as a Pub-Sub broker in NodeJS microservices. Recently, I worked in a project where publisher needed to send messages to 1 million users and give the response back. Here, a cool thing can be done by Message Queue and we can send 1 million messages in a queue and it will send the messages to the users one by one. But, a pub-sub can broadcast to all instances of a consumer which are subscribed to a certain topic. The Magic of Message Queues vs. Pub-Sub in Kafka When faced with such a massive scale of messaging, two approaches often come to mind: Message Queue: Send 1 million messages to a queue, where each message is processed one by one by consumers. Pub-Sub: Broadcast a single message to all instances of consumers subscribed to a specific topic. For our use case, Kafka’s flexibility in handling both patterns made it the ideal choice. Let’s dive into how to implement a Producer-Consumer microservice system with Kafka. Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git Architecture Producer Microservice: Let's create a Producer microservice and a Consumer microservice. Code in producer/index.js will be like this: const { Kafka } = require('kafkajs'); const kafka = new Kafka({ clientId: 'my-producer', brokers: [process.env.KAFKA_BROKER], }); const producer = kafka.producer(); (async () => { await producer.connect(); for (let i = 0; i < 1000000; i++) { await producer.send({ topic: process.env.TOPIC_NAME, messages: [{ value: `Message ${i}` }], }); console.log(`Sent message ${i}`); } await producer.disconnect(); })(); Here, we can see the producer is producing 1 million messages and sending into a Kafka topic. Consumer Microservice Code in consumer microservice's index.js file will be like this: const { Kafka } = require('kafkajs'); const kafka = new Kafka({ clientId: 'my-consumer', brokers: [process.env.KAFKA_BROKER], }); const consumer = kafka.consumer({ groupId: 'test-group' }); (async () => { await consumer.connect(); await consumer.subscribe({ topic: process.env.TOPIC_NAME, fromBeginning: true }); await consumer.run({ eachMessage: async ({ topic, partition, message }) => { console.log(`Received message: ${message.value} in topic: ${topic} at partition: ${partition}`); }, }); })(); The consumers will be subscribed to a Kafka topic and receive messages asynchronously and process those messages. Docker-compose File The docker-compose.yml file looks like this which will run zookeeper, kafka and producer & consumer services in different containers. version: '3.8' services: zookeeper: image: confluentinc/cp-zookeeper:latest environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 ports: - "22181:2181" networks: - kafka-network volumes: - kafka_data:/var/lib/kafka/data kafka: image: confluentinc/cp-kafka:latest depends_on: - zookeeper ports: - "29092:29092" environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 networks: - kafka-network producer: build: context: ./producer dockerfile: Dockerfile container_name: kafka-producer depends_on: - kafka networks: - kafka-network volumes: - ./producer:/app environment: KAFKA_BOOTSTRAP_SERVERS: kafka:9092 KAFKA_BROKER: kafka:9092 TOPIC_NAME: my-topic consumer: build: context: ./consumer dockerfile: Dockerfile container_name: kafka-consumer depends_on: - kafka networks: - kafka-network volumes: - ./consumer:/app environment: KAFKA_BOOTSTRAP_SERVERS: kafka:9092 KAFKA_BROKER: kafka:9092 TOPIC_NAME: my-topic networks: kafka-network: driver: bridge volumes: kafka_data: Optimizing Kafka Performance with Partitions To handle high throughput, it’s essential to increase the number of partitions in our Kafka topic. More partitions allow parallel processing, enabling producers and consumers to scale horizontally. For example: kafka-topics --create \ --zookeeper localhost:2181 \ --replication-factor 1 \ --partitions 10 \ --topic my-topic Each partition acts as a separate queue, distributing the load among consumers. Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git
Apache Kafka is a distributed event streaming platform which can be used as a Pub-Sub broker in NodeJS microservices. Recently, I worked in a project where publisher needed to send messages to 1 million users and give the response back. Here, a cool thing can be done by Message Queue and we can send 1 million messages in a queue and it will send the messages to the users one by one. But, a pub-sub can broadcast to all instances of a consumer which are subscribed to a certain topic.
The Magic of Message Queues vs. Pub-Sub in Kafka
When faced with such a massive scale of messaging, two approaches often come to mind:
Message Queue: Send 1 million messages to a queue, where each message is processed one by one by consumers.
Pub-Sub: Broadcast a single message to all instances of consumers subscribed to a specific topic.
For our use case, Kafka’s flexibility in handling both patterns made it the ideal choice. Let’s dive into how to implement a Producer-Consumer microservice system with Kafka.
Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git
Architecture
Producer Microservice:
Let's create a Producer microservice and a Consumer microservice.
Code in producer/index.js
will be like this:
const { Kafka } = require('kafkajs');
const kafka = new Kafka({
clientId: 'my-producer',
brokers: [process.env.KAFKA_BROKER],
});
const producer = kafka.producer();
(async () => {
await producer.connect();
for (let i = 0; i < 1000000; i++) {
await producer.send({
topic: process.env.TOPIC_NAME,
messages: [{ value: `Message ${i}` }],
});
console.log(`Sent message ${i}`);
}
await producer.disconnect();
})();
Here, we can see the producer is producing 1 million messages and sending into a Kafka topic.
Consumer Microservice
Code in consumer microservice's index.js
file will be like this:
const { Kafka } = require('kafkajs');
const kafka = new Kafka({
clientId: 'my-consumer',
brokers: [process.env.KAFKA_BROKER],
});
const consumer = kafka.consumer({ groupId: 'test-group' });
(async () => {
await consumer.connect();
await consumer.subscribe({ topic: process.env.TOPIC_NAME, fromBeginning: true });
await consumer.run({
eachMessage: async ({ topic, partition, message }) => {
console.log(`Received message: ${message.value} in topic: ${topic} at partition: ${partition}`);
},
});
})();
The consumers will be subscribed to a Kafka topic and receive messages asynchronously and process those messages.
Docker-compose File
The docker-compose.yml file looks like this which will run zookeeper, kafka and producer & consumer services in different containers.
version: '3.8'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- "22181:2181"
networks:
- kafka-network
volumes:
- kafka_data:/var/lib/kafka/data
kafka:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper
ports:
- "29092:29092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
networks:
- kafka-network
producer:
build:
context: ./producer
dockerfile: Dockerfile
container_name: kafka-producer
depends_on:
- kafka
networks:
- kafka-network
volumes:
- ./producer:/app
environment:
KAFKA_BOOTSTRAP_SERVERS: kafka:9092
KAFKA_BROKER: kafka:9092
TOPIC_NAME: my-topic
consumer:
build:
context: ./consumer
dockerfile: Dockerfile
container_name: kafka-consumer
depends_on:
- kafka
networks:
- kafka-network
volumes:
- ./consumer:/app
environment:
KAFKA_BOOTSTRAP_SERVERS: kafka:9092
KAFKA_BROKER: kafka:9092
TOPIC_NAME: my-topic
networks:
kafka-network:
driver: bridge
volumes:
kafka_data:
Optimizing Kafka Performance with Partitions
To handle high throughput, it’s essential to increase the number of partitions in our Kafka topic. More partitions allow parallel processing, enabling producers and consumers to scale horizontally. For example:
kafka-topics --create \
--zookeeper localhost:2181 \
--replication-factor 1 \
--partitions 10 \
--topic my-topic
Each partition acts as a separate queue, distributing the load among consumers.
Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git