Breaking the Scale Barrier: 1 Million Messages with NodeJS and Kafka

Apache Kafka is a distributed event streaming platform which can be used as a Pub-Sub broker in NodeJS microservices. Recently, I worked in a project where publisher needed to send messages to 1 million users and give the response back. Here, a cool thing can be done by Message Queue and we can send 1 million messages in a queue and it will send the messages to the users one by one. But, a pub-sub can broadcast to all instances of a consumer which are subscribed to a certain topic. The Magic of Message Queues vs. Pub-Sub in Kafka When faced with such a massive scale of messaging, two approaches often come to mind: Message Queue: Send 1 million messages to a queue, where each message is processed one by one by consumers. Pub-Sub: Broadcast a single message to all instances of consumers subscribed to a specific topic. For our use case, Kafka’s flexibility in handling both patterns made it the ideal choice. Let’s dive into how to implement a Producer-Consumer microservice system with Kafka. Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git Architecture Producer Microservice: Let's create a Producer microservice and a Consumer microservice. Code in producer/index.js will be like this: const { Kafka } = require('kafkajs'); const kafka = new Kafka({ clientId: 'my-producer', brokers: [process.env.KAFKA_BROKER], }); const producer = kafka.producer(); (async () => { await producer.connect(); for (let i = 0; i < 1000000; i++) { await producer.send({ topic: process.env.TOPIC_NAME, messages: [{ value: `Message ${i}` }], }); console.log(`Sent message ${i}`); } await producer.disconnect(); })(); Here, we can see the producer is producing 1 million messages and sending into a Kafka topic. Consumer Microservice Code in consumer microservice's index.js file will be like this: const { Kafka } = require('kafkajs'); const kafka = new Kafka({ clientId: 'my-consumer', brokers: [process.env.KAFKA_BROKER], }); const consumer = kafka.consumer({ groupId: 'test-group' }); (async () => { await consumer.connect(); await consumer.subscribe({ topic: process.env.TOPIC_NAME, fromBeginning: true }); await consumer.run({ eachMessage: async ({ topic, partition, message }) => { console.log(`Received message: ${message.value} in topic: ${topic} at partition: ${partition}`); }, }); })(); The consumers will be subscribed to a Kafka topic and receive messages asynchronously and process those messages. Docker-compose File The docker-compose.yml file looks like this which will run zookeeper, kafka and producer & consumer services in different containers. version: '3.8' services: zookeeper: image: confluentinc/cp-zookeeper:latest environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 ports: - "22181:2181" networks: - kafka-network volumes: - kafka_data:/var/lib/kafka/data kafka: image: confluentinc/cp-kafka:latest depends_on: - zookeeper ports: - "29092:29092" environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 networks: - kafka-network producer: build: context: ./producer dockerfile: Dockerfile container_name: kafka-producer depends_on: - kafka networks: - kafka-network volumes: - ./producer:/app environment: KAFKA_BOOTSTRAP_SERVERS: kafka:9092 KAFKA_BROKER: kafka:9092 TOPIC_NAME: my-topic consumer: build: context: ./consumer dockerfile: Dockerfile container_name: kafka-consumer depends_on: - kafka networks: - kafka-network volumes: - ./consumer:/app environment: KAFKA_BOOTSTRAP_SERVERS: kafka:9092 KAFKA_BROKER: kafka:9092 TOPIC_NAME: my-topic networks: kafka-network: driver: bridge volumes: kafka_data: Optimizing Kafka Performance with Partitions To handle high throughput, it’s essential to increase the number of partitions in our Kafka topic. More partitions allow parallel processing, enabling producers and consumers to scale horizontally. For example: kafka-topics --create \ --zookeeper localhost:2181 \ --replication-factor 1 \ --partitions 10 \ --topic my-topic Each partition acts as a separate queue, distributing the load among consumers. Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git

Jan 15, 2025 - 16:30
Breaking the Scale Barrier: 1 Million Messages with NodeJS and Kafka

Apache Kafka is a distributed event streaming platform which can be used as a Pub-Sub broker in NodeJS microservices. Recently, I worked in a project where publisher needed to send messages to 1 million users and give the response back. Here, a cool thing can be done by Message Queue and we can send 1 million messages in a queue and it will send the messages to the users one by one. But, a pub-sub can broadcast to all instances of a consumer which are subscribed to a certain topic.

The Magic of Message Queues vs. Pub-Sub in Kafka

When faced with such a massive scale of messaging, two approaches often come to mind:

  1. Message Queue: Send 1 million messages to a queue, where each message is processed one by one by consumers.

  2. Pub-Sub: Broadcast a single message to all instances of consumers subscribed to a specific topic.

For our use case, Kafka’s flexibility in handling both patterns made it the ideal choice. Let’s dive into how to implement a Producer-Consumer microservice system with Kafka.

Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git

Architecture

NodeJS Microservice with Kafka

Producer Microservice:

Let's create a Producer microservice and a Consumer microservice.
Code in producer/index.js will be like this:

const { Kafka } = require('kafkajs');

const kafka = new Kafka({
  clientId: 'my-producer',
  brokers: [process.env.KAFKA_BROKER],
});

const producer = kafka.producer();

(async () => {
  await producer.connect();
  for (let i = 0; i < 1000000; i++) {
    await producer.send({
      topic: process.env.TOPIC_NAME,
      messages: [{ value: `Message ${i}` }],
    });
    console.log(`Sent message ${i}`);
  }
  await producer.disconnect();
})();

Here, we can see the producer is producing 1 million messages and sending into a Kafka topic.

Consumer Microservice

Code in consumer microservice's index.js file will be like this:

const { Kafka } = require('kafkajs');

const kafka = new Kafka({
  clientId: 'my-consumer',
  brokers: [process.env.KAFKA_BROKER],
});

const consumer = kafka.consumer({ groupId: 'test-group' });

(async () => {
  await consumer.connect();
  await consumer.subscribe({ topic: process.env.TOPIC_NAME, fromBeginning: true });

  await consumer.run({
    eachMessage: async ({ topic, partition, message }) => {
      console.log(`Received message: ${message.value} in topic: ${topic} at partition: ${partition}`);
    },
  });
})();

The consumers will be subscribed to a Kafka topic and receive messages asynchronously and process those messages.

Docker-compose File

The docker-compose.yml file looks like this which will run zookeeper, kafka and producer & consumer services in different containers.

version: '3.8'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - "22181:2181"
    networks:
      - kafka-network
    volumes:
      - kafka_data:/var/lib/kafka/data

  kafka:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    ports:
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    networks:
      - kafka-network

  producer:
    build: 
      context: ./producer
      dockerfile: Dockerfile
    container_name: kafka-producer
    depends_on:
      - kafka
    networks:
      - kafka-network
    volumes:
      - ./producer:/app
    environment:
      KAFKA_BOOTSTRAP_SERVERS: kafka:9092
      KAFKA_BROKER: kafka:9092
      TOPIC_NAME: my-topic

  consumer:
    build: 
      context: ./consumer
      dockerfile: Dockerfile
    container_name: kafka-consumer
    depends_on:
      - kafka
    networks:
      - kafka-network
    volumes:
      - ./consumer:/app
    environment:
      KAFKA_BOOTSTRAP_SERVERS: kafka:9092
      KAFKA_BROKER: kafka:9092
      TOPIC_NAME: my-topic

networks:
  kafka-network:
    driver: bridge


volumes:
  kafka_data:

Optimizing Kafka Performance with Partitions

To handle high throughput, it’s essential to increase the number of partitions in our Kafka topic. More partitions allow parallel processing, enabling producers and consumers to scale horizontally. For example:

kafka-topics --create \
  --zookeeper localhost:2181 \
  --replication-factor 1 \
  --partitions 10 \
  --topic my-topic

Each partition acts as a separate queue, distributing the load among consumers.

Check out the demo code: https://github.com/fahadfahim13/producer-consumer-kafka.git