Kafka-Docker: Steps To Run Apache Kafka Using Docker

Free Kafka course with real-time projects Start Now!!

In this Kafka tutorial, we will learn the concept of Kafka-Docker. Moreover, we will see the uninstallation process of Docker in Kafka. This includes all the steps to run Apache Kafka using Docker.

Along with this, to run Kafka using Docker we are going to learn its usage, broker ids, Advertised hostname, Advertised port etc.

So, let’s begin Kafka-docker tutorial.

What is Kafka-docker?

Here come the steps to run Apache Kafka using Docker i.e. Kafka-docker.

i. Pre-Requisites for using Docker

  • At very first, install docker-compose

a. Install Docker Compose
We can run compose on macOS, Windows, as well as 64-bit Linux.
Now, to install Kafka-Docker, steps are:
1. For any meaningful work, Docker compose relies on Docker Engine. Hence, we have to ensure that we have Docker Engine installed either locally or remote, depending on our setup.
Basically, on desktop systems like Docker for Mac and Windows, Docker compose is included as part of those desktop installs.

2. Then, Install Compose on macOS
Docker for Mac and Docker Toolbox already include Compose along with other Docker apps, so Mac users do not need to install Compose separately. Docker install instructions for these are here:

3. Uninstallation of Kafka-docker

  • If we installed it using curl, then to uninstall Docker Compose:

sudo rm /usr/local/bin/docker-compose

  • If we  installed using pip, then to uninstall Docker Compose:

pip uninstall docker-compose

  • After installing compose, modify the KAFKA_ADVERTISED_HOST_NAME in docker-compose.yml to match our docker host IP

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

Note: Do not use localhost or 127.0.0.1 as the host IP to run multiple brokers.

  • If we want to customize any Kafka parameters, we need to add them as environment variables in docker-compose.yml. 
  • By adding environment variables prefixed with LOG4J_, Kafka’s log4j usage can be customized. These will be mapped to log4j.properties. For example LOG4J_LOGGER_KAFKA_AUTHORIZER_LOGGER=DEBUG, authorizerAppender

NOTE: There are various ‘gotchas’ with configuring networking. 

ii. Usage

Start a Kafka cluster:

  • docker-compose up -d

To add more Kafka brokers:

  • docker-compose scale kafka=3

To destroy a cluster:

  • docker-compose stop

Note: The default docker-compose.yml should be seen as a starting point. Each Kafka Broker will get a new port number and broker id on a restart, by default. It depends on our use case this might not be desirable.

Also, we can modify the docker-compose configuration accordingly,  to use specific ports and broker ids, e.g. docker-compose-single-broker.yml:
docker-compose -f docker-compose-single-broker.yml up

iii. Broker IDs

It is possible to configure the broker id in different ways

  • explicitly, using KAFKA_BROKER_ID
  • via a command, using BROKER_ID_COMMAND, e.g. BROKER_ID_COMMAND: “hostname | awk -F’-‘ ‘{print $2}'”

However, if somehow we don’t specify a broker id in our docker-compose file, that means it will automatically be generated. It permits scaling up and down. Also, to ensure that containers are not re-created, do not use –no-recreate option of docker-compose and thus keep their names and ids.

iv. Automatically create topics

If we want to have Kafka-docker automatically create topics in Kafka during creation, a KAFKA_CREATE_TOPICS environment variable can be added in docker-compose.yml.
Here is an example snippet from docker-compose.yml:
  environment:
    KAFKA_CREATE_TOPICS: “Topic1:1:3,Topic2:1:1:compact”
Here, we can see Topic 1 is having 1 partition as well as 3 replicas, whereas Topic 2 is having 1 partition, 1 replica, and also a cleanup.policy which is set to compact.

Moreover, override the default, separator, by specifying the KAFKA_CREATE_TOPICS_SEPARATOR environment variable, in order to use multi-line YAML or some other delimiter between our topic definitions.

For example, to split the topic definitions KAFKA_CREATE_TOPICS_SEPARATOR: “$$’\n”‘ would use a newline. Make sure Syntax has to follow docker-compose escaping rules, and ANSI-C quoting.

v. Advertised Hostname

We can configure the advertised hostname in different ways

  • explicitly, using KAFKA_ADVERTISED_HOST_NAME
  • By a command, using HOSTNAME_COMMAND, e.g. HOSTNAME_COMMAND: “route -n | awk ‘/UG[ \t]/{print $$2}'”

However, if KAFKA_ADVERTISED_HOST_NAME is already specified, it takes priority over HOSTNAME_COMMAND
In order to get the container host’s IP, we can use the Metadata service, for AWS deployment:
HOSTNAME_COMMAND=wget -t3 -T2 -qO-  http://169.254.169.254/latest/meta-data/local-ipv4

  • Injecting HOSTNAME_COMMAND into the configuration

Use the _{HOSTNAME_COMMAND} string in our variable value, if we require the value of HOSTNAME_COMMAND in any of our other KAFKA_XXX variables, i.e.
KAFKA_ADVERTISED_LISTENERS=SSL://_{HOSTNAME_COMMAND}:9093,PLAINTEXT://9092

vi. Advertised Port

It is very important to determine if the required advertised port is not static. It is possible with the PORT_COMMAND environment variable.
PORT_COMMAND: “docker port $$(hostname) 9092/tcp | cut -d: -f2
By using the _{PORT_COMMAND} string, we can interpolat it in any other KAFKA_XXX config, i.e.
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://1.2.3.4:_{PORT_COMMAND}

vii. Listener Configuration

If the Kafka documentation is open, it is very useful, in order to understand the various broker listener configuration options easily.
For the time of version 0.9.0, there are multiple listener configurations, Kafka supports for brokers to help support different protocols as well as discriminate between internal and external traffic. 

NOTE: advertised.host.name and advertised.port still work as expected, but should not be used if configuring the listeners.

viii. Example

The example environment below:
HOSTNAME_COMMAND:

curl http://169.254.169.254/latest/meta-data/public-hostname

KAFKA_ADVERTISED_LISTENERS:

INSIDE://:9092,OUTSIDE://_{HOSTNAME_COMMAND}:9094

KAFKA_LISTENERS:

INSIDE://:9092,OUTSIDE://:9094KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT

KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE

Will result in the following broker config:

advertised.listeners = OUTSIDE://ec2-xx-xx-xxx-xx.us-west-2.compute.amazonaws.com:9094,INSIDE://:9092
listeners = OUTSIDE://:9094,INSIDE://:9092
inter.broker.listener.name = INSIDE

ix. Rules

  • No listeners may share a port number.
  • The second rule is, an advertised.listener is must be present by protocol name and port number in listener’s list.

x. Broker Rack

We can configure the broker rack affinity in different ways

  • explicitly, using KAFKA_BROKER_RACK
  • Via a command, using RACK_COMMAND, e.g. RACK_COMMAND: “curl http://169.254.169.254/latest/meta-data/placement/availability-zone”

In the above example, the AWS metadata service is used to put the instance’s availability zone in the broker.rack property.

xi. JMX

We may need to configure JMX, for monitoring purposes. In addition, to the standard JMX parameters, problems may occur from the underlying RMI protocol used to connect

java.rmi.server.hostname - interface in order to bind listening port
com.sun.management.jmxremote.rmi.port

For example,
if we want to connect to a Kafka running locally (suppose exposing port 1099)
KAFKA_JMX_OPTS: “-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=127.0.0.1 -Dcom.sun.management.jmxremote.rmi.port=1099”
JMX_PORT: 1099
Jconsole can now connect at jconsole 192.168.99.100:1099

xii. Docker Swarm Mode

While deploying Kafka in a Docker Swarm using an overlay network, the above listener configuration is necessary. On separating both OUTSIDE as well as INSIDE listeners, a host can communicate with clients outside the overlay network at the time of benefiting from it within the swarm.
More good practices for operating Kafka in a Docker Swarm include

  1. To launch one and only one Kafka broker per swarm node, use “deploy: global” in a compose file.
  2. Rather than default “ingress” load-balanced port binding, make usage of composing file version ‘3.2’  as well as the “long” port definition along with the port in “host” mode. For example:

ports:
 – target: 9094
   published: 9094
   protocol: tcp
   mode: host
So, this was all about Kafka-docker. Hope you like our explanation of Kafka Docker.

Conclusion

Hence, we have seen the whole Kafka-docker tutorial. Moreover, we saw how to uninstall Docker in Kafka. However, if you have any doubt regarding, Kafka-docker, feel free to ask through the comment section.

Did you know we work 24x7 to provide you best tutorials
Please encourage us - write a review on Google

follow dataflair on YouTube

4 Responses

  1. prabhu says:

    very nice blog on kafka-docker depoyment.
    Can you share kafka-docker on overlay network sample compose.yml

    • Data Flair says:

      Hello, Prabhu
      We appreciate your Observation on this “Kafka-Docker: Steps to run Apache Kafka Using Docker” blog. SOON we will Update our Content Considering your Feedback. Thank You for sharing your Opinion.
      Keep reading.

  2. Boris says:

    Thanks for sharing.
    Which version of docker-compose are you using in this article? Environment variable substition for “_{XXX}” is not working on my side.
    Could you please post here docker-compose.yml with _{HOSTNAME_COMMAND} or _{PORT_COMMAND} ?

  3. Nabeel Raza says:

    when we have multi kafka for example there are 3 containers running kafka and on one contain zookeeper is running.
    We have to submit a program to kafka then how it works ? did should we submit it to zookeeper and it will share it with all connected kafka ? or something else kindly guide me out.

Leave a Reply

Your email address will not be published. Required fields are marked *