Kafka-Docker: Steps To Run Apache Kafka Using Docker
Kafka course with real-time projects Start Now!!
In this Kafka tutorial, we will learn the concept of Kafka-Docker. Moreover, we will see the uninstallation process of Docker in Kafka. This includes all the steps to run Apache Kafka using Docker.
Along with this, to run Kafka using Docker we are going to learn its usage, broker ids, Advertised hostname, Advertised port etc.
So, let’s begin Kafka-docker tutorial.
What is Kafka-docker?
Here come the steps to run Apache Kafka using Docker i.e. Kafka-docker.
i. Pre-Requisites for using Docker
- At very first, install docker-compose
a. Install Docker Compose
We can run compose on macOS, Windows, as well as 64-bit Linux.
Now, to install Kafka-Docker, steps are:
1. For any meaningful work, Docker compose relies on Docker Engine. Hence, we have to ensure that we have Docker Engine installed either locally or remote, depending on our setup.
Basically, on desktop systems like Docker for Mac and Windows, Docker compose is included as part of those desktop installs.
2. Then, Install Compose on macOS
Docker for Mac and Docker Toolbox already include Compose along with other Docker apps, so Mac users do not need to install Compose separately. Docker install instructions for these are here:
3. Uninstallation of Kafka-docker
- If we installed it using curl, then to uninstall Docker Compose:
sudo rm /usr/local/bin/docker-compose
- If we installed using pip, then to uninstall Docker Compose:
pip uninstall docker-compose
- After installing compose, modify the KAFKA_ADVERTISED_HOST_NAME in docker-compose.yml to match our docker host IP
Note: Do not use localhost or 127.0.0.1 as the host IP to run multiple brokers.
- If we want to customize any Kafka parameters, we need to add them as environment variables in docker-compose.yml.
- By adding environment variables prefixed with LOG4J_, Kafka’s log4j usage can be customized. These will be mapped to log4j.properties. For example LOG4J_LOGGER_KAFKA_AUTHORIZER_LOGGER=DEBUG, authorizerAppender
NOTE: There are various ‘gotchas’ with configuring networking.
ii. Usage
Start a Kafka cluster:
- docker-compose up -d
To add more Kafka brokers:
- docker-compose scale kafka=3
To destroy a cluster:
- docker-compose stop
Note: The default docker-compose.yml should be seen as a starting point. Each Kafka Broker will get a new port number and broker id on a restart, by default. It depends on our use case this might not be desirable.
Also, we can modify the docker-compose configuration accordingly, to use specific ports and broker ids, e.g. docker-compose-single-broker.yml:
docker-compose -f docker-compose-single-broker.yml up
iii. Broker IDs
It is possible to configure the broker id in different ways
- explicitly, using KAFKA_BROKER_ID
- via a command, using BROKER_ID_COMMAND, e.g. BROKER_ID_COMMAND: “hostname | awk -F’-‘ ‘{print $2}'”
However, if somehow we don’t specify a broker id in our docker-compose file, that means it will automatically be generated. It permits scaling up and down. Also, to ensure that containers are not re-created, do not use –no-recreate option of docker-compose and thus keep their names and ids.
iv. Automatically create topics
If we want to have Kafka-docker automatically create topics in Kafka during creation, a KAFKA_CREATE_TOPICS environment variable can be added in docker-compose.yml.
Here is an example snippet from docker-compose.yml:
environment:
KAFKA_CREATE_TOPICS: “Topic1:1:3,Topic2:1:1:compact”
Here, we can see Topic 1 is having 1 partition as well as 3 replicas, whereas Topic 2 is having 1 partition, 1 replica, and also a cleanup.policy which is set to compact.
Moreover, override the default, separator, by specifying the KAFKA_CREATE_TOPICS_SEPARATOR environment variable, in order to use multi-line YAML or some other delimiter between our topic definitions.
For example, to split the topic definitions KAFKA_CREATE_TOPICS_SEPARATOR: “$$’\n”‘ would use a newline. Make sure Syntax has to follow docker-compose escaping rules, and ANSI-C quoting.
v. Advertised Hostname
We can configure the advertised hostname in different ways
- explicitly, using KAFKA_ADVERTISED_HOST_NAME
- By a command, using HOSTNAME_COMMAND, e.g. HOSTNAME_COMMAND: “route -n | awk ‘/UG[ \t]/{print $$2}'”
However, if KAFKA_ADVERTISED_HOST_NAME is already specified, it takes priority over HOSTNAME_COMMAND
In order to get the container host’s IP, we can use the Metadata service, for AWS deployment:
HOSTNAME_COMMAND=wget -t3 -T2 -qO- http://169.254.169.254/latest/meta-data/local-ipv4
- Injecting HOSTNAME_COMMAND into the configuration
Use the _{HOSTNAME_COMMAND} string in our variable value, if we require the value of HOSTNAME_COMMAND in any of our other KAFKA_XXX variables, i.e.
KAFKA_ADVERTISED_LISTENERS=SSL://_{HOSTNAME_COMMAND}:9093,PLAINTEXT://9092
vi. Advertised Port
It is very important to determine if the required advertised port is not static. It is possible with the PORT_COMMAND environment variable.
PORT_COMMAND: “docker port $$(hostname) 9092/tcp | cut -d: -f2
By using the _{PORT_COMMAND} string, we can interpolat it in any other KAFKA_XXX config, i.e.
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://1.2.3.4:_{PORT_COMMAND}
vii. Listener Configuration
If the Kafka documentation is open, it is very useful, in order to understand the various broker listener configuration options easily.
For the time of version 0.9.0, there are multiple listener configurations, Kafka supports for brokers to help support different protocols as well as discriminate between internal and external traffic.
NOTE: advertised.host.name and advertised.port still work as expected, but should not be used if configuring the listeners.
viii. Example
The example environment below:
HOSTNAME_COMMAND:
curl http://169.254.169.254/latest/meta-data/public-hostname
KAFKA_ADVERTISED_LISTENERS:
INSIDE://:9092,OUTSIDE://_{HOSTNAME_COMMAND}:9094
KAFKA_LISTENERS:
INSIDE://:9092,OUTSIDE://:9094KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
Will result in the following broker config:
advertised.listeners = OUTSIDE://ec2-xx-xx-xxx-xx.us-west-2.compute.amazonaws.com:9094,INSIDE://:9092 listeners = OUTSIDE://:9094,INSIDE://:9092 inter.broker.listener.name = INSIDE
ix. Rules
- No listeners may share a port number.
- The second rule is, an advertised.listener is must be present by protocol name and port number in listener’s list.
x. Broker Rack
We can configure the broker rack affinity in different ways
- explicitly, using KAFKA_BROKER_RACK
- Via a command, using RACK_COMMAND, e.g. RACK_COMMAND: “curl http://169.254.169.254/latest/meta-data/placement/availability-zone”
In the above example, the AWS metadata service is used to put the instance’s availability zone in the broker.rack property.
xi. JMX
We may need to configure JMX, for monitoring purposes. In addition, to the standard JMX parameters, problems may occur from the underlying RMI protocol used to connect
java.rmi.server.hostname - interface in order to bind listening port com.sun.management.jmxremote.rmi.port
For example,
if we want to connect to a Kafka running locally (suppose exposing port 1099)
KAFKA_JMX_OPTS: “-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=127.0.0.1 -Dcom.sun.management.jmxremote.rmi.port=1099”
JMX_PORT: 1099
Jconsole can now connect at jconsole 192.168.99.100:1099
xii. Docker Swarm Mode
While deploying Kafka in a Docker Swarm using an overlay network, the above listener configuration is necessary. On separating both OUTSIDE as well as INSIDE listeners, a host can communicate with clients outside the overlay network at the time of benefiting from it within the swarm.
More good practices for operating Kafka in a Docker Swarm include
- To launch one and only one Kafka broker per swarm node, use “deploy: global” in a compose file.
- Rather than default “ingress” load-balanced port binding, make usage of composing file version ‘3.2’ as well as the “long” port definition along with the port in “host” mode. For example:
ports:
– target: 9094
published: 9094
protocol: tcp
mode: host
So, this was all about Kafka-docker. Hope you like our explanation of Kafka Docker.
Conclusion
Hence, we have seen the whole Kafka-docker tutorial. Moreover, we saw how to uninstall Docker in Kafka. However, if you have any doubt regarding, Kafka-docker, feel free to ask through the comment section.
Did we exceed your expectations?
If Yes, share your valuable feedback on Google
very nice blog on kafka-docker depoyment.
Can you share kafka-docker on overlay network sample compose.yml
Hello, Prabhu
We appreciate your Observation on this “Kafka-Docker: Steps to run Apache Kafka Using Docker” blog. SOON we will Update our Content Considering your Feedback. Thank You for sharing your Opinion.
Keep reading.
Thanks for sharing.
Which version of docker-compose are you using in this article? Environment variable substition for “_{XXX}” is not working on my side.
Could you please post here docker-compose.yml with _{HOSTNAME_COMMAND} or _{PORT_COMMAND} ?
when we have multi kafka for example there are 3 containers running kafka and on one contain zookeeper is running.
We have to submit a program to kafka then how it works ? did should we submit it to zookeeper and it will share it with all connected kafka ? or something else kindly guide me out.