Kafka Installation
Kafka Installation
This lab will guide you through the steps of setting up Apache Kafka and using it
to facilitate centralized logging and real-time monitoring in a DevOps environment.
By the end of this lab, you’ll understand how Kafka can be used to enhance
communication between microservices, centralize logs, and provide a robust event
streaming solution for your infrastructure.
Prerequisites:
●
Basic understanding of Kafka concepts (topics, producers, consumers, brokers)
●
Installed and configured:
○
Java 8+ (required to run Kafka)
○
Apache Zookeeper (used by Kafka for coordination)
○
Docker (optional, but useful for containerized deployment)
Tools:
●
Apache Kafka (version 2.8.0 or later)
●
Apache Zookeeper (to manage Kafka cluster)
●
Docker (if using containerized deployment)
●
Linux or MacOS (or a Linux-compatible environment like WSL on Windows)
Step 1: Install Apache Kafka and Zookeeper
You can run Kafka on Docker or natively. Below are steps for both methods.
Method 1: Native Installation
1.
Download Kafka and Zookeeper: wget
https://downloads.apache.org/kafka/2.8.0/kafka_2.13-2.8.0.tgz tar -xzf kafka_2.13-
2.8.0.tgz cd kafka_2.13-2.8.0
2. Start Zookeeper: Kafka depends on Zookeeper to manage its clusters, so start
Zookeeper first. bin/zookeeper-server-start.sh config/zookeeper.properties
3. Start Kafka: In a separate terminal, start Kafka after Zookeeper is up and
running. bin/kafka-server-start.sh config/server.properties
Method 2: Using Docker (Recommended for Simplicity)
If you prefer to run Kafka in Docker containers, use the following steps:
1.
Run Zookeeper and Kafka using Docker: docker network create kafka-net docker run -d
--name zookeeper --network kafka-net -e ZOOKEEPER_CLIENT_PORT=2181 confluentinc/cp-
zookeeper:latest docker run -d --name kafka --network kafka-net -e
KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 -e
KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 -e
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 confluentinc/cp-kafka:latest
2. Verify Kafka and Zookeeper are running:
docker ps
Step 2: Create a Kafka Topic
In Kafka, a topic is where data (messages) is stored. Producers send messages to
topics, and consumers read from them.
1.
Create a topic called logs: bin/kafka-topics.sh --create --topic logs --bootstrap-
server localhost:9092 --partitions 1 --replication-factor 1
2. Verify the topic was created:
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Step 3: Create a Kafka Producer (Send Logs to Kafka)
We’ll simulate a microservice or application sending logs to Kafka.
1.
Start a producer that sends messages to the logs topic: bin/kafka-console-
producer.sh --topic logs --bootstrap-server localhost:9092
After running the above command, type any message (e.g., log data like "INFO: User
login successful") and hit Enter. This will send the message to Kafka.
2. Automate Log Sending with a Bash Script:
Create a script to simulate sending application logs to Kafka periodically. # log-
sender.sh while true; do log="INFO: User login at $(date)" echo $log | bin/kafka-
console-producer.sh --topic logs --bootstrap-server localhost:9092 > /dev/null
sleep 5 done
Run this script in the background:
chmod +x log-sender.sh ./log-sender.sh &
Step 4: Create a Kafka Consumer (Read Logs from Kafka)
Now that logs are being sent to Kafka, we’ll create a consumer that reads and
processes those logs.
1.
Start a Kafka consumer: bin/kafka-console-consumer.sh --topic logs --bootstrap-
server localhost:9092 --from-beginning
This will print out all the messages that were sent to the logs topic, simulating
the way a real-time log monitoring system works.
Step 5: Integrate Kafka with Monitoring Tools (Optional)
In a real DevOps scenario, you may want to forward Kafka logs to monitoring tools
like ELK Stack or Grafana for visualization and alerting.
1.
Kafka to ELK: Use Logstash to pull logs from Kafka and store them in Elasticsearch,
then visualize them using Kibana.
2.
Kafka to Prometheus and Grafana: Use a Kafka Exporter to extract Kafka metrics, and
visualize them in Grafana for real-time monitoring and alerts.
Step 6: Automating Real-Time Alerts
Kafka can be used to trigger automated alerts when logs indicate certain events
(e.g., high server load, errors).
1.
Create an alert consumer: Write a consumer script that checks for error logs and
triggers an alert (e.g., an email or Slack notification) based on log content. #
log-alert.sh bin/kafka-console-consumer.sh --topic logs --bootstrap-server
localhost:9092 --from-beginning | \ while read log; do if [[ $log == *"ERROR"* ]];
then echo "ALERT: Error found in logs - $log" # Trigger an alert, e.g., using a
mail or Slack API. fi done
2. Run the alert consumer:
chmod +x log-alert.sh ./log-alert.sh
Step 7: Kafka Cluster Management (Scaling & Replication)
In a production DevOps environment, Kafka brokers are deployed across multiple
nodes for fault tolerance and scalability.
1.
Scale Kafka Brokers: To scale Kafka, you can add more brokers to the cluster by
running additional instances and connecting them to Zookeeper.
Example of running an additional Kafka broker: docker run -d --name kafka2 --
network kafka-net -e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 -e
KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9093 -e KAFKA_BROKER_ID=2 -e
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=2 confluentinc/cp-kafka:latest
2. Replicate Topics Across Brokers: Kafka automatically handles topic replication
across brokers for fault tolerance.
Check the replication status: bin/kafka-topics.sh --describe --topic logs --
bootstrap-server localhost:9092
Step 8: Clean Up
After completing the lab, you may want to stop Kafka, Zookeeper, and any background
processes:
1.
Stop Kafka and Zookeeper:
Native: bin/kafka-server-stop.sh bin/zookeeper-server-stop.sh
Docker:
docker stop kafka zookeeper
2. Stop background log-sending and alert scripts:
killall log-sender.sh killall log-alert.sh
Conclusion:
In this lab, you learned how to set up Kafka, send logs to Kafka from a simulated
application, consume logs in real-time, and create alerting mechanisms based on log
events. Kafka plays a vital role in DevOps for managing event streams, centralizing
logging, and automating processes, especially in large-scale, distributed systems.
By mastering Kafka, you can optimize communication between microservices, enhance
real-time monitoring, and implement scalable event-driven architectures for your
infrastructure.