Introduction To Apache Kafka and Its Setup
Introduction To Apache Kafka and Its Setup
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Learning Outlines
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Learning Outlines
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
What is a Messaging System?
• Distributed messaging:
• Point to point
• Publish-subscribe (pub-sub)
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Point to Point Messaging System
Message
Sender Receiver
Queue
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Publish-Subscribe Messaging System
sc r ibe
Producer Sub Consumer
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Learning Outlines
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Apache Kafka
● Kafka is used as an enterprise messaging system to decouple source
and target systems to exchange data.
● Kafka provides high throughput with partitions and fault tolerance with
replication.
Source: https://dzone.com/articles/introduction-to-apache-kafka-1
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Apache Kafka
Consumer
Producer
Kafka
Cluster Consumer
Producer
Consumer
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Apache Kafka
Partition 1 Server 1
0 1 p1
Replica
1 Consumer
Producer
Partition 2 Follower
Read/ pull data
Server 2
012 Consumer
Write/ Push data
Replica
P2 2
Replica
p3 3
Old - - - - - - - > New
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Apache Kafka: Terminology
Source: https://dzone.com/articles/introduction-to-apache-kafka-1
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Apache Kafka: Terminology
Below are some points we need to remember when working with partitions.
● Topics are identified by name. We can have many named topics in a
cluster.
● The order of messages is maintained at the partition level, not across
topics.
● Once the data written to the partition, it is not overridden. This is
called immutability.
● The messages in partitions are stored with keys, values, and
timestamps. Kafka ensures publishing the message to the same
partition for a given key.
● From the Kafka cluster, each partition will have a leader that will take
read/write operations to that partition.
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Apache Kafka: Terminology
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Apache Kafka: Example
Source: https://dzone.com/articles/introduction-to-apache-kafka-1
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Consumer Group
Source: http://cloudurable.com/blog/kafka-architecture-consumers/index.html
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
2 Server Kafka Cluster Hosting 4 Partition (P0-P5)
Source: http://cloudurable.com/blog/kafka-architecture-consumers/index.html
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Zookeeper
http://zookeeper.apache.org/
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Architecture
Kafka Cluster
Broker 1 Consumer
Producer
Broker 2
Consumer
Write/ Push data Read/ pull data
Producer Broker 3
Consumer
get kafka
Update Offset
broker id
Zookeeper
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Learning Outlines
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Why Kafka?
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Learning Outlines
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Setup
Ref: https://kafka.apache.org/quickstart
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Setup (Cont’d)
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Setup (Cont’d)
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Setup (Cont’d)
If you run the following command, you will see the topic name:
> bin/kafka-topics.sh --list --zookeeper localhost:2181
test
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Setup (Cont’d)
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Kafka Setup (Cont’d)
Kafka also has a command line consumer that will dump out messages
to standard output:
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Next Lessons
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017
Reference
Apache Kafka:
- https://kafka.apache.org/
• https://dev.to/de_maric/what-is-a-consumer-group-in-kafka-49il
• https://blog.cloudera.com/scalability-of-kafka-messaging-using-
consumer-groups/
-
This document is licensed with a Creative Commons Attribution 4.0 International License ©2017