0% found this document useful (0 votes)

65 views20 pages

Kafka 2

This document provides an overview of Kafka basics and core concepts, including: - Kafka is a distributed streaming platform that stores data as a continuous stream of records in a commit log. - Messages are the basic units of data in Kafka and are organized into topics, which are logical categories or feeds. - Topics can be partitioned to allow horizontal scaling and parallel processing of messages. - Producers publish messages to topics and determine which partitions to send to. Consumers read messages from partitions in order.

Uploaded by

felixhahn721

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views20 pages

Kafka 2

Uploaded by

felixhahn721

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

12/5/23, 1:01 PM Kafka Basics and Core concepts.

d Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Open in app Sign up Sign in

Search Write

Kafka Basics and Core concepts

Aritra Das · Follow
Published in inspiringbrilliance · 11 min read · Jan 17, 2021

658 4

Source: https://www.confluent.io/

Introduction
Let’s start by answering the question “What is Kafka?”.
https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 1/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Kafka is a Distributed Streaming Platform or a Distributed Commit Log

Let’s try to understand those jargons.

Distributed
Kafka works as a cluster of one or more nodes that can live in different
Datacenters, we can distribute data/ load across different nodes in the Kafka
Cluster, and it is inherently scalable, available, and fault-tolerant.

Streaming Platform

Kafka stores data as a stream of continuous records which can be processed

in different methods.

Commit Log

This one is my favorite. When you push data to Kafka it takes and appends
them to a stream of records, like appending logs in a log file or if you’re from
a Database background like the WAL. This stream of data can be “Replayed”
or read from any point in time.

Is Kafka a message queue?

It certainly can act as a message queue, but it’s not limited to that. It can act
as a FIFO queue, as a Pub/ Sub messaging system, a real-time streaming
platform. And because of the durable storage capability of Kafka, it can even
be used as a Database (read about it here).

Having said all of that, Kafka is commonly used for real-time streaming data
pipelines, i.e. to transfer data between systems, building systems that

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 2/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

transform continuously flowing data, and building event-driven systems.

We will jump into core Kafka concepts now.

Message
A message is the atomic unit of data for Kafka. Let’s say that you are building
a log monitoring system, and you push each log record into Kafka, your log
message is a JSON that has this structure.

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 3/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

When you push this JSON into Kafka you are actually pushing 1 message.
Kafka saves this JSON as a byte array, and that byte array is a message for
Kafka. This is that atomic unit, a JSON having two keys “level” and
“message”. But it does not mean you can’t push anything else into Kafka, you
can push String, Integer, a JSON of different schema, and everything else,
but we generally push different types of messages into different topics (we
will get to know what is a topic soon).

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 4/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Messages might have an associated “Key” which is nothing but some

metadata, which is used to determine the destination partition (will know
soon as well) for a message.

Topic
Topics, as the name suggests, are the logical categories of messages in Kafka,
a stream of the same type of data. Going back to our previous example of the
logging system, let’s say our system generates application logs, ingress logs,
and database logs and pushes them to Kafka for other services to consume.
Now, these three types of logs can be logically be divided into three topics,
appLogs, ingressLogs, and dbLogs. We can create these three topics in Kafka,
whenever there’s an app log message, we push it to appLogs topic and for
database logs, we push it to the dbLogs topic. This way we have logical
segregation between messages, sort of like having different tables for
holding different types of data.

Partitions
Partition is analogous to shard in the database and is the core concept
behind Kafka’s scaling capabilities. Let’s say that our system becomes really
popular and hence there are millions of log messages per second. So now
the node on which appLogs topic is present, is unable to hold all the data
that is coming in. We initially solve this by adding more storage to our node
i.e. vertical scaling. But as we all know vertical scaling has its limit, once that
threshold is reached we need to horizontally scale, which means we need to
add more nodes and split the data between the nodes. When we split data of
a topic into multiple streams, we call all of those smaller streams the
“Partition” of that topic.

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 5/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Source: Kafka The Definitive Guide

This image depicts the idea of partitions, where a single topic has 4
partitions, and all of them hold a different set of data. The blocks you see
here are the different messages in that partition. Let’s imagine the topic to
be an array, now due to memory constraint we have split the single array
into 4 different smaller arrays. And when we write a new message to a topic,
the relevant partition is selected and then that message is added at the end of
the array.

An offset for a message is the index of the array for that message. The
numbers on the blocks in this picture denote the Offset, the first block is at
the 0th offset and the last block would on the (n-1)th offset. The performance
of the system also depends on the ways you set up partitions, we will look
into that later in the article. (Please note that on Kafka it is not going to be an
actual array but a symbolic one)

Producer
A producer is the Kafka client that publishes messages to a Kafka topic. Also
one of the core responsibilities of the Producer is to decide which partition
to send the messages to. Depending on various configuration and

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 6/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

parameters, the producer decides the destination partition, let’s look a bit
more into this.

1. No Key specified => When no key is specified in the message the

producer will randomly decide partition and would try to balance the
total number of messages on all partitions.

2. Key Specified => When a key is specified with the message, then the
producer uses Consistent Hashing to map the key to a partition. Don’t
worry if you don’t know what consistent hashing is, in short, it’s a
hashing mechanism where for the same key same hash is generated
always, and it minimizes the redistribution of keys on a re-hashing
scenario like a node add or a node removal to the cluster. So let’s say in
our logging system we use source node ID as the key, then the logs for the
same node will always go to the same partition. This is very relevant for
the order guarantees of messages in Kafka, we will shortly see how.

3. Partition Specified => You can hardcode the destination partition as well.

4. Custom Partitioning logic => We can write some rules depending on

which the partition can be decided.

Consumer
So far we have produced messages, to read those messages we use Kafka
consumer. A consumer reads messages from partitions, in an ordered
fashion. So if 1, 2, 3, 4 was inserted into a topic, the consumer will read it in
the same order. Since every message has an offset, every time a consumer
reads a message it stores the offset value onto Kafka or Zookeeper, denoting
that it is the last message that the consumer read. So in case, a consumer
node goes down, it can come back and resume from the last read position.

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 7/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Also if at any point in time a consumer needs to go back in time and read
older messages, it can do so by just resetting the offset position.

Consumer Group
A consumer group is a collection of consumers that work together to read
messages from a topic. There are some very interesting concepts here, let’s
go through them.

1. Fan out exchange => A single topic can be subscribed to by multiple

consumer groups. Let’s say that you are building an OTP service.

Now you need to send both text and email OTP. So your OTP service can put
the OTP in Kafka, and then the SMS Service consumer group and Email
Service consumer group can both receive the message and can then send the
SMS and email out.

2. Order guarantee => Now we have seen that a topic can be partitioned and
multiple consumers can consumer from the same topic, then how do you
maintain the order of messages on the consumer-end one might ask. Good
question. One partition can not be read by multiple consumers in the same

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 8/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

consumer group. This is enabled by the consumer group only, only one
consumer in the group gets to read from a single partition. Let me explain.

So your producer produces 6 messages. Each message is a key-value pair, for

key “A” value is “1”, for “C” value is “1”, for “B” value is “1”, for “C” value is “2”
….. “B” value is “2”. (Please note that by key I mean the message key that we
discussed earlier and not the JSON or Map key). Our topic has 3 partitions,
and due to consistent hashing messages with the same key always go to the
same partition, so all the messages with “A” as the key will get grouped and
the same for B and C. Now as each partition has only one consumer, they get
messages in order only. So the consumer will receive A1 before A2 and B1
before B2, and thus the order is maintained, tada 🎉. Going back to our
logging system example the keys are the source node ID, then all the logs for
node1 will go to the same partition always. And since the messages are
always going to the same partition, we will have the order of the messages
maintained.

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 9/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

This will not be possible if the same partition had multiple consumers in the
same group. If you read the same partition in the different consumers who
are in different groups, then also for each consumer group the messages will
end up ordered.

So for 3 partitions, you can have a max of 3 consumers, if you had 4

consumers, one consumer will be sitting idle. But for 3 partitions you can
have 2 consumers, then one consumer will read from one partition and one
consumer will read from two partitions. If one consumer goes down in this
case, the last surviving consumer will end up reading from all the three
partitions, and when new consumers are added back, again partition would
be split between consumers, this is called re-balancing.

Source: Kafka The Definitive Guide

Broker
A broker is a single Kafka server. Brokers receive messages from producers,
assigns offset to them, and then commit them to the partition log, which is
basically writing data to disk, and this gives Kafka its durable nature.

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 10/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Cluster
A Kafka cluster is a group of broker nodes working together to provide,
scalability, availability, and fault tolerance. One of the brokers in a cluster
works as the Controller, which basically assigns partitions to brokers,
monitors for broker failure to do certain administrative stuff.

In a cluster, partitions are replicated on multiple brokers depending on the

replication factor of the topic to have failover capability. What I mean is, for
a topic of replication factor 3, each partition of that topic will live onto 3
different brokers. When a partition is replicated onto 3 brokers, one of the
brokers will act as the leader for that partition and the rest two will be
followers. Data is always written on the leader broker and then replicated to
the followers. This way we do not lose data nor availability of the cluster, and
if the leader goes down another leader is elected

Let’s look at a practical example. I am running a 5 node Kafka cluster locally

and I run this command

kafka-topics — create — zookeeper zookeeper:2181 — topic applog —

partitions 5 — replication-factor 3

If we break down the command, it becomes

1. Create a topic

2. Create 5 partitions of that topic

3. And replicate data of all 5 partitions into a total of 3 nodes

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 11/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

This screenshot describes the topic we just created.

Let’s take Partition 0, the leader node for this partition is node 2. The data for
this partition is replicated on nodes 2,5 and 1.S o one partition is replicated
on 3 nodes and this behavior is repeated for all 5 partitions. And also if you
see, all the leader nodes for each partition are different. So to utilize the
nodes properly, the Kafka controller broker distributed the partitions evenly
across all nodes. And you can also observe the replications are also evenly
distributed and no node is overloaded. All of these are done by the controller
Broke with the help of Zookeeper.

Since you have understood clustering now, you can see to scale we could
partition a topic even more and for each partition, we could add a dedicated
consumer node, and that way we can horizontally scale.

Zookeeper
Kafka does not function without zookeeper( at least for now, they have plans
to deprecate zookeeper in near future). Zookeeper works as the central
configuration and consensus management system for Kafka. It tracks the
brokers, topics, and partition assignment, leader election, basically all the
metadata about the cluster.

And these my friend were the basic and core concepts of Kafka.

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 12/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Beyond basics
There are a few more things that are slightly advanced that you should know,
I would not go into details and just touch upon, cause I don’t want to
overload you with so much information in one shot.

Producer

You can send messages in 3 ways to Kafka.

Fire and forget

Synchronous send

Asynchronous send.

All of them have their performance vs consistency pitfalls.

You can configure characteristics of acknowledgment on the producer as well.

ACK 0: Don’t wait for an ack |FASTEST

ACK 1: Consider sent when leader broker received the message |FASTER

ACK All: Consider sent when all replicas received the message |FAST

You can compress and batch messages on producer before sendig to broker.

It gives high throughput and lowers disk usage but raises CPU usage.

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 13/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Avro Serializer/ Deserializer

If you use Avro as the serializer/ deserializer instead of normal JSON, you
will have to declare your schema upfront but this gives better performance
and saves storage.

Consumer

Poll loop

Kafka consumer constantly polls data from the broker and it’s no the other
way round.

You can configure partition assignment strategy

Range: Consumer gets consecutive partitions

Round Robin: Self-explanatory

Sticky: Tries to create minimum impact while rebalancing keeping most

of the assignment as is

Cooperative sticky: Sticky but allows cooperative rebalancing

Batch size

We can configure how many records and how much data is returned per poll
call.

Commit offset

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 14/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

On message read we can update the offset position for the consumer, this is
called committing the offset. Auto commit can be enabled or the application
can commit the offset explicitly. This can be done both synchronously and
asynchronously.

Ending notes
Kafka is a great piece of software and has tons of capabilities and can be
used in various sets of use cases. Kafka fits great into Modern-day
Distributed Systems due to it being distributed by design. It was originally
founded at LinkedIn and is currently maintained by Confluent. It is used by
top tech companies like Uber, Netflix, Activision, Spotify, Slack, Pinterest,
Coursera. We looked into the core concepts of Kafka to get you started. There
are tons of other things like Kafka Stream API or kSql that we did not talk
about in the interest of time.

References
1. Kafka the Definitive Guide

2. https://www.confluent.io/blog/apache-kafka-intro-how-kafka-works/

[This article was originally published on hackernoon]

Thanks for reading!!!

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 15/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

I am Aritra Das, I work as a Software Developer, and really enjoy building

Distributed Systems. Feel free to reach out to me on Linkedin or Twitter for
anything related to tech.

Happy learning 😃

Kafka Distributed Systems Message Queue Event Driven Architecture

Programming

Written by Aritra Das Follow

642 Followers · Writer for inspiringbrilliance

Backend Developer | ❤ Distributed Systems | ❤ Open source

More from Aritra Das and inspiringbrilliance

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 16/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Aritra Das in Level Up Coding Dakshin Karthikeyan in inspiringbrilliance

Error handling in Event-Driven Lightning fast GeoSpatial

Systems Queries⚡️
Practical error handling strategies for This is the story of how we cut the time taken
Distributed Event Driven System. With… by our APIs by 80%, and made it more…

11 min read · Jan 25 5 min read · Oct 23

631 2 59 3

Denny Sam in inspiringbrilliance Aritra Das in Glance

Locking in Databases and Isolation Building a read-through cache

Mechanisms using CDN
A peek into the world of database locks, their Supercharge your APIs by using a CDN to
types, and issues associated cache your dynamic responses

9 min read · Jul 20, 2022 9 min read · Nov 2, 2021

332 1 454 4

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 17/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

See all from Aritra Das See all from inspiringbrilliance

Recommended from Medium

Shamsul Arefin TransUnion Technology

Introduction to Kafka and Modern Data Streaming

implementation with Python Architecture
Kafka is a robust and one of the most If you had to write software to identify all the
performant, resilient event bus red cars parked in a garage, it is a relatively…

4 min read · 5 days ago 6 min read · Jun 27

81 1

Lists

General Coding Knowledge Coding & Development

20 stories · 636 saves 11 stories · 300 saves

Stories to Help You Grow as a ChatGPT

Software Developer 22 stories · 293 saves

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 18/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

19 stories · 602 saves

Suresh Podeti Adarsha Regmi

System design: Google Docs Building Real-Time Applications

Introduction with FastAPI and Apache Kafka
Introduction:

8 min read · Oct 26 3 min read · Jul 14

432 2 3

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 19/20
12/5/23, 1:01 PM Kafka Basics and Core concepts. In this article we will cover the core… | by Aritra Das | inspiringbrilliance | Medium

Prakher Jindal Andy Bryant

Kafka Processing guarantees in Kafka

Kafka is distibuted message streaming Each of the projects I’ve worked on in the last
platform that uses publish and subscribe… few years has involved a distributed messag…

10 min read · Oct 23 21 min read · Nov 16, 2019

1 2K 4

See more recommendations

https://medium.com/inspiredbrilliance/kafka-basics-and-core-concepts-5fd7a68c3193 20/20

Apache Kafka
No ratings yet
Apache Kafka
27 pages
Handbook Version Confluent Exercise
No ratings yet
Handbook Version Confluent Exercise
160 pages
Kafka Integration Made Easy With Spring Boot - by Avinash Hargun - Simform Engineering - Medium
No ratings yet
Kafka Integration Made Easy With Spring Boot - by Avinash Hargun - Simform Engineering - Medium
19 pages
Student Handbook Version 5.5.0-V1.1.0
No ratings yet
Student Handbook Version 5.5.0-V1.1.0
160 pages
Creating Data Pipe Lines With Kafka
No ratings yet
Creating Data Pipe Lines With Kafka
144 pages
Kafka in Action
100% (4)
Kafka in Action
209 pages
Kafka Development and Functionality
No ratings yet
Kafka Development and Functionality
43 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
24 pages
1646412329504-CCDAK Study Guide
No ratings yet
1646412329504-CCDAK Study Guide
56 pages
Apache Kafka Introduction
No ratings yet
Apache Kafka Introduction
21 pages
Introduction To Data Ingestion and Processing
No ratings yet
Introduction To Data Ingestion and Processing
28 pages
Kafkha
No ratings yet
Kafkha
32 pages
Kafka
No ratings yet
Kafka
15 pages
Apache Kafka Beginner Guide
No ratings yet
Apache Kafka Beginner Guide
40 pages
5 Kafka 2.7m
No ratings yet
5 Kafka 2.7m
46 pages
Kafka Overview
No ratings yet
Kafka Overview
36 pages
BDA Unit V
No ratings yet
BDA Unit V
21 pages
Learn Kafka 1734081324
No ratings yet
Learn Kafka 1734081324
15 pages
Introduction To Confluent Components
No ratings yet
Introduction To Confluent Components
68 pages
Kafka As A Solution
No ratings yet
Kafka As A Solution
10 pages
Unit 3
No ratings yet
Unit 3
26 pages
Piping Supervisor
No ratings yet
Piping Supervisor
12 pages
Step 19 Kafka Optional
No ratings yet
Step 19 Kafka Optional
10 pages
Apache Kafka
No ratings yet
Apache Kafka
27 pages
Kafka Notes1
No ratings yet
Kafka Notes1
19 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Introduction To Apache Kafka and Its Setup
No ratings yet
Introduction To Apache Kafka and Its Setup
29 pages
Kafka Fund
No ratings yet
Kafka Fund
160 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Kafka Presentation
No ratings yet
Kafka Presentation
16 pages
Kafka
No ratings yet
Kafka
43 pages
01 - Chapter Introduction To AMQ Streams
No ratings yet
01 - Chapter Introduction To AMQ Streams
10 pages
Kafka Ebook SoftwareMill
No ratings yet
Kafka Ebook SoftwareMill
27 pages
Kafka
No ratings yet
Kafka
12 pages
Cours - Kafka
No ratings yet
Cours - Kafka
72 pages
Apache Kafka
No ratings yet
Apache Kafka
10 pages
Introduction To Apache Kafka - 070224-1155-334
No ratings yet
Introduction To Apache Kafka - 070224-1155-334
7 pages
KAFKAExample 2
No ratings yet
KAFKAExample 2
12 pages
Kafka
No ratings yet
Kafka
5 pages
AK
No ratings yet
AK
22 pages
Basics of Kafka
No ratings yet
Basics of Kafka
17 pages
Apache Kafka
No ratings yet
Apache Kafka
13 pages
PM Clinic Dozers Komatsu
100% (1)
PM Clinic Dozers Komatsu
3 pages
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
18 pages
Kafka
No ratings yet
Kafka
1 page
Kafka
No ratings yet
Kafka
23 pages
Instaclustr Understanding Apache Kafka White Paper
No ratings yet
Instaclustr Understanding Apache Kafka White Paper
8 pages
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
Apache Kafka - Introduction - Tutorialspoint
No ratings yet
Apache Kafka - Introduction - Tutorialspoint
3 pages
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
No ratings yet
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
23 pages
Chapter 1 - Introduction To KAFKA: Objectives
No ratings yet
Chapter 1 - Introduction To KAFKA: Objectives
17 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
3 pages
Understanding Apache Kafka White Paper
No ratings yet
Understanding Apache Kafka White Paper
7 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
25 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
6 pages
Getting Started With Apache Kafka in Python - Towards Data Science PDF
No ratings yet
Getting Started With Apache Kafka in Python - Towards Data Science PDF
17 pages
? Kafka
No ratings yet
? Kafka
2 pages
Invitation Letter For Visa Spouse
No ratings yet
Invitation Letter For Visa Spouse
2 pages
Apache Kafka
No ratings yet
Apache Kafka
9 pages
TNOU Hall Ticket
100% (1)
TNOU Hall Ticket
2 pages
Haven Technical Services
No ratings yet
Haven Technical Services
12 pages
STV Insights
No ratings yet
STV Insights
20 pages
Online Platforms For ICT Content Development
No ratings yet
Online Platforms For ICT Content Development
11 pages
Sworn Statement of Assets, Liabilities and Net Worth
No ratings yet
Sworn Statement of Assets, Liabilities and Net Worth
2 pages
Kubernetes
No ratings yet
Kubernetes
6 pages
Certificates
No ratings yet
Certificates
54 pages
Untitled 2
No ratings yet
Untitled 2
31 pages
Apt Test For Free Resources
No ratings yet
Apt Test For Free Resources
23 pages
What Is Ansible
100% (1)
What Is Ansible
5 pages
8 Gabriel
No ratings yet
8 Gabriel
22 pages
Technical Notes - John C. Hull
No ratings yet
Technical Notes - John C. Hull
64 pages
Hydrogen Aircraft and Airport Safety
No ratings yet
Hydrogen Aircraft and Airport Safety
31 pages
Shaping, Planning, and Slotting Machines - Principles, Specifications, and Comparisons
No ratings yet
Shaping, Planning, and Slotting Machines - Principles, Specifications, and Comparisons
12 pages
Ludo Game Report LP
No ratings yet
Ludo Game Report LP
15 pages
Vendor Agreement Template
No ratings yet
Vendor Agreement Template
11 pages
OIST Research Intern Application
No ratings yet
OIST Research Intern Application
12 pages
Experiment #2 - Introduction To TI C2000 Microcontroller, Code Composer Studio (CCS) and Matlab Graphic User Interface (GUI)
No ratings yet
Experiment #2 - Introduction To TI C2000 Microcontroller, Code Composer Studio (CCS) and Matlab Graphic User Interface (GUI)
18 pages
Obj - Que. PBG 4.4
No ratings yet
Obj - Que. PBG 4.4
11 pages
What Is Functional Programming and Why Use It?: Bootcamp 101 Tech Tips Career Guides Tech Trends Student Success
No ratings yet
What Is Functional Programming and Why Use It?: Bootcamp 101 Tech Tips Career Guides Tech Trends Student Success
16 pages
Functional Programming Paradigm: Courses
No ratings yet
Functional Programming Paradigm: Courses
10 pages
2 Plugins Changelog
No ratings yet
2 Plugins Changelog
3 pages
R55 Pro
No ratings yet
R55 Pro
2 pages
Docker 3
No ratings yet
Docker 3
10 pages
2024 July Rationale Crisil
No ratings yet
2024 July Rationale Crisil
7 pages
Lightning Go
No ratings yet
Lightning Go
13 pages
2016 CCNY Great Grads
No ratings yet
2016 CCNY Great Grads
16 pages
Stax-21 Quick Reference Guides - Digital - PAX A920
No ratings yet
Stax-21 Quick Reference Guides - Digital - PAX A920
2 pages
Docker 2
No ratings yet
Docker 2
5 pages
Introduction To Functional Programming: Explore Developers Explore Jobs
No ratings yet
Introduction To Functional Programming: Explore Developers Explore Jobs
9 pages
Kub 3
No ratings yet
Kub 3
6 pages
Kateen Menu
No ratings yet
Kateen Menu
2 pages
Final Training Design
No ratings yet
Final Training Design
4 pages
Milk Powder: Etc., Recombined Milks and Other Liquid Beverages
No ratings yet
Milk Powder: Etc., Recombined Milks and Other Liquid Beverages
5 pages
Research Technology Resource Mar 07
No ratings yet
Research Technology Resource Mar 07
7 pages
Sainik School Amaravathinagar Class Xii - Summer Vacation Home Work Annexure A
No ratings yet
Sainik School Amaravathinagar Class Xii - Summer Vacation Home Work Annexure A
5 pages
Conversion
No ratings yet
Conversion
1 page
ISO 14001 Environment Management Watermark
No ratings yet
ISO 14001 Environment Management Watermark
2 pages
Learning Concurrent Programming in Scala - Second Edition
From Everand
Learning Concurrent Programming in Scala - Second Edition
Aleksandar Prokopec
No ratings yet
Learning Concurrent Programming in Scala
From Everand
Learning Concurrent Programming in Scala
Aleksandar Prokopec
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet

Kafka 2

Uploaded by

Kafka 2

Uploaded by

12/5/23, 1:01 PM Kafka Basics and Core concepts.

Open in app Sign up Sign in

Kafka Basics and Core concepts

Kafka is a Distributed Streaming Platform or a Distributed Commit Log

Let’s try to understand those jargons.

Kafka stores data as a stream of continuous records which can be processed

Is Kafka a message queue?

transform continuously flowing data, and building event-driven systems.

We will jump into core Kafka concepts now.

Messages might have an associated “Key” which is nothing but some

Source: Kafka The Definitive Guide

1. No Key specified => When no key is specified in the message the

4. Custom Partitioning logic => We can write some rules depending on

1. Fan out exchange => A single topic can be subscribed to by multiple

So your producer produces 6 messages. Each message is a key-value pair, for

So for 3 partitions, you can have a max of 3 consumers, if you had 4

Source: Kafka The Definitive Guide

In a cluster, partitions are replicated on multiple brokers depending on the

Let’s look at a practical example. I am running a 5 node Kafka cluster locally

kafka-topics — create — zookeeper zookeeper:2181 — topic applog —

If we break down the command, it becomes

2. Create 5 partitions of that topic

3. And replicate data of all 5 partitions into a total of 3 nodes

This screenshot describes the topic we just created.

You can send messages in 3 ways to Kafka.

Fire and forget

All of them have their performance vs consistency pitfalls.

You can configure characteristics of acknowledgment on the producer as well.

ACK 0: Don’t wait for an ack |FASTEST

Avro Serializer/ Deserializer

You can configure partition assignment strategy

Range: Consumer gets consecutive partitions

Round Robin: Self-explanatory

Sticky: Tries to create minimum impact while rebalancing keeping most

Cooperative sticky: Sticky but allows cooperative rebalancing

[This article was originally published on hackernoon]

Thanks for reading!!!

I am Aritra Das, I work as a Software Developer, and really enjoy building

Kafka Distributed Systems Message Queue Event Driven Architecture

Written by Aritra Das Follow

642 Followers · Writer for inspiringbrilliance

Backend Developer | ❤ Distributed Systems | ❤ Open source

More from Aritra Das and inspiringbrilliance

Aritra Das in Level Up Coding Dakshin Karthikeyan in inspiringbrilliance

Error handling in Event-Driven Lightning fast GeoSpatial

11 min read · Jan 25 5 min read · Oct 23

Denny Sam in inspiringbrilliance Aritra Das in Glance

Locking in Databases and Isolation Building a read-through cache

9 min read · Jul 20, 2022 9 min read · Nov 2, 2021

See all from Aritra Das See all from inspiringbrilliance

Recommended from Medium

Shamsul Arefin TransUnion Technology

Introduction to Kafka and Modern Data Streaming

4 min read · 5 days ago 6 min read · Jun 27

General Coding Knowledge Coding & Development

Stories to Help You Grow as a ChatGPT

19 stories · 602 saves

Suresh Podeti Adarsha Regmi

System design: Google Docs Building Real-Time Applications

8 min read · Oct 26 3 min read · Jul 14

Prakher Jindal Andy Bryant

Kafka Processing guarantees in Kafka

10 min read · Oct 23 21 min read · Nov 16, 2019

See more recommendations

You might also like