0% found this document useful (0 votes)
39 views18 pages

Introduction To Apache Kafka

The document provides an overview of Apache Kafka including what it is, why it is used, how it works, its architecture and components, use cases, a comparison to RabbitMQ, and a demonstration. Kafka is an open-source distributed event streaming platform that publishes and subscribes to streams of records in a fault-tolerant way and is widely used by companies like Twitter, LinkedIn, and Netflix.

Uploaded by

Bhavin Bhadran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views18 pages

Introduction To Apache Kafka

The document provides an overview of Apache Kafka including what it is, why it is used, how it works, its architecture and components, use cases, a comparison to RabbitMQ, and a demonstration. Kafka is an open-source distributed event streaming platform that publishes and subscribes to streams of records in a fault-tolerant way and is widely used by companies like Twitter, LinkedIn, and Netflix.

Uploaded by

Bhavin Bhadran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Introduction To Apache

KAFKA
Presented By:

Vaisakh Mohanan
Overview

 What is Apache Kafka?


 Why Kafka?
 How Kafka Works?
 Kafka Architecture and Components
 Kafka Use cases
 Kafka v/s Rabbit MQ
 Demo
What is Apache Kafka?

• Apache Kafka – a distributed messaging and event streaming open-source platform


• Originally developed at LinkedIn - open sourced in early 2011
• Written in Scala and Java
• Works with Pub-Sub mechanism for message communication
• Kafka uses a binary TCP-based protocol that is optimized for efficiency
Why Kafka?

 Event Streaming – Definition


• Practice of capturing data in real-time from event sources like databases, cloud services,
and applications
• Streams of events
• Storing, manipulating, processing, and reacting to the event streams in real-time as well
as retrospectively
• Right information - at the right place, at the right time.

 Event Streaming - Applications


• Real-time financial transactions : Stock Exchanges, Banks, Insurances
• Data platforms, event-driven architectures, and microservices
• Analyze and capture sensor data –IoT
• Logistics and Automotive Industry
Why Kafka?

 Highly efficient and able to  Kafka Architecture supports fault


handle high velocity and high tolerance making message
loads of data with ordered communication and event
delivery processing robust & reliable – No
data loss

 Kafka’s persistence and ▪ Highly scalable – future business


message replication feature growth through horizontal
makes it highly durable scaling in distributed clustered
environment
How Kafka Works?

 Kafka adopted Pub-Sub messaging architecture

Kafka Cluster

Message Brokers

Broker

Publisher Subscriber
Message Producer Message Consumer
Kafka Architecture as a Messaging System
Kafka Cluster Consumer Group 1

Broker 1
Pull Message Consumer A
Push Message Broker 2
Producer A Broker 3 Consumer B

P1
Topic X
Producer B
P2
Consumer C
Producer C P3
Topic Y Consumer D
P4

Consumer Group 2
Get Kafka Broker Id Update Offset
Kafka Zookeeper
Kafka Components

1. Topic
 A unique name for Kafka Stream
 A category or feed name to which records or messages are published, and stored
 Always multi-subscriber
 Has partitions - an ordered, immutable sequence of records that is continually
appended to a structured commit log – identified by Offset

2. Kafka Producer
 Publishes messages to a Kafka topic
 Responsible for choosing which record to assign to which partition within the topic
Kafka Components

3. Kafka Consumer
 This component subscribes to topics
 Reads and processes messages from the topics - pull messages

4. Kafka Broker
 Manages the storage of messages in the topics
 More than one broker – Kafka Cluster

5. Zookeeper
 Offers the brokers with metadata about the processes running in the system
 Facilitate health checking, managing and coordination
Kafka Core Api’s

Allows an application to publish a stream of records to one or more


Producer Api
Kafka topics

Allows an application to subscribe to one or more topics and process


Consumer Api
the stream of records produced to them.

Allows an application to act as a stream processor, consuming an


Stream Api input stream from one or more topics and producing an output
stream to one or more output topics, effectively transforming the
input streams to output streams

Allows building and running reusable producers or consumers that


Connector Api
connect Kafka topics to existing applications or data systems
Kafka Use Cases

 Messaging
 Replacement to traditional message broker
 Kafka has better throughput, built-in partitioning, replication, and fault-tolerance
 Good solution for large-scale message processing applications.

 Metrics
 For operational monitoring data
 Includes aggregating statistics from distributed applications to produce centralized feeds
of operational data.

 Event Sourcing
 Excellent backend for applications of event sourcing - supports very large stored log data
Kafka X Rabbit MQ

 Messaging Principle of Pub-Sub Works using exchanges [Direct


Mechanism Exchange, Fan Out Exchange, Topic
Exchange, Header Exchange]

 Usage Fits for high flux event Fits well as a traditional message
Scenarios streaming use-cases broker.
• With loosely coupled message
environment
 Message Works on Pull Model where
Forwarding Works with the Push model having
Consumers act smart while
Model Dumb Consumers and Smart broker
Kafka broker acts dumb

 Performance 100,000 messages/second 20,000 messages/second


Rate
Kafka X Rabbit MQ

 Message Kafka keeps the messages in RabbitMQ works on


Retention the commit log based on Acknowledgement based message
policy based retention retention
mechanism

 Data Flow Distinct bounded data packets Unbounded continuous data in the
in the form of messages form of key-value pairs.

 Which is better, Kafka or RabbitMQ?


Demo

Demo of a simple Pub-Sub mechanism used in Kafka

 Download Apache Kafka from the following link – (Scala 2.13 - kafka_2.13-3.3.1.tgz (asc, sha512))

https://kafka.apache.org/downloads

 Unzip the package


 To start Apache Kafka:
• Open command prompt and start the in-built Zookeeper
.\bin\windows\zookeeper-server-start.bat .\config\zookeeper.properties
• Open a new command prompt and start the Apache Kafka
.\bin\windows\kafka-server-start.bat .\config\server.properties
Demo

• Open another command prompt and create a topic – The topic has one partition and one
replica by default
.\bin\windows\kafka-topics.bat --create --bootstrap-server localhost:9092 replication-factor 1 --
partitions 1 --topic mySampleKafkaDemo
• Next Open a new command prompt and create a producer to send message to the above
created topic and send a message to it
.\bin\windows\kafka-console-producer.bat --broker-list localhost:9092 –topic mySampleKafkaDemo
• Finally Open a new command prompt and start the consumer which listens to the topic we
just created above. We will get the message we had sent using the producer
.\bin\windows\kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic
mySampleKafkaDemo --from-beginning
Conclusion

 Apache Kafka - An efficient message streaming platform


 Publish and subscribe to streams of events in a fault-tolerant manner
 Store events in durable way
 Kafka arrived at the right time, captured mindshare among developers and exploded in
popularity
 Application in Twitter, LinkedIn, Netflix etc
Reference Links

 https://www.tutorialspoint.com/apache_kafka/apache_kafka_introduction.htm

 https://kafka.apache.org/intro

 https://www.slideshare.net/AimdekTechnologies/introduction-to-apache-kafka-239053578

 https://www.youtube.com/watch?v=eab-nQXGoD0

 https://tutorialspedia.com/kafka-vs-rabbitmq-a-comparison-of-kafka-and-rabbitmq/

 https://insidebigdata.com/2016/04/28/a-brief-history-of-kafka-linkedins-messaging-platform/

 https://tutorialspedia.com/kafka-introduction-kafka-architecture-overview-use-cases-and-basic-conce

pts-explanation/
Thank You ...

You might also like