0% found this document useful (0 votes)

107 views27 pages

Apache Kafka

Apache Kafka is a distributed streaming platform that functions as a publish-subscribe messaging system, originally developed by LinkedIn and now maintained by Confluent. It offers high throughput and scalability, allowing for efficient data exchange between applications through its unique architecture, which includes producers, consumers, brokers, and topics. The document also provides implementation details using Spring Boot, illustrating how to send and receive messages in Kafka.

Uploaded by

UTHAYAKUMAR J

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views27 pages

Apache Kafka

Uploaded by

UTHAYAKUMAR J

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

Apache Kafka (Message Queue) with Code

Apache Kafka introduction

Apache Kafka is a software platform which is based on a distributed streaming process.

It is a publish-subscribe messaging system which let exchanging of data between applications, servers, and
processors as well.

Apache Kafka was originally developed by LinkedIn in 2010, and later it was donated to the Apache Software
Foundation. Currently, it is maintained by Confluent under Apache Software Foundation.

In the year 2011 Kafka was made public.

Kafka works well as a replacement for a more traditional message broker.

Apache Kafka has resolved the lethargic trouble of data communication between a sender and a receiver.
• Apache Kafka introduction
• Messaging system
• Apache Kafka as a Message system
• Apache Kafka Architecture
• Apache Kafka Work flow
• Apache kafka Core API
• Apache kafka Components
• Apache kafka Use cases
• RabbitMQ Vs Kafka
• Implementation Code
Messaging System

There are two types of Messaging System:

1. Point to Point System
2. Publish-Subscribe System

1. Point to Point System

Messages are persisted in a queue, but a particular message can be consumed by a maximum of one consumer
only. Once a consumer reads a message in the queue, it disappears from that queue.

The typical example of this system is an Order Processing System, where each order will be processed by one Order
Processor, but Multiple Order Processors can work as well at the same time. The following diagram depicts the
structure.
Messaging System

2. Publish-Subscribe System
Messages are persisted in a topic. Unlike point-to-point system, consumers can subscribe to one or more topic and
consume all the messages in that topic. In the Publish-Subscribe system, message producers are called publishers
and message consumers are called subscribers.

A real-life example is Dish TV, which publishes different channels like sports, movies, music, etc., and anyone can
subscribe to their own set of channels and get them whenever their subscribed channels are available.
Apache Kafka as a Messaging System
Apache Kafka Architecture
Apache Kafka Architecture Cont...

Kafka is a distributed, replicated commit log. Kafka does not have the concept of a queue which might seem strange at
first, given that it is primary used as a messaging system. Queues have been synonymous with messaging systems for a
long time. Let’s break down “distributed, replicated commit log” a bit:

Distributed because Kafka is deployed as a cluster of nodes, for both fault tolerance and scale
Replicated because messages are usually replicated across multiple nodes (servers).

Kafka is so powerful regarding throughput and scalability that it allow you to handle continuous stream of messages.

Commit Log because messages are stored in partitioned, append only logs which are called Topics. This concept of a log
is the principal killer feature of Kafka.
Apache Kafka Work Flow
Following is the step wise workflow of the Pub-Sub Messaging −
• Producers send message to a topic at regular intervals.
• Kafka broker stores all messages in the partitions configured for that particular topic. It ensures the messages
are equally shared between partitions. If the producer sends two messages and there are two partitions, Kafka
will store one message in the first partition and the second message in the second partition.
• Consumer subscribes to a specific topic.
• Once the consumer subscribes to a topic, Kafka will provide the current offset of the topic to the consumer and
also saves the offset in the Zookeeper.
• Consumer will request the Kafka in a regular interval (like 100 Ms) for new messages.
• Once Kafka receives the messages from producers, it forwards these messages to the consumers.
• Consumer will receive the message and process it.
• Once the messages are processed, consumer will send an acknowledgement to the Kafka broker.
• Once Kafka receives an acknowledgement, it changes the offset to the new value and updates it in the
Zookeeper. Since offsets are maintained in the Zookeeper.
• This above flow will repeat until the consumer stops the request.
• Consumer has the option to rewind/skip to the desired offset of a topic at any time and read all the subsequent
messages
Apache Kafka Core API

The Producer API allows an application to publish a stream of

records to one or more Kafka topics.
The Consumer API allows an application to subscribe to one or
more topics and process the stream of records produced to
them.
The Streams API allows an application to act as a stream
processor, consuming an input stream from one or more topics
and producing an output stream to one or more output topics,
effectively transforming the input streams to output streams.
The Connector API allows building and running reusable
producers or consumers that connect Kafka topics to existing
applications or data systems. For example, a connector to a
relational database might capture every change to a table.
What is Streams

Think of s Stream as an infinite. Continuous real-time flow of data.

data are a key-value pairs.

In Kafka stream API transform and increases data.

- Support per-record stream processing with milisecond.
Kafka Components

Using the following components, Kafka achieves messaging:

1. Topic
Basically, A Topic is a unique name for Kafka Stream. Topic is a category or feed name to which records are
published, and stores messages. Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or
many consumers that subscribe to the data written to it.

Each partition is an ordered, immutable

sequence of records that is continually
appended to—a structured commit log.
The records in the partitions are each
assigned a sequential id number called
the offset that uniquely identifies each
record within the partition.
Kafka Components

The Kafka cluster durably persists all published records—whether or not they have been consumed—
using a configurable retention period. For example, if the retention policy is set to two days, then for the
two days after a record is published, it is available for consumption, after which it will be discarded to free
up space. Kafka's performance is effectively constant with respect to data size so storing data for a
long time is not a problem.
This is one of the biggest difference between RabbitMQ/ActiveMQ and Kafka.
Kafka Components

2. Kafka Producer
It publishes messages to a Kafka topic. The producer is responsible for choosing which record to assign to which
partition within the topic.

3. Kafka Consumer
This component subscribes to a topic(s), reads and processes messages from the topic(s).

4. Kafka Broker
Kafka Broker manages the storage of messages in the topic(s). If Kafka has more than one broker, that is what we
call a Kafka cluster.

5. Kafka Zookeeper
To offer the brokers with metadata about the processes running in the system and to facilitate health checking and
managing and coordinating, Kafka uses Kafka zookeeper.
Kafka Components
• Partitions for the same topic are distributed
across multiple brokers in the cluster.

• Partitions are replicated across multiple servers;

number of replicas is a configurable parameter.

• Each Partition has one server as a leader and

a number of servers as followers.

• Each Server acts a leader for some of its partitions

and as a follower for some other.

• The Producers are responsible for choosing which

message to assign to which partition within the
topic based on key assigned to message.
Kafka Use Cases

There are several use Cases of Kafka that show why we actually use Apache Kafka.

Messaging
For a more traditional message broker, Kafka works well as a replacement. We can say Kafka has better throughput,
built-in partitioning, replication, and fault-tolerance which makes it a good solution for large-scale message
processing applications.

Metrics
For operational monitoring data, Kafka finds the good application. It includes aggregating statistics from distributed
applications to produce centralized feeds of operational data.

Event Sourcing
Since it supports very large stored log data, that means Kafka is an excellent backend for applications of event
sourcing.
RabbitMQ Vs Kafka
Let’s see how they differ from one another:
i. Features
Apache Kafka– Basically, Kafka is distributed. Also, with guaranteed durability and availability, the data is shared and
replicated.
RabbitMQ– It offers relatively less support for these features.

ii. Performance rate

Apache Kafka — Its performance rate is high to the tune of 100,000 messages/second.
RabbitMQ — Whereas, the performance rate of RabbitMQ is around 20,000 messages/second.

iii. Processing
Apache Kafka — It allows reliable log distributed processing. Also, stream processing semantics built into the Kafka
Streams.
RabbitMQ — Here, the consumer is just FIFO based, reading from the HEAD and processing 1 by 1.

iv. Replay
When your application needs access to stream history, delivered in partitioned order at least once. Kafka is a durable
message store and clients can get a “replay” of the event stream on demand, as opposed to more traditional message
brokers where once a message has been delivered, it is removed from the queue.
Implementation of Kafka
Dependency uses:

<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
</dependency>
Implementation of Kafka
Define the KafkaSender class to send message to the kafka topic named as developervisits-topic:

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;

@Service
public class KafkaSender {

@Autowired
private KafkaTemplate<String, String> kafkaTemplate;

String kafkaTopic = "developervisits-topic";

public void send(String message) {

kafkaTemplate.send(kafkaTopic, message);
}
}
Implementation of Kafka
Define a Controller which will pass the message and trigger the send message to the Kafka Topic
using the KafkaSender class.

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import com.hcl.service.KafkaSender;

@RestController
@RequestMapping(value = "/developervisits-kafka/")
public class ApacheKafkaWebController {

@Autowired
KafkaSender kafkaSender;

@GetMapping(value = "/producer")
public String producer(@RequestParam("message") String message) {
kafkaSender.send(message);

return "Message sent to the Kafka Topic developervisits-topic Successfully";

}

}
Implementation of Kafka
Finally Define the Spring Boot Class with @SpringBootApplication annotation

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class SpringBootHelloWorldApplication {

public static void main(String[] args) {

SpringApplication.run(new Object[] {
SpringBootHelloWorldApplication.class
}, args);

}
}
Implementation of Kafka
We are done with the required Java code. Now lets start Apache Kafka. As we had explained in detail
in the Getting started with Apache Kafka perform the following.

Download the Apache Kafka from this link:

https://kafka.apache.org/downloads

Start Apache Zookeeper-

zookeeper-server-start.bat c:\shareData\development\appachekafka\kafka_2.12-2.0.0\config\
zookeeper.properties

Start Apache Kafka-

kafka-server-start.bat c:\shareData\development\appachekafka\kafka_2.12-2.0.0\config\
server.properties
Implementation of Kafka
Next start the Spring Boot Application by running it as a Java Application.

CREATE TOPIC ON KAFKA SERVER

Also Start the consumer listening to the developervisits-topic

kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic

developervisits-topic

PRODUCER
kafka-console-producer.bat --broker-list localhost:9092 --topic developervisits-topic
-OR-
http://localhost:8080/developervisits-kafka/producer?message="test"

CONSUMER
kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic developervisits-topic --
from-beginning

Analytics Engineer Roadmap
No ratings yet
Analytics Engineer Roadmap
6 pages
Vikash Sharma. Ip Project Class 12
64% (25)
Vikash Sharma. Ip Project Class 12
27 pages
Kafka Using Spring Boot
No ratings yet
Kafka Using Spring Boot
136 pages
Apache Kafka
No ratings yet
Apache Kafka
9 pages
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
18 pages
Apache Kafka Beginner Guide Final
No ratings yet
Apache Kafka Beginner Guide Final
3 pages
Kafka
No ratings yet
Kafka
23 pages
Apache Kafka 360 1631077800
No ratings yet
Apache Kafka 360 1631077800
137 pages
Unit 5 Apache Kafka Notes
No ratings yet
Unit 5 Apache Kafka Notes
54 pages
Introduction To Apache Kafka and Its Setup
No ratings yet
Introduction To Apache Kafka and Its Setup
29 pages
Apache Kafka - Thi Nguyen's Blog
No ratings yet
Apache Kafka - Thi Nguyen's Blog
39 pages
Apache Kafka Beginner Guide
No ratings yet
Apache Kafka Beginner Guide
40 pages
Unit 3
No ratings yet
Unit 3
26 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Kafka
No ratings yet
Kafka
3 pages
Big Data - Group 14
No ratings yet
Big Data - Group 14
26 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Kafka
No ratings yet
Kafka
12 pages
Configuring Kafka For High Throughput
No ratings yet
Configuring Kafka For High Throughput
11 pages
Kafka
No ratings yet
Kafka
43 pages
Kafka Monitoring
No ratings yet
Kafka Monitoring
64 pages
? Kafka
No ratings yet
? Kafka
2 pages
KAFKA
No ratings yet
KAFKA
11 pages
Apache Kafka Description
No ratings yet
Apache Kafka Description
36 pages
Kafka My Kafka Note v67
No ratings yet
Kafka My Kafka Note v67
55 pages
Apache Kafka - Introduction - Tutorialspoint
No ratings yet
Apache Kafka - Introduction - Tutorialspoint
3 pages
Mastering Apache Kafka
No ratings yet
Mastering Apache Kafka
17 pages
Apache Kafka
No ratings yet
Apache Kafka
10 pages
KAFKAExample 2
No ratings yet
KAFKAExample 2
12 pages
Kafka Using Spring Boot v2
No ratings yet
Kafka Using Spring Boot v2
150 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
Apache Kafka
No ratings yet
Apache Kafka
27 pages
Kafkha
No ratings yet
Kafkha
32 pages
Apache Kafka
No ratings yet
Apache Kafka
13 pages
Apache - Kafka Notes
No ratings yet
Apache - Kafka Notes
9 pages
AK
No ratings yet
AK
22 pages
Kafka Patterns and Anti-Patterns
No ratings yet
Kafka Patterns and Anti-Patterns
7 pages
Cours - Kafka
No ratings yet
Cours - Kafka
72 pages
Documentation
No ratings yet
Documentation
105 pages
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
Kafka Concepts For SQS User
No ratings yet
Kafka Concepts For SQS User
17 pages
Kafka Topic Questions
No ratings yet
Kafka Topic Questions
9 pages
Kafka Sparkstreaming
No ratings yet
Kafka Sparkstreaming
75 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
3 pages
SITA1603 Unit 3 Material
No ratings yet
SITA1603 Unit 3 Material
45 pages
Kafka Ebook SoftwareMill
No ratings yet
Kafka Ebook SoftwareMill
27 pages
Kafka Architectures Notes
No ratings yet
Kafka Architectures Notes
9 pages
5 Kafka 2.7m
No ratings yet
5 Kafka 2.7m
46 pages
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
No ratings yet
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
23 pages
Kafka
No ratings yet
Kafka
15 pages
08 Apache Kafka
No ratings yet
08 Apache Kafka
45 pages
Kafka Presentation
No ratings yet
Kafka Presentation
16 pages
Big Data-Kafka
No ratings yet
Big Data-Kafka
14 pages
Kafka Overview
No ratings yet
Kafka Overview
36 pages
Pache Kafka Is An Open-Source Distr
No ratings yet
Pache Kafka Is An Open-Source Distr
1 page
Apache Kafka - Introduction
No ratings yet
Apache Kafka - Introduction
2 pages
Kafka Notes Linkedin
No ratings yet
Kafka Notes Linkedin
33 pages
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
From Everand
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
Peter Jones
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
The Apache Kafka® and Generative AI Handbook
From Everand
The Apache Kafka® and Generative AI Handbook
Joseph Matthew Stein
No ratings yet
Mastering Kafka Streams: From Basics to Expert Proficiency
From Everand
Mastering Kafka Streams: From Basics to Expert Proficiency
William Smith
No ratings yet
BDA - Lab-Manual - 1to4
No ratings yet
BDA - Lab-Manual - 1to4
17 pages
Big Data Management
No ratings yet
Big Data Management
4 pages
Hadoop Notes 1
No ratings yet
Hadoop Notes 1
9 pages
3 Five V S of Big Data
No ratings yet
3 Five V S of Big Data
12 pages
CN Lab Manual
No ratings yet
CN Lab Manual
75 pages
Orange IP065 11 QP
100% (1)
Orange IP065 11 QP
7 pages
CMP1042 Information Systems
No ratings yet
CMP1042 Information Systems
4 pages
Business Intelligence Interview Questions With Answers
No ratings yet
Business Intelligence Interview Questions With Answers
6 pages
DBMS Interview Questions by Company
No ratings yet
DBMS Interview Questions by Company
15 pages
Dba Interview Questions & Answers
No ratings yet
Dba Interview Questions & Answers
43 pages
Metadata Issues in Digital Libraries: Key Concepts and Perspectives
No ratings yet
Metadata Issues in Digital Libraries: Key Concepts and Perspectives
27 pages
Eei5466 Tma 02 2023 24
No ratings yet
Eei5466 Tma 02 2023 24
1 page
S. No. Roll NO Name Project Title: B R Krishna Kokiligada
No ratings yet
S. No. Roll NO Name Project Title: B R Krishna Kokiligada
8 pages
01-Introduction To DS With Python
No ratings yet
01-Introduction To DS With Python
32 pages
Major Issues in DM
No ratings yet
Major Issues in DM
5 pages
Robert Smith: Jr. SQL Server Database Administrator
No ratings yet
Robert Smith: Jr. SQL Server Database Administrator
2 pages
DBxConnect User Guide
No ratings yet
DBxConnect User Guide
20 pages
UG 4-1 R19 EEE Syllabus
No ratings yet
UG 4-1 R19 EEE Syllabus
37 pages
Pdf-Sapdocx Compress
No ratings yet
Pdf-Sapdocx Compress
6 pages
B Tree Application
100% (2)
B Tree Application
6 pages
MS SQL To Snowflake Migration Project
No ratings yet
MS SQL To Snowflake Migration Project
10 pages
Technical Support Basic
No ratings yet
Technical Support Basic
5 pages
Blue Orange Hiking Bag Sales Presentation
No ratings yet
Blue Orange Hiking Bag Sales Presentation
63 pages
CSc9618 (As) - Mock 3 - Paper 1
100% (1)
CSc9618 (As) - Mock 3 - Paper 1
8 pages
Demystifying The Big Data Ecosystem... - Param Natarajan
100% (1)
Demystifying The Big Data Ecosystem... - Param Natarajan
8 pages
Editable Tables in JavaFX - DZone Java
No ratings yet
Editable Tables in JavaFX - DZone Java
15 pages
Jesswyna Anessa Anak Joannes Wat PDF
No ratings yet
Jesswyna Anessa Anak Joannes Wat PDF
9 pages
Introduction To SQL: Dr. Sambit Bakshi
No ratings yet
Introduction To SQL: Dr. Sambit Bakshi
28 pages
COPO
No ratings yet
COPO
164 pages
Acknowledgement: Niharika Sharma XII Science
No ratings yet
Acknowledgement: Niharika Sharma XII Science
24 pages
Admin Interview - PPT by Teju Validation Upto
No ratings yet
Admin Interview - PPT by Teju Validation Upto
92 pages
Data Warehousing Laboratory
0% (1)
Data Warehousing Laboratory
28 pages
Introduction To Data Management: Chapter 1, Pratt & Adamski
100% (1)
Introduction To Data Management: Chapter 1, Pratt & Adamski
25 pages

Apache Kafka

Uploaded by

Apache Kafka

Uploaded by

Apache Kafka (Message Queue) with Code

Apache Kafka introduction

Apache Kafka is a software platform which is based on a distributed streaming process.

In the year 2011 Kafka was made public.

Kafka works well as a replacement for a more traditional message broker.

There are two types of Messaging System:

1. Point to Point System

The Producer API allows an application to publish a stream of

Think of s Stream as an infinite. Continuous real-time flow of data.

In Kafka stream API transform and increases data.

Using the following components, Kafka achieves messaging:

Each partition is an ordered, immutable

• Partitions are replicated across multiple servers;

• Each Partition has one server as a leader and

• Each Server acts a leader for some of its partitions

• The Producers are responsible for choosing which

ii. Performance rate

String kafkaTopic = "developervisits-topic";

public void send(String message) {

return "Message sent to the Kafka Topic developervisits-topic Successfully";

public static void main(String[] args) {

Download the Apache Kafka from this link:

Start Apache Zookeeper-

Start Apache Kafka-

CREATE TOPIC ON KAFKA SERVER

kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic

You might also like