0% found this document useful (0 votes)
41 views15 pages

Introduction To Apache Kafka

Apache Kafka is an open-source distributed event streaming platform designed for real-time data processing and storage, widely used by companies like LinkedIn and Netflix. It operates on a publish/subscribe messaging model, consisting of brokers, topics, and partitions, ensuring high performance, fault tolerance, and scalability. The document also outlines installation steps and hardware considerations for setting up Kafka.

Uploaded by

Arut Jothi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
41 views15 pages

Introduction To Apache Kafka

Apache Kafka is an open-source distributed event streaming platform designed for real-time data processing and storage, widely used by companies like LinkedIn and Netflix. It operates on a publish/subscribe messaging model, consisting of brokers, topics, and partitions, ensuring high performance, fault tolerance, and scalability. The document also outlines installation steps and hardware considerations for setting up Kafka.

Uploaded by

Arut Jothi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 15

UNIT-V

Dr.G.Arutjothi
Assistant Professor
Introduction to Apache Kafka

Publish/Subscribe Messaging &


Installation Guide
Meet Kafka: Introduction
• - Distributed event streaming platform
• - Handles large-scale, real-time data
processing
• - Used by LinkedIn, Netflix, Uber, and more
Apache Kafka

• Apache Kafka is an open-source stream-processing


software platform designed to handle real-time data
storage and processing. It acts as a broker between a
sender (producer) and a receiver (consumer),
facilitating the exchange of messages between
applications, servers, and processors
Kafka Key Concepts
• Kafka Broker
– A Kafka cluster consists of one or more servers known as Kafka
brokers
• Kafka Topic
– A topic in Kafka is a category or feed name to which messages
are stored and published.
• Partitions and Consumer Groups
– Kafka topics are divided into partitions, allowing data to be
split across multiple brokers.
• Core APIs
– Producer API
– Consumer API
– Streams API
– Connector API
Real-Time Applications
• Twitter: Uses Storm-Kafka for stream processing
infrastructure.
• LinkedIn: Utilizes Kafka for activity stream data and
operational metrics.
• Netflix: Employs Kafka for real-time monitoring and
event processing.
• Box: Uses Kafka for production analytics pipeline and
real-time monitoring1
Advantages of Apache
Kafka
• High Performance: Capable of handling millions of messages
per second with low latency.
• Fault-Tolerance: Ensures data is not lost even if a consumer
fails to process a message.
• Scalability: Easily scales to handle large volumes of data.
• Durability: Stores streams of records in a fault-tolerant
manner
Publish/Subscribe Messaging Model
• - Producers publish messages to topics
• - Consumers subscribe to topics
• - Ensures reliable, scalable messaging
Why Kafka? & Kafka’s Data Ecosystem
• - High throughput and fault tolerance
• - Handles real-time data processing
• - Connects with databases, microservices, and
streaming applications
Kafka’s Origin & Use Cases
• - Originally developed at LinkedIn
• - Used for event-driven architectures, log
aggregation, and analytics pipelines
Installing Kafka - First Steps
• - Download Kafka from Apache website
• - Extract and set up environment variables
• - Start Zookeeper and Kafka server
Installing a Kafka Broker & Configuration
• - Configure server.properties file
• - Set broker ID, log directory, and zookeeper connect
• - Start Kafka broker
Hardware Selection for Kafka
• - Choose high-performance disks
• - Optimize CPU and memory based on
workload
• - Network bandwidth considerations
Sending a Message to Kafka
• - Use Kafka console producer to send
messages
• - Use Kafka console consumer to read
messages
• - Implement producers/consumers in Python,
Java, or other languages

You might also like