0% found this document useful (0 votes)

78 views8 pages

Introduction To Apache Ka Ka For Python Programmers: Installation

The document provides an introduction to using Apache Kafka with Python applications. It covers installing the Confluent Kafka Python client, producing messages to Kafka topics from Python, and consuming messages from topics using Python. Example code is shown for creating a Kafka producer and consumer and handling messages.

Uploaded by

inc0gnit0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views8 pages

Introduction To Apache Ka Ka For Python Programmers: Installation

Uploaded by

inc0gnit0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

CLIENTS

Introduction to Apache
Ka�ka ® for Python
Programmers
J U N E 7, 2 0 1 7 L A S T U P DAT E D : N O V E M B E R 7, 2 0 1 9

In this blog post, we’re going to get back to basics and walk through how to get started
using Apache Ka�ka with your Python applications.

We will assume some basic knowledge of Ka�ka. If you’re new to the project, the
introduction and design sections of the Apache documentation are an excellent place to
start. The Con�luent blog is also packed with great information; Jay Kreps’s A Practical
Guide to Building a Streaming Platform covers many of the core Ka�ka concepts again, but
with a focus on Ka�ka’s role at a company-wide scale. A lso noteworthy are Ben Stopford’s
microservices blog posts ( The Data Dichotomy , Services on a Backbone of Events ) for his
unique take on the relationship between applications and data.

Installation
For our examples we’ll use Con�luent Platform . This is a source-available, open distribution
of Ka�ka that includes connectors for various data systems, a REST layer for Ka�ka, and a
schema registry . On OS X this is easily installed via the tar archive . Instructions for all
platforms are available on the Con�luent website.

The Con�luent Python client con�luent-ka�ka-python leverages the high performance C

1 of 8 4/6/20, 11:18 AM
Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

client librdka�ka (also developed and supported by Con�luent). Starting with version 1.0,
these are distributed as self-contained binary wheels for OS X and Linux on PyPi. You can
install (generally inside a virtual environment ) with:

pip install confluent-kafka

Starting Ka�ka
You can get a single-broker Ka�ka cluster up and running quickly using default
con�iguration �iles included with the Con�luent Platform .

First, you’ll need to start a ZooKeeper instance, which Ka�ka utilizes for providing various
distributed system related services. Assuming you used the zip or tar archive to install
Con�luent Platform, you can start ZooKeeper from the installation directory as follows:

./bin/zookeeper-server-start ./etc/kafka/zookeeper.properties

Then to start a Ka�ka broker:

./bin/kafka-server-start ./etc/kafka/server.properties

That’s it! You now have a Ka�ka broker to play with.

Producing Messages
Here’s a simple program that writes a message with key ‘ hello ‘ and value ‘ world ‘ to the
Ka�ka topic mytopic :

from confluent_kafka import Producer

p = Producer({'bootstrap.servers': 'localhost:9092'})
p.produce('mytopic', key='hello', value='world')
p.flush(30)

After importing the Producer class from the confluent_kafka package, we construct a
Producer instance and assign it to the variable p . The constructor takes a single
argument: a dictionary of con�iguration parameters. Because con�luent-ka�ka uses

2 of 8 4/6/20, 11:18 AM
Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

librdka�ka for its underlying implementation, it shares the same set of con�iguration
properties .

The only required property is bootstrap.servers which is used to specify the address of
one or more brokers in your Ka�ka cluster. In our case, there is only one, but a real-world
Ka�ka cluster may grow to tens or hundreds of nodes. It doesn’t matter which broker(s) you
specify here; this setting simply provides a starting point for the client to query the cluster
– any broker can answer metadata requests about the cluster.

In the call to the produce method, both the key and value parameters need to be either a
byte-like object (in Python 2.x this includes strings), a Unicode object, or None . In Python
3.x, strings are Unicode and will be converted to a sequence of bytes using the UTF-8
encoding. In Python 2.x, objects of type unicode will be encoded using the default
encoding. Often, you will want to serialize objects of a particular type before writing them
to Ka�ka. A common pattern for doing this is to subclass Producer and override the
produce method with one that performs the required serialization.

The produce method returns immediately without waiting for con�irmation that the
message has been successfully produced to Ka�ka (or otherwise). The flush method blocks
until all outstanding produce commands have completed, or the optional timeout (speci�ied
as a number of seconds) has been exceeded. You can test to see whether all produce
commands have completed by checking the value returned by the flush method: if it is
greater than zero, there are still produce commands that have yet to complete. Note that
you should typically call flush only at application teardown, not during normal �low of
execution, as it will prevent requests from being streamlined in a performant manner.

To be noti�ied when produce commands have completed, you can specify a callback
function in the produce call. Here’s an example:

from confluent_kafka import Producer

def acked(err, msg):

if err is not None:
print("Failed to deliver message: {0}: {1}"
.format(msg.value(), err.str()))
else:

3 of 8 4/6/20, 11:18 AM
Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

print("Message produced: {0}".format(msg.value()))

p = Producer({'bootstrap.servers': 'localhost:9092'})

try:
for val in xrange(1, 1000):
p.produce('mytopic', 'myvalue #{0}'
.format(val), callback=acked)
p.poll(0.5)

except KeyboardInterrupt:
pass

p.flush(30)

The callback method has two parameters – the �irst sends information about any error that
occured whilst producing the message and the second information about the message
produced. Callbacks are executed and sent as a side-effect of calls to the poll or flush
methods. Unlike the flush method, the poll method always blocks for the speci�ied
timeout period (measured in seconds). An advantage of the poll based callback mechanism
is that it allows you to keep everything single threaded and easy to reason about.

Consuming Messages
Data is read from Ka�ka using consumers that are generally working together as part of a
consumer group . Different consumers subscribe to one or more topics and are
automatically assigned to a subset of each topic’s partitions. If consumers are added or
removed (perhaps due to failure) from the group, the group will automatically rebalance so
that one and only one consumer is ever reading from each partition in each topic of the
subscription set. For more detailed information on how consumer groups work, Jason
Gustafson’s blog post covering the Java consumer is an excellent reference.

Below is a simple example that creates a Ka�ka consumer that joins consumer group
mygroup and reads messages from its assigned partitions until Ctrl-C is pressed:

from confluent_kafka import Consumer, KafkaError

settings = {

4 of 8 4/6/20, 11:18 AM
Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

'bootstrap.servers': 'localhost:9092',
'group.id': 'mygroup',
'client.id': 'client-1',
'enable.auto.commit': True,
'session.timeout.ms': 6000,
'default.topic.config': {'auto.offset.reset': 'smallest'}
}

c = Consumer(settings)

c.subscribe(['mytopic'])

try:
while True:
msg = c.poll(0.1)
if msg is None:
continue
elif not msg.error():
print('Received message: {0}'.format(msg.value()))
elif msg.error().code() == KafkaError._PARTITION_EOF:
print('End of partition reached {0}/{1}'
.format(msg.topic(), msg.partition()))
else:
print('Error occured: {0}'.format(msg.error().str()))

except KeyboardInterrupt:
pass

finally:
c.close()

A number of con�iguration parameters are worth noting:

�. bootstrap.servers : As with the producer, bootstrap servers speci�ies the initial

point of contact with the Ka�ka cluster.
�. group.id : The name of the consumer group the consumer is part of. If the consumer
group does not yet exist when the consumer is constructed (there are no existing
consumers that are part of the group), the group id will be created automatically.
Similarly, if all consumers in a group leave the group, the group and group id will be

5 of 8 4/6/20, 11:18 AM
Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

automatically destroyed.
�. client.id : Although optional, each consumer in a group should be assigned a unique
id – this allows you to differentiate between clients in Ka�ka error logs and monitoring
aggregates.
�. default.topic.config : A number of topic related con�iguration properties are
grouped together under this high level property. One commonly used topic-level
property is auto.offset.reset which speci�ies which offset to start reading from if
there have been no offsets committed to a topic/partition yet. This defaults to latest ,
however you will often want this to be smallest so that old messages are not ignored
when you �irst start reading from a topic.
�. enable.auto.commit : By default, as the consumer reads messages from Ka�ka, it will
periodically commit its current offset (de�ined as the offset of the next message to be
read) for the partitions it is reading from back to Ka�ka. Often you would like more
control over exactly when offsets are committed. In this case you can set
enable.auto.commit to False and call the commit method on the consumer. For
simplicity, we have left auto offset commit enabled in this example.

After constructing the consumer, the subscribe method is called to inform Ka�ka that we
wish to join the consumer group mygroup (speci�ied in the con�iguration) and read
messages from a single topic mytopic . It’s possible to subscribe to more than one topic by
specifying more than one topic name in the list provided to the subscribe method. Note
that you can’t do this by calling the subscribe method a second time – this would result in
the consumer �irst unsubscribing from the original subscription set and then subscribing to
only the topic(s) in the newly speci�ied one.

Having subscribed to a set of topic groups, we enter the main poll loop. This is wrapped in
a try/except block that allows controlled shutdown of the consumer via the close method
when the user interrupts program execution. If the close method is omitted, the consumer
group would not rebalance immediately – removal of the consumer from the group would
occur as per the consumer group failure detection protocol after the session.timeout.ms
has elapsed.

On the consumer, the poll method blocks until a Message object is ready for consumption,
or until the timeout period (speci�ied in seconds) has elapsed, in which case the return
value is None . When a Message object is available, there are essentially three cases to

6 of 8 4/6/20, 11:18 AM
Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

consider, differentiated by the value returned by Message.error() :

�. None : The Message object represents a consumed message. The message key, value
and other relevant information can be obtained via the key() , value() ,
timestamp() , topic() , partition() and offset() methods of the Message
object.
�. KafkaError._PartitionEOF : The Message object does not encapsulate any consumed
message – it simply signals that the end of a partition has been reached. You can use
the partition() and topic() methods to determine the pertinent partition.
�. Any other value : An error occurred during consumption. Depending on the result of
Message.error() , other Message object methods may return valid values. For most
error types, use of topic() and partition() is valid.

Summary
That concludes our introduction on how to integrate Apache Ka�ka with your Python
applications. In order to keep this post to a reasonable length, we’ve omitted some of the
more advanced features of ka�ka python integration provided by the library. For example,
you can hook into the partition assignment process that happens after you call subscribe
on the consumer but before any messages are read. This allows you to do things like pre-
load state associated with the partition assignment for joining with the consumed
messages. The client also ships with AvroProducer and AvroConsumer classes that allow you
to serialize data in Avro format and manage the evolution of the associated schemas using
schema registry . For further information of ka�ka python integration, refer to the API
documentation , the examples in the github repo, or user’s guide on our website.

For expert advice on deploying or operating Ka�ka, we’ve released a range of training and
technical consulting services covering all levels of expertise for you to consume and learn
from. For large-scale deployments of Ka�ka, we offer Con�luent Platform , which not only
provides a number of powerful features in addition to those under the Con�luent
Community License but also provides enterprise grade support. Finally, a hosted and fully
managed version Apache Ka�ka is just around the corner with the up-coming Con�luent
Cloud .

7 of 8 4/6/20, 11:18 AM
Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

Matt Howlett

8 of 8 4/6/20, 11:18 AM

Kafka Lab Manual - 3 Experiments
No ratings yet
Kafka Lab Manual - 3 Experiments
15 pages
Apache Kafka
No ratings yet
Apache Kafka
27 pages
100 Python Programming Challenges
50% (2)
100 Python Programming Challenges
61 pages
Hiragana Katakana Worksheet
100% (4)
Hiragana Katakana Worksheet
23 pages
Kafka My Kafka Note v67
No ratings yet
Kafka My Kafka Note v67
55 pages
Apache Kafka Key Concepts
100% (1)
Apache Kafka Key Concepts
8 pages
Kafka
No ratings yet
Kafka
15 pages
Apache Kafka Beginner Guide
No ratings yet
Apache Kafka Beginner Guide
40 pages
Apache Kafka Interview Questions
No ratings yet
Apache Kafka Interview Questions
5 pages
Apache Kafka
No ratings yet
Apache Kafka
130 pages
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
100% (1)
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
23 pages
Apache Kafka Description
No ratings yet
Apache Kafka Description
36 pages
Apache Kafka 360 1631077800
No ratings yet
Apache Kafka 360 1631077800
137 pages
Apache Kafka
No ratings yet
Apache Kafka
32 pages
Apache Kafka Confluent Enterprise Ref Architecture
No ratings yet
Apache Kafka Confluent Enterprise Ref Architecture
17 pages
Linnemann Oracion
No ratings yet
Linnemann Oracion
3 pages
Siddhaṃ Script
No ratings yet
Siddhaṃ Script
7 pages
Chapter 1 - Introduction To KAFKA: Objectives
No ratings yet
Chapter 1 - Introduction To KAFKA: Objectives
17 pages
Daniel Yacob, Ge'ez Frontier Foundation: 11 January 2021
100% (1)
Daniel Yacob, Ge'ez Frontier Foundation: 11 January 2021
41 pages
AK
No ratings yet
AK
22 pages
Top Answers To Kafka Interview Questions
No ratings yet
Top Answers To Kafka Interview Questions
3 pages
Apache Kafka
No ratings yet
Apache Kafka
13 pages
Kafka - Interview Questions
No ratings yet
Kafka - Interview Questions
4 pages
Cours - Kafka
No ratings yet
Cours - Kafka
72 pages
PML
100% (1)
PML
21 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
6 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Loquendo TTS User Guide
No ratings yet
Loquendo TTS User Guide
73 pages
Creating Data Pipe Lines With Kafka
No ratings yet
Creating Data Pipe Lines With Kafka
144 pages
Kafka Monitoring
No ratings yet
Kafka Monitoring
64 pages
Unicode Arabic Quran
No ratings yet
Unicode Arabic Quran
92 pages
1646412329504-CCDAK Study Guide
No ratings yet
1646412329504-CCDAK Study Guide
56 pages
Basic Tex
No ratings yet
Basic Tex
12 pages
LC B2 Kozepfok Beszedertes Valaszlap
No ratings yet
LC B2 Kozepfok Beszedertes Valaszlap
1 page
Kafka Overview
No ratings yet
Kafka Overview
22 pages
Introduction To Confluent Components
No ratings yet
Introduction To Confluent Components
68 pages
5 Kafka 2.7m
No ratings yet
5 Kafka 2.7m
46 pages
3dvia Composer
No ratings yet
3dvia Composer
53 pages
Font Spec
No ratings yet
Font Spec
130 pages
Apache Kafka - Thi Nguyen's Blog
No ratings yet
Apache Kafka - Thi Nguyen's Blog
39 pages
Apache Kafka
No ratings yet
Apache Kafka
94 pages
Ctreedb
No ratings yet
Ctreedb
755 pages
Kafka
No ratings yet
Kafka
43 pages
Apache Kafka
No ratings yet
Apache Kafka
38 pages
Kafka
No ratings yet
Kafka
20 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
26 pages
Vijeo Citect v7.40 Service Pack 2 - Release Notes
No ratings yet
Vijeo Citect v7.40 Service Pack 2 - Release Notes
67 pages
Syriac Open Fonts For Windows
No ratings yet
Syriac Open Fonts For Windows
20 pages
Publisher Subscriber Based Messaging System - Demo
No ratings yet
Publisher Subscriber Based Messaging System - Demo
12 pages
Kafka Python
No ratings yet
Kafka Python
84 pages
Using Kafka For Real Time Data Ingestion With .NET KevinFeasel
No ratings yet
Using Kafka For Real Time Data Ingestion With .NET KevinFeasel
33 pages
Documentation
No ratings yet
Documentation
105 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
Sungha Jung - First Step
No ratings yet
Sungha Jung - First Step
9 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Big Data-Kafka
No ratings yet
Big Data-Kafka
14 pages
Kafka
No ratings yet
Kafka
12 pages
The Key Differences Between Python 2.7.x and Python 3.x With Examples
No ratings yet
The Key Differences Between Python 2.7.x and Python 3.x With Examples
12 pages
Lab 2 - Clients
No ratings yet
Lab 2 - Clients
5 pages
KAFKAExample 2
No ratings yet
KAFKAExample 2
12 pages
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
18 pages
Kafka
No ratings yet
Kafka
5 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
25 pages
Manual Web-Editor E
No ratings yet
Manual Web-Editor E
131 pages
Kafka
No ratings yet
Kafka
23 pages
Kafka 1
No ratings yet
Kafka 1
10 pages
Kafka Dmbi
No ratings yet
Kafka Dmbi
3 pages
BDA Lab A7
No ratings yet
BDA Lab A7
10 pages
A Nasty Bit of Blues Arreglo Alhue Piano
No ratings yet
A Nasty Bit of Blues Arreglo Alhue Piano
4 pages
Kicir Kicir Finish
No ratings yet
Kicir Kicir Finish
3 pages
Sap Gui For Java
No ratings yet
Sap Gui For Java
64 pages
Getting Started With Apache Kafka in Python - Towards Data Science PDF
No ratings yet
Getting Started With Apache Kafka in Python - Towards Data Science PDF
17 pages
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
Once Upon A December
No ratings yet
Once Upon A December
2 pages
Unicode FAQs
No ratings yet
Unicode FAQs
8 pages
Rudolph The Red Nose Reindeer: Fingerstyle - Easy Version
No ratings yet
Rudolph The Red Nose Reindeer: Fingerstyle - Easy Version
2 pages
My Bonnie Lies Over The Ocean Gitar
No ratings yet
My Bonnie Lies Over The Ocean Gitar
1 page
Mudarra Romanesca
No ratings yet
Mudarra Romanesca
1 page
Apache Kafka
No ratings yet
Apache Kafka
9 pages
? Kafka
No ratings yet
? Kafka
2 pages
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
15 pages
Kannada: Range: 0C80-0CFF
No ratings yet
Kannada: Range: 0C80-0CFF
4 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
U1EE00
No ratings yet
U1EE00
6 pages
2007 - Word Processing in Linguistics - Installing and Using Fonts and Making PDF Files
No ratings yet
2007 - Word Processing in Linguistics - Installing and Using Fonts and Making PDF Files
7 pages
Profile Parameters
No ratings yet
Profile Parameters
3 pages

Introduction To Apache Ka Ka For Python Programmers: Installation

Uploaded by

Introduction To Apache Ka Ka For Python Programmers: Installation

Uploaded by

Introduction to Apache Kafka® for Python Prog... https://www.conﬂuent.io/blog/introduction-to-apa...

The Con�luent Python client con�luent-ka�ka-python leverages the high performance C

pip install confluent-kafka

Then to start a Ka�ka broker:

That’s it! You now have a Ka�ka broker to play with.

from confluent_kafka import Producer

from confluent_kafka import Producer

def acked(err, msg):

print("Message produced: {0}".format(msg.value()))

from confluent_kafka import Consumer, KafkaError

A number of con�iguration parameters are worth noting:

�. bootstrap.servers : As with the producer, bootstrap servers speci�ies the initial

consider, differentiated by the value returned by Message.error() :

You might also like