0% found this document useful (0 votes)
1 views

getting-started-with-apache-kafka

Uploaded by

drivesankofa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

getting-started-with-apache-kafka

Uploaded by

drivesankofa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

TIBCO solution brief

Getting Started with


Apache Kafka

Designed by LinkedIn, and now an Apache solution, Apache


Kafka message distribution provides a strong, simple, and
straightforward architectural approach based on logs. Kafka
uses a common and well understood publish/subscribe
messaging paradigm in which publishers publish messages to
topics stored in a broker that are then delivered to consumers
who subscribe to those topics.
Kafka contains three main components: the Kafka event broker;
Kafka Connect that allows you to connect producers and
consumers to the broker; and Kafka Streams for real-time data
processing. Applications publish and subscribe to topics and
topic partitions, while brokers handle distribution based on
interest. Because Kafka is distributed, servers can be added to
provide additional scale.

PRODUCER PRODUCER PRODUCER


APP APP APP

KAFKA CLUSTER

CONSUMER CONSUMER CONSUMER


APP APP APP

Figure 1: Basic Kafka Architecture


TIBCO solution brief | 2

Publish-Subscribe in Kafka
When a publisher publishes a message to a topic partition, the
Kafka broker appends the message to the topic partition’s physical
log. This model allows for messages to be indexed by their offset in
the message log versus a traditional approach that uses a message
ID for indexing and message lookup. In this way, it reduces
complexity, and more importantly, reduces state management
compared to other broker-based messaging systems.

PRODUCER
APP

0 1 2 3 4 5 6

CONSUMER CONSUMER CONSUMER


APP APP APP

Figure 2: Kafka Message Storage

Consumers are managed in much the same way as producers. They


sequentially consume messages from a given topic partition. Since
sequential consumption is built into the architecture, consumers
can acknowledge all messages received by simply acknowledging
the last message received in the sequence. In addition, Kafka
brokers do not maintain any information about consumption. This
allows them to be stateless, and for messages to be purged after a
configurable time period. New consumers coming online can replay
history and existing consumers can rewind and re-consume data
on demand.
Because Kafka treats the topic stream like a log, the only
information retained in the Kafka server is the per-consumer
offset. The position of a consumer in the process stream is
maintained in the Kafka server, but unlike other server-based
messaging solutions, the rest of the metadata about the
consumer is held in the consumer application. This method
provides a fast and lightweight way to store and retrieve data.
TIBCO solution brief | 3

The Kafka server persists message data based on a


configurable retention period. Consumers can replay and/
or catch up to the data stream based on the time period the
administrator has configured for the stream to persist data.
After this retention period, the message data is discarded, and
the space is reclaimed.
Like many messaging solutions, Kafka provides a guarantee
of at-least-once delivery for each message, but in some cases,
messages may be received multiple times. The community
building Kafka has taken the approach that less is more and
provides this delivery model as the sole approach, although
they provide architectural guidelines for managing duplicate
detection in the consumer application.

Kafka Connect
Building on the simple approach and design that has made
Kafka so attractive, the Kafka Connect toolkit provides a
flexible and scalable approach to integration with other
systems. Kafka Connect defines a connector as the ingress or
egress point of data. It defines a common framework for an
integration point for third-party systems to interact with the
core Kafka messaging system.
Like Kafka, Kafka Connect is designed to provide a simple,
scalable approach to integration. It acts as a data pump into
and out of Kafka core messaging. For importing data, a source
connector is used, and for exporting data, a sink connector.

PRODUCER PRODUCER PRODUCER


APP APP APP
KAFKA SOURCE CONNECTOR

TIBCO FTL
KAFKA CLUSTER

CONSUMER CONSUMER CONSUMER


APP APP APP

Figure 3: Kafka Connect Source Connector


TIBCO solution brief | 4

PRODUCER PRODUCER PRODUCER


APP APP APP
KAFKA SINK CONNECTOR

RELATIONAL DB
KAFKA CLUSTER

CONSUMER CONSUMER CONSUMER


APP APP APP

Figure 4: Kafka Connect Sink Connector

The Kafka event broker treats Kafka Connect source and sink
connectors just like Kafka publishers and subscribers. The
Kafka core is not affected by how data comes in or goes out,
which keeps the broker architecture simple. The logic and
processing for a given data source or sink happens within
Kafka Connect through a special connector for the given
integration point.

Kafka Streams
Some applications require real-time stream processing on top
of Kafka’s simple publish/subscribe interface. Building stream
processing into an application adds additional complexity. The
Kafka Streams library allows developers to invoke real-time stream
processing without building it themselves. Client applications can
access functions purpose built for real-time stream processing like
data filtration, aggregation, and grouping.
The Kafka Streams interface provides client applications with
the flexibility to not only consume data natively from Kafka,
but transform it in the message flow, improving data visibility
and data access. This streaming approach opens the Kafka
message flow and provides optimizations for applications
built to provide data analytics, data monitoring, and real-time
decision-making, supporting event-driven architectures.
TIBCO solution brief | 5

Getting Started with Apache Kafka


Now that you understand how Kafka works, follow these steps
to try it.

1 Download and install the software for your operating


system from https://www.tibco.com/products/tibco-
messaging/downloads
2 Start the Zookeeper server, which manages the Kafka brokers
3 Start a Kafka broker
4 Create a topic
5 Publish and consume messages

For additional information on deploying Apache Kafka into


Kubernetes environments and connecting it to other TIBCO
Messaging components, see https://community.tibco.com/wiki/
tibco-messaging-and-tibco-activespaces-article-links-quick-
access

Conclusion
While Apache Kafka was built for real-time data distribution, it
will not fit all the requirements of every enterprise application.
Alternatives like Apache Pulsar, Eclipse Mosquitto, and many
others may be worth investigating, especially if requirements
prioritize large scale global infrastructure where built-in
replication is needed or if native IoT/MQTT support is needed.
For more information on comparisons between Apache Kafka
and other data distribution solutions, please see the Resources
section at https://www.tibco.com/solutions/apache-kafka.

Global Headquarters TIBCO Software Inc. unlocks the potential of real-time data for making faster, smarter decisions. Our Connected
3307 Hillview Avenue Intelligence platform seamlessly connects any application or data source; intelligently unifies data for greater
Palo Alto, CA 94304 access, trust, and control; and confidently predicts outcomes in real time and at scale. Learn how solutions to our
+1 650-846-1000 TEL customers’ most critical business challenges are made possible by TIBCO at www.tibco.com.
+1 800-420-8450 ©2020, TIBCO Software Inc. All rights reserved. TIBCO and the TIBCO logo are trademarks or registered trademarks of TIBCO Software Inc. or its subsidiaries
in the United States and/or other countries. Apache, Kafka, and Pulsar are trademarks of The Apache Software Foundation in the United States and/or other
+1 650-846-1005 FAX countries. All other product and company names and marks in this document are the property of their respective owners and mentioned for identification
www.tibco.com purposes only.
16Sep2020

You might also like