0% found this document useful (0 votes)

60 views33 pages

Using Kafka For Real Time Data Ingestion With .NET KevinFeasel

This document discusses using Apache Kafka for real-time data ingestion with .NET. It describes Kafka concepts like topics, partitions, producers, and consumers. It then demonstrates building a .NET producer to read CSV data, an enricher to process messages and output to JSON, and a consumer to aggregate data and write to SQL Server. It concludes with best practices for Kafka performance around throughput, latency, and horizontal scaling.

Uploaded by

Kohinata Minoru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views33 pages

Using Kafka For Real Time Data Ingestion With .NET KevinFeasel

Uploaded by

Kohinata Minoru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Using Kafka for Real-Time

Data Ingestion with .NET

Kevin Feasel
Engineering Manager, Predictive Analytics
ChannelAdvisor

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Who Am I? What Am I Doing Here?
Catallaxy Services

Curated SQL

We Speak Linux
@feaselkl
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Apache Kafka
Apache Kafka is a
message broker on
the Hadoop stack. It
receives messages
from producers and
sends messages to
consumers.
Everything in Kafka is distributed.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Why Use A Broker?
Suppose we have two applications which want to
communicate. We connect them directly.

Works great at low scale--it's easy to understand, easy to work

with, and has fewer working parts to break. But it hits scale
limitations. #ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Why Use A Broker?
We then expand out.

It is easy to expand this way as

long as you don't overwhelm the
DB; eventually you will.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Why Use A Broker?
We then expand out. Again.
It takes some effort here: we
need to manage connection
strings and write to the correct
DB.

But it's doable and expands

indefinitely.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Why Use A Broker?
But what happens when a
consumer (database) goes down
for too long?
• Producer drops messages
• Producer holds messages
(until it runs out of disk)
• Producer returns error

There’s a better way.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Why Use A Broker?
Brokers take messages from
producers and feed messages to
consumers.

Brokers deal with the jumble of

connections, let us be resilient to
producer and consumer failures,
and help with scale-out.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Motivation
Today's talk will focus on using Kafka to ingest, enrich, and
consume data. We will build .NET applications in Windows to
talk to a Kafka cluster on Linux.

Our data source is flight data. I’d like to ask a few questions,
with answers split out by destination state:
1. How many flights did we have in 2008?
2. How many flights' arrivals were delayed?
3. How many minutes of arrival delay did we have?
4. Given a flight with a delay, how long can we expect it to be?
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Most message brokers act as queues.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Kafka is a log, not a queue.

Multiple consumers may

read the same message
and a consumer may re-
read messages.

Think microservices and

replaying data.
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Brokers foster communication between producers and
consumers. They store the produced messages and keep track
of what consumers have read.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Topics are categories or feeds to which messages get
published. Topics are broken up into partitions. Partitions are
ordered, immutable sequences of records.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Producers push messages to Kafka.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Consumers read messages from topics.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Consumers enlist in consumer groups. Consumer groups act
as "logical subscribers" and Kafka distributes load to
consumers in a group.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
Records in partitions are immutable. You do not modify the
data, but can add new rows.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Concepts
• Consumers should know where they left off. Kafka assists
by storing consumer group-specific last-read pointer values
per topic and partition.
• Kafka retains messages for a certain (configurable) amount
of time, after which point they drop off.
• Kafka can also garbage collect messages if you reach a
certain (configurable) amount of disk space.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
The Competition
• MSMQ and Service Broker: queues in Microsoftland
• Amazon Kinesis and Azure Event Hub: Kafka as a Service
• RabbitMQ: complex routing & guaranteed reliability
• Celery: distributed queue built for Python
• ZeroMQ: socket-based distributed queueing
• Queues.io lists dozens of queues and brokers

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Building A Producer
Our first application reads data from a CSV and pushes
messages onto a topic.

This application will not try to understand the messages; it

simply takes data and pushes it to a topic.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Building A Producer
I chose Confluent's Kafka .NET library (nee RDKafka-dotnet) as
my library of choice.

There are several libraries available, each with their own

benefits and drawbacks. This library serves up messages in an
event-based model and has official support from Confluent,
so use this one.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Demo Time

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Building An Enricher
Our second application reads data from one topic and pushes
messages onto a different topic.

This application provides structure to our data and will be the

largest application.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Building An Enricher
Enrichment opportunities:

1. Convert "NA" values to appropriate values: either a

default value or None (not NULL!).
2. Perform lookups against airports given an airport code.
3. Converting the input CSV record into a structured type
(similar to a class).
4. Outputting results as JSON for later consumers.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Demo Time

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Building A Consumer
Our third application reads data from the enriched topic,
aggregates, and periodically writes results to SQL Server.

We’ve already seen consumer code, so this is easy.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Demo Time

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Kafka Performance
Basic tips:

• Maximize your network bandwidth! Your fibre channel will

push a lot more messages than my travel router.
• Compress your data. Compression works best with high-
throughput scenarios, so test first.
• Minimize message size. This reduces network cost.
• Buffer messages in your code using tools like
Collections.Concurrent.BlockingCollection
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Throughput Versus Latency
Minimize latency when you want the most responsive
consumers but don't need to maximize the number of
messages flowing.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Throughput Versus Latency
Maximize throughput when you want to push as many
messages as possible. This is better for bulk loading
operations.

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Throughput Versus Latency
Consumer config settings:
• fetch.wait.max.ms
• fetch.min.bytes

Producer config settings:

• batch.num.messages
• queue.buffering.max.ms

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
More, More, More
Kafka is a horizontally distributed system, so when in doubt,
add more:
• More brokers will help accept messages from producers
faster, especially if current brokers are experiencing high
CPU or I/O.
• More consumers in a group will process messages more
quickly.
• You must have at least as many partitions as consumers in
a group! Otherwise, consumers may sit idle.
#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM
Wrapping Up
Apache Kafka is a powerful message broker. There is a small
learning curve associated with Kafka, but this is a technology
well worth learning.

To learn more, go here: https://CSmore.info/on/kafka

And for help, contact me:

[email protected] | @feaselkl

#ITDEVCONNECTIONS | ITDEVCONNECTIONS.COM

GR 10 History (English) Term 1 Controlled Test 1 Question Paper 2
No ratings yet
GR 10 History (English) Term 1 Controlled Test 1 Question Paper 2
4 pages
SignalR on .NET 6 - the Complete Guide
From Everand
SignalR on .NET 6 - the Complete Guide
Fiodar Sazanavets
No ratings yet
Kafka Using Spring Boot
No ratings yet
Kafka Using Spring Boot
136 pages
200 PMP Questions 76-100 With Answers
No ratings yet
200 PMP Questions 76-100 With Answers
4 pages
Understanding Apache Kafka White Paper
No ratings yet
Understanding Apache Kafka White Paper
7 pages
Apache Kafka
No ratings yet
Apache Kafka
9 pages
This Bread Bakery Business Plan
100% (1)
This Bread Bakery Business Plan
39 pages
Kafka
No ratings yet
Kafka
12 pages
kafka
No ratings yet
kafka
43 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Cours - Kafka
No ratings yet
Cours - Kafka
72 pages
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
No ratings yet
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
23 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Configuring Kafka For High Throughput
No ratings yet
Configuring Kafka For High Throughput
11 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Kafka
No ratings yet
Kafka
23 pages
kafka-overview
No ratings yet
kafka-overview
36 pages
Chapter 1 - Introduction To KAFKA: Objectives
No ratings yet
Chapter 1 - Introduction To KAFKA: Objectives
17 pages
Kafkha
No ratings yet
Kafkha
32 pages
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
18 pages
Instaclustr Understanding Apache Kafka White Paper
No ratings yet
Instaclustr Understanding Apache Kafka White Paper
8 pages
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
Kafka Using Spring Boot v2
No ratings yet
Kafka Using Spring Boot v2
150 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
KAFKA PRESENTATION (1)
No ratings yet
KAFKA PRESENTATION (1)
16 pages
BDA Lab A7
No ratings yet
BDA Lab A7
10 pages
Data Engineering 101 Kafka Concepts 1721892046
No ratings yet
Data Engineering 101 Kafka Concepts 1721892046
76 pages
Data Engineering 101 - Kafka Concept
No ratings yet
Data Engineering 101 - Kafka Concept
76 pages
Apache Kafka
No ratings yet
Apache Kafka
13 pages
Unit 5 Apache Kafka Notes
No ratings yet
Unit 5 Apache Kafka Notes
54 pages
Kafka Notes1
No ratings yet
Kafka Notes1
19 pages
Getting Started With Apache Kafka
No ratings yet
Getting Started With Apache Kafka
21 pages
KAFKA PPT
No ratings yet
KAFKA PPT
11 pages
Apache Kafka Key Concepts
100% (1)
Apache Kafka Key Concepts
8 pages
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
15 pages
20 Best Practices For Working With Apache Kafka at Scale - DZone Big Data
No ratings yet
20 Best Practices For Working With Apache Kafka at Scale - DZone Big Data
10 pages
Kafka Ebook SoftwareMill
No ratings yet
Kafka Ebook SoftwareMill
27 pages
Apache Kafka(1)
No ratings yet
Apache Kafka(1)
10 pages
Apache Kafka - PPT
No ratings yet
Apache Kafka - PPT
27 pages
Documentation
No ratings yet
Documentation
105 pages
Apache_kafka notes
No ratings yet
Apache_kafka notes
9 pages
Event-Driven Architecture- Building Scalable Systems With Apache Kafka - The Tal
No ratings yet
Event-Driven Architecture- Building Scalable Systems With Apache Kafka - The Tal
19 pages
Kafka Notes2
No ratings yet
Kafka Notes2
19 pages
Creating Data Pipe Lines With Kafka
No ratings yet
Creating Data Pipe Lines With Kafka
144 pages
4. Introduction to Apache Kafka and its setup (3)
No ratings yet
4. Introduction to Apache Kafka and its setup (3)
29 pages
Apache Kafka
100% (2)
Apache Kafka
33 pages
Apache_Kafka_360_1631077800
No ratings yet
Apache_Kafka_360_1631077800
137 pages
Kafka My Kafka Note v67
No ratings yet
Kafka My Kafka Note v67
55 pages
Learning Apache Kafka - Second Edition - Sample Chapter
No ratings yet
Learning Apache Kafka - Second Edition - Sample Chapter
12 pages
Apache Kafka
No ratings yet
Apache Kafka
130 pages
Kafka
No ratings yet
Kafka
5 pages
i
No ratings yet
i
26 pages
Apache Kafka Beginner Guide
No ratings yet
Apache Kafka Beginner Guide
40 pages
Getting Started With Apache Kafka in Python - Towards Data Science PDF
No ratings yet
Getting Started With Apache Kafka in Python - Towards Data Science PDF
17 pages
Mastering Apache Kafka
No ratings yet
Mastering Apache Kafka
17 pages
Big Data - Group 14
No ratings yet
Big Data - Group 14
26 pages
AK
No ratings yet
AK
22 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
25 pages
Apache Kafka - Introduction - Tutorialspoint
No ratings yet
Apache Kafka - Introduction - Tutorialspoint
3 pages
KAFKAExample2
No ratings yet
KAFKAExample2
12 pages
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet
Ansible For Containers and Kubernetes By Examples
From Everand
Ansible For Containers and Kubernetes By Examples
Berton
No ratings yet
Besr Reviewer
No ratings yet
Besr Reviewer
41 pages
Attack on titan main theme ost piano sheets - sayuriii
No ratings yet
Attack on titan main theme ost piano sheets - sayuriii
9 pages
SFI - Selective Pricing Strategies
No ratings yet
SFI - Selective Pricing Strategies
2 pages
Statistics For Research Students 1673302328
No ratings yet
Statistics For Research Students 1673302328
109 pages
Final Bank Exam
100% (13)
Final Bank Exam
14 pages
Corporate Finance I
No ratings yet
Corporate Finance I
3 pages
University of The Isthmus qqq111
No ratings yet
University of The Isthmus qqq111
5 pages
BFHI Case Studies FINAL PDF
No ratings yet
BFHI Case Studies FINAL PDF
61 pages
American Weathervanes
No ratings yet
American Weathervanes
2 pages
Hot Buttons: Introduction
No ratings yet
Hot Buttons: Introduction
18 pages
IPS MBD21907 in 522 Datasheet of Emergency Relief Valve A
No ratings yet
IPS MBD21907 in 522 Datasheet of Emergency Relief Valve A
3 pages
Pelvic Organ Prolapse - Dr. Que
100% (1)
Pelvic Organ Prolapse - Dr. Que
55 pages
Assignment A
No ratings yet
Assignment A
4 pages
Staff Selection Commission, Southern Region, Chennai
No ratings yet
Staff Selection Commission, Southern Region, Chennai
4 pages
Parnika Gupta CV
No ratings yet
Parnika Gupta CV
2 pages
KESL & ESL Final Exam Schedule Spring 2023
No ratings yet
KESL & ESL Final Exam Schedule Spring 2023
2 pages
Functions of Continuous Random Variables PDF CDF
No ratings yet
Functions of Continuous Random Variables PDF CDF
5 pages
Science Grade IV Shadows Done 2019-20
100% (4)
Science Grade IV Shadows Done 2019-20
4 pages
IFFCO Training Report File Full
No ratings yet
IFFCO Training Report File Full
53 pages
Csec-Geography June 2014 p1
No ratings yet
Csec-Geography June 2014 p1
14 pages
IET Wireless Sensor Systems - 2019 - Mohapatra - Detection and Avoidance of Water Loss Through Municipality Taps in India
No ratings yet
IET Wireless Sensor Systems - 2019 - Mohapatra - Detection and Avoidance of Water Loss Through Municipality Taps in India
11 pages
Environmental & Resource Economics
No ratings yet
Environmental & Resource Economics
16 pages
CVP Lecture
No ratings yet
CVP Lecture
9 pages
CARNES Steam Humidifiers
No ratings yet
CARNES Steam Humidifiers
12 pages
Astrology at Angkor
No ratings yet
Astrology at Angkor
26 pages
Mater's Dissertation, Nsonkwa Nehlie-Lois Asaba 2
No ratings yet
Mater's Dissertation, Nsonkwa Nehlie-Lois Asaba 2
297 pages
About The Compilation of Sri Guru Granth Sahib Ji - Prof. Sahib Singh
0% (1)
About The Compilation of Sri Guru Granth Sahib Ji - Prof. Sahib Singh
272 pages

Using Kafka For Real Time Data Ingestion With .NET KevinFeasel

Uploaded by

Using Kafka For Real Time Data Ingestion With .NET KevinFeasel

Uploaded by

Using Kafka for Real-Time

Data Ingestion with .NET

Works great at low scale--it's easy to understand, easy to work

It is easy to expand this way as

But it's doable and expands

There’s a better way.

Brokers deal with the jumble of

Multiple consumers may

Think microservices and

This application will not try to understand the messages; it

There are several libraries available, each with their own

This application provides structure to our data and will be the

1. Convert "NA" values to appropriate values: either a

We’ve already seen consumer code, so this is easy.

• Maximize your network bandwidth! Your fibre channel will

Producer config settings:

To learn more, go here: https://CSmore.info/on/kafka

And for help, contact me:

You might also like