Apache-Kafka Bernhard-H Oss 2018
Apache-Kafka Bernhard-H Oss 2018
Bernhard Hopfenmüller
Bernhard Hopfenmüller
IT Consultant @ ATIX AG
IRC: Fobhep
github.com/Fobhep
#atix #ossummit
whoarewe
over 15 years
datacenter automation, Linux
Consulting, Engineering, Support,
Training
#atix #ossummit
Kafka
Quora.com
What is the relation between Kafka, the writer, and Apache Kaf-
ka, the distributed messaging system?
Jay Kreps: I thought that since Kafka was a system optimized for
writing using a writer’s name would make sense. I had taken a lot
of lit classes in colleague and liked Franz Kafka. Plus the name
sounded cool for an OS project
#atix #ossummit
I developed by LinkedIn, Open Source since 2011
#atix #ossummit
Messaging-Systems
#atix #ossummit
Messaging-Systems
#atix #ossummit
Queues vs Topics
Supermarket Wait until it’s your turn Television Choose what you want to
receive
#atix #ossummit
Kafka-Basic structure
#atix #ossummit
Use Cases
#atix #ossummit
Topics I
#atix #ossummit
Topics II
#atix #ossummit
Topics III
#atix #ossummit
Topics IV
I Clean-Up policies:
I default: Retention-time
(delete old data after x days)
I Retention-size
(delete old data if data
memory > x)
#atix #ossummit
Topics V
I Clean-Up policies:
I default: Retention-time
(delete old data after x days)
I Retention-size
(delete old data if data
memory > x)
I Log-Compaction
(replace old value to key with
new)
#atix #ossummit
Topic consumption
#atix #ossummit
Consumer Groups
I parallelism allows
high throughput
I never more consumers
than partitions
I Kafka features exactly-
once-semantics!
#atix #ossummit
Wait but who knows what’s read?
I Consumer
commit their
offset
I Upon failure
re-processing
possible
#atix #ossummit
Replication
implemented on partition level
Source[3]
#atix #ossummit
In and Out of Sync Replica
#atix #ossummit
Did somebody hear my message?
#atix #ossummit
ZooKeeper
Source[4]
#atix #ossummit
Broker and ZooKeeper
#atix #ossummit
Talk to Kafka - Kafka Connect
Source[7]
#atix #ossummit
Talk to Kafka - Schema Registry
I define standards
I version and store them
I Open Source by
Confluent
source: confluent
#atix #ossummit
TV or Netflix?
source: confluent
#atix #ossummit
Who likes Kafka?
I zalando - microservices
I Cisco Systems - security
I Airbnb - event pipeline
I Netflix (Monitoring!)
I The New York Times ( Kafka as data storage! Super awesome blog
post) [5][6]
I Audi - IoT
I Spotify
I Twitter
I Uber (Kafka = Backbone!!!)
I https://kafka.apache.org/powered-by
#atix #ossummit
Sources
1 https://www.informatik-aktuell.de/betrieb/verfuegbarkeit/apache-
kafka-eine-schluesselplattform-fuer-hochskalierbare-systeme.html
2 https://thecattlecrew.net/2017/09/28/apache-kafka-im-detail-teil-
1/ and
https://thecattlecrew.net/2017/09/28/apache-kafka-im-detail-teil-
2/
3 https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-
in-operational-simplicity/
4 https://www.infoq.com/articles/apache-kafka
5 https://www.confluent.io/blog/okay-store-data-apache-kafka/
6 https://www.confluent.io/blog/publishing-apache-kafka-new-york-
times/
7 https://www.confluent.io/blog/simplest-useful-kafka-connect-data-
pipeline-world-thereabouts-part-1/
#atix #ossummit
Install Kafka with Docker/Ansible
#atix #ossummit
Single Components
---
- name: Start zookeeper
docker_container:
name: zookeeper
image: "{{ images.zookeeper }}:{{ versions.kafka }}"
state: started
restart_policy: unless-stopped
ports:
- "{{ ports.zookeeper.client }}:2181"
- "{{ ports.zookeeper.peer }}:2888"
- "{{ ports.zookeeper.leader }}:2181"
volumes:
- "/zookeeper/data:/var/lib/zookeeper/data"
- "/zookeeper/log:/var/lib/zookeeper/log"
env:
ZOOKEEPER_SERVER_ID: "{{ zookeeper_server_id }}"
ZOOKEEPER_CLIENT_PORT: "2181"
ZOOKEEPER_SERVERS: "{{ lookup('template', 'sort_zookeeper.j2') }}"
ZOOKEEPER_DATA_DIR: "/var/lib/zookeeper/data"
ZOOKEEPER_LOG_DIR: "/var/lib/zookeeper/log"
#atix #ossummit ...
{% for host in groups['zookeeper'] %}
{% if inventory_hostname == hostvars[host]['inventory_hostna
0.0.0.0
{% else %}
{{ hostvars[host]['ansible_default_ipv4']['address'] }}
{% endif %}
{% if not index_loop.last %}
;
{% endif %}
{% endfor %}
#atix #ossummit
Check system health
---
- name : "Check Zookeeper Health"
command : docker run --rm -it confluentinc/zookeeper cub zk-re
register : output
until: output is success
retries: 3
...
#atix #ossummit
Configure via REST/uri
---
- name: create new topic
command: "{{ 'sudo docker run --rm confluentinc/cp-kafka
kafka-topics --create' ... }}"
...
#atix #ossummit
whoami
Bernhard Hopfenmüller
IRC: Fobhep
github.com/Fobhep twitter.com/fobhep
#atix #ossummit
Kafka vs MQ
#atix #ossummit