Kubernetes and Real Time World Analytics Albert Lewandowski
Kubernetes and Real Time World Analytics Albert Lewandowski
real-time analytics
How to connect these two worlds with
Apache Flink?
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Content
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Introduction to the
jungle
What is Kubernetes?
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes - Operators
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes - Custom Resource Definitions
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
What is Apache Flink?
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Apache Flink Roadmap
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Perception
Idempotency
CI/CD
Infrastructure
logic Explainability
Security Testing
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Reality
Idempotency
CI/CD
Data Ingestion
Monitoring Reprocessing
Business logic Serving
Infrastructure
Explainability
Security Testing
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Real-time
data streaming
Data Streaming vs. Batch
Batch
Events
1 2 3 4 5 6
Stream
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use cases
Location data
User activity
Logistics
Fraud detection
Recommendations
Industrial IoT
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use case - Kcell - Telecom company in Kazakhstan
2018 10M
Subscribers
165K 22.5
Events / s
TB / month
2020 10M
Subscribers
500K
Events / second
40
TB / month
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use case - Kcell
SMS events
Voice usage events
Data usage events
Roaming events
Location events
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Use case - Kcell - some scenarios for Flink
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Apache Flink
One tool, multiple versions
One tool, multiple languages
Java 8 or 11
SQL
Python
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Where should I install?
● CICD process
● Service Discovery - monitoring with Prometheus
● Scalability
● Managing resources
● A/B Testing
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
High Availability of Flink
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes Operator for Native Kubernetes -
Flink K8S Operator Ververica Platform
Apache Flink Apache Flink
CICD Kubernetes API Kubernetes API REST API or Web UI Kubernetes API
Helm chart or raw Helm chart or raw Helm chart or raw No need to install any
Installation
Kubernetes manifests Kubernetes manifests Kubernetes manifests component
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Installation & Configuration
Kubernetes API
Flink jobs
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Testing
savepoi
nt
Dedicated TaskManagers
Production Standard
job output
Incubating Mode Production Dedicated TaskManagers
data Job
Separated
Incubating
output
mode
Flink Job
Result #1
#1
Blue Green
Proxy
Deployment
Flink Job
Result #2
#2
A/B Testing
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Deployment process
Git Flow
Versioning images
Deployment process
Monitoring
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes aspects
Resources
Dedicated namespaces
Network performance
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Self-healing and autoscaling
Flink restarts
Scale based
on metrics
External tool
for fixing
Automate
manual tasks
Re-create
cluster
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Job Cluster & Session Cluster
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Stories from production
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Local Setup
How to start locally?
APACHE FLINK
KUBERNETES
STREAMING SQL
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Observability
Whitepaper - here
Observability
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Part One: Metrics
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Prometheus - Kubernetes-native solution
open-source systems
monitoring and alerting toolkit
joined the Cloud Native Computing
Foundation in 2016 as the second
hosted project, after Kubernetes
a lot of exporters
you can write your own easily
mature ecosystem
PushGateway, Blackbox, AlertManager, etc.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Prometheus - simple or complex High Availability?
Simple Complex
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Pull vs. push-based monitoring
Pull Push
Workload on central poller increases with the number of Polling task fully distributed among agents, resulting in
devices polled. linear scalability.
Polling protocol can potentially open up system to Push agents are inherently secure against remote
remote access and denial of service attacks. attacks since they do not listen for network connections.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Prometheus - Stories
service discovery
simple on k8s
limited security
archived data
how old data is required?
monitor monitoring
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Part Two: Logs analytics
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Logs analytics - which tool should I choose?
Loki ElasticSearch
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
ELK vs. Loki
Query performances Faster due to indexed all the data Slower due to indexing only labels
Resource requirements Higher due to the need of indexing Lower due to index only labels
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
What about alerts?
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Quick start
Flink - Complex Event Processing
Article.
Codebase for example.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
DevOps best practises
Article.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes - first setup
● Minikube
● Kind
● Use Kubernetes service from public cloud provider like
AWS, GCP, Azure during free tier
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes + Flink - Operator
Verify if it works:
$ kubectl -n flink-operator get po
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Kubernetes + Ververica Platform
Verify if Ververica is up
$ kubectl get po
Do you want to test Flink SQL feature? Use Flink Faker (a data generator source connector)
https://github.com/knaufk/flink-faker/
It requires changing used image for vvp-gateway.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Join Us!
Data Engineer
Spark, Kafka, Airflow, public cloud
Link
Backend Engineer
Java / Scala, microservices
Link
MLOps Engineer
MLOps tools, Python, public cloud
Link
DevOps / SRE
GCP, Terraform, Prometheus
Link
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Q&A
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Contact details
LinkedIn:
https://www.linkedin.com/in/albert-lewandowski
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Thank you for your
attention!