Common Flink Mistakes
Common Flink Mistakes
Decodable
Apache Flink
Eric Sammer
Decodable
Webinar
[email protected] @rmetzger_
[email protected] @esammer
Today’s Webinar
The Top 5 Mistakes Deploying Apache Flink
Common Stream Processing Patterns using SQL
Q&A
Common Flink
Mistakes
Robert Metzger
Staff Engineer @ decodable, Committer and PMC Chair @ Flink
#1 Mistake: Serialization is expensive
- Mistake: People use Java Maps, Sets etc. to store state or do network
transfers
- Serialization happens when
- transferring data over the network (between TaskManagers or from/to
Sources/Sinks)
- accessing state in RocksDB (even in-memory)
- Sending data between non-chained tasks locally
- Serialization costs a lot of CPU cycles
#1 Mistake: Serialization is expensive
Example:
package co.decodable.talks.flink.performance;
start lon:11 lat:22
private static class Location { end lon:88 lat:99
int lon;
int lat;
}
Example:
public record OptimizedLocation (int startLon, int startLat, int endLon, int endLat)
{}
Further reading:
11 22 88 99 16 bytes “Flink SerializationTuning
Vol. 1: Choosing your
→ 7.5x reduction in data Serializer — if you can”
Fewer object allocations = less CPU cycles https://flink.apache.org/news/2020/04/15/flink-ser
ialization-tuning-vol-1.html
Disclaimer: The actual binary representation used by Kryo might differ, this is for demonstration purposes only
#2 Mistake: Flink doesn’t always need to be
distributed
- Flink’s MiniCluster allows you to spin up a full-fledged Flink cluster
with everything known from distributed clusters (Rocksdb,
checkpointing, the web UI, SQL, …)
Multiple Jobs share a One Job per JobManager, One Job per JobManager,
JobManager planned on the JobManager planned outside the JobManager
● Data: ● Hardware:
○ Message size: 2 KB ○ 5 machines, each running a TaskManager
○ Throughput: 1,000,000 msg/sec
○ Distinct keys: 500,000,000
(aggregation in window: 4 longs per key)
○ Checkpoint every minute
Sliding
Kafka keyBy Window Kafka
Source userId 5m size Sink
1m slide
RocksDB
Example: A machine’s perspective
TaskManager n
Kafka Source 400MB/s / 5 receivers =
Kafka: 400 MB/s
80MB/s
2 KB * 1,000,000 = 2GB/s 1 receiver is local, 4 remote:
2GB/s / 5 machines = 400 MB/s keyBy 4 * 80 = 320 MB/s out
80 MB/s
Shuffle: 320 MB/s
window
Shuffle: 320 MB/s
Kafka Sink
Kafka: 67 MB/s
Excursion: State & Checkpointing
For each key-value access, we need to retrieve 40 bytes from disk, update the
aggregates and put 40 bytes back
TaskManager n
Kafka Source
Kafka: 400 MB/s
Shuffle: 320 MB/s
keyBy
Shuffle: 320 MB/s Kafka: 67 MB/s
80 MB/s
window
Checkpoints: 333 MB/s
Kafka Sink
Total In: 720 MB/s Total Out: 720 MB/s
Cluster sizing: Conclusion
- This was just a “back of the napkin” approximation! Real world results will
differ!
- Ignored network factors
- Protocol overheads (Ethernet, IP, TCP, …)
- RPC (Flink‘s own RPC, Kafka, checkpoint store)
- Checkpointing causes network bursts
- A window emission causes bursts
- Other systems using the network
- CPU, memory, disk access speed have not been considered
#5 Advice: Ask for Help!
decodable.co 2022