Spark-Performance Tuning
Spark-Performance Tuning
Spark - Performance
Tuning
www.zekeLabs.com
• Data Serialization
• Memory Tuning
• Level of Parallelism
Agenda
• Memory Usage of Reduce Tasks
• Determinin
• Broadcasting Large Values
Performance Tuning
Memory
Data Serialization Memory Tuning
Management
• Java serialization
• Kyro serilization
Memory Management Overview
• applications that do not use caching can use the entire space for
execution, obviating unnecessary disk spills.
• applications that do use caching can reserve a minimum storage
space (R) where their data blocks are immune to being evicted.
• reasonable out-of-the-box performance for a variety of
workloads.
• spark.memory.fraction - M,default 0.6 fraction of heap
• spark.memory.storageFraction, R
Determing Memory Consumption
Get in touch: