Solving I/O The Slowdown: The "Noisy Neighbor" Problem
Solving I/O The Slowdown: The "Noisy Neighbor" Problem
John Fragalla
Principal Engineer
Cray, Inc.
• When multiple users share a high speed parallel filesystem, ”bad applications”
will effect “good application” performance
• Bad applications: Lots of Small Files, Random Small I/O, Unaligned I/O
• Good applications: Stream large I/O, Sequential performance, Aligned I/O
• Recent features available in Lustre help automate I/O isolation and placement
with transparent use of Flash and HDD Devices in a Single Namespace
• Progressive File Layout with Lustre Storage Pools
• Data on Metadata
• Distributed Namespace (DNE) 2 – Clustered Metadata
MDS
• Small/random I/O performance improved
• Data on Metadata
• Ideal for small file workloads
• File data stored directly on metadata storage
• Lower communication overhead for data access
• Scales with Distributed Namespace (DNE)
• Avoids contention by not placing small files on OSTs
25,000
Performance (MB/sec)
20,000
15,000
We want no change in performance across various sizes
Write Mean
Read Mean
10,000
5,000
0
No DoM DoM=64K DoM=256K DoM=1024K DoM=4096K
Progressive File Layout Small Component Size
24,000 24,000
Baseline Isolated Baseline Isolated
22,000 22,000
MB/s
MB/s
20,000 20,000
Interfered Interfered
18,000 18,000
16,000 16,000
14,000 14,000
12,000 12,000
None None (1MB) 1024K (1MB) X-Axis Legend None None (4MB) 1024K (4MB) 4096K (4MB)
PFL Size on Flash (Noisy Neighbor File Size)
cray.com