0% found this document useful (0 votes)
62 views15 pages

Solving I/O The Slowdown: The "Noisy Neighbor" Problem

Data Storage

Uploaded by

martinez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views15 pages

Solving I/O The Slowdown: The "Noisy Neighbor" Problem

Data Storage

Uploaded by

martinez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Solving I/O the Slowdown:

The "Noisy Neighbor" Problem


Rice Oil and Gas Conference 2019

John Fragalla
Principal Engineer
Cray, Inc.

© 2018 Cray Inc. 1


Agenda
• Today’s I/O challenge with shared I/O
• New Lustre Features enabling Flash to improve Shared Application Performance
• Performance results isolating “Noisy Neighbor” Applications
• Summary

© 2018 Cray Inc. 2


Today’s I/O Challenge

• When multiple users share a high speed parallel filesystem, ”bad applications”
will effect “good application” performance
• Bad applications: Lots of Small Files, Random Small I/O, Unaligned I/O
• Good applications: Stream large I/O, Sequential performance, Aligned I/O
• Recent features available in Lustre help automate I/O isolation and placement
with transparent use of Flash and HDD Devices in a Single Namespace
• Progressive File Layout with Lustre Storage Pools
• Data on Metadata
• Distributed Namespace (DNE) 2 – Clustered Metadata

© 2018 Cray Inc. 3


Hybrid File System Architecture
Scalable MDS Flash Tier :
Parallel File System
• Large number of Inode Support per FS
Single Namespace
• Improved Metadata Operations
Compute Cluster SSD SSD SSD SSD
• Improved Small I/O Latency
Flash Tier :
CN CN CN CN SSD SSD SSD SSD
• Optimize for throughput and IOPs - $/GB/sec.
MDS
• Improved Performance for Intermediate Results
CN CN CN CN HDD HDD HDD HDD

MDS
• Small/random I/O performance improved

CN CN CN CN HDD HDD HDD HDD


MDS High Performance HDD Tier:
CN CN CN CN
• Optimize Throughput/Capacity - $/GB/Sec
HDD HDD HDD HDD
MDS • Optional Flash/Cache to accelerate small block
IO within the HDD Tier
CN CN CN CN HDD HDD HDD HDD

Capacity HDD Tier:


HDD HDD HDD HDD
• Optimize cost - $/GB
• Lower performance, longer term data retention

© 2018 Cray Inc. 4


Lustre | Flexibility and Usability
• Progressive File Layouts (PFL)
• Optimized striping based on file size
• Layout changes at specific thresholds
• Can locate components on specific pools
• Fixed amount of file on flash, the rest on disk

© 2018 Cray Inc. 5


Lustre Storage Pools are more relevant Now
• Lustre Pools historically was used for debugging to isolate performance issues,
for example, for a subset of OSTs or OSS Nodes.
• Now, with PFL, Lustre Storage Pools can be a powerful tool to create automatic
data placement on different storage medias
• Flash Pool
• High Performing Disk Pool
• Slow Performing Disk Tier (e.g. focus on capacity)

© 2018 Cray Inc. 6


Lustre | Improved Small File Performance

• Data on Metadata
• Ideal for small file workloads
• File data stored directly on metadata storage
• Lower communication overhead for data access
• Scales with Distributed Namespace (DNE)
• Avoids contention by not placing small files on OSTs

© 2018 Cray Inc. 7


Lustre | Data on Metadata and PFL

• Leverage DoM and PFL for more flexible solution


• Small files land on MDT Component
• Medium files land on flash with larger files growing to disk (for example)
• Compatible with Progressive File Layouts

© 2018 Cray Inc. 8


DNE Phase 2
• Allows a user to spread a single large directory across multiple MDTs using the
DNE striped directory feature
• Note due to some overhead, this should only be done to very large
directories with file counts in +50K range.

DNE phase 1 DNE phase 2


© 2018 Cray Inc. 9
System Setup for Benchmarks
Hardware: Software:
• 4 Flash MDTs with RAID-10 • Lustre 2.11.0 clients and server
• 2 Flash IOPS Optimized OSTs with • CentOS Linux release 7.5 (server and
RAID-10 client)
• 4 GridRAID OSTs (Parity De- • Spectre/Meltdown-enabled kernels on
Clustered RAID-6 equivalent data Clients, S/M disabled on Server
protection) • Client: 3.10.0-862.el7.x86_64
• Up to 64 Client nodes (FDR • Server: 3.10.0-693.21.1.x3.1.9.x86_64
Connectivity)
• EDR InfiniBand Non-Blocking Fabric

© 2018 Cray Inc. 10


LUSTRE PFL STREAMING PERFORMANCE
Flash MDTs -> HDD OST Tier (DoM)
30,000

25,000
Performance (MB/sec)

20,000

15,000
We want no change in performance across various sizes
Write Mean
Read Mean

10,000

5,000

0
No DoM DoM=64K DoM=256K DoM=1024K DoM=4096K
Progressive File Layout Small Component Size

Progressive File Layout maintains peak performance for streaming workloads


© 2018 Cray Inc. 11
LUSTRE PFL NOISY NEIGHBOR ISOLATION

Two Competing Workloads on Same Resources


Streaming Workload File
Lustre
HDD OST
Small File Workload File 1 File 2 File 3 File 4

Two Workloads Separated using PFL

Streaming Workload File HDD OST

Small File Workload File 1 File 2 File 3 File 4 Flash


OST or MDT

© 2018 Cray Inc. 12


LUSTRE PFL NOISY NEIGHBOR ISOLATION
Flash Tier (OST or DoM with MDTs) -> HDD OST Tier
1 MB FILES 4 MB FILES
Write Mean Read Mean Write Mean Read Mean

24,000 24,000
Baseline Isolated Baseline Isolated
22,000 22,000
MB/s

MB/s
20,000 20,000
Interfered Interfered
18,000 18,000

16,000 16,000

14,000 14,000

12,000 12,000
None None (1MB) 1024K (1MB) X-Axis Legend None None (4MB) 1024K (4MB) 4096K (4MB)
PFL Size on Flash (Noisy Neighbor File Size)

PFL ISOLATION OF IOPS FROM STREAMING IMPROVES PERFORMANCE


© 2018 Cray Inc. 13
Summary
• New Lustre Features such as PFL, DoM, and DNE2 help improve mixed I/O
performance on high speed shared parallel filesystem
• Transparent data placement on Flash MDTs and/or OSTs and HDDs for various
I/O sizes to optimize throughput and IOPS
• Isolate small files or small I/Os from streaming I/O to solve the “Noisy Neighbor”
slow down for sequential performance

© 2018 Cray Inc. 14


THANK YOU
QUESTIONS?

cray.com

You might also like