0% found this document useful (0 votes)
7 views25 pages

3 VNX Performance Tuning

Uploaded by

mingli.bi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views25 pages

3 VNX Performance Tuning

Uploaded by

mingli.bi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Unified VNX Performance Tuning

V4.1.1

Best viewed in slideshow


mode – includes animation

EMC CONFIDENTIAL—INTERNAL USE ONLY 1


Agenda

• Terminology
– Just a few…
• Setting Expectations
– Why do we tune? Do we have to?
– Just what does tuning entail?
• Tips
1. Know Your I/O
2. Choosing a RAID Type
3. Disk Count
4. Choose the Right System
5. LU Distribution and MetaLUNs
6. Cache Allocation
7. Managing Cache Space with Watermarks
8. Cache Page Size
9. Logical Unit Settings
10.Stripes
11.High Bandwidth

EMC CONFIDENTIAL—INTERNAL USE ONLY 2


Terms
• Alignment – Data block addresses compared to RAID stripe
addresses
• Coalesce – To combine multiple smaller I/O into one larger
I/O
• Concurrency – More than one application or thread writing
to a LUN or disk at the same time
• Flush – Data in write cache written to disk
• Locality – Multiple I/O requested from a reasonably small
area on the disk (same MB or GB)
• RDBMS – Relational Database Management System
• Throughput — IOPS: typically important for filesystem
access, RDBMS; small requests (2-16KB)
• Bandwidth — MB/s: typically important for backups, DSS
operations, rich media access (64KB, 256KB)
• Response time — a key measurement of quality of service;
an array can offer a high max IOPS figure, but deliver
consistently slow response time

EMC CONFIDENTIAL—INTERNAL USE ONLY 3


Setting Expectations
What do we tune? And do we have to?

• Tuning is mostly upfront design


– Choosing an appropriate system, disk count, and RAID type
– Some array settings which will change behaviors at the margin
• Do I have to tune my design?
– A modest investment in planning will certainly pay off
• Do I have to tune the storage system?
– Standard settings are designed to work with most workloads
– Clients with unusual workloads can get better performance
• Some extreme workloads require adjustments
– High percentage sequential
– High percentage random
– All large block

EMC CONFIDENTIAL—INTERNAL USE ONLY 4


Setting Expectations
What does tuning entail?

• Planning ahead is still the most effective technique


– It is harder to “fix” a problem if the problem is:
• Poor data layout
• Wrong RAID type
– Full use of physical resources is important
• No reason for idle drives or processors
• Planning puts the tuning in the right order
– Some adjustments must be made before committing data
• Selection of disks, RAID type, metaLUN design
– Some adjustments are available online
• Cache settings all can be done while in production

EMC CONFIDENTIAL—INTERNAL USE ONLY 5


Performance Tuning
1. Know your I/O
• What you need to know
– Predominant I/O request sizes
– Read/Write ratio
– Ratio of random vs. sequential access
– Steadiness or burstiness of your access patterns
– Data sets that have “linked contention” if on the same disks
• Host file system stripes
• RDBMS distributed tables
• Multiple components of the same database
– Degree of concurrency
• Multiple threads? Async I/O?
– If not, response time will increase with load.
– Skew
• % of IO over capacity
• Required for estimations associated with FAST storage pools and
FAST cache usefulness

EMC CONFIDENTIAL—INTERNAL USE ONLY 6


Performance Tuning
2. Choosing a RAID type: Random Reads

• RAID 1/0 is best for random I/O, but costs more


• RAID 5 will perform—for an equal number of disks—very close to
RAID 1/0 in read-heavy (80%) environments
• RAID 6 will perform—for an equal number of disks, the same as
RAID 5 in read heavy environments; performance is lower for writes
At an equal capacity, RAID 1/0 will offer much higher
performance
High Ratio of Reads
Equivalent Spindles: About the same speed Equivalent Capacity: RAID 1/0 is best
RAID 5 3+1 RAID 1/0 2+2 RAID 5 3+1 RAID 1/0 3+3:

RAID 1/0:
Higher cost/GB Higher cost/GB

EMC CONFIDENTIAL—INTERNAL USE ONLY 7


Performance Tuning
2. Choosing a RAID type: Random Writes

• If ratio of writes is above 20%, RAID 1/0 will be more efficient than
RAID 5 (similarly for RAID 6)
– RAID 1 and 1/0 require that two disk operations for each host write
– RAID 5 requires four operations per host write
– RAID 6 requires six operations per host write

High Ratio of Writes


Equivalent Spindles: RAID 1/0 is better Equivalent Capacity: RAID 1/0 is much better
RAID 5 3+1 RAID 1/0 2+2 RAID 5 3+1 RAID 1/0 3+3:

Less load Less load, more disks,


Higher cost/GB Higher cost/GB

EMC CONFIDENTIAL—INTERNAL USE ONLY 8


Performance Tuning
2. Choosing a RAID type: Sequential/High Bandwidth

• Parity RAID types do full-stripe writes (reduced parity penalty)


• RAID 5 better than RAID 1/0
– RAID 1/0 is good, but RAID 5 is a bit faster
– Fewer drives to synchronize: N+1 not N+N
• RAID 6 is nearly identical for reads and about 10% lower for writes
• RAID 3 is slightly shorter code path than RAID 5
– RAID 3 with NL-SAS drives yields close to SAS bandwidth
• NL-SAS with 8-stripe parity is also near SAS bandwidth for other parity RAID
types
– Remember though – ANY random load on these drives and you should
use another RAID type
• Anything but exclusive access would favour another RAID type also
• RAID 1/0 is best when workload is mixed (some sequential, some
random)

EMC CONFIDENTIAL—INTERNAL USE ONLY 9


Performance Tuning
3. Determine your disk count
• If data is from a host tool we have to convert to disk IOPS
– SAR, Perfmon, iostat etc.
• A host write may cause one or more disk I/O
– Sequential access will likely result in fewer disk I/O
• Sequential I/O are coalesced
– Random access will result in more I/O due to RAID type
• For small to moderate IO sizes, typically up to 16KB, and purely
random; convert host load to disk load based on your RAID type
• RAID 5: Total I/O = Host Reads + 4 * Host Writes
• RAID 6: Total I/O = Host Reads + 6 * Host Writes
• RAID 1 and RAID 1/0: Total I/O = Host Reads + 2 * Host Writes

Example: HOST LOAD: 5,200 Random IOPS, 60 % Reads, 40% Writes

RAID 5 Disk Load RAID 6 Disk Load RAID 1/0 Disk Load
= 0.6 * 5,200 + (4 * ( 0.4 * 5,200 )) = 0.6 * 5,200 + (6 * ( 0.4 * 5,200 )) = 0.6 * 5,200 + (2 * ( 0.4 * 5,200 ))
= 3,120 + 4 * 2,080 = 3,120 + 6 * 2,080 = 3,120 + 2 * 2,080
= 3,120 + 8,320 = 3,120 + 12,480 = 3,120 + 4,160
= 11,440 IOPS = 15,600 IOPS = 7,280 IOPS

EMC CONFIDENTIAL—INTERNAL USE ONLY 10


Performance Tuning
3. Determine your disk count (cont.)
• Disk drives are a critical element of Unified performance
– Use the rule of thumb to determine the number of drives to use
• Rules of Thumb for Drive Performance
– These are a conservative starting point for analysis, not the absolute maximums!
– IOPS denotes small block random with good response time
– Bandwidth denotes large block sequential

10 K rpm
15 K rpm Flash Drive NL-SAS
(2.5” & 3.5”)

IOPS 150 IOPS 180 IOPS 3500 IOPS 90 IOPS

6 - 24MB/s (64KB-
Bandwidth 8 - 32 MB/s 100 MB/s 2.5 - 16 MB/s
512KB)

Example: HOST LOAD: 5,200 Random IOPS, 60 % Reads


RAID 5 RAID 6 RAID 1/0
Disk Load = 11,440 IOPS Disk Load = 15,600 IOPS Disk Load = 7,280 IOPS

10 K rpm Drives 11,440 / 150 = 77 drives 15,600 / 150 = 104 drives 7280 / 150 = 49 drives
15 K rpm Drives
11,440 / 180 = 64 drives 15,600 / 180 = 88 drives 7280 / 180 = 42 drives
Flash Drives 11,440 / 3500 = 4 drives 15,600 / 3500 = 5 drives 7280 / 3500 = 3 drives

EMC CONFIDENTIAL—INTERNAL USE ONLY 11


Performance Tuning
4. Choose the right system
• IOPS Workloads
– Choose the system that holds the number of drives required
• Unified systems scale to their maximum drive counts (typically, check BPG)
• Bandwidth Workloads
– Bandwidth ranges below are for actively concurrent drives, 256 KB I/O

15K rpm SAS VNX5300 VNX5500 VNX5700 VNX7500

IOPS 4KB Random 120 disks 240 disks 440 disks 920 disks
(Read IOPS) 34,288 IOPS 69,813 IOPS 123,841 IOPS 176,800 IOPS

120 disks 140 disks 200 disks 480 disks


Read bandwidth
4,054 MB/s 4,144 MB/s 5,505 MB/s 8,802 MB/s

120 disks 180 disks 160 disks 240 disks


Write Bandwidth
1,606 MB/s 1,859 MB/s 2,193 MB/s 2,503 MB/s

Write Bandwidth 120 disks 240 disks 440 disks 920 disks
(cache bypass) 883 MB/s 1,777 MB/s 3,112 MB/s 4,085 MB/s

EMC CONFIDENTIAL—INTERNAL USE ONLY 12


EFD Performance Rules of Thumb

3,500 IOPS per drive, 100 MB/s per drive


• Compare to 180 IOPS for 15KRPM HDD: 19X increase
• For simple “one number” estimation of EFD performance
– All EFD drive choices
– Mixed read/write workloads
• That’s a “back-end” measure so add all RAID parity
operations
• Why 3500? It’s lower than the 5000 or more we see in
tests…
1. Benchmark tests under optimal conditions, with tuned hosts and
no contending I/O on the array
2. No application latency, think time or bursts
3. Real workloads are often not tuned or optimal, and apps WILL
have to be tuned to get the most out of EFDs
4. Conservative estimates result in good results across broad range
of uses
• Applying EFD to the random workload in slide-12 table
– Uses 10X less drives to saturate an array

EMC CONFIDENTIAL—INTERNAL USE ONLY 13


Performance Tuning
5. LU distribution MetaLUNs and Pool LUNs
• Distribute your I/O load evenly across your available disks
– IOPS: Share spindles with multiple processes
– Bandwidth: Regular LUNs, not metaLUNs or pool lUNs, are usually
better
• Fewer seeks, more sequential operation
• But – Avoid linked contention
• Primary and clone on same RAID group or pool
• Primary and Reserved LUN Pool on same RAID group (RLP not
supported on pools)
• Application contention
• Meta or Host Striping across the same RAID group

In Linked Contention, disk heads must move over large ranges of disk addresses in
order to service I/O being requested at the same time.

Read EXAMPLE: Snapshot


Writechunk from Primary
to Primary LUN LUN
Save area on same disk
group as primary LUN
Write Chunk to Snapshot Save LUN

EMC CONFIDENTIAL—INTERNAL USE ONLY 14


Performance Tuning
6. How to allocate cache

• Write Cache should get the largest share Write cache size will be
– 14,250 MB Maximum available with VNX7500 affected as you enable
– 10,906 MB available for the VNX5700 features like FAST Cache
– 6,988 MB for VNX5500
– 3,997 MB for VNX5300
– 801 MB for VNX5100
• Read Cache
– Smaller Systems (VNX5100, VNX5300 and VNX5500)
• Typical users need <= 10% of available cache as Read Cache ( <=20% on
CX4 products)
– Large Systems (VNX5700 and VNX7500)
• Recommend 1GB Read Cache per SP and the rest to write cache
– We could allocate less as we don’t need to keep the data
around long.
Example: prefetching
– Can turn off read cache at LUN level if desired, but not at a pool level
1.5 MB in 1 MB of HOST READ HOST READ Cache
cache

64 KB disk reads

EMC CONFIDENTIAL—INTERNAL USE ONLY 15


Performance Tuning
7. Managing cache space
• Watermarks – Your tool for overall cache optimization
– A global setting – defaults to 80 (high) 60 (low)
– High watermark
• Sets amount of reserve cache space (HW  100%)
• Acts as a trigger for Watermark Flush
– Low watermark
• Where Watermark Flushing stops
• It sets the amount of dirty pages to keep for cache hits

Cache Usage Over Time


100 To absorb bursts, we
Forced Flush reserve some cache
90
space – the area
Hi-Water Flush above the high
Percent Cache Used

80
watermark.
70
Set the watermarks
60 lower if you hit too
Idle Flush many forced flushes.
50

40
Tim e
H Watermark Cache Usage Low Watermark

EMC CONFIDENTIAL—INTERNAL USE ONLY 16


Performance Tuning
8. Caching – Cache page
• This is the chunk used to organize the cache
– The Cache Page size is configurable: 2,4, 8, or 16 KB
• Read and Write cache use the same size, as do all LUNs
– Size has a minor performance impact at the margin
• Small pages are a little better for small I/Os
• Large pages are a little better for large I/Os and sequential
• What size to use?
– Match to OS/host/application I/O size
– For high bandwidth with RAID 5, RAID 6 and RAID 3
• Use 4 KB or larger
• With disk groups of 8+1 or more – use 8 KB or larger cache page
– 2KB page size would not do FSW with wide raid groups (4+1 OK though).
• For maximum bandwidth: use 16 KB
• Bottom line
– Default of 8 KB is the best choice if access is mixed

EMC CONFIDENTIAL—INTERNAL USE ONLY 17


Performance Tuning
9. Cache – Traditional Logical unit settings
• Leave cache turned on
– There are very few cases where turning it off will help
• Write-Aside – Conditional write cache disable
– Largest I/O that the array will put in write cache
• Automatic bypass of write caching for sizes above this value
– Why?
• Write-aside can improve bandwidth beyond CMI/internal memory bandwidth
for large I/Os
– For RAID 5, RAID 6 and RAID 3, make sure bypassed I/O is;
• Multiple of the stripe size
• Aligned on the stripe
– For RAID 1/0, aligned I/O will be best
– This will probably INCREASE response time unless cache is already
saturated

• Default is 2048 blocks for VNX (and CX4 and CX3 series systems)
– VNX systems cache up to 1 MB very effectively
– Use CLI to change:
– naviseccli –user <username> -password <pwd> -scope <0¦1> –h
<ip_address> chglun –l <LUN ID> –w 2048

EMC CONFIDENTIAL—INTERNAL USE ONLY 18


Performance Tuning
9. Cache – Traditional Logical unit settings
• Prefetch
– Prefetch Disable: no prefetch for requests of this size and larger
• Set in number of 512-byte blocks
• Default is 4097 blocks
– Any I/O of 2048.5 KB and larger will not cause prefetch
• Intended to regulate back-end access in mixed environments
– Max Prefetch: the most data we’ll grab (ceiling on the multiplier)
• More regulation to insure the back end is not flooded with prefetch requests
• Default is 4096 (2 MB); can be set as high as 8192 blocks (4 MB).
– Retention Priority (retain prefetch by default)
• Determines whether prefetch is favored over host requested data when read
cache becomes full
– Idle Count
• Determines when prefetching occurs relative to the number of host I/O requests
to the LUN. If the number of host I/0 requests is less than the idle count,
prefetching occurs; otherwise, prefetching does not occur.
– Really a throttle such that if there are many outstanding IO requests for a
LUN, don’t waste time prefetching but get on with doing the requested IO
– We can process a good number of Host IO and maintain good prefetch
capability
– Keep the default value (40) for this parameter.

EMC CONFIDENTIAL—INTERNAL USE ONLY 19


Performance Tuning
9. Cache – Traditional Logical unit settings
• Prefetch Types
– Fixed – if sequentiality is detected, grabs a fixed number of blocks
• “Segment” is how many blocks at a time
– Variable (default)
• “Multiplier” is how much to prefetch, multiplied by the I/O size
• The “Segment” is used to control the size and number of requests to the
back end. The size of the original I/O requested is multiplied by this value.
The back end will then request chunks of this size until the total prefetch
request is satisfied.
Example1: 2KB I/O with Multiplier of 4 and Segment of 4: this will kick off a prefetch
totaling 8KB. As the Segment is 4, the 8KB will be requested all at once (2KB *4 = 8
KB).
Example 2: 16KB I/O with Multiplier of 4 and Segment of 2: this will kick off a prefetch
totaling 64KB. This will be requested in two 32KB segments (2*16KB = 32KB).
Example 3: 1 MB I/O with Multiplier and Segment of 4. This will request up to 4 MB, but if
the Max Prefetch is set to a smaller value, the total requested will equal the Max
Prefetch value

EMC CONFIDENTIAL—INTERNAL USE ONLY 20


Performance Tuning
9. Cache – Pool Logical unit settings
• Pools have FAST cache enable/disable option
– Pool wide option
• Pool LUNs, both thin and thick have no cache options
– Ease of use
• Pool “under the covers” comprise private RAID Groups and private
LUNs
– Cache defaults are fixed at private pool LUN level

• Pool LUNs do benefit from optimization for disk access, at the


private object though and no tunable options

EMC CONFIDENTIAL—INTERNAL USE ONLY 21


Performance Tuning
10. Stripes and Traditional Logical Units

• The default stripe element size is 64 KB (128 blocks)


– Don’t change it!
• Be careful with huge stripes (large RAID groups)
– Requires a larger cache page for high bandwidth writes
– Harder to get good bandwidth on write-cache bypass
– Takes longer to rebuild
– Better to use smaller groups and MetaLUNs/host striping
• Fix alignment issues when you begin
– Windows is the primary concern (prior to Windows 2008 and 7)
– Use DISKPAR.EXE or DISKPART.EXE to align at 64 KB (see BPG for
example)
• For MetaLUNs, use a Stripe Multiplier of 4
– Additional guidance in the BPG
• For Pool LUNs, stripe element applies to private POOL LUNs
that you have no visibility or control of, that use default
64KB
– Alignment is still the same for Pool LUNs i.e. use diskpar/diskpart prior
to Windows
EMC CONFIDENTIAL—INTERNAL USE ONLY 2007/2008 22
Performance Tuning
11. High Bandwidth
• Key for parity (RAID 3/5/6) is MR3 writes (full stripe writes)
– RAID 1/0 not as good for overall system bandwidth but is easy to get good
bandwidth with a RAID 1/0 group
– Use RAID 3 for ATA, see “4_Parity_RAID_and_Alignment.ppt”
• Any Random and use R5 (8 stripe parity on CX4 and CX3)
• VNX NL-SAS appears to be closer with R5 anyway even for pure sequential
• Use the default stripe element size of 64 KB (128 blocks) even for
RAID 3
– This will keep you out of a LOT of trouble
• Max backend write (MBW) size
– MBW = Cache page size * 128
• PARITY RAID & MR3 (Full Stripe Write) Dependencies
– Stripe element must be 128 blocks or smaller
– MBW must be >= stripe size
– Cache page and stripe element must be aligned
• Integrals of each other, e.g. page of 4 and stripe element of 30 and you get no
MR3!
– MR3 requires either cached sequential I/O or uncached I/O that is aligned
to the RAID stripe and integral to the RAID stripe size
• Example: for 256KB stripe, 256KB, 512 KB, or 1 MB
– You need a MBW of 1 MB for decent ATA /NL-SAS RAID 3 performance, 2
MB is better; that’s 8 KB (1MB) or 16KB (2MB) cache page for 4+1 RG
size!

EMC CONFIDENTIAL—INTERNAL USE ONLY 23


Summary

1. Know Your I/O


2. Choosing a RAID Type
3. Disk Count
4. Choosing the Right System
5. LU Distribution and MetaLUNs
6. Cache Allocation
7. Managing Cache Space
8. Cache Page Size
9. Logical Unit Settings
10. Stripes
11. High Bandwidth

Revisit your settings periodically to insure you’re still on track.

EMC CONFIDENTIAL—INTERNAL USE ONLY 24


Reference
CLARiiON Storage Fundamentals
On Powerlink

EMC CLARiiON Best Practices for Performance and Availability


On Powerlink

SPEED and CSPEED Official Site*


http://speed.corp.emc.com/

* Must turn off popup blockers

EMC CONFIDENTIAL—INTERNAL USE ONLY 25

You might also like