DRP Best Practices August 2019
DRP Best Practices August 2019
DRP Best Practices August 2019
Data Reduction
Best Practices
1
© Copyright IBM Corporation 2019
Contents
Spectrum
Data Reduction
Data Reduction Virtualize Data Data Reduction
Usage Further
Techniques Reduction Design Patterns
Considerations
Technologies
2
© Copyright IBM Corporation 2019
Data Reduction Techniques
Thin Provisioning Compression Deduplication
Capacity is allocated on Data is compressed before being Duplicates of data are
demand when the data is written to storage* detected and replaced with
first written. references to the first copy*
Compression and De-
duplication make use of thin
provisioning.
*Some other vendors don’t run these operations in real-time. i.e. the original
data is first written uncompressed/non-deduplicated and a background
process reduces later.
Replace the footer with text from the PPT-Updater. Instructions are included in that file. 3
© Copyright IBM Corporation 2018
© Copyright IBM Corporation 2019
Spectrum Virtualize
Data Reduction Technology
4
© Copyright IBM Corporation 2019
Data Reduction with Data Reduction Pools
Data Reduction Pools (DRP) deliver thin If best practices are not followed :
provisioning, compression and deduplication
across all volumes in a pool
• Running out of storage will take volumes offline,
making it difficult or time consuming to recover
Requires best practices be followed because :
• Garbage discounted at a DRP level may still exist in a • Under-estimating the amount of storage remaining
physical storage level negates the cost benefits of using data reduction and
(Garbage is data that is no longer needed but has not yet been deleted)
makes long term planning challenging
• Using fully allocated volumes with a compressing back
end may lead to a distorted view of physical storage
remaining. As with any data reduction technology, you should
(DRP sees a fully allocated volume, compressed back end sees a thin, compressed
volume) always track usage of underlying storage
5
© Copyright IBM Corporation 2019
Data Reduction with FlashCore Modules
The FlashSystem 9100, Storwize V7000 Gen 3 and When using FCM based products:
Storwize V5100 contain compressing physical storage (ie.
FlashCore Modules, or FCMs) The user should ALWAYS monitor physical space
FCMs compress data at line speed with no performance The GUI will ensure the user creates configurations that
impact to the workload are sensible and measurable
DRP is fully supported within these products, even when And the storage controller uses knowledge of both the
using FCMs logical space and physical space
(This is an advantage of Spectrum Virtualize being tightly integrated with the storage
controller where more information about the underlying storage is known)
If virtualizing external storage using these or any other
Spectrum Virtualize products, then the best practices still
apply!
6
© Copyright IBM Corporation 2019
Considerations When Using Data Reduction at 2 levels
If you create a solution where data reduction technologies ALL DRP volumes should run with compression switched
are applied at both the storage and the virtualization on
appliance levels, then here are the rules you should follow: (DRP uses Intel Compression Acceleration hardware & will not adversely affect
general DRP performance)
8
© Copyright IBM Corporation 2019
DRP above simple RAID
Recommended!
Spectrum Virtualize
with DRP
Use DRP at top level to plan for de-duplication and
snapshot optimisations.
DRP at top level provides best application capacity
reporting (volume written capacity).
9
© Copyright IBM Corporation 2019
Fully-allocated above single-tier
data reducing backend Recommended with
appropriate precautions!
Data Reduced Storage Using DRP with over-allocated back end could lead to
(e.g. FlashSystem A9000) the DRP garbage causing out-of-space
Avoid!!
Spectrum Virtualize Very difficult to understand used capacity when
DRP w/compression + FA combining DRP and FA and a compression backend
12
© Copyright IBM Corporation 2019
Fully-allocated above multi-tier
data reducing backends
Use with great care!
Easy Tier is unaware of physical capacity in tiers of a
Spectrum Virtualize hybrid pool.
Fully Allocated
Easy Tier will tend to fill the top tier with hottest
data.
13
© Copyright IBM Corporation 2019
DRP above DRP
Spectrum Virtualize
Avoid!!
with DRP
Creates two levels of IO amplification on meta-data.
14
© Copyright IBM Corporation 2019
Data Reduction Usage
Further Considerations
15
© Copyright IBM Corporation 2019
Summary - Planning
Flexibility Is Great, But Choose The Right Choose The Right Design Pattern
Technology
Don’t over-complicate
Use Disk Magic to model the performance with both FCM
and DRP data reduction If virtualizing a compressing storage system use a
(IBM or your Business Partner can help with this) recommended design
Use Compresstimator to identify expected space savings. Other designs make it impossible to understand your real
Data Reduction Estimation Tool (DRET) can identify capacity consumption and you will run out of space!
deduplication workloads.
A workload with a lot of identical data across one or more VDI tends to have a large write contingent of around
volumes 30/70
The more duplicates there are, the greater the savings As well as having a lot of duplication, this kind of workload
realized in terms of physical space used benefits from having more cache
As mentioned, data reduction costs IOPS, so balance the To fully realize the benefits of the system, configure the
capacity savings against the performance requirements VDI deployment across multiple VMFS datastores so as to
increase parallelism. At least 16 if possible.
18
© Copyright IBM Corporation 2018 IBM Confidential
© Copyright IBM Corporation 2019
Summary - Monitoring
19
© Copyright IBM Corporation 2018 IBM Confidential
© Copyright
© Copyright IBM
IBM Corporation
Corporation 2019
2019
Further Reading
White Paper - Best Practices Details for Managing Physical Space on FlashSystems 900-AE3
https://www-01.ibm.com/support/docview.wss?uid=ibm10735459
20
© Copyright IBM Corporation 2019