0% found this document useful (0 votes)
73 views20 pages

DRP Best Practices August 2019

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 20

IBM Spectrum Virtualize

Data Reduction
Best Practices

1
© Copyright IBM Corporation 2019
Contents

Spectrum
Data Reduction
Data Reduction Virtualize Data Data Reduction
Usage Further
Techniques Reduction Design Patterns
Considerations
Technologies

2
© Copyright IBM Corporation 2019
Data Reduction Techniques
Thin Provisioning Compression Deduplication
Capacity is allocated on Data is compressed before being Duplicates of data are
demand when the data is written to storage* detected and replaced with
first written. references to the first copy*
Compression and De-
duplication make use of thin
provisioning.

*Some other vendors don’t run these operations in real-time. i.e. the original
data is first written uncompressed/non-deduplicated and a background
process reduces later.
Replace the footer with text from the PPT-Updater. Instructions are included in that file. 3
© Copyright IBM Corporation 2018
© Copyright IBM Corporation 2019
Spectrum Virtualize
Data Reduction Technology

4
© Copyright IBM Corporation 2019
Data Reduction with Data Reduction Pools

Data Reduction Pools (DRP) deliver thin If best practices are not followed :
provisioning, compression and deduplication
across all volumes in a pool
• Running out of storage will take volumes offline,
making it difficult or time consuming to recover
Requires best practices be followed because :

• Garbage discounted at a DRP level may still exist in a • Under-estimating the amount of storage remaining
physical storage level negates the cost benefits of using data reduction and
(Garbage is data that is no longer needed but has not yet been deleted)
makes long term planning challenging
• Using fully allocated volumes with a compressing back
end may lead to a distorted view of physical storage
remaining. As with any data reduction technology, you should
(DRP sees a fully allocated volume, compressed back end sees a thin, compressed
volume) always track usage of underlying storage
5
© Copyright IBM Corporation 2019
Data Reduction with FlashCore Modules

The FlashSystem 9100, Storwize V7000 Gen 3 and When using FCM based products:
Storwize V5100 contain compressing physical storage (ie.
FlashCore Modules, or FCMs) The user should ALWAYS monitor physical space

FCMs compress data at line speed with no performance The GUI will ensure the user creates configurations that
impact to the workload are sensible and measurable

DRP is fully supported within these products, even when And the storage controller uses knowledge of both the
using FCMs logical space and physical space
(This is an advantage of Spectrum Virtualize being tightly integrated with the storage
controller where more information about the underlying storage is known)
If virtualizing external storage using these or any other
Spectrum Virtualize products, then the best practices still
apply!

6
© Copyright IBM Corporation 2019
Considerations When Using Data Reduction at 2 levels

If you create a solution where data reduction technologies ALL DRP volumes should run with compression switched
are applied at both the storage and the virtualization on
appliance levels, then here are the rules you should follow: (DRP uses Intel Compression Acceleration hardware & will not adversely affect
general DRP performance)

Physical storage should be allocated 1:1


This means the usable capacity in the storage must be
equal to the effective capacity provisioned to SVC. Review
documentation for how to do this for your storage system.

Fully allocated volumes should be in their own pool


7

If you want to use DRP with an existing over-allocated


backend, you need to reclaim storage and configure it
according to the best practices

© Copyright IBM Corporation 2019


Data Reduction
Design Patterns

8
© Copyright IBM Corporation 2019
DRP above simple RAID

Recommended!
Spectrum Virtualize
with DRP
Use DRP at top level to plan for de-duplication and
snapshot optimisations.
DRP at top level provides best application capacity
reporting (volume written capacity).

Fully Allocated Storage


(e.g. Storwize V5010)
Always use compression in DRP to get the most
benefits from the technology

9
© Copyright IBM Corporation 2019
Fully-allocated above single-tier
data reducing backend Recommended with
appropriate precautions!

Spectrum Virtualize Need to track physical capacity use carefully to avoid


with Fully Allocated out-of-space.

Spectrum Virtualize can report physical use but does not


manage to avoid out-of-space.
No visibility of each applications data usage at the
Data Reduced Storage Spectrum Virtualize layer.
(e.g. FlashSystem A9000)

If actual out-of-space happens there is limited ability to


recover with unmap. Also consider creating sacrificial
emergency space volume.
10
© Copyright IBM Corporation 2019
DRP above data reducing Recommended with
backend appropriate precautions!

Assume 1:1 (effective:usable) compression in backend


storage – do not overcommit!
Spectrum Virtualize Review product documentation for how to do this
with DRP w/compression
Small extra savings will be realized from compressing
meta-data.

Monitor capacity and out-of-space alerts from the


backend storage system.

Data Reduced Storage Using DRP with over-allocated back end could lead to
(e.g. FlashSystem A9000) the DRP garbage causing out-of-space

Think : For existing systems, do you need to move to


DRP to get the benefits of dedup, or is line-speed
hardware compression the right performance/capacity
balance?
11
© Copyright IBM Corporation 2019
DRP + Fully-Allocated above data
reducing backend

Avoid!!
Spectrum Virtualize Very difficult to understand used capacity when
DRP w/compression + FA combining DRP and FA and a compression backend

Temptation is to try to exploit capacity savings which


might overcommit backend

If a mix of DRP and FA is required, then use separate


pools; Previous design patterns can be used, but must
Data Reduced Storage
not combined in the same pool
(e.g. FlashSystem A9000) (Or use with extreme caution and with a ~<20% mix of fully allocated)

12
© Copyright IBM Corporation 2019
Fully-allocated above multi-tier
data reducing backends
Use with great care!
Easy Tier is unaware of physical capacity in tiers of a
Spectrum Virtualize hybrid pool.
Fully Allocated
Easy Tier will tend to fill the top tier with hottest
data.

Changes in compressibility of data between the tiers


can overcommit the storage leading to out-of-space.
Data Reduced Storage Data Reduced Storage
(e.g. FlashSystem A9000) (e.g. V7000 w/ DRP) Think: Is there risk my hot data is incompressible?

13
© Copyright IBM Corporation 2019
DRP above DRP

Spectrum Virtualize
Avoid!!
with DRP
Creates two levels of IO amplification on meta-data.

Introduces an additional layer of capacity management


(two DRP, one physical)

There is no performance, capacity or usability benefit in


Spec. Virt. w/DRP using DRP on top of DRP
(e.g. V7000 w/DRP)

14
© Copyright IBM Corporation 2019
Data Reduction Usage
Further Considerations

15
© Copyright IBM Corporation 2019
Summary - Planning

Flexibility Is Great, But Choose The Right Choose The Right Design Pattern
Technology
Don’t over-complicate
Use Disk Magic to model the performance with both FCM
and DRP data reduction If virtualizing a compressing storage system use a
(IBM or your Business Partner can help with this) recommended design

Use Compresstimator to identify expected space savings. Other designs make it impossible to understand your real
Data Reduction Estimation Tool (DRET) can identify capacity consumption and you will run out of space!
deduplication workloads.

Use FlashCore Modules with inline hardware compression


for best performance

Use Data Reduction Pools if you can really benefit from


deduplication capacity savings 16
© Copyright IBM Corporation 2018 IBM Confidential
© Copyright IBM Corporation 2019
Summary - System Sizing

Right Sizing The Cache Right Sizing Your Capacity

Appropriately sizing the cache can improve the operation


of the system As a rule, try to avoid running your system past
85% full
• 256GB is a good starting point to allow a reasonable
size non-DRP working set in cache Workloads can be unpredictable and a large demand for
• Another 128GB is recommended if you’re making capacity could quickly fill up system capacity. 15% buffer
heavy use of copy services provides opportunity to handle such situations
• If consolidating workloads, consider how much cache As volumes fill, garbage collection work could trigger write
you currently have available amplification placing additional write workload on the
• Maximize the cache to extract the best performance system, slowing down application response times
out of DRP (lower latencies, avoid response time
spikes) and optimize DRP metadata hits Monitor your free space and plan ahead to avoid running
garbage collection too frequently and fully realize the
performance required by your workloads 17
© Copyright IBM Corporation 2018 IBM Confidential
© Copyright IBM Corporation 2019
Summary - Deduplication Workloads

Good Deduplication Workloads Successful VDI Deployment

A workload with a lot of identical data across one or more VDI tends to have a large write contingent of around
volumes 30/70

The more duplicates there are, the greater the savings As well as having a lot of duplication, this kind of workload
realized in terms of physical space used benefits from having more cache

As mentioned, data reduction costs IOPS, so balance the To fully realize the benefits of the system, configure the
capacity savings against the performance requirements VDI deployment across multiple VMFS datastores so as to
increase parallelism. At least 16 if possible.

A great example of a deduplication friendly workloads is


VDI

18
© Copyright IBM Corporation 2018 IBM Confidential
© Copyright IBM Corporation 2019
Summary - Monitoring

Monitor Capacity Monitor System Performance


Configure alerts – get warned EARLY when capacity starts Plan for high bandwidth demanding workloads (such as
to fill up migration in) consuming excessive system resource
• Allow enough time to delete or migrate data or
augment existing capacity • Follow Spectrum Virtualize best practices to create a
balanced system
Running out of physical capacity will take volumes offline • Consider additional resource usage when using DRP
• Use QoS to throttle heavy workloads and protect other
Monitor your reclaimable capacity and free space to do workloads running on the system
longer term planning (Throttling may elongate time taken to run a workload)

Space savings need IOPS to be realized

19
© Copyright IBM Corporation 2018 IBM Confidential
© Copyright
© Copyright IBM
IBM Corporation
Corporation 2019
2019
Further Reading

White Paper - Best Practices Details for Managing Physical Space on FlashSystems 900-AE3
https://www-01.ibm.com/support/docview.wss?uid=ibm10735459

20
© Copyright IBM Corporation 2019

You might also like