0% found this document useful (0 votes)
851 views35 pages

CommVault Building Block Configuration V9 White Paper

Commvault

Uploaded by

Harish Gupta
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
851 views35 pages

CommVault Building Block Configuration V9 White Paper

Commvault

Uploaded by

Harish Gupta
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

CommVault Building Block Configuration White Paper

June 2011

CommVault Building Block Configuration White Paper

Copyright 2011 CommVault Systems, Incorporated. All rights reserved. CommVault, CommVault and logo, the "CV" logo, CommVault Systems, Solving Forward, SIM, Singular Information Management, Simpana, CommVault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, Quick Snap, QSnap, SnapProtect, Recovery Director, CommServe, CommCell, ROMS, and CommValue are trademarks or registered trademarks of CommVault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.

The information in this document has been reviewed and is believed to be accurate. However, neither CommVault Systems, Inc. nor its affiliates assume any responsibility for inaccuracies, errors, or omissions that may be contained herein. In no event will CommVault Systems, Inc. or its affiliates are liable for direct, indirect, special, incidental, or consequential damages resulting from any defect or omission in this document, even if advised of the possibility of such damages.

CommVault Systems, Inc. reserves the right to make improvements or changes to this document and information contained within, and to the products and services described at any time, without notice or obligation.

June 2011

Content in this document is subject to change without notice

Page 1

CommVault Building Block Configuration White Paper

CommVault Building Block Configuration White Paper Contents


1. Introduction: What is a Building Block..5 1.1. Physical Layer.8 1.1.1. At a Glance Specifications and Configurations 1.1.2. Examples of Servers that meet Building Block Requirements 1.2. Logical View...10 1.2.1. Average Throughput 1.2.2. Deduplication Databases 1.2.3. Number of Deduplication Databases per Building Block 1.2.4. Deduplication Building Block Size Settings 1.2.5. Managing Multiple DDBs and Hardware Requirements 1.2.6. Disk Space required for DDBs 1.2.7. Disk Library 1.3. Disk Attachment Considerations 2. Global Deduplication Storage Policy.17 2.1. Block Size 2.2. Disk Libraries 2.3. Remote Offices 2.4. Global Deduplication Storage Policy Caveats 2.5. Streams 2.6. Data Path Configuration 2.7. Use store Priming Option with Source-Side Deduplication 3. Deduplication Database Availability.....22 3.1. Considerations 4. Building Block Design27 4.1. Choosing the Right Building Block 4.2. Building Block Configuration Examples 5. Conclusion...34

June 2011

Content in this document is subject to change without notice

Page 2

CommVault Building Block Configuration White Paper

Introduction

What is a Building Block?

June 2011

Content in this document is subject to change without notice

Page 3

CommVault Building Block Configuration White Paper

1. What is a Building Block?


A large data center requires a data management solution that can be flexible, scalable and hardware agnostic. This paper will illustrate how the CommVault Building Block Data Management Solution delivers that solution.

The Building Blocks are flexible because they can grow by adding mount paths.

They can also

accommodate different retentions and different data types all within the same deduplication framework.

The Building Blocks are scalable because they can grow to hundreds of TB of unique data. Through staggering full backups, the building blocks can protect large amounts of data with minimal infrastructure which holds down cost and liability.

The Building Blocks are hardware agnostic by requiring hardware classes instead of specific models. Within this paper we describe six different examples of adequate servers from three major

June 2011

Content in this document is subject to change without notice

Page 4

CommVault Building Block Configuration White Paper

manufacturers.

Additionally, the solution is completely flexible with respect to the storage

infrastructure including disk types, connectivity and brand.

A Building Block is a modular approach to data management. A single Building Block is capable of managing 64 TB of deduplicated data within a Disk Library. Each Building Block also provides processing throughput of at least 2 TB/hr. The Deduplication Building Block design is comprised of two layers; the physical layer and logical layer. The physical layer is the actual hardware specification and configuration. The logical layer is the CommCell configuration that controls that hardware.

Physical Layer There are FOUR design considerations that make up the Building Blocks physical layer: Server Data Throughput Rate Disk Library Hardware Deduplication Database (DDB) LUN

Logical Layer There are SEVEN aspects that comprise the Building Block logical layer: Average Throughput Deduplication Databases Number of Deduplication Databases per Building Block Deduplication Building Block Size Settings Managing Multiple Global Deduplication Databases and Hardware Requirements Disk Space required for Deduplication Database Disk Library

June 2011

Content in this document is subject to change without notice

Page 5

CommVault Building Block Configuration White Paper

1.1. The Physical Layer


The physical layer comprises the hardware of the solution. networking play apart in the physical layer. In addition to servers, storage and

1.1.1. At A Glance Specifications and Configurations Minimum Server Specifications


Components 64 bit OS 2 CPU, Quad Core 32 GB RAM Windows/Linux

Minimum Data Throughput Port Specifications

Option 1 (Recommended) Option 2

1 exclusive 10 GigE port 4, 1 GigE Parts NIC Teaming on Host

June 2011

Content in this document is subject to change without notice

Page 6

CommVault Building Block Configuration White Paper

Disk Library Configuration


Option 1 (Recommended) Network attached Storage (NAS) Exclusive 10 GigE port 7.2 K RPM SAS spindles Option 2 SAS/FC/iSCSI SAS: 6 Gbps HBA FC: 8 Gbps HBA iSCSI: Exclusive 10 GigE NIC 7.2 K RPM SATA/SAS spindles Min. RAID 5 Raid groups with 7+ spindles each 2 TB LUNs up to 50 LUNs Dedicated Storage Adaptors

Minimum DDB LUN Specifications


Option 1 - Internal Disk 6 Gbps SAS HBA DDB Volume Specifications 15 k RPM SAS spindles RAID 0 4 spindles RAID 5 5-6 spindles RAID 10 8 spindles RAID 50 10-12 spindles

Note: The LUN hosting the DDB should be 3x the size of the active DDB in order to allow for recovery point reconstruction operations.

Option2 SAN Disk

FC: 8 Gbps HBA iSCSI: Exclusive 10 GigE NIC DDB Volume Specifications 15 k RPM physical disks RAID 0 4 spindles RAID 5 5-6 spindles RAID 10 8 spindles RAID 50 10-12 spindles

Note: The LUN hosting the DDB should be 3x the size of the active DDB in order to allow for recovery point reconstruction operations.

June 2011

Content in this document is subject to change without notice

Page 7

CommVault Building Block Configuration White Paper

Examples of Servers that Meet Building Block Requirements Servers Dell R710 with H700 and H800 controllers and MD storage Blades Dell M610 blades on Dell M1000e enclosure with 10 GigE backplane with EqualLogic or MD3000i storage OR 8 Gbps FC fabric. HP BL 460 or BL600 blades on in HP c7000 enclosure with 8 Gbps FC fabric and 10 GigE Ethernet fabric.

HP DL 380 G6 with 480i internal controller and FC/10 GigE iSCSI/ Gbps SAS for external storage

IBM x3550 or above with internal SAS controller andIBM JS, PS or HS blade servers with FC/10 GigE external SAS/FC/10 GigE iSCSI controller fabrics

June 2011

Content in this document is subject to change without notice

Page 8

CommVault Building Block Configuration White Paper

1.2. The Logical Layer The logical layer is the software and configuration that controls the hardware. A properly configured logical layer allows the physical layer to achieve its potential. 1.2.1. Average Throughput The Building Block has a minimum throughput rate of 2 TB/hr up to a maximum of 4 TB/hr. A single Building Block can transfer between 48 TB to 96 TB in a 24 hour period. A typical streaming backup window is 8 hours, which allows a Building Block to transfer 16 TB to 32 TB of data. The following table shows expected amounts of data transferred over specific time periods and throughputs. Most design cases should be scaled from an assumption of 2 TB/hr, assuming a configuration as recommended in this document. Table Backup Window

Total amount of data P/H 2 TB/H 3 TB/H 24 30 36 42 48 54 60 66 72 4 TB/H 32 40 48 56 64 72 80 88 96

8 10 12 14 16 18 20 22 24

Hours Hours Hours Hours Hours Hours Hours Hours Hours

16 20 24 28 32 36 40 44 48

June 2011

Content in this document is subject to change without notice

Page 9

CommVault Building Block Configuration White Paper

1.2.2. Deduplication Database The Simpana v9 Deduplication engine utilizes a multi-threaded C-Tree server mode database. This database can scale to a maximum of 750 million records. This record limit is equivalent to 90 TB of data residing on the Disk Library and 900 TB of application data assuming a 10:1 deduplication ratio. The DDB has a recommended maximum of 50 concurrent connections, or streams. Any configuration above 50 concurrent DDB connections will have a negative impact to the Building Block performance and scalability.
Deduplication Database Characteristics

Database Threaded DDB Rows Capacity Application Data Connection

C-Tree server mode Multi-Threaded 500 to max 750 million records 60-90 TB for unique data @128k block 600 TB to 900TB @10:1 deduplication ratio 50 concurrent connection

1.2.3. Number of Deduplication Databases per Building Block CommVault recommends hosting a single deduplication database per Building Block. However, certain workloads may require higher concurrency but lower capacity. The Simpana Desktop/Laptop Solution is a perfect example of this workload. For such workloads, it is possible to host up to 2 DDBs per Building Block. This is known as DDB Extended Mode. Having the additional DDB allows a total of 100 streams per Building Block enabling higher concurrency for the workloads.

In DDB Extended Mode, the total capacity of the DDBs will reach 60-90 TB combined. One DDB may scale to 20 TB of raw data and the other to 40 TB raw data. There is no way to easily predict the size to which a DDB will grow. In this configuration, it is a best practice to stagger the backups so that only one DDB is utilized at a time. This will ensure that each DDB will scale closer to the 60-90 TB of raw data capacity.

June 2011

Content in this document is subject to change without notice

Page 10

CommVault Building Block Configuration White Paper

1.2.4. Deduplication Block Size Setting It is a CommVault best practice to configure Simpana v9 Deduplication Storage Policy block sizes at a minimum of 128 K. This recommendation is for all data types other than databases larger than 100 GB. Large databases can be configured at 256 K (1TB to 5TB) or 512 K (> 5TB) block sizes and should be configured WITHOUT software compression enabled at the Storage Policy Level. This setting represents the block size that the data stream is cut up into. In Simpana v9, enhancements have been made to eliminate the need for Storage Policies per data type. Any block from 16 k to the configured block size will automatically be hashed and checked into the deduplication database. This eliminates the complexity of multiple storage policies per data type.

1.2.5. Managing Multiple DDBs and the Hardware Requirements. The scalability of a DDB is highly dependent upon the deduplication block size. The larger the block size, the more data can be stored in the Disk Library. Assuming a standard block size of 128 K, a DDB using a single store can comfortably grow to 64 TB without performance penalty. Using this conservative number as a guide, one can predict the number of DDBs required for a given amount of unique data. By default, the software will generate hashes for blocks that are smaller than the specified size down to a minimum size of 16k. In Simpana v9, the block size hashing can be further reduced by using the registry key SignatureMinFallbackDataSize. This further reduces the minimal deduplication block size from 16 k to 4 k. With a 128 k block storage policy any block between 4 k or larger will be checked into the deduplication database. This registry key is ideal for Client Side Deduplication or a network optimized DASH copy over a slow network.

SignatureMinFallbackDataSize
Location: Type Value MediaAgent DWORD 4096

This registry key should be installed on MediaAgent or client performing signature generation. It can be pushed out via CommCell GUI the MediaAgent subkey will be created on the client.

June 2011

Content in this document is subject to change without notice

Page 11

CommVault Building Block Configuration White Paper

1.2.6. Disk Space Required for DDB The amount of disk space required for the DDB will depend on the amount of data protected, deduplication ratios, retention, and change rate. This information should be placed in the Storage_Policy_Plan table of the Deduplication Calculator. The top number (in yellow outline) is the total amount of disk space required for the active DDBs. The lower number (in blue outline) is the individual total used by each Storage Policy copy. These numbers dont take into account the DDB recovery point or the working space required which is 3 times the store size.

1.2.7. Disk Library A Best Practice is to create a single Disk Library for deduplication with no more than three Building Blocks. This is illustrated in the following table.

Data per DDB 60 TB 90 TB

Data Total in the Disk Library 180 TB 270 TB

Application Data at a 10:1 ratio 1.8 PB 2.7 PB

Throughput of 6 TB/hr 6 TB/hr 6 TB/hr

Non-deduplicated data should backup to a separate Disk Library whenever possible. Sequestering the data types into separate Disk Libraries allows for easier reporting on the overall deduplication savings. Mixing deduplicated and non-deduplicated data into a single library will skew the overall Disk usage information and make space usage prediction difficult.

June 2011

Content in this document is subject to change without notice

Page 12

CommVault Building Block Configuration White Paper

Each Building Block can support 100 TB of disk storage. The disk storage should be partitioned into 2 4 TB LUNs and configured as mount points in the operating system. This equates to 502 TB LUNs, 33- 3 TB LUNs, or 25- 4 TB LUNS. This LUN size is recommended to allow for ease of maintenance for the Disk Library. Additionally, a larger array of smaller LUNs reduces the impact of a failure of a given LUN.

Additional disk capacity should be added in 2-4 TB LUNs matching the original LUN configuration if possible. When GridStor is used apply the equal amount of capacity across all MediaAgents. For example, three MediaAgents would require a total of 6 TB, 2 TB per Building Block. It is not recommended to use third party real-time disk de-fragmentation software on a Disk Library or DDB-LUN. This can cause locks on files that are being access by backup, restore, DASH copies and data aging operation. Third party software can be used to defragment a mount path after it has been taken offline. Anti virus software should also be configured to NOT scan CommVault Disk Libraries and DDB-LUNs. 1.3. Disk Attachment Considerations Mount paths can be of two types, NAS paths (Disk Library over shared storage) or direct attached block storage (Disk Library over direct attached storage). In direct attached block storage (SAN) the mount paths are locally attached to the MediaAgent. With NAS, the disk storage is on the network and the MediaAgent connects via a network protocol. The NAS Mount Path is the preferred method for a mount path

configuration. This provides several benefits over the direct attached configuration. If a MediaAgent goes offline, the Disk Library is still accessible by other MediaAgents in the library. With direct attached, if a MediaAgent is lost then the Disk Library is offline. Secondly, all network communication to the mount path occurs from the MediaAgent to the NAS device.

June 2011

Content in this document is subject to change without notice

Page 13

CommVault Building Block Configuration White Paper

During restores and DASH copies, there is no intermediate communication between MediaAgents. In direct attached, all communication must pass through the hosting MediaAgent in order to service the DASH copy or restore. Backup activities are not affected by the mount path choice.

In a direct attached design, configure the mount paths as mount points instead of drive letters. This allows for larger capacity solutions to configure more mount paths than there are drive letters. Smaller capacity sites can use drive letters as long as they do not exceed the number of available drive letters. From an administration perspective its better to stick with drive letters or mounts paths and to not mix the two. There are no performance advantages to either configuration.

Each MediaAgent should have no more than 50 writers across all the mount paths. A MediaAgent with 10- 2 TB mount paths (20 TB of raw capacity) would have 5 writers per mount path. The purpose behind this is to evenly distribute the load across all mount paths and to ensure the number of concurrent connections to the DDB remains under the 50 connection limit. In a 3 Building Block GridStor configuration the total number of writers should not exceed 150 writers, 50 writers per MediaAgent.

Configure the Disk Library to use Spill and fill mount paths as this allows for load balancing the writers evenly across all mount paths in the library. This setting is located in the Disk Library Properties > Mount Paths Tab. For further information please refer to Establish the Parameters for Mount Path Usage.

Regardless of the type of disk being used, SAN or NAS, the configuration is the same. The Disk Library consists of disk devices that point to the location of the Disk Library folders. Each disk device will have a read/write path and a read only path. The read/write path is for the MediaAgent June 2011
Content in this document is subject to change without notice

Page 14

CommVault Building Block Configuration White Paper

controlling the mount path to perform backup. The read only path is for the alternate MediaAgent to be able to read the data from the host MediaAgent. This is to allow for restores or aux copy operations while the local MediaAgent is busy. For step by step instructions on configuring a shared Disk Library with alternate data paths please reference Configuring a Shared Disk Library with Alternate Data Paths.

June 2011

Content in this document is subject to change without notice

Page 15

CommVault Building Block Configuration White Paper

Global Deduplication Storage Policy

June 2011

Content in this document is subject to change without notice

Page 16

CommVault Building Block Configuration White Paper

2. Global Deduplication Storage Policy


Global Deduplication Policy introduces the concept of a common deduplication store that can be shared by multiple Storage Policy copies, Primary or DASH, to provide one large global deduplication store. Each Storage Policy copy defines its own retention rules. However, all participating Storage Policy copies share the same data paths which consists of MediaAgents and Disk Library mount paths.

A Global Deduplication Storage Policy (GDSP) should be used instead of a standard deduplication storage policy whenever possible. A GDSP allows for multiple standard deduplication policies to be associated to it allowing for global deduplication across all associated clients. The requirements for a standard Deduplication Storage Policy to become associated to a GDSP are common block size and Disk Library. 2.1. Block Size All associated standard Deduplication Policies are configured with the same block size regardless of the copy being associated to the GDSP. For example, the primary copy has a standalone deduplication database and DASH copy associated to a GDSP. Both the Primary and DASH copy will require the same block size. This is because the block size is configured at the Storage policy level and all copies will adhere to that value. Trying to associate a Storage Policy copy to a GDPS with a different block size will generate the following error:

June 2011

Content in this document is subject to change without notice

Page 17

CommVault Building Block Configuration White Paper

2.2. Disk Libraries All associated storage policies, in a GDSP, will back up to the same Disk Library. If a different Disk Library is required then a different GDSP will be needed. All disk based library configurations are supported for a GDSP. There is no limit to the number of standard Deduplication Policies that can be associated to a GDSP. However, there are operational benefits to maintaining a simple design. Create standard Deduplication Policies based on client specific requirements and retention needs such as compression, signature generation, and encryption requirements. With a standard

Deduplication Policy each specific backup requirement noted above would need a separate DDB.

2.3. Remote Offices Remote offices with local restorability requirements typically have small data sets and low retention. Although, a single standard Deduplication Policy, in most cases, will service the remote sites requirements for data availability, it is recommended to use a GDSP. Remote sites may need flexibility to handle special data such as legal information. In this case, a GDSP would allow this data to deduplicate with other data at the site.

2.4. Global Deduplication Storage Policy Caveats There are three important considerations when using Global Deduplication Storage Policies: Client computers cannot be associated to a GDSP; only to standard storage policies. Once a storage policy copy has been associated to a GDSP there is no way to change that association. Multiple copies within a storage policy cannot use the same GDSP.

June 2011

Content in this document is subject to change without notice

Page 18

CommVault Building Block Configuration White Paper

2.5. Streams The stream configuration in a Storage Policy design is also important. When a Round-Robin design is configured, ensure the total number of streams across the storage policies associated to the GDSP does not exceed 50. This ensures that no more than 50 jobs will protect data at a given time and overload the DDB. For example, a GDSP may have four associated storage policies with 50 streams each for a total of 200 streams. If all policies were in concurrent use, the DDB would have 200 connections and performance would degrade. By limiting the number of writers to a total of 50, all 200 jobs may start, however, only 50 will run at any one time. As resources become available from jobs completing, the waiting jobs will resume.

2.6. Data Path Configuration When using SAN storage for the mount path, use Alternate Data Paths -> When Resources are offline -> immediately. In a GridStor environment this will ensure the backups are configured to go through the designated Building Block. If a data path fails or is marked offline for maintenance the job will failover to the next data path configured in the Data Path tab. Although Round-Robin between Data paths will work for SAN storage its not recommended because of the performance penalty during DASH copies and restores. This is because of the multiple hops that have to occur in order to restore or copy the data. When using Use Alternate Data Path with When Resources are Offline then number of streams per client storage policy should not exceed 50.

June 2011

Content in this document is subject to change without notice

Page 19

CommVault Building Block Configuration White Paper

When using NAS storage for the mount path, Round Robin, between Data Paths is recommended. This is configured in the Storage Policy copy properties -> Data Path Configuration tab of the storage policy associated to the GDSP and not in the GDSP properties. NAS mount paths do not have the same performance penalty because the network communication is between the servicing Media Agent and the NAS mount path directly.

2.7. Use Store Priming Options with Source-Side Deduplication The store priming feature queries a previously sealed DDB for hash lookup before requesting a client to send the data. The purpose of this feature is to leverage existing protected data in the Disk Library before sending new data over the network. The feature is designed for slow network based backup only. This would include Client-Side Deduplication and DASH copies. The feature is not recommended for LAN based backup or network links faster than 1 Gbps. Lab testing has shown that using this feature on the LAN can actually hinder backup performance. This is because it is faster to request the data from the client than perform the queries on the previously sealed DDB. This feature does not eliminate the need to re-baseline after a sealed deduplication database. It only eliminates the need for the client to send the data over the network to the MediaAgent. This feature requires Source-Side Deduplication to be enabled.

June 2011

Content in this document is subject to change without notice

Page 20

CommVault Building Block Configuration White Paper

Deduplication Data Base Availability

June 2011

Content in this document is subject to change without notice

Page 21

CommVault Building Block Configuration White Paper

3. Deduplication Database Availability


The DDB recovery point is a copy of the active DDB. This copy is used to rebuild the DDB in the event of failure. When the recovery point process is initiated all communication to the active DDB is paused. The information in memory is committed to disk to ensure the DDB is in a quiesced state. The DDB is then copied from the active location to the backup location. After a DDB has been backed up successfully the previous recovery point is deleted. All communication to the DDB is then resumed. Throughout this time, the Job Controller will show the jobs in a running state. By default, the DDB recovery point is placed in a folder called BACKUP in the DDB location. Since this is a copy of the active DDB the LUN hosting the DDB will need THREE times the amount of disk space as the active DDB. This allows for the active DDB, the DDB recovery point, and an equal amount of working space. The DDB recovery point can be moved to an alternate location if more space is required. If this process is going to be used then the DDB LUN requires enough disk space for the active DDB plus growth. The DDB recovery point location will require two times the size of the active DDB. This allows for the recovery point and the working space for the DDB recovery point process. The best practice is to use the Disk Library for the recovery point destination. The default interval for recovery point creation is 8 hours. The registry key that controls this is the Create recovery Points Every registry key. Once the time interval has been reached the next backup will create the recovery point. It is not recommended to lower the Create Recovery Point Every setting to below 4 hours. Lowering the setting below 4 hours can have a negative impact on backup performance. There are 2 reasons for this. First, the recovery point flushes the DDB that is residing in memory to disk. When the jobs resume the DDB has to be loaded back into memory. This process can be time consuming. Secondly, all backup activity pauses while the active DDB is copied to the recovery point.

June 2011

Content in this document is subject to change without notice

Page 22

CommVault Building Block Configuration White Paper

3.1. Considerations Changing the DDB recovery point interval requires the DDB engine to be restarted. This can be done by restarting the Media Management services from the CommVault Services Control Panel. To view the running time interval, locate the following entry in the SIDBEngine.log file. The value in brackets represents the interval in seconds. The Valid range for the DDB recovery point is 0-99 hours. ### Backup interval set to [28800] When moving the DDB recovery point to a network share, take the network speed into consideration when choosing the destination. Best practice is to use the fastest network connection available. During the DDB recovery point operation, if the copy process of the DDB to the backup folder takes longer than 20 minutes the running jobs will move into a pending state. This is because clients, by default, wait a maximum of 20 minutes when there is no response from the DDB. While the default value can be changed the best practice is to ensure the DDB recovery point process completes within 20 minutes. In order to extend the wait time, three possible registry keys may need to be applied. The examples that follow are all set for one hour. If the timeout value is set to accommodate the backup time for the DDB, then the backup will wait until the SIDB starts allowing threads to continue and will not go pending or show any errors.

MediaAgent when Source-Side Deduplication is not being used. Location: Key: Type: Value: MediaAgent SIDBReplyTimeoutInS DWORD 3600

Client for Source-Side Deduplication Location: Key: Type: Value: iDataAgent SignatureWaitTimeSeconds DWORD 360

June 2011

Content in this document is subject to change without notice

Page 23

CommVault Building Block Configuration White Paper

MediaAgent for DASH copy which uses the same code as Source-Side Deduplication. Location: Key: Type: Value: MediaAgent SignatureWaitTimeSeconds DWORD 360

When using a Disk Library, as a recovery point destination, ensure that the mount path reserve space is set appropriately to accommodate the DDB recovery point. Without this the mount path could run out of disk space and fail all DDB recovery point operations until free space is available.

To move the DDB recovery point to a network path the following registry value must be created. This change requires a support case to be opened as the SIDBBackupPathPassword string must be encrypted via a proprietary encryption tool that is not publically available.

SIDBBackupPath
Location: Type Value MediaAgent String Local or network path

SIDBBackupPathUser
Location: June 2011 MediaAgent
Content in this document is subject to change without notice

Page 24

CommVault Building Block Configuration White Paper

Type Value

String Domain/User. Only required of the network share

SIDBBackupPathPassword
Location: Type Value MediaAgent String Encrypted by a CommVault tool. Only required of the network share

June 2011

Content in this document is subject to change without notice

Page 25

CommVault Building Block Configuration White Paper

Building Block Design

June 2011

Content in this document is subject to change without notice

Page 26

CommVault Building Block Configuration White Paper

4. Building Block Design


Designing the operational architectures involve several important considerations. These considerations include backup windows, data sets, throughput and retention. 4.1. Choosing the Right Building Block Backup Windows total amount of time allotted to protect the data set Data Set Throughput The amount of data to protect in the backup window Required Throughput to protect the Data Set within the Backup Window Retention How long the data is to be kept before aging off the system

To determine the correct Building Block configuration, the Deduplication Calculator can be populated with the appropriate data. The summary page of the Deduplication Calculator provides the Backup Window, Total Amount of data to protect in a full cycle and the number of DDBs to protect the data for the required retention.

To determine the required throughput, divide the Production Site Size by the Backup Window. The result is the required throughput needed by the Building Blocks to protect the data within the backup window. Take the required throughput and divide this by 2 to generate the number of Building Blocks required to protect the data set.

June 2011

Content in this document is subject to change without notice

Page 27

CommVault Building Block Configuration White Paper

When using Building Blocks, different block sizes for different storage policies will require another deduplication database. Each deduplication database will have a specific hardware requirement as outlined in this document.

5.2. Building Block Configuration Examples In this section we will cover several configuration examples. These examples include a 1 Building Block configuration, a 3 Building block configuration and a staggered full backup configuration.

Example 1: 1 Building Block


Full backups performed one day a week. Information obtained from Deduplication Calculator Data Set Backup Window Retention Daily Change Rate 16 TB 8 Hours 4 Weeks 2% or 320GB

Only one Building Block is required to protect the amount of data specified during the backup window. The daily change rage is 320 GB which can be protected by a single Building Block.
Clients

MediaAgent

DDB
50 Writers

Disk

June 2011

Content in this document is subject to change without notice

Page 28

CommVault Building Block Configuration White Paper

Example 2: 3 Building Block


Full backups performed one day a week. Information obtained from Deduplication Calculator Data Set 48 TB Backup Window Retention Daily Change Rate 8 Hours 4 Weeks 2% or 950GB

This site would require three Building Blocks in order to protect the data within the backup window. The incremental change rate is 950 GB and can be handled by the Building Blocks. This will allow a total of 150 concurrent streams and an overall deduplication capacity, between the 3 nodes of 180270 TB of unique data across all the DDBs. The Deduplication Calculator estimates the deduplication store to be at 42 TB. Per the Deduplication Calculator the total required Disk Library space is 52 TB. Using 2 TB LUNs would yield 26 mount paths (52 TB/2 TB LUN = 26). Rounding the number of mount paths up to 27 results in each node hosting 9 mount paths (27 mount paths/3 BBs = 9). Increasing the number of mount paths to 27 would also increase the disk space to 54 TB.

Clients

MediaAgent

MediaAgent

MediaAgent

DDB
50 Writers 50 Writers

DDB
50 Writers

DDB

Disk

Disk

Disk

June 2011

Content in this document is subject to change without notice

Page 29

CommVault Building Block Configuration White Paper

5.3. Staggering Full Backups Staggering full backups can have a major impact on the overall architecture and design. The next example will show the architectural impact of staggered full backups in a large environment.

Example 3: Part 1: Traditional Backups


Full backups performed one day a week. Information obtained from Deduplication Calculator

Data Set Backup Window Retention Daily Change Rate Number of DDBs

120TB 8 Hours 4 Weeks 2% or 2.4TB 2

This site would require eight Building Blocks in order to protect the data within the backup window. The incremental change rate is 2.4 TB and can be handled by the Building Blocks. To protect the data within the backup window, 8 DDBs will be required. This will allow a total of 400 concurrent operational streams and an overall deduplication capacity between the eight nodes of 660-720 TB of unique data across all the DDBs. The Deduplication Calculator estimates the deduplication store to be at 106 TB. Per the Deduplication Calculator, the total required Disk Library space is 130 TB. Using 2 TB LUNs would yield 65 mount paths (130 TB/2 TB LUN = 65). For evenly distributed mount paths the number would have to decrease to 64 or increase to 72. Decreasing the mount paths to 64 would reduce the overall capacity to 128 TB. Increasing the mount paths to 72 would increase the capacity to 144 TB. In this case, keep the mount paths at 65. Configure 8 mount paths for 7 MediaAgents and 9 mount paths for the 8th.

June 2011

Content in this document is subject to change without notice

Page 30

CommVault Building Block Configuration White Paper

Example 3: Part 2: Staggered Full Backups


Full backups performed six days a week. Information obtained from Deduplication Calculator

Data Set Backup Window Full backup Retention Daily Change Rate Number of DDBs

120 TB 8 Hours Monday - Saturday 4 Weeks 2% or 2.4TB 2

In this scenario, the site has a total data set of 120 TB. The full backups will occur Friday Wednesday leaving Thursday available for data aging operations. To figure out the daily data to protect the following formula is used.

/day

Next, determine the number of Building Blocks required for the data rate..

June 2011

Content in this document is subject to change without notice

Page 31

CommVault Building Block Configuration White Paper

Staggering the full backup would only require this site to use two Building Blocks in order to protect the data foot print in part 1 within the backup window. The Deduplication Calculator only calls for 2 DDBs for the amount of data being protected and the retention. This will allow a total of 100 concurrent streams and an overall deduplication capacity between the two nodes of 120-180 TB of unique data across the DDBs. The Deduplication Calculator estimates the deduplication store to be at 106 GB. The total required Disk Library space is 130 TB. This is the same deduplication footprint as in Part 1. Using 2 TB LUNs would yield 65 mount paths (130 TB/2 TB LUN = 65). For evenly distributed mount paths the number would have to increase to 66. This also increases the total capacity to 132 TB. Each Building Block would have 33 mount paths. Staggering the backups across the week reduces the overall infrastructure required to protect the data set significantly.

Clients
MediaAgent

MediaAgent

DDB
50 Writers 50 Writers

DDB

Disk

Disk

June 2011

Content in this document is subject to change without notice

Page 32

CommVault Building Block Configuration White Paper

Conclusions

June 2011

Content in this document is subject to change without notice

Page 33

CommVault Building Block Configuration White Paper

2. Conclusion
The Building Block data management solution is flexible, scalable and hardware agnostic. The Building Blocks are flexible because they can grow by adding mount paths and they can accommodate different retentions and different data types all within the same deduplication framework. The Building Blocks are scalable because they can grow to hundreds of TB of unique data. Through staggering full backups the building blocks can protect large amounts of data with minimal infrastructure which holds down cost and liability. The Building Blocks are hardware agnostic by requiring hardware classes instead of specific models. As detailed in the preceding sections, we have shown there are six different examples of adequate servers across three major manufacturers.

June 2011

Content in this document is subject to change without notice

Page 34

You might also like