ImplementationandMaintenanceStudentGuide PDF
ImplementationandMaintenanceStudentGuide PDF
Maintenance
Student Guide
2|
Copyright
Information in this document, including URL and other website references, represents the current view
of CommVault Systems, Inc. as of the date of publication and is subject to change without notice to you.
Descriptions or references to third party products, services or websites are provided only as a
convenience to you and should not be considered an endorsement by CommVault. CommVault makes
no representations or warranties, express or implied, as to any third party products, services or
websites.
The names of actual companies and products mentioned herein may be the trademarks of their
respective owners. Unless otherwise noted, the example companies, organizations, products, domain
names, e-mail addresses, logos, people, places, and events depicted herein are fictitious.
Complying with all applicable copyright laws is the responsibility of the user. This document is intended
for distribution to and use only by CommVault customers. Use or distribution of this document by any
other persons is prohibited without the express written permission of CommVault. Without limiting the
rights under copyright, no part of this document may be reproduced, stored in or introduced into a
retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying,
recording, or otherwise), or for any purpose, without the express written permission of CommVault
Systems, Inc.
CommVault may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any written
license agreement from CommVault, this document does not give you any license to CommVault’s
intellectual property.
CommVault, CommVault and logo, the “CV” logo, CommVault Systems, Solving Forward, SIM, Singular
Information Management, Simpana, CommVault Galaxy, Unified Data Management, QiNetix, Quick
Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, QuickSnap, QSnap, Recovery Director,
CommServe, CommCell, IntelliSnap, ROMS, Simpana OnePass, CommVault Edge and CommValue, are
trademarks or registered trademarks of CommVault Systems, Inc. All other third party brands, products,
service names, trademarks, or registered service marks are the property of and used to identify the
products or services of their respective owners. All specifications are subject to change without notice.
All right, title and intellectual property rights in and to the Manual is owned by CommVault. No rights
are granted to you other than a license to use the Manual for your personal use and information. You
may not make a copy or derivative work of this Manual. You may not sell, resell, sublicense, rent, loan or
lease the Manual to another party, transfer or assign your rights to use the Manual or otherwise exploit
or use the Manual for any purpose other than for your personal use and reference. The Manual is
provided "AS IS" without a warranty of any kind and the information provided herein is subject to
change without notice.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
|3
Introduction .............................................................................................................................7
Preliminaries .................................................................................................................................... 8
Education Advantage ....................................................................................................................... 9
Customer Education LIfecycle ......................................................................................................... 10
CommVault Certification ................................................................................................................ 11
CommVault Advantage................................................................................................................... 12
Course Building Blocks............................................................................ Error! Bookmark not defined.
Course Objective ............................................................................................................................ 14
Common Technology Engine ........................................................................................................... 15
Training Environment ..................................................................................................................... 17
Module 1 – Planning a CommCell Architecture ....................................................................... 19
Topics ............................................................................................................................................ 20
Common Technology Engine Architecture ....................................................................................... 21
CommCell Architecture Overview ................................................................................................... 22
CommServe Server ......................................................................................................................... 23
Indexing Structure .......................................................................................................................... 25
Common Technology Engine Best Practices ..................................................................................... 27
Architecting a Storage Solution ....................................................................................................... 29
Simpana Deduplication................................................................................................................... 32
Understanding Simpana Deduplication ........................................................................................... 33
Deduplication Building Block Guidelines ......................................................................................... 36
Deduplication Storage Options ....................................................................................................... 38
Partitioned Deduplication Database ............................................................................................... 40
Enterprise Building Block Guidelines ............................................................................................... 41
SILO Storage................................................................................................................................... 42
Advanced Deduplication Configurations ......................................................................................... 45
Deduplication Best Practices........................................................................................................... 47
Designing a Sound Data Protection Strategy.................................................................................... 50
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
4|
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
|5
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
6|
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Introduction | 7
Introduction
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
8 | Introduction
Preliminaries
• Who am I?
• Who are you?
• Why are we here?
• How will this course be conducted?
Preliminaries
The value of this course comes from three distinct areas – first, the content of the material
which guides your exploration and understanding of the product. Second, the skill of the
instructor to expand on those areas of interest and to add value from their experience with the
product. And lastly, you, the student whose questions and own experiences help not only
yourself but others in understanding how Simpana® software can help you with your data
management requirements.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Introduction | 9
Education Advantage
Education Advantage
The CommVault Education Advantage product training portal contains a set of powerful tools to
enable CommVault customers and partners to better educate themselves on the use of the
CommVault software suite. The portal includes:
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
10 | Introduction
Before customers install CommVault® Simpana® software, they should have a basic
understanding of the product. This learning timeline illustrates the role of product education
over the early years of owning CommVault Simpana software. A lifecycle ranging from the pre-
installation review of the "Introduction to Simpana Software" eLearning module, to the pursuit
of Masters Program certifications.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Introduction | 11
CommVault® Certification
CommVault Certification
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
12 | Introduction
CommVault® Advantage
CommVault Advantage
CommVault® Advantage is your profile as a CommVault consumer and expert. The CommVault
Advantage system captures your certifications, participation in learning events and courses,
your Forum participation, Support interaction and much more. Through your CommVault
interactions your awarded Profile Points are collected and compared with other CommVault
consumers worldwide. These Profile Points allow our users to thoroughly demonstrate their
Simpana® software expertise for personal and professional growth. Login to CommVault
Advantage to check your progress and compare yourself to the Global CommVault community
or create an account today.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Introduction | 13
Course Modules
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
14 | Introduction
Course Objective
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Introduction | 15
Physical View
Communication
Libraries
Data Protection /
Recovery
Client MediaAgent
Policy Copy
Logical View
Agent Default
The CommCell® environment is the logical management boundary for all components that
protect, move, store and manage the movement of data and information. All activity within the
CommCell environment is centrally managed through the CommServe® server. Users log on to
the CommCell® Console Graphical User Interface (GUI) which is used to manage and monitor
the environment. Agents are deployed to clients to protect production data by communicating
with the file system or application requiring protection. The data is processed by the agents and
protected through MediaAgents to disk, tape or cloud storage. Clients, MediaAgents and
libraries can be in local or remote locations. All local and remote resources can be centrally
configured and managed through the CommCell console. This allows for centralized and
decentralized organizations to manage all data movement activities through a single interface.
All production data protected by agents, all MediaAgents and all libraries that are controlled by
a CommServe server is referred to as the CommCell environment.
Physical Architecture
A physical CommCell® environment is made up of one CommServe® server, one or more
MediaAgents and one or more Clients. The CommServe server is the central component of a
CommCell environment. It hosts the CommServe database which contains all metadata for the
CommCell environment. All operations are executed and managed through the CommServe.
MediaAgents are the workhorses which move data from source to destination. Sources can be
production data or protected data and destinations can be disk, cloud or removable media
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
16 | Introduction
libraries. Clients are production systems requiring protection and will have one or more Agents
installed directly on them or on a proxy server to protect the production data.
Logical Architecture
CommVault’s logical architecture is defined in two main areas. The first area depicts the logical
management of production data which is designed in a hierarchal tree structure. Production
data is managed using Agents. These agents interface natively with the file system or application
and can be configured based on specific functionality of data being protected. Data within these
agents are grouped into a data set (backup set, replication set, or archive set). These data sets
represent all data the Agent is designed to protect. Within the data set, one or more subclients
can be used to map to specific data. The flexibility of subclients is that data can be grouped into
logical containers which can then be managed independently in the CommVault protected
environment.
The second area depicts managing data in CommVault protected storage. This is facilitated
through the use of storage policies. Storage policies are policy containers which contain one or
more rule sets for managing one or more copies of protected data. The first rule set is the
primary copy. This copy manages data being protected from the production environment.
Additional secondary copies can be created with their own rule sets. These rule sets will
manage additional copies of data which will be generated from existing copies within the
CommVault protected environment. The rule sets define what data will be protected
(subclients), where it will reside (data path), how long it will be kept for (retention), encryption
options, and media management options.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Introduction | 17
Training Environment
Training Environment
The CommVault Virtual Training environment, when available, can be used by students to
perform course activities or explore the product’s user interface. The training environment is
NOT fully resourced, nor are all components installed or available. All course activities are
supported, but due to host memory (RAM and disk space) constraints, only a limited number of
Virtual Machines can be operational at the same time and few tasks beyond the activities listed
in the course manual can be performed. Please discuss with your instructor what other
activity/tasks you can do.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
18 | Introduction
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 19
Module 1
Planning a CommCell®
Architecture
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
20 | Planning a Com mCell Architecture
Topics
• Common Technology Engine • Planning a Sound Data
Architecture Protection Strategy
• CommCell® Architecture Overview • Disaster Recovery Concepts
• CommServe® Server • Business Continuity Concepts
• Indexing Structure • Protection Methods
• Common Technology Engine Best • Data Description
Practices • Data Availability
• Architecting a Storage Solution • Protected Storage Requirements
• Simpana® Deduplication • Designing a Sound Data
• Understanding Simpana Protection Strategy
Deduplication • Understanding Client Agents
• Deduplication Building Block • Protecting Virtual Environments
Guidelines
• VSA Backup Process
• SILO Storage
• Protecting Applications
• Deduplication Best Practices
• Snapshot Management
• Data Protection
Best Practices
Topics
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 21
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
22 | Planning a Com mCell Architecture
• CommServe® Server
CommServe
• MediaAgent
• Indexing
• Library
• Client Deduplication
Compression / Deduplication / Encryption
Data Protection
MediaAgent
Client Snap Backup Archive
Revert
Create / Update
Browse / Retrieve Index Cache
Archive / Prune
The heart of any Simpana deployment is the CommServe® server. All activity is managed from
this central point and all backup and restore activity must be initiated from the CommServe
server. A Microsoft SQL metadata database is used to store all CommServe configuration and
job history data.
Data movement is conducted from source to destination using MediaAgents. One or more
MediaAgents can be used to move data providing greater flexibility and scalability.
Production data is managed by installing iDataAgents on physical hosts, virtual hosts or on proxy
hosts. The iDataAgent communicates with the file system or application being protected and
uses native APIs and / or scripting to conduct data protection operations. Physical and virtual
hosts with iDataAgents installed are referred to as clients.
Libraries are used to store protected data. CommVault software supports a wide range of library
configurations.
The CommServe server, MediaAgents, libraries and clients that communicate with one another
make up the CommCell architecture.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 23
CommServe Server
Within a CommCell environment there can only be one active CommServe server. For high
availability and failover there are several methods that can be implemented. The following
information explains each of these methods.
Virtualization
Some customers with virtual environments are choosing to virtualize the production
CommServe server. A virtualized CommServe server has an advantage of using the hypervisors
high availability functionality (when multiple hypervisors are configured in a cluster) and
reduces costs since separate CommServe hardware is not required. Although this method could
be beneficial, it should be properly planned and implemented. If the virtual environment is not
properly scaled the CommServe server could become a bottleneck when conducting data
protection jobs. In larger environments where jobs run throughout the business day,
CommServe server activity could have a negative performance impact on production servers.
When virtualizing the CommServe server it is still critical to run the CommServe DR backup. In
the event of a disaster the CommServe server may still have to be reconstructed on a physical
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
24 | Planning a Com mCell Architecture
server. Do not rely on the availability of a virtual environment in the case of a disaster. Follow
normal CommVault best practices in protecting the CommServe metadata.
Clustering
The CommServe server can be deployed in a clustered configuration. This will provide high
availability for environments where CommCell operations run 24/7. A clustered CommServe
server is not a DR solution and a standby CommServe server must be planned for at a DR site.
Clustering the CommServe server is a good solution in large environments where performance
and availability are critical.
Another benefit for using a clustered CommServe server is when using Simpana data archiving.
Archiving operations can be configured to create stub files which allow end users to initiate
recall operations. For the end user recall to complete successfully the CommServe server must
be available.
CommServe DR IP Address
A CommCell license is bound to the IP address of the CommServe server. In situations where a
standby CommServe server with a different IP address it must be included in the CommCell
license information.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 25
Indexing Structure
Index
Cache
Shared
Index Cache Server Index
Index file copied Cache
to media
Index file copied
to media
Disk
Index file log Library Tape Library Disk Tape
shipping to ICS Library Library
Indexing Structure
Simpana software uses a distributed indexing structure that provides for enterprise level
scalability and automated index management. This works by using the CommServe database to
only retain job based metadata which will keep the database relatively small. Job and detailed
index information will be kept on the MediaAgent protecting the job, automatically copied to
media containing the job and optionally copied to an Index Cache Server.
Job summary data maintained in the CommServe database will keep track of all data chunks
being written to media. As each chunk completes it is logged in the CommServe database. This
information will also maintain media identities where the job was written to which can be used
when recalling off-site media back for restores. This data will be held in the database for as long
as the job exists. This means even if the data has exceeded defined retention rules, the
summary information will still remain in the database until the job has been overwritten. An
option to browse aged data can be used to browse and recover data on media that has
exceeded retention but has not been overwritten.
The detailed index information for jobs is maintained in the MediaAgent’s Index Cache. This
information will contain each object protected, what chunk the data is in, and the chunk offset
defining the exact location of the data within the chunk. The index files are stored in the index
cache and after the data is protected to media, an archive index operation is conducted to write
the index to the media. This method automatically protects the index information eliminating
the need to perform separate index backup operations. The archived index can also be used if
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
26 | Planning a Com mCell Architecture
the index cache is not available, when restoring the data at alternate locations, or if the indexes
have been pruned from the index cache location.
Transaction logging
The index copy on the Index Cache server is created by either copying the original index during
the Archive Index phase of the data protection job, or dynamically through transactional log
replay. Transactional logs are sent at the completion of each storage chunk. In the event the
local cache is lost while indexing a job, the job can be restarted at the last transaction
successfully entered on the Index Cache Server.
Ensure that you have enough space to accommodate the index cache from all participating
MediaAgents.
Note: When using a network share, the local index and the shared index are one and the same.
A network disruption might corrupt the index and jobs might have to be restarted due to index
cache failure.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 27
• CommServe® Server
• Index Cache Settings
CommServe Server
The CommServe metadata database is the most critical component within the CommCell
infrastructure. If the data becomes corrupt, the CommServe server disk crashes or you are faced
with a full site disaster situation, having the metadata backup readily accessible is critical.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
28 | Planning a Com mCell Architecture
All object level data protection jobs will use indexes for all operations. These indexes are
maintained in the index cache. Improper configuration of the index cache can result in job
failures and long delays in browse and recovery operations.
Consider the following when designing and configuring the index cache:
• Do NOT put the index cache on the system drive. Use a dedicated drive (recommended) or a
dedicated partition (for smaller environments). During MediaAgent installation the default
path for the index cache is the system drive. The location of the cache can be changed by
selecting: right-click the MediaAgent à select properties Catalog tab.
• Size the index cache appropriately based on the size of your environment and the estimated
number of objects that will be protected. It is much better to overestimate than
underestimate index cache size. Sizing guidelines are available on CommVault’s
documentation-site.
• If you will be running many concurrent jobs protecting millions of objects locate the index
cache on high speed dedicated disks. Backup performance can suffer during index update
operations.
• The default retention time for the index cache is 15 days. If you will be frequently browsing
for data older than 15 days increase this setting and allocate enough disk space for the index
cache.
• Index files are automatically backed up to media after each data protection job so there is no
need to perform backups of the index cache location. If you are concerned about having fast
access to indexes in the event that the cache is lost consider using an Index Cache Server.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 29
All data storage devices associated/configured with the Simpana® software are referred to as
Libraries. All data destined to and from a library must pass through a MediaAgent (or a NAS
Filer). Libraries can be shared between dissimilar OS hosted MediaAgents. Data written by the
Simpana software is OS independent. This means data written by a UNIX MediaAgent can be
restored via a Windows MediaAgent and vice-versa.
The most common supported library types are listed below. For a list of specific vendor devices
consult the Hardware Compatibility List on CommVault's Maintenance Advantage website. For
a list of all supported library types consult CommVault's Books Online website.
Disk library - A disk library is a virtual library associated disk media configured for read/write
access as one or more mount paths. The disk library is a software entity and does not represent
a specific hardware entity. The storage capacity of a disk library is determined by the total
storage space in its mount paths.
Tape library - Tape libraries are made up of one or more tape devices with a library controller
and internal media storage. A Tape library can have mixed media and shared access with one or
more MediaAgents (on NAS Filers) in the same CommCell® group. Note that Tape libraries can
be configured to use WORM media.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
30 | Planning a Com mCell Architecture
Blind Library - A blind library is a tape library without a barcode reader, and is the opposite of a
sighted library which has a barcode reader. A blind library must have all its drives (and media) of
the same type. Once configured, a blind library cannot be configured as a sighted library.
Stand-alone Tape Library - A single tape device with no library controller or internal storage that
is accessible from a MediaAgent. Stand-alone Tape drives can be pooled together for a multi-
stream job or single stream failover configuration.
NAS NDMP Library - A tape library attached to a NAS Filer for NDMP data storage. The library
control and drives in a NAS NDMP library can be dynamically shared between multiple devices
(NAS file servers and MediaAgents) if these devices are connected to the library in a SAN
environment. The device initially having library control (media changer) would be the first
configured device.
Virtual Tape Library - A software representation of a tape library using disk storage. Virtual tape
libraries are supported, but not recommended because a normal disk library provides many
more features and capabilities.
Plug & Play Library - Plug and Play (PnP) storage devices (e.g., FireWire, USB, SATA storage
devices, etc.) can be used for storage instead of tapes. Once configured, PnP disks are treated
like tapes in a Stand-Alone drive. PnP libraries are useful in locations where it is hard to
configure and manage tapes due to operational issues. Only one PnP library can be configured
per MediaAgent. Although multiple drives can be configured, only single-streamed jobs are
supported. (Multiple drives provide the ability to span across multiple media for a single-
streamed job.)
Cloud Library - A Cloud library uses online storage devices — cloud storage devices — as storage
targets. Cloud libraries provide a pay-as-you-go capability for network storage. Data is
transferred through secured channels using HTTPS protocol.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 31
Storage Connections
Direct Attached Storage (DAS)
Direct Attached Storage (DAS) means the production storage location is directly attached (not
SAN) to the production server. In situations where many production servers use DAS, there is no
single point of failure. The primary disadvantages are higher administrative overhead and
depending on budget limitations, lower quality storage being used instead of high quality
enterprise class disks (typically found in SAN/NAS storage).
For some applications such as Exchange 2010 using DAG (Database Availability Groups), Direct
Attached Storage may be a valid solution. The main point is that although the storage trend over
the past several years has been to storage consolidation, DAS storage should still be considered
for certain production applications.
One key disadvantage regarding DAS protection is that backup operations will likely require data
to be moved over a network. This problem can be reduced by using dedicated backup networks.
Another disadvantage is that DAS is not as efficient as SAN or NAS when moving large amounts
of data.
One key disadvantage of NAS is that it typically requires network protocols when performing
data protection operations. This disadvantage can be greatly reduced through the use of
snapshots and proxy based backup operations.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
32 | Planning a Com mCell Architecture
SIMPANA® DEDUPLICATON
Simpana Deduplication
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 33
Storage Policy
All deduplication activity is centrally managed through a storage policy. Configuration settings
are defined in the policy, the location of the deduplication database is set through the policy,
and the disk library which will be used is also defined in the policy.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
34 | Planning a Com mCell Architecture
uniquely represents the data within the block. This hash will then be used to determine if the
block already exists in storage.
The block size that will be used is determined in the Storage Policy Properties in the Advanced
tab. CommVault® recommends using the default value of 128k but the value ranges from 32k to
512k. Higher block sizes for large databases is recommended.
Deduplication can be configured for Storage Side Deduplication or Client (source) Side
Deduplication. Depending on how deduplication is configured, the process will work as follows:
Storage Side Deduplication. Once the signature hash is generated on the block, the block and
the hash are both sent to the MediaAgent. The MediaAgent with a local or remotely hosted
deduplication database (DDB) will compare the hash within the database. If the hash does not
exist that means the block is unique. The block will be written to disk storage and the hash will
be logged in the database. If the hash already exists in the database that means the block
already exists on disk. The block and hash will be discarded but the metadata of the data being
protected will be written to the disk library.
Client Side Deduplication Once the signature is generated on the block, only the hash will be
sent to the MediaAgent. The MediaAgent with a local or remotely hosted deduplication
database will compare the hash within the database. If the hash does not exist that means the
block is unique. The MediaAgent will request the block to be sent from the Client to the
MediaAgent which will then write the data to disk. If the hash already exists in the database
that means the block already exists on disk. The MediaAgent will inform the Client to discard
the block and only metadata will be written to the disk library.
Client Side Disk Cache An optional configuration for low bandwidth environments is the
client side disk cache. This will maintain a local cache for deduplicated data. Each
subclient will maintain its own cache. The signature is first compared in the local cache.
If the hash exists the block is discarded. If the hash does not exist in the local cache, it is
sent to the MediaAgent. If the hash does not exist in the DDB, the MediaAgent will
request the block to be sent to the MediaAgent. Both the local cache and the
deduplication database will be updated with the new hash. If the block does exist the
MediaAgent will request the block to be discarded.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 35
Deduplication Database
The deduplication database currently can scale from 500 to 750 million records. This results in
up to 90 Terabytes of data stored within the disk library and up to 900 Terabytes of production
in protected storage. It is important to note that the 900 TB is not source size but the amount of
data that is baked up over time. For example if 200 TB of data is being protected and retained
for 28 days using weekly full and daily incremental backups, the total amount of protected data
would be 800 TB (200 TB per cycle multiplied by 4 cycles since a full is being performed every
seven days). These estimations are based on a 128k block size and may be higher or lower
depending on the number of unique blocks and deduplication ratio being attained.
Deduplication Store
Each storage policy copy configured with a deduplication database will have its own
deduplication store. Quite simply a deduplication store is a group of folders used to write
deduplicated data to disk. Each store will be completely self-contained. Data blocks from one
store cannot be written to another store and data blocks in one store cannot be referenced
from a different deduplication database for another store. This means that the more
independent deduplication storage policies you have, the more duplicate data will exist in disk
storage.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
36 | Planning a Com mCell Architecture
• Based on 128 KB
Block Size
Dedicated high speed SSD disks
Must meet IOPs requirements
300 – 500 GB capacity
Deduplication
Database
(DDB)
CommVault recommends using building block guidelines for scalability in large environments.
There are two layers to a building block, the physical layer and the logical layer.
For the physical layer, each building block will consist of one or more MediaAgents, one disk
library and one deduplication database.
For the logical layer, each building block will contain one or more storage policies. If multiple
storage policies are going to be used they should all be linked to a single global deduplication
policy for the building block.
A building block using a deduplication block size of 128 KB can scale to retain up to 96 TB of
deduplicated data. This could retain approximately 40 – 60 TB of production data with retention
of 30 – 90 days. The actual size of data will vary depending on the uniqueness of production
data and the incremental block rate of change.
Performance starts with properly scaling the MediaAgent. There should be a minimum of 32 GB
of RAM on each MediaAgent hosting the deduplication database.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 37
The disks library can be sized up to 100 TB for a single building block. Mount paths should be
configured between 2 – 8 TB.
In order to meet deduplication database IOPs requirements, high performance disks in a RAID
array must be used. Enterprise class Solid State Disks or high speed SCSI disks are
recommended. The disks should be configured in a RAID 0 or RAID 10 configuration. RAID 0
provides the best read / write performance but creates multiple single points of failure. If RAID
0 is going to be used ensure you are frequently protecting the deduplication database. Dedupe
database backup and recovery will be covered later in this ELearning course.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
38 | Planning a Com mCell Architecture
There are three methods that disk library data paths can be configured when using
deduplication: Direct Attached Storage or DAS, Storage Area Network or SAN and Network
Attached Storage or NAS.
Direct attached storage is when the disk library is physically attached to the MediaAgent. In this
case each building block will be completely self-contained. This provides for high performance
but limits resiliency. If the MediaAgent controlling the building block fails, data stored in the disk
library cannot be recovered until the MediaAgent is repaired or replaced.
Keep in mind that, in this case, all the data in the disk library is still completely indexed and
recoverable, even if the index cache is lost. Once the MediaAgent is reconstructed, data from
the disk library can be restored.
Storage Area Networks or SANs are very common in many data centers. SAN storage can be
zoned and presented to MediaAgents using either Fibre Chanel or iSCSI. In this case the zoned
storage is presented directly to the MediaAgent providing Read / Write access to the disks.
When using SAN storage, each building block should use a dedicated MediaAgent, deduplication
database and disk library. Although the backend disk storage in the SAN can reside on the same
disk array, logically in the Simpana software it should be configured as two separate libraries.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 39
This provides for fast and protocol efficient movement of data but, as in the case of Direct
Attached Storage, if the building block MediaAgent fails, data cannot be restored. When using
SAN storage either the MediaAgent can be rebuilt or the disk library can be re-zoned to a
different MediaAgent. If the disk library is rezoned, it must be reconfigured in the Simpana
software to the MediaAgent that has access to the LUN.
Network Attached Storage has an advantage in that the path to the storage is directly through
the NAS hardware. This means that by using CIFS or NFS, UNC paths can be configured for a disk
library to read and write directly to storage. When using NAS storage as a disk library, it is still
recommended to configure two separate disk libraries in the Simpana software. In this case the
library can be configured as a shared library, where both MediaAgents can see all storage.
Separate building blocks should still be used for each MediaAgent providing Read / Write access
to a disk library but Read Only access can also be granted to all libraries on the NAS storage. In
this case, if a MediaAgent fails, any other MediaAgent with access to the library can conduct
restore operations.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
40 | Planning a Com mCell Architecture
Partition 1 Partition 2
DDB DDB
Parallel deduplication is a highly scalable and resilient solution that allows the deduplication
database to be partitioned. It works by dividing signatures between multiple databases to
increase the capacity of a single building block. If two dedupe partitions are used, it effectively
doubles the size of the deduplication store.
In this example, two dedupe partitions have been configured, each on a separate MediaAgent.
Signatures are generated on the Client and depending on the signature generated it will be
directed to one of the two partitions for processing. Although either MediaAgent can process
signature lookups, the data for the client will always use its default MediaAgent path. This
allows all unique deduplication blocks to be protected through a single MediaAgent although
duplicate blocks may have been protected by either of the MediaAgents.
Since deduplicated data can exist on either of the partitions, the disk library should be
configured using NAS storage. UNC paths should be used for the NAS disk library so either
MediaAgent will be able to access data even if the other MediaAgent is unavailable.
Parallel deduplication is an advanced feature for large enterprise environments and CommVault
Professional Services should be consulted when designing deduplication building blocks using
this solution.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 41
800 GB
Production MediaAgent
Data 128 KB Disk
30 Day Retention Library
4 TB
Global Deduplication
Production
Storage Policy
Data 128 KB
90 Day Retention
MediaAgent
24 TB Disk
Production Library
Data
256 KB
14 Day Retention
When designing storage policy and building block architecture, another consideration is that
certain data types do not deduplicate well against other data types. A prime example would be
file system data and database data. In this case, different building blocks and storage policies
can be configured to manage different data types. In this example a global deduplication
storage policy has been configured with a block size of 128 KB. Two data management storage
policies have been configured, one with a 30 day retention and the other with a 90 day
retention. All deduplication blocks from both storage policies will deduplicate based on the
global deduplication policy setting, but will be retained based on the data management storage
policy retention.
A second building block using a dedicated storage policy has been configured for database
backups. In this example a 256 KB block size has been configured and the storage policy has
retention of 14 days.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
42 | Planning a Com mCell Architecture
SILO Storage
Storage
• How SILO Works Policy
Client
• SILO Folder Primary
Copy
Recovery Process
Secondary
Copy
Metadata
Block data
Index data
SILO
Copy
Folder closed when
size limit reached
SILO Storage
Consider all the data that is protected within one fiscal quarter within an organization.
Traditionally a quarter end backup would be preserved for long term retention. Let’s assume
that quarter end backup of all data requires 10 LTO 5 tapes. Unfortunately with this strategy the
only data that could be recovered would be what existed at the time of the quarter end backup.
Anything deleted prior to the backup within the specific quarter would be unrecoverable unless
it existed in a prior quarter end backup. This results in a single point in time that data can be
recovered. Now let’s consider those same 10 tapes containing every backup that existed within
the entire quarter. Now any point in time within the entire quarter can be recovered. That is
what SILO storage can do.
SILO storage allows deduplicated data to be copied to tape without rehydrating the data. This
means the same deduplication ratio that is achieved on disk can also be achieved to tape. As
data on disk storage gets older the data can be pruned to make space available for new data.
This allows disk retention to be extended out for very long periods of time by moving older data
to tape.
Data blocks are written to volume folders in disk storage. These folders make up the
deduplication store. The folders have a maximum size which once reached the folder is marked
closed. New folders will then be created for new blocks being written. The default volume folder
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 43
size for a SILO enabled copy is 512 MB. This value can be set in the Control Panel, in the Media
Management Applet. The SILO Archive Configuration setting Approximate Dedup disk volume
size in MB for SILO enabled copy is used to specify the volume folder size. It is strongly
recommended to use the default 512 MB value. For a SILO enabled storage policy, when the
folder is marked full it can then be copied to tape. What this really is doing is backing up the
backup.
When a storage policy is enabled for SILO storage an On Demand Backup Set is created in the
File System iDataAgent on the CommServe server. The on Demand Backup Set will determine
which volume folders have been marked full and back them up to tape each time a SILO
operation runs. Within the backup set a Default Subclient is used to schedule the SILO
operations to run. Just like an ordinary data protection operation, right-click the subclient and
select Backup. The SILO backup will always be a full backup operation and use the On Demand
Backup to determine which folders will be copied to SILO storage.
In traditional recovery from tape, the tape is mounted in a drive and the data is recovered
directly back to the recovery location. With SILO to tape the data must first be staged to the disk
before the data can be recovered. Each volume folder that contains data blocks for the restore
must be staged to the disk library for the recovery operation to complete. Since block level
deduplication will result in blocks in different locations being referenced by data, multiple
volume folders may be needed for a single recovery operation. This can result in a slower
restore performance.
SILO storage is intended to be a compliance solution by storing data with long retention in
deduplicated form. Time to recover SILO data will be longer than traditional tape or disk storage
since it needs to be pre-staged to disk before recovery. SILO storage is not an option to recover
data from last week but rather is a feature to recover data from last year or five years ago.
Understanding this concept places Silo storage into proper perspective. This feature is for long
term preservation of data to allow for point in time restores within a time period with
considerably less storage requirements than traditional tape storage methods.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
44 | Planning a Com mCell Architecture
We could define our SLA for up to 6 months to be 2 hours. From 6 months to 1 year the SLA will
be 2-4 hours. Beyond that point the SLA will be 4+ hours.
• The CommVault administrator performs a browse operation to restore a folder from eight
months ago.
• If the volume folders are still on disk the recovery operation will proceed normally.
• If the volume folders are not on disk the recovery operation will go into a waiting state.
• A SILO recovery operation will start and all volume folders required for the restore will be
staged back to the disk library.
• Once all volume folders have been staged, the recovery operation will run.
• To ensure adequate space for SILO staging operations a disk library mount path can
optionally be dedicated to SILO restore operations. To do this, in the Mount Path Properties
General tab select the option Reserve space for SILO restores.
• The procedure is straight forward and as long as SILO tapes are available the recovery
operation is fully automated and requires no special intervention by the CommVault
administrator.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 45
• Compression
• Client side disk cache
• Variable Content Alignment
• Fragmentation considerations
Compression
It is recommended for most data types to enable compression during the deduplication process.
Compression can be enabled in the storage policy primary copy or in the subclient properties.
By default compression is enabled for a deduplication storage policy. You can turn compression
off in the storage policy copy or you can override the use of compression in the subclient
properties.
For certain application types such as Oracle and SQL which may perform application level
compression you should use a dedicated deduplication storage policy with compression turned
off. In some cases using application compression can cause deduplication rates to suffer. In this
case you should experiment with using application compression or CommVault compression to
determine which results in better deduplication ratios. For large databases it is recommended
to consult with CommVault on best practices.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
46 | Planning a Com mCell Architecture
Fragmentation Considerations
Since CommVault stores data in the disk library in chunks, when blocks are deleted from disk it
causes empty spaces within the chunk. For Windows MediaAgents, the sparse file attribute is
used to allow empty spaces within the chunk to be used to store new blocks. Since Windows
uses a write next mechanism when writing data to disk, the empty spaces will only be allocated
to new data when the disk starts to reach full capacity. If new data is written to the empty
spaces, fragmentation could occur. This could negatively affect performance for auxiliary copy
and restore operations. Scheduled fragmentation analysis operations can be configured for the
disk library. This will analyze each mount path to determine the level of chunk fragmentation
that exists. If fragmentation levels are too high, defragmentation operations can be run by using
third party file level defrag tools. When performing defragmentation operations on a mount
path, the mount path should be placed in an offline state.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 47
• General Guidelines
• Deduplication Database
• Disk Library Considerations
• GridStor™ Technology Considerations
• Deduplication Storage
• Block Size Settings
• Performance
• Global Deduplication
• SILO Storage
General Guidelines
• Carefully plan your environment before implementing deduplication policies.
• Consider current protection and future growth into your storage policy design. Scale your
deduplication solution accordingly so the deduplication infrastructure can scale with your
environment.
• Once a storage policy has been created the option to use a global dedupe policy cannot be
modified.
• When using encryption use dedicated policies for encrypted data and other policies for non-
encrypted data.
• Not all data should be deduplicated. Consider a non-deduplicated policy for certain data
types.
• Non-deduplicated data should be stored in a separate disk library. This will ensure accurate
deduplication statistics which can assist in estimating future disk requirements.
Deduplication Database
• Ensure there is adequate disk space for the deduplication database.
• Use dedicated dedupe databases with local disk access on each MediaAgent.
• Use high speed SCSI disks in a RAID 0, 5, 10, or 50 configurations.
• Ensure the deduplication database is properly protected.
• Do NOT backup the deduplication database to the same location the active database resides.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
48 | Planning a Com mCell Architecture
Deduplication Store
• Only seal deduplication stores when databases grow too large or when using SILO storage.
• When using SILO storage consider sealing stores at specific time intervals e.g. monthly or
quarterly to consolidate the time period to tape media.
• For WAN backups you can seed active stores to reduce data blocks that must be
retransmitted when a store is sealed. Use the option Use Store Priming option with Source-
Side Deduplication to seed new active stores with data blocks from sealed stores.
Performance
• Use DASH Full backup operations to greatly increase performance for full data protection
operations.
• Use DASH Copy for auxiliary copy jobs to greatly increase auxiliary copy performance.
• Ensure the deduplication database is on high speed SCSI disks.
• Ensure MediaAgents hosting a dedupe database has enough memory (at least 32GB).
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 49
Global Deduplication
• Global deduplication is not a be-all-end-all solution and should not be used all the time.
• Consider using global dedupe policies as a base for other object level policy copies. This will
provide greater flexibility in defining retention policies when protecting object data.
• Use global deduplication storage policies to consolidate remote office backup data in one
location.
• Use this feature when like data types (File data and or virtual machine data) need to be
managed by different storage policies but in the same disk library.
SILO storage
• SILO storage is for long term data preservation and not short term disaster recovery.
• Recovery time will be longer if data is in tape SILO so for short term fast data recovery use
traditional auxiliary copy operations.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
50 | Planning a Com mCell Architecture
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 51
Disaster recovery or ‘DR’ is much more than backing up data and sending it off-site. Like other
areas of technology, disaster recovery has been refined to a science encompassing all aspects of
data protection, data preservation and data recovery. This science has been molded to a point
where several key concepts and definitions are commonly used when planning, testing and
implementing DR plans. The following information provides a high level overview of each of
these concepts.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
52 | Planning a Com mCell Architecture
disaster and should be quantified by business system owners and other business units.
Technologies such as clustering, virtualization and disk replication/mirroring are implemented
with the intention to reduce and in some cases eliminate system outages. These systems
provide a level of high availability that, when planned right, can guarantee a high level of up
time. However, it is important to properly understand the type of disasters that may occur and
how they might affect RTO.
Gap Analysis
Gap Analysis is a process in which business units define SLA values for various business systems
and then pass them along to technical teams. The technical teams conduct tests to establish
current capabilities to meet SLAs. Gap analysis is then performed to see if the established SLAs
can be met. If not the technical team must address shortcomings and adapt to better meet the
business unit’s requirements. In some cases procedural adjustments can be made to better
meet business’s needs. In other cases additional investments must be made to meet SLA
requirements. If the business unit’s needs cannot be met or budget limitations prevent gap
reduction then the business units must redefine their SLAs to be more in line with the realistic
capabilities of the technical teams.
Another key point regarding gap analysis is that each business unit will always think that their
systems are the most important. Fairly determining system priority and properly defining SLAs is
sometimes a better fit for outside consultants or auditors. If outside consultants are to be used
it is important that they do not represent specific products and technologies as they will
sometimes push what they want and not provide the best solution for your situation. Auditors
can be a big benefit as their knowledge of compliance requirements such as Sarbanes-Oxley can
be used to push through technology upgrades and change legacy thought processes that
impede progress towards providing a sound disaster recovery strategy.
Risk Assessment
Risk Assessment is a companywide coordinated effort to address the likelihood of a disaster, the
effect it may have on business and the cost involved in preparing for it. Risks such as air
conditioner leaks, fire, hacking or sabotage are disaster situations that every company should
be prepared to deal with. Major disasters such as tornado, hurricane, volcanic eruption or
terrorist attack are more complicated disasters that, depending on the nature of a business may
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 53
or may not be considered in a DR plan. This may sound contrary to what a DR course should
state, but the truth is that location, disaster probability, nature of the business and data being
protected will all factor in to planning a sound DR strategy.
If you work for a small company on the outskirts of Mt. Rainier, the potential of a volcanic
eruption and the cost in defining short SLAs, which may be defined for an air condition leak,
may not be worth the money and effort when the likelihood of an eruption is very small. In this
case the cost associated with meeting short SLAs for an eruption would be substantially greater
than an air condition leak. On the other hand if you work for a major bank in the same location,
short SLAs would most likely be required. The point here is not that a DR plan should not be put
in place, but rather the SLAs for the various levels of disaster should be realistically weighed on
a cost/benefit scale before investing in meeting SLA requirements. Not all disasters are created
equal so risk assessment should be considered at various disaster levels: business system
outage, limited site disaster, site disaster and regional disaster.
Where the TCO can usually be quantified with various calculations, Return on Investment (ROI)
is not as easy to quantify. If two months after implementing a DR plan, disaster strikes, the ROI
would be wonderful. If disaster never strikes then ROI may be thought of as being nothing. The
truth is that ROI can be quantified when put into perspective. The piece of mind that a sound
DR plan brings to a company can be factored into the ROI. Many companies who implement
sound DR plans may receive a break on insurance, pass security and DR audits and even have an
increase in customer and investor confidence. These factors should not be taken lightly and
depending on the company and services they provide a sound DR plan can even be used in
advertising. In overall planning of a DR strategy TCO and ROI should be taken into account to
properly define SLAs.
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
54 | Planning a Com mCell Architecture
cheaper and Simpana features such as deduplication, DASH Full and DASH Copy, a branch office
can be quickly and inexpensively converted into a warm or hot DR site.
In some cases cost reduction can have a negative effect on DR. Consider deduplication, being
the big concept in data protection. When blocks are deduplicated they are only stored once. In
this case the cost reduction in disk storage is countered by an increased risk in a corrupt block
affecting the ability to recover data. This is the concept of cost reduction vs. risk reduction.
Saving money in disk storage results in an increased risk. Another example is implementing
archiving solutions where data is moved to secondary storage to free up space in production.
Like deduplication, this results in data being stored in one location which may increase risk.
Using technologies such as Deduplication and archiving can be methods of reducing risk without
increasing cost. When the Simpana software is configured properly and CommVault best
practices are followed, cost and risk reduction can both be achieved.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 55
• Concepts of Business
Continuity
• High Availability
• Disaster Recovery
The concept of Business Continuity (BC) is the holistic approach of defining guidelines and
procedures for the continuation of a business in the face of any disaster situation. In this case
disaster may or may not even involve technical aspects or require DR planning. Business
continuity is beyond the scope of an IT department and beyond the scope of this course, but it
is extremely important to consider in regards to DR planning. A DR strategy may be perfectly
planned and executed but without proper BC plans and procedures the effort of IT may be in
vein. The primary point to consider here is that on the technical end of things you may not have
the ability to design a BC strategy but you do have the power to influence. In some cases
influence may include ensuring that DR aspects of a BC plan are properly being addressed such
as facilities, chain of command, communication and power sources. In other cases influence
might be making upper management aware that they need to create a BC plan as some
companies may have no idea of how important BC planning is.
Consider the following critical BC points and questions as they relate to DR planning:
• Facilities – How secure is the main data center? Is the air conditioner right on top of the data
center? How reliable is the power source? Is there a generator? How often is it tested? How
much fuel does it have?
• Chain of command – Who is in charge when the person in charge is not there? Who’s next on
the list? Who on the management team do you contact if you need to make substantial
emergency purchase? What are ALL methods to contact ALL people in the chain?
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
56 | Planning a Com mCell Architecture
• Communication – Who is our cell phone provider and what are their contingency plans in the
event of disaster? Who is responsible for communicating with them? In the case of disaster
how will management communicate with employees on status updates?
• Contingencies – What happens when DR plans need to be changed? How does the company
deal with extended outages such as utilities where the ability to restore power or
communication is out of the company’s hands.
• Continuation of business – how will employees work if there is no facility to work from? How
will they access resources? How will they communicate?
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 57
Protection Methods
• Traditional Backups
• Archiving
• Simpana OnePass™ Feature
• Snapshots
• Edge Data Protection
• Image Level Backups
Protection Methods
There are several primary protection methods used in modern data centers. Each of these
technologies have their advantages and disadvantages. It is important to understand that not all
technologies are created equal and a holistic approach should be considered when designing a
data protection strategy to meet SLAs.
Traditional Backups
Traditional backups to disk or tape protect data by backing up each object to protected storage.
This is the tried and true method that has been used for decades so it is the most reliable
protection technology. The main advantages when using traditional backups is that each item
protected is a complete separate copy that is backed up to separate media. When using tape
media the backup becomes portable. Many modern backup solutions incorporate traditional
backups to disk storage which is then replicated to a DR site. CommVault’s deduplication and
DASH Copy is an example of using traditional backups with a scheduled replication (DASH Copy)
where only changed blocks are transmitted to the DR location. Traditional backups and restores
are usually slower than some modern protection technologies which can have a negative effect
on SLAs. This performance bottleneck is more severe when millions of items require protection
such as large file repositories. Traditional backups are still the most common and cost effective
data protection technology.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
58 | Planning a Com mCell Architecture
Archiving
Data Archiving is not technically a data protection technology but can be used to improve SLAs.
Archiving removes infrequently accessed data from production disks and moves it to less
expensive secondary storage. The archived data can be recalled by end users or Simpana
administrators. By removing data from the production environment, backup and restores
complete faster since less data needs to be moved improving RTO and RPO.
The Simpana OnePass™ Agent is a comprehensive solution included in Simpana® product suite
that incorporates traditional backup and archiving into a single operation. It enables the
movement of data to a secondary storage location and uses this data to meet both data
protection and storage management archiving business objectives. Secure data recovery is
available to both administrators and end-users via a platform-independent web-based console,
file stub recovery and a tightly integrated Outlook add-in. Policy-driven selective stubbing and
deletion from front-end storage provides storage management archiving without the need to
process the data a second time.
Snapshots
Snapshots are logical point in time views of source volumes that can be conducted almost
instantaneously. This allows for shortened RPOs since the snapshots can be conducted more
frequently throughout the day. A snapshot is not truly considered a DR protection strategy since
the protected data is not physically moved to separate media. Advanced snapshot technologies
allow for data to be mirrored or vaulted to separate physical disks which can be located at off-
site DR locations. Snapshot technologies are used to meet strict SLA requirements but are
considerably more expensive to implement requiring dedicated hardware. Simpana’s
Continuous Data Replicator (CDR) is a software based snapshot and replication technology
which is a cost effective alternative to hardware snapshots. For supported hardware and CDR,
SnapProtect™ technology can be used to conduct and manage snapshots.
Replication
Replication technology is used to replicate block or object changes from a source volume to a
destination volume. Replication methods can use synchronous or asynchronous replication to
synchronize source and destination volumes using a one-to-one, one-to-many (fan out), or
many-to-one (fan in) replication strategy. Production data can be replicated providing fast SLAs
for high availability. Backup data or snapshot data can be replicated providing for a more
complete DR solution. A disadvantage of replication is that if corruption occurs at the source it
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 59
may be replicated to the destination so replication should be used along with point in time
snapshots.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
60 | Planning a Com mCell Architecture
Data Description
% Inc Rate of
Protection Data Center Change / Projected Growth
Data Classification Data Type Priority 1 -4 Location Server Location Size of Datta Week 6 Month
Main Center /
HR Database Business Oracle DB 3 corp gov Physical OR_Corp 450 GB 2% 680 GB
Sales Mail Exchange DB Main Center /
Database IT store 3 Corp Physical Exch_1 120 GB 4% 140 GB
Users Mail Exchange DB Main center / 400 GB (quota
Database IT store 2 corp Physical Exch_1 350 GB 3% limited)
Managers + Exchange Main center /
mailboxes Business mailboxes 4 corp Physical Exch_1 220 GB 4% 450 GB
Manager + Exchange Main center /
Journal Business Journal MB 4 Corp physical Exch_1 15 GB 100% 390 GB
Main /
Finance DB Business MS-SQL DB 4 Acounting Virtual /SQL_1 50GB 2% 74 GB
Finance SQL Main /
Server IT File System 4 Acounting Virtual /SQL_1 25 GB 0.05% 27 GB
SQL System Main /
DB IT MS-SQL DB 4 Acounting Virtual /SQL_1 15GB 0.05% 16 GB
Windows File Win File Main /
System IT System 2 Production Virtual / WIN_1 20GB 2% 30 GB
Win File Main /
Home Folders Business /IT System 3 Production Virtual / WIN_1 800 GB 3% 1.4 TB
Main Data
NAS File Share IT NAS NDMP 3 Center Physical /NAS_1 1.8 TB 4% 3.5 TB
Data Description
Data Description is based on the business data residing in the production environment. This
data in some cases can be an entire server, in other cases a business system may span multiple
servers and in other cases business data requiring different protection may exist on a single
server. The key aspect of describing data should be its business value and not its physical
location.
The Client/Host is the system through which the specified data set will be accessed. In the case
of shared or distributed storage there may be more than one client per data set. Identifying the
client(s) marks the first transition point for data movement. Data will be read from primary
storage through the client host onto protected storage. Its path from the client to protected
storage will be determined by the placement of MediaAgents in the final storage design.
The location of the data will help determine whether some protection options (such as Snap or
replication) are possible and it will also determine any possible resource/data path sharing
requirements. Several sets of data located on the same shared storage device but under
different client management can present potential performance problems.
Volume information is, of course, essential to sizing protected storage, but it’s also essential to
determine data movement resource requirements and potentially the need for parallel data
movement to meet operation window requirements.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 61
The dynamics of the data is its daily change rate and annual growth rate. Both are key data
points for storage and storage policy design. Daily change quantifies both modified and new
data as the minimum data volume that requires protection. This impacts the rate of protected
storage growth and resources required to move the new data into storage. Annual growth rate
helps determine future storage capacity which must be accounted for in any storage design.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
62 | Planning a Com mCell Architecture
Data Availability
% Inc Rate Projected Recovery Online / Offline
Priorit Data Center Size of of Change / Growth 6 Time Recovery Point Protection to Meet
Data y 1 -4 Location Datta Week Month objective Objective Objectives
Main Center Point in time log
HR Database 3 / corp gov 450 GB 2% 680 GB 8 hours 8 hours shipping
Main Center
Sales Mail Database 3 / Corp 120 GB 4% 140 GB 8 hours 24 hours Nightly full backups
400 GB
Main center / (quota
Users Mail Database 2 corp 350 GB 3% limited) 24 hours 24 hours Nightly full backups
2 hours (2 Nightly mailbox
Main center / month) 24 backups to disk target
Managers + mailboxes 4 corp 220 GB 4% 450 GB hrs after 24 hours for recovery requests
1 hour (6
Main center / month) 24 nightly compliance
Manager + Journal 4 Corp 15 GB 100% 390 GB hrs after 24 hours backup
Main / near zero data synchronous DB
Finance DB 4 Acounting 50GB 2% 74 GB 4 hours loss replication
Main /
Finance SQL Server 4 Acounting 25 GB 0.05% 27 GB 4 hours 24 hours nightly backups
Main /
SQL System DB 4 Acounting 15GB 0.05% 16 GB 4 hours 24 hours nightly backups
Main /
Windows File System 2 Production 20GB 2% 30 GB 24 hours 24 hours Nightly backups
Main /
Home Folders 3 Production 800 GB 3% 1.4 TB 24 hours 24 hours Nightly backups
Main Data Hardware point in time
NAS File Share 3 Center 1.8 TB 4% 3.5 TB 1 hour 4 hours snap shots 4 hour
Data Availability
Data Availability is the speed and ease of access to data. Understanding Data Availability
options, capabilities, and limitations is essential to designing protected storage. Production disk
is the primary data availability media/location. Data at that level is instantaneously and
transparently available to both applications and users. It’s where the data is originally written
and read from.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 63
FS/MB data recovery requests > 1 year old but < 7 years old can be recovered within (24) hours
of request. The data recovered will be the last monthly full iteration of the data.
Business Continuity
Business Continuity (BC) is the immediate availability of data as may be required to minimize
the interruption of day-to-day business. This usually involves loss of a file, folder, disk, or server
and is normally satisfied by restore from on-site backup data on magnetic storage. BC
requirements are usually specified in media type and length of availability.
Disaster Recovery
Disaster Recovery (DR) provides for protection against loss of both production and on-site
backup data and usually implies loss of a critical business function. DR requirements are
usually specified in frequency and duration of data movement off-site.
Archive
Archive implies long term availability of data that has value to the company. It can also mean
movement of less frequently access data to less expensive storage. This data may be historical
records required by legal, industry, or company requirements. The requirement for recall of
archived data may be transparent (on-site magnetic storage) or limited (vaulted off-site storage)
Archive requirements are usually specified in levels of availability and/or retention.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
64 | Planning a Com mCell Architecture
Protection Primary Primary Near Line Near Line Off Site Off Line Archive Off Line Content
Data Method Target Retention Target Retention Target Retention Target Retention Encryption Index
LTO 4 LTO 4 EOQ 5 Hardware
HR Database OR DB Agent LTO 4 Tape 5 days N/A N/A Tape 14 days Tape years LTO 4 N/A
Sales Mail Exch DB Dedupe
Database Agent Disk_1 14 days N/A N/A LTO 4 tape 1 month N/A N/A N/a N/A
Users Mail Exch DB Dedupe
Database Agent Disk_1 14 days N/A N/A LTO 4 tape 1 month N/A N/A N/a N/A
Managers + Exch mailbox Dedupe LTO 4
mailboxes Agent Disk_1 2 months Tape 1 years N/A N/A N/A N/A N/A Yes
Exchange
Manager + Compliance Dedupe LTO 4 LTO 4 EOQ 5
Journal Archive Disk_1 6 months Tape 2 Years N/A N/A Tape Years Yes Yes
SQL Agent on Dedupe LTO 4 LTO 4 EOQ 5 Hardware
Finance DB VM Disk_1 5 days N/A N/A Tape 14 days Tape years LTO 4 N/A
Finance SQL Dedupe LTO 4
Server VSA Snapshot Disk_1 5 days N/A N/A Tape 14 days N/A N/A N/A N/A
SQL Agent on Dedupe LTO 4
SQL System DB VM Disk_1 5 days N/A N/A Tape 14 days N/A N/A N/A N/A
Windows File dedupe LTO 4
System VSA Snapshot Disk_1 14 days N/A N/A Tape 1 months N/A N/A N/A N/A
dedupe LTO 4 LTO 4
Home Folders VSA Snapshot Disk_1 1 month Tape 6 month Tape 1 month N/A N/A N/A N/A
Hardware
Snap / Snap dedupe LTO 4
NAS File Share Protect Disk_1 2 Weeks Tape 1 months N/A N/A N/A N/A N/A N/A
Using this chart, Storage Policies can be configured in an efficient manner. A chart such as the
one above created in a spreadsheet program can be sorted by fields to determine common
requirements such as storage location and retention. This can simplify the process of creating
Storage Policies.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 65
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
66 | Planning a Com mCell Architecture
• Agent
• File System Client
Agent
• Application Database
• Application Objects Subclients
VSS Data Set
• Data Set Default
Subclient
• Subclients
• Default
• Custom Defined User Data
Subclient
VSS
The Simpana product suite uses iDataAgents or ‘Agents’ to communicate with file systems and
application that require protection. Any server with an Agent installed on it is referred to as a
Client. Each Agent contains code that is used to communicate directly with the system requiring
protection. The Agent will communicate using APIs or scripting that is native to the file system
or application. For example: A Windows 2008 file system can use VSS to protect file data so the
Windows Agent will have the option to enable VSS during backup operations.
The Agent will then have a data set defined. The data set is a complete representation of all
data the Agent is responsible to protect. Within the data set, subclients are used to define the
actual data requiring protection. By default, a Default Subclient is used to define ALL data
requiring protection within the backup set.
Additional subclients can be created to define specific content requiring protection. When
content is defined within the user defined subclient, it will automatically be excluded from the
default subclient. An example for a custom subclient could be defining a specific drive
containing user data where VSS will be initiated for the drive during backup jobs to ensure all
open files are protected.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 67
VMware
• Hyper-V
• VSA on Physical host
• VMware®
• VSA on physical or
virtual proxy VSA installed on
Client Agents
installed within
physical or virtual
Virtual Machines
• Agents Installed in VMs Proxy Server Hyper-V
Client / MA
VSA installed on
physical Hyper-V
Server
For VMware Simpana supports VMware vStorage API method (VADP) and VMware Consolidated
Backup method (VCB). To support VADP backups, Change Block Tracking or CBT must be
enabled. The Virtual Server iDataAgent agent will automatically check and enable CBT at the
time of backup. Virtual Server iDataAgent may not be able to enable CBT for cloned or migrated
virtual machines.
The traditional method for protecting virtual machines is to install a file system or application
agent within the virtual machine itself. There are advantages and disadvantages to using this
method:
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
68 | Planning a Com mCell Architecture
The first advantage is it is simple to deploy. When agents are installed in the virtual machine the
Simpana software will treat the machine as if it is a physical client. All of the functionality for
managing physical clients would be the same for the virtual clients
The second advantage of installing an agent in a virtual machine, is the application specific
functionality becomes available. For example using MAPI to conduct granular mailbox backups.
Another example would be conducting transaction log backups for a virtualized database
application.
Installing agents within the virtual machine results in all data being granularly backed up and
restored. This can be a slow process if there are many files within the machine.
The backup and recovery process will also require all data to be moved over the network which
can become a bottleneck.
A third disadvantage is that all processing during the backup or recovery process will be
conducted on the hypervisor. This could potentially become a bottleneck if too many virtual
machines are being backed up at the same time.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 69
Communicate with
vCenter to locate VM
1 vCenter
• The backup is initiated by the Simpana software for each virtual machine that will be backed
up.
• The VSA communicates with hypervisor with the list of virtual machines that have been
defined within the subclient contents of the virtual server agent.
• All virtual machines will have their disks quiesced. For windows virtual machines, VSS will be
enabled on all of the disks to provide a consistent point in time backup of each disk.
• Once the disks for the virtual machines are quiesced, the Hypervisor conducts a software
snapshot which will be used to back up the VM.
• VSA backs up the virtual machines either through the physical hypervisor or a physical proxy.
With VMware, if the VSA is installed on a physical host it will be used as a proxy to back up
the VMs. If a virtual proxy is being used, the VMs will be backed up through the virtual proxy
on the physical hypervisor. For Hyper-V the VMs will be backed up through the physical
hypervisor.
• When the backup process runs, virtual disks are indexed to provide granular recovery of files
and folders within the virtual machine.
• Once the backup is complete, the hypervisor releases the software snapshot. The disk
within the virtual machines are unquiesced and any transactions that were recorded while
the disks were in the quiescent state are replayed.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
70 | Planning a Com mCell Architecture
This process ensures a consistent state of the snapshot at the time the backup was initiated.
For Windows virtual machines, Microsoft’s Volume Shadow Copy Service, or VSS, can be used to
provide consistent point in time backups of disk volumes.
VSS is Windows’ built-in infrastructure for application backups. A native Windows service, VSS
facilitates creating a consistent view of application data during the course of a backup. It relies
on coordination between VSS requestors, writers, and providers to quiesce – or “quiet” – a disk
volume so that a backup can be successfully obtained without data corruption.
In order for this to work the VMware Tools VSS component must be enabled. The Virtual Server
agent requests VMware tools to initiate a VSS snapshot in the Guest OS. All registered VSS
writers in the Guest OS get the request and they prepare its application to be backed up
committing all transactions to. Once all VSS writers are finished they communicate back to your
backup software which then initiates a VMware snapshot. This will make the backup application
consistent.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 71
Protecting Applications
Client
• Application Agents
Database
• Freeze/Thaw Scripts
• VSS Aware Log Files MediaAgent
Applications
Client / MA
Log Files
Database
IntelliSnap™
Snapshot
Proxy/ MA
Protecting Applications
Simpana® software supports most major applications through the use of agents installed on the
application servers or on proxy servers with access to data. For unsupported applications,
scripts can be used to properly quiesce application databases and then back them up as file
data.
Virtualized Applications
Virtualized applications pose a challenge when it comes to data protection. Issues such as disk
I/O activity, application type and application state at the time of backup can significantly affect
the backup process. There are several methods that can be used to protect virtualized
applications.
VSA and VSS aware applications – Some application such as Microsoft SQL and Exchange are
VSS aware. When VSS is initiated on the virtual machine it will attempt to quiesce the VSS aware
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
72 | Planning a Com mCell Architecture
Scripting database shutdowns – Using external scripts which can be inserted in the Pre/Post
processes of a subclient, application data can be placed in an offline state to allow for a
consistent point-in-time snap and backup operation. This will require the application to remain
in the offline state for the entire time of the snapshot operation. When the VM is recovered the
application will have to be restarted after the restore operation completes. This method is only
recommended when Simpana agents are not available for the application.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 73
Snapshot Management
• IntelliSnap® features
• How IntelliSnap feature Works
• IntelliSnap Architecture and Deployment
Requirements
Client Proxy Host
File System iDA
Application iDA Same OS as Host
VSS Provider (WIN OS) Media Agent
SnapProtect SnapProtect
MediaAgent
Library
Snap Copies
Backup Copy
Snapshot Management
Snapshot backups to reclaim disk cache space – By managing the snapshots, Simpana software
can also be used to backup the snapped data. As older snapshots are backed up to protected
storage, the snaps can be released on the source disk and the space can be freed for new snap
operations.
Granular recovery - Snapshots can be mounted for Live Browse and indexed during backup
operations for granular recovery of objects within the snap. Whether using live browse or a
restore from a backup, the method to restore the data is consistent. Using the proper
iDataAgent you can browse the snapped data and select objects for recovery. This process is
especially useful when multiple databases or virtual machines are in the same snap and a full
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
74 | Planning a Com mCell Architecture
revert cannot be done. In this case just the objects required for recovery can be selected and
restored.
Simplified management – Multiple hardware vendors supported by the IntelliSnap feature can
all be managed through the Simpana interface. Little additional training is involved since the
same subclient and storage policy strategies used for backing up data are extended when using
snapshots. Just a few additional settings are configured to enable snapshots within the
CommCell environment.
Note: The Simpana IntelliSnap feature is rapidly evolving to incorporate increased capabilities as
well as expanded hardware support. Check with CommVault online documentation for current
list of supported features and supported vendors.
Configure arrays – Array information is set in the Array Management applet in Control Panel.
Depending on the vendor different information may be required.
Configure storage policies – Storage policies are used to centrally manage snapshots of
subclient data just like backup data. When configuring storage policies, a snapshot copy is
added to the policy. For some vendors, multiple snap copies can be added. NetApp DFM
enabled policies currently support multiple snap mirror/vault copies.
Configure subclients – The IntelliSnap capability is enabled at the client level in the Advanced
tab of the client properties. Once enabled for the client, subclients will have a IntelliSnap
Operations tab that can be used to enable and configure snapshots for the subclient.
IntelliSnap Architecture
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 75
IntelliSnap architecture is made up of host servers and proxy servers that work together to
provide snap and optional backup operations.
MediaAgent – Provides capabilities to execute array functions and access to snapshots on the
host. It can also be used when backing up snapshots to CommVault protected storage if no
proxy is being used or if the proxy server is unavailable.
IntelliSnap – IntelliSnap options are built into iDataAgents and do not require additional
software to be installed on the host. IntelliSnap capabilities are enabled in the Advanced tab of
the client properties. This will add a IntelliSnap Operations tab to subclients to configure snap
operations.
CommVault VSS Provider – used to properly quiesce Microsoft applications for application
consistent snapshots.
A proxy server can be used to backup snapshots and as an off host proxy to mount snapshots.
The use of a proxy server eliminates load on the host server for snap mining, mounting, and
backup operations. The proxy server requires the following agents:
OS must be same as host – For a mount or backup operation to be performed the snap must be
mounted on the proxy. In order for the proxy to recognize the file system, the same OS must be
used on the proxy.
File System iDataAgent – A file system agent is required for backup operations. When a
snapshot is backed up it is treated like a file system backup job.
MediaAgent – Used for array access, mounting snaps on the proxy and data movement from
array to CommVault protected storage.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
76 | Planning a Com mCell Architecture
• It is critical to meet data protection windows. If windows are not being met then restore
windows may not be met. If data is scheduled to go off-site daily but it takes four days to
back up the data, then the data cannot be sent off-site until the job completes.
• If you are currently meeting protection windows, then there is no need to modify anything.
Improving windows from six to four hours when your window is eight hours just creates more
work and a more complex environment. The following recommendations are intended to
improve performance when protection windows are NOT being met.
• Device Streams – Increase device streams to allow for more concurrent jobs streams to write
if adequate resources are available.
• MediaAgent – ensure MediaAgent is properly scaled to accommodate higher stream
concurrency.
• Network – ensure network bandwidth can manage higher traffic.
• Disk Library (Non-Deduplicated) – ensure library can handle higher number of write
operations. Increase the number of mount path writers so the total number of writers across
all mount paths equals the number of device streams.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 77
• Disk Library (Deduplication enabled) – if not using Client Side Deduplication enable it. Each
deduplication database can manage up to 50 concurrent streams. If using Client Side
Deduplication, after the initial full is complete most data processing will be done locally on
each Client. This means minimum bandwidth, MediaAgent, and disk resource will be
required for data protection operations.
• Tape Library – If tape write speeds are slow enable multiplexing. Note: enabling multiplexing
can have a positive effect on data protection jobs but may have a negative effect on restore
and auxiliary copy performance.
General recommendations:
Ensure all data is properly being filtered. Use the job history for the client to obtain a list of all
objects being protected. View the failed items log to determine if files are being skipped
because they are open or if they existed at time of scan and not time of backup. This is common
with temp files. Filters should be set to eliminate failed objects as much as possible.
For file systems and application with granular object access (Exchange, Domino, SharePoint)
consider using data archiving. This will move older and infrequently accessed data to protected
storage which will reduce backup and recovery windows.
• For backups on Windows operating systems ensure source disks are defragmented.
• Ensure all global and local filters are properly configured.
• If source data is on multiple physical drives increase the number of data readers to multi-
stream protection jobs.
• If source data is on a RAID volume, create subclient(s) for the volume and increase the
number of data readers to improve performance. Enable the Allow Multiple Data Readers
within a Drive or Mount Point option.
• For large volumes containing millions of objects:
• Consider using multiple subclients and stagger scheduling backup operations over a
weekly or even monthly time period.
• For supported hardware consider using the Simpana IntelliSnap™ feature to snap and
backup volumes using a MediaAgent proxy server.
• Consider using the Simpana Image Level backup agent.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
78 | Planning a Com mCell Architecture
Database applications
• For large databases that are being dumped by application administrators consider using
Simpana database agents to provide multi-streamed backup and restores.
• When using Simpana database agents for instances with multiple databases consider creating
multiple subclients to manage databases.
• For large databases consider increasing the number of data streams for backing up database.
Note: For multi-streamed subclient backups of SQL, DB2, and Sybase databases, the streams
cannot be multiplexed. During auxiliary copy operations to tape if the streams are combined
to a tape they must be pre-staged to a secondary disk target before they can be restored.
• For MS-SQL databases using file/folder groups, separate subclients can be configured to
manage databases and file/folder groups.
General Guidelines
• Consider using the Simpana Virtual Server Agent (VSA).
• Determine which virtual machines DO NOT require protection and do not back them
up.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Planning a CommCell Architectur e | 79
• Consider the pros and cons of using Simpana compression and client side
deduplication. Using application level compression may have a better compression
ratio but deduplication efficiency can suffer.
If streams from different data sets are multiplexed or combined to a tape, only one data set can
be restored at a time. Consider isolating different data set streams to different media using
separate secondary copies for each data set and using the combine to streams option.
For large amounts of data that are being multi-streamed during backups, do not multiplex or
combine the streams to tape. If the streams are on separate tapes the Restore by Job option can
be used to multi-stream restore operations improving performance.
When using Simpana deduplication use the minimum recommended 128k block size. Small
block sizes will result in heavier data fragmentation on disk which can reduce restore
performance.
• Filter out data that is not required for data protection operations. The less you backup the
less you have to restore.
• Strongly consider data archiving. It will improve backup and restore performance. Note that
deduplication will improve backups and reduce storage requirements it can actually have a
negative effect on restore performance.
• If a subclient job was multi-streamed you can restore it using multiple streams through the
Restore by Job option.
• Consider assigning different RTOs for different business data. It is not always about restoring
a everything. Consider a database server with five databases. Each one can be defined in a
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
80 | Planning a Com mCell Architecture
separate subclient. This will allow each database to have a separate RTO so they can be
recovered by priority.
• Run point in time backups such as incremental or transaction logs more frequently for
shorter RPO.
• Consider prioritizing data for RPO requirements and define the data as a separate subclient
and assign separate schedules. For example a critical database with frequent changes can be
configured in a separate subclient and scheduled to run transaction logs every fifteen
minutes. To provide short off-site RPO windows consider running synchronous copies with
the automatic schedule enabled.
• Consider using hardware snapshots with the Simpana IntelliSnap feature to manage and
backup snapshots.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 81
Module 2
CommCell® Environment
Deployment
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
82 | CommCell Environment Deploym ent
Topics
• CommCell® Deployment Process
• New CommCell Deployment Process
• Existing CommCell Upgrade Process
• CommCell Disaster Recovery Process
• Environment Requirements
• Installing CommServe® Software
• Installing MediaAgent Software
• Index Cache Configuration
• Library Detection, Installation and Configuration
• Client Agent Deployment methods
• Standard Installation Methods – Interactive
• CommCell Console Push Install
• Custom Installation Methods
• Deployment Best Practices
No unauthorized use, copy or distribution.
Topics
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 83
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
84 | CommCell Environment Deploym ent
The first component to be installed in a new CommCell® environment will be the CommServe®
server. Once it is installed the next step would be to install MediaAgent software and detect and
configure libraries. Policy configuration for Storage Policies, Schedule Policies, Subclient Policies
and Global Filters should be done prior to installing any client agents. When installing client
agents, options to associate the default subclient for the agent with the policies can be selected
so preconfiguring policies makes the agent deployment process smoother.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 85
MediaAgents should be upgraded next and libraries should be tested to ensure everything is
functioning properly. Clients can then be upgraded on an as needed basis. Note that with
Simpana® software, client agents up to two versions back can coexist with a CommServe server
and MediaAgents at the latest version.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
86 | CommCell Environment Deploym ent
CommServe® Server
The CommServe server must be the first machine recovered before the recovery of any
production data can be accomplished. The speed and method of recovering the CommServe
server ultimately depends on the combination of several factors:
• Which High Availability CommServe server option or Standby CommServe server option was
configured and the access to it.
• Access to the DR Backup metadata.
• How prepared the production and DR environment is and how practiced and efficient the
Simpana Administrators are at recovering the CommServe server.
• What the effect is for an actual disaster scenario you are confronting (site or regional), or
what practice DR run you are simulating.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 87
The CommServe Disaster Recovery Tool can be used to rebuild the CommServe server on the
same or different computer, change the name of the CommServe computer, create and
maintain a CommServe server in the hot-site and to update the license.
MediaAgent
MediaAgents at a Disaster Recovery site
If it becomes necessary to build a completely new MediaAgent, as in the case of complete
Disaster Recovery at another location other than the production-site, there are a few things to
keep in mind. Usually, this type of scenario will be using removable media as the primary
source of data recovery and the MediaAgent will be new to the CommCell environment. This is
not a problem for the Simpana® software. The new MediaAgent will be installed after the
CommServe server Disaster Recovery has taken place and connected to a library where the
media has been loaded. Restores can take place from any library; in fact one of the advanced
options of a restore job is to select the desired MediaAgent and Library.
Some considerations may be licensing issues. With Volume based licensing there is no issue
installing an additional MediaAgent, as with a DR License as well, but if the licensing is per agent
it may be necessary to release the license of an existing MediaAgent to apply a license to a new
one.
Rebuild Libraries
Rebuilding Disk Libraries
The main reason for changing the hardware of a disk library is to upgrade to better devices.
Most disk libraries will have fault tolerant redundancies built in to avoid major disk failures.
Changing the hardware is a relatively simple process. The original CV_Magnetic folder and all
its contents must be copied to the new location. After moving the physical data, the original
mount path can be changed in the Library and Drive Configuration tool by right-clicking the old
mount path and selecting the new location.
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
88 | CommCell Environment Deploym ent
can also be manually detected by selecting Mark Drive Replaced from the task menu of the
replaced drive.
Replacing a Library
There are two reasons to replace a library, the first being an upgrade to newer hardware and
the second being a major hardware malfunction. In either case it is fairly simple to replace the
library. Keep in mind the new library must support the same drive type and must have the
same or more drives as the original. Once the Hardware is connected and can communicate
with the MediaAgent’s operating system you can move the media from the old library to the
new one. Then you must use the Library and Drive Configuration tool in order to configure the
new library. The procedure to Modify the existing library to be replaced with the new library is
clearly documented in our documentation website.
Caution: If a library is being replaced, Do not deconfigure the library, master drive pool or
drive pool. Doing so will lose all data associated with the library.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 89
Environment Requirements
• Hardware Requirements
• Software Requirements
• Network Requirements
• Domain and DNS Requirements
Environment Requirements
Minimum system, software, and application requirements are documented in Simpana's Books
Online (BOL). If you do not see your specific OS, application, or environment don't
panic. Contact your Simpana® Support group to check if you can still install the software
component. BOL may not have the latest information available from our testing
group. Additionally, CommVault will often authorize "Field Certification" of new versions or
unique environments to allow installation and support.
Be sure to read the notes included on the System Requirements page. These notes often
contain caveats or additional information essential to the installation process.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
90 | CommCell Environment Deploym ent
• Installing in a Clustered
Environment
• Standby CommServe Server
• Install with Existing Database
Hypervisor
CommServe
Virtual
Metadata
CommServe Database Backup
Standby
CommServe
Within a CommCell environment there can only be one active CommServe server. For high
availability and failover there are several methods that can be implemented. The following
information explains each of these methods.
When using a hot / cold standby CommServe server consider the following key points:
It is critical that both the production and standby CommServe servers are patched to the same
level. After applying updates to the production CommServe server, ensure the same updates are
applied to the standby CommServe server.
Multiple standby CommServe servers can be used. For example: an on-site standby and an off-
site DR CommServe server. Use post script processes to copy the raw DR metadata to additional
CommServe servers.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 91
The CommCell license is bound to the IP address of the production CommServe server. If
standby CommServe server will be used purchase the CommServe DR license which allows
multiple IP addresses to be associated with the standby CommServe servers.
A standby CommServe server can be a multi-function system. The most common multi-function
system would be installing the CommServe software on a MediaAgent. It is important to note
that the CommServe software must be installed on a system with no other CommVault agents
installed. First install the CommServe software and then install other agents. When using a
multi-function server the DR restore operation will not work. Contact support in the event that
the standby CommServe server must be activated for instructions on how to proceed in this
situation.
If a virtual environment is present consider using a virtual standby CommServe server. This
avoids problems associated with multi-function standby CommServe servers and eliminates the
need to invest in additional hardware. Ensure the virtual environment is properly scaled to
handle the extra load that may result when activating the virtual standby CommServe server.
Virtualization
Many customers with virtual environments are choosing to virtualize the production
CommServe server. A virtualized CommServe server has an advantage of using the hypervisors
high availability functionality (when multiple hypervisors are configured in a cluster) and
reduces costs since separate CommServe hardware is not required. Although this method could
be beneficial it should be properly planned and implemented. If the virtual environment is not
properly scaled the CommServe server could become a bottleneck when conducting data
protection jobs. In larger environments where jobs run throughout the business day,
CommServe server activity could have a negative performance impact on production servers.
When virtualizing the CommServe server it is still critical to run the CommServe DR backup. In
the event of a disaster the CommServe server may still have to be reconstructed on a physical
server. Do not rely on the availability of a virtual environment in the case of a disaster. Follow
normal CommVault best practices in protecting the CommServe metadata.
Clustering
The CommServe server can be deployed in a clustered configuration. This will provide high
availability for environments where CommCell operations run 24/7. A clustered CommServe
server is not a DR solution and a standby CommServe server planned for at a DR site. Clustering
the CommServe server is a good solution in large environments where performance and
availability are critical.
Another benefit for using a clustered CommServe server is when using Simpana data archiving.
Archiving operations can be configured to create stub files which allow end users to initiate
recall operations. For the recall to complete successfully the CommServe server must be
available.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
92 | CommCell Environment Deploym ent
How to configure:
Pre stage a CommServe server at a hot-site and/or locally with a different host name and a
different IP address with all Simpana services stopped. The DR Backup export phase is
configured run directly to this server via a network share.
Reinstalling Software:
Ideally the standby DR CommServe server has the OS and the CommServe module preloaded
with all of the CommServe services stopped. You should never run two CommServe servers at
the same time in the same CommCell environment with their Simpana services running.
If the CommServe module is not preinstalled, install only the CommServe module, install the
same Service Pack and patches as the production CommServe server and recover the
CommServe server using the CSDR tool outlined below.
After recovering the CommServe server you can install the MediaAgent Module on the same
machine if that is desirable.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 93
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
94 | CommCell Environment Deploym ent
A MediaAgent component can be located on the same host as the CommServe component, the
same host as an iDataAgent component, or on a separate host by itself.
The MediaAgent component can be installed in any order after the CommServe component has
been installed.
Normally, at least one MediaAgent is installed with a configured library prior to installing
iDataAgents, but it is not required.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 95
If the client data will transit over a LAN, verify the expected access path is available and
addressable. This data path may be different from the control/coordination path used for the
install. Such paths can be set up using Data Interface Pairs.
Calculations for sizing the index can be found on the documentation web site. However, a SWAG
of 4% of protected indexed volume is often used to size the Index Cache for “average”
environments. For most cases this is more than adequate.
After installation, the location of the Index Cache directory can be changed. Multiple
MediaAgents can share a common Index Cache directory.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
96 | CommCell Environment Deploym ent
CommVault software uses a two tiered distributed indexing structure providing great resiliency,
availability, and scalability. Job summary data is maintained in the CommServe metadata
database and requires minimal space to retain the data. The job summary information will be
maintained as long as the data is being retained. An Index Cache maintains detailed indexing
information for all objects being protected. Multiple index caches can be used for more efficient
index scaling and to keep index files in close proximity to the MediaAgents. Index data is
maintained in the cache based on retention settings of days or disk usage percentage. Each
subclient will have its own index file and new index files are generated by default during a full
data protection operation. Index files are copied to media automatically at the end of each job.
The index cache should be sized based on the need to browse back in time for data to be
recovered. The farther back in time you need to browse, the larger the cache should be. If the
index cache is undersized, index files will be pruned sooner to maintain a default 90% disk
capacity. When you attempt to perform a browse or find operation and the index file is not in
the cache it will automatically be restored from media. If the index file is in magnetic storage
there will be a short delay in recovering the index but if it is on removable media the time to
recover the index can be much longer.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 97
The index cache should be on a dedicated disk or partition with no other data being written to
the disk.
To reduce the probability of pulling an index file back from media use a large index cache
location.
A Minimum Free Space can also be configured to reserve space in the index cache location. The
cleanup percent setting would be based on the allocated space to the index cache. So if you had
a 100 GB partition and wanted to reserve 10 GB of space, the cleanup percent would be based
on 90GB.
To configure a shared index cache, one MediaAgent will host the index cache. The other
MediaAgents will connect to it. The index cache location can be a local drive, SAN disk space, or
a NAS device.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
98 | CommCell Environment Deploym ent
Libraries are either detected (e.g. tape device, library controller) or added (e.g. disk, cloud, IP-
based controller). Essential to both is the ability of the MediaAgent to correctly see/access the
device. Prior to any detection or adding of devices to a MediaAgent, the Engineer should
confirm the physical and logical view of the device from the operating system. If multiple
similar devices are involved (e.g. a multi-drive library), all such devices should be at the same
firmware level.
Detection
The system only detects devices for which device drivers are loaded. A detected device may
have the following status:
• Success indicates that the system has all of the information necessary to use the device.
• Partially configured, detect fail - connection error status when the detection fails due to
an error connecting to the MediaAgent
• Partially configured, detect fail - device not found status if the detection fails due to a
missing device
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 99
Note: that some devices (e.g., the library associated with a stand-alone drive) have no detection
status, since they are virtual entities and as such have no hardware components that can be
detected.
Exhaustive Detection
Modern tape drives have serial numbers which are used by the Simpana software to properly
place a drive physically and logically within a library. Older drives without serial numbers require
manual locating. Exhaustive detection is a process of associating drive numbers to its correct
SCSI address. This is done by mounting a media to each of the drives in the library to obtain the
drive’s SCSI address.
Adding
Logical libraries (e.g. Disk, Cloud, PnP) are added by user allocating assets and/or access to
devices. This usually involves the grouping of devices (mount paths) identified by providing data
paths and user access authority.
A hybrid library requiring both addition and detection would be an IP-based library. The IP
address for the library control is added while the tape devices used by the MediaAgent(s) are
detected and logically associated with the IP-based library.
Configuration
Added or Detected devices can be configured as new libraries or added to existing libraries (e.g.
adding an additional tape drive in an already detected/configured library.) Configuration gives
the device an identity within the CommCell environment and, as appropriate, an association
with other devices for management/control (e.g. tapes drives in an automated library, new
mount paths).
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
100 | CommCell Envir onment Deployment
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 101
• Run Setup.exe
CommServe
• Use appropriate Download
installation disc Software
Install Disks
Interactive Install
Interactive installation can be performed directly from the installation software disc by running
the setup.exe command. Optionally you can copy the installation files to a disk or a network
share accessible to the client and execute the setup.exe command from the client.
The SQL Server instance for the CommServe component cannot be installed from a UNC
path. The path must be a mapped drive letter.
The user performing the installation must have administrator privileges on the client to install
software.
Any number of components can be selected for installation at the same time. For a new client,
the Base Agent (not visible in the component list) will automatically be selected and the first
item installed. The Base Agent provides files needed for communication with the CommServe
component.
While every effort is made to not require a reboot of the host during or after the installation,
the state of the system at the time of install may require a reboot. If this happens, you will
be presented with an option to not reboot at that time.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
102 | CommCell Envir onment Deployment
• Installation Path
• Authorized user/password to interact with an application for backup/restore (Application
Agent only)
• Firewall access (if required)
• Computer Group membership
• Default Storage Policy
• Filter Policy
• Include patches/service packs
• Update Schedule
Use a common installation path for all clients if possible, this will help with re-installation
and/or a full system restore should it become necessary.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 103
Client
Central
Cache
Client
Remote
Cache
You can manage the installation of agent and component software packages on client
computers or even on network computers not yet a part of the CommCell environment, from
the CommCell console.
The required software packages can be downloaded or copied to the CommServe Cache
Directory and then pushed to selected computers. Remote Software Cache directories may
also be configured and used to locate installation software closer to their prospective targets or
for different access privileges.
Remote caches can be configured for automatic synchronization with the CommServe cache
directory. This entire process is all conveniently managed from the CommCell console.
Remote software cache directories can be created and managed via the Add/remove Software
Configuration applet located in the CommCell cconsole's control panel.
Prior to configuring the installation of software packages to specific computers to build your
CommCell environment, you must copy or download the required software packages to the
CommServe cache directory. The directory is configured to serve as a holding area for software
and update packages. To install from any of the software cache directories the directory must be
a shared network directory with permissions set to write to the directory.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
104 | CommCell Envir onment Deployment
The CommServe cache directory can be populated during install of the CommServe host, by FTP
download, or by using the Copy Software task found in the install path's base
directory. Remote cache directories can be populated using the synchronization feature or the
CopyToCache utility.
Client computers that are not in the same domain as the domain in which the CommServe
cache is located must have bidirectional trust in place.
If Authentication for Agent Install is enabled for the CommCell environment, installation from
the CommCell Console is restricted to only those users belonging to a user group assigned with
Administrative Management capabilities for the CommCell computer or an existing Client
computer within the CommCell environment. However, if it is a new computer, not yet part of
the CommCell group, you must have Administrative Management capabilities for the CommCell
group.
During configuration, computers within the domain that are not yet part of the CommCell group
can be selected for installation. Users accessing these computers must have administrative
privileges required for installing software.
Software packages are intelligently pushed to the computers. This means that a windows
package pushed to the domain consisting of both Windows and UNIX computers will only install
on the Windows systems. Additionally, application package software will only install on systems
with the prerequisite software installed.
Install options include "For Restore Only" and the optional "CommServe Host Name" which, if
left blank" leaves the client in an un-registered state. Either of these options will not consume a
license until the client is either registered and/or enabled for data protection.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 105
• Decoupled
Install
• Custom
Package
Install
• Restore Only
Agent
• Silent Install
Decoupled Install
Decoupled install is performed without involving the CommServe server until you are ready to
add the Client and/or MediaAgent to the CommCell environment. Once all necessary physical
connections are established, the computer can be added to the CommCell environment. This
feature will be useful when you want to pre-image computers with the software at a central
location and later ship them to the environment where you plan to use them.
Custom packages can be configured from the Simpana Installer's Advanced options. For UNIX
systems, select the cvpkgadd's Advanced options menu choice.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
106 | CommCell Envir onment Deployment
Installation involves copying the custom package .exe file to the target host and executing to
expand the installation files and start the silent install process. Upon completion, a success
message will be displayed.
For UNIX, copy the folder to the target host and execute the cvpkgadd command.
Silent Install
A Silent install consists of the following distinct phases:
Recording Mode - In this phase, an install is recorded, saving your install options to an .xml file.
Playback Mode (XML input file) - In this phase, the .xml file is played back by the install
program. The software components are installed as per the recorded options without
prompting for any user inputs. Through this method, the deployment of the software can be
automated.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 107
• CommServe® Server
• MediaAgents
• Index Cache
• Libraries
• Clients
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
108 | CommCell Envir onment Deployment
MediaAgents
DR MediaAgents
• A DR MediaAgent is installed and preconfigured at a DR location. The most common
implementation of DR MediaAgents is in the use of replica libraries or a secondary disk
library using Simpana deduplication and the DASH Copy feature. By having an active and
registered MediaAgent configured with a library at a DR location RTOs can be more
realistically achieved. Incorporating a DR MediaAgent with a standby CommServe server
provides a ‘ready to go’ DR infrastructure which can expedite recovery procedures in the case
of disaster.
• Another use of the DR MediaAgent is the ability to pre-stage recovery operations at a DR
location. This is most commonly implemented in virtual environments. CommVault provides
documentation on the proper implementation on pre-staging the recovery of virtual
machines in DR environments.
Index Cache
Sizing the Index Cache
• The index cache should be sized based on the need to browse back in time for data to be
recovered. The farther back in time you need to browse, the larger the cache should be. If
the index cache is undersized, index files will be pruned sooner to maintain a default 90%
disk capacity. When you attempt to perform a browse or find operation and the index file is
not in the cache it will automatically be restored from media. If the index file is in magnetic
storage there will be a short delay in recovering the index but if it is on removable media the
time to recover the index can be much longer.
• To properly size the index cache, consider the following:
• The index file size is based on the number of objects being protected. Estimate 150 bytes per
object. The more objects you are protecting the larger the index files will be.
• Each subclient will contain its own index files within the cache.
• The index cache should be on a dedicated disk or partition with no other data being written
to the disk.
• To reduce the probability of pulling an index file back from media use a large index cache
location.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
CommCell Environm ent Deploym ent | 109
Libraries
Consider the following key points for library connections:
• For client servers where the source data is in a SAN or DAS environment and target storage
can be made directly accessible to the client, install a MediaAgent on the client server to
provide LAN free backups.
• When backing up to disk storage attached to a network use a dedicated backup network for
library read/write operations. Do not use the same NIC that is receiving data from a client to
write the data to the library.
• If using Fibre SAN storage with an iDataAgent and MediaAgent installed on the same system
use separate HBAs to receive the source data and write the data to storage.
• If using iSCSI ensure the iSCSI initiator and target systems being used are enterprise class.
Consider using a TCP/IP Offload Engine (TOE) NIC card to reduce CPU load on the server. Do
not use the same NIC receiving the data to write the data to storage.
• If considering using a Virtual Tape Library (VTL) carefully weigh the advantages and
disadvantages. Simpana disk features such as deduplication and DASH operations will not
work if disk storage is configured as a VTL.
• If using a shared disk library, where the library will be shared between multiple MediaAgents
and Simpana’s deduplication, it is strongly recommended to use NAS storage instead of SAN
storage.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
110 | CommCell Envir onment Deployment
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 111
Module 3
Advanced Configurations
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
112 | Advanced Conf igurations
Topics
• Storage Policy Design
• Storage Based Design Strategy
• Business Based Design Strategy
• Deduplication’s Impact on Policy Design
• How Many Storage Policies do I Really Need?
• Advanced Storage Policy Features
• Storage Policy Design Best Practices
• Advanced Job Control
• Controlling Data Protection and Recovery Jobs
• Firewall Configuration
• Network Control
• Stream Management
• Configuring Data Encryption
• Advanced Media Management
• Retention
• Data Aging
• Tape Media Lifecycle
• Vault Tracker® Technology
No unauthorized use, copy or distribution.
Topics
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 113
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
114 | Advanced Conf igurations
This strategy starts with the assumption that protection for the largest data set for a particular
data type is the biggest challenge. For example; if you have hundreds of Oracle databases that
drive your business then their protection should be handled first, even though the databases
cross business function lines. Once you get that storage policy in place, you deal with the next
largest data set. This strategy is driven more by resources than protection requirements.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 115
As the title says, this strategy approaches your data from the business side. Build the storage
policies you need for your mission critical data/business function first. These storage policies
become your core set. As you review other data sets and business groups look to see which can
be incorporated/covered by existing storage policies and which need new policies. This policy is
driven by protection requirements rather than resources. It often results in the purchase of
more storage and data transmission resources.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
116 | Advanced Conf igurations
• Data Types
Global Policy
• Location Building Block
No compression
256 KB
Block Factor
When using Simpana Deduplication, the dedupe block factor is a primary concern when
developing storage policy strategies. The smaller the block size the more entries are made to
the dedupe database. Currently the database can scale from 500-750 million records. The total
volume of data being protected, which is relatively simple to estimate and the estimated
number of unique blocks, which is certainly not easy to estimate, should be taken into
consideration when determining block size. The following recommendations for block factor
settings are based on the following:
128KB – All object level protection, virtual machines and smaller databases.
128KB – 512KB – Current recommendation for database backups depending on size of all
database data managed by the policy. For large databases it is recommended to engage
CommVault Professional Services for proper deployment.
In this case different storage policies should be configured for the different block factors. It is
not recommended to use a single policy for all data when mixed data types are involved since
different data may not deduplicate well in mixed dedupe stores.
Another factor that should be considered is how long the data will be retained. Longer retention
will result in larger databases. Since different data types typically will have different retention
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 117
settings, it would require separate storage policies to manage the data so separate dedupe
databases will be used. It is NOT recommended to use global deduplication for long retention or
large volume protection.
When determining the number of policies that will be needed in large environments, data
growth projections should be considered. Although a single dedupe database may be able to
manage all current data, if the data growth rate is expected to change significantly, you may find
yourself scrambling to redesign your policies at the last minute to accommodate changes in
your environment. This will have a negative effect on deduplication efficiency especially when
data is being retained for longer periods of time.
The use of global dedupe policies are mainly for consolidating small amounts of data with
different primary retention needs or for consolidating remote location data to a central location.
Global deduplication policies should NOT be used across the board for everything in your
datacenter. You can quickly grow out of the database maximum size which will then require a
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
118 | Advanced Conf igurations
complete redesign of your storage policy structure. Realize that policy copies attached to a
global dedupe policy cannot be unattached. New policies will have to be created and the old
policies cannot be deleted until ALL data has aged from the policies.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 119
A general rule of thumb has been – “The more storage policies you have, the more
management is required.” This is not entirely true. Following the rule of thumb you would
think the ultimate solution would be to have just one storage policy. While possible, the
problem with this is the potential complexity of this single storage policy and the efforts needed
to handle any additional data/clients.
Storage policies need to reflect your storage organization and business needs. If that means
have 5, 10, or even 100 storage policies then that’s the correct number of storage policies you
need.
From the previous design strategies you always start with one storage policy. Within a storage
policy you can add, delete, and modify copies by just moving data around. You can’t move/re-
associate existing data between storage policies
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
120 | Advanced Conf igurations
Dynamic Drive
SAN Sharing Library
• Automatic switch-over to an alternate data path, when one of the components in the default
data path is not available.
• Utilization of available libraries and drives in the event of failure or non-availability of these
resources.
• Minimizes media utilization by routing backup operations from several subclients to the same
storage policy and hence the same media; instead of creating several storage policies which
in turn utilizes a different media for each subclient.
• Load balancing (round robin) between alternate data paths provides the mechanism to
evenly distribute backup operations between available resources.
• Facility to define a subset of the data paths at the subclient level within the selected storage
policy and its data paths.
Alternate data paths are supported for both the primary and secondary copies associated with
storage policies for all libraries.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 121
Data Interface pairs are stored in the CommServe database, and are communicated to the
clients via the FwConfig.txt file. When a Data Interface Pair configuration changes, the
CommServe software automatically generates the appropriate FwConfig.txt and pushes it to all
affected parties. There is no requirement that a firewall needs to be configured. It was for
efficiency only that the same file was used for holding both firewall and Data Interface Pair
configuration information on the client.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
122 | Advanced Conf igurations
If hidden storage policies need to be visible in the storage policy tree, set the Show hidden
storage policies parameter to 1 in the Service Configuration tab in the Media Management
applet.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 123
Copy Precedence
Copy precedence determines the order in which restore operation will be conducted. By
default, the precedence order specified is based on the order in which the policy copies are
created. The default order can be modified by selecting the copy and moving it down or up. This
changes the default order. Precedence can also be specified when performing browse and
recovery operations in the Advanced options of the browse or restore section. When using the
browse or restore precedence the selected copy becomes explicit. This means that if the data is
not found in the location the browse or restore operation will fail.
Erase Data
Erase data is a powerful tool that allows end users or Simpana administrators to granularly mark
objects as unrecoverable within the CommCell environment. For object level archiving such as
files and Email messages, if an end user deleted a stub, the corresponding object in CommVault
protected storage can be marked as unrecoverable. Administrators can also browse or search
for data through the CommCell Console and mark the data as unrecoverable.
It is technically not possible to erase specific data from within a job. The way Erase data works is
by logically marking the data unrecoverable. If a browse or find operation is conducted the data
will not appear. In order for this feature to be effective, any media managed by a storage policy
with Erase Data enabled will not be able to be recovered through Media Explorer, Restore by
Job, or Cataloged.
It is important to note that enabling or disabling this feature cannot be applied retroactively to
media already written. If this option is enabled, then all media managed by the policy cannot be
recovered other than through the CommCell Console. If it is not enabled, then all data managed
by the policy can be recovered through Media Explorer, Restore by Job, or Cataloged.
If this feature is going to be used it is recommended to use dedicated storage policies for all
data that may require the Erase Data option to be applied. For data that is known to not require
this option, disable this feature.
Content Indexing
Content indexing allows selected object level data to be indexed for eDiscovery, Records
Management, and compliance purposes. Simpana software allows data to be proactively or
retroactively indexed. This means any jobs being retained in the CommCell environment can be
indexed. Content director schedules can be set to index new data as it is protected, or jobs
currently managed by a storage policy can be selected and indexed.
Subclients can be defined to protect specific data required for indexing. This allows for several
key advantages when using CommVault content indexing:
Selected data and users can be defined in specific subclients for investigative purposes.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
124 | Advanced Conf igurations
When using the Intelligent Archiving Agent, data can be defined based on file type in separate
subclients for Information Lifecycle management policies. This allows specific data types to be
removed from production storage and set in a standard lifecycle management policy where the
data will be retained and destroyed based on retention policies. Indexing of the data can be
conducted to associate relevant search results with specific retention policies to manage data
based on content throughout its useful lifecycle.
Data can be defined in separate subclients for records management policies. This allows data to
be searched based on content and ownership and move relevant information to ERM
(SharePoint), export for 3rd party analysis tools, or moved into separate legal policies for data
preservation.
The Content Indexing tab allows subclient data to be selected for indexing. This allows for a
policy retaining protected data to selectively index relevant data while adhering to standard
retention policies.
Legal Hold Storage Policies can also be used with Content Director for records management
policies. This allows content searches to be scheduled and results of the searches can be
automatically copied into a designated Legal Hold Policy.
Subclients Associations
Subclient Properties
In order to protect a subclient, it must be associated with a storage policy. During an iDataAgent
install, a storage policy can be selected for the Default Subclient. When creating additional
Subclients you must select a storage policy. The policy defined to manage the subclient is
configured in the Storage Device tab – Data Storage Policy sub tab. Use the storage policy drop
down box to associate the subclient with a policy.
Storage Policy Level
All subclients for a specific storage policy can be associated with another policy in the
Associated Subclients tab of the Storage Policy Properties. You choose Re-Associate All to
change all policies, or you can use the Shift or Ctrl keys select specific subclients and choose the
Re-Associate button to associate selected subclients to a new policy.
Policies Level Subclient Association
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 125
If subclient associations need to be made for more than one storage policy you can use the
Subclient Associations option by expanding Policies, right-click on Storage Policies and select
Subclient Associations.
The windows will display all subclients for the CommCell environment. There are several
methods that can be used to associate subclients to storage policies.
Select the subclient and use the drop down box under the storage policy field to select the
storage policy.
You can use the Shift or Ctrl keys to select multiple subclients then use the Change all selected
Storage Policies to drop down box to associate all selected subclients to a specific storage
policy.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
126 | Advanced Conf igurations
Consider these four basic rules for approaching storage policy design:
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 127
When designing a CommCell environment focus should always be placed on how data will be
recovered. Does an entire server need to be recovered or only certain critical data on the server
require recovery? What other systems are required for the data to be accessible by users? What
is the business function that the data relies on? What is the associated cost with that system
being down for long periods of time? The following sections will address RTO and RPO and
methods for improving recovery performance.
The Simpana software suite offers powerful features to provide great flexibility in managing
data. One of the most powerful features is the ability to logically address content requiring
protection by using subclients. Subclients allow content to be explicitly defined such as files,
folders, mailboxes, document repositories, or databases. Although most environments only use
the Default subclient to protect all data managed by an agent, custom subclients can provide
granular management of data which can be used to improve performance, make more efficient
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
128 | Advanced Conf igurations
use of media, or define custom data handling methods to meet specific protection
requirements.
• When custom retention settings are required for specific data such as a folder, virtual
machine or a database.
• When special storage requirements exist for specific data such as isolating financial data onto
separate media from other data being managed by the agent.
• When special file handling must be performed such as using VSS or Simpana QSnap to
protect open files.
• When specific files must be protected and managed independently from other data in the
same location such as PDF and DOC files requiring specific retention or storage requirements.
• When scripts need to be used to place data in a specific state prior to backup such as
quiescing a database before backing it up.
When Simpana software is initially deployed in most environments, only a default subclient is
used. In the KISS philosophy of keeping things simple, unless there are predefined reasons for
protecting specific data, this step may be skipped. Modifying, adding or deleting subclients is a
simple process that can be performed at any time.
Retention Considerations
Retention requirements should be based on specific contents within a file system or application.
All too often, determining retention requirements is not easy, especially when data owners do
not want to commit to specific numbers.
Consider using default retention policies providing several levels of protection. Provide the
options to the data owners and allow them to choose. Also stipulate that if they do not make a
choice then a primary default retention will be used. Also state a deadline in which they must
provide their retention requirements. It is important to note that this is a basic
recommendation and you should always follow policies based on company and compliance
guidelines.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 129
• Disaster Recovery requirements should be based on the number of Cycles of data that should
be retained. This should also include how many copies (on-site / off-site) for each cycle.
• Data Recovery requirements should be based on how far back in time (days) that data may
be required for recovery.
• Data Preservation/Compliance should be based on the frequency of point-in-time copies
(Monthly, Quarterly, Yearly) and how long the copies should be kept for (Days).
Managing different data types such as file systems and databases typically require special
design considerations for storage policies. These factors should be considered in the initial
design strategy.
Consider special protection requirements for different data types in the following situations:
• Typically different data types such as databases and files will require different retention
settings which will result in different policies being used to protect the data.
• If the primary storage target is a tape library and multiplexing will be used it is not
recommended to mix database and object level backups to the same media. Using different
storage policies will force different data types to use different media.
• When using Simpana deduplication it is recommended to use different policies to manage
various data types. This should be done for three reasons:
• Different block sizes may be recommended for different data types depending on the volume
of data requiring protection.
• Different data types do not always deduplicate well against each other.
• It provides for greater storage scalability since each policy will maintain their own dedupe
database.
Library Considerations
Library and Data Paths
Consider the following when determining storage policy strategies for libraries and data
paths:
When using Simpana deduplication, for performance and scalability reasons different policies
should be used for each MediaAgent data path. This will allow the deduplication database to be
locally accessible by each MediaAgent providing better throughput, higher scalability, and more
streams to be run concurrently.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
130 | Advanced Conf igurations
If a shared disk (not using Simpana deduplication) or shared tape library is being used where
multiple Client / MediaAgents have LAN free (Preferred) paths to storage, a single storage policy
can be used. Add each path in the Data Path Properties tab of the Primary Copy. Each Client /
MediaAgent will use the LAN Free path to write to the shared library. This will allow for
simplified storage policy management and the consolidation of data to tape media during
auxiliary copy operations.
If a shared disk (not using Simpana deduplication) or tape library is protecting LAN based client
data where multiple MediaAgents can see the library, each data path can be added to the
primary copy. GridStor Round Robin or failover can be implemented to provide data path
availability and load balancing for data protection jobs.
Deduplication Considerations
When using Simpana deduplication careful planning is essential. Whether a policy will use
deduplication or if a policy copy will be associated with a global dedupe policy must be
determined during the initial configuration of the policy copy. Although the block size specified
during the policy creation can be modified, it will seal the store which will have a negative
impact on dedupe ratios and backup performance.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 131
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
132 | Advanced Conf igurations
• Firewall Configurations
• Network Control
• Network throttling
• Data Interface pairs
• Robust Network Layer
• Job Control
• Activity Control
• Operation Windows
• Holidays
• Stream Management
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 133
Several services used by Simpana software are designed to listen for incoming network traffic
on specified network ports; thus the CommServe server, MediaAgents, and Agents within the
CommCell environment communicate with each other. Essential CommServe services are
automatically assigned registered static port numbers during installation. MediaAgents, Agents,
and other software components can utilize the same default static port numbers, or any static
port numbers specified during installation.
Those services assigned static ports are constantly listening for commands/results from other
services.
Those services assigned dynamic ports are only active when required.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
134 | Advanced Conf igurations
Port usage can be curtailed through defining Firewall port restrictions. However, restricting port
usage can negatively impact performance during concurrent operations.
Base Services
CommServe Services
Monitor (QNServer)
Present only if CommNet Server is installed. Responsible for communicating with CommCell
severs(including SRM) and the CommNet Browser for CommNet Server components. Uses
static port 8403.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 135
MediaAgent Services
Client/Agent Services
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
136 | Advanced Conf igurations
Firewall Configuration
CommServe Proxy
External User
8400
Certificate
MediaAgent Client
8400 MediaAgent
Dynamic
8400
Remote Site
Client
8400
9520
Firewall Configuration
In scenarios like this, you can establish a port-forwarding at the gateway to forward incoming
connections on specific ports to certain machines on the internal network (on specific ports).
You can then configure the client to open a direct connection to the port-forwarder’s IP on a
specific port to reach a particular internal server. This creates a custom route from client
towards the internally running server(s).
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 137
The Simpana proxy acts like a Private Branch Exchange (PBX) that sets up secure conferences
between dial-in client calls. With this setup, firewalls can be configured to disallow straight
connections between inside and outside networks.
The diagram on the right illustrates a DMZ setup where a client from outside communicates to
the CommServe server and MediaAgent operating in an internal network through the Simpana
proxy.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
138 | Advanced Conf igurations
Network Control
• Network Communications
• Data Interface Pairs
• Robust Network Layer
• Job Control
• Activity Control
• Operation Windows
Network Control
Network Communication
Communication within a CommCell® environment is based on TCP/IP. It is recommended to use
hostnames for communication between agents and with the CommServe server. Due to this
recommendation, a properly configured DNS environment with forward and reverse lookup
zones should be used. Hosts files can also be used in situations where DNS is not available or
not reliable. If using hostnames is not preferred, IP addresses can be used to bypass host name
resolution all together.
During agent software installation you will be prompted to choose the hostname of the server
you are installing the software on. The hostname will automatically be populated in the drop
down box. If there are multiple interfaces you can use the drop down box to select the
preferred interface. You can also enter an IP address in place of the hostname though this can
lead to communication problems when IP addresses are changed.
Data Interface Pairs (DIP) are used to explicitly define the physical IP network path the data will
take from source to target. This is done by specifying source and destination network interfaces
using host name or IP address. When multiple paths from source to target exist, multiple DIPs
can be configured allowing multi-stream operations to use separate network paths for streams.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 139
This will permit the aggregate bandwidth of multiple DIPs source-to-target physical connections
to improve data movement performance.
Job Control
Activity Control
Operation Windows
Operation windows allow the CommVault administrator to designate blackout windows in
which designated operations will not run. These rules can be set at the global, client computer
group, client, iDataAgent and subclient levels.
Different operation windows can be defined for data protection jobs, recovery jobs, copy jobs
and administrative jobs. Each defined operation window can have one or more Do not run
intervals defined.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
140 | Advanced Conf igurations
Different operation rules can be specified for the same operation type to define specific time
intervals for different days of the week.
Key Points
Job starts during an operation window blackout period
If a job starts and an operation window is currently preventing jobs from running it will be
placed in a Queued state. This will apply to both indexed and non-indexed jobs. Once the
operation window is lifted and jobs are able to run, the jobs will change to a running state.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 141
A client computer with multiple Network Interface Cards (NIC) and networks can route control
and data traffic using Data Interface Pairs.
Data Interface Pairs can be configured in the CommCell Console’s Control Panel or on the Job
Configuration tab of the Client’s Properties dialog page.
Data Interface Pairs can use the NIC’s host name (ex: client.backup.net) or IP address. In
situations where name resolution is slow or erratic, use of an IP address can improve
performance.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
142 | Advanced Conf igurations
Network Throttling
The network traffic for Clients and MediaAgents can be throttled based on the network
bandwidth in your environment. This is useful to regulate network traffic and minimize
bandwidth congestion.
By default, network throttling is disabled. You can enable the throttling options for an individual
client, a client group consisting of multiple clients, and/or a MediaAgent. Once configured, the
throttling options are applied to all data transfer and control message operations, such as
Backup operations including Laptop Backups, Copy operations including DASH copy, Restore
operations, etc.
The throttling values setup in the throttling rule regulates the rate at which the data is sent and
received.
You can also setup relative bandwidth throttling to ensure performance when the client
machine connects with limited bandwidth. Multiple rules can be created for same client/client
group, however the lowest values set up in different rules takes precedence for each time that
intersects.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 143
• Media Password
CommServe
• Best Practices
Auxiliary Encryption
Key
Management
Client Storage Policy
Hardware Encryption
Inline encryption
Enable client encryption: Client properties | encryption tab
Apply encryption: Subclient properties | Encryption tab
Data can be encrypted as it is being backed up using inline encryption. Encryption can take place
on the client or on the MediaAgent. There are two steps to implement inline encryption:
Enable encryption for the client - encryption can be enabled to use specific encryption
algorithms and bit length. Options to place keys on media and use a pass phrase can also be
configured.
Encryption is applied at the subclient level - Choose which subclients will be encrypted. This
allows you to specifically define data that will be encrypted.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
144 | Advanced Conf igurations
Hardware Encryption
Storage Policy Copy Properties | Data path tab | Data path properties
Simpana software supports LTO standards for data encryption. For LTO generation 4 and above
drives the LTO standard includes AES encryption. In this case the drive will perform all
encryption and decryption. The Simpana software can manage encryption keys in the
CommServe database and optionally include the keys on the media for recovery through the
Media Explorer tool.
Media Password
The Media Password is a CommCell and/or storage policy level password that is written to all
media. When using Media Explorer or the catalog feature and a Media Password has been set,
the administrator must enter the password before media catalog operations can be conducted.
It is strongly recommended a Media Password is always set.
When using LTO hardware encryption or Simpana offline copy based encryption there is an
option to place the encryption keys on the media. If the keys are placed on the media, a Media
Password must be set or encrypted data will be recoverable without entering any password.
Ensure the media password is configured. The CommCell level media password is set in the
System settings in control panel. Optionally, a media password can be configured for specific
storage policies.
If the encryption keys will be placed on the media for recovery using Media Explorer, ensure
that any storage policies where data may need to be recovered using Media Explorer has the
Enable Erase Data option deselected. If this option is enabled, Media Explorer can NOT be used
to recover data.
If a Pass-phrase is going to be used for inline encryption, ensure the password is properly
documented. If the password is not known when recovering data through Media Explorer, the
data cannot be recovered. If the option for regular restore access With a Pass-Phrase is enabled
and the pass-phrase is not known then no data can be recovered using any means.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 145
Using GXAdmin
The GXAdmin.exe utility located in the <install folder>\Base directory is used by Engineers and
Support personnel to monitor Simpana services and processes during
troubleshooting. The GXAdmin.exe utility's Log Params tab replaces
SetLogParamsGUI.exe utility used to set Log size, retention, and verbosity (debug) parameters.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
146 | Advanced Conf igurations
Available Actions:
Select Retrieve Remote Clients to enable view/management of other client's info/settings
DBMaintenance
DBMaintenance utility can be used to perform the following maintenance tasks on the
CommServe database:
• Check the database for inconsistencies
• Re-index all database tables
• Shrink the database
This utility is located at <Software Installation Path>/Base directory. From the command
prompt, run dbmaintenance with appropriate parameters from the list of available parameters.
Running the utility without any parameters will give the complete list of supported parameters.
The CommCell services must be stopped before performing database maintenance.
Using CommServeDisasterRecoveryGUI
Disaster Recovery Backup data can be restored at any production-site or a hot-site any time
using the CommServe Disaster Recovery Tool; however, the operation must be run on a
CommServe machine that does not have any other platforms installed, e.g., MediaAgents,
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Advanced Conf igurat ions | 147
iDataAgents. Running the restore on a CommServe-only machine ensures that conflicts caused
by mismatched product versions or dynamic-link library (DLL) files are avoided. The backup data
can be restored from the Export Destination (Disaster Recovery Backups on disk) or the Backup
Destination (Disaster Recovery Backups on media).
Selecting the Disk Read option, run the tool on the MediaAgent and specify a UNC path to a
source folder on the Client. This will yield full end-to-end throughput performance. The tool
gives you the option to use the Windows ReadFile API to give you an environmental benchmark
for real achievable performance levels.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
148 | Advanced Conf igurations
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 149
Module 4
Performance Tuning
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
150 | Perf ormance Tuning
Topics
• Performance Basics
• Establishing Benchmarks
• Storage Performance
• Performance Parameters
• Stream Management
• Data Streams
• Deduplication Stream Management
Topics
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 151
PERFORMANCE BASICS
Performance
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
152 | Perf ormance Tuning
Establishing Benchmarks
Establishing Benchmarks
Benchmarks can be divided into two kinds, component and system. Component benchmarks
measure the performance of specific parts of a process, such as a the network, tape or hard disk
drive, while system benchmarks typically measure the performance of the entire process end-
to-end.
Establishing a Benchmark focuses your performance tuning and quantifies the effects of your
efforts. Building a benchmark is made up of the following 5 steps:
For example: a backup job over a network to a tape library takes 2 hours to complete. You think
it should take a lot less and you spend time, effort, and money to improve your network and
tape drives and parallel the movement of data. The job now takes 1.8 hours to complete. You
gained a 10% improvement.
Looking at the job in more detail we find that the scan phase of the job is taking 1.5 hours and
the rest is the actual data movement. Switching the scan method reduces the scan phase time
to 12 minutes. The job now takes .4 hours. You gained a 78% improvement.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 153
Knowing what phases a job goes through and how much each phase impacts the overall
performance can help you focus your time, effort, and money on the real problems.
5. Write it down
The hardest lessons are the ones you have to learn twice. Once you’ve established your
acceptable and/or expected performance levels for each resource and end-to-end, write them
down and use them as the baseline for comparing future performance.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
154 | Perf ormance Tuning
Storage Performance
• Disk Performance
DAS
• Tape
Cloud
Storage Performance
Storage Connections
TCP/IP
TCP/IP is the most common network transmission protocol and the least efficient of the three.
Factors that can degrade TCP/IP performance are:
Latency - Packet retransmissions over distance take longer and negatively impact overall
throughput for a transmission path.
Concurrency - TCP/IP was intended to provide multiple users with a shared transmission
media. For a single user, it is an extremely inefficient means to move data.
Line Quality - Transmission packet sizes are negotiated between sender/receiver based
on line quality. A poor line connection can degrade a single link’s performance.
Duplex setting - Automatic detection of connection speed and duplex setting can result
in a half-duplex connection. Full duplex is needed for best performance
Switches - Each switch in the data path is a potential performance degrader if not
properly configured.
SCSI/RAID
SCSI is the most common device protocol used and provides the highest direct connection
speed. Current max SCSI speed for a SCSI controller device is 640MB/sec. An individual SCSI
drive’s speed is determinant by spindle speed, access time, latency, and buffer. Overall SCSI
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 155
throughput is also dependent on how many devices are on the controller and in what type of
configuration. The limitation of SCSI is the distance between devices and the number of devices
per controller.
RAID arrays extend the single addressable capacity and random access performance of a set of
disks. The fundamental difference between reading and writing under RAID is this: when you
write data in a redundant environment, you must access every place where that data is stored;
when you read the data back, you only need to read the minimum amount of data necessary to
retrieve the actual data--the redundant information does not need to be accessed on a read. In
a nutshell – writes are slower than reads.
RAID 0 (striping) or RAID 1 (mirror) or RAID 1+0 with narrow striping are the fastest
configuration when it comes to sequential write performance. Wider striping is better for
concurrent use. A RAID 5 configured array regardless of the striping has the worst write
performance. It’s even worse than single disks. Of course the tradeoff is redundancy should a
disk fail.
Note that fine tuning a RAID controller for sequential read/write may be counterproductive to
concurrent read/write. A compromise needs to be worked out if backup/archive performance is
an issue.
iSCSI/Fibre Channel
iSCSI or Fibre Channel protocol (FCP) is essentially serial SCSI with increased distance and device
support. SCSI commands and data are assembled into packets and transmitted to devices where
the SCSI command is assembled and executed. Both protocols are more efficient than
TCP/IP. FCP has better statistics than iSCSI for moving data, but not by much. Performance
tuning is usually setting the correct Host Bust Adapter configuration (as recommended by the
vendor for sequential I/O) or hardware mismatch. Best performance is achieved when
hardware involved is from the same vendor. Given that configuration and hardware is optimum,
then for both iSCSI and FCP, performance is inhibited only by available server CPU resources
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\lpxnds\Parameters\
Device\NumberOfRequests
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
156 | Perf ormance Tuning
Disk I/O
Optimize Readers/Writers
Parallel reads and writes are possible on disk devices. Adding additional readers and writers can
improve throughput. At some point, there are diminishing returns from adding more reads or
writes. For a single disk, the best speed is obtained by having no more than 2 readers/writers.
Additional throughput may be achieved by adding additional readers/writers, but the speed of
each data stream and the amount of improvement will diminish.
For example: Two readers @ 24Mb/sec can move approximately 20GB/hour. Adding an
additional reader may drop the speed of each stream to 20Mb/sec and move 27GB/hour. The
throughput is higher, but the speed of each stream is lower.
RAID devices use multiple disks so more parallel I/O streams are possible. The CommVault
software recommends a max of 5 readers/writers for a RAID 5 volume. This will give you the
fastest performance per stream. Again, more streams may improve the overall throughput, but
slow the individual streams. Consider what’s on the other end of the stream.
Tape
Tape I/O
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 157
So how do you keep the tape’s write buffer filled? Match the buffer input of data to the output
of data. This can be done by providing sufficient fast individual job streams or by multiplexing
slower job streams together. Note: that excessive input to a tape write buffer impacts the
previous buffer and operation. Do not multiplex for the sake of just having more parallel data
streams and then be concerned about poor performance.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
158 | Perf ormance Tuning
Performance Parameters
Performance Parameters
Chunk Size
A chunk is the unit of data that the MediaAgent software uses to store data on media. For
sequential access media (tape), chunk is defined as data between two file markers. The default
chunk size for indexed data on Tape is 4 GB and for non-indexed data (databases) is 16 GB. For
disk libraries the default chunk size for this type of media is 2 GB. NDMP Libraries use 4GB
chunks.
The Chunk size set on the Data Path Properties will have an impact only when backing up the
tape.
A higher chunk size will give you better data throughput for backups, but granular restores (e.g.,
single file restore) will be slower. On the other hand large restores, like a full machine rebuild
will be a bit faster. Recommended values are: 4 GB, 8 GB, 16 GB or 32 GB.
A lower value is recommended for frequent checks of slower data protection operations,
especially when data is moving across a WAN link. A chunk size can be set as low as 32KB, but
we do not recommend a chunk size below 512KB.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 159
Block Size
CommVault software uses a default block size of 64 KB for tape devices, 32 KB for Centera
devices, and whatever the formatted block size used on disk. MediaAgents can write to media
using different block sizes if the MediaAgent host Operating System and media device supports
that block size.
With tape devices, the higher the block size, the better the performance. CommVault software
can write block sizes up to 256 KB and can automatically read block sizes up to 512 KB. If the
block sizes are larger than 512 KB, read operations from the media will fail and such media will
be over written and re-used if the When Content Verification Failed option is enabled in the
Library Properties (Media) dialog box.
Ensure that the device hardware and MediaAgent host Operating System supports higher block
sizes. If block sizes are not supported the data cannot be restored.
Subclient Parameters
Compression
Data compression options are provided for data secured by data protection operations.
Compression reduces the quantity of data sent to storage, often doubling the effective capacity
of the media (depending on the nature of the data). Hardware compression is usually available
on tape devices and is the recommended method of compression. If hardware compression is
not available or not enabled, then software compression can be configured to occur on the
Client or on the MediaAgent. If the data is later restored/recovered, the system automatically
decompresses the data on the Client and restores it to its original state. Software compression
is CPU intensive. We recommend using it only for WAN-based Clients writing to disk libraries
with compression enabled on the Client.
As compressed data often increases in size, if it is again subjected to compression, the system
only applies one type of compression for a given data protection operation. You can redefine
the compression type at any time without compromising your ability to restore/recover data.
Magnetic Writers
The previous section on disk I/O contains more detailed discussion on configuring the number
of magnetic writers. Setting the number of magnetic writers is a balance between MediaAgent
capacity, drive configuration, data stream speed, and overall throughput.
Multiplexing factor
The previous section on tape I/O contains a more detail discussion on setting a multiplexing
factor. However, a few more things to note:
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
160 | Perf ormance Tuning
• Multiplexing factor is effective only for the Primary copy and to a secondary copy with Inline
copy enabled.
• Multiplexing factor for tape libraries has no impact on the max number of streams for the
storage policy.
• Multiplexing factor for magnetic libraries has impact on the max number of streams for the
storage policy. So if you enable multiplexing for magnetic libraries, adjust as appropriate the
max number of streams for the storage policy
Network Agents
Network Agents are parallel processes that read/write buffers to the transmission path. If not
fully used, they consume resources that might be used elsewhere. If LAN Optimized the
number of Network Agents is automatically set to 1. Changing the setting has no affect.
Pipeline buffers
CommVault’s data pipe technology in a Non-LAN optimized configuration uses by default thirty
(30) - 64KB buffers in shared memory to transmit/receive data. For large databases writing to
tape devices, increasing the number of buffers to 150 or 300 can improve performance. The
amount of shared memory available and the impact to other production processes must be
considered before increasing this value.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 161
STREAM MANAGEMENT
Stream Management
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
162 | Perf ormance Tuning
Data Streams
Data Streams
Data Streams are what CommVault software uses to move data from source to destination. The
source can be production data or CommVault protected data. A destination stream will always
be to CommVault protected storage. Understanding the data stream concept will allow a
CommCell environment to be optimally configured to meet protection and recovery windows.
This concept will be discussed in great detail in the following sections.
Job Streams
Content requiring protection is defined within a subclient. Each subclient will contain one or
more streams for data protection jobs. For most iDataAgents, it is possible to multi-stream
subclient operations. Depending on performance requirements and how the data is organized in
the production environment, multi-streaming source data can be done by adding more
subclients or increasing the streams for an individual subclient.
Multiple Subclients
There are many advantages to use multiple subclients in a CommCell environment. These
advantages are discussed throughout this book. This section will focus only on the performance
aspects of using multiple subclients.
Running multiple subclients concurrently allows multi-stream read and data movement during
protection operations. This can be used to improve data protection performance and when
using multi-stream restore methods, it can also improve recovery times. Using multiple
subclients to define content is useful in the following situations:
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 163
Using multiple subclients to define data on different physical drives – This method can be used
to optimize read performance by isolating subclient contents to specific physical drives. By
running multiple subclients concurrently each will read content from a specific drive which can
improve read performance.
Using multiple subclients for iDataAgents that don’t support multi-stream operations – This
method can be used for agents such as the Exchange mailbox agent to improve performance by
running data protection jobs on multiple subclients concurrently.
Using multiple subclients to define different backup patterns – This method can be used when
the amount of data requiring protection is too large to fit into a single operation window.
Different subclients can be scheduled to run during different protection periods making use of
multiple operation windows to meet protection needs.
Multi-Stream Subclients
For iDataAgents that support multi-streaming individual subclients can be set to use multiple
read streams for data protection operations. Depending on the iDataAgent being used this can
be done through the Data Readers setting or the Data Streams setting.
Data Readers
Data Readers determine the number of concurrent read operations that will be performed
when protecting a subclient. By default, the number of readers permitted for concurrent read
operations is based on the number of physical disks available. The limit is one reader per
physical disk. If there is one physical disk with two logical partitions, setting the readers to 2 will
have no effect. Having too many simultaneous read operations on a single disk could potentially
cause the disk heads to thrash slowing down read operations and potentially decreasing the life
of the disk. The Data Readers setting is configured in the General tab of the subclient and
defaults to two readers.
Data Streams
Some iDataAgents will be configured using data streams and not data readers. For example,
Microsoft SQL and Oracle subclients use data streams to determine the number of job streams
that will be used for data protection operations. Data Streams are configured in the Storage
Device tab of the subclient. Although they will be configured differently in the subclient, they
still serve the same purpose of multi-streaming data protection operations.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
164 | Perf ormance Tuning
Device Streams
As Job Streams are received by the MediaAgent, data is put into chunk format and is written to
media as Device Streams. The number of device streams that can be used will dependent on the
library type, library configuration and storage policy configuration.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 165
Client
Writers
Mount
Paths
Storage Disk Writers
50 device Policy Library
Client streams
Allow maximum
writers
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
166 | Perf ormance Tuning
Most disk libraries will be comprised of multiple mount paths. In order to load balance between
mount paths, it is recommended to set the Mount Path Usage option in the Mount Paths tab to
use Spill and Fill. By setting the number of streams to 50, storage policy streams to 50 and the
mount path usage to spill and fill, maximum performance can be achieved.
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
Perf ormance Tuning | 167
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.
168 | Perf ormance Tuning
THANK YOU
www.commvault.com
The Information contained in this document is subject to change without notice. No unauthorized use, copy or distribution.