Commvault CloudArchitectureGuidev11 SP15
Commvault CloudArchitectureGuidev11 SP15
REVISION HISTORY
VERSION DATA CHANGES
1.0 March 2015 • Initial Version
1.1 May 2015 • Updated AWS Architecture recommendations
1.2 June 2015 • Added new Architecture Consideration sections - Networking (AWS VPC),
Infrastructure Access, Performance / Storage
• Added new Installation sections - Bring Your Own License / Software sections
(Installation), Video Tutorial links
• Added new Additional Architecture Resources section
• Updated document & layout for new Commvault® branding
• Updated core cloud concepts and technology, AWS Sizing Recommendations and
Security (Architecture Considerations) section
• Modified section layout
• Removed Data Aging caveats with SP11 micro pruning for Cloud Storage release,
replaced text to refer to this only for pre-SP11 sites
1.3 July 2015 • Updated with new trademark guidelines
1.4 August 2015 • Minor reformatting
• Added new links to video content
1.5 September • Added Selecting the right Storage Class section
2015 • Minor reformatting
1.6 November
• New logo style
2015
• Updated requirements for Disaster Recovery to the Cloud
• Updated cloud use case diagrams
• Updated VM Recovery to AWS feature, and updated documentation links
• Added Unsupported Cloud Configurations section
2.0 March 2016 • Updated to reflect new Virtual Server Agent for AWS methodologies, deployment
and changes to use cases
• Updated Backup to the Cloud, DR to the Cloud and Protection in the Cloud use case
scenarios and requirements
• Updated micro pruning section
• Updated Drive Shipping to add note about Snowball support arriving 2016
• Updated all BOL links to use Commvault® Version 11 documentation
• Added new Documentation section to Additional Resources
• Added Automating Deployment with Puppet/Chef section
• Added Pre-packaging Commvault within a VM Template section
• Minor reformatting changes
2
VERSION DATA CHANGES
2.1 June 2016 • Updated Backup to the Cloud use case for more clear language around DDB requirements
• Added Commvault IntelliSnap functionality into VSA for AWS
2.2 September • Added Migration to the Cloud use case and Application Migration section
2016 • Fixed error in data seeding table
• Minor updates to Backup/Archive to the Cloud use case verbiage
• Updated links to 2 Clicks to the Cloud videos, added new Backup videos for AWS
• Updated all original “Agent-in-Guest” references to “Agent-in-Guest (Streaming)”
• Added Snapshot-based Agent-in-Guest approach to Amazon Web Services
• Added AWS Reference Architectures
• Updated verbiage on Selecting the right Storage Class
2.3 March 2017 • Revised cloud library storage deduplication performance recommendations
• Added Live Sync DR for Amazon EC2 and revised DR structure
• Added Amazon S3/Blob storage backup feature
• Added GET/PUT storage cost considerations
• Added partitioned dedupe recommendation
• Added information about leveraging multiple mount points for cloud libraries
• Added EC2 to Azure conversion feature
• Updated AWS VSA features and recommendations (AppAware, Proxy restore AZ,
advanced file system restore)
• General grammatical fixes
• Added Performance test results for AWS backup methods: VSA and agent in guest
2.4 April 2017 • Updated Cloud pricing
• Changed a few areas to Commvault® where Simpana was referenced
2.5 August 2017 • Changed CS & MA sizing wording
• Added Windows 2016(SP7) to CS and MA
2.6 September • Added Extra Large specs for AWS media agents
2017 • Added recommendations on dedupe block size to use in hybrid environments
• Added Oracle E-Business Suite migration functionality
• Added video link to Oracle EBS migration
2.7 February 2018 • Updated sizing for MediaAgents running in AWS
2.8 May 2018 • Review entire document and update with new content
• Updated MediaAgent Instance sizing
• Included Move, Use, Manage use cases
• Separated document specific to AWS
2.9 October 2018 • Corrected various minor errors
• Updated document with SP12 and SP13 functionality
2.10 January 2019 • Updated document with SP14 functionality
• Revised Amazon EC2 sizing recommendations for MediaAgents
• Corrected various minor errors
2.11 March 2019 • Updated document with SP15 functionality
• Modified Best Practices for Amazon EC2 instance protection and performance numbers
• Added Architecture recommendations for Cross-Account Operations using VSA and
Intellisnap
• Corrected various minor errors
3
Table ofContents
NOTICES / 2
REVISION HISTORY / 2
ABSTRACT / 7
SCALABILITY / 10
AUTOMATION / 13
ARCHITECTURE CONSIDERATIONS / 19
NETWORKING / 19
4
• VIRTUAL PRIVATE CLOUD/NETWORKING / 19
INFRASTRUCTURE ACCESS / 20
• IN-FLIGHT / 20
• AT-REST /20
• HTTPS PROXIES / 20
• “OVER-THE-WIRE” /21
• DRIVE SEEDING / 21
COST / CONSUMPTION / 22
• NETWORK EGRESS / 22
• STORAGE I/O / 22
• DATA RECALL / 22
PERFORMANCE / STORAGE / 23
• PARTITIONED DEDUPLICATION / 24
• MICRO PRUNING / 24
• SELECTING THE RIGHT STORAGE CLASS FOR BACKUP AND ARCHIVE DATA / 24
AMAZON-SPECIFIC WORKLOADS / 27
5
• VIRTUAL MACHINE RECOVERY INTO AMAZON EC2 INSTANCES / 27
• AGENT-LESS AMAZON EC2 INSTANCE PROTECTION (VIRTUAL SERVER AGENT FOR AWS) / 31
APPLICATION MIGRATION 36
• VIRTUAL MACHINE RESTORE & CONVERT (LIFT AND SHIFT TO AWS) / 36
DEPLOYMENT / 37
REMOTE ACCESS / BRING YOUR OWN SOFTWARE / 37
• INSTALLATION BASICS / 37
ARCHITECTURE SIZING / 39
AMAZON WEB SERVICES (AWS) / 39
ADDITIONALRESOURCES / 43
DOCUMENTATION / 43
6
ABSTRACT
This document serves as an architecture guide for solutions architects and Commvault® customers who are
building data protection and management solutions utilizing public cloud environments and Commvault® software.
It includes public cloud concepts, architectural considerations, and sizing recommendations to support
Commvault® software in public cloud. The approach defined in this guide applies to both running Commvault solely
in public cloud environments and extending existing on-premises Commvault® functionality into hybrid cloud
architectures. The guide covers several common use cases for public cloud including moving data to public cloud,
disaster recovery to public cloud, as well as protecting workloads in public cloud.
This guide delivers architecture considerations and sizing recommendations for the Amazon Web Services™ (AWS).
Guides for other public cloud environments are available as well.
Within cloud, these infrastructure building blocks are not only provisioned on-demand as required driven by actual
usage but can also be programmed and addressed by code allowing for a cost-effective pay-as-you-go model. This
greatly enhances flexibility for both production, non-production environments for scenarios such as development,
testing, and disaster recovery.
Resources in cloud environments can be provisioned as temporary disposable units, freeing users from the
inflexibility and constraints of a fixed and finite IT infrastructure. Infrastructure provisioning is automated through
code, allowing for greater self-service and more agile delivery of desired business and technical outcomes.
Intelligently provisioned and managed resource consumption in cloud environments is measured by what you
consume, not what you could consume, drastically changing the cost model challenges experienced today in
traditional on-premises architectures that typically operate on a three-year to five-year technology refresh cycle.
This represents a major, disruptive reset for the way in which you approach common infrastructure for data usage
such as secondary backup storage, long-term archiving, disaster recovery, new application development and
testing, reliability and capacity planning for bursting production workloads. Commvault utilizes this attribute of
public cloud to enable cost effective on-demand use cases for both data protection and data management both to
and in public cloud platforms.
7
GLOBAL, FLEXIBLE AND UNLIMITED RESOURCES
Public cloud providers’ offer globally distributed infrastructure available to customers on a pay-as-you-go model,
allowing for more flexibility in meeting requirements for both geographically distributed workloads and recoveries.
Cloud resources, bandwidth, and their availability are often localized via massive regional presence to proximity
of on-premises corporate assets and human resources, allowing for an easy on-ramp to public cloud. The cost
model implications of pay-as-you-go do not just extend to only production workloads, but also to the ever-present
challenge of providing a flexible, agile, yet capable, recovery solution for your applications and data. Today, many
recovery environments have less compute and storage capacity than their production counterparts, resulting
in degraded service in the event of a disaster. Even more so, hardware is often re-purposed to fulfill both the
recovery requirements as well as non-production uses, resulting in higher than expected maintenance costs and
slowed recovery times.
With the public cloud model, the infrastructure availability and refresh aspect are disrupted by removing the
need to maintain a hardware fleet that can meet both your recovery requirements and sustain your service level
agreements. Public cloud instances can be rapidly provisioned to meet the needs tied to business requirements,
rather than purchasing cycles. For specific recovery events – both real and simulated – the underpinning hardware
is maintained and upgraded by the Cloud provider without any need for technical input, and no upgrade costs are
incurred by the organization.
This dynamic shift allows you to begin costing per recovery event, instead of paying for availability, improving
your level of disaster recovery preparedness through the application of flexible, unlimited resources to stage both
recovery tests and execute actual recovery events – all without requiring pre-purchased hardware or disrupting
production operations. While the recovery use case is the most common foray into a public cloud architecture,
many other use cases such as application testing and development, business intelligence and analytics, and
production bursting all benefit from the public cloud model.
Commvault® software is designed as an orchestrated, hardware and cloud agnostic, highly modular, distributed
solution that conforms with this new architecture reality, allowing data protection and management solutions to
be built to support and remain flexible with a highly distributed infrastructure built on-top of cloud architecture –
public, private or hybrid.
Due to the existence of a variety of public cloud platforms, many organizations are deploying multi-cloud
architectures out of a need to overcome their technological challenges and to optimize costs for specific types of
services. In some cases, this is being coupled with in-house developed private cloud environments that operate
with the agility and flexibility of public cloud for specific data types. The resulting reality for most environments is
a hybrid multi-cloud architecture, which offers optimal flexibility and control of the costs.
With the formation of a hybrid multi-cloud environments, some common challenges surface with respect to data
management and protection. The human element is fundamental to cloud transformation. Having the appropriate
skillsets available in a timely manner to create, administer, and manage the data across cloud environments
becomes a top priority and can quickly result in a costly offset to the benefits of cloud being originally sought. The
number of different toolsets and techniques to perform the same operation can become an overwhelming decision
8
point and can result in a steep learning curve coupled with considerable custom automation and added cloud cost
if not done correctly. Lastly, the risks associated with the volume of data movement to, from, and across clouds and
their associated regulatory and compliance implication on your data being placed in public cloud often warrants a
deeper examination, before embarking on cloud adoption.
An executive survey around cloud adoption produced the following data points:
Regardless of cloud preferences, the data is yours and needs to adhere to the standards set forth by your
organization with respect its management, compliance, and protection. With the Commvault® platform, several
barriers discussed above get mitigated by having a single software-defined solution that provides the following
capabilities:
• native integration with multiple cloud platforms compute, application, and storage layers
• single pane of glass for managing all data in hybrid environments – public and private clouds and multiple
hybrid clouds
• cost optimization in public cloud by usage-based scaling infrastructure
• integrated global deduplication and encryption
• data portability through cross platform conversion and recoveries
9
This Cloud Library works by communicating directly with object storage’s REST API interface over HTTP or HTTPS,
allowing for Commvault® platform deployments on both virtual and physical compute layers to perform read/write
operations directly against cloud storage targets, reducing the TCO of the data management solution. The Cloud
Library is part of the native code of the Commvault platform, and it optimizes the data exchange with cloud object
storage platform to maximize the transfer speed while minimizing recall needs and costs.
Since the Cloud Library essentially treats cloud storage akin to a disk target, data management functions such
as compression, encryption, deduplication, and data life-cycling can be performed against cloud storage targets
to ensure that both costs and risks are managed effectively. This also allows the data to be retained independent
to the Cloud format thereby enabling optimized recall and movement of data across different cloud platforms for
future use cases.
*For more information on all the supported vendors, please refer to this comprehensive list located in Commvault
online documentation.
SCALABILITY
Application environments and the data and instances that service those environments grow over time, and a
data protection and management solution needs to adapt with the change rate to protect the dataset quickly and
efficiently, while maintaining an economy of scale that continues to generate business value out of that system.
Commvault® software maintains a scale-up or scale-out “building block” approach for protecting datasets,
regardless of the origin or type of data. These blocks are sized based on the front-end data they will ingest, prior
to compression and deduplication. This provides clear scale-out and scale-up guideline for the capabilities and
requirements for each Commvault MediaAgent that will perform the data movement (both ingestion and storage),
compression and deduplication.
Furthermore, these deduplication MediaAgent building blocks may be logically grouped together in a grid
formation, providing further global deduplication scale, load balancing, and redundancy across all nodes within
the grid.
This software architecture, with scale-up and scale-out capabilities enables cloud adoption to start with a
cost-conscious approach however scales to meet SLAs quickly without locking the architecture into a specific
unit of operation.
CLIENT-SIDE DEDUPLICATION
As is the nature of deduplication operations, each data block must be hashed to determine if it is a duplicate block,
or unique; and then must be captured. While this is a way to improve the ingest performance of the data mover
(MediaAgent), it has the secondary effect of reducing the network traffic stemming from each client communicating
through to the data mover.
In public cloud environments where network performance can vary, the use of client-side deduplication can reduce
backup windows and drive higher scale, freeing up bandwidth for both production and backup network traffic. By
utilizing client-side deduplication, the workload of backup can be distributed across all the instances, compared
to building a larger data protection architecture in cloud. This can also help reduce the recovery points for critical
application by enabling more frequency of protection.
10
VIRTUAL SERVER AGENT (VSA) OR PROXY FOR CLOUD PLATFORMS
Utilizing agents in each cloud operating instance is an approach that distributes the overall workload and cost
for data protection across all the instances. However, in many cases with large scale deployments, management
of each instance can become an overhead issue. The Commvault® platform automates the management of agent
operations from initial deployment to upgrading and the removal from instances. When this approach is deemed
insufficient, the Commvault Virtual Server Agent (VSA) proxy software capability can be loaded into a public cloud
instance to perform complete agent-less operations.
Akin to proxy-based protection for on-premises hypervisors, the Commvault VSA* proxy interfaces directly
with APIs available from the hypervisor layer of public cloud platforms to perform protection and management
operations of instances within the public cloud platform. The VSA not only manages operations such as snapshot
creation and orchestration but can also performs automatic instance identification and selective data reads
(Change Block Tracking) from cloud platform platforms that support this capability. The VSA further performs
any data format conversions and enables disaster recovery operations for instances to cloud platforms. Working
together with the MediaAgent (data mover), the VSA offers enhanced protection and management of cloud
workloads.
*For a complete list of supported VSAs and updated VSA capabilities please review the online VSA Feature
Comparison Matrix.
As part of any data protection and management solution, it is important to ensure that you design for recovery in
order to maintain and honor the recovery time objective (RPO) and recovery point objective (RTO) requirements
identified for your individual applications groups.
While crash-consistency within a recovery point may be sufficient for a file-based dataset or cloud instances such
as Amazon EC2 instances, it is not generally appropriate for an application such as Microsoft SQL Server or Oracle
Database where the database instance needs to be quiesced to ensure the database is valid at the time of backup.
Commvault® software supports both crash- and application-consistent backups, providing flexibility in your design
while assuring instance recoverability coupled with application recovery to a specific point in time. Not only are
the most common types of applications covered, but a wide variety of classic applications and cloud applications
are supported. For a complete list of updated application support please review the online documentation: Data
Protection and Recovery Agents.
Many cloud providers support replication at the object storage layer from one region to another. However, in the
circumstance that bad or corrupted blocks are replicated to the secondary region, your recovery points are invalid.
Further network and storage costs continue to accumulate regardless of the validity of both “sides” of the data.
11
While Commvault® software can support a replicated cloud library model, in which the secondary storage location
for backups is replicated using the Cloud vendors storage-based replication tools, we recommend that you
consider using the Commvault® software to create an independent copy of your data, either to another region, or
another cloud provider, or back to an on-premises infrastructure to address broader risks. Deduplication is also
vital as part of the latter option and this ensures that Commvault® software can minimize the cross-region and
cross-provider copy time and costs by ensuring only the unique changed blocks are transferred over the network.
This recommendation not only ensures recoverability to multiple points in time, it further manages the cost
and risk through the assurance that the data is independent of the platform and ensures that different SLAs for
protection and retention can be maintained for different classes of data.
Not all workloads within the cloud need protection – for example, with micro services architectures, or any
architecture that involves worker nodes that write out the valued data to an alternate location, there is no value in
protecting the worker nodes. Instead, the protection of the gold images and the output of those nodes provides the
best value for the business. However, it is important to note that data stored in ephemeral locations may need to be
protected prior to termination operations against those instances to ensure that any valuable data is not lost.
A common consideration is to utilize multiple tiers of storage for data as the service life of that data reduces. This
has been a common practice on-premises and the Commvault platform extends this capability to cloud platforms.
By having native integration to primary object storage targets such as Amazon S3 Standard and having native
access to more cost-effective tiers such as Amazon S3 Standard-Infrequent Access (S3-IA), Amazon S3 One Zone-
Infrequent Access (S3 One Zone-IA) and Amazon Glacier, data lifecycle management can be performed within the
cloud. For example, it is not uncommon to see Amazon S3 Standard being used as the secondary copy for short-
term retention followed by Amazon Glacier being used for long-term retention copies. Having a data management
platform that can utilize SLA policies to orchestrate the data movement and be aware of the location of data for
recall and disposition becomes a valuable quality in gaining cloud efficiency.
Shutdown of instances in an on-premises data center is a very uncommon practice or design concept. However,
in cloud environments this type of operation is welcomed by those paying the cloud bills. By having the ability to
create policies which monitor resource usage of cloud instances and can both alert and act by terminating such
instances. However, the risk of data loss is mitigated since it is ensured a copy of ephemeral data is protected
before such an operation is performed.
The ability to shutdown instances is extended to Commvault platform components running in the public cloud.
Referring to the MediaAgent (data movers) referenced above, these instances can be both shutdown and powered-
up via a policy operating on a proxy running in public cloud or on-premises. The trigger events are around data
protection operations. For example, shutting down the Cloud MediaAgent after all protection operations have
ceased and restarting prior to the next SLA window can help further reduce operational costs within public
cloud environments.
12
AUTOMATION
The cloud encourages automation, not just because the infrastructure is programmable, but the benefits in having
repeatable actions reduces operational overheads, bolsters resilience through known good configurations and
allows for greater levels of scale. Commvault® software provides this capability through three key tenets:
Commvault® software provides a robust Application Programming Interface (API) that allows for automated control
over deployment, configuration, and backup and restore activities within the solution.
Whether you are designing a continuous delivery model that requires automated deployment of applications,
or automating the refresh of a disaster recovery copy, data warehouse or development/testing environment
that leverages data from a protection copy, Commvault® software provides the controls necessary to reduce
administrative overhead and integrate with your toolset of choice.
Beyond API access, the most common use cases for data protection and management are built into the Commvault
user interface. Simply enter the Cloud credentials and necessary permission and the Commvault platform will
query the Cloud environment accounts and present wizards with necessary attributes to create instances and
populate with the data required to support the above uses discussed. Since format conversions are handled by the
VSA, the entire operation is orchestrated even if the source of data is an on-premises hypervisor. This reduces the
operational overhead and unique skillsets required to on-board cloud usage.
The Commvault® Intelligent Data Agents (iDA), whether via the Virtual Server Agent for the various cloud platforms,
or the multitude of application and database iDAs, provide auto-detection capabilities to reduce administrative
load.
Fresh instances, new volumes recently attached to cloud instances and virtual machines, or databases imported
and created into a database instance are some examples of how Commvault® software automatically detects new
datasets for inclusion in the next data protection SLA window, all without manual intervention. Even agent-in-
guest deployments can be auto-detected by Commvault software and included in the next data protection schedule
through intelligent Client Computer Groups. This capability is especially valuable in the assurance of data protected
in large scale cloud environments where many users can provision workloads in the cloud but may have little or no
consideration for the protection of those workloads.
This auto-detection and auto-protection level removes the requirement for a backup or cloud administrator to
manually update the solution to protect the newly created datasets. This results in improving your operational
excellence, improving resiliency within your cloud infrastructure, and ensuring new data is protected thus Service
Level Agreements (SLAs) are maintained.
A common task performed by system administrators is facilitating access to recovery points for end-users and
application owners, shifting their attention away from other day-to-day operations and strategic projects.
The Commvault self-service interfaces empower users to access their datasets through a web-based interface,
allowing security mapped access to individual files and folders within the protected dataset, freeing up
administrators to work on critical tasks. The Commvault robust role-based security function provides assurance
that self-servicing users have access to only their data assets, while bespoke auditory reporting capabilities
capture how these users are accessing those data assets.
13
MOST COMMON CLOUD USE CASES WITH COMMVAULT® SOFTWARE
The most common use cases observed at most customer environments by Commvault, related to cloud, fall into
three categories depending on the maturity level of initiatives around cloud adoption.
• Move data to the cloud – typically involves using public cloud object storage as a target for backups and archive
data and moving certain types of VM workload into cloud instances.
• Manage data in and across clouds – protecting and life-cycling data and instances in cloud, moving data across
clouds and back to on-premises in some cases.
• Use data in the cloud – utilizing the data stored in public cloud for use cases such as disaster recovery, dev/test,
and other production and non-production use cases. These three primary use cases can be visualized as follows:
Each use case can have multiple phases and types of data associated. For example, movement could involve
simple backup data, but can graduate to workloads being moved back and forth for agility as an extension to on-
premises. Management of data can start with basic snapshot management and graduate to complete data lifecycle
management with cloud snapshots, operational recovery deduplicated copies, and archive of data coupled with
searching and indexing for compliance. The use of data can involve uses such as disaster recovery that eliminate
the need to sustain secondary on-premises sites and utilize the agility of the cloud to on-ramp recovery testing and
real recoveries.
14
MOVE DATA – BACKUP AND ARCHIVE TO THE CLOUD
Business Value: Protecting data at the primary on-premises location by writing directly to an external cloud
provider’s storage solution or retaining a local copy and replicating the backup and archive data copy (either in full,
or only selective portions of that data) into an external cloud provider’s storage service suitable for both short and
long-term retention configurations.
• Native, Direct connectivity to 40+ object storage • Can use direct internet connection or a dedicated
endpoints – no requirement for translation, gateway, network to the cloud provider for optimized data
or hardware deduplication devices. transport performance in a secure manner (e.g.,
AWS Direct Connect).
• Avoid point solution on a per-application basis. Any
data (physical or virtual) that can be backed up by • In-cloud MediaAgent can be created to support a DR
Commvault on-premises can be moved to cloud. solution to cloud using the data that is placed in the
cloud storage service. This cloud MediaAgent can
• Cloud object storage target can be provided by either
be deployed at the time of the DR event or DR test,
a public IaaS provider (such as AWS) or via a Managed
as required.
Service Provider (MSP).
15
MOVE DATA - MIGRATION OF VMS AND APPLICATIONS TO THE CLOUD
Business Value: Upon protecting VM and application data at the primary on-premises location, Commvault®
software orchestrates the migration of application workloads into the cloud, either at the VM container level or the
application level. While providing the migration lifecycle and workloads are in a transition phase between
on- premises and public cloud, data is still protected on-premises.
• Lift & Shift of virtual machines – Application- • Minimum 1x MediaAgent on-premises to protect and
consistent VM backups are used to restore and capture workloads
convert VMware and Hyper-V VMs into Amazon
• Minimum 1x MediaAgent (& DDB) in-cloud to protect
EC2 instances as part of a migration with a
workloads post-migration in-cloud, and for optimal
phased cut-over strategy reducing on-premises
migration performance.
downtime.
• The Oracle migration feature to Amazon EC2 supports
• Oracle Migration Feature (Linux, AIX, Solaris to
Oracle on Linux, AIX and Solaris. For AIX and Solaris
AWS) – an Oracle Application Migration feature to
source databases, a destination Amazon EC2 instance
both synchronize the baseline and any incremental
must exist with Oracle installed. However, for Linux
changes as part of a migration lifecycle.
source databases, the destination Amazon EC2
• Application Restore Out-of-Place –Leverage instance can be provisioned as part of the process.
Commvault® iDataAgents for your supported
• It’s highly recommended to use dedicated network
workload to restore the target application out-of-
links to the cloud provider for best performance (e.g.
place to a warm instance residing in cloud.
AWS Direct Connect).
16
MANAGE DATA - PROTECTION IN THE CLOUD
Business Value: Providing operational recovery for active workloads and data within an external provider’s cloud.
Provide the ability to lifecycle data and cloud instances to meet SLA and cost requirements.
• Data protection for Cloud based workloads – • (AWS) Virtual Server Agent and MediaAgent deployed
protecting active workloads within an existing IaaS on a proxy within an Amazon EC2 instance for
Cloud (Production, Dev/Test, etc.). agentless backup. Applications will require agent-in-
guest deployed in-instance.
• Agentless Instance Protection (AWS) – protect
instances with an agentless and script-less • Applications requiring application-level consistency,
protection mechanism through the Virtual Server and all other cloud providers can be protected via
Agent. agents deployed in each VM/instance within the IaaS
provider.
• DASH Copy data to another region, cloud, or
back to on-premises – complete data mobility by • Minimum 1x MediaAgent in source cloud, and
replicating to another geographical region within (optional) minimum 1x MediaAgent at secondary
IaaS provider, a different IaaS provider, or back to site (whether cloud or on-premises) for receiving
on-premises sites. replicated copy of data.
• Protect Amazon S3 – Backup object storage • Recommended to use a dedicated network from cloud
repositories containing data created by other provider to on-premises for best performance when
third-party applications either in cloud, to an replicating back to on-premises (e.g., AWS Direct
alternative provider, or back to on-premises sites. Connect)
17
USE DATA - DISASTER RECOVERY TO THE CLOUD
Business Value: Providing operational recovery of primary site applications to a secondary site from an external
cloud provider.
• Off-site storage and cold DR site in the Cloud – • Database/Files – restore out-of-place, whether on-
only use the Cloud compute infrastructure when demand or scheduled, to refresh DR targets. When
a DR event occurs, saving time and money via combined with job-based reporting, this scheduled
the elimination of asset allocation with long idle operation is of benefit to enterprises that must
periods between DR operations. maintain audit and compliance reporting associated
with business continuity reporting.
• Live Sync data replication for Warm Recovery in
cloud – Automate the creation of cloud instance • Minimum 1x MediaAgent on-premises, and minimum
and replication of on-premises VMs to Amazon 1x MediaAgent in cloud
EC2 instances on a periodic cycle basis more
• MediaAgent in cloud only needs to be powered on for
frequently than backups. Reduces recovery time to
recovery operations
the Cloud.
• Highly recommended to use dedicated network links
• VM Restore & Convert – convert VMware and
to the cloud provider for best performance (AWS
Hyper-V VMs into Amazon EC2 instances on-
Direct Connect).
demand with data intact. This data transformation
automation reduces time and complexity costs.
18
ARCHITECTURE CONSIDERATIONS
NETWORKING
AWS has the capability to establish an isolated logical network. This is referred to within AWS as a Virtual Private
Cloud (VPC).
Instances/Virtual Machines deployed within a VPC, by default, have no access to the public Internet, and utilize a
subnet of the customer’s choice. Typically, VPCs are used when creating a backbone between Virtual Machines
(Amazon EC2 instances), and when establishing a dedicated network route from a customer’s existing on-
premises network directly into the public cloud provider via AWS Direct Connect.
Customers may find a need to bridge their existing on-premises infrastructure to their public cloud provider, or
bridge systems and workloads running between different Cloud providers to ensure a common network layer
between compute nodes and storage endpoints.
This is particularly relevant to solutions where you wish to backup/archive directly to the cloud or create
deduplicated secondary data copies (DASH Copy) of existing backup/archive data to object storage within a cloud
provider.
• VPN Connection – network traffic is routed between network segments over public Internet, encapsulated
in a secure, encrypted tunnel over the customer’s existing Internet connection. As the connection is shared,
bandwidth is limited, and regular data transfer fees apply as per the customer’s current contract with their ISP.
• AWS Direct Connect – a dedicated network link is provided at the customer’s edge network at an existing on-
19
premises location that provides secure routing into an AWS Virtual Private Cloud Network. Typically, these
links are less expensive when compared to a customer’s regular internet connection, as pricing is charged on
a monthly dual-port fee, with all inbound and outbound data transfers included free of charge, with bandwidth
from 10 Mbit/s to 10 Gbit/s.
INFRASTRUCTURE ACCESS
Public cloud providers do not allow direct access to the underlying hypervisor, instead access to functionality such
as VM power on/off and console access are provided through a REST API.
AWS provides VPC Endpoints that enable you to create private connections between a given VPC and another
AWS service without having to route via public Internet space. Support for Amazon S3 VPC Endpoints was
announced in May 2015, and while it is only supported within the same region as the VPC, use of this function is
highly recommended as it reduces availability risks and bandwidth constraints on the VPC’s link through to public
Internet.
An Amazon S3 VPC Endpoint must first be defined by creating an Endpoint Policy within the AWS console, but there
is no change to the FQDN hostname used to define the Cloud Library within Commvault. Instead, AWS will ensure
that DNS queries for the hostname will resolve against the Amazon S3 VPC Endpoint, instead of the public address,
and apply appropriate routing (provided the Endpoint Policy is successfully created).
For more information on VPC Endpoints, please refer to this AWS documentation: VPC Endpoints.
DATA SECURITY
IN-FLIGHT
By default, all communication with Cloud Libraries utilize HTTPS which ensures that all traffic is encrypted while
in-flight between the MediaAgent and the Cloud Library end-point, but traffic between Commvault® nodes is not
encrypted by default. We recommend that any network communications between Commvault® modules routing
over public internet space be encrypted to ensure data security. This is employed by using standard Commvault®
firewall configurations (Two-Way and One-Way).
AT-REST
Data stored in a public cloud is usually on shared infrastructure logically segmented to ensure security. Commvault
recommends adding an extra layer of protection by encrypting all data at-rest. Most cloud providers require that
any seeded data is shipped in an encrypted format. An example of seeding data in AWS is with the use of AWS
Snowball or AWS Snowball Edge devices.
HTTPS PROXIES
Please take note of any HTTP(S) proxies between MediaAgents and endpoints, whether via public Internet or
private space, as this may have a performance impact upon any backup/restore operations to/from an object
storage endpoint. Where possible, Commvault® software should be configured to have direct access to an object
storage endpoint.
20
DATA SEEDING
Data seeding is the process of moving the initial set of data from its current location to a cloud provider in a method
or process that is different from regular or normal operations. For seeding data to an external cloud provider, there
are two primary methods:
“OVER-THE-WIRE”
This is typically performed in a small logical grouping of systems to maximize network utilization in order to more
quickly complete the data movement per system. Some organizations will purchase “burst” bandwidth from their
network providers for the seeding process to expedite the transfer process.
Major cloud providers offer a direct network connection service option for dedicated network bandwidth from your
site to their cloud such as AWS Direct Connect.
Please see the chart below for estimated payload transfer time for various data sizes and speeds.
1 GB 10 GB 100 GB 1 TB 10 TB 100 TB 1 PB 10 PB
10 Mbit 14 min 2.2 hrs 22.2 hrs 9.2 days 92.6 days - - -
100 Mbit 1 min 20 s 13 m 20 s 2.2 hrs 22.2 hrs 9.2 days 92.6 days - -
10 Gbit 0.8 s 8s 1 m 20 s 13 m 20 s 2.2 hrs 22.2 hrs 9.2 days 92.6 days
DRIVE SEEDING
If the data set is too large to copy over the network, or transport over network is too costly, then physical drive
seeding is a valid alternative option. Drive seeding is copying the initial data set to external physical media and then
shipping it directly to the external cloud provider for local data ingestion.
Please refer to the Commvault’s Online documentation for the Seeding the Cloud Library procedure for more
information: documentation.commvault.com/commvault/v11_sp14/article?p=9274.htm.
In addition to this, each external cloud provider has their own process for drive seeding:
Amazon
• Snowball
• Snowball Edge
• Snowmobile
21
COST / CONSUMPTION
NETWORK EGRESS
Moving data into a cloud provider, in most cases, has no provider cost, however moving data outside the cloud
provider, virtual machine instance, or cloud provider region usually has a cost associated with it. Restoring data
from the cloud provider to an external site or replicating data between provider regions are examples of activities
that are classified as Network Egress and usually have additional charges. Pay special attention to the tier of
storage. Some storage tiers cost more for egress and others are free. This may impact your storage costs enough
to decide to choose a higher tier of storage like Amazon S3 Standard instead of Amazon S3-IA or Amazon Glacier.
STORAGE I/O
The input and output operations to storage attached to the virtual machine instance. Cloud storage is usually
metered with a fixed allowance included per month and per unit “overage” charges beyond the allowance. Frequent
restores, active data, and active databases may go beyond a cloud provider’s storage I/O monthly allowance, which
would result in additional charges.
Amazon S3 storage usually incurs a cost for GET/PUT transactions to cloud object storage. These costs are
primarily to enforce good practices for applications when retrieving and placing data in the cloud. As such, the cost
when using the Commvault® solution is minimal.
When Commvault® software writes data to a cloud library, the cloud library splits the data up into a sub-chunk size
of 32 MB. Each 32 MB chunk write or read will incur a GET or PUT request. As of January 2018, AWS charges $0.005
per 1000 PUT requests and $0.004 per 10,000 GET requests for an Amazon S3 Standard bucket for example.
A baseline of 200 GB with a saving of 40% at 32 MB sub-chunk size would result in an approximately 3840 PUT
requests. At a charge of $0.005 per 1,000 requests that is a cost of 2 cents.
Note: All cost figures are referenced in USD and based on pricing listed on the AWS website
at time of this document’s publication.
DATA RECALL
Low-cost cloud storage solutions may have a cost associated with accessing data or deleting data earlier than an
agreed time. Storing infrequently accessed data on a low-cost cloud storage solution may be attractive upfront,
however Commvault recommends modeling realistic data recall scenarios. In some cases, the data recall charges
may be more than the potential cost savings vs. an active cloud storage offering.
As a best practice, Commvault recommends developing realistic use case scenarios and modeling cost against
the identified scenarios to ensure the cloud solution meets your organization’s SLAs, as well as cost objectives, by
leveraging the AWS cost calculator.
22
PERFORMANCE / STORAGE
Object storage performs best with concurrency, and as such with any Cloud Libraries configured within
Commvault, best performance is achieved when configured for multiple readers / streams.
There are additional Data Path settings and additional settings used to adjust and fine-tune the performance of the
Cloud Library. The exact settings for best performance may vary between cloud vendors.
The following combined settings are recommended to increase data read performance from cloud libraries
utilizing deduplication.
1 Increase deduplication block size to either 256 KB or 512 KB for maximum performance
Cloud object storage is subject to 50% or higher latencies than traditional storage. When requesting blocks
from object storage, this delay may reduce overall read performance. To counteract the delay, increase the
deduplication block size to allow larger data retrievals for each request. Note that changing existing storage
policies will initially cause an increase in storage as new deduplication data re-baselines. Only new data written
with the higher block size will benefit from retrieval performance improvements.
If the requirement is to keep one copy on-premises and another in the cloud, recommendation is to use 256 KB
block size for on-premises and cloud copy. Otherwise, if one or all copies involved will be using cloud storage,
recommendation is to use 512 KB block size. The reason for this is, you cannot choose a different deduplication
block size for multiple copies within a Storage Policy, allowing this will unnecessarily increase the overhead in
creating the secondary copies as data will now have to be rehydrated and re-deduplicated with the new block
size. As of April 2018 (Commvault® V11 SP11), the block size for a deduplication cloud library will automatically
be created to use 512 KB.
When used in conjunction with either 256 KB or 512 KB block size, this setting will further increase read
performance from a cloud library utilizing deduplication.
For more tunable settings and information, please refer to “Cloud Connection Performance Tuning” in Commvault
online documentation: Cloud Connection Performance Tuning
Deduplication is recommended to be used where possible, except for environments where there are significant
bandwidth concerns for re-baselining operations, or for archive-only use cases where the data pattern spread
generates no benefit from deduplication operations.
While additional compute resources are required to provide the necessary foundation for optimal deduplication
performance, using deduplication in a cloud context can still achieve greater than a 10:1 reduction.
Even with sealing of the deduplication database (DDB), stored data results can achieve a 7:1 reduction in footprint,
providing significant network savings and reduced backup/replication windows (DASH Copy).
In comparison, software compression can only achieve 2:1 reduction on average and will constantly consume the
same bandwidth when in-flight between endpoints (no DASH Copy).
23
LEVERAGING MULTIPLE MOUNT PATHS FOR A CLOUD LIBRARY
Just like regular disk libraries, Cloud Libraries have the option to leverage multiple mount paths. The benefit of
using multiple mount paths depends on the Cloud storage vendor.
For Amazon S3, using multiple mount paths may help to increase performance due to the nature of how the
Amazon S3 subsystem distributes data.
While public IaaS environments allow block-based storage to be provisioned and leveraged as disk libraries, the
overall cost of those volumes can quickly exceed that of object storage. Based on typical cloud pricing, object
storage could store 3x as much data as block-based storage (Amazon EBS “General Purpose SSD” gp2 volumes)
for 33% less cost. This will vary based on capacity and performance needed from block-based storage.
Additionally, with the inclusion of Commvault® micro pruning, and its benefit of reducing cost of data stored in
object storage, it is highly recommended that object storage be the primary choice for writing data to the cloud,
and other forms of storage by exception.
If you are unsure as to which offering to use, you should consume regular object storage (Amazon S3 Standard).
PARTITIONED DEDUPLICATION
Like on-premises configurations, making use of partitioned deduplication can provide several benefits. When
possible, make use of partitioned deduplication to increase scale, load balancing, and failover. Version 11 allows for
the addition of two extra nodes (up to 4) to an existing deduplication store dynamically, allowing for rapid scale-out
configurations. See Commvault online documentation for more information: Partitioned Deduplication.
MICRO PRUNING
The micro pruning support for object storage is effective for any new data written into the active store.
For customers who have upgraded from Version 10 of Commvault software, but have not yet enabled micro pruning
support, macro pruning rules will still apply to existing data within the active store until the store has been sealed.
Once the active store is sealed, there will no longer be a need for continued periodic sealing against that store.
SELECTING THE RIGHT STORAGE CLASS FOR BACKUP AND ARCHIVE DATA
Depending on the provider, there may be different tiers of object storage available that offer different levels of cost,
performance and access/SLA’s. This can have a significant impact on both the cost and the user experience for the
datasets contained within the cloud storage.
For example, storing infrequently accessed backups within Amazon S3 Standard-Infrequent Access class (Amazon
S3-IA) can significantly lower the cost of your cloud bill, while storing archives in Infrequent Access tier vs. a Deep
Storage tier (i.e. Amazon Glacier) can greatly impact accessibility for end-users to the archived data.
To delve into further detail, these storage classes can be broken into three primary categories:
• Standard storage – this storage class represents the base offering of any object storage platform – inexpensive,
instant access to storage on-demand. Offerings in this category include Amazon S3 Standard, at an average
price of $0.024/GB/month (as of April 2018 and depending on geographic region). Typically, this tier would be
used for backup and archive workloads in a short-term retention configuration.
• Infrequent Access – this is a relatively new offering that addresses what was a gap between Standard offering
and Deep Archive storage tiers, in that it is offered at a lower price point than Standard storage ($0.004/ GB/
24
month for AWS), but is aimed at scenarios where data is infrequently accessed. Offerings in this category include
Amazon S3 Standard-Infrequent Access (Amazon S3-IA) and Amazon S3 One Zone-Infrequent Access (Amazon
S3 One Zone-IA).
While the storage is always accessible, like the Standard offering, the cost model is structured to enforce an
infrequent access use case by charging $0.01/GB for any retrieval from this storage tier.
This tier would be best leveraged for backup workloads in a medium to long-term retention configuration, and
for archive workloads that require instant access to the archived data.
• Deep Archive – sometimes referred to as “cold storage”, this tier is intended for data that will probably not
be accessed again, but must be retained in the event of compliance, legal action, or another business reason.
Amazon Glacier is an example of archive storage which Commvault supports.
The cost of this storage class is the lowest compared to all three offerings – between $0.002/GB/month to $0.01/
GB/month depending on geographic region – but as with the Infrequent Access class, the Deep Archive class’s
cost model is also structured with the expectation that retrievals are infrequent and unusual, and data will be
stored for an extended period of time.
Typically, providers will charge a fee if data is deleted prior to 30-90 days of an object’s creation, and if more than
a set percentage of your data set per month is retrieved, then additional costs may apply. You can think of this
class of storage as equivalent to tape and is therefore recommended not to use deduplication.
It is highly recommended that you review the cost options and considerations of each of these storage classes
against the use case for your architecture to gain the best value for your cost model. Commvault® Professional
Services can assist in necessary service class / data class valuations in designing the correct cost value model for
your enterprise.
Support for the following Infrequent Access storage classes are available in Commvault® Version 11:
For remote office locations, small cloud environments, roaming devices such as laptops, and any architecture
that proves unfeasible or cost prohibitive to implement a traditional or cloud-based MediaAgent, backups can be
done directly from the source to a cloud target such as Amazon S3, completely bypassing the MediaAgent. This
is achieved by installing and enabling the Storage Accelerator feature on the client for direct communication to a
storage target ad will speed up the backup and reduce costs in these situations.
25
PERFORMING DISASTER RECOVERY TO THE CLOUD
This section will cover the steps required to perform disaster recovery into the Amazon Web Services cloud
platform. We examine recovery methods available for both image and agent-based protection. This also addresses
different recovery scenarios that may be needed to meet short recovery time objectives.
An agent-in-guest approach allows for the recovery of a wide variety of operating systems and applications.
These can be captured at the primary site and replicated to the cloud based MediaAgent in a deduplicated
efficient manner. Once replicated, the data can be held and restored in the event of a DR scenario or automatically
recovered to existing instances for more critical workloads.
Live Sync also allows you to replicate VMs to public cloud infrastructure. As of December 2016, Amazon EC2 is a
supported cloud infrastructure vendor for Commvault® v11. Live Sync combines the VM conversion feature with
incremental replication to provide a DR solution utilizing on-demand cloud infrastructure. As Live Sync to cloud
integrates with DASH Copy, highly efficient WAN replication is possible by reducing the amount of data being
replicated.
As cloud infrastructure is securely shared and protected, only a limited subset of integration points are available
for Commvault® software to integrate with Amazon EC2. As such, there are extra steps to convert virtual machines
and prepare them for recovery. For example, Amazon EC2 stipulates that all imported machines must run through
the VM import process – this can take several hours depending on the size of the VMs, increasing the RTO value if a
failover is required while conversion is taking place. This must be considered when designing Live Sync-based DR,
as each incremental update will require a full run-through of the import process, delaying the replication process.
As of June 2018 (Commvault V11 SP12), if a Commvault VSA proxy running inside an Amazon EC2 instance is used
as the destination VSA proxy for the Live Sync replication/conversion, the Commvault software will bypass the AWS
VM Import/Export process and will instead natively perform the VM conversion. This results in significant speed
increases compared to previous Commvault software releases.
26
As such, a good strategy is to identify multiple approaches depending on the business RTO/RPO requirements and
implement them accordingly, while also considering the limitations and requirements specific to the cloud vendor.
For example, Tier 1 applications may be a better fit for near-continuous replication using the Commvault CDR
technology, while Tier 2 applications could make use of Live Sync (VMs, Files, DB), and Tier 3 apps could use on-
demand VM conversion from cloud storage when needed.
Additional information on the conversion feature is accessible using the link below.
Commvault® Continuous Data Replicator (CDR) allows near time continuous data replication for critical workloads
that must be recovered in adherence to Service Level Agreements that exceed the capabilities associated with
Live Sync operations. These VMs require a similarly sized Amazon EC2 instance running in AWS to receive any
replicated data. In order for CDR to operate, an Amazon EC2 instance must be running at all times to receive
application changes. Additional information on CDR can be located using the link below.
• ContinuousDataReplicator (CDR)
AMAZON-SPECIFIC WORKLOADS
With the release of Version 11, the Commvault® Virtual Server Agent allows the ability to easily perform direct
conversion of protected VMware or Hyper-V virtual machines into Amazon EC2 instances, from backups stored
either within Amazon S3 (any tier) or another Cloud Library or from an on-premises Disk Library.
This process could be used as part of a disaster recovery strategy using Amazon as a cold DR site, or as a
migration strategy (Lift-and-Shift).
Additional information on the Conversion feature can be located using the link below.
27
VIRTUAL MACHINE RECOVERY FROM AMAZON EC2 TO MICROSOFT AZURE
With Commvault® V11 SP7 or newer, you can now recover Amazon EC2 instances protected with the Virtual Server
Agent to an Azure Virtual Machine for disaster recovery or migration purposes. Currently streaming backups are
supported for recovery to either Azure Classic or Azure Resource Manager.
Additional information on the Conversion feature can be located using the link below.
With Commvault® V11 SP13 or newer, you can now recover Amazon EC2 instances protected with the Virtual Server
Agent to VMware virtual machines for disaster recovery or migration purposes. The Amazon EC2 instance to be
converted into a VMware virtual machine must be one that was previously converted by the Commvault software
from VMware into Amazon EC2.
Additional information on the Conversion feature can be located using the link below.
For more information on the Commvault Personalization Services team, please contact Commvault or your
Commvault Partner Account team.
For more information on the Workflow engine, please refer to the Workflow Overview link.
AGENT-IN-GUEST (STREAMING)
An agent-in-guest approach is used to protect a wide variety of operating systems and applications. Agent-in-
guest is used on the production workload and protected to the MediaAgent residing in AWS, leveraging client-side
deduplication to reduce the network consumption within the cloud. This can also be replicated to a secondary
MediaAgent residing in a different geographic region. Once replicated, the data can be held and restored in the
event of a DR scenario or automatically recovered to existing instances for the more critical workloads.
28
• When you require granular-level protection and restoration features for applications – the Commvault®
iDataAgents can deliver granular-level protection for supported application workloads, such as SQL Server or
Oracle Database, in comparison to a Full VM or File-level approach.
ARCHITECTURE RECOMMENDATIONS
• Use of multiple readers to increase concurrency to the object storage target is recommended.
• Use of the Amazon S3 VPC Endpoint is highly recommended to improve throughput to/from Amazon S3 buckets.
In addition to the standard agent-in-guest streaming approach, for supported configurations the agent can
integrate the Amazon EBS snapshot facility into a seamless framework that combines fast storage-based
snapshots with application consistency, without the need for a collection of scripts and disparate snapshot, backup
and recovery tools.
While the Snapshot-based Agent-in-Guest approach is currently only supported for Oracle on Linux, SAP HANA on
Linux, and the Linux File System iDataAgent, there is no difference with the actual agents themselves – agents are
capable of both Streaming and Commvault IntelliSnap® backup approaches, and it only requires that Commvault
IntelliSnap be configured.
The agent is deployed on the production Amazon EC2 instance, orchestrating work between the target workload
and the Amazon EBS API, with additional support to extracting data from those snapshots to be stored in a
deduplicated, compressed format, within Amazon S3 Standard or Amazon S3-IA buckets for long-term retention at
a lower cost point. This process of extracting from the snapshot is known as a Backup Copy job. This ensures that
consumption costs are minimized while also protecting against failure of single Availability Zones and Regions by
combining both Amazon EBS and Amazon S3 storage options and redundancy offerings.
WHEN TO USE SNAPSHOT-BASED AGENT-IN-GUEST APPROACH INSTEAD OF THE STREAMING BASED APPROACH:
• When fast, snapshot-based recovery is required – use the snapshot-based agent-in-guest to leverage Amazon
EBS snapshots while maintaining application consistency and granular-level recovery options for workloads that
require fast recovery.
• When application-consistent backups with a shorter backup window are required – the snapshot-based agent-in-
guest approach allows you to execute fast backups, reducing the amount of time normally reserved for a backup
window in order to extract the blocks.
• When minimal load against the production server is required– the snapshot-based agent-in-guest approach
operates with minimal load against the production host and can be instructed to perform the Backup Copy
(extracting from snapshot to Amazon S3 Standard or Amazon S3-IA buckets) using a separate Amazon EC2
instance as a proxy.
29
ARCHITECTURE REQUIREMENTS FOR SNAPSHOT-BASED AGENT-IN-GUEST:
• Minimum 1x iDataAgent per Amazon EC2 instance for the intended dataset (i.e. Oracle, File). Multiple iDataAgents
can be deployed on the same Amazon EC2 instance.
• Amazon must be configured as an “array” in Commvault under Array Management. For more information, see
Setting Up the Amazon Array Using Array Management using the Commvault online documentation.
• Any selected proxy must be in the same availability zone and region in order to both access any volumes created
from snapshots and for best performance.
• (Backup Copy) Minimum 1x MediaAgent per region. MediaAgents connect to the target object storage, and
can either be deployed on the same Amazon EC2 instance as the client, or on a dedicated host for a fan-
in configuration. The Amazon EC2 instance specifications of the MediaAgent should match the MediaAgent
specifications within this Architecture Guide.
• Check the Systems Requirements section in Books Online to determine if the iDataAgent supports your
application: documentation.commvault.com/commvault/v11_sp14/article?p=60237.htm.
ARCHITECTURE RECOMMENDATIONS
• For AWS environments with multiple accounts you can configure a virtualization client for Amazon Web Services
to use a separate Admin account for data protection operations. This approach reduces the impact of backup
operations and restore operations on production accounts. For more information on Cross-Account operations,
See documentation.commvault.com/commvault/v11_sp15/article?p=108887.htm
• While Amazon EBS snapshots are independent of the original volume, they are still only redundant within a single
Availability Zone. Extracting the data from the snapshot into object storage is highly recommended to gain higher
redundancy, whether just single region (store in Amazon S3 Standard or Amazon S3-IA buckets) or multiple
regions (store in an Amazon S3 Standard or Amazon S3-IA bucket, or DASH Copy/replicate to another region).
• (Backup Copy) Use of multiple readers to increase concurrency to the object storage target is recommended.
• (Backup Copy) Use of the Amazon S3 VPC Endpoint is highly recommended to improve throughput to/from
Amazon S3 buckets.
30
NOTES
• The Oracle iDataAgent, when in a Commvault IntelliSnap configuration, will only snapshot Amazon EBS volumes
that contain the target database. At time of backup the agent automatically interrogates the Oracle instance to
determine the source path for the data and log destinations back to the OS mount path, before then resolving this
to an Amazon EBS volume ID that is used during the snapshot API call.
• For more information on using Commvault IntelliSnap® with Oracle, please consult the Books Online section:
http://documentation.commvault.com/commvault/v11/article?p=35532.htm.
AGENT-LESS AMAZON EC2 INSTANCE PROTECTION (VIRTUAL SERVER AGENT FOR AWS)
Introduced in Version 11 Service Pack 3, the Virtual Server Agent for AWS (VSA for AWS) delivers an agent-less,
block-level capture of Amazon EC2 instances and their attached Amazon EBS volumes. Restoration options include
both Full Virtual Machine recovery and granular-level file recovery.
31
Commvault® software does not require access to the AWS hypervisor-level, instead using the Amazon EC2/
EBS REST APIs to create a Full VM Clone (AMI creation) of each Amazon EC2 instance, attaching the Amazon EBS
volumes to a nominated proxy (Amazon EC2-based VSA / MediaAgent) to read and deduplicate the blocks before
writing out to an Amazon S3 Bucket.
Commvault IntelliSnap® functionality for the VSA for AWS modifies this behavior by simply creating an Amazon EBS
snapshot and retaining the snapshot based on the Storage Policy’s Snap Primary retention setting.
This has the effect of reducing the amount of time required to protect data, providing fast, snapshot-based
restoration capability and offloading the task of extracting blocks from the snapshot into Amazon S3 for longer-
term retention through the Backup Copy operation.
On snapshot cost: Use of the Commvault IntelliSnap method does mean that snapshots remain available for
longer than the backup window, however we recommend that snapshots be retained only for as long as required.
The Snap Primary can be configured to retain at least one snapshot, keeping snapshot costs at a minimum while
providing fast backup and restoration capabilities.
• For AWS environments with multiple accounts you can configure a virtualization client for Amazon Web Services
to use a separate Admin account for data protection operations. This approach reduces the impact of backup
operations and restore operations on production accounts. For more information on Cross-Account operations,
See documentation.commvault.com/commvault/v11_sp15/article?p=108887.htm.
• Use of the Commvault IntelliSnap® configuration is highly recommended (requires Version 11 Service Pack
4 available as of June 2016 or forward) to improve backup and restore times. Use of this method does mean
that snapshots remain available for longer than the backup window, however Commvault recommends that
snapshots are retained only for as long as required. The Snap Primary can be configured to retain at least one
snapshot, keeping snapshot costs at a minimum while providing fast backup and restoration capabilities.
• By default, VSA backups of Amazon EC2 instances are crash consistent. To get application consistent backups,
leverage the AppAware feature available as of Commvault® Version 11 Service Pack 7 released in March 2017.
Additional information regarding AppAware can be found in Books Online: documentation.commvault.com/
commvault/v11_sp15/article?p=14241.htm.
• If the AMI was sourced from the AWS Marketplace, any volumes that were deployed as part of that AMI cannot be
targeted for snapshots. Only new volumes created and attached post-instance launch can be snapshotted. This is
by Amazon Web Services design.
• (Backup Copy / Streaming) Use of the Amazon S3 VPC Endpoint is highly recommended to improve throughput
to/from Amazon S3 buckets.
• (Backup Copy / Streaming) Configuring more than 10 readers for a single VSA for AWS proxy may cause snapshot
mount operations to fail. Consider scaling out with multiple VSA proxies if higher concurrency is required.
• (Backup Copy / Streaming) Use of “Provisioned IOPS SSD” (io1) EBS volumes for the deduplication database used
by the MediaAgent module is highly recommended for optimal performance.
32
• (Backup Copy / Streaming) Disable Granular Recovery of files if granular recovery is not required, or agents-in-
guest are used to collect large file system datasets. This will improve the backup window by removing the need
to ‘walk’ the file system structure within the Amazon EBS volumes.
• (Backup Copy / Streaming) To restore from advanced Linux file systems such as EXT4, XFS and others, you can
deploy a file recovery enabler by converting an existing Linux MediaAgent to a FREL. When browsing data on
advanced file systems, Commvault® software will leverage the FREL to access the file system. Introduced in
December 2018 with Commvault V11 SP14, the FREL can now be deployed from an AWS Marketplace AMI.
Below is some standard benchmark comparison of backup methods (5 instances, 1 TB used data, incremental
20% change rate, and run over a 4-job schedule) to help illustrate some of the operation outcomes with more
typical enterprise workloads running in the IaaS environment. The AWS example chart is included below based on
Commvault V11 SP14 testing using Amazon EBS “Provisioned IOPS SSD” (io1) volumes.
These tests reveal that because of the movement of data during agent-less snapshots from Amazon EBS volumes
attached to Amazon EC2 instances to a non-selectable Amazon S3 bucket, and the rehydration of data back from
this non-selectable shared Amazon S3 bucket to the selected Amazon EBS io1 volumes attached to the Commvault
VSA, the initial full backup can take an elongated time. Subsequently backups via the Amazon EBS volume snapshot
method, despite having incremental changes, are subject to the same data movement and coupled with the lack
of change block tracking (CBT), incremental agent-less backups also consume a considerable time. Once this
movement of data from Amazon EBS to Amazon S3 back to Amazon EBS is completed then the Commvault platform
can retrieve, deduplicate, and backup the data to a target.
Due to these architectural limitations, the best approach to protect data is to utilize a hybrid protection strategy
that comprises:
• Utilize Commvault IntelliSnap for AWS to create low RPO and RTO point-in-time backups for Amazon EC2
instances and Amazon EBS volumes by utilizing snapshots. These can be used to quickly recover entire Amazon
33
EC2 instances and entire Amazon EBS volumes. These provide an RPO of minutes to hours, depending on
frequency of snapshots. To keep costs low, these snapshots can be retained for a short period of time ranging
from hours to days purely dependent on business requirements and cost.
• Simultaneously utilize block-level backups using agents-in-guest to protect the larger file systems and
applications on the same Amazon EC2 instances and Amazon EBS volumes as above, with a typical RPO of 12- 24
hours. These streaming backups will move only incremental block changes at the file system level and coupled
with source-side deduplication will minimize the data movement. The data can be stored in a deduplicated
format at a destination such as Amazon S3-IA storage for a longer retention period ranging from days to months.
These backups can be further migrated to lower cost storage such as Amazon S3 Glacier for elongated retention.
• For small Amazon EC2 instances, such as t2.micro, or instances that are running single applications (or near
stateless operations) on the operating system drives and therefore do not have the additional disk capacity or
CPU to support agents-in-guest, it is recommended to utilize Commvault IntelliSnap snapshots of these Amazon
EC2 instances. The snapshots can then be mounted on a Commvault VSA proxy and then have the data streamed
off of the snapshots for a complete agentless approach. For smaller instances and instances with minimal
change rates, this approach will prove to be most efficient.
• For remote office location, small cloud environments, roaming devices such as laptops, and any architecture
that proves unfeasible or cost-prohibitive to implement a traditional or cloud-based MediaAgent, backups can be
done directly from the source to a cloud target such as Amazon S3, completely bypassing the MediaAgent. This
is achieved by installing and enabling the Storage Accelerator feature on the client for direct communication to a
storage target and will speed up the backup and reduce costs in these situations.
This type of hybrid architecture will provide protection from multiple types of scenarios – entire instances and
volumes can be recovered near instantaneously while granular recovery of individual files, folders and application
data can be performed quickly from the Commvault platform, utilizing browsing of indexed data without having to
stage and recover the entire volume first. This hybrid approach also maintains a balance of costs since snapshot
retention and life-cycling can be managed via Commvault IntelliSnap and the data in the longer-term retention
storage can be stored in a deduplicated format. Some Amazon EC2 instances may simply utilize one approach in its
entirety depending on the requirements for speed for backup, RPO, RTO and retention. To further investigate and
discuss this architecture, please work with your Commvault Pre-sales Systems Engineer.
The Amazon Relational Database Service (RDS) offers customers access to databases without having to manage
the infrastructure supporting the databases. However, eliminating the need to administer the underlying operating
system also comes with a drawback. AWS restricts users from accessing the operating system and restricts
privileged connections to the databases. This limits the options that are available for database protection.
34
SQL CONSISTENT STREAMING BACKUP OF AMAZON RDS SQL INSTANCES
While most of the Amazon RDS offerings can only be protected via snapshots, and consequently the Commvault
VSA, an Amazon RDS SQL instance can also be protected via the Commvault SQL Server iDataAgent in addition
to snapshots (as of Commvault V11 SP12, released in June 2018). This method does not utilize Amazon RDS
snapshots, but instead performs an export of the SQL Server database. This export can be used to port the data to
another Amazon RDS instance as well as between SQL Server databases that may reside either on-premises or in
the cloud. While this method does provide the flexibility of a portable backup, it will generally be slower than the
snapshot orchestration outlined earlier. Please check Commvault online documentation for details regarding this
Amazon RDS SQL instance protection method: Backup and Restore Amazon RDS SQL Databases.
ALTERNATIVE PROTECTION AND MIGRATION METHODS FOR AMAZON RDS ORACLE INSTANCES
The Commvault software includes workflows to allow the migration of Amazon RDS Oracle instances to either a
Microsoft Azure Database or to an Oracle instance running in a Microsoft Azure VM. This provides flexibility for
your Oracle workloads and enables a multi-cloud strategy. Please see Commvault online documentation for details:
Amazon RDS Database Migration to an Azure Database.
In addition to the Amazon RDS snapshot protection method described in the previous section, as of Commvault
V11 SP13 (September 2018) an Amazon RDS Oracle instance can also be protected using the Oracle Data Pump
utility installed on a Windows proxy computer. The protected Oracle database can subsequently be restored to
an Amazon RDS Oracle instance, a traditional Oracle database instance running in an Amazon EC2 instance, or to
a traditional Oracle database instance running on-premises or in another Cloud provider. For more details, see
Commvault online documentation: Amazon RDS Oracle Protection Using Oracle Native Export Utility.
As of December 2016, Commvault® Version 11 includes the ability to perform backups of Amazon S3 object storage
created by third party applications. This capability allows Commvault® software to capture the data contained
inside an Amazon S3 bucket, allowing for restore back to an Amazon S3 bucket or file system client.
Additional information on Amazon S3 backups can be located using the link below.
• Amazon S3 Backup
35
ARCHITECTURE RECOMMENDATIONS:
• For large datasets, consider using multiple subclients to increase scan performance and lower the amount of
time taken to traverse the bucket contents.
• Configure data operations for multi-streaming using multiple readers for best performance.
• To improve backup performance, we recommend you disable logging or to redirect the logging to another bucket
in a user-defined subclient
APPLICATION MIGRATION
Commvault can assist in application migration efforts when shifting from on-premises facilities to public cloud
providers such as AWS. By leveraging the power of the data management platform, workloads can be migrated
through a number of methods.
The Virtual Server Agent can capture Virtual Machines from VMware and Hyper-V based platforms in an
application-consistent method (VSS call / VMware Guest Tools hooks / Hyper-V Integration Tools) to ensure that a
consistent image of the guest, and the applications residing within, are captured correctly.
With this image, the Virtual Server Agent can then restore and convert (VMware and Hyper-V) the Virtual Machine
into Amazon EC2 instances directly, and the process can handle single or multiple Virtual Machines.
This process is performed interactively through the CommCell® Console, via Commvault® Workflow or API calls.
Introduced in Version 11 Service Pack 5, the Oracle iDataAgent can now perform a migration of an Oracle Instance
from a physical or virtual machine on-premises into an Amazon EC2 instance.
The process requires the Oracle iDataAgent for Linux to be deployed on the source machine. At which point the
migration process will automate:
For more information regarding the migration of an Oracle instance into an Amazon EC2 instance, see Commvault
online documentation: Oracle Database Application Migration to an Amazon EC2 Instance.
The Commvault software also can migrate a traditional Oracle database instance into an Amazon RDS Oracle
instance. See Commvault online documentation for more details: Oracle Database Application Migration to an
Amazon RDS Database.
Introduced in Version 11 Service Pack 7, the Microsoft SQL Server iDataAgent can now perform a migration of a
SQL Server database from a physical or virtual machine on-premises into an Amazon EC2 instance. Microsoft SQL
Database Migration.
36
APPLICATION OUT-OF-PLACE RESTORE (ALL SUPPORTED PLATFORMS)
All application iDataAgents support the capability to restore a given source dataset out-of-place to an alternate
location. In this method, the data is captured from the source system (physical, or virtual), and then either directly
from the source copy or replicated to cloud (DASH Copy), a restore to the destination is submitted.
The process requires the supported iDataAgent to be deployed on both the source instance, and the destination
Amazon EC2 instance.
This process is performed interactively through the CommCell® Console, via Commvault® Workflow or API calls.
DEPLOYMENT
INSTALLATION BASICS
The following links cover the steps when installing the CommServe® in the cloud. This is only needed when the
primary CommServe will be running on the hosted cloud VM or used for DR recovery. Multiple modules can be
deployed in a single installation pass to streamline deployment.
• Installation Overview
• Installing the CommServe
• Installing the MediaAgent
• Installing the Virtual Server Agent (Amazon)
The following link covers CommServe® DR Solution comparisons for building a standby DR CommServe in the
cloud, or simply restoring on-demand (DR Backup restore): CommServe Disaster Recovery.
For more information, please refer to the Installing the Custom Package instructions within Books Online:
• Custom Packages
37
AUTOMATING DEPLOYMENT WITH CONTINUOUS DELIVERY
For environments using Continuous Delivery toolsets such as Puppet, Chef or Ansible, Commvault® supports
deployment methods that allow administrators to both control agent deployment and configuration to provide an
automated deploy-and-protect outcome for applications and servers.
For more information on creating an unattended installation package for inclusion in a recipe, please refer to the
Unattended Installation guide within Commvault® Books Online:
• Unattended Installation
For more information on using Commvault® software’s XML / REST API interface to control configuration post-
deployment, please refer to the online documentation links below to review options available for each iDataAgent:
For most on-premises backup use cases (except for very small environments limited to 100 GB in payload size),
cloud as a direct storage target for the primary copy is not recommended. For performance and responsiveness,
a primary copy should be stored on an on-site disk library and a secondary copy should be hosted on the cloud
storage. The secondary copy should be setup as an encrypted network optimized DASH copy to the cloud.
The link below lists all of the supported direct cloud storage targets.
The link below covers cloud storage target setup and management.
Depending upon your cloud device type you may choose to verify the compatibility between:
For devices that are not publicly accessible, please contact your account manager.
38
ARCHITECTURE SIZING
Start with the smallest category (Small or Extra-Small) in the tables below for Amazon EC2 instance size and
Amazon EBS volume sizes to fit your situation. Upgrade them as needed to meet your environment’s requirements
(e.g., start with an Extra-Small MediaAgent to begin with, then upgrade it to a Small MediaAgent as you onboard
more client computers).
For CommServe servers, adjust the Amazon EC2 instance size upwards when CPU and RAM loads become
consistently high; add more space to the Amazon EBS volumes as needed to accommodate the size of the
CommServe database as you add more client computers and jobs to your CommCell. CPU and RAM load can be
monitored in the AWS Management Console, or in the CommCell Command Console using the Infrastructure Load
Report (see Commvault documentation for details: http://documentation.commvault.com/commvault/v11_sp14/
adminconsole/article?p=105205.htm).
For MediaAgent servers, monitor CPU and RAM utilization and network I/O on the Amazon EC2 instance using
AWS native tools such as the AWS Management Console. Upgrade to a larger Amazon EC2 instance when you start
encountering bottlenecks in any of these resources. Consult the CommCell Health Report to view the performance
of the DDB. If the DDB is experiencing degraded performance, consider adding more capacity to the Amazon EBS
volume holding the DDB or move the DDB to a higher tier of Amazon EBS volume. For more details about the Health
Report view in the CommCell Command Console, see Commvault documentation: http://documentation.commvault.
com/commvault/v11_sp14/adminconsole/article?p=38680.htm. The Health Report view can be downloaded from
the Commvault Store.
Avoid using Amazon EC2 instance types with local NVMe SSD volumes (such as the R5d, M5d and C5d instance
types), as these local NVMe volumes will not retain their data if the Amazon EC2 instance is powered off or
terminated, which makes them unsuitable choices for the Index Cache or DDB. If choosing an Amazon EC2 type
other than ones listed in the tables below, choose an Amazon EC2 instance type which is “EBS optimized” to ensure
good performance with the Index Cache and DDB.
Consult the AWS Cost Calculator and AWS documentation for the best choice of Amazon EC2 instances for your
AWS region and workload, as Amazon EC2 instance types are frequently evolving.
39
AWS COMMSERVE® SPECIFICATIONS
• c5.xlarge EC2 instance • c5.2xlarge EC2 instance • c5.4xlarge EC2 instance • m5a.4xlarge EC2
(4 vCPU, 8 GB RAM) (8 vCPU, 16 GB RAM) (16 vCPU, 32 GB RAM) instance (16 vCPU,
or or or 64 GB RAM)
m5a.xlarge EC2 m5a.2xlarge EC2 m5a.4xlarge EC2 or
instance (4 vCPU, 16 instance (8 vCPU, 32 GB instance (16 vCPU, m5.4xlarge EC2
GB RAM) or m5.xlarge RAM) 64 GB RAM) instance (16 vCPU,
EC2 instance (4 vCPU, or or 64 GB RAM)
16 GB RAM) m5.2xlarge EC2 instance m5.4xlarge EC2 or
or (8 vCPU, 32 GB RAM) instance (16 vCPU, m4.4xlarge EC2
m4.xlarge EC2 or 64 GB RAM) instance (16 vCPU,
instance (4 vCPU, 16 m4.2xlarge EC2 instance or 64 GB RAM)
GB RAM) (8 vCPU, 32 GB RAM) m4.4xlarge EC2
• 1x 300 GB EBS
instance (16 vCPU,
• 1x 100 GB EBS • 1x 150 GB EBS “General “Provisioned IOPS
64 GB RAM)
“General Purpose SSD” Purpose SSD” (gp2) SSD” (io1) volume
(gp2) volume for CS volume for CS Software • 1x 300 GB EBS “General @ 2000 IOPS for CS
Software & CSDB & CSDB Purpose SSD” (gp2) Software & CSDB
volume for CS Software
• Windows Server 2012 • Windows Server 2012 • Windows Server
& CSDB
R2 or Windows Server R2 or Windows Server 2012 R2 or Windows
2016 (Commvault® V11 2016 (Commvault® V11 • Windows Server 2012 Server 2016
SP7+) SP7+) R2 or Windows Server (Commvault® V11
2016 (Commvault® V11 SP7+)
SP7+)
40
AWS MEDIAAGENT SPECIFICATIONS
EXTRA SMALL SMALL MEDIUM LARGE EXTRA LARGE
60 TB BET 120 TB BET 240 TB BET 600 TB BET 800 TB BET
Important: EBS-optimized instances are recommended as they provide dedicated network bandwidth for EBS
volumes, improving deduplication and Index Cache performance and freeing up bandwidth to send/receive from
clients, other MediaAgents, and Amazon S3 endpoints.
R5 and M5 Amazon EC2 instance types provide higher network bandwidth and lower costs than M4 instance types.
It is recommended to have multiple smaller instances for MediaAgents than a single larger instance to avoid
disruptions to backup/restore operations in the event of an instance becoming non-functional.
The BET and FET sizing for maximum capacity are based on a 512 KB deduplication block size, which is the default
for writing to cloud libraries.
Note: For more detailed information, please refer to the following link:
http://documentation.commvault.com/commvault/v11/article?p=1647_1.htm.
41
EXAMPLE OF SIZING IN CLOUD
It is also assumed that the daily change rate is approximately 2% of net new data that is created per day,
or approximately 2 TB worth of new data before it is compressed and deduplicated. The change rate in an
environment varies greatly by environment and coupled with retention and deduplication ratio of the data. Both
of which are also highly dependent on the specific environment, these three factors affect the back-end storage
capacity that is ultimately required. For this example, we shall assume 90 days of retention and approximately 5:1
deduplication ratio. Both are typically observed within most virtual environments running a mix of Windows and
Linux operating systems. With retention it is important to note that data that is 90 days old or the first full backup
are not deleted until the most recent one is fully committed to the backup target. This accounts for retention+1
storage requirement. This results in approximately 117 TB used for back-end capacity.
It will also be assumed that absolutely no infrastructure to manage the cloud protection environment is present
outside the cloud and that Commvault Cloud MediaAgent power management feature is enabled, enabling
shutdown of resources when backups and recoveries are not occurring. While most backup windows are usually 8
hours, the assumption is that restore jobs account for another 4 hours per day allowing for power management to
operate for only half of a given day.
Using publicly available pricing for AWS resources (as of January 2019) the cost of performing protection in AWS by
utilizing any combination of iDataAgents and coupled with Commvault IntelliSnap for AWS snapshot management,
the following becomes a rough estimate of the cost of the AWS infrastructure required for a period of 1 year:
AWS COST/HR AWS COST/ AWS COST W/
COMMVAULT COMPONENT QTY. AWS RESOURCE TYPE
OR /GB YEAR POWER OFF
CommServe Medium VM 1 c5.2xlarge EC2 reserved $0.568 $4,975.00 $4,975.00
instance Win
CommServe OS Disk 1 150 GB gp2 EBS volume $0.100 $180.00 $180.00
*It must be noted that this is a sample configuration utilizing estimated sizing data and that actual costs will vary
depending on data type, retention and numerous other factors. This assumes scaled up to 100 TB FET, starting with
a much smaller footprint and growing as the source grows is perfectly acceptable.
42
ADDITIONALRESOURCES
DOCUMENTATION
The Cloud Storage section from the Commvault® Books Online documentation covers technical procedures and
information on Supported Cloud Targets, Advanced procedures, Troubleshooting and FAQ sections for Commvault
customers.
VIDEOS
https://www.youtube.com/watch?v=H5MOG0NZV0Q
Focuses on creating an Amazon S3 container and configuring as a Cloud Library within Commvault® v11.
https://www.youtube.com/watch?v=h3L-FDdSD7w
Technical showcase for the Virtual Server Agent for AWS, demonstrating Commvault IntelliSnap® functionality
(Version 11 SP4).
©1999-2019 Commvault Systems, Inc. All rights reserved. Commvault, Commvault and logo, the “C hexagon” logo, Commvault Systems, Commvault
HyperScale, ScaleProtect, Commvault OnePass, GridStor, Vault Tracker, IntelliSnap, CommServe, CommCell, APSS, Commvault Edge, Commvault
GO, Commvault Advantage, Commvault Complete, Commvault Activate, Commvault Orchestrate, and CommValue are trademarks or registered
trademarks of Commvault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are
the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.