IBM Spectrum Protect In-the-Cloud Deployment Guidelines With Microsoft Azure V1.3
IBM Spectrum Protect In-the-Cloud Deployment Guidelines With Microsoft Azure V1.3
In-the-Cloud Deployment
Guidelines with Microsoft Azure
James Damgar
Daniel Benton
Jason Basler
IBM Spectrum Protect Performance Evaluation
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
© Copyright International Business Machines Corporation 2018, 2020
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
CONTENTS
Contents .............................................................................................................................................. 3
List of Figures ..................................................................................................................................... 4
List of Tables ....................................................................................................................................... 5
Introduction ......................................................................................................................................... 6
1.1 Purpose of this Paper ......................................................................................................... 6
1.2 Considerations for Disk-to-Cloud Tiering Versus Direct-to-Cloud Data Movement............ 7
1.2.1 Cloud Accelerator Cache Considerations ................................................................... 8
1.2.2 Workload Limitations and Considerations with Tiering ............................................... 9
1.3 Cloud Deployment Patterns .............................................................................................. 12
1.4 Cloud Environment Considerations .................................................................................. 13
1.4.1 Importance of Adequate Sizing ................................................................................. 13
1.4.2 Linux Logical Volume Manager (LVM) ...................................................................... 14
1.5 References to Physical IBM Spectrum Protect Blueprints ................................................ 15
1.6 Database Backup to Object Storage ................................................................................ 15
1.6.1 Tuning Database Backup Operations to Object Storage .......................................... 16
1.7 Server Maintenance Scheduling Considerations .............................................................. 18
1.8 Session Scalability by Blueprint Size ................................................................................ 19
Microsoft Azure Configurations ......................................................................................................... 21
2.1 Design Considerations for Microsoft Azure Instances ...................................................... 28
2.1.1 Considerations for Direct-to-Cloud Architectures ..................................................... 30
2.1.2 Sizing the Cloud Accelerator Cache ......................................................................... 31
2.1.3 Microsoft Azure: Large Instance Considerations ...................................................... 32
2.1.4 Microsoft Azure: Medium and Small Instance Considerations ................................. 33
Throughput Measurements and Results........................................................................................... 33
3.1 Dataset Descriptions ......................................................................................................... 33
3.2 Backup and Restore Measurements ................................................................................ 35
3.2.1 Microsoft Azure: Large Instance Measurements ...................................................... 35
Appendix ........................................................................................................................................... 38
Disk Benchmarking ........................................................................................................................... 38
Object Storage Benchmarking .......................................................................................................... 42
Instance and Object Storage: Navigating the Microsoft Azure Portal .............................................. 46
References ........................................................................................................................................ 57
Notices .............................................................................................................................................. 58
Trademarks ........................................................................................................................... 59
LIST OF FIGURES
Figure 4 : Sizing the cloud accelerator cache for Microsoft Azure .................................................. 32
Figure 5: Microsoft Azure large configuration; database volume average throughput; 8 KiByte
random writes/reads ..................................................................................................... 41
Figure 6: Microsoft Azure large configuration; database volume average IOPS; 8 KiByte random
writes/reads .................................................................................................................. 41
Figure 7: Microsoft Azure large configuration; cloud cache volume average throughput; mixed 256
KiByte writes and reads ................................................................................................ 42
LIST OF TABLES
Table 1: IBM Spectrum Protect physical Blueprint targets (V4.2, Linux x86) .................................. 15
Table 2: Preferred ranges of maximum values for client session counts ........................................ 20
Table 10: Microsoft Azure, large configuration, 128 MiByte VE-like dataset backup results .......... 35
Table 11: Microsoft Azure, large configuration, 128 MiByte VE-like dataset restore results ........... 36
Table 12: Microsoft Azure, large configuration, 1 GiByte dataset backup results ........................... 36
Table 13: Microsoft Azure, large configuration, 1 GiByte dataset restore results ........................... 36
Table 14: Microsoft Azure, large configuration, 128 KiByte dataset backup results ....................... 36
Table 15: Microsoft Azure, large configuration, 128 KiByte dataset restore results ........................ 37
Introduction
7
greater, with the caveat that a slower-performing disk might be sufficient for this case. In all
cases, you must understand the ingestion targets (after data deduplication and
compression) to determine a daily disk capacity for a transient disk case. Meanwhile,
operational recovery requirements in terms of the number of days’ worth of recovery data
(after data deduplication and compression) should be determined to further size a
directory-container storage pool with tiering to cloud if necessary.
With the direct-to-cloud model, you can minimize local block storage capacity. This is an
advantage because local block storage can be cost prohibitive in cloud-hosted
environments.
9
data will not be removed from the disk tier (although it will be copied to the object storage
tier).
The following figures illustrate how data movement with disk-to-cloud tiering can occur.
Figure 1 depicts a scenario in which multiple versions of three backup objects (A, B, and C)
have been ingested and are stored in a directory-container storage pool on disk. Dotted
lines represent references to deduplicated extents (colored, numbered boxes). With the
tier-by-state option, the inactive object copies (shown in the gray rectangle) would be tiered
to a cloud-container storage pool.
Figure 2 depicts the situation after tiering is completed and the REUSEDELAY parameter
value of the source directory-container storage pool is exceeded (so that deduplicated
extent removal for extents with zero reference count can occur).
Figure 2: Disk-to-cloud tiering, after tiering
Notice that deduplicated extents 1 and 2 remain on disk even after tiering and extent
cleanup have occurred. This is due to the fact that those extents are shared between the
active and inactive backup copies. If many deduplicated extents are shared by objects (a
high duplicate data rate with high data deduplication ratios), it is more likely that data will
remain on disk, even after backup objects have been tiered at an IBM Spectrum Protect
inventory level. Keep this factor in mind when you consider a disk-to-cloud tiering model
and when you size an environment.
For workloads that deduplicate well from day to day, there will be many shared extents
across backup and archive generations and a smaller capacity footprint on tiered object
storage as a result because these backup and archive generations will also share many
extents in the cloud-container storage pool. For workloads that deduplicate poorly day to
day (highly unique data change each day), there will be few shared extents across backup
and archive generations and potentially a larger capacity footprint on tiered object storage
because these backup and archive generations will each point to (more) unique data in the
cloud-container storage pool.
If the primary motivation for using disk-to-cloud tiering is rapid recovery of operational data,
a tiering model might provide the best approach. You must understand the nature of the
client workload to accurately size the directory-container storage pool on disk.
11
1.3 Cloud Deployment Patterns
The described configurations can be used as starting points in situations where the IBM
Spectrum Protect cloud instance will be a primary server and in situations where it is
used as a replication target. In scenarios where the cloud-based instance is a replication
target, adequate “public” network capability might be necessary to satisfy replication
throughput requirements. Microsoft Azure ExpressRoute can be used to establish a
dedicated link ranging from 50 Mbps to 10 Gbps from an on-premises data center to
Microsoft Azure private and public resources to facilitate efficient IBM Spectrum Protect
replication or backup processing from peer servers or clients outside of the Microsoft Azure
infrastructure.
Generally, IBM Spectrum Protect deployments making use of cloud-based object storage
will align with one of the following three patterns:
In the figure, the first deployment pattern could involve an IBM Spectrum Protect server
that is installed on premises or on a Microsoft Azure system, with primary backup and
archive data landing in object storage immediately. The positioning of the IBM Spectrum
Protect server in relationship to clients could be one critical decision point when you
consider whether to have a server instance on premises or within Microsoft Azure. This
pattern could involve use of a direct-to-cloud architecture with accelerator cache or a small
directory-container storage pool with immediate tiering to a second cloud-container storage
pool without accelerator cache.
The second deployment pattern would make use of cloud-based Azure Blob object storage
at the secondary disaster recovery (DR) site. This DR server could be installed at an on-
premises site or on a Microsoft Azure system. In the latter case, sufficient wide area
network (WAN) bandwidth between the primary and secondary sites is required for
acceptable performance. Much like the first deployment pattern, here the IBM Spectrum
Protect server at the DR site could make use of a direct-to-cloud topology with a cloud-
container storage pool featuring accelerator cache, or it could use a small directory-
container storage pool landing spot with immediate tiering to a cloud-container storage
pool backed by object storage.
The third deployment pattern features specific use of disk-to-cloud tiering, available with
IBM Spectrum Protect V8.1.3 and later, to allow for operational recovery data to reside on
faster performing disk storage. Data that is older, archived, or both would be tiered to
Microsoft Azure Blob object storage after a specified number of days. This deployment
could also be performed at an on-premises site or within a Microsoft Azure instance.
However, the additional cost of having a larger capacity directory-container storage pool
should be factored into cost estimates with an in-the-cloud solution.
A combination of approaches is also possible within the same deployment. For example,
a cloud-container storage pool could be configured with accelerator cache disk and made
to store long-term retention or compliance archives. A directory-container storage pool
could be configured as a disk tier for normal backups, and a tiering relationship could be
set up so that operational recovery data (for example, backups from the previous 7 days) is
kept on this disk tier, while older data is demoted to the same cloud-container storage pool.
The same cloud-container storage pool can be a direct backup target and a tiering target.
However, if the pool is a direct target of a backup-archive client, the pool must be
configured with accelerator cache disk.
13
Certain Microsoft instances might or might not have access to dedicated bandwidth to
attached disks (Microsoft managed disks). A lack of access can create a bottleneck in the
database operations of the IBM Spectrum Protect server. Certain instances might have
limited throughput over Ethernet, and this limitation could hamper ingestion and restore
throughput with object storage. During the planning phase, consider how the ingested data
will be reduced via data deduplication and compression in the back-end storage
location. These factors will help you estimate how much back-end data must be moved
within a certain time window (measured in hours) and can help predict the throughput
(megabytes per second or terabytes per hour) that the Ethernet network and object storage
endpoint require to satisfy ingestion requirements. Generally, 10 Gbps Ethernet capability
to private Microsoft Azure Blob storage endpoints is required for large, medium, or small
Blueprint ingestion targets, while 1 Gbps is sufficient for extra-small targets.
Beginning with IBM Spectrum Protect V8.1.3, the server automatically throttles client
backup operations if the cloud accelerator cache portion of a cloud-container storage pool
is nearing full capacity. As a result, it is not mandatory to configure cloud accelerator disk
cache space that would be large enough to hold a full day’s worth of backups (after data
deduplication and compression). However, disk benchmarks should be run to ensure that
the anticipated back-end workload that an IBM Spectrum Protect server is expected to
support will not result in this disk location being the primary bottleneck of the system (see
Disk Benchmarking). In practice, any planned deployment should be validated to ensure
that it will meet performance requirements.
Although not defined explicitly in the physical Blueprints, the extra-small cloud Blueprint
systems target up to 10 TB or more of total managed (front-end) data with a daily ingestion
rate of up to 1 TB, or more, per day.
15
Hot access tier, LRS Blob (object) storage for database backup purposes compared to
statically provisioning 16 TB of Standard class block disk storage.
Another advantage of using Blob object storage for IBM Spectrum Protect database
backups is that Blob object storage pricing with Microsoft Azure is based on the amount of
used storage, while disk storage pricing is based on the amount of storage space
provisioned, even if a portion is unused. Not only is unused provisioned disk space a
deterrent to cost savings, the actual rate charged for this space is much more than object
storage considering that the data involved (database backups) is archive-like in nature.
Static provisioning of disk storage is no longer required and the amount of storage
consumed for database backup can better match the requirements of the environment. By
taking advantage of this pricing model, you can enjoy greater freedom in choosing and
changing retention policies for database backups to achieve the required recovery window.
For example, you can transition from 2 days’ worth of full database backups to 7 days
without having to re-provision and configure disk storage.
A further benefit of database backup operations to Blob object storage is that increased
data redundancy, availability, and durability can be achieved by using a Blob object
storage account with different data redundancy settings. Locally redundant storage (LRS)
is the most cost-efficient option, where data is copied synchronously three times within a
single physical location in the primary Microsoft Azure region. Zone-redundant storage
(ZRS) copies data across three availability zones (data centers) in the primary region
synchronously with each write operation and can be used to protect against the outage of a
single availability zone. Greater availability and durability can be achieved by using Geo-
redundant storage (GRS) or Geo-zone-redundant storage (GZRS) to replicate data from
the primary Microsoft Azure region to a secondary region. Both of these options copy data
to a single physical location in the secondary region (as with LRS), but differ in how they
copy data in the primary region. As with LRS, GRS makes three copies in a single physical
location at the primary region while GZRS, as with ZRS, copies data to three availability
zones in the primary region. In the case of GRS and GZRS, data is copied to the additional
Microsoft Azure region asynchronously with a recovery point objective (RPO) of
approximately 15 minutes or less (although with no guaranteed service level agreement,
SLA).
You can use the same Microsoft Azure Blog object storage account for database backups
and the cloud-container storage pool of the IBM Spectrum Protect server to ensure
matching redundancy, availability, and durability attributes for database metadata and
storage pool data. In the case of an outage of an availability zone within a Microsoft Azure
region, an IBM Spectrum Protect server instance can be recovered via a database restore
operation and by using the cloud-container storage pool resident data that is accessed by
a different Microsoft Azure server instance located within the same region. For more
information about Microsoft Azure redundancy options, see References [8]. For detailed
guidance about setting up database backup operations to object storage, see References
[4].
• A server thread that sends data from the server to object storage
Several performance factors affect operations for backing up the server database to object
storage, for example:
17
database restore performance. For smaller IBM Spectrum Protect servers with smaller
databases (such as the extra-small and small configurations shown here) use compression
when the following conditions are met:
• The retention policy that affects client node data on the target replication
server should match the value of the TIERDELAY parameter of the storage
rule that is responsible for tiering the same client node data on the source
server.
In general, the server that is used for disk-to-cloud tiering (whether it be the source
replication server or the target replication server) should be the server with the longer
retention policy for the client nodes that are affected by the tiering storage rule.
19
Objects that feature smaller, deduplicated extent sizes (for example, 60 - 100 KiBytes or
similar) and that deduplicate and compress well (for example, 50% data deduplication with
50% compressibility) will result in less network, disk, and object storage bandwidth used,
but will lead to more database and computation overhead to facilitate these data reduction
operations. As session counts increase, CPU and database-related memory are likely to
first become limiting factors for these data types. In general, the more successfully data
can be deduplicated and compressed (and therefore the greater the data reduction from
front-end to back-end data), the greater the number of feasible client sessions. The
following table indicates a reasonable range of client session counts based on system size
and data type, as well as the likely limiting factor for the system as the high end of the
range is approached. For more information about these data types, see Throughput
Measurements and Results.
Table 2: Preferred ranges of maximum values for client session counts
Extra 10 – 50 25 – 50 10 – 50 10 – 50
small
1
This model uses 128 MiByte objects, 250 - 350 KiByte extents, and <10% data deduplication and
compressibility. Full backup operations are used with pseudo random data or data that cannot be easily
deduplicated or compressed. For example, this model can be applied to encrypted data.
2
This model uses 128 MiByte objects, 150 - 200 KiByte extents, and 50% data deduplication and compressibility.
For example, this model can be applied to virtual machine backups.
3
This model uses 1 GiByte objects, 60 - 100 KiByte extents, and 50% data deduplication and compressibility. For
example, this model can be applied to database image backups.
4
This model uses 128 KiByte objects and <10% data deduplication and compressibility. For example, this model
can be applied to file server data and other small files or objects.
Often, a diminishing rate of return in regard to throughput is experienced when 50 - 100
total client sessions are exceeded, regardless of data type. More sessions are possible
and might be warranted, given the client schedule or other requirements. However,
aggregate gains in total throughput of a single IBM Spectrum Protect instance might not be
substantial past this point.
21
Table 3: Microsoft Azure, large configuration
448 GB RAM
896 GB local
SSD, 64
maximum
attached disks
for this instance
16 Gbit Ethernet
connectivity to
Blob Storage
The following environments were not tested, but can be used as starting points for
medium, small, and extra-small Microsoft Azure based instances that could satisfy CPU,
memory, ingest throughput, and server database requirements, among other concerns.
Also included is an alternative large system that would be more price-competitive.
256 GB RAM
32 maximum
attached disks for
this instance
“Extremely high”
Ethernet
connectivity to Blob
Storage
23
Cloud component Microsoft Azure Detailed Quantity
component description
160 GB RAM
32 maximum
attached disks for
this instance
Shared 8-bit
Ethernet
connectivity to Blob
Storage (~4 Gbit
Ethernet)
25
Cloud component Microsoft Azure Detailed Quantity
component description
region as the
instance
128 GB RAM
32 maximum
attached disks for
this instance
Shared 8 Gbit
Ethernet
connectivity to Blob
Storage (~4 Gbit
for Ethernet)
16 GB RAM
8 maximum
attached disks for
this instance
Shared 8 Gbit
Ethernet
connectivity to Blob
Storage (~2 Gbit
for Ethernet)
27
Cloud component Microsoft Azure Detailed Quantity
component description
The above table is sorted from lowest to highest cost instance in each category. Costs
are estimated by using the Microsoft Azure calculator based on a RHEL instance in the
West US 2 Region with hourly pricing. As pricing is subject to change, actual figures are
not included here.
When possible, enable and use Microsoft Azure Accelerated Networking for supported
Azure instance types. Instances with Azure Accelerated Networking feature the ability to
offload computationally expensive network policy enforcement (such as with network
security groups, access control lists, isolation, and other network virtualized serves to
29
network traffic) to hardware. This permits the bypassing of the Azure virtual switch layer
and can help facilitate more efficient Azure Blob storage and IBM Spectrum Protect backup
client communication. See References [7].
Beginning with IBM Spectrum Protect V8.1.3, the server automatically throttles client
backup operations if the cloud accelerator cache portion of a direct-to-cloud storage pool is
nearing full capacity. As a result, it is not mandatory to configure cloud accelerator disk
cache space that would be large enough to hold a full day’s worth of backups (after data
deduplication and compression). However, disk benchmarks should be run to ensure that
the anticipated back-end workload that an IBM Spectrum Protect server is expected to
support will not result in this disk location being the primary bottleneck of the system (see
Disk Benchmarking). In the previous table, the EC2 instances were carefully selected to
ensure that they have the dedicated EBS bandwidth and disk resources that are necessary
for the assigned role. In practice, any planned deployment should be validated to ensure
that it will meet performance requirements.
The Microsoft Azure Blob storage account underlying an IBM Spectrum Protect cloud-
container storage pool can be configured as a hot access tier or as a cool access tier
storage account. The Azure storage archive access tier is currently not supported by IBM
Spectrum Protect. The performance characteristics of the hot and cool storage account
types should be identical. Hot access tier storage accounts feature higher storage costs
than the cool access tier but with lower access costs. For data that will be retained for at
least 30 days and that will be restored infrequently (for example, long-term or compliance
retention data and archive data), consider directing this workload to an IBM Spectrum
Protect cloud-container storage pool with an underlying cool access tier Azure blob storage
account to save on costs. However, be aware that segmenting your data ingestion
workload into more than one cloud-container storage pool could potentially reduce the
overall data deduplication efficiency of the solution, as IBM Spectrum Protect data de-
duplication takes place at the granularity of the container storage pool. Consider balancing
the cost savings from segmenting longer-term retention workloads with potential losses in
data deduplication efficiency and a greater data storage footprint.
31
Figure 4 : Sizing the cloud accelerator cache for Microsoft Azure
33
128 MiByte ~200-300 ~0% ~0% Random data,
KiByte large extent size
The 128 MiByte, VE-like front-end dataset represents a relatively large object size that
aligns with the IBM Spectrum Protect for Virtual Environments: Data Protection for VMware
API client’s VE megablock size for virtual machine disk backups. The large object size
and relatively large, though realistic, deduplication extent size represents a favorable
profile for the IBM Spectrum Protect server’s ingestion engine to achieve good
performance. A duplicate data rate of 70% combined with a compressibility rate of 50% for
this dataset yields an 85% total data reduction from front-end data as compared with data
that is actually stored to the (cloud-accelerator cache and object storage) back-end after
data deduplication, compression, and encryption processing. Although this workload does
not qualify as a “best case,” it does represent a realistic, favorable scenario in which to
model top-end throughput capability of an IBM Spectrum Protect system without
overestimating throughput performance.
The 128 MiByte, random front-end dataset represents a larger object size with a large,
favorable deduplication extent size. However, the random nature of the data ensures that it
does not deduplicate well with existing storage pool data or compress well. This dataset is
included to represent a workload that is throughput intensive from the perspective of
storage pool disk and object storage network load. Full backups of large objects containing
relatively random data content would be modeled well by this dataset.
The 1 GiByte front-end dataset represents a model of structured, database-like data
possessing a relatively small deduplication extent size relative to the front-end object size.
Such a workload is representative of what might be experienced with an IBM Spectrum
Protect for Databases: Data Protection for Oracle backup environment protecting
production databases. The smaller extent size causes additional strain and overhead for
the IBM Spectrum Protect ingestion engine and typically results in less throughput than the
128 MiByte dataset. A duplicate data rate of 50% and compressibility of 50% yield a 75%
overall front-end to back-end reduction for this workload, with a 4:1 ratio reduction, which
approaches what is seen for this type of data in the field.
The 128 KiByte front-end dataset is used here as a relative “worst case” workload in
regard to throughput efficiency for the IBM Spectrum Protect server. This “small file”
workload will stress the data deduplication, compression, encryption, and object storage
transfer components of the IBM Spectrum Protect server to a higher degree relative to the
amount of data that is actually protected and transferred to object storage. This high
overhead dataset allows for predicting a lower estimate on performance of an IBM
Spectrum Protect environment within these cloud solutions.
3.2 Backup and Restore Measurements
The following sections outline the backup and restore throughput results that were
experienced with the previously mentioned datasets in the built cloud environments.
Prior to conducting backup and restore tests on the IBM Spectrum protect environments, a
load phase was conducted whereby the servers were initially loaded with a set of
deduplicated 128 MiByte front-end data in order to populate the server database tables to
provide for a more realistic customer configuration. IBM Spectrum Protect database
queries can change their behavior based on the size and layout of server database tables.
This load phase was performed to bring behavior in line with real environment
expectations.
For each dataset, up to 50 IBM Spectrum Protect client backup sessions were initiated in
parallel to the server for the large Microsoft configuration. The results presented here for
backup represent the maximum front-end throughput experienced with the largest number
of sessions tested against that system.
For each dataset on restore, between 1 and 40 client restore sessions were initiated for the
large Microsoft system. Results presented here include the intermediate session count
values to give an idea on how restore throughput scales with the number of restore
sessions involved for datasets similar to these types.
All throughput values represent front-end, “protected data” values, before inline data
deduplication, compression, and encryption. These are the data rates experienced by a
client that is backing up data to or restoring data from the IBM Spectrum Protect server.
The rates are similar to what customers would likely describe as their performance
experience with the product. On ingestion, the actual quantity of data that makes it to
accelerator cache disk and onwards to object storage will be less, depending on the data
deduplication and compression rate. On restore, all individual extents comprising a front-
end object will be restored using HTTP GET calls from the object storage device. However,
the built-in caching within the IBM Spectrum Protect server’s restore engine might reduce
the number of restore operations needed if a workload contains duplicate data.
35
50 0% 696.9 2.4 19.1 23.9
Ingestion throughput for the large Microsoft Azure build with the favorable 128 MiByte
dataset was well within the target throughput range of a large Blueprint system (20 – 100
TiBytes per day).
Table 11: Microsoft Azure, large configuration, 128 MiByte VE-like dataset restore results
Sessions GiBytes/hour
Top-end 483
Ingestion rates for the smaller extent size 1 GiByte object workload at 50 sessions resulted
in an aggregate throughput rate that was slightly less than the lower end of the large
Blueprint target.
Table 13: Microsoft Azure, large configuration, 1 GiByte dataset restore results
Sessions GiBytes/hour
Top-end 389
Sessions GiBytes/hour
Per session 66
Top-end 664
37
APPENDIX
Disk Benchmarking
As a part of vetting a Microsoft Azure test configuration, disk benchmark tests were
performed to validate the capability of the disk volumes underlying the IBM Spectrum
Protect database and cloud accelerator cache. From a database point of view, this vetting
was done to ensure that the volumes were sufficiently capable from an IOPS perspective
to support the 8 KiByte random mixed write and read workload that a busy Blueprint-level
system would demand. From a cloud cache standpoint, the vetting was performed to
ensure that overlapped 128-256 KiByte write and read throughput could achieve a rate
high enough such that the server’s bottleneck for IO would be at the instance-to-object
storage network level and not the disk level. The goal was to ensure that the disk could
perform at a rate such that the IBM Spectrum Protect server could utilize it during
overlapped ingest and be able to stress the network link layer simultaneously.
Disk benchmarking was performed by using the tsmdiskperf.pl Perl script, provided as a
part of the Blueprint configuration scripts package found on the IBM Spectrum Protect
Blueprints page (References [1]). Execution of the script was performed as follows:
perl tsmdiskperf.pl workload=stgpool fslist=directory_list
perl tsmdiskperf.pl workload=db fslist=directory_list
With a stgpool workload specification, the script drives a 256 KiByte IO pattern, whereas
with a db workload specification, the script drives 8 KiByte operations. For each directory
location provided as a value to the comma-separate fslist, a pair of IO processes is
created to perform writes and reads to test files that are generated in that directory.
Typical script output for a stgpool workload run resembles the following example:
======================================================================
: Number of filesystems: 1
: Mode: readwrite
: File size: 2 GB
======================================================================
:
: The test can take upwards of ten minutes, please be patient ...
===================================================================
: RESULTS:
: dm-2
===================================================================
The value that was extracted for the purposes of comparison and validation for stgpool
workloads was Avg Combined Throughput (MB/sec). The goal was to determine the
largest aggregate average throughput for writes and reads to the accelerator cache disk
such that overlapped backup ingest and transfer to object storage will not be constrained
by disk capability.
When running the tool in db workload mode, output should appear similar to the following
example:
======================================================================
: Workload type: db
: Number of filesystems: 1
: Mode: readwrite
39
: File size: 10 GB
======================================================================
: The test can take upwards of ten minutes, please be patient ...
===================================================================
: RESULTS:
: dm-6
===================================================================
For the db workload tests, the Avg Combined Throughput (MB/sec) and Average
IOPS metrics are significant for evaluating database disk capability. Here, the small
random IOPS capability of the underlying disk that is used for the IBM Spectrum Protect
Db2 database is of interest.
To conduct measurements of your own, increase the number of write/read threads pairs
(and directories) by 1 for each test until the average throughput, the average IOPS, or both
stabilize (level off). Benchmark test results are provided here as a reference for those who
want to build systems resembling those laid out in this document and who want to validate
that their system is capable of supporting the described level of ingestion. For each graph,
the horizontal axis represents the quantity of write/read thread pairs (and the number of
directory locations used with fslist). For each successive bar to the right, the thread
count affecting the disk is increased by 2 (1 write thread, 1 read thread, and adding a
directory location). The vertical axis represents total average throughput in MiBytes/s.
120
Throughput (MiBytes/s)
100
80
60
40
20
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Figure 5: Microsoft Azure large configuration; database volume average throughput; 8 KiByte random
writes/reads
18000
Number of Write/Read Thread Pairs
16000
14000
12000
10000
IOPS
8000
6000
4000
2000
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Figure 6: Microsoft Azure large configuration; database volume average IOPS; 8 KiByte random
writes/reads
41
1600 Number of Write/Read Thread Pairs
1400
1200
Throughput (MiBytes/s)
1000
800
600
400
200
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Figure 7: Microsoft Azure large configuration; cloud cache volume average throughput; mixed 256
KiByte writes and reads
2. To run a set of automated tests scaling from 1 to 100 threads, run the
tsmobjperf.pl tool by using the recently created RAM disk files as source files to
upload. If more threads are specified than files are present in the source list, the tool
completes a round-robin action over these source files. Because all activity is read-
only, using separate file handles from memory-mapped sources, multiple threads
sharing the same file is not a concern. To test with 1, 10, 20, 30, 40, 50, 60, 70, 80, 90,
and 100 threads, run the tool as follows, specifying the arguments as needed:
perl tsmobjperf.pl type=type endpoints=endpoint user="user"
pass="pass" bucket=bucket min=1 max=100 step=10 flist=
comma_delimited_source_files_list
where:
• For Microsoft Azure, the user should be the Azure Account Name.
• For Microsoft Azure, the pass should be a SAS token that was configured for
the Azure Storage Account in that specific Azure region. This user must have
valid Azure credentials to create containers (buckets) and PUT and GET Blob
objects in the region indicated by the endpoint URL. These values align with
those that are used to define an IBM Spectrum Protect cloud-container storage
pool, either via the Operations Center or the command line.
• The bucket value should be a Microsoft Azure container name that the
credentialed user has create/PUT/GET access to and that exists in the object
storage system.
43
• The min and max values should indicate the minimum and maximum thread
counts to test.
• The step value should indicate the increase in thread count from test to test.
Each thread count test (for 1, 10, 20, or more threads) uploads 10 x 1 GB objects per
thread. The previous example would result in a total of 5510 GB of data being stored to
the test container after all thread tests are completed. The tool does not remove
objects that are created. You must remove the objects manually after test completion.
Upon completion, the tool generates aggregate throughput metrics that can be used to
estimate practical instance-to-object storage performance rates for IBM Spectrum
Protect. Data is provided in comma-separated-value format (CSV) and the output of
the SPObjBench.jar tool can be inspected upon completion as well:
===================================================================
: Type: azure
: Endpoints: https://spobjpvthot.blob.core.windows.net/
: User: spobjpvthot
: Pass: SASTOKENSTRING
: Min Threads: 1
: Thread Step: 10
: File List:
/mnt/ramdisk/file.1,/mnt/ramdisk/file.2,/mnt/ramdisk/file.3,/mnt/ramdisk/file.4,
/mnt/ramdisk/file.5,/mnt/ramdisk/file.6,/mnt/ramdisk/file.7,/mnt/ramdisk/file.8,
/mnt/ramdisk/file.9 ,/mnt/ramdisk/file.10
===================================================================
: Test Results
1, XXX, YYY
===================================================================
It can be beneficial to monitor network transmission rates externally from the tool, as
well, to validate the absolute throughput rate that is experienced to object storage over
the (Ethernet) network. The tool reports an aggregate rate that can include build-up
and tear-down overhead associated with the tool. Calculating an actual transmission
rate from the instance-to-object storage while the test is running can give an indication
of the throughput limits of the environment. On Linux, for example, the dstat utility
can be used to monitor several system metrics at once, including network interface
send and receive statistics, by using the basic command:
% dstat
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
45
15 1 83 0 0 1| 0 0 |1778k 62M| 0 0 | 45k 3068
The dstat tool outputs a new line of metrics at a configured interval, much like the
standard iostat and netstat utilities. For the execution above, the net/total
send column is of greatest interest, here reported in MiBytes, as an indication of how
quickly data could be sent to the object storage endpoint from the server.
3. On the Basics tab, in the Subscription and Resource group sections, specify
appropriate settings.
4. In the Virtual machine name field, enter a name for the virtual machine.
5. In the Region and Availability options sections, specify appropriate settings. In the
Image section, specify an image, for example, Red Hat Enterprise Linux 7.6. Then,
click Select size.
47
6. In the Select a VM size pane, search for an appropriate virtual machine instance type,
for example, e32s for a large Blueprint. Click on the instance type and then click the
Select button at the bottom.
7. In the Username and SSH public key sections, specify settings for initial instance
authentication.
8. Optionally, specify settings in the Inbound port rules section. Click the Next: Disks
button.
9. On the Disks tab, add IBM Spectrum Protect premium SSD and standard HDD block
disks that are appropriate for the planned cloud Blueprint size. For each disk added,
click the Change size link to set the appropriate size for the disk.
Warning: Microsoft Azure bills for disk usage based on the value in the Disk Tier field.
For each disk tier, the maximum size of a disk is listed. If a custom disk size is used
that is larger than the size specified, the next disk tier up is used, and usage is billed
for this next size. For example, if a 200 GiB disk is chosen, you are billed for a 256 GiB
P15 tier disk. Guidance in this paper is to use disks only of the appropriate disk tier
size for optimal capacity and billing.
10. When all disks are added for the appropriate cloud Blueprint size, click Next:
Networking.
49
11. On the Networking tab, in the Virtual network section, create a virtual network or
place the virtual machine into an existing network. Select appropriate subnet and
public IP values for the virtual machine.
12. Optionally, set public inbound ports and other options. Click Next: Management.
Tip: Place the IBM Spectrum Protect server virtual machine in the same virtual network
as the client systems that are being protected. In this way, you can help to ensure
optimal performance and minimize ingress and egress charges.
13. On the Management tab, add or customize the settings in the Azure Security Center
and Azure Active Directory sections for the virtual machine. Click Next: Advanced.
14. On the Advanced tab, add any extensions or customizations that you require. Click
Next: Tags.
15. On the Tags tab, add any tags that you require for this virtual machine. Click Next:
Review + create.
16. On the Review + create tab, ensure that the virtual machine passes the validation test.
To create the virtual machine, click Create.
51
Microsoft Azure Blob Object Storage
1. In the navigation pane on the left side of the window, either click Create a resource
and search for Storage accounts in the search bar or click Storage accounts if it
appears in the list.
2. In the Storage accounts pane, click Add.
3. On the Basics tab, in the Subscription and Resource group sections, select
appropriate settings. Specify settings in the Storage account name and Location
fields. For the performance level, select Standard. For the account kind, select Blob
Storage. For the replication setting, select the appropriate redundancy setting for your
Blob storage. The Locally-redundant storage (LRS) selection provides the lowest
cost option. For the access tier, select Hot. Click Next: Advanced.
53
4. On the Advanced tab, specify whether to require secure data transfer (HTTPS with
TLS) and other options. Click Next: Tags.
5. On the Tags tab, add any tags that you require for this storage account. Click Next:
Review + create.
6. On the Review + create tab, ensure that the storage account passes the validation
test. To create the storage account, click Create.
After a Blob storage account is created, a shared access signature (SAS) can be
created along with a SAS token to use with IBM Spectrum Protect cloud-container
storage pools.
7. To create the SAS and SAS token, click the relevant storage account on the Storage
accounts pane. Click the Shared access signature link in the navigation pane for
the Blob storage account.
8. In the Shared access signature view, select an appropriate start and end range for
the SAS token based on your security needs. After a SAS token expires, another one
must be created, and the cloud-container storage pool must be updated with this new
token value. Read, Write, Delete, List, Add, and Create permissions are required for
cloud container storage pool access. Optionally, restrict access to certain IP addresses
and for the HTTPS protocol only. In the Signing key section, select an appropriate
signing key. Click Generate SAS and connection string.
55
9. Copy the SAS token entry that appears below the Generate SAS and connection
string button and use this for IBM Spectrum Protect cloud-container storage pools.
In-the-Cloud Deployment Guidelines with Microsoft Azure
REFERENCES
57
Notices
This information was developed for products and services offered in the US. This material might be available from IBM in
other languages. However, you may be required to own a copy of the product or product version in that language in order to
access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM
representative for information on the products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used.
Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be
used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program,
or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of
this document does not grant you any license to these patents. You can send license inquiries, in writing, to:
For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property
Department in your country or send inquiries, in writing, to:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT
WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some
jurisdictions do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may
not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the
information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements
and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve
as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and
use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without incurring any
obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information
between independently created programs and other programs (including this one) and (ii) the mutual use of the information
which has been exchanged, should contact:
Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.
The licensed program described in this document and all licensed material available for it are provided by IBM under terms
of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.
In-the-Cloud Deployment Guidelines with Microsoft Azure
The performance data discussed herein is presented as derived under specific operating conditions. Actual results may
vary.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM
products should be addressed to the suppliers of those products.
This information is for planning purposes only. The information herein is subject to change before the products described
become available.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely
as possible, the examples include the names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to actual people or business enterprises is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming techniques on
various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to
IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application
programming interface for the operating platform for which the sample programs are written. These examples have not been
thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of
these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any
damages arising out of your use of the sample programs.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp.,
registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other
companies. A current list of IBM trademarks is available on the web at "Copyright and trademark information" at
www.ibm.com/legal/copytrade.shtml.
Intel and Itanium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and
other countries.
The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive licensee of
Linus Torvalds, owner of the mark on a worldwide basis.
Microsoft is a trademark of Microsoft Corporation in the United States, other countries, or both.
Java™ and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
VMware is a registered trademark of VMware, Inc. or its subsidiaries in the United States and/or other jurisdictions.
59