Rethinking Enterprise
Rethinking Enterprise
35”
Rethinking
Get a head start evaluating Window 8—guided by a Windows Note
expert who’s worked extensively with the software since the This title is also available as a free eBook
preview releases. Based on final, release-to-manufacturing (RTM) on the Microsoft Download Center
software, this book introduces new features and capabilities, with (microsoft.com/download)
scenario-based insights demonstrating how to plan for, implement,
and maintain Windows 8 in an enterprise environment. Get the
high-level information you need to begin preparing your
Enterprise
deployment now.
About the Author
Jerry Honeycutt is an expert on Windows
Topics include:
Storage
• Windows PowerShell™ 3.0 and Group Policy
• Managing and sideloading apps
microsoft.com/mspress
• Internet Explorer 10
®
A Hybrid
Cloud Model
U.S.A. $14.99
Canada $15.99
[Recommended]
Operating Systems/
Windows Marc Farley
Visit us today at
microsoftpressstore.com
• Hundreds of titles available – Books, eBooks, and online
resources from industry experts
• Special offers
• Free eBooks
• How-to articles
www.microsoftvirtualacademy.com/ebooks
Microsoft Press
PUBLISHED BY
Microsoft Press
A Division of Microsoft Corporation
One Microsoft Way
Redmond, Washington 98052-6399
All rights reserved. No part of the contents of this book may be reproduced or transmitted in any form or by any
means without the written permission of the publisher.
Microsoft Press books are available through booksellers and distributors worldwide. If you need support related to this
book, email Microsoft Press Book Support at [email protected]. Please tell us what you think of this book at
http://www.microsoft.com/learning/booksurvey.
The example companies, organizations, products, domain names, email addresses, logos, people, places, and
events depicted herein are fictitious. No association with any real company, organization, product, domain name,
email address, logo, person, place, or event is intended or should be inferred.
This book expresses the author’s views and opinions. The information contained in this book is provided without
any express, statutory, or implied warranties. Neither the authors, Microsoft Corporation, nor its resellers, or
distributors will be held liable for any damages caused or alleged to be caused either directly or indirectly by
this book.
Foreword ix
Introduction xi
Next steps xv
Index 97
Contents
Foreword ix
Introduction xi
Next steps xv
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Backing up to disk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Virtual tape: A step in the right direction 15
Incremental-only backup 16
microsoft.com/learning/booksurvey
v
Dedupe makes a big difference 17
For the love of snapshots 17
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
vi Contents
CiS designs for efficient working set storage. . . . . . . . . . . . . . . . . . . . . . . . . 53
Data reduction and tiering within the CiS system 53
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Contents vii
Data portability in the hybrid cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Migrating applications and copying data 84
Can you get there from here? 85
Recovery in the cloud 86
Big Data and discovery in the cloud 88
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Appendix 91
Glossary 93
Index 97
microsoft.com/learning/booksurvey
viii Contents
Foreword
W hen I started my career in IT, storage was incredibly boring and something
that most people tried to avoid. Enterprise data storage was the domain of
strange people interested in tracks, cylinders, and data placements; they did not
write code—they were the forgotten people.
Twenty-five years or so later, storage is neither boring nor straightforward.
Data growth flows at exponential rates; structured data has been joined by
unstructured data, the Facebook generation creates extensive social content in
unprecedented quantities, and the enterprise is looking not only at how they store
but also how they derive value from this content in the form of Big Data analytics.
And somewhere along the line, I became a storage person—a StorageBod if you
will.
We are at the centre of the storm brought on by cloud computing, and
the promise of infinite scale and elasticity are changing the questions asked
of enterprise storage. The certainty of managing data storage with enterprise
arrays from the big five storage vendors is gone. There are now many possible
answers to a problem that has moved away from simply being a case of how
much capacity we require to store our application’s data. Instead, we are thinking
about how to balance user and business requirements in the context of flat-lining
IT budgets. Should all our data be stored off-premises in the cloud or should we
look at everything being stored in-house? Should all our data be stored in an
object store? If so, whose?
This ambiguity brings increasing levels of complexity to the storage world.
Data will live in many places on many different platforms and how we manage it,
access it, and secure it for the enterprise is the next big question to be answered
in storage.
Martin Glassborow
Blogger, Storagebod.com
June 2013
ix
Introduction
J ust as the Internet has fundamentally changed many industries, cloud
computing is fundamentally changing the information technology industry,
including infrastructures such as enterprise data storage. This book is about one
of the new infrastructure game changers—a storage architecture called hybrid
cloud storage that was developed by a company called StorSimple, now a part of
Microsoft, as a way to integrate cloud storage services with traditional enterprise
storage. Hybrid cloud storage is a completely different approach to storing data
with a single comprehensive management system covering data through its
entire life cycle, including active and inactive states as well as backup and archive
versions. IT teams with cloud-integrated storage arrays running in their data
centers use cloud storage as a data management tool and not simply as additional
storage capacity that needs to be managed. That concept takes a little time to
fully understand and it’s why this book was written.
The audience for this book includes all levels of IT professionals, from
xecutives responsible for determining IT strategies to systems administrators
e
who manage systems and storage. The book explains how hybrid cloud storage
changes the ways data protection is accomplished without tape backup systems;
how disaster recovery works with data that is stored in the cloud; how cloud
services are used to facilitate capacity management; and how the performance of
data stored in the cloud is managed. Several applications for hybrid cloud storage
are discussed to help IT professionals determine how they can use the Microsoft
hybrid cloud storage (HCS) solution to solve their own storage problems. The last
chapter is a hypothetical look into the future that speculates how this technology
might evolve.
Conventions
The following naming conventions are used in this book:
■■ The Microsoft HCS solution The hybrid cloud storage solution
discussed in this book combines a StorSimple-designed Cloud-integrated
Storage system with the Windows Azure Storage service. This combination
is referred to throughout the book as “the Microsoft HCS solution.”
■■ Hybrid cloud boundary The term is used in this book to identify the
aspects of hybrid cloud that create a separation between computing
on-premises and computing in the cloud. Physical location, bandwidth
xi
availability, and latency are examples of things that can form a hybrid cloud
boundary.
■■ The IT team The term refers to all the employees and contractors that
work together to manage the technology infrastructure of an organization.
Sidebars are used throughout the book to convey information, ideas, and
concepts in a less formal fashion or to draw attention to tangential topics that I
thought might be interesting to readers. Sidebars are easy to identify by being
offset from the rest of the text with a shaded background. An example of a
sidebar is in Chapter 1, “Rethinking enterprise storage,” in the section “The hybrid
cloud management model.”
Acknowledgments
Even a short book like this one has many contributors. I’d like to thank a number
of people who helped make this book happen. Maurilio Cometto for his kind
patience, Sharath Suryanarayan for his experience and perspective, Guru Pangal
for his encouragement, Gautam Gopinadhan for his depth of knowledge, Mark
Weiner for his unwavering support, Ursheet Parikh for his vision and faith, and
Carol Dillingham for her insights and guidance throughout.
http://aka.ms/HybridCloud/errata
If you find an error that is not already listed, you can report it to us through the
same page.
If you need additional support, email Microsoft Press Book Support at
[email protected].
m
Please note that product support for Microsoft software is not offered through
the addresses above.
xii Introduction
We want to hear from you
At Microsoft Press, your satisfaction is our top priority, and your feedback our
most valuable asset. Please tell us what you think of this book at:
http://aka.ms/tellpress
The survey is short, and we read every one of your comments and ideas.
Thanks in advance for your input!
Stay in touch
Let’s keep the conversation going! We’re on Twitter: http://twitter.com/MicrosoftPress.
Introduction xiii
Next steps
W e hope this book piques your interest in the Microsoft hybrid cloud
storage (HCS) solution. If you want to learn more about implementing
the Microsoft HCS solution in your own enterprise, please visit the following site,
where you can read case studies and request a demo:
http://www.microsoft.com/StorSimple
To connect with the author or other readers of this book, check out:
■■ Marc Farley’s blog, “Hybrid Cloud Storage”: http://blogs.technet.com/b/cis/
■■ The book’s website: http://blogs.technet.com/b/cis/p/rethinkingenterprise
storage.aspx
■■ StorSimple on Twitter: https://twitter.com/StorSimple
■■ Marc Farley on Twitter: https://twitter.com/MicroFarley
xv
CHAPTER 1
Rethinking enterprise
storage
T he information technology (IT) world has always experienced rapid changes, but the
environment we are in today is bringing the broadest set of changes that the industry
has ever seen. Every day more people are accessing more data from more sources and
using more processing power than ever before. A profound consequence of this g rowing
digital consumption is that the corporate data center is no longer the undisputed
center of the computing universe. Cloud computing services are the incubators for new
applications that are driving up the demand for data.
IT managers are trying to understand what this means and how they are going
to help their organizations keep up. It is abundantly clear that they need the ability
to respond quickly, which means slow-moving infrastructures and management
processes that were developed for data center-centric computing need to become
more agile. Virtualization technologies that provide portability for operating systems
and applications across hardware boundaries are enormously successful, but they are
exposing the limitations of other data center designs, particularly constraints that hinder
storage and data management at scale.
It is inevitable that enterprise storage technologies will change to become more
s calable, agile, and portable to reflect the changes to corporate computing. This book
examines how storage and data management are being transformed by a hybrid cloud
storage architecture that spans on-premises enterprise storage and cloud storage
services to improve the management capabilities and efficiency of the organization. The
Microsoft hybrid cloud storage (HCS) solution is an implementation of this architecture.
1
metadata from on-premises storage to the cloud, fulfilling the roles for a number of storage
and data management practices.
FIGURE 1-1 Three on-premises data centers exchange management information with cloud resources
and management services across the hybrid cloud boundary.
If VMs can run either on-premises or in the cloud, companies will want a way to
copy data across the hybrid cloud boundary so it can be accessed locally (“local”
in the cloud context means both the VM and data are located in the same cloud
data center). However, if copying data takes too long, the hybrid cloud application
might not work as anticipated. This is an area where hybrid cloud storage could play
a valuable role by synchronizing data between on-premises data centers and the
cloud. Chapter 7, “Imagining the possibilities with hybrid cloud storage,” discusses
future directions for this technology, including its possible use as a data portability
tool.
M any of the advancements in storage and data management today are based
on advanced mathematical algorithms for hashing, encoding, and encrypting
data. These algorithms tend to assume that there is enough processing power
available to not impact system performance and that the data being operated on is
stored on devices with sufficient performance so bottlenecks can be avoided. Much
of the design work that goes into storage systems today involves balancing the
resources used for serving data with the resources used for managing it.
So, if data growth has been a problem for some time, why hasn’t data reduction been
used more broadly in enterprise storage arrays? The answer is the performance impact it can
have. One of the most effective data reduction technologies is deduplication, also known as
dedupe. Unfortunately, dedupe is an I/O intensive process that can interfere with primary
storage performance, especially when device latencies are relatively high as they are with
disk drives. However, enterprise storage arrays are now incorporating low-latency solid state
disks (SSDs) that can generate many more I/O operations per second (IOPS) than disk drives.
This significantly reduces the performance impact that dedupe has on primary storage. The
Microsoft HCS solution discussed in this book uses SSDs to provide the IOPS for primary
storage dedupe.
Chapter 4, “Taming the capacity monster,” looks at all the various ways the Microsoft HCS
solution reduces storage capacity problems.
S SDs are one of the hottest technologies in storage. Made with nonvolatile flash
memory, they are unencumbered by seek time and rotational latencies. From a
storage administrator’s perspective, they are simply a lot faster than disk drives.
However, they are far from being a “bunch of memory chips” that act like a disk
drive. The challenge with flash is that individual memory cells can wear out over
time, particularly if they are used for low-latency transaction processing applications.
To alleviate this challenge, SSD engineers design a number of safeguards, including
metadata tracking for all cells and data, compressing data to use fewer cells, parity
striping to protect against cell failures, wear-leveling to use cells uniformly, “garbage
collecting“ to remove obsolete data, trimming to remove d
eleted data, and metering
to indicate when the device will stop being usable.
SSDs manage everything that needs to be managed internally. Users are advised not
to use defrag or other utilities that reorganize data on SSDs. They won’t perform
faster, but they will wear out faster.
Doing things the same old way doesn’t solve new problems
The root cause of most storage problems is the large amount of data being stored. Enterprise
storage arrays lack capacity “safety valves” to deal with capacity-full scenarios and slow to
a crawl or crash when they run out of space. As a result, capacity planning can take a lot of
time that could be used for other things. What many IT leaders dislike most about capacity
management is the loss of reputation that comes with having to spend money unexpectedly
on storage that was targeted for other projects. In addition, copying large amounts of
data during backup takes a long time even when they are using dedupe backup systems.
Technologies like InfiniBand and Server Message Block (SMB) 3.0 can significantly reduce the
amount of time it takes to transfer data, but they can only do so much.
More intelligence and different ways of managing data and storage are needed to change
the dynamics of data center management. IT teams that are already under pressure to work
more efficiently are looking for new technologies to reduce the amount of time they spend
on it. The Microsoft HCS solution discussed in this book is a solution for existing management
technologies and methods that can’t keep up.
FIGURE 1-2 In the hybrid cloud storage architecture, the CiS SAN system accesses the expanded capacity
available to it in Windows Azure Storage over an extended distance.
Data tiering
CiS transparently performs data tiering, a process which moves data between the SSDs and
HDDs in the CiS system according to the data’s activity level with the goal of placing data on
the optimal cost/performance devices. Expanding data tiering with a hybrid cloud storage
architecture transparently moves dormant data off site to the cloud so it no longer occupies
on-premises storage. This transparent, online “cold data” tier is a whole new storage level that
is not available with traditional storage architectures, and it provides a way to have archived
data available online.
Thin provisioning
SAN storage is a multitenant environment where storage resources are shared among
multiple servers. Thin provisioning allocates storage capacity to servers in small increments
on a first-come, first-served basis, as opposed to reserving it in advance for each server. The
caveat almost always mentioned with thin provisioning is the concern about over-committing
resources, running out of capacity, and experiencing the nightmare of system crashes, data
corruptions, and prolonged downtime.
However, thin provisioning in the context of hybrid cloud storage operates in an
environment where data tiering to the cloud is automated and can respond to capacity-
full scenarios on demand. In other words, data tiering from CiS to Windows Azure S torage
provides a capacity safety valve for thin provisioning that significantly eases the task of
managing storage capacity on-premises.
Summary
The availability of cloud technologies and solutions is pressuring IT teams to move faster
and operate more efficiently. Storage and data management problems are front and center
in the desire to change the way data centers are operated and managed. E xisting storage
Summary CHAPTER 1 9
technologies and best practices are being questioned for their ability to support data-driven
business goals.
A new architecture called hybrid cloud storage improves the situation by integrating
on-premises storage with cloud storage services providing both the incremental allocation
of cloud storage as well as remote data protection. Extending the traditional o n-premises
storage architecture to include cloud storage services enables much higher levels of
management automation and expands the roles of traditional storage management
functions, such as snapshots and data tiering, by allowing them to be used for backup and
off-site archiving.
The rest of the book explores the implementation of the Microsoft HCS solution and how
it fundamentally changes how data and storage management is done.
Leapfrogging backup
with cloud snapshots
11
Technology obsolescence is another difficult aspect of data protection. As new backup
storage technologies are introduced, IT teams have to manage the transition to those
technologies as well as retain access to data across multiple technologies. This tends to be
more problematic for long-term data archiving than backup, but it is a consideration that
weighs on IT teams nonetheless.
Disaster recovery is the most stressful, complex undertaking in all of IT. Recreating
r eplacement systems from tape backups involves many intricate details that are very difficult
to foresee and plan for. Doing this without the usual set of online resources is the ultimate
test of the IT team’s skills—a test with a very high bar and no chance for a retry. Most IT
teams do not know what their own recovery capabilities are; for example, how much data
they could restore and how long it would take. When you consider how much time, money,
and energy has been invested in backup, this is a sad state of affairs for the IT industry. Data
growth is only making the situation worse.
Tape media
Tape complexity starts with its physical construction. In one respect, it is almost miraculous
that tape engineers have been able to design and manufacture media that meets so many
challenging and conflicting requirements. Magnetic tape is a long ribbon of multiple
laminated layers, including a microscopically jagged layer of extremely small metallic particles
that record the data and a super-smooth base layer of polyester-like material that gives the
media its strength and flexibility. It must be able to tolerate being wound and unwound
and pulled and positioned through a high-tension alignment mechanism without losing the
integrity of its dimensions. Manufacturing data grade magnetic tapes involves sophisticated
chemistry, magnetics, materials, and processes.
Unfortunately, there are many environmental threats to tape, mostly because metals tend
to oxidize and break apart. Tape manufacturers are moving to increase the environmental
range that their products can withstand, but historically, they have recommended storing
them in a fairly narrow humidity and temperature range. There is no question that the
IT teams with the most success using tape take care to restrict its exposure to increased
temperatures and humidity. Also, as the density of tape increases, vibration during transport
has become a factor, resulting in new packaging and handling r equirements. Given that tapes
M ost tape rotation schemes make periodic full copies of data in order to avoid
the potential nightmare of needing data from tapes that can’t be read. The
thinking is that tapes that were recently written to will be easier to recover from
and that the backup data will be more complete. The simplest rotation scheme
makes full copies once a week on the weekends and then once a day during
workdays. Sometimes IT teams use other rotation schemes that include making full
copies at monthly or longer intervals.
One of the problems with making full backup copies is that the operation can take
longer to finish than the time available to get the job done. When that happens,
system performance can suffer and impact productivity. Obviously, being able to
skip making full copies would be a big advantage, which is how the Microsoft HCS
solution does it.
Backing up to disk
With all the challenges of tape, there has been widespread interest in using disk instead of
tape as a backup target. At first glance, it would seem that simply copying files to a file server
could do the job, but that doesn’t provide the ability to restore older versions of files. There
are workarounds for this, but workarounds add complexity to something that is already
complex enough.
Several disk-based backup solutions have been developed and have become popular,
espite the fact that they tend to be more expensive than tape and require careful planning
d
and administration. As replacements for tape, they rely on backup and archiving software to
make a complete solution. When all the pieces of disk-based backup are put together it can
get fairly complicated; however, most users of the technology believe it is well worth it as a
way to avoid all the problems of tape.
VTLs significantly improve the automation of backup processes and provide good backup
performance, but are more expensive than tape backup systems. Because the storage
capacity of virtual tape products is limited, it might not be possible to backup as many servers
or retain as much backup data as desired. For cost, capacity, and performance reasons, VTLs
were mostly used in niche environments until dedupe technology was integrated with them
and made them more widely applicable.
Incremental-only backup
The incremental-only approach to backup makes a single full backup copy and thereafter
makes incremental backup copies to capture newly written data. If synthetic full tapes are
not made, this approach leads to horrendously long and troublesome restores because every
tape that was ever made might be needed for recovery. This implies copies need to be made
of every tape in case they fail and also requires them to be stored in different locations, which
means it might be necessary to have multiple copies at each location to account for media
failures and so on and so forth. (It’s funny what disaster paranoia will lead you to think about.)
That’s why backup vendors developed disk-based, incremental-only backup systems that
automatically copy backup data from a backup system at one site to another system at a
remote location. When a disaster happens at the primary site, a full recovery can be made at
the remote site from backup data in the remote system.
Incremental-only backup solutions integrate database, replication, and backup software
along with the redundant hardware systems and facilities overhead at the remote site.
Like other disk-based backup systems, they have capacity limitations that restrict the
amount of backup data that can be kept, requiring management diligence and planning.
Incremental-only backup systems are effective for solving backup problems, but, if the IT
team also wants to reduce the cost of storage, incremental-only systems probably don’t fit
the bill.
Primary dedupe is the application of dedupe technology for primary production data,
as opposed to being limited to backup data. The main advantage of primary dedupe
is that it reduces the amount of capacity consumed on primary storage—which
tends to be the most expensive storage in the data center. The main disadvantage of
primary dedupe is the performance impact of running dedupe on production data.
A variation of CDP is near-CDP, where the system doesn’t quite stay up to speed,
but is probably close enough for affordability’s sake. Data Protection Manager
from Microsoft is an example of a near-CDP solution that integrates tightly with
Microsoft server products and allows users to pick and choose from numerous
copies of the data on which they were working.
One problem with snapshots is that they consume additional storage capacity on primary
storage that has to be planned for. The amount of snapshot data depends on the breadth
of changed data and the frequency of snapshots. As data growth consumes more and more
capacity the amount of snapshot data also tends to increase and IT teams may be surprised
to discover they are running out of primary storage capacity. A remedy for this is deleting
snapshot data, but that means fewer versions of data are available to restore than expected.
In many cases, that may not be a huge problem, but there could be times when not being
able to restore previous versions of data could cause problems for the IT team. Otherwise,
the ease that snapshot capacity can be returned to free space depends on the storage system
and may not be as simple as expected.
But Windows Azure Storage works very well for storing fingerprints and fingerprints
definitely are not files—they are logical packages of contiguous blocks of data.
Blocks are an addressing mechanism that operating systems use to calculate where
to put data to maintain system performance. CiS systems exchange blocks, not
files, with servers. On the cloud side, CiS systems exchange objects with Windows
Azure Storage—and fingerprints, chock full of blocks, are the objects used by the
Microsoft HCS solution. Just as the operating system translates files into blocks in
order to store them on storage devices, CiS translates blocks into fingerprints so
they can be stored in the cloud.
Summary
Backing up data, in preparation for recovering from a disaster, has been a problem for IT
teams for many years due to problems with tape technology and the time-consuming manual
processes that it requires. New technologies for disk-based backup, including virtual tape
libraries and data deduplication, have been instrumental in helping organizations reduce or
eliminate the use of tape. Meanwhile, snapshot technology has become very popular with IT
teams by making it easier to restore point-in-time copies of data. The growing use of remote
Summary CHAPTER 2 23
replication with dedupe backup systems and snapshot solutions indicates the importance
IT teams place on automated off-site backup storage. Nonetheless, data protection has
continued to consume more financial and human resources than IT leaders want to devote
to it.
The Microsoft HCS solution from Microsoft replaces traditional backup processes with a
new technology—cloud snapshots, that automate off-site data protection. The integration of
data protection for primary storage with Windows Azure Storage transforms the e rror-prone
tedium of managing backups into short daily routines that ensure nothing unexpected
occurred. More than just a backup replacement, cloud snapshots are also used to quickly
access and restore historical versions of data that were uploaded to the cloud.
One of the key technology elements of the Microsoft HCS solution are granular data
bjects called fingerprints, which are created by the integrated dedupe process. The
o
Microsoft HCS solution tracks all fingerprints on premises and in the cloud with a system
of pointers that provide disaster recovery capabilities as well as the ability to recover
point-in-time copies of data. The system of fingerprints and metadata pointers in the
Microsoft HCS solution forms a hybrid cloud management system that is leveraged to
provide functions that supersede those of backup systems grounded in tape technologies
and processes. The next chapter, “Accelerating and broadening disaster recovery protection,”
continues the discussion by showing how the hybrid data management system enables
deterministic, thin full recoveries of data from Windows Azure Storage.
25
Planning for the unexpected
DR plans are customized documents that identify the roles, processes, technologies, and
data that are used to recover systems and applications. The best plans are comprehensive
in scope, identifying the most important applications that need to be recovered first as well
as providing contingency plans for the inevitable obstacles that arise from the chaos of a
disaster. DR plans should be regularly updated to address changing application priorities.
Data growth complicates things by forcing changes to servers and storage as capacity is
filled and workloads are redistributed. Hypervisors that allow applications and storage to be
relocated help IT teams respond quickly to changing requirements, but those changes are
somewhat unlikely to be reflected in the DR plan. That doesn’t mean the application and data
can’t be restored, it simply means that the IT team could discover the plan isn’t working as
expected and have to rely on memory and wits. The Microsoft HCS solution accommodates
data growth without having to relocate data onto different storage systems. The advantage
of knowing where your data is and where it should be recovered to after a disaster cannot be
emphasized enough.
S tatistics are often quoted for the high failure rate of businesses that do not have
a DR plan when a disaster strikes. These so-called statistics probably are fictitious
because there is no way of knowing if a business had a DR plan or if it moved to a
new location, changed its name, or been out of business temporarily while facilities
were being rebuilt. It’s difficult to put calipers on survival when it can take so many
forms.
However, it is also obvious that some businesses do indeed fail after a disaster
and that businesses struggle to return to pre-disaster business levels. The loss of
business records, customer information, and business operations contributes heavily
to the eventual failure of a business. There are many things that are weakened by a
disaster and losing data certainly doesn’t help.
FIGURE 3-1 The timeline of a recovery includes the disaster, recovery point, and recovery time.
The timeline shown in Figure 3-1 will likely have different dimensions and scales
epending on the importance of the application. The IT team sets recovery point objectives
d
(RPOs) and recovery time objectives (RTOs) for applications based on their importance to the
organization and the data protection technology they are using for recovery. The highest
priority applications typically are protected by technologies providing the shortest RPOs and
RTOs while the lowest priority applications are protected by tape backup technology, which
provides the longest RPOs and RTOs.
STORAGE-BASED REPLICATION
Replication between storage systems is a proven method for providing excellent RPOs and
RTOs. With storage-based replication, applications running on servers at a secondary site can
read data that had been written to storage at the primary site only a few moments earlier.
Storage-based replication can easily multiply data center costs by adding the costs of
duplicate storage and server systems, backup systems and software at both sites, low-latency
network equipment, and the management, maintenance, and facilities overhead associated
with running dual sets of equipment.
A bucket is the generic word for a storage container that holds data objects in the
cloud. They are sometimes compared to large disk drives, but it is more useful
to think of them as specialized servers that store data objects. They are accessed
and managed using cloud APIs. A storage volume is the generic word for a storage
container for data in on-premises storage systems. It is more frequently used for
block data than for file shares, but it is sometimes used to refer to the container
where a file share is.
In the hybrid cloud storage model, the contents of a volume are protected by
uploading them as fingerprints to a Windows Azure Storage bucket. A Windows
Azure Storage bucket typically stores fingerprints from multiple volumes on the
CiS system. In fact, it is not unusual for a single Azure storage bucket to store the
fingerprints for all the volumes in a CiS system.
Disaster recovery operations begin by selecting a cloud snapshot date and time and
ownloading the metadata map from its bucket to a recovery CiS system. When the map
d
is loaded, servers and VMs at the recovery site can mount the storage volumes that had
previously been on a source CiS system, and then users and applications can browse and open
files. The fingerprints from the source CiS system are still on the other side of the hybrid cloud
boundary, but can now be accessed and downloaded in a way that is similar to a remote file
share.
Figure 3-3 shows the relationship between Windows Azure Storage, source and recovery
CiS systems, and illustrates how the metadata map is uploaded, stored, and downloaded.
T here are many variables that can influence download performance and
individual results will vary, nonetheless, readers will want some idea of
download times for the metadata map. In a hypothetical example of 10 TB of data
stored in the cloud over an unencumbered DS3 (44.7 Mbps) internet connection,
the metadata map would likely be downloaded in less than 2 hours.
L et’s take the hypothetical example of 10 TB of data stored in the cloud with a
DS3 Internet connection and estimate the difference in recovery times between
the Microsoft HCS solution and using cloud-storage-as-virtual-tape with backup
software. In the previous sidebar, we estimated the time to download the metadata
map to be less than 2 hours. From then on, applications can access their data. With
virtual tape, however, all the tape images would be downloaded first, which would
probably take over 3 weeks. Using dedupe with a cloud-storage-as-virtual-tape
would improve download performance considerably, but recovery times would
likely be slower by an order of magnitude compared to the Microsoft HCS solution.
Clearly, there are big differences in the way that cloud storage is used.
Windows Azure Storage does not offer the same types of services, but instead
provides affordable and reliable storage with built-in data protection features that
IT teams can rely on to recover from a disaster. Rather than consulting on how to
recover, Windows Azure Storage services is part of the actual recovery process.
Location-independent recovery
Through cloud snapshots, the Microsoft HCS solution uploads all the data needed for
recovery into one or more Windows Azure Storage buckets. The portability of fingerprints
and the metadata map makes it possible for one or more recovery CiS systems to access
those buckets from virtually any location with a suitable Internet connection.
An organization does not have to operate multiple data centers in order to take advantage
of location-independent recovery. An example would be a business with a primary data
center that has the ability to quickly setup VMs and a spare CiS system in a local colocation
facility. Location-independent recovery gives the IT team many options for developing a DR
strategy that fits their operations and their budgets.
Summary
Disaster recovery is a fundamental best practice for all IT teams, yet many of them struggle
with the technologies, tools, and processes they have. The combination of data growth, the
difficulty writing, updating, and testing DR plans, and the need to make DR more cost-
effective is making it very difficult for IT teams to do the job the way they know it needs to
be done. Solutions like remote replication work well to reduce RPOs and RTOs for a limited
number of mission-critical applications, but the expense of owning and operating dual
environments for replication means that a lot of data does not get the DR coverage that the
organization needs.
The Microsoft HCS solution is based on the hybrid management model where deduped
fingerprints on a source CiS system are uploaded to Windows Azure Storage where they
can be downloaded to another recovery CiS system for DR purposes. The recovery data
that is stored in the cloud does not consume floor space, power, or cooling costs in any of
the organization’s data centers. Fingerprints in Windows Azure Storage are protected in the
cloud by replication and geo-replication services. One of the key management elements is
an object called the metadata map, which contains pointers to all the fingerprints that were
uploaded by the source CiS system. The combination of the fingerprints and the metadata
map creates a portable, deduped data volume that can be downloaded to another CiS system
during recovery operations.
In a recovery operation, the metadata map is downloaded first and then all the data
that had been uploaded becomes visible to applications and users. Thereafter, the
download p rocess is driven by applications as they access their data. This deterministic,
application-driven recovery process limits the data that is downloaded to only the deduped
working set, leaving all the data that is not needed in the cloud. The thin, fast recovery
capabilities of the Microsoft HCS solution enable IT teams to test their DR plans without
disrupting their production operations. Recovery times with deterministic restores are short.
Recovery points can be reduced by taking cloud snapshots several times a day.
The hybrid cloud management model enables a number of flexible, c ost-reducing
data recovery architectures. A single CiS system can be a spare for other CiS systems in
a N:1 topology, or one or more CiS systems can be used to recover data for one or more
disaster-stricken CiS systems in a N:N topology. There is no need to duplicate a data center
environment for DR with the Microsoft HCS solution.
Summary CHAPTER 3 41
CHAPTER 4
43
Here’s how it works. VM data is formatted and stored as a portable storage object called a
virtual machine disk (VMDK) in VMware ESX environments or a virtual hard disk (VHD) in
Windows Server 2012 Hyper-V environments. A VMDK/VHD on a source storage system can
be synchronized with a copy running on a destination storage system. When all the data
on the source is synchronized at the destination, the VM switches to using the destination
VMDK/VHD. Figure 4-1 illustrates the process of storage migration.
FIGURE 4-1 In storage migration, a VM moves its VHD from one storage system to another.
After the synchronization process has completed and the VM is accessing data from
the destination VMDK/VHD, the source VMDK/VHD and all its data can be deleted and its
capacity reclaimed and used for other purposes.
Storage migration is a powerful tool for managing data on a storage s ystem that is
pproaching its capacity limits. Without it, the IT team would have to stop all the applications
a
on the source volume, copy them to the destination volume, redirect the VM to access the
volume on the destination, and then bring up the applications. This is a lot of work that
involves application downtime and planning.
Storage migration depends on having a suitable destination storage system with sufficient
capacity and performance capabilities to accommodate the new workload. Finding a s uitable
destination storage system becomes more difficult as utilization levels increase across
available storage resources.
T he Microsoft HCS solution works with SVMotion and Storage Live Migration
in either the source or destination roles. Its scalability characteristics make it a
valuable tool for managing environments suffering from VM sprawl.
VM sprawl occurs when there are so many VMs created that it is virtually impossible
to manage them all. Over time, many of these “zombie” VMs are no longer used,
but the data that was stored for them continues to consume storage capacity. IT
teams use the Microsoft HCS solution as a destination storage system for migrated
zombie data from other storage systems. The bottomless capacity of the Microsoft
HCS solution with Windows Azure Storage really comes in handy when you are
fighting zombies.
FIGURE 4-3 The Microsoft HCS solution offers expandable capacity for thin provisioning.
Storage architectures: Scale-up, scale-out, and scale-across with cloud storage as a tier CHAPTER 4 47
Scale-up storage systems tend to be the least flexible and involve the most planning and
longest service interruptions when adding capacity. The cost of the added capacity depends
on the price of the devices, which can be relatively high with some storage systems. High
availability with scale-up storage is achieved through component redundancy that eliminates
single points of failure, and by software upgrade processes that take several steps to
complete, allowing system updates to be rolled back, if necessary.
An architecture that adds capacity by adding individual storage systems, or nodes, to a
group of systems is called a scale-out architecture. Products designed with this architecture
are often classified as distributed storage or clustered storage. The multiple nodes are t ypically
connected by a network of some sort and work together to provide storage resources to
applications.
In general, adding network storage nodes requires less planning and work than a dding
devices to scale-up storage, and it allows capacity to be added without suffering service
interruptions. The cost of the added capacity depends on the price of an individual storage
node—something that includes power and packaging costs, in addition to the cost of the
storage devices in the node.
Scale-up and scale-out storage architectures have hardware, facilities, and maintenance
costs. In addition, the lifespan of storage products built on these architectures is typically
four years or less. This is the reason IT organizations tend to spend such a high percentage of
their budget on storage every year.
Scale-across storage
The Microsoft HCS solution uses a different method to increase capacity—by adding storage
from Windows Azure Storage in a design that scales across the hybrid cloud boundary.
Scale-across storage adds capacity by incrementally allocating it from a Windows Azure
Storage bucket. This allocation is an automated process that needs no planning and requires
no work from the IT team. The cost of additional capacity with this solution is the going rate
for cloud storage and does not include facilities and maintenance costs. However, using cloud
storage this way involves much higher latencies than on-premises storage, which means it
should be used for application data that can be separated by latency requirements—in other
words, data that is active and data that is dormant.
Figure 4-4 compares the three scaling architectures.
FIGURE 4-5 The typical data life cycle with access frequency as a function of data age.
The access frequency of data depends on the application, and there are certainly other
life cycles that don’t follow the curve of Figure 4-5. For instance, accounting applications that
process data on monthly or quarterly schedules will have a different curve. Nonetheless, a lot
of data does follow this asymptotic curve, where access declines over time.
Data in the Microsoft HCS solution is encapsulated as fingerprints. As each fingerprint
ages, it moves down the life cycle curve from top left to right bottom and at some point
the capacity management algorithms in the CiS system determine that data needs to be
relocated to the cloud. There is no specific point along the curve that determines when this
happens because the process is driven by the available free capacity in the CiS system.
T he life cycle of a fingerprint in the Microsoft HCS solution includes the creation
and management of twin copies that are on both sides of the hybrid cloud
boundary. Both copies are managed as part of the hybrid cloud data management
system. This is different than cloud backup or archiving solutions where the copies
in the cloud are managed by a different process.
Because there is a single management system that spans on-premises and cloud
storage, the processes for deleting on-premises copies and downloading cloud
copies are transparent and automated. Transparency combined with automation is a
hallmark of hybrid cloud storage.
The active data in Figure 4-6 is the working set stored in a CiS system. It follows that the
dormant data in the figure is stored in cloud storage. As the graphic indicates, the size of the
working set grows much more slowly over time, even though the overall data growth—and
the growth of dormant data—is much faster. The reason for the difference in the two curves
is because the working set does not accumulate data; instead, it continuously expels aging
active data to dormant storage at a rate that tends to match the ingestion rate of new data.
B ackup dedupe ratios are typically much higher than dedupe ratios for primary
storage. Dedupe processes are I/O intensive, which can create performance
problems for other concurrent storage processes. Backup dedupe doesn’t have
any other storage processes to interfere with, but primary dedupe designs have to
consider the performance of primary storage. Two ways to reduce the performance
impact of dedupe is to put the dedupe data on low-latency storage and use a less
aggressive dedupe process.
Although the dedupe ratios for primary storage are less than they are for backup
storage, they can be leveraged in more significant ways. For starters, deduping
primary storage directly counteracts the effects of data growth—which means
the IT team does not have to purchase storage capacity as frequently. Deduping
primary storage can also act like source-side dedupe, which reduces the amount
of data that needs to be copied during data protection operations, for example,
nightly cloud snapshots will copy less data to Windows Azure Storage.
Summary
IT teams looking for ways to manage data growth have many things to consider, including
how often they will need to upgrade the capacity of their storage systems and what the
associated management overhead is. Automated storage and data management tools that
provide flexible ways to structure storage and lower its costs are highly valued.
Virtual server tools, such as SVMotion and Windows Live Migration, that migrate VM data
from one storage system to another are an effective way to manage capacity crises when
Summary CHAPTER 4 55
CHAPTER 5
NOTE The word archive is sometimes used instead of “backup,” a topic that was
discussed in Chapter 2, “Leapfrogging backup with cloud snapshots.” Sometimes
the word archive is used to refer to migrating data from primary storage to free disk
capacity, a topic that was discussed in Chapter 4, “Taming the capacity monster.”
In this chapter the word archive refers to storing digital business records for an
extended period of time.
57
important elements of library science, with specialized requirements for very long-term data
storage (think millenniums) and methods which are beyond the scope of this book.
The other use case for digital archiving is for business purposes and is the subject of this
chapter. Digital archiving in the business context is one of the most challenging management
practices in all of IT because it attempts to apply legal and compliance requirements over a
large and growing amount of unstructured data. Decisions have to be made about what data
to archive, how long to keep the data that is archived, how to dispose of archived data that
is no longer needed, what performance or access goals are needed, and where and how to
store it all.
Like disaster recovery (DR), archiving for legal and compliance purposes is a cost without
revenue potential. For that reason, companies tend to limit their expenditures on archiving
technology without hindering their ability to produce documents when asked for them. There
are other reasons to archive business data but, in general, business archiving is closely tied to
compliance and legal agendas.
The ability to find and access archived data tends to be a big problem. Storing dormant
data safely, securely, and affordably for long periods of time is at odds with being able to
quickly find specific files and records that are pertinent to unanticipated future queries. The
selection of the storage technology used to store archived data has big implications for the
long-term cost of archiving and the service levels the IT team will be able to provide.
Compliance and legal requirements have driven the development of electronic discovery
(eDiscovery) software that is used to quickly search for data that may be relevant to an inquiry
or case. Courts expect organizations to comply with orders to produce documents and have
not shown much tolerance for technology-related delays. Due to the costs incurred in court
cases, eDiscovery search and retrieval requirements are often given higher priority than
storage management requirements. In other words, despite the desire to limit the costs of
archiving, in some businesses, the cost of archival storage is relatively high, especially when
one considers that the best case scenario is one where archived data is never accessed again.
Complete archiving solutions often combine long-term, archival storage with eDiscovery
software, but there is a great deal of variation in the ways archiving is implemented. Many
companies shun eDiscovery software because they can’t find a solution that fits their needs
or they don’t want to pay for it. Unlike backup, where best practices are fairly common across
all types of organizations, archiving best practices depend on applicable regulations and the
experiences and opinions of an organization’s business leadership and legal team. Digital
archiving is a technology area that is likely to change significantly with the development of
cloud-storage archiving tools in the years to come.
Archiving to disk
High-capacity disk systems are also used for archiving, even though they are much
more expensive to operate than tape. A feature, referred to as drive spindown, has been
incorporated into some disk systems to reduce power and cooling costs by selectively
stopping individual disk drives in the storage system. When data is needed on drives that are
spun down, the system starts them again and reads the data. The problem with spindown
technology is that disk drives are generally not made to be powered on and off and
sometimes they do not respond as expected. Application performance can also be erratic.
There is no question that disk is superior to tape for searching with eDiscovery solutions.
The immediate access to files and the ability to search both production data and online
archived data on disk saves everybody involved a great deal of time—which is a big deal
to corporate legal teams. However, disk-based archiving still requires some form of disaster
protection, which is usually tape, and all the overhead related to data protection, including
administrative time, equipment, media, and facilities costs.
FIGURE 5-1 Archive data is migrated from primary archive storage connected to an archive server to a file
server connected to Microsoft HCS.
Summary
Digital archiving is a growing concern for most IT teams due to the necessity and challenges
of complying with regulations and policies. IT teams are looking for solutions that help them
comply with regulations and reduce the cost of archiving by reducing the cost of archival
storage and by automating the process.
The Microsoft HCS solution features Windows Azure Storage and is capable of storing data
for many years in compliance with government regulations. Windows Azure has achieved
several important compliance milestones including the HIPAA Business Associate Agreement
(BAA), ISO/IEC 27001:2005 certification, and SSAE 16 / ISAE 3402 attestation.
Cloud snapshot policies determine the length of time that data is retained in the cloud and
are easily customized to fit a wide variety of data retention requirements. IT teams can choose
to archive data in place or in special archive volumes, or choose to use the Microsoft HCS
solution as secondary storage for offloading existing enterprise storage systems. E ncryption
technology protects privacy, hashing provides data integrity checks, and Windows Azure
Storage replication services provides availability and DR protection.
The next chapter, “Putting all the pieces together,” looks at the broader system capabilities
of the Microsoft HCS solution and discusses a number of use cases where IT teams can
successfully deploy it.
67
The system of fingerprints and pointers
When data is initially written to the CiS system it is stored as block data and placed in
an input queue, waiting to be deduped. This input queue is also referred to as the linear
tier. Data in the linear tier is a very small percentage of all the data in the system, and like
fingerprints, it is protected by cloud snapshots.
When data exits the linear tier, it is run through the dedupe process and if there is a match
to an existing fingerprint, the data is assigned to that fingerprint’s pointer. If the dedupe
process does not find a matching fingerprint, a new fi ngerprint is generated, and a new
pointer is created mapping the incoming data to the new fingerprint. It follows that the block
storage address spaces that are exported externally to servers are mapped internally by
pointers to fingerprints. When a server accesses data stored by the Microsoft HCS solution,
the CiS system looks up the fingerprints needed to service the request using these pointers.
Fingerprints can be stored on any tier, determined by how recently they were accessed.
Every tier is managed as a queue where least recently accessed fingerprints are moved to
lower tiers as part of capacity management.
The pointers that map block storage to fingerprints are sometimes called the metadata.
Metadata is also tiered by the system based on least recently used information but the tiering
of fingerprints and metadata are independent of each other. In general, metadata is located
on a tier with less latency than where the fingerprints are placed in order to minimize the
performance impact of looking up fingerprints.
Fingerprints provide a data abstraction across all tiers, including the cloud tier in Windows
Azure Storage. They are identified uniquely, by name, regardless of which tier they are in.
Fingerprints stored in Windows Azure Storage are not encapsulated in other data formats,
such as tape formats, and are managed as part of a single hybrid data management system.
This allows fingerprints to be immediately accessed by all storage functions, without having
to convert and load them into primary storage before using them. Fingerprints stored in the
cloud tier are encrypted, however, so they must be unencrypted before they can be used
again on-premises.
Because the Microsoft HCS solution uses the same names for fingerprints on-premises as is
the cloud, tiering is done by updating metadata to use a fingerprint’s cloud copy. T
iering this
way is much faster and more efficient than uploading data and helps explain how capacity
can be reclaimed so quickly when facing a sudden influx of new data.
Recovery operations also leverage the fingerprint naming scheme. Using the pointers
downloaded with the metadata map, a recovery CiS system can access and download the
fingerprints needed to fulfill IO requests from servers. As the fingerprints are downloaded to
the CiS system they are placed in the SSD tier and their pointers are updated to reflect their new
location. The name of the fingerprint does not change.
Another technique is wide striping, which uses many disk drives in order to
increase the available input/output per second (IOPS) in the system. The more
disk drives there are in a system, the more IOPS can be generated to support
transaction processing applications. Performance is achieved by adding disk
drives—and capacity—with far less interest in optimizing capacity than in boosting
performance. Capacity utilization can be increased in wide-striping disk systems
by filling them up with data from more applications, but at the risk of creating
contention for disk resources, which adversely impacts performance.
Storage systems that use flash SSDs are turning the tables on the old “performance
first” designs of disk-based storage. The IOPS generating capabilities of SSDs is
great enough that their performance can be used to increase capacity through
techniques like dedupe—which are too IOPS-intensive to be practical on disk-based
storage systems. Increasing the effective capacity through dedupe allows more
data to be stored in flash SSDs, which improves the overall performance of the
system.
The best performance with the Microsoft HCS solution is achieved when the least amount
of data needs to be downloaded from cloud storage. In other words, the working set of the
data easily fits within the capacity resources of the CiS system.
FIGURE 6-2 Fingerprints are tiered to cloud storage based on their access frequency and the available
capacity in the CiS system.
The working set in a CiS system is a dynamic entity that changes as new data is written to
the CiS system and as applications and users change the data they are actively working with.
When fingerprints that were previously tiered to the cloud are accessed and downloaded
from the cloud, they are added back to the working set.
Just don’t do it. Don’t defrag. Not only does it lack the ability to improve the
performance of the SSD tier in a CiS system, it also contributes to them wearing out
faster. Other server utilities that access large amounts of data should likewise be
avoided.
T he optimal practice for migrating data to a new CiS system is to copy data
based on its access dates in its pre-CiS storage. One way to do this would be
to first copy half the data from each volume that has been least recently accessed.
Then copy the other half of the data that was most recently accessed. That way
the CiS system will have instant access history that roughly resembles the access
characteristics of the data before it was migrated. True geeks will think of all kinds
of interesting ways to do this and, as usual, their mileage will vary.
Establishing DR competency
Chapter 3, “Accelerating and broadening disaster recovery protection,” was largely devoted
to the problems IT teams have with DR and how the Microsoft HCS solution improves the
situation. To summarize briefly, data growth is making an already bad situation worse because
the amount of data that needs to be protected and later restored is too large using existing
technologies and processes. Backup processing can’t be counted on to complete within the
backup window, which means data might not be restorable. Many IT teams are struggling
with how to protect a swelling volume of data across their systems and applications.
IT teams recognize that their ability to recover is impaired. In many cases, they cannot test
their recovery plans because the disruption to production operations would be too great,
or if they do test them, they encounter too many problems and can’t finish. As a result, they
can only guess what the recovery time objectives (RTOs) or recovery point objectives (RPOs)
might be for their applications, and although they find this situation unacceptable, they don’t
know what to do about it.
The Microsoft HCS solution gives the IT team a new, efficient DR tool. Cloud snapshots are
much more efficient than other data protection technologies, completing the task in much
less time and r equiring far less administrative effort. As importantly, successful recovery tests
can be conducted with a relatively small amount of hardware and minimal interruptions
to production operations. The CiS system used for the test requires an adequate Internet
connection to access fingerprints and from there on, deterministic restores ensure that only
the metadata map and working sets are downloaded to establish realistic RPOs and RTOs.
A lot of IT teams are looking for ways to exchange corporate data similar to how
they use consumer cloud storage services, such as SkyDrive from Microsoft,
but with more security and control. The Microsoft HCS solution provides a way for
IT teams to do this that is completely under their control, secured by encryption,
and with data integrity ensured.
Summary
The scale-across architecture of the Microsoft HCS solution is unique in the industry.
Designed for the problems of data growth, it does more than simply transfer data between
on-premises storage and cloud storage—it keeps data online and instantly accessible using
the same names and data format regardless of where it is stored. This means data never has
to be copied by additional storage products and processes, such as tape or dedupe backup
equipment. It also explains how a fingerprint uploaded to Windows Azure Storage by a cloud
snapshot during a backup operation can be used by the cloud-as-a-tier feature months later.
The fact that data spans on-premises and Windows Azure Storage has important
implications for performance. The key to performance with the Microsoft HCS solution
is maintaining a reasonably stable working set and knowing that least recently accessed
fingerprints are tiered first when the system needs to expand capacity. Applications that do
not have a high degree of volatility form the most predictable working sets.
There are a number of ways IT teams can use the Microsoft HCS solution to solve their
storage problems. In addition to the backup, disaster recovery, capacity growth, and
archiving examples detailed in Chapters 2-5, it also can be used very effectively for enterprise
document management and for large Microsoft SharePoint environments.
Summary CHAPTER 6 79
CHAPTER 7
NOTE This chapter is a piece of technology fiction that is based on the state of
hybrid cloud storage technology today and projecting how it might develop in the
years to come. Hybrid cloud storage is certain to have an interesting future that will
be shaped by unforeseen events, technical inventions, and business changes. This
chapter is bound to get some things wrong but, with some luck, may get a few things
right. Read at your own risk.
MORE INFO An excellent blog post on setting up Windows Azure environments can
be found here: http://sqlblog.com/blogs/buck_woody/archive/2013/04/17/creating-a-
windows-azure-virtual-machine-the-right-way.aspx
81
VMs have become the granular infrastructure building blocks of corporate data centers.
With VMs everywhere on premises and VMs everywhere in the cloud, it follows that effective
VM portability across the hybrid cloud boundary will be an important enabler to hybrid
infrastructures. IT teams want to copy successful VM implementations between their data
centers and the cloud where they can be run with different goals and circumstances. VM
portability provides the flexibility to change how processing is done and is a guard against
being locked in by any one CSP. System Center 2012 App Controller is an example of a
management tool that automates the process of uploading and installing VMs in Windows
Azure, and it is an excellent example of the progress being made to integrate cloud and
on-premises data centers.
There will always be differences between the things that on-premises and cloud data
centers do best. Certain types of applications and data are likely going to stay on premises,
while others can only be justified economically in the cloud. Then there will be everything
else that probably could be run either on premises or in the cloud. The final decision on those
will be made based on cost, reliability, and security.
Infrastructure virtualization
Abstracting physical systems as VMs started a revolution in computing that continues today
with cloud computing. The initial breakthrough technology for VMs was VMware’s ESX
hypervisor, a special operating system that allowed other guest operating systems to run on it
as discrete, fully-functioning systems. IT teams used ESX to consolidate many server instances
onto a single physical server, dramatically reducing the number of physical servers and their
associated footprint, power, and cooling overhead. There are several hypervisors in use
today, including VMware’s ESXi and Hyper-V, which is part of both Microsoft Server 2012 and
Windows Azure.
But virtualization technologies have been around for much longer than ESX. The
t echnology was first invented for mainframes, and both virtual networking and virtual storage
were well-established when the first VMs from VMware were introduced. Virtualization is one
of the most important technologies in the history of computing and will continue to be.
In addition to VMs, virtual switches (v-switches) and virtual storage appliances (VSAs)
were also developed to run on hypervisors in server systems. Of these three virtualized
infrastructure technologies, VSAs have been the least successful at imitating the functionality
of their corresponding hardware systems. This is not so hard to understand considering the
performance challenges of running a storage system developed for specialized hardware on a
PC-based hypervisor.
However, hypervisors in the cloud are different, simply by virtue of where they run, and are
much more likely to attract the interest of storage vendors. The most successful infrastructure
transitions tend to be those that require the least amount of change and storage vendors will
want to sell VSA versions of their on-premises products to ensure that customers making the
transition to the cloud will continue to use their technologies—whether they are products or
services.
Thanks to VMs, everything done in data centers today can be done in the cloud tomorrow CHAPTER 7 83
Managing data growth in a hybrid cloud
Organizations using hybrid cloud designs will expect to limit the costs of running their own
corporate data centers. Considering the cost of storage and the rate of data growth, it follows
that most of the data growth needs to be absorbed by cloud storage while maintaining
steady storage capacity levels on premises. The Microsoft HCS solution provides an excellent
way to limit capacity growth on-premises by deduplicating primary storage and using
the cloud as a tier for low-priority data. There is nothing particularly futuristic about that
however, because the solution does that already.
Another way to limit the storage footprint on-premises is to migrate applications to the
cloud. Just as the Microsoft HCS solution migrates lower-priority data to the cloud, the
applications that are migrated from the on-premises data center to the cloud could also have
lower priorities, and less sensitivity to the effects of migration.
C loud data centers have the basic server, storage, and network elements of an
infrastructure, but from the perspective of the IT team, they are missing a lot of
the elements they work with on a regular basis, including NICs, HBAs, device drivers,
backup systems, tapes, cables, and power. Avoiding all these infrastructure bumps
in the road is the point of cloud computing, after all. The idea of infrastructure as a
service should be to simplify the work of providing resources so more attention can
be paid to applications, data, devices, and people.
Like anything else in IT, there are big differences between rookie-level and
guru-level plays. The best hybrid clouds will be designed by people that understand
the nuances and details of virtual appliances and how to make them operate most
effectively in various cloud platforms. In the years to come, organizations will likely
be looking for cloud administrators who know their stuff. The combination of cloud
and virtualization skills will be highly desired by IT teams looking for individuals
that can get the cloud job done.
S earches of data archives are best done with data that has been indexed by
archiving software, however, it might not always be possible to run everything
through the archiving system. Map-reduce processes could be applied to comb
through unstructured data that has not been indexed—that’s more or less what
it was originally invented for when Google was first trying to figure out Internet
search.
Summary CHAPTER 7 89
APPENDIX
Considerations and
recommendations for
networking, privacy, and
data protection
This appendix discusses a variety of topics that IT team members should consider when
implementing the Microsoft hybrid cloud storage (HCS) solution.
iSCSI considerations
Setting up an iSCSI SAN to connect servers to the CiS system is relatively straightforward
and simple. Most server operating systems have device drivers to e stablish iSCSI sessions
between them and storage. iSCSI provides robust communications for storage I/O traffic
over Ethernet networks. The CiS system has multiple Ethernet ports for high availability.
iSCSI SANs should be segregated from LAN traffic using subnets, VLANs, or separate
physical networks. In general, the larger the SAN and LAN, the greater the need for
segregation.
91
Internet connection considerations
Unlike an iSCSI network, where it is easy to segregate SAN and LAN traffic with subnets or
VLANs, the connection to the Internet is almost always shared with other Internet traffic at
the site. The minimal dedicated bandwidth recommendation for the Internet link between the
CiS system and Windows Azure Storage is 20 Mb/second. Obviously connections with more
bandwidth will provide faster uploads and downloads. The CiS system can have its bandwidth
throttled during production hours to keep it from interfering too much with other work. This
typically doesn’t create problems for the Microsoft HCS solution because most of the time its
Internet traffic is generated at night during cloud snapshots.
92 APPENDIX Considerations and recommendations for networking, privacy, and data protection
Glossary
Active data Data that is expected to be accessed again and incorporates cloud storage services as a resource
relatively soon or periodically for storing on-premises data
Archiving A storage process that preserves data for an Cloud computing Scalable computing services
extended period of time provided on a short- or long-term basis by a large
number of systems
At-rest An IT resource that has a stable state and is not
being copied Cloud snapshot A data protection method that stores
point-in-time copies of data for on-premises systems in
Backup target A storage device or system that backup
cloud storage
software writes data to when performing backups
Cloud storage-as-a-tier Scale-across storage, CiS that
Backup A data protection method that was developed
works with cloud storage to provide a single, scalable
to work with tape and usually combines periodic full
storage system
copies of data with incremental copies of new data
Cloud storage Scalable, object-based storage capacity
Best practices IT management informed by advanced
provided as a service on a short- or long-term basis
knowledge and experience
Cloud A data center providing scalable computing
Big Data Vernacular term for data analytics, associated
and storage services, characterized by a large number
with, but not restricted to, Hadoop technology and
of systems that can be accessed for long-term or
methods
short-term projects
BLOB Binary large object, often a file
Clustered storage Scale-out storage, a tightly coupled
Block storage A storage environment characterized group of storage systems that function as a single,
by devices and protocols that are designed to consume scalable storage system
storage based on the granular element, blocks
Data analytics Computing processes looking for
Bucket A storage container provided by a cloud patterns or correlating factors in large amounts of data
storage service
Data reduction Processes that reduce the amount
CDP (continuous data protection) A method of data of storage capacity consumed for a given amount of
protection that makes copies of all changes made to information
data
Data tiering A storage management process that
CiS (Cloud-integrated Storage) An on-premises determines the performance and storage requirements
storage system that stores data for on-premises systems for data and locates it on a cost-effective storage
resource
93
Data volatility An indication of the percentage of an Hypervisor A software program that provides an
application’s data that may be accessed in day-to-day operating environment for virtual machines
operations
IaaS (infrastructure-as-a-service) A cloud service
Dedupe (Deduplication) A process that identifies offering the use of virtual computer, storage, and
duplicate copies of data and eliminates them by linking network systems
to reference copies
Index A way of condensing or collating data
Deterministic Precisely specified electronically that facilitates searching
Discovery The process for finding data that may be In-flight An IT resource that is being copied or
needed for legal or regulatory reasons transmitted from one location to another
Dormant data Data that is accessed very rarely, if ever IOPS The total sum of read and write operations per
second; input/output operations per second
Downtime The amount of time systems and data are
not available for processing, usually associated with a iSCSI (Internet Small Computer System Interface) An
disaster or failure, but also maintenance operations Ethernet protocol for exchanging storage commands and
data between computer systems and storage systems
DR Disaster recovery, the process of resuming
operations following a disaster that causes the IT team Employees and contractors that plan, acquire,
unexpected stoppage or failure of computing manage, and operate IT
equipment and processes
IT Information technology, the profession and industry
Enterprise A business or government entity of of developing, manufacturing, selling, implementing,
substantial size and operating data processing and communications
products and services
Fingerprint The granular data structure that is
managed in the Microsoft hybrid cloud storage solution Local snapshot A point-in-time copy of data or
comprised of data and metadata pointers to data stored on a storage system’s own disk
drives
Geo-replication A cloud storage service that copies
data from a cloud data center to a remote cloud data Metadata Data that describes data or attributes of
center data, such as a hash value of its contents
Hash A numerical value calculated from processing a Migration time The time that an application is offline
data string that uniquely identifies the string while it and its data are being relocated from one data
center to another
High availability A system design designed to
continue operating after the loss or failure of system Monolithic storage Scale-up storage, a single system
components or entire systems storage design that scales by adding components
Hybrid cloud boundary The distance, time, or Near CDP A method of data protection that makes
technology barrier that separates an on-premises data copies of most changes made to data
center from a cloud data center
NV-RAM (non-volatile random access memory) Fast
Hybrid cloud storage Data storage formed by the memory storage that retains data even after the loss of
combination of on-premises storage and cloud storage power
services
On-premises A facility owned and operated by an
Hybrid cloud A computing service that combines organization such as a business or government
public compute services with private compute services
94 Glossary
Orchestration An intelligent installation process that Snapshot A method of data protection that uses a
manages multiple related technologies to create a system of pointers to make point-in-time copies of data
solution
Spindown A process of stopping the rotation of disk
Portability The ability to relocate compute resources drives for dormant data to reduce the power costs of
and processes from one location or set of resources to storing it
another
SSD (solid state disk) Amassed memory technology
Primary site A data center where production that functions like a disk drive
operations run and where replication copies originate
Tape rotation The schedule that backup software
Primary storage Storage where applications read and creates for naming, using, and cycling tapes
write data during normal processing
Thin provisioning A method of allocating storage
Private-cloud A computing service that is restricted capacity from a common resource to individual storage
to a specific set of users, often implemented at a volumes on a first-come, first-served basis.
corporate-owned facility
Thin A storage process designed to minimize resource
Public-cloud A multi-tenant computing service that is consumption
provided openly over the Internet
Virtual machine (VM) The functionality of a physical
Recovery point The time in the past when the last computer provided by software operating as a logical
data was captured prior to a disaster event system image
Recovery site A data center where DR operations are Virtual storage A storage resource that is comprised
conducted of elements of other storage resources
Recovery time The amount of time it takes to return a Virtual tape The use of a disk storage system to
system to full functionality after a disaster event replace tapes used for backup
Replication A data protection method that copies Virtualization The process of using software to mimick
written data from one location to another the functionality of physical equipment
ROBO Remote office / branch office VM sprawl A phenomenon where the number of
virtual machines in an organization scales beyond the
SAN Storage area network
ability of the IT team to manage
Scale-across A storage design that scales by adding
VM Virtual disk A virtual disk managed by a
resources from cloud storage
hypervisor for storing the data for a VM, typically
Scale-out A storage design that scales by adding stored as a file.
additional systems to a group of systems
VSA (virtual storage appliance) The equivalent of a
Scale-up A storage design that scales by adding VM, but for a storage system
components to a single system
VTL (virtual tape library) A storage system used to
Secondary site A data center where replicated data is backup data to disk
copied to
Working set The data normally accessed in regular
Secondary storage Storage that is used for data daily processing
protection or archiving
Glossary 95
Index
97
cloud-storage-as-a-tier
thin provisioning, 9, 45–46 discovery, 57–59, 88, 94. See also eDiscovery
working set, 72 disk storage, 15, 60–61
cloud-storage-as-a-tier, 49–53, 68, 76, 93 disk-to-disk-to-tape (D2D2T), 15–16
clustered storage, 48, 93 documentation, 61–62
compliance, 58, 61–62, 77–78 dormant data, 36, 49–52, 58, 94
compression, data, 54 download performance, 34
continuous data protection (CDP), 18, 93 downtime, 25, 27, 94
cost considerations, 4, 7, 17, 37, 87
D E
eDiscovery, 58–60, 88
data. See also working sets Elastic Block Storage (EBS), 86
access to, 72–73 electronic discovery. See eDiscovery
archived, 62 encryption, 59, 61, 92
order of incoming, 73–74 enterprises, 3, 94
unstructured, 50, 58, 88 erasure coding, 3
data analytics, 88, 93
data availability, 4, 48, 59, 66
data centers
cloud, 8, 82, 85 F
on-premises, 1, 53, 82, 84, 94
fingerprints
virtual, 83
block data and, 22
data growth, 3–4, 26, 29, 84
data integrity, 92
data integrity, 59, 66, 92
data life cycle of, 50–52
data life cycles, 49–52
definition of, 94
data protection
expiration of, 63
at ROBO sites, 40
in CiS system, 68
cloud snapshots and, 20–21, 36, 92
overview, 19–20
continuous, 18, 93
storing in Windows Azure Storage, 31–32, 69
Data Protection Manager, 18
working set of, 35–36
data reduction, 5–6, 53–54, 93
data tiering, 9, 22, 53–54, 93
data volatility, 47, 72–73, 94
dedupe ratios, 54, 76, 78 G
deduplication (dedupe), 6, 17, 29, 53–54, 94.
See also primary dedupe; source dedupe geo-replication, 39–40, 66, 69, 94
defragmentation, 73 Google, 88
deterministic recovery, 27, 34, 89
deterministic, definition of, 94
digital archiving. See archiving
disaster recovery (DR)
H
as a best practice, 7 Hadoop, 88
definition of, 94 hard disk drives (HDDs), 8–9, 53–54
in the cloud, 86–88 hash, 94
problems of, 12 hashing algorithms, 6, 66
strategies of, 25–30 high availability, 48, 94
with hybrid cloud storage, 30–39, 68–70, 75–76 HIPAA Business Associate Agreement (BAA), 62
98
performance
hybrid cloud, 94
hybrid cloud boundary, 2, 22–23, 82, 84, 94
M
hybrid cloud storage magnetic tapes, 12–14, 60
architecture, 1 metadata, 68, 77, 94
definition of, 94 metadata maps, 31–34, 37, 70, 87
disaster recovery (DR) with, 30–39, 68–70, 75–76 Microsoft HCS
management model, 1 benefits of incremental storage, 49
performance, 71–72 cost advantages of, 37
storage volumes in, 32 data growth, 26
Windows Azure Storage in, 8 defragmentation with, 73
Hyper-V, 4–5, 74, 82 deployment scenarios, 74–78
Hyper-V Recovery Manager, 2, 87 differences from other storage systems, 69
Hyper-V Replica, 87 Microsoft Sharepoint, 76
hypervisors, 4, 26, 82, 86, 94 Microsoft System Center 2012, 82
migration, 43–44, 64, 73–74, 77, 84
migration time, 84–85, 94
monolithic storage, 47, 94
I
IaaS (Infrastructure-as-a-Service), 81, 94
incremental-only backup, 16, 20 N
index, 60, 94
in-flight resources, 65, 94 near-CDP solutions, 18, 94
Internet connection and CiS system, 34–35, 92 nodes, storage, 48
IOPS (input/output per second), 6, 49–50, 53, 71, 94 NV-RAM (non-volatile random access memory), 53, 94
iSCSI (Internet Small Computer System Interface),
8, 70, 91, 94
ISO/IEC 27001 2005 certification, 62
IT, 94
O
IT managers, 1, 7 object storage, 85
IT team, 94 on-premises data centers, 1, 53, 84, 94
orchestration, 83, 94
overprovisioning, 47
over-subscription, 73
J
Joyner, John, 2
P
performance
L capacity and, 71
cloud-storage-as-virtual-tape, 35
life cycles, 50–52 download, 34
linear tiers, 68 hybrid cloud storage, 71–72
local replication, 39 primary storage and, 6
local snapshots, 21, 69–70, 94 solid state disks (SSDs), 53
99
pointers
pointers, 17, 22, 40, 51, 68 scale-out storage design, 47–49, 55, 95
portability, 5, 82, 84, 94 scale-up storage design, 47–49, 55, 95
primary dedupe, 17, 22, 29, 54, 77 scheduling, 20–21, 92
primary site, 16, 28–29, 95 secondary archive storage, 69–70, 77
primary storage, 21, 54, 95 secondary site, 25, 28, 34, 86, 95
archived data in, 62, 77 secondary storage, 64–66, 69–70, 77, 95
capacity management, 17–18, 64 server virtualization technology, 43, 77
data protection in, 19, 21 service level agreements, 86
dedupe ratios, 54, 78
short-stroking technique, 71
deduping, 53–54
snapshots, 9, 17–18, 92, 95.
performance and, 6
See also cloud snapshots; local snapshots
storage location in CiS system, 69–70
solid state disks (SSDs), 6, 9, 53–54, 71, 95
privacy, protection of, 59, 65, 92
source CiS system, 32–33, 37–38
private cloud, 95
source dedupe, 17
public cloud, 59, 95
spindown, 60, 95
SSAE 16 / ISAE 3402 attestation, 62
100
working sets
T VMware, 4, 74, 82
volumes. See storage volumes
tape rotation, 13–15, 95
thin, 34, 95
thin provisioning, 9, 45–47, 69, 95
transparency, 51 W
Trust Center, 62
wide striping technique, 71
Windows Azure, 81–82
101
About the author
MARC FARLEY is a senior product marketing manager at
icrosoft working on hybrid cloud storage solutions. Rethinking
M
Enterprise Storage: A Hybrid Cloud Model is his fourth book on
network storage; his previous books are the two editions of
Building Storage Networks (McGraw-Hill, 2001 and 2002) and
Storage Networking Fundamentals (Cisco Press, 2004). In addition
to writing books about storage, Marc has blogged about storage
while working for EqualLogic, Dell, 3PAR, HP, StorSimple, and now
Microsoft.
When he is not working, Marc likes to ride bicycles, listen to
music, dote on his cats, and spend time with his family.
Now that
you’ve
read the
book...
Tell us what you think!
Was it useful?
Did it teach you what you wanted to learn?
Was there room for improvement?