0% found this document useful (0 votes)
4 views

Module4

Module 4 focuses on implementing Storage Spaces and Data Deduplication in Windows Server 2016, highlighting the importance of these features for managing enterprise storage needs. It covers objectives such as describing, implementing, and managing Storage Spaces, as well as understanding Data Deduplication. The module emphasizes cost-effective storage solutions and the need for a balanced approach to meet performance and capacity requirements.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Module4

Module 4 focuses on implementing Storage Spaces and Data Deduplication in Windows Server 2016, highlighting the importance of these features for managing enterprise storage needs. It covers objectives such as describing, implementing, and managing Storage Spaces, as well as understanding Data Deduplication. The module emphasizes cost-effective storage solutions and the need for a balanced approach to meet performance and capacity requirements.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 93

about:blank

Module 4: Implementing Storage Spaces and Data


Deduplication

Contents:

Lesson Implementing Storage Spaces

Lesson 2: Managing Storage Spaces

Lab A: Implementing Storage Spaces

Lesson 3: Implementing Data Deduplication

Lab Data Deduplication

and takeaways

Module overview

The Windows Server 2016 operating system introduces a number of storage


technologies and improvements to existing storage technologies. You can use
Storage Windows Server 2016, provision
storage and virtual disks physical storage
from Data Deduplication is to find
and while maintaining your module
describes these two new features within Server storage
architecture.

1 of 93 3/12/2019, 12:38 PM
about:blank

Objectives
After completing this module, you will be able to:

• Describe and implement the Storage Spaces feature in the context of enterprise
storage needs.

• Manage Storage Spaces.

• Describe Data Deduplication.

Lesson 1: Implementing Storage Spaces

Managing direct-attached storage (DAS) on a server can be a tedious task for


administrators. To overcome this problem, many organizations use storage area
networks ( disks together. However, because
they configuration, and sometimes help
overcome issues, you can use Storage together.
Storage presented to the operating that can
span disks in a pool. This lesson implement
Storage Spaces.

Lesson objectives
After completing this lesson, you will be able to:

• Implement as an enterprise storage

• Describe Spaces feature and its components.

• Describe of Storage Spaces, including drive


allocation, and provisioning schemes such as thin provisioning.

2 of 93 3/12/2019, 12:38 PM
about:blank

• Describe changes to the Storage Spaces feature in Windows Server 2016.

• Describe common usage scenarios for storage spaces, and weigh their benefits
and limitations.

• Compare using Storage Spaces to using other storage solutions.

Enterprise needs

In most organizations, discussions about storage needs can be straining. This is


typically because storage costs are a major item on many Information Technology
(IT) budgets. decreasing cost of individual amount of
data continues to grow cost of
storage

Consequently, organizations are investigating that provide a


cost-effective alternative to their existing solution, without sacrificing performance. A
typical demand from organizations during storage planning is how to lower the costs

3 of 93 3/12/2019, 12:38 PM
about:blank

and effort of delivering infrastructure as a service (IaaS) storage services. When


planning your storage solution, you need to assess how well the storage solution
scales. If your storage solution does not scale well, it will cost more. Additionally, you
should consider deploying inexpensive networks and storage environments. This
comes by deploying industry-standard server, network, and storage infrastructure to
build highly available and scalable software-defined storage.

Finally, using disaggregated deployments


when costs of delivering While
many compute/storage solutions provide features,
they both components simultaneously. words, you might
have to add compute power in the same ratio as previous hardware when expanding
storage. To achieve lower costs of delivering IaaS storage service, you should
consider independent management and independent scaling when planning your
storage solution.

While dictate which advanced during


your primary drivers are typically performance, cost,
and storage solutions. lengthy
discussions drivers separately, needs to be
a balanced deployment approach.

When planning your balanced storage deployment approach to meet your storage
needs, you will need to assess your capacity and performance requirements in
relation to your cost. For cost efficiency, your storage environment should utilize
solid-state highly active data (higher cost) and
hard data accessed infrequently ( the cost).

If you budget constraints meeting your


performance requirements; this is because HDDs capacity, but with
lower performance. Likewise, if you deploy only SSDs, your budget constraints will
prevent you from meeting your capacity requirements; this is because SSDs provide

4 of 93 3/12/2019, 12:38 PM
about:blank

higher performance, but with lower capacity. As a result, your balanced storage
deployment approach will most likely include a mix of HDDs and SSDs to achieve the
best performance and capacity at the appropriate cost.

Included in your storage planning, you should consider whether your storage solution
needs to support the common capabilities of most storage products, such as:

• Mirror/parity

• Data

• Enclosure

• Storage tiering

• Storage replication

• Data

• Data

• Performance

Note: This list is only meant to provide suggestions and is not an exhaustive
list of the common capabilities of most storage products. The storage
requirements of your organization might differ.

The volumes, the ever-increasing and the


need of data volumes can IT
departments Server 2016 provides features
that important facets of storage

Check Your Knowledge

5 of 93 3/12/2019, 12:38 PM
about:blank

Discovery
Which factors should you consider when planning your enterprise storage strategy?

Show solution Reset

Check Your Knowledge

Discovery
What your organization

Show solution

What are Storage Spaces?

Storage virtualization feature Server 2016 and


Windows 10.

The Storage Spaces feature consists of two components:

6 of 93 3/12/2019, 12:38 PM
about:blank

• Storage pools. Storage pools are a collection of physical disks aggregated into a
single logical disk, allowing you to manage the multiple physical disks as a single
disk. You can use Storage Spaces to add physical disks of any type and size to a
storage pool.

• Storage spaces. Storage spaces are virtual disks created from free space in a
storage spaces have attributes such storage
tiers, precise administrative advantage
of no longer need to Instead, you
can unit. Virtual disks are logical unit
number ( SAN.

Note: The virtual disks that you create with the Storage Spaces feature are
not the same as the virtual hard disk files that have the .vhd and .vhdx file
extensions.

To create need the following:

• Physical Physical disks are disks such as Technology


Attachment (SATA) serial-attached SCSI (SAS) you want to add
physical disks to a storage pool, the disks must adhere to the following
requirements:

o One physical disk is required to create a storage pool.

o disks are required to create virtual

o disks are required to resiliency

o At least five physical disks are required for three-way mirroring.

7 of 93 3/12/2019, 12:38 PM
about:blank

o Disks must be blank and unformatted, which means no volumes can exist on
the disks.

o Disks can be attached using a variety of bus interfaces including SAS, SATA,
SCSI, Non-Volatile Memory Express (NVMe), and universal serial bus (USB). If
you plan to use failover clustering with storage pools, you cannot use SATA,

• Storage is a collection of that you


can disks. You can add one nonformatted
physical pool, but you can attach only one
storage

• Virtual disk or storage space. This is similar to a physical disk from the perspective
of users and applications. However, virtual disks are more flexible because they
include both fixed provisioning and thin provisioning, also known as just-in-time
(JIT) allocations. They are also more resilient to physical disk failures with built-in
functionality mirroring and parity. These Array of
Independent technologies, but Storage data
differently

• Disk volume that you can access Windows operating


system, for example, by using a drive letter.

Note: When planning your Storage Spaces deployment, you need to verify
whether the storage enclosure is certified for Storage Spaces in Windows
Spaces to identify array’s
failure lights, the array must Enclosure

Additional Reading: For more information, refer to “Windows Server Catalog”


at: http://aka.ms/Rdpiy8

8 of 93 3/12/2019, 12:38 PM
about:blank

You can format a storage space virtual disk with an FAT32 file system, New
Technology File System (NTFS) file system, or Resilient File System (ReFS). You will
need to format the virtual disk with NTFS if you plan to use the storage space as part
of a Clustered Shared Volume (CSV), for Data Deduplication, or with File Server
Resource Manager (FSRM).

Components features of Storage

An important step when configuring storage spaces is planning virtual disks. To


configure storage spaces to meet your requirements, you must consider the Storage
Spaces features described in the following table before you implement virtual disks.

Feature

Storage of the characteristics that from the


layout that are allocated. Valid options

• Simple. A simple space has data striping but no redundancy. In data striping,
logically sequential data is segmented across several disks in a way that enables

9 of 93 3/12/2019, 12:38 PM
about:blank

Feature Description

different physical storage drives to access these sequential segments. Striping


can improve performance because it is possible to access multiple segments of
data at the same time. To enable data striping, you must deploy at least two disks.
The simple storage layout does not provide any redundancy, so if one disk in the
storage pool fails, you will lose all data unless have backup.

three-way mirrors. Mirroring against the


more disks. Mirror spaces the data
Specifically, two-way mirrors and three-
maintain three data copies Duplication occurs
every write to ensure that all data current. Mirror spaces
also stripe the data across multiple physical drives. To implement mirroring, you
must deploy at least two physical disks. Mirroring provides protection against the
loss of one or more disks, so use mirroring when you are storing important data.
The disadvantage of using mirroring is that the data is duplicated on multiple
disks, so disk usage is inefficient.

space resembles a simple across


However, parity information disks when
parity storage layout. You can calculate
a disk. Parity enables perform
requests even when a information is
rotated across available disks A storage space
requires a minimum of three physical drives for parity spaces. Parity spaces have
increased resiliency through journaling. The parity storage layout provides
redundancy but is more efficient in utilizing disk space than mirroring.

Note: The number of columns for a given storage space can also impact the
number of disks.

Disk sector size is set the moment are set as


size

drives being used contains pool sector


set to 512e. A 512 disk uses 512-byte drive is a hard disk
with 4,096-byte sectors that emulates 512-byte sectors.

10 of 93 3/12/2019, 12:38 PM
about:blank

Feature Description

• If the list contains at least one 4-kilobyte (KB) drive, the pool sector size is set to
4 KB.

Cluster disk Failover clustering prevents work interruptions if there is a computer failure. For a pool to
requirement clustering, all drives in the pool

Drive defines how the drive is allocated


allocation
This is the default allocation pool.
Spaces can automatically select data-store drives
both storage space creation and JIT

• Manual. A manual drive is not used as part of a storage space unless it is


specifically selected when you create that storage space. This drive allocation
property lets administrators specify particular types of drives for use only by
certain storage spaces.

These are reserve drives that a storage


added to a pool. If a drive storage
one of these reserve drives drive.

Provisioning virtual disk by using one


schemes
provisioning space. Thin provisioning be allocated readily
on a just-enough and JIT basis. Storage capacity in the pool is organized into
provisioning slabs that are not allocated until datasets require the storage. Instead
of the traditional fixed storage allocation method in which large portions of storage
capacity are allocated but might remain unused, thin provisioning optimizes the
use of any available storage by reclaiming storage that is no longer needed, using
known as trim.

provisioning space. In Storage also use


provisioning slabs. The difference capacity
time that you create the

create both thin and fixed provisioning same storage pool.


Having both provisioned types in the same storage pool is convenient, especially when
they are related to the same workload. For example, you can choose to use a thin
provisioning space for a shared folder containing user files, and a fixed provisioning

11 of 93 3/12/2019, 12:38 PM
about:blank

Feature Description

space for a database that requires high disk I/O.

Stripe You can increase the performance of a virtual disk by striping data across multiple
parameters physical disks. When creating a virtual disk, you can configure the stripe by using two
parameters, NumberOfColumns and Interleave.

represents one pass of data written data written


stripes, or passes.

correlate to underlying physical of data for


space is written.

• Interleave represents the amount of data written to a single column per stripe.

The NumberOfColumns and Interleave parameters determine the width of the stripe (e.g.,
stripe_width = NumberOfColumns * Interleave). In the case of parity spaces, the stripe
width determines how much data and parity Storage Spaces writes across multiple disks
to increase performance available to apps. You can control the number of columns and
when creating a new virtual PowerShell
New-VirtualDisk with the NumberOfColumns parameters.

When Spaces can use any use SATA


and integrated drive electronics drives)
that are internally to the computer. When Storage Spaces
storage subsystems, you must consider the following factors:

• Fault tolerance. Do you want data to be available in case a physical disk fails? If
so, you must use multiple physical disks and provision virtual disks by using
mirroring

• Performance. improve performance for by using a


parity You also need to each
individual determining performance. can use
disks to provide a tiered system example, you
can use SSDs for data to which you require fast and frequent access and use

12 of 93 3/12/2019, 12:38 PM
about:blank

SATA drives for data that you do not access as frequently.

• Reliability. Virtual disks in parity layout provide some reliability. You can improve
that degree of reliability by using hot spare physical disks in case a physical disk
fails.

• Extensibility. One of the main advantages of using Storage Spaces is the ability to
expand by adding physical physical disks
to after you create it to capacity or to
provide

Demonstration: Configuring Storage Spaces


In this demonstration, you will see how to:

• Create a storage pool.

• Create volume.

Demonstration steps

Create a storage pool

1. On LON-SVR1, in Server Manager, access File and Storage Services and


Storage Pools.

2. pane, create a New


add some of the available

Create and a volume

1. In the VIRTUAL DISKS pane, create a New Virtual Disk with the following

13 of 93 3/12/2019, 12:38 PM
about:blank

settings:

o Storage pool: StoragePool1

o Disk name: Simple vDisk

o Storage layout: Simple

Thin

2. results page, wait until the task then ensure that


the Create a volume when this wizard closes check box is selected.

3. In the New Volume Wizard, create a volume with these settings:

o Virtual disk: Simple vDisk

Simple Volume

4. completes, and then click

Changes to file and storage services in Windows Server


2016

14 of 93 3/12/2019, 12:38 PM
about:blank

File and storage services includes technologies that help you deploy and manage
one

New Server 2016

The following storage services features are in Windows


Server 2016:

• Storage Spaces Direct. This feature enables you to build highly available storage
systems by using storage nodes with only local storage. You will learn more about
this module.

• Storage feature in Windows replication


—between clusters that are in the same sites—for
disaster Replica includes both asynchronous
replication or longer distance between enables you to
achieve storage replication at a lower cost.

15 of 93 3/12/2019, 12:38 PM
about:blank

• Storage Quality of Service (QoS). With this feature, you can create centralized
QoS policies on a Scale-Out File Server and assign them to virtual disks on
Hyper-V virtual machines. QoS ensures that performance for the storage adapts
to meet policies as the storage load changes.

• Data Deduplication. This feature was introduced in Windows Server 2012 and is
improved 2016 in the following information about
Data covered later in this module):

o up to 64 terabytes ( been
Server 2016 and is able to
CPU’s per volume to increase throughput rates on
volume sizes up to 64 TB.

o Support for file sizes up to 1 TB. With the use of new stream map structures
and other improvements to increase optimization throughput and access
performance, deduplication in Windows Server 2016 performs well on files up

o configuration for virtualized applications. In


the configuration of virtualized backup
simplified when enabling deduplication volume.

o Support for Nano Server. A new deployment option in Windows Server 2016,
Nano Server fully supports Data Deduplication.

• Support for cluster rolling upgrades. You can upgrade each node in an existing
Windows Server 2012 R2 cluster to Windows Server 2016 without incurring
downtime nodes at once.

• Server SMB) hardening improvements. Server 2016,


client Active Directory Domain SYSVOL and
NETLOGON on domain controllers now signing and mutual
authentication (e.g., Kerberos authentication). This change reduces the likelihood
of man-in-the-middle attacks. If SMB signing and mutual authentication are

16 of 93 3/12/2019, 12:38 PM
about:blank

unavailable, a Windows Server 2016 computer won’t process domain-based


Group Policy and scripts.

Note: The registry values for these settings aren’t present by default;
however, the hardening rules still apply until Group Policy or other registry

New Server 2012 and Windows

Windows Server 2012 R2 and Windows Server 2012 offered several new and
improved file and storage-services features over its predecessor, including:

• Multiterabyte volumes. This feature deploys multiterabyte NTFS file system


volumes, which support consolidation scenarios and maximize storage use. NTFS
volumes record (MBR) formatted terabytes
(TB) globally unique identifier ( table (GPT)
formatted 18 exabytes.

• Data This feature saves disk space single copy of


identical data on the volume.

• iSCSI Target Server. The iSCSI Target Server provides block storage to other
servers and applications on the network by using the iSCSI standard. Windows
Server 2012 R2 also includes VHDX support and end-to-end management by
using Management Initiative Specification.

• Storage pools. This feature storage


by standard disks into storage storage
spaces capacity in the storage in
Windows R2 enables you to create solution that
transparently delivers an appropriate balance between capacity and performance
that can meet the needs of enterprise workloads.

17 of 93 3/12/2019, 12:38 PM
about:blank

• Unified remote management of File and Storage Services in Server Manager. You
can use the Server Manager to manage multiple file servers remotely, including
their role services and storage.

• Windows PowerShell cmdlets for File and Storage Services. You can use the
Windows PowerShell cmdlets for performing most administration tasks for file and
storage

• ReFS. File System (ReFS) Server 2012


offers availability, scalability, file-based
data

• Server Message Block (SMB) 3.0. SMB protocol is a network file-sharing protocol
that allows applications to read and write to files and request services from server
programs on a network.

• Offloaded Data Transfer (ODX). ODX functionality enables ODX-capable storage


arrays computer and directly between
compatible

• Chkdsk. Chkdsk runs automatically background and


monitors system volume; enabling deploy
multiterabyte file system volumes without endangering their
availability. The Chkdsk tool introduces a new approach. It prioritizes volume
availability and allows for the detection of corruption while the volume remains
online, and its data remains available to the user during maintenance.

Storage usage scenarios

18 of 93 3/12/2019, 12:38 PM
about:blank

When considering whether to use Storage Spaces in a given situation, you should
weigh and limitations. The was
designed administrators to:

• Implement manage scalable, reliable, storage.

• Aggregate individual drives into storage pools, which are managed as a single
entity.

• Use inexpensive storage with or without external storage.

• Use different types of storage in the same pool ( SATA, SAS, USB, SCSI).

• Grow required.

• Provision required from previously

• Designate drives as hot spares.

• Automatically repair pools containing hot spares.

19 of 93 3/12/2019, 12:38 PM
about:blank

• Delegate administration by pool.

• Use the existing tools for backup and restore and Volume Shadow Copy Service
(VSCS) for snapshots.

• Management can be local or remote, by using Microsoft Management Console


(MMC) Windows PowerShell.

• Utilize Failover Clusters.

above mentions USB as storage medium,


using USB in a pool might be more practical on a Windows 8 client or while
developing a proof of concept. The performance of this technology also
depends on the performance capabilities of the storage you choose to pool
together.

There limitations in Storage Windows


Server 2016, some of the limitations consider when
planning:

• Storage volumes are not supported on volumes.

• The contents of a drive are lost when you introduce that drive into a storage pool.

o You should add only unformatted, or non-partitioned, drives.

• You drive in a simple storage

• Fault have specific requirements:

o a minimum of two

o mirroring requires a minimum of

o Parity requires a minimum of three drives.

20 of 93 3/12/2019, 12:38 PM
about:blank

• All drives in a pool must use the same sector size.

• Storage layers that abstract the physical disks are not compatible with Storage
Spaces, including:

o VHDs and pass-through disks in a virtual machine (VM).

o deployed in a separate

• Fibre not supported.

• Failover to SAS as a storage

Note: Microsoft Support provides troubleshooting assistance only in


environments when you deploy Storage Spaces on a physical machine, not a
virtual machine. In addition, just a bunch of disks (JBOD) hardware solutions
that you implement must be certified by Microsoft.

When of a particular workload environment, Storage


Spaces resiliency types. As a result, better
suited resilient scenarios. The following these recommended
workload types.

Resiliency Number of Data Copies Workload Recommendations


Type Maintained

Mirror mirror)
mirror)

Parity units of

Simple need resiliency, or


provide alternate resiliency mechanism

21 of 93 3/12/2019, 12:38 PM
about:blank

Storage Spaces Direct deployment scenarios

Storage Spaces Direct removes the need for a shared SAS fabric, simplifying
deployment and configuration. Instead, it uses the existing network as a storage
fabric, leveraging SMB 3.0 and SMB Direct for high-speed, low-latency CPU efficient
storage. To scale out, you simply add more servers to increase storage capacity and
I/O performance.

Storage deployed in support of


Hyper-V file or secondary storage virtual
machine Windows Server 2016, both options for Hyper-V,
specifically focusing on Hyper-V IaaS (Infrastructure as a Service) for service
providers and enterprises.

In the disaggregated deployment scenario, the Hyper-V servers (compute


component) separate cluster from Direct servers
(storage virtual machines are configured on the
Scale-Out The SOFS is designed share for
server accessed over the 0 protocol.
This Hyper-V clusters (compute) cluster (storage)
independently.

In the hyper-converged deployment scenario, the Hyper-V (compute) and Storage


Spaces Direct (storage) components are on the same cluster. This option does not
require deploying a SOFS, because the virtual machine files are stored on the CSVs.
This compute clusters does not
require access and permissions. Storage
Spaces volumes are available, provisioning
Hyper-V and uses the same any other
Hyper-V a failover cluster.

You also can deploy Storage Spaces Direct in support of SQL Server 2012 or newer,

22 of 93 3/12/2019, 12:38 PM
about:blank

which can store both system and user database files. SQL Server is configured to
store these files on SMB 3.0 file shares for both stand-alone and clustered instances
of SQL Server. The database server accesses the SOFS over the network using the
SMB 3.0 protocol. This scenario requires Windows Server 2012 or newer on both the
file servers and the database servers.

does not support Exchange

Interoperability Azure virtual machines scenarios

You can use Storage Spaces inside an Azure virtual machine to combine multiple
virtual hard drives, creating more storage capacity or performance than is available
from a single Azure virtual hard drive. There are three supported scenarios for using
Storage Spaces in Azure virtual machines, but there are some limitations and best
practices follow, as described below.

• As capacity storage for

• As System Center Data Protection

• As storage for Azure Site Recovery.

Multi-tenant scenarios

You administration of storage access control


lists ( on a per-storage-pool supporting
hosting tenant isolation. Because uses the
Windows it can be integrated fully Directory Domain
Services.

Storage Spaces can be made visible only to a subset of nodes in the file cluster. This

23 of 93 3/12/2019, 12:38 PM
about:blank

can be used in some scenarios to leverage the cost and management advantage of
larger shared clusters and to segment those clusters for performance or access
purposes. Additionally, you can apply ACLs at various levels of the storage stack (for
example, file shares, CSV, and storage spaces). In a multitenant scenario, this
means that the full storage infrastructure can be shared and managed centrally and
that you can design dedicated and controlled access to segments of the storage
infrastructure. configure a particular customer storage pools,
storage volumes, and file and
ACLs tenant has access

Additionally, SMB Encryption, you can ensure the file-based


storage is encrypted to protect against tampering and eavesdropping attacks. The
biggest benefit of using SMB Encryption over more general solutions, such as IPsec,
is that there are no deployment requirements or costs beyond changing the SMB
settings on the server. The encryption algorithm used is AES-CCM, which also
provides validation.

Discussion: Comparing Storage Spaces storage


solutions

24 of 93 3/12/2019, 12:38 PM
about:blank

Storage Spaces in Windows Server 2016 provides an alternative to using more


traditional such as SANs and network-attached NAS).

Consider questions to prepare for the

Check Your Knowledge

Discovery
What are the advantages of using Storage Spaces compared to using SANs or NAS?

Show solution Reset

Check

Discovery
What are the disadvantages of using Storage Spaces compared to using SANs or NAS?

Show solution Reset

25 of 93 3/12/2019, 12:38 PM
about:blank

Check Your Knowledge

Discovery
In what scenarios would you recommend each option?

Show solution Reset

Lesson Storage Spaces

Once implemented Storage Spaces, you manage and


maintain lesson explores how to use Storage mitigate disk
failure, to expand your storage pool, and to use logs and performance counters to
ensure the optimal behavior of your storage.

Lesson objectives
After you will be able to:

• Describe Storage Spaces.

• Explain Storage Spaces to mitigate storage

• Explain how to expand your storage pool.

• Describe how to use event logs and performance counters to monitor Storage
Spaces.

Managing Spaces

26 of 93 3/12/2019, 12:38 PM
about:blank

Storage Spaces is integrated with failover clustering for high availability, and
integrated volumes (CSV) for can
manage using:

• Server

• Windows PowerShell

• Failover Cluster Manager

• System Center Virtual Machine Manager

• Windows Instrumentation (WMI)

Manage Manager

Server Manager provides you with the ability to perform basic management of virtual
disks and storage pools. In Server Manager, you can create storage pools; add and

27 of 93 3/12/2019, 12:38 PM
about:blank

remove physical disks from pools; and create, manage, and delete virtual disks. For
example, in Server Manager you can view the physical disks that are attached to a
virtual disk. If any of these disks are unhealthy, you will see an unhealthy disk icon
next to the disk name.

Manage using Windows PowerShell

Windows advanced management disks and


storage table lists some examples cmdlets.

Windows cmdlet Description

Get-StoragePool Lists storage pools.

Get-VirtualDisk Lists virtual disks.

Repair-VirtualDisk Repairs a virtual disk.

Get-PhysicalDisk HealthStatus -ne Lists unhealthy


“Healthy”}

Reset-PhysicalDisk Removes pool.

Get-VirtualDisk Get-PhysicalDisk Lists physical for a virtual disk.

Optimize-Volume Optimizes a volume, performing such tasks on


supported volumes and system SKUs as
defragmentation, trim, slab consolidation, and
storage tier processing.

more information, in
http://aka.ms/po9qve
cmdlets in Windows download
module for use in Windows more
“Storage Spaces Cmdlets PowerShell” at:
ms/M1fccp

Monitoring storage tier performance

28 of 93 3/12/2019, 12:38 PM
about:blank

When planning for storage tiering, you should assess the workload characteristics of
your storage environment so that you can store your data most cost-effectively
depending on how you use it. In Windows Server 2016, the server automatically
optimizes your storage performance by transparently moving the data that's
accessed more frequently to your faster solid state drives (the SSD tier) and moving
less active data to your less expensive, but higher capacity, hard disk drives (the
HDD

In many most common workload a large


data data that is typically is files that
you access and have a longer lifespan. most common
workload characteristics also includes a smaller portion of the data that is typically
hot. Hot data, commonly referred to as working set, is files that you are working on
currently; this part of the data set is highly active and changes over time.

Note: optimization process the data is


mapped sub-file level. For example, of the
data is hot, only that 30 moved to

Additionally, when planning for storage tiering, you should assess if there are
situations in which a file works best when placed in a specific tier. For example, you
need to place an important file in the fast tier, or you need to place a backup file in
the slow tier. For these situations, your storage solution might have the option to
assign a file to a particular tier, also referred to as pinning the file to a tier.

Before spaces, plan ahead and fine-tune the


storage observe your workloads
input/output second (IOPS) and latency, predict the
storage each workload more accurately. some
recommendations when planning ahead:

29 of 93 3/12/2019, 12:38 PM
about:blank

• Don't allocate all available SSD capacity for your storage spaces immediately.
Keep some SSD capacity in the storage pool in reserve, so you can increase the
size of an SSD tier when a workload demands it.

• Don't pin files to storage tiers until you see how well Storage Tiers Optimization
can optimize storage performance. When a tenant or workload requires a
particular performance, you can pin files ensure that all
I/O that tier.

• Do parent VHDX file to the providing pooled


desktops have deployed a Infrastructure (VDI)
to desktops for users, you should the master
image that's used to clone users' desktops to the SSD tier.

You should use the Storage Tier Optimization Report when observing or monitoring
your used to check the storage tiers
and might optimize their the
performance report provides data for such as,
“How and “How much do capacity?”

Additional Reading: For more information, refer to “Monitoring Storage Tiers


Performance” at: http://aka.ms/Sz4zfi

Managing disk failure with Storage Spaces

30 of 93 3/12/2019, 12:38 PM
about:blank

Before deployment, you should plan Storage Spaces to handle disk and JBOD
enclosure impact on service loss. With
any storage should expect that hardware is
especially storage solution. To caused by
failing plan should account number of
failures, occur in your environment. You for how your
solution should handle each fault without service interruption.

• Design a complete, fault-tolerant storage solution. For example, if you want your
storage solution to be able to tolerate a single fault at any level, you need this
minimum

o single-parity storage spaces.

o connections between each and each JBOD.

o Redundant network adapters and network switches.

31 of 93 3/12/2019, 12:38 PM
about:blank

o Enough JBOD enclosures to tolerate an entire JBOD failing or becoming


disconnected.

• Deploy a highly available storage pool. Using mirrored or parity virtual disks in
Storage Spaces provides some fault tolerance and high availability to storage
resources. However, because all physical disks connect to a single system, that
system single point of failure. physical
disks access to the storage exist.
Storage Server 2016 supports storage
pool spaces, parity spaces, cluster
Storage environment must meet requirements:

o All storage spaces in the storage pool must use fixed provisioning.

o Two-way mirror spaces must use three or more physical disks.

o Three-way mirror spaces must use five or more physical disks.

o clustered pool must be SAS.

o support persistent reservations failover

Note: The SAS JBOD must be physically connected to all cluster nodes
that will use the storage pool. Direct attached storage that is not
connected to all cluster nodes is not supported for clustered storage
pools with Storage Spaces.

• Unless highly available storage pool on


another fails. In Windows Spaces writes
the storage pool directly if the
single-point-of-failure system fails and the server requires replacement
or a complete reinstall, you can mount a storage pool on another server.

32 of 93 3/12/2019, 12:38 PM
about:blank

• Most problems with Storage Spaces occur because of incompatible hardware or


because of firmware issues. To reduce problems, follow these best practices:

o Use only certified SAS-connected JBODs. These enclosure models have been
tested with Storage Spaces and enable you to identify the enclosure and slot
for a physical disk easily.

o models within a JBOD. solid-state


of HDD for all disks that you
and make sure that compatible with

o Install the latest firmware and driver versions on all disks. Install the firmware
version that is listed as approved for the device in the Windows Server Catalog
or is recommended by your hardware vendor. Within a JBOD, it's important that
all disks of the same model have the same firmware version.

o recommendations for disk in the


your hardware vendor. different
placement of SSDs and HDDs, reasons.

• Unless spares, retire missing default


policy physical disk that goes missing storage pool
(-RetireMissingPhysicalDisks = Auto) simply marks the disk as missing (Lost
Communication), and no repair operation on the virtual disks takes place. This
policy avoids potentially I/O-intensive virtual disk repairs if a disk temporarily goes
offline, but the storage pool health will remain degraded, compromising resiliency
if administrator takes using hot
spares, you change the RetireMissingPhysicalDisks policy
to disk repair operations loses
communication system, restoring the
dependent spaces as soon as possible.

• Always replace the physical disk before you remove the drive from the storage
pool. Changing the storage pool configuration before you replace the physical disk

33 of 93 3/12/2019, 12:38 PM
about:blank

in the enclosure can cause an I/O failure or initiate virtual disk repair, which can
result in a “STOP 0x50” error and potential data loss.

• As a general rule, keep unallocated disk space in the pool for virtual disk repairs
instead of using hot spares. In Windows Server 2016, you have the option to use
available capacity on existing disks in the pool for disk repair operations instead of
bringing This enables Storage automatically repair
storage disks by copying data pool,
significantly time it takes to recover when
compared spares, and it lets you disks
instead aside hot spares.

o To correct a failed disk in a virtual disk or storage pool, you must remove the
disk that is causing the problem. Actions such as defragmenting, scan disk, or
using chkdsk cannot repair a storage pool.

o To replace a failed disk, you must add a new disk to the pool. The new disk
automatically when disk maintenance daily
Alternatively, you can trigger manually.

• When counts, make sure physical disks to


support virtual disk repairs. Typically, configure the virtual
disk with 3-4 columns for a good balance of throughput and low latency.
Increasing the column count increases the number of physical disks across which
a virtual disk is striped, which increases throughput and IOPS for that virtual disk.
However, increasing the column count can also increase latency. For this reason,
you should optimize overall cluster performance by using multiple virtual disks with
3−4 mirrors) or seven columns spaces.
The entire cluster remains virtual disks
are up for the reduced

• Be multiple disk failures. If you purchased disks in an


enclosure at the same time, the disks are the same age, and the failure of one
disk might be followed fairly quickly by other disk failures. Even if the storage

34 of 93 3/12/2019, 12:38 PM
about:blank

spaces return to health after the initial disk repairs, you should replace the failed
disk as soon as possible to avoid the risk of additional disk failures, which might
compromise storage health and availability and risk data loss. If you want to be
able to delay disk repairs safely until your next scheduled maintenance, configure
your storage spaces to tolerate two disk failures.

• Provide enclosure level. added


level enclosure level, compatible JBODs
that awareness. In an enclosure-aware solution,
Storage copy of data to a As a
result, enclosure fails or goes offline, the available in one or
more alternate enclosures. To use enclosure awareness with Storage Spaces,
your environment must meet the following requirements:

o JBOD storage enclosures must support SCSI Enclosure Services (SES).

o Storage Spaces must be configured as a mirror.

o enclosure with two-way


enclosures.

o enclosures with three-way


enclosures.

Storage pool expansion

35 of 93 3/12/2019, 12:38 PM
about:blank

One of the main benefits of using Storage Spaces is the ability to expand your
storage additional storage. Occasionally,
investigate storage is being used pool
before the storage. This is your various
virtual across the physical disks
configuration based on the storage layout selected when
creating the pool. Depending upon the specifics, you might not be able to extend the
storage, even if there is available space in the pool.

Example

Consider example:

In the pool consists of five larger than


the others. consumed across all five disks vdisk2
consumes space only on disks 1 through 3.

36 of 93 3/12/2019, 12:38 PM
about:blank

FIGURE POOL CONSISTING

In the sixth disk has been added

FIGURE POOL CONSISTING

• If vdisk1, the maximum disk has


already even though more space is the pool on disk 6.
This is because the layout required by vdisk1—due to the options chosen at
creation (such as mirroring and parity) — needs five disks. Therefore, to expand
vdisk1, you would need to add four additional disks.

• However, if you attempt to extend vdisk2, you can do so because that disk is
currently three devices and across
those extend it.

Spaces, blocked storage


pre-expanded state, vdisk1 columns and vdisk2
uses three columns.

37 of 93 3/12/2019, 12:38 PM
about:blank

• Vdisk2 might just be a virtual disk that used two-way mirroring. This means that
data on disk1 is duplicated on disk2 and disk3. If you wish to expand a virtual disk
with two-way mirroring, it has to have the appropriate number of columns available
to accommodate the needs of the virtual disk.

Determining

Before storage pool, you must distribution


of blocks determining column can use
the Windows cmdlet Get-VirtualDisk

Note: For more information, refer to “Storage Spaces Frequently Asked


Questions (FAQ)” at: http://aka.ms/knx5zg

Expanding

After usage where necessary, storage


pool

• Server Manager. Open Server Manager, select File and Storage Services, and
then click Storage Pools. You can add a physical disk by right-clicking the pool,
and then click Add Physical Disk.

• Windows PowerShell. You can use the Windows PowerShell cmdlet Add-
PhysicalDisk physical disk to the storage

Add-PhysicalDisk –VirtualDiskFriendlyName –PhysicalDisks


(Get-PhysicalDisk -FriendlyName PhysicalDisk3, PhysicalDisk4)

38 of 93 3/12/2019, 12:38 PM
about:blank

Demonstration: Managing Storage Spaces by using


Windows PowerShell
In this demonstration, you will see how to use Windows PowerShell to:

• View the properties of a storage pool.

• Add storage pool.

Demonstration
View the Properties of a Storage Pool

1. On LON-SRV1, open Windows PowerShell.

2. View the current storage configuration in Server Manager.

3. commands:

storage pools with their operational


following command:

Get-StoragePool

b. To return more information about StoragePool1, run the following


command:

StoragePool1 |

detailed information about your including


provisioning type, parity layout, and health, run the following command:

39 of 93 3/12/2019, 12:38 PM
about:blank

Get-VirtualDisk | fl

d. To return a list of physical disks than can be pooled, run the following
command:

| Where {$_.canpool

Add Physical Disks to a Storage Pool

1. Run the following commands:

a. To create a new virtual disk in StoragePool1, run the following command:

–StoragePoolFriendlyName StoragePool1
Data -Size 2GB

You can see this new virtual disk in Server Manager.

b. To add a list of physical disks that can be pooled to the variable, run the
following command:

Get-PhysicalDisk –CanPool

disks in the variable following

Add-PhysicalDisk -PhysicalDisks $canpool

40 of 93 3/12/2019, 12:38 PM
about:blank

-StoragePoolFriendlyName StoragePool1

2. View the additional physical disks in Server Manager.

Event performance counters

With any storage technology, it is important that you monitor storage behavior and
function to ensure ongoing reliability, availability, and optimal performance.

Using

When the storage architecture, generates


errors, errors to the Event Log. events by
using or by accessing the recorded using Server
Manager or Windows PowerShell cmdlets. The following table identifies common
Event IDs associated with problematic storage.

41 of 93 3/12/2019, 12:38 PM
about:blank

Event ID Message Cause

100 Physical drive %1 failed to read the A physical drive can fail to
configuration or returned corrupt data read the configuration or
for storage pool %2. As a result, the in- return corrupt data for a
memory configuration might not be the storage pool for the following
most recent copy of the configuration. reasons:
Return Code: %3.
physical drive
requests
I/O

The physical drive


might contain
corrupted storage
pool configuration
data.

The physical drive


contain
memory

102 Majority of the physical drives failure might occur


storage pool %1 failed a configuration writing a storage pool
update, which caused the pool to go configuration to physical
into a failed state. Return Code: %2. drives for the following
reasons:

• Physical drives might


fail requests with
errors.

insufficient
physical
online and
updated with their
latest configurations.

42 of 93 3/12/2019, 12:38 PM
about:blank

Event ID Message Cause

• The physical drive


might contain
insufficient memory
resources.

103 The capacity consumption consumption of


storage pool %1 has exceeded has
threshold limit set on the threshold limit
Code: %2.

104 The capacity consumption capacity consumption of


storage pool %1 is now below the the storage pool returns to a
threshold limit set on the pool. Return level that is below the
Code: %2. threshold limit set on the
pool.

200 Windows was unable to read the drive Windows was unable to read
header for physical drive for a
know the drive is still usable,
resetting the drive health
command line or GUI might
failure condition and enable
reassign the drive to its storage
Return Code: %2.

201 Physical drive %1 has invalid meta- The metadata on a physical


data. Resetting the health status by drive has become corrupt.
using the command line or GUI might
bring the physical drive to the
primordial pool. Return Code: %2.

202 Physical drive %1 has invalid a physical


data. Resetting the health corrupt.
using the command line or
resolve the issue. Return

203 An I/O failure has occurred failure has occurred


drive %1. Return Code: %2. on a physical drive.

43 of 93 3/12/2019, 12:38 PM
about:blank

Event ID Message Cause

300 Physical drive %1 failed to read the A physical drive can fail to
configuration or returned corrupt data read the configuration or
for storage space %2. As a result, the return corrupt data for the
in-memory configuration might not be following reasons:
the most recent copy of the
The physical drive
configuration. Return Code:
requests
I/O

physical drive
might contain
corrupted storage
space configuration
data.

• The physical drive


might contain
memory

301 pool drives failed to read experience all


configuration or returned corrupt drives failing to read
for storage space %1. As configuration or
storage space will not attach. Return returning corrupt data for
Code: %2. storage spaces for the
following reasons:

• Physical drives might


fail requests with
errors.

drives might
corrupted
pool
configuration data.

• The physical drive

44 of 93 3/12/2019, 12:38 PM
about:blank

Event ID Message Cause

might contain
insufficient memory
resources.

302 Majority of the pool drives the pool


space meta-data for storage space
failed a space meta-data storage
caused the storage pool to metadata
state. Return Code: %2. following

Physical drives might


fail requests with
device I/O errors.

• Insufficient number of
physical drives have
storage space

physical drive
contain
insufficient memory
resources.

303 Drives hosting data for storage space This event can occur if a
have failed or are missing. As a result, drive in the storage pool fails
no copy of data is available. Return or is removed.
Code: %2.

304 One or more drives hosting drives hosting


storage space %1 have failed space
missing. As a result, at least missing.
data is not available. However, least one copy
least one copy of data is still not available.
Return Code: %2. at least one copy
of data is still available.

45 of 93 3/12/2019, 12:38 PM
about:blank

Event ID Message Cause

306 The attempt to map or allocate more The attempt to map or


storage for the storage space %1 has allocate more storage for the
failed. This is because there was a storage space has failed.
write failure involved in the updating More physical drives are
the storage space metadata. Return needed.
Code: %2.

307 The attempt to unmap or unmap or trim


storage space %1 has failed. space has
Code: %2.

308 The driver initiated a repair driver initiated a repair


storage space %1. Return Code: %2. attempt for storage space.
This is a normal condition.
No further action is required.

Performance Monitoring

Most regarding the configuration architecture


have performance of your storage also true for
using implement your storage is better
or worse balance between multiple cost, reliability,
availability, power, ease-of-use.

There are multiple components that handle storage requests within your storage
architecture, including:

• File

• File

• Volume

• Physical storage hardware.

46 of 93 3/12/2019, 12:38 PM
about:blank

• Storage Spaces configuration options.

You can use Windows PowerShell and Performance Monitor to monitor the
performance of your storage pools. If you want to use Windows PowerShell, you
must install the Storage Spaces Performance Analysis module for Windows
PowerShell.

“Storage Spaces Performance module for


module, go to: http://aka.

To use Windows PowerShell to generate and collect performance data, at a Windows


PowerShell prompt, run the following cmdlet:

Measure-StorageSpacesPhysicalDiskPerformance
-StorageSpaceFriendlyName StorageSpace1 60
-SecondsBetweenSamples -ReplaceExistingResultsFile
-ResultsFilePath StorageSpace1.blg -SpacetoPDMappingPath PDMap.csv

This cmdlet:

• Monitors the performance of all physical disks associated with the storage space
named StorageSpace1.

• Captures for 60 seconds at

• Replaces they already exist.

• Stores performance log in the file named StorageSpace1.

• Stores the physical disk mapping information in a file named PDMap.csv.

47 of 93 3/12/2019, 12:38 PM
about:blank

You can use Performance Monitor to view the data collected in the two files specified
in the cmdlet above, named StorageSpace1.blg and PDMap.csv.

Lab A: Implementing Storage Spaces

Scenario
Adatum purchased a number of and you
have a storage solution devices
to the requirements in A. Datum redundancy,
you must have a redundancy solution that does not
require fast disk read and write access. You also must create a solution for data that
does require fast read and write access.

You decide to use Storage Spaces and storage tiering to meet the requirements.

Objectives
After will be able to:

• Create space.

• Enable and configure storage tiering.

Lab setup

Estimated

Virtual 20740C-LON-DC1 and 20740C-LON-SVR1

User Adatum\Administrator

Password: Pa55w.rd

48 of 93 3/12/2019, 12:38 PM
about:blank

For this lab, you need to use the available virtual machine environment. Before you
begin the lab, you must complete the following steps:

1. On the host computer, start Hyper-V Manager.

2. In Hyper-V Manager, click 20740C-LON-DC1, and, in the Actions pane, click

3. Connect. Wait until starts.

4. credentials:

Administrator

• Password: Pa55w.rd

• Domain: Adatum

5. for 20740C-LON-SVR1

Exercise Storage Space

Scenario

Your server does not have a hardware-based RAID card, but you have been asked to
configure redundant storage. To support this feature, you must create a storage pool.

After you must create a Because the


data redundant storage use a three-
way after the volume is you have to
replace disk to the storage pool.

The main tasks for this exercise are as follows:

49 of 93 3/12/2019, 12:38 PM
about:blank

1. Create a storage pool from six disks that are attached to the server

2. Create a three-way mirrored virtual disk (need at least five physical disks)

3. Copy a file to the volume, and verify it is visible in File Explorer

4. Remove a physical drive to simulate drive failure

5. available

6. storage pool and remove

Detailed Steps ▼

Detailed Steps ▼

Detailed Steps ▼

Detailed Steps

Detailed Steps

Detailed Steps

Result: After completing this exercise, you should have successfully created a
storage pool and added five disks to it. Additionally, you should have created a
three-way mirrored, thinly-provisioned virtual disk from the storage pool. You also
should have copied a file to the new volume and then verified that it is accessible.
Next, after removing a physical drive, you should have verified that the virtual disk
was you could access it. added
another storage pool.

Exercise configuring storage

Scenario

50 of 93 3/12/2019, 12:38 PM
about:blank

Management wants you to implement storage tiers to take advantage of the high-
performance attributes of a number of SSDs, while utilizing less expensive hard disk
drives for less frequently accessed data.

The main tasks for this exercise are as follows:

1. cmdlet to view all system

2.

3.

4. Specify the media type for the sample disks and verify that the media type is
changed

5. Create pool-level storage tiers by using Windows PowerShell

6. with storage tiering Disk

7.

Detailed Steps ▼

Detailed Steps ▼

Detailed Steps ▼

Detailed Steps

Detailed Steps

Detailed Steps

Detailed Steps ▼

Result: After completing this exercise, you should have successfully enabled and

51 of 93 3/12/2019, 12:38 PM
about:blank

configured storage tiering.

Review Question(s)

Check Your Knowledge

Discovery
At a must you add to a three-way
mirrored

Show solution

Check Your Knowledge

Discovery
You have four SAS disks, and attached to a
Windows You want to provide a users that
they What would you use?

Show solution

Lesson 3: Implementing Data Deduplication

Data Deduplication is a role service of Windows Server 2016. This service identifies
and removes duplications within data without compromising data integrity. It does this
to achieve storing more data disk space.
This implement Data Deduplication Server 2016
storage.

Lesson objectives
After completing this lesson, you will be able to:

52 of 93 3/12/2019, 12:38 PM
about:blank

• Describe Data Deduplication in Windows Server 2016.

• Identify Data Deduplication components in Windows Server 2016.

• Explain how to deploy Data Deduplication.

• Describe common usage scenarios for data deduplication.

• Explain maintain data deduplication.

• Describe considerations with

What is Data Deduplication?

To cope growth in the enterprise, consolidating


servers scaling and data optimization Data
Deduplication practical ways to achieve including:

• Capacity optimization. Data Deduplication stores more data in less physical

53 of 93 3/12/2019, 12:38 PM
about:blank

space. It achieves greater storage efficiency as compared to features such as


Single Instance Store (SIS) or NTFS compression. Data deduplication uses subfile
variable-size chunking and compression, which deliver optimization ratios of 2:1
for general file servers and up to 20:1 for virtualization data.

• Scale and performance. Data Deduplication is highly scalable, resource efficient,


and can process up to 50 Windows Server
2012 of data per second Windows
Server perform significantly advancements in
the Processing Pipeline. In this Server,
Data can run multiple threads in multiple I/O
queues on multiple volumes simultaneously without affecting other workloads on
the server. Throttling the CPU maintains the low impact on the server workloads
and memory resources that are consumed; if the server is very busy, deduplication
can stop completely. In addition, you have the flexibility to run Data Deduplication
jobs at any time, set schedules for when data deduplication should run, and
establish policies.

• Reliability When you apply volume on a


server, integrity of the data. Data checksum
results, and identity validation to ensure Data
Deduplication maintains redundancy, for all metadata most frequently
referenced data, to ensure that the data is repaired, or at least recoverable, in the
event of data corruption.

• Bandwidth efficiency with BranchCache. Through integration with BranchCache,


the techniques are applied the WAN to
a faster file download bandwidth
consumption.

• Optimization with familiar tools. optimization


functionality Server Manager and Windows Default settings
can provide savings immediately, or you can fine-tune the settings to see more
gains. By using Windows PowerShell cmdlets, you can start an optimization job or

54 of 93 3/12/2019, 12:38 PM
about:blank

schedule one to run in the future. Installing the Data Deduplication feature and
enabling deduplication on selected volumes can also be accomplished by using
the Unattend.xml file that calls a Windows PowerShell script and can be used with
Sysprep to deploy deduplication when a system first boots.

The process involves finding within data


without or integrity. The in less
space small variable-sized identifying
duplicate maintaining a single copy

After deduplication, files are no longer stored as independent streams of data, and
they are replaced with stubs that point to data blocks that are stored within a common
chunk store. Because these files share blocks, those blocks are only stored once,
which reduces the disk space needed to store all files. During file access, the correct
blocks transparently assembled to serve the data without the application or the
user the on-disk transformation enables you
to apply without having to worry behavior
to the users who are accessing
Deduplication storage scenarios with that are not
modified

Enhancements to the Data Deduplication Role Service

Windows Server 2016 includes several important improvements to the way Data
Deduplication Windows Server 2012 R2 2012,
including:

• Support to 64 TB. Data Deduplication Server


2012 perform well on volumes greater size (or less for
workloads with a high rate of data changes), the feature has been redesigned in
Windows Server 2016. Deduplication Processing Pipeline is now multithreaded

55 of 93 3/12/2019, 12:38 PM
about:blank

and able to utilize multiple CPUs per volume to increase optimization throughput
rates on volume sizes up to 64 TB. This is a limitation of VSS, on which Data
Deduplication is dependent.

• Support for file sizes up to 1 TB. In Windows Server 2012 R2, very large files are
not good candidates for Data Deduplication. However, with the use of the new
stream other improvements optimization
throughput performance, deduplication 2016
performs TB.

• Simplified configuration for virtualized Although


Windows R2 supports deduplication backup
applications, it requires manually tuning the deduplication settings. In Windows
Server 2016, however, the configuration of deduplication for virtualized backup
applications is drastically simplified by a predefined usage-type option when
enabling deduplication for a volume.

• Support Nano Server is a new Windows


Server smaller system resource significantly
faster, updates and restarts Core
deployment Windows Server. In addition, supports
Data

• Support for cluster rolling upgrades. Windows servers in a failover cluster running
deduplication can include a mix of nodes running Windows Server 2012 R2 and
nodes running Windows Server 2016. This major enhancement provides full data
access to all of your deduplicated volumes during a cluster rolling upgrade. For
example, upgrade each deduplication existing
Windows cluster to Windows incurring
downtime nodes at once.

Note: Although both the Windows Server versions of deduplication can


access the optimized data, the optimization jobs run only on the Windows

56 of 93 3/12/2019, 12:38 PM
about:blank

Server 2012 R2 deduplication nodes and are blocked from running on the
Windows Server 2016 deduplication nodes until the cluster rolling upgrade is
complete.

Effectively, Data Deduplication in Windows Server 2016, allows you to efficiently


store, transfer, and backup fewer bits.

Volume Data Deduplication

After service, you can enable per-volume


basis. Deduplication includes the following requirements:

• Volumes must not be a system or boot volume. Because most files used by an
operating system are constantly open, Data Deduplication on system volumes
would negatively affect the performance because deduplicated data would need to
be you could use the files.

• Volumes partitioned by using master GUID


partition and must be formatted ReFS
file

• Volumes must be attached to the Windows Server and cannot appear as non-
removable drives. This means that you cannot use USB or floppy drives for Data
Deduplication, nor use remotely-mapped drives.

• Volumes can be on shared storage, such as Fibre Channel, iSCSI SAN, or SAS
array.

• Files attributes, encrypted files, and


reparse processed for Data

• Data not available for Windows systems.

57 of 93 3/12/2019, 12:38 PM
about:blank

Data Deduplication components

The service consists of several


components

• Filter component monitors local or handles the chunks


of data on the file system by interacting with the various jobs. There is one filter
driver for every volume.

• Deduplication service. This component manages the following job types:

o Optimization. Consisting of multiple jobs, they perform both deduplication and


according to the data deduplication volume.
a file, if the file is the data
threshold for optimization, optimized again.

o Collection. Data Deduplication includes collection jobs to


process deleted or modified data on the volume so that any data chunks no
longer referenced are cleaned up. This job processes previously deleted or

58 of 93 3/12/2019, 12:38 PM
about:blank

logically overwritten optimized content to create usable volume free space.


When an optimized file is deleted or overwritten by new data, the old data in
the chunk store is not deleted right away. While garbage collection is scheduled
to run weekly, you might consider running garbage collection only after large
deletions have occurred.

o Deduplication has built-in data as


metadata consistency built-in
metadata and the most As data is
deduplication jobs process data, encounter
record the corruption in a log jobs use these
features to analyze the chunk store corruption logs and, when possible, to
make repairs. Possible repair operations include using three sources of
redundant data:

▪ Deduplication keeps backup copies of popular chunks when they are


times in an area called working copy
deduplication uses its redundant soft
flips or torn writes.

Storage Spaces, deduplication image of


chunk to serve the I/O and

▪ If a file is processed with a chunk that is corrupted, the corrupted chunk is


eliminated, and the new incoming chunk is used to fix the corruption.

of the additional validations that built into


the deduplication subsystem system to
signs of data corruption file

o Unoptimization. This job undoes deduplication on all of the optimized files on


the volume. Some of the common scenarios for using this type of job include

59 of 93 3/12/2019, 12:38 PM
about:blank

decommissioning a server with volumes enabled for Data Deduplication,


troubleshooting issues with deduplicated data, or migration of data to another
system that doesn’t support Data Deduplication. Before you start this job, you
should use the Disable-DedupVolume Windows PowerShell cmdlet to disable
further data deduplication activity on one or more volumes. After you disable
Data Deduplication, the volume remains in the deduplicated state, and the
data remains accessible; stops
for the volume, and the new
would use the unoptimization existing
volume. At the end unoptimization job,
deduplication metadata is deleted volume.

Note: You should be cautious when using the unoptimization job because all
the deduplicated data will return to the original logical file size. As such, you
should verify the volume has enough free space for this activity
data to allow the job successfully.

Data

In Windows Server 2016, Data Deduplication transparently removes duplication


without changing access semantics. When you enable Data Deduplication on a
volume, a post-process, or target, deduplication is used to optimize the file data on
the volume by performing the following actions:

• Optimization background tasks, the server


to volume.

• By segment all file data on variable-


sized range from 32 KB to 128 KB.

• Identifies chunks that have one or more duplicates on the volume.

60 of 93 3/12/2019, 12:38 PM
about:blank

• Inserts chunks into a common chunk store.

• Replaces all duplicate chunks with a reference, or stub, to a single copy of the
chunk in the chunk store.

• Replaces the original files with a reparse point, which contains references to its
data chunks.

• Compresses organizes them in container Volume


Information

• Removes stream of the files.

The Data Deduplication process works through scheduled tasks on the local server,
but you can run the process interactively by using Windows PowerShell. More
information about this is discussed later in the module.

Data have any write-performance the data is


not deduplicated is being written. Windows post-
process ensures that the deduplication maximized.
Another this type of deduplication application
servers and client computers offload all processing, which means less stress on the
other resources in your environment. There is, however, a small performance impact
when reading deduplicated files.

Note: The three main types of data deduplication target (or post-
and in-line (or transit

Data can process all of volume,


except less than 32 KB in size, and that are excluded.
You must carefully determine if a server and its attached volumes are suitable
candidates for deduplication prior to enabling the feature. You should also consider

61 of 93 3/12/2019, 12:38 PM
about:blank

backing up important data regularly during the deduplication process.

After you enable a volume for deduplication and the data is optimized, the volume
contains the following elements:

• Unoptimized files. Includes files that do not meet the selected file-age policy
setting, alternate data streams, with
extended smaller than 32 KB, or

• Optimized files that are stored as contain


pointers the respective chunks in the are needed to
restore the file when it is requested.

• Chunk store. Location for the optimized file data.

• Additional free space. The optimized files and chunk store occupy much less
space than they did prior to optimization.

Deploying Deduplication

62 of 93 3/12/2019, 12:38 PM
about:blank

Planning a Data Deduplication deployment

Prior configuring Data Deduplication you must


plan following steps:

• Target Data Deduplication is designed on primary –


and not to logically extended – data volumes without adding any additional
dedicated hardware. You can schedule deduplication based on the type of data
that is involved and the frequency and volume of changes that occur to the volume
or particular file types. You should consider using deduplication for the following
data

o Group content publication home folders,


Redirection/Offline Files.

o deployment shares. Software binaries, updates.

o VHD libraries. Virtual hard disk (VHD) file storage for provisioning to

63 of 93 3/12/2019, 12:38 PM
about:blank

hypervisors.

o VDI deployments. Virtual Desktop Infrastructure (VDI) deployments using


Hyper-V.

o Virtualized backup. Backup applications running as Hyper-V guests saving


backup data to mounted VHDs.

• Determine are candidates for deduplication. Deduplication can be


very storage and reducing space
consumed to 90 percent of your when
applied data. Use the following considerations evaluate which
volumes are ideal candidates for deduplication:

o Is duplicate data present?

File shares or servers which host user documents, software deployment


binaries, virtual hard disk files tend to have plenty of duplication, and yield
from deduplication. deployment
deduplication and the supported/unsupported scenarios are
module.

o access pattern allow for sufficient deduplication?

For example, files that frequently change and are often accessed by users or
applications are not good candidates for deduplication. In these scenarios,
deduplication might not be able to process the files, as the constant access and
change to the data are likely to cancel any optimization gains made by
other hand, good candidates deduplication

o sufficient resources and deduplication?

requires reading, processing, amounts of data,


which consumes server resources. Servers typically have periods of high
activity and times when there is low resource utilization; the deduplication jobs

64 of 93 3/12/2019, 12:38 PM
about:blank

work more efficiently when resources are available. However, if a server is


constantly at maximum resource capacity, it might not be an ideal candidate for
deduplication.

• Evaluate savings with the Deduplication Evaluation Tool. You can use the
Deduplication Evaluation Tool, DDPEval.exe, to determine the expected savings
that enable deduplication
DDPEval. evaluating local drives remote
shares.

install the deduplication Deduplication


Evaluation Tool (DDPEval.exe) is automatically installed to the \Windows
\System32\ directory.
For more information, refer to “Plan to Deploy Data Deduplication” at:
http://aka.ms/sxzd2l

• Plan and deduplication deduplication


policy sufficient for most environments. your
deployment following conditions, altering the
default

o Incoming data is static or expected to be read-only, and you want to process


files on the volume sooner. In this scenario, change the MinimumFileAgeDays
setting to a smaller number of days to process files earlier.

o You have directories that you do not want to deduplicate. Add a directory to the

o you do not want to type to the

o different off-peak hours than want to


change the Garbage Collection and Scrubbing schedules. Update the

65 of 93 3/12/2019, 12:38 PM
about:blank

schedules using Windows PowerShell.

Installing and configuring Data Deduplication

After completing your planning, you need to use the following steps to deploy Data
Deduplication your environment:

• Install components on the options to


install components on the server:

o Server Manager. In Server Manager, you can install Data Deduplication by


navigating to Add Roles and Features Wizard > under Server Roles > select
File and Storage Services > select the File Services check box > select the
Data Deduplication check box > click Install.

o You can use the following Data

ServerManager
Add-WindowsFeature -Name FS-Data-Deduplication
Import-Module Deduplication

• Enable Data Deduplication. Use the following options to enable Data


Deduplication on the server:

o the Server Manager

volume and select Configure Deduplication.

deduplication box, select the to host on the


volume. For example, select General purpose file server for general data
files or Virtual Desktop Infrastructure (VDI) server when configuring

66 of 93 3/12/2019, 12:38 PM
about:blank

storage for running virtual machines.

iii. Enter the minimum number of days that should elapse from the date of file
creation before files are deduplicated, enter the extensions of any file types
that should not be deduplicated, and then click Add to browse to any folders
with files that should not be deduplicated.

these settings and Manager


Set Deduplication continue to
deduplication.

o PowerShell. Use the following command deduplication on a


volume:

Enable-DedupVolume –Volume VolumeLetter –UsageType StorageType

VolumeLetter with the volume. Replace


the value corresponding type of
volume. Acceptable
volume for Hyper-V storage.

volume that is optimized for virtualized backup servers.

• Default. A general purpose volume.

the Windows PowerShell Set-DedupVolume to


options, such as the minimum that should
file creation before
types that should not folders that
deduplication.

• Configure Data Deduplication jobs. With Data Deduplication jobs, you can run
them manually, on demand, or use a schedule. The following list are the types of

67 of 93 3/12/2019, 12:38 PM
about:blank

jobs which you can perform on a volume:

o Optimization. Includes built-in jobs which are scheduled automatically for


optimizing the volumes on a periodic basis. Optimization jobs deduplicate data
and compress file chunks on a volume per the policy settings. You can also use
the following command to trigger an optimization job on demand:

–Volume VolumeLetter

o Scrubbing jobs are scheduled analyze the


volume on a weekly basis and produce a summary report in the Windows event
log. You can also use the following command to trigger a scrubbing job on
demand:

–Volume VolumeLetter

o Garbage collection jobs automatically to


volume on a weekly basis. collection is a
processing-intensive operation, you may consider after the
deletion load reaches a threshold to run this job on demand or schedule the job
for after hours. You can also use the following command to trigger a garbage
collection job on demand:

–Volume VolumeLetter GarbageCollection

o Unoptimization jobs are available basis and


automatically. However, following command
unoptimization job on demand:

68 of 93 3/12/2019, 12:38 PM
about:blank

Start-DedupJob –Volume VolumeLetter –Type Unoptimization

Note: For more information, refer to “Set-DedupVolume” at:


http://aka.ms/o30xqw

• Configure Deduplication schedules. When Deduplication on


a are enabled by default: scheduled to run
every Collection and Scrubbing once a
week. the schedules by using this PowerShell cmdlet Get-
DedupSchedule. These scheduled jobs run on all the volumes on the server.
However, if you want to run a job only on a particular volume, you must create a
new job. You can create, modify, or delete job schedules from the Deduplication
Settings page in Server Manager, or by using the Windows PowerShell cmdlets:
New-DedupSchedule Set-DedupSchedule, or Remove-DedupSchedule.

Deduplication jobs only support, schedules. If


schedule for a monthly custom time
Windows Task Scheduler. However, unable to view
these custom job schedules created with Windows Task Scheduler by using
the Windows PowerShell cmdlet Get-DedupSchedule.

Demonstration: Implementing Data Deduplication


In this see how to:

• Install Deduplication role service.

• Enable Deduplication.

• Check the status of Data Deduplication.

69 of 93 3/12/2019, 12:38 PM
about:blank

Demonstration steps
Install the Data Deduplication Role Service

• On LON-SVR1, in Server Manager, add the Data Deduplication role service.

Enable

1. observe the available space.

2. Storage Services.

3. Click Disks.

4. Click the 1 disk, and then click the D volume.

5. Enable Data Deduplication, and then click the General purpose file server

6. settings:

older than (in days):

b. Enable throughput optimization

c. Exclude: D:\shares

Check Deduplication

1. PowerShell.

2. following commands to verify Data

a.Get-DedupStatus

70 of 93 3/12/2019, 12:38 PM
about:blank

b.Get-DedupStatus | fl
c.Get-DedupVolume
d.Get-DedupVolume |fl
e.Start-DedupJob D: -Type Optimization –Memory 50

3. and 2c.

most the files on drive notice a


of saved space.

4. Close all open windows.

Usage scenarios for Data Deduplication

The following table highlights typical deduplication savings for various content types.
Your data storage savings will vary by data type, the mix of data, and the size of the

71 of 93 3/12/2019, 12:38 PM
about:blank

volume and the files that the volume contains. You should consider using the
Deduplication Evaluation Tool to evaluate the volumes before you enable
deduplication.

• User documents. This includes group content publication or sharing, user home
folders (or MyDocs), and profile redirection for accessing offline files. Applying
Data shares might save percent of your
system’s

• Software shares. This includes software symbols


files, updates. Applying Data Deduplication shares might be
able to save you up to 70 to 80 percent of your system’s storage space.

• Virtualization libraries. This includes virtual hard disk files (i.e., .vhd and .vhdx
files) storage for provisioning to hypervisors. Applying Data Deduplication to these
libraries might be able to save you up to 80 to 95 percent of your system’s storage
space.

• General includes a mix of all the above.


Applying to these shares might 60
percent storage space.

Data Deduplication deployment candidates

Based on observed savings and typical resource usage in Windows Server 2016,
deployment deduplication are ranked

• Ideal deduplication

o Virtualization depot or provisioning library

72 of 93 3/12/2019, 12:38 PM
about:blank

o Software deployment shares

o SQL Server and Exchange Server backup volumes

o Scale-out File Servers (SoFS) CSVs

o Virtualized backup VHDs (e.g., DPM)

o VDIs)

deployments, special the boot


is the name given to the large numbers of
users trying to simultaneously log in to their VDI, typically upon arriving
to work in the morning. In turn, this hammers the VDI storage system
and can cause long delays for VDI users. However, in Windows Server
2016, when chunks are read from the on-disk deduplication store during
startup of a virtual machine, they are cached in memory. As a result,
subsequent reads don’t require frequent access to the chunk store
intercepts them; the are
the memory is much

• Should based on content

o Line-of-business servers

o Static content providers

o Web servers

o computing (HPC)

• Not deduplication

o WSUS

73 of 93 3/12/2019, 12:38 PM
about:blank

o SQL Server and Exchange Server database volumes

Data Deduplication Interoperability

In Windows Server 2016, you should consider the following related technologies and
potential deploying Data Deduplication:

• BranchCache. data over the network enabling


BranchCache servers and clients. BranchCache-enabled
system over a WAN with a remote enabled for
Data Deduplication, all of the deduplicated files are already indexed and hashed,
so requests for data from a branch office are quickly computed. This is similar to
preindexing or prehashing a BranchCache-enabled server.

a feature which can network


enhance network application responsiveness when
in a central office from locations. When
BranchCache, a copy of the retrieved from the
server is cached within the branch office. If another client
requests the same content, the client can download it directly
from the local branch network without needing to retrieve the content by
using the WAN.

• Failover Clusters. Windows Server 2016 fully supports failover clusters, which
means volumes will failover gracefully the cluster.
Effectively, volume is a self-contained e., all of
the information that the requires that
each accesses deduplicated running the
Data feature. When a cluster is Deduplication schedule
information is configured in the cluster. As a result, if a deduplicated volume is
taken over by another node, the scheduled jobs will be applied on the next

74 of 93 3/12/2019, 12:38 PM
about:blank

scheduled interval by the new node.

• FSRM quotas. Although you should not create a hard quota on a volume root
folder enabled for deduplication, using File Server Resource Manager (FSRM),
you can create a soft quota on a volume root which is enabled for deduplication.
When FSRM encounters a deduplicated file, it will identify the file’s logical size for
quota Consequently, quota usage ( thresholds)
does deduplication processes quota
functionality, volume-root soft quotas will work
as deduplication.

Note: File Server Resource Manager (FSRM) is a suite of tools for


Windows Server 2016 that allows you to identify, control, and manage the
quantity and type of data stored on your servers. FSRM enables you to
configure hard or soft quotas on folders and volumes. A hard quota
prevents users from saving files after the quota limit is reached; whereas, a
enforce the quota limit, but generates a notification
volume reaches a threshold. quota is
root folder enabled for actual free
and the quota restricted are not
cause deduplication

• DFS Replication. Data Deduplication is compatible with Distributed File System


(DFS) Replication. Optimizing or unoptimizing a file will not trigger a replication
because the file does not change. DFS Replication uses Remote Differential
Compression (RDC), not the chunks in the chunk store, for over-the-wire savings.
In files on the replica deduplication if
the Data Deduplication.

Instance Storage (SIS), a file system used for NTFS


file deduplication, was deprecated in Windows Server 2012 R2 and
completely removed in Windows Server 2016.

75 of 93 3/12/2019, 12:38 PM
about:blank

Monitoring and maintaining Data Deduplication

After Deduplication in your environment, you


monitor systems that are enabled and the
corresponding ensure optimal performance. Deduplication
in Windows includes a lot of automation, optimization jobs, the
deduplication process requires that you verify the efficiency of optimization; make the
appropriate adjustments to systems, storage architecture, and volumes; and
troubleshoot any issues with Data Deduplication.

Monitoring and reporting of Data Deduplication

When Deduplication in your environment, inevitably ask


yourself, configured deduplicated Although
Windows Data Deduplication TB, you
must appropriate size of the deduplicated your environment
can support. For many, the answer to this question is that it depends on your
hardware specifications and your unique workload. More specifically, it depends

76 of 93 3/12/2019, 12:38 PM
about:blank

primarily on how much and how frequently the data on the volume changes and the
data access throughput rates of the disk storage subsystem.

Monitoring the efficiency of Data Deduplication in your environment is instrumental in


every phase of your deployment, especially during your planning phase. As detailed
earlier in the module, Data Deduplication in Windows Server 2016 performs intensive
I/O and most deployments, in the
background schedule on each day’s data
churn); deduplication is able to optimize a daily
basis, for deduplication.
organizations create a 64 TB volume, enable and then wonder
why they experience low optimization rates. Most likely in this scenario, deduplication
is not able to keep up with the incoming churn from a dataset that is too large on a
configured volume. Although Data Deduplication in Windows Server 2016 runs
multiple threads in parallel using multiple I/O queues on multiple volumes
simultaneously, deduplication environment might computing
power.

You following when estimating enabled


for Data

• Deduplication optimization must be able to keep up with the daily data churn.

• The total amount of churn scales with the size of the volume.

• The speed of deduplication optimization significantly depends on the data access


throughput storage subsystem.

Therefore, maximum size for a deduplicated should be


familiar the data churn and the speed processing on
your volumes. You can choose to use reference data, such as server hardware
specifications, storage drive/array speed, and deduplication speed of various usage

77 of 93 3/12/2019, 12:38 PM
about:blank

types, for your estimations. However, the most accurate method of assessing the
appropriate volume size is to perform the measurements directly on your
deduplication system based on the representative samples of your data, such as data
churn and deduplication processing speed.

You should consider using the following options to monitor deduplication in your
environment its health:

• Windows cmdlets. After you enable feature on


a following Windows PowerShell

o Get-DedupStatus. The most commonly used cmdlet, this cmdlet returns the
deduplication status for volumes which have data deduplication metadata,
which includes the deduplication rate, the number/sizes of optimized files, the
last run-time of the deduplication jobs, and the amount of space saved on the
volume.

o cmdlet returns the volumes


deduplication metadata. The
number/sizes of optimized deduplication
minimum file age, minimum files/folders,
compression-excluded file types, and the chunk redundancy threshold.

o Get-DedupMetadata. This cmdlet returns status information of the


deduplicated data store for volumes that have data deduplication metadata,
which includes the number of:

container.

store.

container.

▪ Containers in the stream map store.

78 of 93 3/12/2019, 12:38 PM
about:blank

▪ Hotspots in a container.

▪ Hotspots in the stream map store.

▪ Corruptions on the volume.

o Get-DedupJob. This cmdlet returns the deduplication status and information


queued deduplication

to assess whether pace with


You can use the to monitor
optimized files compared with in-policy files. This
you if all the in-policy files are processed. number of in-
policy files is continuously rising faster than the number of optimized files, you
should examine your hardware specifications for appropriate utilization or the
type of data on the volume usage type to ensure deduplication efficiency.
However, if the output value from the cmdlet for LastOptimizationResult is
dataset was processed the

information, refer to “Storage Windows


at: http://aka.ms/po9qve

• Event Viewer logs. Monitoring the event log can also be helpful to understand
deduplication events and status. To view deduplication events, in Event Viewer,
navigate to Applications and Services Logs click Microsoft click Windows,
and Deduplication. For example, Event you with
the deduplication job and the

• Performance In addition to using monitoring server


performance, CPU and memory, you can disk counters to
monitor the throughput rates of the jobs that are currently running, such as: Disk

79 of 93 3/12/2019, 12:38 PM
about:blank

Read Bytes/sec, Disk Write Bytes/sec, and Average Disk sec/Transfer. Depending
on other activities on the server, you might be able to use the data results from
these counters to get a rough estimate of the saving ratio by examining how much
data is being read and how much is being written per interval. You can also use
the Resource Monitor to identify the resource usage of specific programs/services.
To view disk activity, in Windows Resource Monitor, filter the list of processes to
locate examine the I/O on tab.

the executable file Server


Management Host process, which Deduplication
Windows Server 2016.

• File Explorer. While not the ideal choice for validating deduplication on an entire
volume, you can use File Explorer to spot check deduplication on individual files.
In viewing the properties of file, you notice that Size displays the logical size of
the displays the true physical For an
optimized is less than the actual because
deduplication contents of the file to and
replaces an NTFS reparse

Maintaining Data Deduplication

With the data that is collected by monitoring, you can use the following Windows
PowerShell optimal efficiency of environment.

• Update-DedupStatus Some of the storage cmdlets, Get-DedupStatus


and retrieve information from This
cmdlet to compute new Data Deduplication information for
updating the metadata.

80 of 93 3/12/2019, 12:38 PM
about:blank

• Start-DedupJob. This cmdlet is used to launch ad hoc deduplication jobs, such


as optimization, garbage collection, scrubbing, and unoptimization. For example,
you might consider launching an ad hoc optimization job if a deduplicated volume
is low on available space because of extra churn.

• Measure-DedupFileMetadata. This cmdlet is used to measure potential disk


space specifically, this cmdlet space
you if you delete a group subsequently run
a Files often have chunks other
folders. engine calculates which and would
be garbage collection job.

• Expand-DedupFile. This cmdlet expands an optimized file into its original


location. You might need to expand optimized files because of compatibility with
applications or other requirements. Ensure there is enough space on the volume
to store the expanded file.

Troubleshooting effects of Data Deduplication

When Deduplication Windows Server 2016 an application


or access file, several options are available including:

• Use a different deduplication frequency by changing the schedule or opting for


manual deduplication jobs.

• Use

o which halts deduplication interferes with the

o causes the deduplication engine specific


deduplication jobs to the top of the job queue and cancel the current job.

81 of 93 3/12/2019, 12:38 PM
about:blank

o ThrottleLimit, which sets the maximum number of concurrent operations


which can be established by specific deduplication jobs.

o Priority, which sets the CPU and I/O priority for specific deduplication jobs.

o Memory, which specifies the maximum percentage of physical computer


that the data deduplication job can

allowing deduplication to allocation


recommended, you might maximum
scenarios. For most scenarios, you should
maximum percentage within 50, and a
higher memory consumption for jobs that you schedule to run when you
specify the StopWhenSystemBusy parameter. For garbage collection
and scrubbing deduplication jobs, which you typically schedule to run
after business hours, you can consider using a higher memory
consumption, such as 50.

• Use cmdlet to expand, files if


needed performance.

• Use Start-DedupJob cmdlet with the Unoptimization to disable


deduplication on a volume.

Troubleshooting Data Deduplication corruptions

Data Server 2016 provides report,


and corruptions. In fact, data integrity important
by deduplication, large number of deduplicated referencing
a single gets corrupted. While features
built help protect against corruption, still some
scenarios where deduplication might not recover automatically from corruption.

82 of 93 3/12/2019, 12:38 PM
about:blank

Additional Reading: For more information, refer to “Troubleshooting Data


Deduplication Corruptions” at: http://aka.ms/Tdz13m

Some of the most common causes for deduplication to report corruption are:

• Incompatible options used when copying Robocopy with


the volume root as the target store. To
avoid /XD option to exclude
Information scope of the Robocopy

Note: For more information, refer to “FSRM and Data Deduplication may
be adversely affected when you use Robocopy /MIR in Windows Server
2012” at: http://aka.ms/W0ux7m

• Incompatible Backup/Restore program used on You


should backup solution supports in
Windows unsupported backup
corruptions More information about this
module.

• Migrating a deduplicated volume to a down-level Windows Server version. File


corruption messages might be reported on files accessed from a deduplicated
volume, which is mounted on an older version of Windows Server, but were
optimized on a later version of the operating system. In this scenario, you should
verify server accessing the deduplicated same
version the version of the data on the
volume. deduplicated volumes can be servers,
deduplication compatible but not forward can
upgrade to a newer version of Windows data deduplicated
by a newer version of Windows Server cannot be read on older versions of
Windows Server and might report the data as corrupted when trying to read.

83 of 93 3/12/2019, 12:38 PM
about:blank

• Enabling compression on the root of a volume also enabled with deduplication.


Deduplication is not supported on volumes that have compression enabled at the
root. As a result, this might lead to the corruption and inaccessibility of
deduplicated files.

files in compressed Windows


should function normally.

• Hardware hardware storage issues using the


deduplication scrubbing job. Refer to the general troubleshooting steps
below for more information.

• General corruption. You can use the steps below to troubleshoot most general
causes for deduplication to report corruption:

1. details of corruption deduplication


cases of early file corruption
job. Any corruption detected logged
Scrubbing channel lists were
that were attempted to be The deduplication
Scrubbing logs are located in the Event Viewer ( Application and
Services > Microsoft > Windows > Deduplication > Scrubbing). In addition,
searching for hardware events in the System Event logs and Storage Spaces
Event logs will often yield additional information about hardware issues.

potentially large number of deduplication


might be difficultN Event Viewer.
ou
n
script that generates auth HTML which
ori
z
corruptions and the resultsed c attempted
op
ies
fixes from the Scrubbing job. allo
we
information, refer to “Generate Deduplication Scrubbing d!

Report” at: http://aka.ms/N75avw

84 of 93 3/12/2019, 12:38 PM
about:blank

You can use the following command in Windows PowerShell to initiate a deep
Scrubbing job:

Start-DedupJob VolumeLetter -Type Scrubbing –Full

VolumeLetter with the drive

Backup restore considerations Deduplication

One Data Deduplication is operations


are faster. have reduced the meaning
there up. When you perform backup, your backup is
also because the total size of the optimized non-optimized files,
and data deduplication chunk store files are much smaller than the logical size of the
volume.

85 of 93 3/12/2019, 12:38 PM
about:blank

Note: Many block-based backup systems should work with data


deduplication, maintaining the optimization on the backup media. File-based
backup operations that do not use deduplication usually copy the files in their
original format.

The restore scenarios are supported deduplication in


Windows

• Individual

• Full backup/restore

• Optimized file-level backup/restore using VSS writer

On the other hand, the following backup and restore scenarios are not supported with
deduplication Server 2016:

• Backup reparse points.

• Backup only the chunk store.

In addition, a backup application can perform an incrementally optimized backup as


follows:

• Back files created, modified, last


backup.

• Back store container files.

• Perform incremental backup at the sub-file

86 of 93 3/12/2019, 12:38 PM
about:blank

Note: New chunks are appended to the current chunk store container. When
its size reaches approximately 1 GB, that container file is sealed and a new
container file is created.

Restore Operations

Restore benefit from data deduplication. full-


volume benefit because they reverse of the
backup data means quicker a full
volume

1. The complete set of data deduplication metadata and container files are
restored.

2. The complete set of data deduplication reparse points are restored.

3. are restored.

Block-level optimized backup is automatically restore


because process occurs under data deduplication, works at the file
level.

As with any product from a third-party vendor, you should verify whether the backup
solution supports Data Deduplication in Windows Server 2016, as unsupported
backup introduce corruptions after common
methods support Data Deduplication 2016:

• Some support unoptimized backup the


deduplicated upon backup; i.e., backs up full-size files.

• Some backup vendors support optimized backup for a full volume backup, which
backs up the deduplicated files as-is; i.e., as a reparse point stub with the chunk

87 of 93 3/12/2019, 12:38 PM
about:blank

store.

• Some backup vendors support both

The backup vendor should be able to comment on what their product supports, the
method version.

information, refer to “Backup


Volumes” at: http://aka.

Check Your Knowledge

Discovery
Can you enable Data Deduplication on a drive with storage tiering enabled?

Show solution

Check

Discovery
Can you enable Data Deduplication on ReFS formatted drives?

Show solution Reset

Check

Discovery
Can you Deduplication on volumes in running
and apply virtual machines?

Show solution Reset

88 of 93 3/12/2019, 12:38 PM
about:blank

Lab B: Implementing Data Deduplication

Scenario
After you have tested the storage redundancy and performance options, you decide
that it also would be beneficial to maximize the available disk space that you have,
especially servers. You decide to test solutions to
maximize for users.

New: storage redundancy options, you


now also be beneficial to maximize disk space that
you have, especially around virtual machine storage which is in ever increasing
demand. You decide to test out Data Deduplication solutions to maximize storage
availability for virtual machines.

Objectives

After will be able to:

• Install Deduplication role service.

• Enable Data Deduplication.

• Check the status of Data Deduplication.

Lab

Estimated

Virtual 20740C-LON-DC1 and 20740C-LON-SVR1

User name: Adatum\Administrator

89 of 93 3/12/2019, 12:38 PM
about:blank

Password: Pa55w.rd

For this lab, you must use the available virtual machine environment. These should
already be running from Lab A. If they are not, before you begin the lab, you must
complete the following steps and then complete Lab A:

1. start Hyper-V Manager

2. 20740C-LON-DC1 pane, click

3. pane, click Connect. Wait until machine starts.

4. Sign in using the following credentials:

• User name: Administrator

Password: Pa55w.rd

5. for 20740C-LON-SVR1

Exercise 1: Installing Data Deduplication

Scenario

You Deduplication role file


servers Manager.

The are as follows:

1. Install the Data Deduplication role service

90 of 93 3/12/2019, 12:38 PM
about:blank

2. Check the status of Data Deduplication

3. Verify the virtual machine performance

Detailed Steps ▼

Detailed Steps

Detailed Steps

Result completing this exercise, you should successfully installed the


Data service and enabled it on servers.

Exercise 2: Configuring Data Deduplication

Scenario

You heavily used and you duplicate files


in some enable and configure role to
reduce space this volume.

The main tasks for this exercise are as follows:

1. Configure Data Deduplication

2. run now and view

3. optimized

4. again

5. module

91 of 93 3/12/2019, 12:38 PM
about:blank

Detailed Steps ▼

Detailed Steps ▼

Detailed Steps ▼

Detailed Steps ▼

Detailed Steps

Result exercise, you should configured


Data appropriate data volume

Review Question(s)

Check Your Knowledge

Discovery
Your the impact that using have on the
write servers’ volumes. Is

Show solution

Module review and takeaways

Common Issues and Troubleshooting Tips

Common Troubleshooting

Some free Please see Student course.


disk
approaches

Review Question(s)

92 of 93 3/12/2019, 12:38 PM
about:blank

Check Your Knowledge

Discovery
You attach five 2-TB disks to your Windows Server 2012 computer. You want to simplify
the process of managing the disks. In addition, you want to ensure that if one disk fails,
the failed What feature can you accomplish these
goals?

Show solution

Check Knowledge

Discovery
Your manager has asked you to consider the use of Data Deduplication within your
storage architecture. In what scenarios is the Data Deduplication role service
particularly useful?

Show solution

93 of 93 3/12/2019, 12:38 PM

You might also like