Dag 2013
Dag 2013
Dag 2013
Storage
High Availability
Site Resilience
Storage
Storage Challenges
Disks
Capacity is increasing, but IOPS are not
Databases
Database sizes must be manageable
Database Copies
Reseeds must be fast and reliable
Passive database copy IOPS are inefficient
Lagged copies have asymmetric storage requirements
Storage Innovations
DB2
DB2
DB3
DB3
DB4
DB4
DB4
DB4
DB1
DB1
DB2
DB2
DB3
DB3
DB3
DB3
DB4
DB4
DB1
DB1
DB2
DB2
DB2
DB2
DB3
DB3
DB4
DB4
DB1
DB1
Passi Lagge
Passi
Lagge
Active
Active
ve
d
ve
d
DB1
DB1
20 MB/s DB1
DB1
DB1
DB1
DB1
DB1
Passi Lagge
Passi
Lagge
Active
Active
ve
d
ve
d
12 MB/s DB2
DB2
20 MB/s DB3
DB3
20 MB/s DB4
DB4
DB4
DB4
12 MB/s DB1
DB1
DB2
DB2
DB3
DB3
DB3
DB3
DB4
DB4
DB1
DB1
DB2
DB2
DB2
DB2
DB3
DB3
DB4
DB4
DB1
DB1
Passi Lagge
Passi
Lagge
Active
Active
ve
d
ve
d
Recommendations
Databases per volume should equal the number of copies per database
Same neighbors on all servers
Autoreseed
Seeding Challenges
Disk failure on active copy = database failover
Failed disk and database corruption issues need to be
addressed quickly
Fast recovery to restore redundancy is needed
Seeding Innovations
X
ed
-se
e
r
n
k
Dis eratio
p
o
Spares
Autoreseed Workflow
Autoreseed Workflow
in between)
4. Try InPlaceSeed with SafeDeleteExistingFiles 5 times (with
1 hour sleeps in between)
5. Once all retries are exhausted, workflow stops
6. If 3 days have elapsed and copy is still F&S, workflow
state is reset and starts from Step 1
Autoreseed Workflow
Prerequisites
Copy is not ReseedBlocked or ResumeBlocked
Logs and database file(s) are on same volume
Database and log folder structure matches required naming convention
No active copies on failed volume
All copies are F&S on the failed volume
No more than 8 F&S copies on the server (if so, might be a controller
failure)
For InPlaceSeed
Up to 10 concurrent seeds are allowed
If a database files exists, wait for 2 days before in-place reseeding
Autoreseed
AutoDagDatabasesRootFolde
rPath
AutoDagVolumesRootFolderPath
ExchDb
s
ExchVo
ls
Vol1
MDB1
AutoDagDatabaseCopiesPerVolu
me = 1
MDB1.D
MDB1.D
B
B
Vol2 Vol3
MDB1
MDB2
MDB2
MDB1.log
MDB1.log
MDB1.D
MDB1.D
B
B
MDB1.log
MDB1.log
Autoreseed
Requirements
Single logical disk/partition per physical disk
Specific database and log folder structure must be used
Recommendations
Same neighbors on all servers
Databases per volume should equal the number of copies
per database
Autoreseed
Numerous fixes in CU1
Autoreseed not detecting spare disks correctly
Autoreseed not using spare disks
Increased Autoreseed copy limits (previously 4, now 8)
Better tracking around mount path and ExchangeVolume path
Get-MailboxDatabaseCopyStatus displays ExchangeVolumeMountPoint
parameters Description
designed to aid with automation
Parameter
BeginSeed
Used as part of a full server reseed operation to reseed all database copies
in a F&S state. Can be used with MaximumSeedsInParallel to start reseeds
of database copies in parallel across specified server in batches of up to
value of MaximumSeedsInParallel parameter copies at a time
Recovery Challenges
Lagged copy
innovations
High Availability
Managed Availability
Managed Availability
Key tenets for Exchange 2013
Access to a mailbox is provided by protocol stack on the
Mailbox server that hosts the active copy of the mailbox
If a protocol is down on a Mailbox server, all access to
active databases on that server via that protocol is lost
Managed Availability was introduced to detect and
restart action
If the restart action fails, a failover can be triggered
Managed Availability
An internal framework used by component teams
Sequencing mechanism to control when recovery
Managed Availability
Managed Availability
imposes 4 new
constraints on the
Best Copy Selection
algorithm
DAG Network
Innovations
subnet deployment
Small remaining administrative burden for deployment
and initial configuration
Site Resilience
Site Resilience
Operationally Simplified
Previously loss of CAS, CAS array, VIP, LB, etc., required admin to
perform a datacenter switchover
In Exchange Server 2013, recovery happens automatically
Site Resilience
Mailbox and CAS recovery independent
Previously, CAS and Mailbox server recovery were tied together in site
recoveries
In Exchange Server 2013, recovery is independent, and may come
automatically in the form of failover
Site Resilience
Namespace provides redundancy
Previously, the namespace was a single point of failure
In Exchange 2013, the namespace provides redundancy by leveraging
multiple A records and clients OS/HTTP stack ability to failover
Site Resilience
Support for new deployment scenarios
With the namespace simplification, consolidation of server roles,
separation of CAS array and DAG recovery, de-coupling of CAS and
Mailbox by AD site, and load balancing changes, if available, three
locations can simplify mailbox recovery in response to datacenter-level
events
You must have at least three locations
Two locations with Exchange; one with witness server
Exchange sites must be well-connected
Witness server site must be isolated from network failures affecting Exchange sites
IP from
DNS namespace,
puts you in control
of infails,
service
time
of VIP
With multiple VIPRemoving
endpointsfailing
sharing
the same
if one VIP
clients
automatically
failover to
alternate VIP(s)
mail.contoso.com:
mail.contoso.com:
192.168.1.50,
10.0.1.50
10.0.1.50
VIP: 10.0.1.50
VIP: 192.168.1.50
cas1
cas2
cas3
cas4
X
mbx1
mbx2
primary datacenter:
Redmond
dag1
witnes
third datacenter:
s
Paris
mbx3
mbx4
alternate datacenter:
Portland
X
mbx1
mbx2
dag1
XX
mbx3
mbx4
witness
X
mbx1
mbx2
witness
dag1
mbx3
mbx4
alternate
witness