Introduction To Data Protection - SNIA - 2013
Introduction To Data Protection - SNIA - 2013
DMF Workgroups:
Data Protection Initiative Information Lifecycle Long-term Archive and
(DPI) Management Initiative Compliance Storage
(ILMI) Initiative (LTACSI)
Defining best practices for Developing, educating and Addressing the challenges
data protection and recovery promoting ILM practices, of retaining, securing, and
technologies such as Backup, CDP, implementation methods, and preserving digital information
Data deduplication and VTL benefits for the long-term
It is critical to keep focused on the actual goal -- availability of the data -- and
to balance how we achieve this by using the right set of tools for the specific
job.
Held in the balance are concepts like data importance or business criticality,
budget, speed, and cost of downtime.
Detection
Corruption or failure noted
Diagnosis / Decision
What went wrong?
What recovery point should be used?
What method of recovery -- overall strategy for the recovery?
Restoration
Moving the data
From tape to disk, or disk to disk, from the backup or archive (source), to the primary
or production disks.
Recovery – Almost done!
Application environment perform standard recovery and startup operations
Any additional steps
Log replays for a database
Journals replays for a file system
Tape Backups Capture on Write Synthetic Backup Instant Recovery Disk Restores Tape Restores
Vaults Disk Backups Real Time
Replication Point-in-Time
Archival Snapshots VTL Roll Back Search & Retrieve
Cold
Offline image of all the data
As backup window shrinks and data size expands, cold backup becomes
untenable.
Cheapest and simplest way to backup data
Application Consistent
Application supports ability to take pieces of overall data set offline for a
period of time to protect it - application knows how to recover from a
collection of individual consistent pieces.
No downtime for backup window.
Crash Consistent or Atomic
Data can be copied or frozen at the exact same moment across the entire
dataset.
Application recovery from an atomic backup performs like a high availability
failover.
No backup window.
Media
Server CATALOG
Data
DATA Metadata
Secondary
Storage
CATALOG
SAN / SCSI
Application Backup
Server Server
Data
DATA Metadata
Secondary
Storage
AGENT LAN
Media
Server CATALOG
Data
DATA Metadata
MIRROR
Secondary
Storage
AGENT LAN
Media
Server CATALOG
Application Backup
Server Server
SAN / SCSI
Data
DATA Metadata
SNAPSHOT DATA
MOVER Secondary
Storage
Backup server delegates the data movement and I/O processing to a “Data-mover”
enabled on a device within the environment
SCSI Extended Copy (XCOPY or “Third-Party Copy”)
Metadata still sent to the backup server for catalog updates
Much less impact on the LAN
Network Data Management Protocol (NDMP)
NDMP is a general open network protocol for controlling the exchange of data between two
parties
Introduction to Data Protection: Disk, Tape and Beyond
© 2009 Storage Networking Industry Association. All Rights Reserved.
17
Traditional Backup Schedules
Full Backup
Everything copied to backup (cold or hot backup)
Full view of the volume at that point in time
Restoration straight-forward as all data is available in one backup image
Huge resource consumption (server, network, tapes)
Incremental Backup
Only the data that changed since last full or incremental
Change in the archive bit
Usually requires multiple increments and previous full backup to do full
restore
Much less data is transferred
Differential backup
All of the data that changed from the last full backup
Usually less data is transferred than a full
Usually less time to restore full dataset than incremental
Introduction to Data Protection: Disk, Tape and Beyond
© 2009 Storage Networking Industry Association. All Rights Reserved.
18
What gets backed up and how
File-level backups
Any change to a file will cause entire file to be backed up
Open files often require special handling SW
Open files may get passed over – measure the risks
PRO: File level backup simplifies both backup and recovery
CON: Small changes to large files result in large backups
Block-level backups
Only the blocks that change in a file are saved
Requires additional client-side processing to discover change blocks versus
entire file
PRO:
Reduce size of backup data thus improving network utilization
In some cases may speed backups
CON: Client-side impact may affect client performance
Increases backup and restore complexity
Incremental Forever
FULL
Considerations Tape
Fibre Channel Disks versus ATA versus SAS Library
Why:
Improved performance and reliability (see B2D) Tape
Library
Reduced complexity versus straight B2D or tape
Unlimited tape drives reduce device sharing, improve backup times *
Enables technologies such as remote replication, data deduplication
What:
Continuous Data Protection
Capture every change as it occurs App
May be host-based, SAN-based, array-based Server
How: Normal
Path
Block-based Backup
Path
File-based
Application-based
Record of
Why:
Protect
Updates
Storage Object
Implementations of true CDP today are delivering zero data loss, zero backup
window and simple recovery. CDP customers can protect all data at all
times and recover directly to any point in time.
What?
The process of examining a data-set or I/O stream at the sub-file level and storing
and/or sending only unique data
Client-side SW, Target-side HW or both
Why?
Check out SNIA Tutorial:
Reduction in cost per terabyte stored
Understanding Data
Significant reduction in storage footprint Deduplication
Less network bandwidth required
Considerations
Greater amount of data stored in less physical space
Suitable for backup, archive and (maybe) primary storage
Enables lower cost replication for offsite copies
Store more data for longer periods
Beware 1000:1 dedupe claims – Know your data and use case
Multiple performance trade-offs
Introduction to Data Protection: Disk, Tape and Beyond
© 2009 Storage Networking Industry Association. All Rights Reserved.
26
Next Steps in Data Protection
Choose the appropriate level of protection
Assess risk versus cost versus complexity
Include your “customers” in your decisions
Match RPO, RTO goals with technology
Consider resources required to support your decisions
Consider centralized versus distributed solutions
Performance is ALWAYS a consideration
Assess your system today for strengths and weaknesses
A new box or new SW may NOT be the answer
When in doubt, call in the experts
Related tutorials
Trends in Data Protection and Restoration Technologies
Deduplication – Methods of Achieving Data Efficiency
In the Face of Litigation: Best Practices for Retention,
Discovery, and Deletion
A Crash Course in Wide Area Data Replication
Visit the Data Management Forum website
http://www.snia-dmf.org
Data Protection Buyers Guides available
Chapters on Continuous Data Protection, Deduplication,
and Virtual Tape Libraries
Introduction to Data Protection: Disk, Tape and Beyond
© 2009 Storage Networking Industry Association. All Rights Reserved.
28
Q&A / Feedback
31
Backup versus Mirroring
Backup
Protecting data by making copies or allowing copies to be generated from
saved data
Examples: snapshots, split mirrors, VTL, tape, CDP
When?
Multiple Recovery Points needed
Recovery from data corruption
Archival and indexing
Mirroring/Replication
Protecting data by moving the data, usually as it changes, to a remote copy.
Synchronous or Asynchronous mirroring
When?
Disaster Recovery Time Objective (DR/RTO centric usually)
Data Migration
Content Distribution
A disk based
A fully usable copy of a
“instant copy”
that captures
the original data
“ defined collection of data that
contains an image of the data
at a specific as it appeared at the point in
point in time. time at which the copy was
Snapshots can initiated. A snapshot may be
be read-only or either a duplicate or a
read-write. replicate of the data it
represents.
www.snia.org/dictionary
Heterogeneous Environment
Multiple platforms (HW / OS)
Multiple tape drives & libraires
Multiple applications
NAS and SAN
Snapshot facilites
Advanced Tape Management
Tape mirroring
Off-site storage
Multiplexing
Advanced Library Management
Sharing, partitioning
Port handling
Security
Authentication & Encryption
DMZ / Firewall support
Centralized Administration
Web GUI & smart interface
Backup strategies
Scheduling
Onsite and offsite mmedia management
Centralized Supervision
Real-time monitoring
Alarms
Event log
SNMP compliant, integration with
Frameworks
Bill back