0% found this document useful (0 votes)
15 views

Module 5 Main Data Recovery and Backups-1

The document discusses data recovery, backups, and disaster recovery planning. It defines key concepts like data recovery, different types of backups including full, differential and incremental backups. It also discusses backup solutions, testing backups, and implementing both on-site and off-site backups. The document then covers disaster recovery plans, defining preventative, detection and corrective measures. It also discusses the importance of post-mortems after incidents to analyze what went wrong and avoid issues in the future.

Uploaded by

jericho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Module 5 Main Data Recovery and Backups-1

The document discusses data recovery, backups, and disaster recovery planning. It defines key concepts like data recovery, different types of backups including full, differential and incremental backups. It also discusses backup solutions, testing backups, and implementing both on-site and off-site backups. The document then covers disaster recovery plans, defining preventative, detection and corrective measures. It also discusses the importance of post-mortems after incidents to analyze what went wrong and avoid issues in the future.

Uploaded by

jericho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

System Administration

and Maintenance
Module 5
Data Recovery &
Backups
• Define Data Recovery
• Learn how to backup data
• Enumerate Backup solutions
• Testing of backups
• Types of Backup
• Discuss User Backups
• Define Disaster Recovery Plan
• Learn how to design a Disaster Recovery Plan
• Define a Post-Mortem
• Learn how to write Post-Mortem
Planning for Data Recovery

5.1

System Administration and Maintenance


• The process of trying to restore data after an
unexpected event that results in data loss or
corruption

• When an unexpected event occurs, your main


objective is to resume normal operations as soon as
possible, while minimizing the disruption to your
business functions.
• Backing up data isn't free. Every additional file you
back up takes up little more disc space, increasing the
overall costs of your backup solution.

• Make sure that you account for future growth and


choose a solution as flexible enough to easily
accommodate increases in data backups.
• Data can be backed up either locally to systems on site, or the
backup data can be sent upside to remote systems.

• The advantage of onsite backup solutions is that the data is


physically very close. But one of the unexpected event is a building
fire.

• This involves making backups of critical data and sending the


backup data offsite to remote systems in a different physical
location. This could be another backup server that you control in a
different office, or a cloud hosted backup service.

• encryption of backups
• implementing both on site and offsite backups is
recommended if it's within your organization's budget
• backup time period. How long do you need to hang on
to backups for
• archive older data using a slower but cheaper storage
mechanism – data tapes
• Tape storage is pretty cheap, but isn't as easy or quick
to access as data stored on hard drives or solid state
drives. It is usually used for long term archival
purposes.
• command line utility
• Rsync isn't explicitly a backup tool, but it's very
commonly used as one.
• It's a file transfer utility that's designed to efficiently
transfer and synchronize files between locations or
computers.
• supports compression and you can use SSH
• Using SSH, it can also synchronize files between
remote machines making it super useful for simple
automated backups.
• Apple’s first party backup solution available for Mac
operating systems
• incremental backup model
• supports restoring an entire system from backup or
individual files
• allows restoring older versions of backup files
• Microsoft’s
• file based version where files are backed up to a zip archive
• system image where the entire disk saved block by block to a file

• File based backup support, either complete backups or incremental


ones.
• System image backups support differential mode, only backing up
blocks on the disk that have changed since the last backup.
• The takeaway here is that it isn't sufficient to just set
up regular backups.
• recovery process and that process needs to be tested
regularly.
• Restoration procedures should be documented and
accessible so that anyone with the right access can
restore operations when needed.
• This process is called Disaster Recovery testing and
is critical to ensuring a well functioning recovery
system.
• Full backup.
• Making a copy of the data to be fully backed up. The full unmodified
contents of all files to be backed up is included in this backup mechanism
whether the data was modified or not
• Differential backup.
• Only backup files that are changed, or been created since the last full
backup
• it's a good practice to perform infrequent full backups,
while also doing more frequent differential backups.
• Regular incremental backups.
• Only the data that's changed in files is backed up
• only sorts differences in the files that have changed since the last
incremental backup
• more time consuming
• File Compression.
• help save space
• When creating a backup all the files and folder structures will be copied
and put into an archive.
• Archives are useful for keeping files organized and preserving full structure
• Besides archiving the files, backups can also be compressed, this is a
mechanism of storing the same data while requiring less disk space by
using complex algorithms.
• RAID (Redundant Array of Independent Disks)
• a method of taking multiple physical disks and combining them into one
large virtual disk
• There are lots of types of RAID configuration called levels.
• inexpensive way of creating a lot of data capacity, while minimizing risk of
data loss in the event of disk failure
• RAID isn't a back up solution, it's a data storage solution that has some
hardware failure redundancy available in some of the RAID levels
• RAID is not a replacement for backups.
• One solution to user backups is to use a cloud service designed for
syncing and backing up files across platforms and devices.
• Some examples of these are things like Dropbox, Apple iCloud and Google Drive
Disaster Recovery Plans and Post-
Mortems

5.2

System Administration and Maintenance


• As an IT support specialist, It's important that you're
prepared for a disastrous accident or mistake
• A disaster recovery plan
• is a collection of documented procedures and plans on how to react and
handle an emergency or disaster scenario, from the operational
perspective.
• things that should be done before, during and after a disaster
• goal of the disaster recovery plan is to minimize disruption to business and
IT operations, by keeping downtime of systems to a minimum and
preventing significant data loss
• Preventative measures
• cover any procedures or systems in place that will proactively minimize
the impact of a disaster.
• includes things like regular backups and redundant systems
• Anything that's done before an actual disaster that's able to reduce the
overall downtime of the event is considered preventative
• Detection measures
• are meant to alert you and your team that a disaster has occurred that can
impact operations.
• Timely notification of a disaster is critical
• Environmental Sensors
• Flood Sensors
• Temperature and Humidity Sensors
• Evacuation Procedures

• If there's a fire and the building needs to be


evacuated, you should be prepared to set up
temporary accommodation so people can still work
effectively.
• Corrective or recovery measures
• are those enacted after disaster has occurred.
• involve restoring lost data from backups or rebuilding and reconfiguring
systems that were damaged.

• Single point of failure


• When one system in a redundant pair suffers a failure
• one failure now to completely take the system down.
• There's no one-size-fits-all for disaster recovery plans.
• The mechanisms chosen and procedures put in place
will depend a lot on the specifics of your organization
and environment.
1. Perform a risk assessment
• This involves taking a long hard look at the operations and characteristics
of your teams
• allows you to prioritize certain aspects of the organizations that are more
at risk if there's an unforeseen event
• brainstorming hypothetical scenarios and analyzing these events to
understand how they'd impact your organization and operations
2. Determine Backup and Recovery Systems
• Make sure you have a sound backup and a recovery system, along with a
good strategy in place
• regular but automated backups to backup systems located both on site
and off site
• data recovery procedures clearly documented and kept up-to-date
• Redundancies shouldn't be limited only to systems. Anything critical to
operations should be made redundant whenever possible
• operational documentation. Make sure that every important operational
procedure is documented and accessible
• periodically verify that the steps documented actually work
3. Determine Detection and Alert Measures
• If up-time and availability is important for your organization, you'll likely
have two internet connections; a primary and a secondary.
• monitor conditions of service and infrastructure equipment. Things like
temperatures, CPU load and network load for a service monitoring for
error rates and requests per second will give you insight into the
performance of the system.
• investigate any unusual spikes or unexpected increases. These early
warning systems allow you to head off disaster before it brings operations
to a halt.
• test these systems. Simulate the conditions your monitoring systems are
designed to catch
• Make sure the detection thresholds actually fire the alerts like they're
supposed to
4. Determine Recovery Measures
• include actions that are taken to restore normal operations and to recover
from an incident or outage
• restoring a corrupted database from a backup or rebuilding and re-
configuring a server
• include reference or links to documentation for these types of tasks
• We look at what happened and try to understand why it
did to avoid it in the future.
• It's important that we're able to learn from our mistakes
• We create a post-mortem after an incident, an outage, or
some event when something goes wrong, or at the end of
a project to analyze how it went.
• This report documents, in detail, what exactly happened leading up to, during, and after
the event or project.
• The purpose of a post-mortem is to learn something from
an event or project, not to punish anyone or highlight
mistakes.
• One thing that often gets overlooked in postmortem is
what went well
• highlight things that went well. These include: fail safe
or fail of a system that worked as designed, and
prevented a large outage, or minimized the severity of
the outage
• helps to demonstrate the effectiveness of our systems
in place.
Blokdyk, Gerardus (2018). Information security management system A Clear and Concise Reference. 5STARCooks.
Blokdyk, Gerardus (2018). Security Management Information System A Complete Guide. 5STARCooks.
Francis, Dishan (2017). Mastering Active Directory: Understand the Core Functionalities of Active Directory Services Using
Microsoft Server 2016 and PowerShell. Packt
Kim, David, Solomon, Michael (2016). Fundamentals of Information Systems Security 3rd Edition. Jones and Barnet
Learning.
Limoncelli, Thomas;Hogan, Christina; Chalup, Srata (2016). The Practice of System and Network Administration: Volume 1:
DevOps and other Best Practices for Enterprise IT (3rd Edition) 3rd Edition. Addison-Wesley

You might also like