0% found this document useful (0 votes)
41 views20 pages

TM07 Database Backup and Recovery

Uploaded by

Eyachew Tewabe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views20 pages

TM07 Database Backup and Recovery

Uploaded by

Eyachew Tewabe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 20

Module Title: Database Backup and Recovery

Unit one: Database Architecture

1.1. Architecture of database file system


Database architecture focuses on database design and construction for large enterprise
database systems that manage massive amounts of information for organizations.

The design of a DBMS depends on its architecture. Selecting the correct Database
Architecture helps in quick and secure access to data. The architecture of a DBMS can be
seen as either single tier or multi-tier. The tiers are classified as follows:
1.1.1. Single tier architecture
The simplest of Database Architecture are 1 tier where the Client, Server, and Database all
reside on the same machine. In other word, it keeps all of the elements of an application,
including the interface, Middleware and back-end data, in one place. Developers see these
types of systems as the simplest and most direct way.
 The database is directly available to the user. It means the user can directly sit on the
DBMS and uses it.
 Any changes done here will directly be done on the database itself. It doesn't provide
a handy tool for end users.
 The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.

Figure 1.1: Single tier architecture


1.1.2. Two-tier Architecture
The two-tier is based on Client Server architecture.

It is like client server application.

The direct communication takes place between client and server. There is no intermediate
between client and server.

The user interfaces and application programs are run on the client-side.
The server side is responsible to provide the functionalities like: query processing and
transaction management.

To communicate with the DBMS, client-side application establishes a connection with the
server side 2 Tier architecture provides added security to the DBMS as it is not exposed to
the end user directly.

A two-tier architecture is a database architecture where


1. Presentation layer runs on a client (PC, Mobile, Tablet, etc)
2. Data is stored on a Server.

Figure 1.2. Two Tier architecture


1.1.3. Three-tier Architecture
It is an extension of the 2-tier architecture. A 3-tier architecture separates its tiers from each
other based on the complexity of the users and how they use the data present in the database.
The three tier architecture is the most popular DBMS architecture.

This architecture has different usages with different applications. It can be used in web
applications and distributed applications. 3-tier architecture has following layers;

 Database server (Data) Tier − at this tier, the database resides along with its query
processing languages.
 Application (Middle) Tier – also called business logic layer and it processes
functional logic, constraint, and rules before passing data to the user or down to the
DBMS.
For a user, this application tier presents an abstracted view of the database. End-users
are unaware of any existence of the database beyond the application.
 User (Presentation) Tier − End-users operate on this tier and they know nothing about
any existence of the database beyond this layer. Example your PC, Tablet, Mobile,
etc.)
 The goal of Three-tier architecture is:
 To separate the user applications and physical database
 Proposed to support DBMS characteristics
 Program-data independence
 Support of multiple views of the data

Figure 1.3. Three-tier Architecture Diagram

1.2.Risks and Failure Scenario


A database is the means of organizing information so it can be easily managed, updated and
retrieved. Losing a database would also mean losing the associated data. This means that if a
business loses its databases for any number of reasons, with no backups stored, it is fair to
assume that they will most likely lose the data too. There are many types of failures that can
affect database processing. Some failures affect the main memory only, while others involve
secondary storage. There are different scenarios or causes that could lead to a database loss.
Some of them include;

Figure 1.4. Cause of database failure

 Power Failure

Power failures can lead to hardware failure. The affected hardware components could be
cables, power supplies or storage devices.
 Disk Failure
While power failures can lead to disk failure, they can also fail due to physical damage or a
logical failure. Such failures are due to head crashes or unreadable media, resulting in the
loss of parts of secondary storage.

 Human Error (Carelessness)


This is the failure due to unintentional destruction of data or facilities by operators or users.
An employee may unintentionally delete some data or may modify the data unknowingly in a
way that would cease the DBMS from interacting with the database effectively. Human error
is the number one cause of data loss.

 Software Corruption
Companies using traditional in-house IT infrastructures are more at risk of software
corruption than those relying on cloud-based services. While cloud vendors provide
flexibility and scalability of resources, traditional IT environments have fixed sets of
hardware resources which they manually upgrade.

Repeated crashing can especially cause serious damage if the user is working on a database.
These are logical errors in the program that is accessing the database, which cause one or
more transactions to fail.

 Virus Infection
An enterprise cannot operate safely without the use of a good security solution. Cyber-attacks
are the biggest threat a company faces today and it is imperative that the security solution
performs real-time scanning.

 Natural Disasters
Natural disasters such as fire, floods, earthquake, tsunami, etc have the ability to destroy the
entire infrastructure.

 Hardware Failure
Hardware failure may include memory errors, disk crashes, bad disk sectors, disk full error
and so on.

 System Crash
System crashes are due to hardware or software errors, resulting in the loss of main memory.
This could be the situation that the system has entered an undesirable state.
 Network Failure
Network failure can occur while using a Client-server configuration or distributed database
system where multiple database servers are connected y common network. •

 Sabotages
These are failures due to international corruption or destruction of data, hardware or users.

1.3 OHS
Occupational Health and Safety (OHS) requirements for database backup and recovery are
crucial to ensure the well-being and safety of individuals involved in managing and
maintaining databases.

 Training and Competency

Ensure that personnel responsible for database backup and recovery are adequately trained
and competent in their roles.

 Ergonomics

Design workstations and environments with ergonomic principles in mind to prevent


musculoskeletal issues among personnel. Ensure that seating, lighting, and computer
equipment are conducive to a healthy working environment.

 Workload Management
Monitor and manage the workload of personnel involved in database backup and recovery to
prevent stress and burnout.

 Emergency Procedures
Establish clear emergency procedures for unexpected situations during database backup and
recovery processes. Ensure that personnel are aware of evacuation plans and procedures in
case of emergencies.

 Security Measures
Implement security measures to protect personnel from potential cybersecurity threats during
backup and recovery operations.

 Equipment Safety
Regularly inspect and maintain all equipment used in database backup and recovery to ensure
it meets safety standards.

 Health Monitoring
Implement health monitoring programs to identify and address any health issues among
personnel promptly.

 Documentation and Procedures


Document clear and detailed procedures for database backup and recovery tasks. Include
safety guidelines and precautions within the documentation to ensure safe work practices.

 Communication Protocols
Establish effective communication protocols to ensure that team members can communicate
efficiently during backup and recovery operations. Encourage open communication about
any concerns related to health and safety.

Unit Two: Database Backup Methods

2.1. Introduction to Backup


Backup is the process of creating a copy of data to protect against accidental or malicious
deletion, corruption, hardware failure, ransom ware attacks, and other types of data loss. Data
backups can be created locally, offsite, or both.

Restore is the process of retrieving data from a backup. This might mean copying data from
backup media to an existing device or to a new device. It also could mean copying data from
the cloud to a local device or from one cloud to another.

Recovery refers to the process of restoring data and operations (e.g., returning a server to
normal working order following hardware failure).
Restore and recovery times can vary widely depending on the backup format and data
recovery methods you choose. Additionally, restore needs also vary (e.g., restoring a single
file vs. an entire server). Finally, critical data may live on workstations, local servers, and in
the cloud. These are important considerations when selecting a backup and recovery solution.
The most common type of database backups are:
- Logical backup - backup of data is stored in a human-readable format like SQL
- Physical backup - backup contains binary data

2.2.Methods for back-up and recovery


2.2.1. Types of Backup

There are different types of backup, and each backup process works differently.
Table 2.1. Comparison of backup type
A comparison of different types of backup

Backup Full Mirror Incremental Differential


Backup 1 All data All data Selected - -
Backup 2 All data All data Selected Changes from backup 1 Changes from backup 1

Backup 3 All data All data Selected Changes from backup 2 Changes from backup 1

Backup 4 All data All data Selected Changes from backup 3 Changes from backup 1

1. Full backups
The most basic and complete type of backup operation is a full backup. As the name implies,
this type of backup makes a copy of all data to a storage device, such as a disk or tape. The
primary advantage to performing a full backup during every operation is that a complete
copy of all data is available with a single set of media. This results in a minimal time to
restore data, a metric known as a recovery time objective. However, the disadvantages are
that it takes longer to perform a full backup than other types (sometimes by a factor of 10 or
more), and it requires more storage space.
Thus, full backups are typically run only periodically. Typically, backup operations employ a
full backup in combination with either incremental or differential backups.

Figure 2.1. The way full backup perform


2. Incremental backups
An incremental backup operation will result in copying only the data that has changed since
the last backup operation of any type. An organization typically uses the modified time stamp
on files and compares it to the time stamp of the last backup.

The benefit of an incremental backup is that it copies a smaller amount of data than a full.
Thus, these operations will have a faster backup speed, and require fewer medium to store
the backup.

Figure 2.2. The way incremental backup perform


3. Differential backups
A differential backup operation is similar to an incremental the first time it is performed, in
that it will copy all data changed from the previous backup. However, each time it is run
afterwards, it will continue to copy all data changed since the previous full backup.
Moreover, differential backups require more space and time to complete than incremental
backups, although less than full backups.

Fig 2.3. Differential backup


4. Mirror backups
A mirror backup is comparable to a full backup. This backup type creates an exact copy of the source
data set, but only the latest data version is stored in the backup repository with no track of different
versions of the files. The backup is a mirror of the source data. All the different backed up files are
stored separately, like they are in the source.
One of the benefits of mirror backup is a fast data recovery time. It's also easy to access
individual backed up files. Mirror backup is the fastest backup method because it copies files
and folders to the destination without any compression.
One of the main drawbacks, though, is the amount of storage space required. It needs more storage
space than any other backup type; password protection is not possible and cannot track different
versions of files

Figure 2.5. The way mirror backup perform

2.2.2. Determining appropriate methods


To determine the type of backup strategy to be used there are different determinant factors
such as overall business cost, performance, data protection levels, total amount of data
retained and availability goals.

Do the right or appropriate backup for your organization. For organizations with small data sets,
running a daily full backup provides a high level of protection without much additional storage space
costs. Larger organizations or those with more data or server volume find that running a weekly full
backup, coupled with either daily incremental backups or differential backups, provides a better
option. Using differentials provides a higher level of data protection with less restore time for most
scenarios and a small increase in storage capacity. For this reason, using a strategy of weekly full
backups with daily differential backups is a good option for many organizations.

From these types of backup, it is possible to develop an approach for comprehensive data
protection. An organization often uses one of the following backup settings:

 Full daily

 Full weekly + differential daily

 Full weekly + incremental daily


2.3. Range of back-up and restoration
While each approach carries its own benefits and risks, organizations need to consider their
need for performance, data protection, their total volume of data assets, and the cost of
recovery. The following five factors can be used in making a decision about which backup
schedule is right for you.

Table 2.1. Comparison of backup type

 The advantages and disadvantages of each backup type


 Pros of Full Backups
 Potential for fast, total recovery of data assets.
 Simple access to the most recent backup version.
 All back-ups are contained in a single version.
 Minimal time needed to restore business operations.
 Cons of Full Backups
 Requires the most storage space.
 Demands the most bandwidth.
 Relatively time-consuming to complete the backup process
 Pros of Incremental Backups
 Minimal time to complete backup.
 Requires the least storage space.
 Demands the least bandwidth.
 Cons of Incremental Backups
 Recovery time may be slower.
 Requires a full backup in addition to incremental backups for complete recovery.
 Recovery requires the piecing together of data from multiple backup sets.
 Small potential for incomplete data recovery if one or more backup sets has
failed.
 Pros of Differential Backups
 Requires less storage space than full backups.
 Only two backups (last full and most recent incremental) are required for
recovery.
 Cons of Differential Backups
 Slower than incremental.
 Requires an initial full backup for complete recovery.
 IT will need to piece together two backup sets.
 Potential for failed recovery if one or more backups is incomplete

2.4. Off-line back-ups


Off-line backup is also called cold backup or static backup. It is a database backup during
which the database is offline and not accessible to update or the database operations are
entirely stopped, and then the backup is performed. While the backup is in progress, no
business operations can be performed.

It is mostly accomplished before the beginning of the day or at the end of the day.

Cold backups consume fewer resources but have a limitation. The database cannot be
accessed when the backup operation is in progress.

The advantage of this method is that users are still able to access the system during the
backup. However, if the server crashes, the backup will also be gone. The risk that comes
with a hot backup is that the data may be modified during the process, resulting in
inconsistent data.

2.5. On-line file back-ups


It is also called a hot backup or dynamic backup. A hot backup is a backup performed while
the database is open and available for use (read and write activity). It is performed in near
real-time when the systems are up and running, and new data is continuously generated or
captured.
In a hot backup, there is a time parameter involved as to when to perform a backup. This can
range from seconds to minutes. The entire data is copied on the secondary location, and
hence the relevant changes reflect in the new backup.

The most important advantage here is the capability to continue business operations while
the backup is in progress. The database is available at all times, and hence the business can
continue as usual.

Table 2.2. Comparison of Hot Backup and Cold Backup

2.6.Disk mirroring
Disk mirroring, also known as RAID 1, is the replication of data to two or more disks. Disk
mirroring is a strong option for data that needs high availability because of its quick recovery
time. It's also helpful for disaster recovery because of its immediate failover capability. Disk
mirroring requires at least two physical drives. If one hard drive fails, an organization can use
the mirror copy. While disk mirroring offers comprehensive data protection, it requires a lot
of storage capacity.

Figure 2.6. Database mirroring


2.7. RAID
RAID refers to redundancy array of the independent disk. It is a technology which is used to
connect multiple secondary storage devices for increased performance, data redundancy or
both. It gives the ability to survive one or more drive failure depending upon the RAID level
used.
It consists of an array of disks in which multiple disks are connected to achieve different
goals.
There are 7 levels of RAID schemes. These schemas are as RAID 0, RAID 1, ...., RAID 6.
These levels contain the following characteristics:
- It contains a set of physical disk drives.
- The operating system views these separate disks as a single logical disk.
- In this technology, data is distributed across the physical drives of the array.
- Redundancy disk capacity is used to store parity information.
- In case of disk failure, the parity information can be helped to recover the data.

2.8. Off-site back-up files

Offsite backup is the replication of the data to a server which is separated geographically
from a production systems site. Offsite data backup may also be done via direct access, over
Wide Area Network (WAN). An offsite backup is a backup process or facility that stores
backup data or applications external to the organization or core IT environment.
It is similar to a standard backup process, but uses a facility or storage media that is not
physically located within the organization’s core infrastructure.

Offsite backups are primarily is used in data backup and disaster-recovery measures. The
core objective behind storing and maintaining data at a backup facility is to:

- Secure data from malicious attacks


- Keep a backup copy of data in case the primary site is damaged or destroyed
Cloud backup, online backup or managed backup are examples of offsite backup solutions
that enable an individual or organization to store data at facilities that are geographically and
logically external.

 Advantages and disadvantages Offsite Storage


Advantages
Offsite storage has several major advantages. Some of them are;
 Scalability
 Cost & Value
 Fast Deployment
 Managed Storage Service
 Connectivity
 Performance
Disadvantages
Some disadvantages of offsite storage include;
 It can be difficult to access
 Security and Privacy
 Compliance and Data Governance
 Lifetime Costs
 Speed

2.9. Onsite Backup


In onsite storage, data and storage hardware are geographically located internally to your
business or organization. You may have a computer room or data center onsite where the
storage arrays are securely located.

Onsite storage usually entails storing important data on a periodic basis on local storage
devices, such as hard drives, DVDs, magnetic tapes, or CDs. Offsite storage requires storing
important data on a remote server, usually via the Internet, although it can also be done via
direct access.
 Advantages of onsite storage:
 Immediate access to data
 Less expensive
 Internet access not needed
 Control of your own data security
 Performance improvement
 Disadvantage of onsite storage;
- In the event of a catastrophic event, onsite data storage can be destroyed.
- Storage can be extremely expensive depending on the size of the storage array.
- Storage device will need to be managed, maintained, and upgraded in-house.
Unit Three: Database Recovery Points & Procedures
3.1.Database recovery point
Database recovery is the process of restoring the database to a correct (consistent) state in the
event of a failure. In other words, it is the process of restoring the database to the most recent
consistent state that existed shortly before the time of system failure.

There are many situations in which a transaction may not reach a commit or abort point.
Some of them include;

An operating system crash can terminate the DBMS processes

 The DBMS can crash


 System failure(e.g. power outage)
 Affects all transactions currently in progress but does not physically
damage the data (softcrash)
 Media failures(e.g. Head crash on the disk)

 damage to the database (hard crash)


 need backup data
 The system might lose power

 Human error can result in deletion of critical data.

In any of these situations, data in the database may become inconsistent or lost.

When a DBMS recovers from a crash, it should maintain the following

- It should check the states of all the transactions, which were being executed.

- A transaction may be in the middle of some operation; the DBMS must ensure the
atomicity of the transaction in this case.

- It should check whether the transaction can be completed now or it needs to be rolled
back.

- No transactions would be allowed to leave the DBMS in an inconsistent state.


In case of any type of failures, a transaction must either be aborted or committed to maintain
data integrity.

Transaction log plays an important role for database recovery and bringing the database in a
consistent state in the event of failure. Transactions represent the basic unit of recovery in a
database system. The recovery manager guarantees the atomicity and durability properties of
transactions in the event of failures. During recovery from failure, the recovery manager
ensures that either all the effects of a given transaction are permanently recorded in the
database or none of them are recorded. A transaction begins with successful execution of a
<T, BEGIN>” (begin transaction) statement.

3.1.1. Database Recovery Techniques


 For fast restoration or recovery of data, the database must hold tools which recover the
data efficiently. It should have atomicity means either the transactions showing the
consequence of successful accomplishment perpetually in the database or the transaction
must have no sign of accomplishment consequence in the database.

 So, recovery techniques which are based on deferred update and immediate update or
backing up data can be used to stop loss in the database.

 Immediate Update: As soon as a data item is modified in cache, the disk copy is
updated.

 Deferred Update: All modified data items in the cache are written either after a
transaction ends its execution or after a fixed number of transactions have completed
their execution.

 Shadow update: The modified version of a data item does not overwrite its disk copy
but is written at a separate disk location.

 In-place update: The disk version of the data item is overwritten by the cache version.

3.1.2. Two approaches of Recovery


 Manual Reprocessing

In a Manual Reprocessing recovery approach, the database is periodically backed up (a


database save) and all transactions applied since the last save are recorded
If the system crashes, the latest database backup set is restored and all of the transactions are
re-applied (by users) to bring the database back up to the point just before the crash.

 Several shortcomings to the Manual Reprocessing approach:


 Time required to re-apply transactions

 Transactions might have other (physical) consequences

 Re-applying concurrent transactions in the same original sequence may not be


possible.

 Automated Recovery with Rollback / Roll forward


 Introduce a Log file – this is a file separate from the data that records all of the changes
made to the database by transactions. Also referred to as a Journal.

 This transaction log Includes information helpful to the recovery process such as: A
transaction identifier, the date and time, the user running the transaction, before
images and after images.

 Before Image: A copy of the table record (or data item) before it was changed by the
transaction.
 After Image: A copy of the table record (or data item) after it was changed by the
transaction.
 The Automated Recovery process uses both rollback and roll forward to restore the
database.
 Rollback: Undo any partially completed transactions (ones in progress when the crash
occurred) by applying the before images to the database. UNDO the transactions in
progress at the time of failure.
 Roll forward: Redo the transactions by applying the after images to the database. This is
done for transactions that were committed before the crash. REDO the transactions that
successfully complete but did not write to the physical disk.
 Checkpoint is a mechanism where all the previous logs are removed from the system and
stored permanently in a storage disk. Checkpoint declares a point before which the DBMS
was in consistent state, and all the transactions were committed. Checkpoints can also be
taken (less time consuming) in between database saves.
 The DBMS flushes all pending transactions and writes all data to disk and transaction log.
 Database can be recovered from the last checkpoint in much less time.

Figure 1.1. Recovery with Rollback / Roll forward

3.2.Testing restore process


Test Database recovery testing is used to ensure that the database is recovered. Recovery
testing allows you to find out whether the application is running properly and to check
retrieving invaluable data that would have been lost if your recovery method is not properly
setup.
The key aim of backup testing is to ensure the business can retrieve its data and continue
operations. Businesses should test that they can restore files, folders and volumes from
backups on a storage volume, user and application basis. Backup testing should be regular
and routine. In an ideal world, businesses would test every backup, but that is rarely
practical.

 Common Steps in Database Backup and Recovery Testing


In database recovery testing, you need to run the test in the actual environment to check if the
system or the data can actually be recovered in case of any disasters and any other unforeseen
events in the business environment.
 The common actions performed in Database Recovery Testing:
 Testing of database system
 Testing of the SQL files
 Testing of partial files
 Testing of data backup
 Testing of Backup tool
 Testing log backups
These policies should set out the recovery point objective (RPO) and the recovery time
objective (RTO).
The RPO sets out how old the most recent backup can be, or put another way, the amount of
data loss the organization can tolerate and still operate. The RTO specifies how quickly
systems must be recovered. Unless the business tests recovery, CIOs will not know if they
can meet the RTO and RPO, or if recovery works at all.

3.3.Restore a database to a point in time


A point-in-time recovery can be used to return the database data and database object to its
functional state prior to detrimental action has been performed.

The ability to perform this kind of recovery depends on a recovery model set for the
database. The database must be in either the Full or Bulk-Logged recovery model. In case the
Simple recovery mode was used, this recovery method is not possible.

In case of using the Bulk-Logged recovery model some errors may occur and recovery to a
point in time might fail. An error will be thrown in case when any bulk-logged operations
were performed. As such operations are minimally logged; there is not sufficient data in a
particular transaction log.

When you issue a RESTORE DATABASE or RESTORE LOG command the WITH
RECOVERY option is used by default.

This option is not on by default, so if you need to recover a database by restoring multiple
backup files and forget to use this option you have to start the backup process all over again.

The most common example of this would be to restore a "Full" backup and one or more
"Transaction Log" backups.

3.3.1. Restore a database using T-SQL


 Restore full backup and one transaction log backup

The first command does the restore and leaves the database in a restoring state and second
command restores the transaction log backup and then makes the database useable.
RESTORE DATABASE <<DatabaseName>>FROM DISK = 'C:\BackupName.BAK'
WITH NORECOVERY
GO
RESTORE LOG <<DatabaseName>> FROM DISK = 'C:\BackupName.TRN'
WITH RECOVERY
GO

 Restore full backup and two transaction log backups

This restores the first two backups using NORECOVERY and then RECOVERY for the last
restore.
RESTORE DATABASE <<DatabaseName>> FROM DISK = 'C:\ BackupName.BAK'
WITH NORECOVERY
GO
RESTORE LOG <<DatabaseName>> FROM DISK = 'C:\ BackupName.TRN'
WITH NORECOVERY
GO
RESTORE LOG <<DatabaseName>> FROM DISK = 'C:\ BackupName1.TRN'
WITH RECOVERY
GO

 Restore full backup, latest differential and two transaction log backups

This restores the first three backups using NORECOVERY and then RECOVERY for the
last restore.
RESTORE DATABASE <<DatabaseName>> FROM DISK = 'C:\ BackupName.BAK'
WITH NORECOVERY
GO
RESTORE DATABASE <<DatabaseName>> FROM DISK = 'C:\ BackupName.DIF'
WITH NORECOVERY
GO
RESTORE LOG <<Database Name>> FROM DISK = 'C:\ BackupName.TRN'
WITH NORECOVERY
GO
RESTORE LOG <<DatabaseName>> FROM DISK = 'C:\ BackupName1.TRN'
WITH RECOVERY
GO

You might also like