Unit IV Backup - Lectures Notes
Unit IV Backup - Lectures Notes
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Lecture notes
Information Availability
Information availability (IA) refers to the ability of the infrastructure to function according to business
expectations during its specified time of operation.
Information availability ensures that people (employees, customers, suppliers, and partners) can access
information whenever they need it. Information availability can be defined with the help of reliability,
accessibility and timeliness.
Reliability: This reflects a component’s ability to function without failure, under stated conditions, for a
specified amount of time.
Accessibility: This is the state within which the required information is accessible at the right place, to
the right user. The period of time during which the system is in an accessible state is termed system
uptime; when it is not accessible it is termed system downtime.
Timeliness: Defines the exact moment or the time window (a particular time of the day, week, month,
and/or year as specified) during which information must be accessible. For example, if online access to an
application is required between 8:00 am and 10:00 pm each day, any disruptions to data availability
outside of this time slot are not considered to affect timeliness.
Causes of Information Unavailability
Various planned and unplanned incidents result in data unavailability. Planned outages include
installation/integration/maintenance of new hardware, software upgrades or patches, taking backups,
application and data restores, facility operations (renovation and construction), and refresh/migration of
the testing to the production environment. Unplanned outages include failure caused by database
corruption, component failure, and human errors.
Another type of incident that may cause data unavailability is natural or man-made disasters such as
flood, fire, earthquake, and contamination. As illustrated in Figure 11-1, the majority of outages are
planned. Planned outages are expected and scheduled, but still cause data to be unavailable. Statistically,
less than 1 percent is likely to be the result of an unforeseen disaster.
Consequences of Downtime
Data unavailability, or downtime, results in loss of productivity, loss of revenue, poor financial
performance, and damages to reputation. Loss of productivity reduces the output per unit of labor,
equipment, and capital. Loss of revenue includes direct loss, compensatory payments, future revenue
losses, billing losses, and investment losses. Poor financial performance affects revenue recognition,
cash flow, discounts, payment guarantees, credit rating, and stock price. Damages to reputation may result
in a loss of confidence or credibility with customers, suppliers, financial markets, banks, and business
partners.
Other possible consequences of downtime include the cost of additional equipment rental, overtime, and
extra shipping.
The business impact of downtime is the sum of all losses sustained as a result of a given disruption. An
important metric, average cost of downtime per hour, provides a key estimate in determining the
appropriate BC solutions.
It is calculated as follows:
Average cost of downtime per hour = average productivity loss per hour + average revenue loss per hour
Where:
Productivity loss per hour = (total salaries and benefits of all employees per week) / (average number of
working hours per week)
Average revenue loss per hour = (total revenue of an organization per week) / (average number of hours
per week that an organization is open for business)
The average downtime cost per hour may also include estimates of projected revenue loss due to other
consequences such as damaged reputations and the additional cost of repairing the system.
A backup is a copy of production data, created and retained for the sole purpose of recovering deleted or
corrupted data. With growing business and regulatory demands for data storage, retention, and
availability, organizations are faced with the task of backing up an ever-increasing amount of data. This
task becomes more challenging as demand for consistent backup and quick restore of data increases
throughout the enterprise which may be spread over multiple sites. Moreover, organizations need to
accomplish backup at a lower cost
with minimum resources.
Backups are performed to serve three purposes: disaster recovery, operational backup, and archival.
Disaster Recovery
Backups can be performed to address disaster recovery needs. The backup copies are used for restoring
data at an alternate site when the primary site is incapacitated due to a disaster. Based on RPO and RTO
requirements, organizations use different backup strategies for disaster recovery. When a tape-based
backup method is used as a disaster recovery strategy, the backup tape media is shipped and stored at an
offsite location. These tapes can be recalled for restoration at the disaster recovery site. Organizations
with stringent RPO and RTO
requirements use remote replication technology to replicate data to a disaster recovery site. This allows
organizations to bring up production systems online in a relatively short period of time in the event of a
disaster.
Operational Backup
Data in the production environment changes with every business transaction and operation. Operational
backup is a backup of data at a point in time and is used to restore data in the event of data loss or logical
corruptions that may occur during routine processing. The majority of restore requests in most
organizations
fall in this category. For example, it is common for a user to accidentally delete an important e‑mail or for
a file to become corrupted, which can be restored from operational backup.
Operational backups are created for the active production information by using incremental or differential
backup techniques, detailed later in this chapter. An example of an operational backup is a backup
performed for a production database just before a bulk batch update. This ensures the availability of a
clean copy of
the production database if the batch update corrupts the production database.
Archival
Backups are also performed to address archival requirements. Although CAS has emerged as the primary
solution for archives, traditional backups are still used by small and medium enterprises for long-term
preservation of transaction records, e‑mail messages, and other business records required for regulatory
compliance.
Apart from addressing disaster recovery, archival, and operational requirements, backups serve as a
protection against data loss due to physical damage of a storage device, software failures, or virus attacks.
Backups can also be used to protect against accidents such as a deletion or intentional data destruction.
Backup Granularity
Most organizations use a combination of these three backup types to meet their backup and recovery
requirements
Full backup is a backup of the complete data on the production volumes at a certain point in time. A full
backup copy is created by copying the data on the production volumes to a secondary storage device.
Incremental backup copies the data that has changed since the last full or incremental backup, whichever
has occurred more recently. This is much faster (because the volume of data backed up is restricted to
changed data), but it takes longer to restore.
Cumulative (or differential) backup copies the data that has changed since the last full backup. This
method takes longer than incremental backup but is faster to restore.
Students have to prepare answers for the following questions at the end of the lecture
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
Reference Book
Backup targets and methods-Backup Process and Architecture- Backup and Restore
Topic
Operations
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Lecture notes
Cloud storage providers, including Amazon Web Services (AWS) and Microsoft Azure, offer
Infrastructure as a Service (IaaS) models that let you create servers in the cloud as backup targets.
Most cloud backup services provide dedicated solutions that look like network drives to software and
users. While this option provides a high level of flexibility, it also comes with additional fees. You might
also incur additional costs needed to protect the data stored in these services.
Cloud customers can also employ a local storage resource, such as a network-attached storage (NAS)
device, to serve as a middleman for backups. This type of resource can store frequently used files and
then serve them through the faster local network.
Cloud backup vendors usually provide management tools according to customer requirements in terms of
size and demand, security, and changing bandwidth conditions. In some scenarios, variable data retention
requirements are also factored in. This enables cloud providers to automatically drop all files or folders
that are older than the time specified by the administrator.
Storage tiers, like frequently-accessed storage or archival storage, are priced differently. Customers can
define policies and automate how data moves between tiers, to conserve costs for data that is less
frequently used or retained only for compliance purposes.
Backup Methods
Hot backup and cold backup are the two methods deployed for backup. They are based on the state of the
application when the backup is performed. In a hot backup, the application is up and running, with users
accessing their data during the backup process. In a cold backup, the application is not active during the
backup process.
The backup of online production data becomes more challenging because data is actively being used and
changed. An open file is locked by the operating system and is not copied during the backup process until
the user closes it. The backup application can back up open files by retrying the operation on files that
were opened earlier in the backup process. During the backup process, it may be possible that files
opened earlier will be closed and a retry will be successful.
The maximum number of retries can be configured depending on the backup application. However, this
method is not considered robust because in some environments certain files are always open.
Backup Process
A backup system uses client/server architecture with a backup server and multiple backup clients. The
backup server manages the backup operations and maintains the backup catalog, which contains
information about the backup process and backup metadata. The backup server depends on backup clients
to gather the data to be backed up. The backup clients can be local to the server or they can reside on
another server, presumably to back up the data visible to that server. The backup server receives backup
metadata from the backup clients to perform its activities.
Figure 12-4 illustrates the backup process. The storage node is responsible for writing data to the backup
device (in a backup environment, a storage node is a host that controls backup devices). Typically, the
storage node is integrated with the backup server and both are hosted on the same physical platform. A
backup device is attached directly to the storage node’s host platform. Some backup architecture refers to
the storage node as the media server because it connects to the storage device. Storage nodes play an
important role in backup planning because they can be used to consolidate backup servers.
The backup process is based on the policies defined on the backup server, such as the time of day or
completion of an event. The backup server then initiates the process by sending a request to a backup
client (backups can also be initiated by a client). This request instructs the backup client to send its
metadata
to the backup server, and the data to be backed up to the appropriate storage node. On receiving this
request, the backup client sends the metadata to the backup server. The backup server writes this metadata
on its metadata catalog.
The backup client also sends the data to the storage node, and the storage node writes the data to the
storage device. After all the data is backed up, the storage node closes the connection to the backup
device. The backup server writes backup completion status to the metadata catal
Backup software also provides extensive reporting capabilities based on the backup catalog and the log
files. These reports can include information such as the amount of data backed up, the number of
completed backups, the number of incomplete backups, and the types of errors that may have occurred.
Reports can be customized depending on the specific backup software used.
Backup and Restore Operations
When a backup process is initiated, significant network communication takes place between the different
components of a backup infrastructure. The backup server initiates the backup process for different clients
based on the backup schedule configured for them. For example, the backup process for a group of clients
may be scheduled to start at 3:00 am every day.
The backup server coordinates the backup process with all the components in a backup configuration (see
Figure 12-5). The backup server maintains the information about backup clients to be contacted and
storage nodes to be used in a backup operation. The backup server retrieves the backup-related
information
from the backup catalog and, based on this information, instructs the storage node to load the appropriate
backup media into the backup devices.
Simultaneously, it instructs the backup clients to start scanning the data, package it, and send it over the
network to the assigned storage node. The storage node, in turn, sends metadata to the backup server to
keep it updated about the media being used in the backup process. The backup server continuously
updates the backup catalog with this information.
Upon receiving a restore request, an administrator opens the restore application to view the list of clients
that have been backed up. While selecting the client for which a restore request has been made, the
administrator also needs to identify the client that will receive the restored data. Data can be restored on
the same client for whom the restore request has been made or on any other client.
The administrator then selects the data to be restored and the specified point in time to which the data has
to be restored based on the RPO. Note that because all of this information comes from the backup catalog,
the restore application must also communicate to the backup server.
b) Cloud storage
c) Main memory
d) Tape drive
a) Backup client
b) Backup server
c) Backup storage
d) Backup application
a) Backup encryption
b) Compression ratio
d) Backup size
Students have to prepare answers for the following questions at the end of the lecture
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
Reference Book
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Lecture notes
Data Deduplication
Data deduplication is a process that eliminates redundant copies of data and reduces storage overhead.
Data deduplication techniques ensure that only one unique instance of data is retained on storage media,
such as disk, flash or tape. Redundant data blocks are replaced with a pointer to the unique data copy. In
that way, data deduplication closely aligns with incremental backup, which copies only the data that has
changed since the previous backup.
An example of data deduplication
A typical email system might contain 100 instances of the same 1 megabyte (MB) file attachment. If the
email platform is backed up or archived, all 100 instances are saved, requiring 100 MB of storage space.
With data deduplication, only one instance of the attachment is stored; each subsequent instance is
referenced back to the one saved copy.
In this example, a 100 MB storage demand drops to 1 MB.
A cloud backup service offers the use of shared software-defined storage, which is managed like a virtual
resource. This type of virtual architecture enables providers to create a large pool of storage and parcel it
out among many customers.
To use these managed services, there is no need for a dedicated physical or virtual server, off-site
facilities for storing tape backup, or expensive tape drives with dedicated backup software.
Virtualized storage
Software-defined storage enables cloud providers to manage storage at the byte level, and employ multi-
tenant architectures to ensure each account is completely separated from others. This helps isolate data
belonging to different customers.
Customers of cloud backup services can store their frequently used data in several geographic locations.
A vendor-owned data center located close to your office, for example, can provide fast access.
Cloud storage providers, including Amazon Web Services (AWS) and Microsoft Azure, offer
Infrastructure as a Service (IaaS) models that let you create servers in the cloud as backup targets.
Most cloud backup services provide dedicated solutions that look like network drives to software and
users. While this option provides a high level of flexibility, it also comes with additional fees. You might
also incur additional costs needed to protect the data stored in these services.
Cloud customers can also employ a local storage resource, such as a network-attached storage (NAS)
device, to serve as a middleman for backups. This type of resource can store frequently used files and
then serve them through the faster local network.
Cloud backup vendors usually provide management tools according to customer requirements in terms of
size and demand, security, and changing bandwidth conditions. In some scenarios, variable data retention
requirements are also factored in. This enables cloud providers to automatically drop all files or folders
that are older than the time specified by the administrator.
Storage tiers, like frequently-accessed storage or archival storage, are priced differently. Customers can
define policies and automate how data moves between tiers, to conserve costs for data that is less
frequently used or retained only for compliance purposes.
A. Data Repentance
B. Data Redundancy
C. Data Inconsistency
D. Data base
Students have to prepare answers for the following questions at the end of the lecture
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
Reference Book
Web resources
Unit
BACKUP, ARCHIVE AND REPLICATION Lecture No 4
No 4
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Lecture notes
Data Archive
Data archiving moves data that is no longer actively used to a separate storage device for long-term
retention. Archive data consists of older data that remains important to the organization or must be
retained for future reference or regulatory compliance reasons. Data archives are indexed and have search
capabilities, so files can be located and retrieved.
Archived data is stored on a lower-cost tier of storage, reducing primary storage consumption and the
related costs. An important aspect of a business's data archiving strategy is to inventory its data and
identify what data is a candidate for archiving.
Some archive systems treat archive data as read-only to protect it from modification, while other data
archiving products enable writes as well as reads. For example, WORM -- write once, read many --
technology uses media that is not rewritable.
Data archiving is most suitable for data that must be retained due to operational or regulatory
requirements, such as document files, email messages and possibly old database records.
Data archiving benefits
The greatest benefit of archiving data is it reduces the cost of primary storage. Primary storage is typically
expensive, because a storage array must produce a sufficient level of input/output operations per second
to meet operational requirements for user read/write activity. In contrast, archive storage costs less,
because it is typically based on a low-performance, high-capacity storage medium. Data archives can be
stored on low-cost hard disk drives (HDDs), tape or optical storage that is generally slower than
performance disk or flash drives.
Archive storage also reduces the volume of data that must be backed up. Removing infrequently accessed
data from the backup data set improves backup and restore performance. Typically, organizations perform
data deduplication on data being moved to a lower storage tier, which reduces the overall storage
footprint and lowers secondary storage costs.
Replication is the process of creating an exact copy of data. Creating one or more replicas of the
production data is one of the ways to provide Business Continuity (BC). These replicas can be used for
recovery and restart operations in the event of data loss. The primary purpose of replication is to enable
users to have designated data at the right place, in a state appropriate to the recovery need. The replica
should provide recoverability and restartability.
Recoverability enables restoration of data from the replicas to the production volumes in the event of data
loss or data corruption. It must provide minimal RPO and RTO for resuming business operations on the
production volumes, while restartability must ensure consistency of data on the replica. This enables
restarting business operations using the replicas.
a) Disaster recovery
b) Load balancing
c) Data migration
d) Data compression
Students have to prepare answers for the following questions at the end of the lecture
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
Reference Book
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Lecture notes
In storage system-based remote replication, the replication is performed between storage systems.
Typically one of the storage systems is in source site and the other system is in remote site for DR
purpose. Data can be transmitted from the source storage system to the target system over a shared or a
dedicated network. Replication between storage systems may be performed in synchronous or
asynchronous modes.
Typically after detachment, changes made to both the source and replica can be tracked at some
predefined granularity. This enables incremental resynchronization (source to target) or incremental
restore (target to source). The clone must be the same size as the source LUN.
Data on the target is a combined view of unchanged data on the source and data on the save location. The
unavailability of the source device invalidates the data on the target. The target contains only pointers to
the data, and therefore, the physical capacity required for the target is a fraction of the source device. The
capacity required for the save location depends on the amount of the expected data change.
Some pointer-based virtual replication implementation uses redirect on write technology (RoW). RoW
redirects new writes destined for the source LUN to a reserved LUN in the storage pool. Such
implementation is different from CoFW, where the writes to the source LUN are held until the original
data is copied to the save location to preserve the point-in-time replica. There is always a need to perform
a lookup to determine whether data is on the source LUN or save location, which causes snapshot reads to
be slower than source LUN reads. In the case of a RoW snapshot, the original data remains where it is,
and is therefore read from the original location on the source LUN.
Storage System based Remote Replication Techniques
Synchronous Remote replication
Storage-based remote replication solution can avoid downtime by enabling business operations at remote
sites. Storage-based synchronous remote replication provides near zero RPO where the target is identical
to the source at all times. In synchronous replication, writes must be committed to the source and the
remote target prior to acknowledging “write complete” to the production server. Additional writes on the
source cannot occur until each preceding write has been completed and acknowledged. This ensures that
data is identical on the source and the target at all times. Further, writes are transmitted to the remote site
exactly in the order in which they are received at the source. Therefore, write ordering is maintained and
it ensures transactional consistency when the applications are restarted at the remote location. Most of the
storage systems support consistency group, which allows all LUNs belonging to a given application,
usually a database, to be treated as a single entity and managed as a whole. This helps to ensure that the
remote images are consistent. As a result, the remote images are always restartable copies.
In asynchronous remote replication, a write from a production server is committed to the source and
immediately acknowledged to the server. Asynchronous replication also mitigates the impact to the
application’s response time because the writes are acknowledged immediately to the server. This enables
to replicate data over distances of up to several thousand kilometers between the source site and the
secondary site (remote locations). In asynchronous replication, the server writes are collected into buffer
(delta set) at the source. This delta set is transferred to the remote site in regular intervals. Therefore,
adequate buffer capacity should be provisioned to perform asynchronous replication. In asynchronous
replication, RPO depends on the size of the buffer, the available network bandwidth, and the write
workload to the source. This replication can take advantage of locality of reference (repeated writes to the
same location). If the same location is written multiple times in the buffer prior to transmission to the
remote site, only the final version of the data is transmitted. This feature conserves link bandwidth.
Bloom’s
Qn
Question Answer Knowledge
No
Level
a.Remote replication
b.Local replication
c.Both
d.None
2. Remote replicas help organizations mitigate the risks True K1
associated with regionally driven outages resulting from
natural or human-made disasters.
True
False
Students have to prepare answers for the following questions at the end of the lecture
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
Reference Book
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Data migration
Data migration is the transfer of the existing historical data to new storage, system, or file format. This
process is not as simple as it may sound. It involves a lot of preparation and post-migration activities
including planning, creating backups, quality testing, and validation of results. The migration ends only
when the old system, database, or environment is shut down.
Usually, data migration comes as a part of a larger project such as
● legacy software modernization or replacement,
● the expansion of system and storage capacities,
● the introduction of an additional system working alongside the existing application,
● the shift to a centralized database to eliminate data silos and achieve interoperability,
● moving IT infrastructure to the cloud, or
● merger and acquisition (M&A) activities when IT landscapes must be consolidated into a single
system.
There are six commonly used types of data migration. However, this division is not strict. A particular
case of the data transfer may belong, for example, to both database and cloud migration or involve
application and database migration at the same time.
Storage migration
Storage migration occurs when a business acquires modern technologies discarding out-of-date
equipment. This entails the transportation of data from one physical medium to another or from a physical
to a virtual environment. Examples of such migrations are when you move data
● from paper to digital documents,
● from hard disk drives (HDDs) to faster and more durable solid-state drives (SSDs), or
● from mainframe computers to cloud storage.
Database migration
A database is not just a place to store data. It provides a structure to organize information in a specific
way and is typically controlled via a database management system (DBMS).
Application migration
When a company changes an enterprise software vendor — for instance, a hotel implements a
new property management system or a hospital replaces its legacy EHR system — this requires moving
data from one computing environment to another. The key challenge here is that old and new
infrastructures may have unique data models and work with different data formats.
Cloud migration
Cloud migration is a popular term that embraces all the above-mentioned cases, if they involve moving
data from on-premises to the cloud or between different cloud environments.
Step 1 Refine the scope. The key goal of this step is to filter out any excess data and to define the
smallest amount of information required to run the system effectively. So, you need to perform a high-
level analysis of source and target systems, in consultation with data users who will be directly impacted
by the upcoming changes.
Step 2 Assess source and target systems. A migration plan should include a thorough assessment of the
current system’s operational requirements and how they can be adapted to the new environment.
Step 3 Set data standards. This will allow your team to spot problem areas across each phase of the
migration process and avoid unexpected issues at the post-migration stage.
Step 4 Estimate budget and set realistic timelines. After the scope is refined and systems are evaluated,
it’s easier to select the approach (big bang or trickle), estimate resources needed for the project, set
schedules, and deadlines. According to Oracle estimations, an enterprise-scale data migration project lasts
six months to two years on average.
Bloom’s
Qn
Question Answer Knowledge
No
Level
Students have to prepare answers for the following questions at the end of the lecture
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
Reference Book
Web resources
Unit
BACKUP, ARCHIVE AND REPLICATION Lecture No 8
No 4
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Lecture notes
Disaster recovery as a service (DRaaS) is a cloud computing service model that allows an organization
to back up its data and IT infrastructure in a third party cloud computing environment and provide all the
DR orchestration, all through a SaaS solution, to regain access and functionality to IT infrastructure after
a disaster. The as-a-service model means that the organization itself doesn’t have to own all the resources
or handle all the management for disaster recovery, instead relying on the service provider.
Disaster recovery planning is critical to business continuity. Many disasters that have the potential to
wreak havoc on an IT organization have become more frequent in recent years:
● Natural disasters such as hurricanes, floods, wildfires and earthquakes
● Equipment failures and power outages
● Cyberattacks
DRaaS works by replicating and hosting servers in a third-party vendor’s facilities versus in the physical
location of the organization that owns the workload. The disaster recovery plan is executed on the third-
party vendor’s facilities in the event of a disaster that shuts down a customer’s site. Organizations may
purchase DRaaS plans through a traditional subscription model or a pay-per-use model that allows them
to pay only when disaster strikes. As-a-service solutions vary in scope and cost—organizations should
evaluate potential DRaaS providers according to their own unique needs and budget.
DRaaS can save organizations money by eliminating the need for provisioning and maintaining an
organization’s own off-site disaster recovery environment. However, organizations should evaluate and
understand service level agreements. For instance, what happens to recovery times if both the provider
and customer are affected by the same natural disaster, such as a large hurricane or earthquake. Different
DRaaS providers have different policies on prioritizing which customers get help first in a large regional
disaster or allowing customers to perform their own disaster recovery testing.
With disaster recovery as a service, the service provider moves an organization’s computer processing to
its cloud infrastructure in the event of a disaster. This way, the business can continue to operate, even if
the original IT infrastructure is totally destroyed or held hostage. This differs from backup as a service,
where only the data, but not the ability to process the data, is duplicated by a third-party provider.
Because BaaS is only protecting the data, and not the infrastructure, it is typically less expensive than
DRaaS. BaaS can be a good solution for companies that need to archive data or records for legal reasons,
but most organizations who use BaaS will want to combine it with another disaster recovery tool to
ensure business continuity.
Planning for disaster and getting the help you need is something every business needs to consider.
Whatever option you choose, a disaster recovery plan is essential for business continuity, and
organizations are increasingly turning to DRaaS.
Bloom’s
Qn
Question Answer Knowledge
No
Level
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
1 Define DRaaS. 2 2 K1
Reference Book
Bloom’s Knowledge
Learning Outcome (LO) At the end of this lecture, students will be able to
Level
Lecture notes
Cloud backup, also known as online backup or remote backup, is a strategy for sending a copy of a
physical or virtual file or database to a secondary, off-site location for preservation in case of equipment
failure, site catastrophe or human malfeasance. The backup server and data storage systems are usually
hosted by a third-party cloud or SaaS provider that charges the backup customer a recurring fee based on
storage space or capacity used, data transmission bandwidth, number of users, number of servers or
number of times data is retrieved.
Implementing cloud data backup can help bolster an organization's data protection, business continuance
and regulatory compliance strategies without increasing the workload of IT staff. The labor-saving benefit
can be significant and enough of a consideration to offset some of the additional costs associated with
cloud backup, such as data transmission charges.
Most cloud subscriptions run on a monthly or yearly basis. Initially used mainly by consumers and home
offices, online backup services are now commonly used by SMBs and larger enterprises to back up some
forms of data. For larger companies, cloud data backup can serve as a supplementary form of backup.
What is the cloud?
Cloud computing is a general term that refers to hosted resources and services that are delivered over the
internet. Different from traditional web hosting, the services on the cloud are sold on demand, offered in
an elastic manner -- meaning the customer can use as much or as little of the service as needed -- and
managed completely by the service provider. Additionally, a cloud can be private or public. A public
cloud sells services to anyone on the internet, such as how AWS operates, while a private cloud supplies
hosted services to a limited number of users within the business.
Although the steps can vary based on backup method or type, this is the basic process for cloud backups.
When an organization engages a cloud backup service, the first step is to complete a full backup of the
data that must be protected. This initial backup can sometimes take days to finish uploading over a
network as a result of the large volume of data being transferred. In a 3-2-1 backup strategy, where an
organization has three copies of data on two different media, at least one copy of the backed up data
should be sent to an off-site backup facility so that it's accessible even if on-site systems are unavailable.
Using a technique called cloud seeding, a cloud backup vendor sends a storage device -- such as a hard
drive or tape cartridge -- to its new customer, which then backs up the data locally onto the device and
returns it to the provider. This process removes the need to send the initial data over the network to the
backup provider. One example of a device that employs this technique is AWS Snowball Edge.
If the amount of data in the initial backup is substantial, the cloud backup service might provide a full
storage array for the seeding process. These arrays are typically small network-attached storage (NAS)
devices that can be shipped back and forth relatively easily. After the initial seeding, only changed data is
backed up over the network.
How data is restored
Cloud backup services are typically built around a client software application that runs on a schedule
determined by the purchased level of service and the customer's requirements. For example, if the
customer has contracted for daily backups, the application collects, compresses, encrypts and transfers
data to the cloud service provider's servers every 24 hours. To reduce the amount of bandwidth consumed
and the time it takes to transfer files, the service provider might only provide incremental backups after
the initial full backup.
Cloud backup services often include the software and hardware necessary to protect an organization's
data, including applications for Microsoft Exchange and SQL Server. Whether a customer uses its own
backup application or the software the cloud backup service provides, the organization uses that same
application to restore backed up data. Restorations could be on a file-by-file basis, by volume or a full
restoration of the complete backup. More granular file-by-file restoration is typically the preferred method
because it enables a business to quickly recover individual lost or damaged files rather than take the time
and risk in restoring entire volumes.
If the volume of data to be restored is very large, the cloud backup service might ship the data on a
complete storage array that the customer can hook up to its servers to recover its data. This is, in effect, a
reverse seeding process. Restoring a large amount of data over a network can take an unacceptably long
time, depending on the organization's recovery time objective.
A key feature of cloud backup restorations is that they can be done anywhere, from nearly any kind of
computer. For example, an organization could recover its data directly to a disaster recovery site in a
different location if its primary data center is unavailable.
Mobile Device Backup:
Mobile device backup involves creating copies of data stored on smartphones, tablets, or other mobile
devices to prevent data loss in case of device damage, loss, or theft. Key aspects of mobile device backup
include:
Data Types: Mobile device backup typically includes contacts, photos, videos, messages, app
data, and other user-generated content.
Backup Methods: Mobile device backup can be performed using various methods, including
built-in backup features provided by the device's operating system, third-party backup apps, or
cloud-based backup services specifically designed for mobile devices.
Automatic Backup: Many mobile devices offer automatic backup options that regularly back up
data to the cloud or other storage locations, ensuring that the backup is up-to-date.
Restore Options: In addition to backup, mobile devices often provide easy-to-use restore options
to recover data onto a new or reset device.
Data Encryption: To ensure data security, mobile device backup solutions may incorporate
encryption techniques to protect data during transmission and storage.
Examples of mobile device backup solutions include iCloud for iOS devices, Google Backup for Android
devices, and various third-party backup apps available on app stores.
Bloom’s
Qn
Question Answer Knowledge
No
Level
Students have to prepare answers for the following questions at the end of the lecture
Marks CO Bloom’s
Qn
Question Knowledge
No
Level
Reference Book
Web resources