A Guide To Site Disaster Recovery Options
A Guide To Site Disaster Recovery Options
Options
Issue Date: 16th December 2008
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 1
Contents
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 2
1.0 Introduction
This document has been produced as an aid to creating a site disaster recovery plan. While the
information contained in this document is generic and can be applied to any NetBackup domain
this document is not intended to be a definitive disaster recovery plan for all environments.
Instead it is assumed that readers of this document will use the information it contains to develop
specific site disaster recovery plans for their NetBackup Environments.
One of the biggest problems encountered during site disaster recovery is that the DR site is not a
mirror image of the production site. The NetBackup catalog backup and recovery process is
primarily intended for recovering from catalog storage or Master Server failure rather than site
loss and, by default, restores the complete catalog including the EMM database which includes
details of the Media Servers, backup devices and Storage Units. The Master Server uses this
information to direct backups and restores but also uses it to interrogate the Media Servers to
establish the status of the backup devices. In a DR environment which does not contain these
Media Servers the performance of the Master Server can be severely impacted, and along with it
the ability to carry out restore operations, as polling operations fail to connect and time out.
This document examines two approaches to recovering the NetBackup environment at a DR site
where the arrangement of Media Servers and clients is different to that at the main production
site. The first approach involves recovering the whole catalog and then disabling or removing the
unwanted configuration elements. The second approach involves a partial catalog recovery in
which the EMM and BMR databases are not restored. Both approaches have advantages and
disadvantages which are discussed in the following sections.
When creating your disaster recovery plan ensure that it includes all the steps identified in section
3.0 and either section 4.0 or section 5.0, depending the recovery process used. Always test the
plan to ensure the steps are correct and the recovery process works as planned.
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 3
2.0 Full Catalog Recovery vs. Partial Catalog Recovery
The catalog recovery wizard offers two recovery options; full recovery, in which both the relational
database and flat file components of the catalog are recovered, and partial recovery, in which
only the flat file components are recovered. The most appropriate method for recovery will be
partially determined by the nature of the DR facility and how similar to the production facility it is.
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 4
3.0 DR Site Preparation
Before recovering the NetBackup catalogs to the Master Server at the DR site the infrastructure
to be used for disaster recovery (master server, media servers, network connections, NetBackup
software) must be in place and working.
When developing a disaster recovery plan it is important to have the steps required to get to this
state carefully documented, particularly if the DR site is not normally configured (for example it is
a facility provided by a specialist DR services company).
In particular the following points should be noted:
1. The same version of NetBackup used at the production site must be installed on a Master
Server and a number of Media Servers and clients. The following points should be
noted:
Note: If the production site includes media servers running older versions of NetBackup
(e.g. NetBackup 5.1) it is not necessary to provide any media servers running these older
versions at the DR site and it is generally recommended to use the same version for the
Master Server and Media Servers at the DR site.
Note: If the full catalog recovery method is going to be used and the Master Server at the
production site is clustered a clustered master server must also exist at the DR site (see
tech note 301049), however the member nodes of the cluster do not need to be the same
as those at the production site. If the partial catalog recovery method is used there is no
requirement to have a clustered master server at the DR site.
2. Network connectivity and authentication between the clients and servers has been tested
and confirmed using test backup policies (these policies should be disabled after testing).
3. Suitable tape drives and libraries have been connected to the Media Servers (as a
minimum requirement the tape drives used at the DR site must be read compatible with
the tapes from the production site and must be configured as the same media type in
NetBackup.
4. Appropriate ‘failover restore Media Server’ setting have been configured to allow backups
written to the Media Servers at the production site to be restore using the Media Servers
at the DR site.
5. In the case of the partial recovery approach a ‘non-scratch’ Media Pool that is not used
by any backup policy has been created and bar code rules configured to ensure that the
backup tapes are automatically added to that pool.
6. If the library type used at the DR site is different to the one used at the production site
ensure that the bar code masking operates in the same way (i.e. that trailing characters
are removed where appropriate) and, if necessary, configure rules to manage this.
7. Either:
a. If the original backup tapes are used for DR purposes ensure they have been
loaded in the tape libraries at the DR site.
b. If backups have been duplicated to secondary tapes for DR purposes ensure
these ‘offsite’ tapes have been loaded in the tape libraries and the
ALT_RESTORE_COPY_NUMBER file has been created with the appropriate
copy number in it.
Note: Wherever possible it is strongly recommended that tapes are physically write
locked before being placed in libraries at the DR site to reduce the risk of accidental
overwriting of valid backups.
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 5
4.0 Full Catalog recovery approach
This section details the steps to be followed before and after recovering the catalog backup
where a full catalog recovery is carried out.
In this approach the complete catalog backup is recovered to the DR Master Server and the
Media Servers that do not exist in the DR environment are deactivated to avoid unnecessary
polling. As the device configuration at the DR site is likely to be different to the production site all
device records are removed and device discovery is run to update the EMM database.
The following steps must be carried out before restores can be started and should be clearly
called out in any DR plan:
1. On UNIX and Linux Master Servers create copies of the bp.conf and vm.conf files – these
will be overwritten by the next step of the process.
2. Recover the entire catalog using bprecover command.
Note: The Master Server must have the same name and topology as the production
Master Server (i.e. if the production Master Server is a cluster the DR Master Server
must also be a cluster although the number of member nodes and the names of those
nodes can be different).
Note: If a hot catalog backup is used that was created on a separate Media Server a
Media Server with the same name is required for the catalog recovery.
3. Deactivate all backup policies to prevent backups from starting automatically. This must
be done manually either via the GUI or the CLI (bppllist <policy> -set –inactive)
4. Shut down NetBackup
5. On UNIX and Linux Master Servers replace the bp.conf and vm.conf files restored from
the catalog backup with the copies created in step 1 above.
6. Start only Sybase and EMM Start the Sybase ASA, NetBackup PBX and EMM services
on the new Master Server.
a. On UNIX/Linux Master Servers run the following commands:
/usr/openv/netbackup/bin/nbdbms_start_stop start
/opt/VRTSpbx/bin/pbx_exchange
/usr/openv/netbackup/bin/nbemm
b. On Windows Master Servers start the following Windows services:
Adaptive Server Anywhere – VERITAS_NB
Symantec Private Branch Exchange
NetBackup Enterprise Media Manager
Note: The PBX process may already be running as it is not stopped and started by the
NetBackup startup and shutdown commands.
Note: For NetBackup versions 6.5.3 and above the qualifier ‘-maintenance’ should be
used on the nbemm command
7. Deactivate any Media Servers that do not form part of the DR environment using the
command:
nbemmcmd -updatehost -machinename <Media Server> -machinestateop
set_admin_pause -machinetype media -masterserver <Master Server>
8. Delete all tape devices from the EMM database using the command:
nbemmcmd –deletealldevices -allrecords
9. Restart NetBackup
10. Run the device configuration wizard to create the new tape drive and library
configuration.
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 6
11. If bar code masking rules were put in place at step 6 of the site preparation ensure that
the same rules are set at this time and, if necessary, add them.
12. Use the GUI to verify all the recovery media are set to non-robotic. It is extrememly likely
that some media will be set to robotic, particularly if the primary media are being used for
recovery. To set media to non-robotic select all the robotic media, right click and select
‘Move’. Change the robot field to ‘standalone and hit OK to save the changes. Once all
the recovery media are set to non-robotic in Inventory all the tape libraries – this will
ensure that the media are identified in the correct library.
It should now be possible to start restore and recovery operations of client data backed up at the
production data center.
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 7
5.0 Partial catalog recovery approach
This section details the steps to be followed before and after recovering the catalog backup
where a partial catalog recovery is carried out.
This approach uses a technique referred to as ‘recovery without import’ in which the EMM
database is not restored from the catalog backup. The technique relies on the fact that restore
operations do not need tapes to be assigned or located in specific media pools. Provided a tape
exists in EMM and can thus be mounted and read by NetBackup it can be restored from.
The following steps must be carried out before restores can be started:
1. On UNIX and Linux Master Servers create copies of the bp.conf and vm.conf files – these
will be overwritten by the next step of the process.
2. Recover only the NetBackup catalog image and configuration files using bprecover
command.
a. For hot catalog backups use the bprecover –wizard or GUI option and select the
partial recovery when prompted
b. For cold catalog backups use the bprecover –r command and select only
/usr/openv/netbackup/db (or Windows equivalent) path
Note: The Master Server must have the same name as the production Master Server
Note: If a hot catalog backup is used that was created on a separate Media Server a
Media Server with the same name is required for the catalog recovery.
3. Deactivate all backup policies to prevent backups from starting automatically. This must
be done manually either via the GUI or the CLI (bppllist <policy> -set –inactive)
4. Shut down NetBackup
5. On UNIX and Linux Master Servers replace the bp.conf and vm.conf files restored from
the catalog backup with the copies created in step 1 above.
6. Start NetBackup
7. Inventory all the tape libraries to ensure that the tapes have been added to the non-
scratch media pool – this step prevents tapes from being accidentally over written by
active backup policies at a later time.
It should now be possible to start restore and recovery operations of client data backed up at the
production data center.
A Guide to Site Disaster Recovery Options Version 1.0 16th December 2008
Page 8