NetBackup 104 AdminGuide Hadoop
NetBackup 104 AdminGuide Hadoop
Administrator's Guide
Release 10.4
NetBackup™ for Hadoop Administrator's Guide
Last updated: 2024-03-26
Legal Notice
Copyright © 2024 Veritas Technologies LLC. All rights reserved.
Veritas, the Veritas Logo, Veritas Alta, and NetBackup are trademarks or registered trademarks
of Veritas Technologies LLC or its affiliates in the U.S. and other countries. Other names may
be trademarks of their respective owners.
This product may contain third-party software for which Veritas is required to provide attribution
to the third party (“Third-party Programs”). Some of the Third-party Programs are available
under open source or free software licenses. The License Agreement accompanying the
Software does not alter any rights or obligations you may have under those open source or
free software licenses. Refer to the Third-party Legal Notices document accompanying this
Veritas product or available at:
https://www.veritas.com/about/legal/license-agreements
The product described in this document is distributed under licenses restricting its use, copying,
distribution, and decompilation/reverse engineering. No part of this document may be
reproduced in any form by any means without prior written authorization of Veritas Technologies
LLC and its licensors, if any.
The Licensed Software and Documentation are deemed to be commercial computer software
as defined in FAR 12.212 and subject to restricted rights as defined in FAR Section 52.227-19
"Commercial Computer Software - Restricted Rights" and DFARS 227.7202, et seq.
"Commercial Computer Software and Commercial Computer Software Documentation," as
applicable, and any successor regulations, whether delivered by Veritas as on premises or
hosted services. Any use, modification, reproduction release, performance, display or disclosure
of the Licensed Software and Documentation by the U.S. Government shall be solely in
accordance with the terms of this Agreement.
http://www.veritas.com
Technical Support
Technical Support maintains support centers globally. All support services will be delivered
in accordance with your support agreement and the then-current enterprise technical support
policies. For information about our support offerings and how to contact Technical Support,
visit our website:
https://www.veritas.com/support
You can manage your Veritas account information at the following URL:
https://my.veritas.com
If you have questions regarding an existing support agreement, please email the support
agreement administration team for your region as follows:
Japan [email protected]
Documentation
Make sure that you have the current version of the documentation. Each document displays
the date of the last update on page 2. The latest documentation is available on the Veritas
website:
https://sort.veritas.com/documents
Documentation feedback
Your feedback is important to us. Suggest improvements or report errors or omissions to the
documentation. Include the document title, document version, chapter title, and section title
of the text on which you are reporting. Send feedback to:
You can also see documentation information or ask a question on the Veritas community site:
http://www.veritas.com/community/
https://sort.veritas.com/data/support/SORT_Data_Sheet.pdf
Contents
■ Limitations
NameNode
Application_Type=hadoop
Backup Host 1
DataNode 1
Primary server
DataNode 2
Backup Host 2
Media server
Data Node 3
Storage
Backup Host 3
DataNode n
...
Note: All the directories specified in NetBackup for Hadoop backup selection must
be snapshot-enabled before the backup.
2
6 Child
DataNode 2 job 2
Backup Host 2
DataNode 3 3
6 Child
job 3
Backup Host 3 Storage
DataNode n 7 Data is backed up in = Workload
parallel streams n distribution files
Hadoop Cluster
(Snapshot Enabled)
4. A workload discovery file is created on the backup host. The workload discovery
file contains the details of the data that needs to be backed up from the different
DataNodes.
5. The backup host uses the workload discovery file and decides how the workload
is distributed amongst the backup hosts. Workload distribution files are created
for each backup host.
6. Individual child jobs are executed for each backup host. As specified in the
workload distribution files, data is backed up.
7. Data blocks are streamed simultaneously from different DataNodes to multiple
backup hosts.
The compound backup job is not completed until all the child jobs are completed.
After the child jobs are completed, NetBackup cleans all the snapshots from the
NameNode. Only after the cleanup activity is completed, the compound backup job
is completed.
See “About backing up a NetBackup for Hadoop cluster” on page 42.
NameNode
1 Primary server
Restore job
is triggered
DataNode 1
Backup host
DataNode 2 4
Objects are restored on Storage
Hadoop Cluster the associated datanodes
(Snapshot Enabled) 3
Restore
Starts
2. The backup host connects with the NameNode. Backup host is also the
destination client.
3. The actual data restore from the storage media starts.
4. The data blocks are restored on the DataNodes.
See “About restoring a NetBackup for Hadoop cluster” on page 44.
Terminology Definition
Compound job A backup job for NetBackup for Hadoop data is a compound job.
■ The backup job runs a discovery job for getting information of the
data to be backed up.
■ Child jobs are created for each backup host that performs the
actual data transfer.
■ After the backup is complete, the job cleans up the snapshots on
the NameNode and is then marked complete.
Discovery job When a backup job is executed, first a discovery job is created. The
discovery job communicates with the NameNode and gathers
information of the block that needs to be backed up and the associated
DataNodes. At the end of the discovery, the job populates a workload
discovery file that NetBackup then uses to distribute the workload
amongst the backup hosts.
Child job For backup, a separate child job is created for each backup host to
transfer data to the storage media. A child job can transfer data blocks
from multiple DataNodes.
Workload discovery During discovery, when the backup host communicates with the
file NameNode, a workload discovery file is created. The file contains
information about the data blocks to be backed up and the associated
DataNodes.
Terminology Definition
Parallel streams The NetBackup parallel streaming framework allows data blocks from
multiple DataNodes to be backed up using multiple backup hosts
simultaneously.
Backup host The backup host acts as a proxy client. All the backup and restore
operations are executed through the backup host.
Fail-over NameNode In a high-availability scenario, the NameNodes other than the primary
NameNode that are updated in the hadoop.conf file are referred
as fail-over NameNodes.
Terminology Definition
DataNode DataNode is responsible for storing the actual data in NetBackup for
Hadoop.
Introduction 13
Limitations
Terminology Definition
Limitations
Review the following limitations before you deploy the NetBackup for Hadoop
plug-in:
■ Only RHEL and SUSE platforms are supported for backup hosts. For platforms
supported for Hadoop clusters, see the NetBackup Database and Application
Agent Compatibility List.
■ Delegation Token authentication method is not supported for NetBackup for
Hadoop clusters.
■ Hadoop plug-in does not capture Extended Attributes (xattrs) or Access Control
Lists (ACLs) of an object during backup and hence these are not set on the
restored files or folders.
■ For highly available NetBackup for Hadoop cluster, if fail-over happens during
a backup or restore operation, the job fails.
■ If you cancel a backup job manually while the discovery job for a backup
operation is in progress, the snapshot entry does not get removed from the
Hadoop web graphical user interface (GUI).
■ If the CRL expires during the backup of an HTTPS-based Hadoop cluster, the
backup runs partially.
■ If you have multiple CRL-based Hadoop clusters, ensure that you add different
backup hosts for every cluster.
■ Backup and restore operations are not supported with Kerberos authentication
if NB_FIPS_MODE is enabled at the bp.conf.
Introduction 14
Limitations
Task Reference
Prerequisites and See “Prerequisites for the NetBackup for Hadoop plug-in” on page 16.
requirements
Preparing the See “Preparing the NetBackup for Hadoop cluster” on page 16.
Hadoop cluster
Prerequisites and best practices for the NetBackup for Hadoop plug-in for NetBackup 16
Prerequisites for the NetBackup for Hadoop plug-in
Task Reference
Best practices See “Best practices for deploying the NetBackup for Hadoop plug-in”
on page 17.
■ For a Hadoop cluster that uses CRL, ensure that the CRL is valid and not
expired.
■ Configuring the NetBackup for Hadoop plug-in using the NetBackup for Hadoop
configuration file
Task Reference
Configuring the See “Configuring the NetBackup for Hadoop plug-in using the NetBackup
NetBackup for for Hadoop configuration file” on page 24.
Hadoop plug-in
See “Configuring NetBackup for a highly-available NetBackup for
using the
Hadoop cluster” on page 26.
NetBackup for
Hadoop See “Configuring number of threads for backup hosts” on page 29.
configuration file See “Configuring distribution algorithm and golden ratio for backup
hosts” on page 30.
Configuring the See “Configuration for a NetBackup for Hadoop cluster that uses
backup hosts for Kerberos” on page 38.
NetBackup for
Hadoop clusters
that use Kerberos
Configuring See “Create a BigData policy for Hadoop clusters” on page 39.
NetBackup policies
for NetBackup for
Hadoop plug-in
host performs all the backup and restore operations and does that require that a
separate agent be installed on the Hadoop cluster.
The backup host must be a Linux computer. NetBackup 10.4 release supports only
RHEL and SUSE platforms as a backup host.
The backup host can be a NetBackup client or a media server or a primary server.
NetBackup recommends that you have a media server as a backup host.
Consider the following before adding a backup host:
■ For backup operations, you can add one or more backup hosts.
■ For restore operations, you can add only one backup host.
■ A primary, media, or client can perform the role of a backup host.
■ Hadoop plug-in for NetBackup is installed on all the backup hosts.
For UNIX:
/usr/openv/var/global/bin/admincmd/bpplinclude PolicyName -add
Backup_Host=IP_address or hostname
4 As a best practice, add the entries of all the NameNodes and DataNodes to
the /etc/hosts file on all the backup hosts. You must add the host name in
FQDN format.
OR
Add the appropriate DNS entries in the /etc/resolv.conf file.
Configuring NetBackup for Hadoop 22
Managing backup hosts
For UNIX:
/usr/openv/var/global/bin/admincmd/bpplinclude PolicyName -delete
'Backup_Host=IP_address or hostname'
■ For Windows
The directory path to the command:
<Install_Path>\NetBackup\bin\admincmd\bpsetconfig
bpsetconfig -h primaryserver
bpsetconfig> APP_PROXY_SERVER = clientname1.domain.org
bpsetconfig> APP_PROXY_SERVER = clientname2.domain.org
bpsetconfig>
Windows systems: <ctl-Z>
Configuring NetBackup for Hadoop 23
Adding NetBackup for Hadoop credentials in NetBackup
Consider the following when you add NetBackup for Hadoop credentials:
■ For a highly-available NetBackup for Hadoop cluster, ensure that the user for
the primary and fail-over NameNode is the same.
■ Use the credentials of the application server that you will use when configuring
the BigData policy.
■ For a NetBackup for Hadoop cluster that uses Kerberos, specify "kerberos" as
application_server_user_id value.
Configuring NetBackup for Hadoop 24
Configuring the NetBackup for Hadoop plug-in using the NetBackup for Hadoop configuration file
■ Hostname and port of the NameNode must be same as you have specified with
the http address parameter in the core-site.xml of the NetBackup for Hadoop
cluster.
■ For password, provide any random value. For example, Hadoop.
To add Hadoop credentials in NetBackup
1 Run tpconfig command from the following directory paths:
On UNIX systems, /usr/openv/volmgr/bin/
On Windows systems, install_path\Volmgr\bin\
2 Run the tpconfig --help command. A list of options which are required to
add, update, and delete Hadoop credentials is displayed.
3 Run the tpconfig -add -application_server application_server_name
-application_server_user_id user_ID -application_type
application_type -requiredport IP_port_number [-password password
[-key encryption_key]] command by providing appropriate values for each
parameter to add Hadoop credentials.
For example, if you want to add credentials for Hadoop server which has
application_server_name as hadoop1, then run the following command using
the appropriate <user_ID> and <password> details.
tpconfig -add -application_server hadoop1 -application_type hadoop
-application_server_user_id Hadoop -requiredport 50070 -password
Hadoop
Note: You must not provide a blank value for any of the parameters, or the backup
job fails.
Ensure that you configure all the required parameters to run the backup and restore
operations successfully.
Note: For non-HA environment, the fail-over parameters are not required.
{
"application_servers":
{
"hostname_of_the_primary_namenode":
{
"failover_namenodes":
[
{
"hostname":"hostname_of_failover_namenode",
"port":port_of_the_failover_namenode
}
],
"port":port_of_the_primary_namenode
"distro_algo": distribution_algorithm,
"num_streams": number_of_streams
}
}
},
"number_of_threads":number_of_threads
}
Configuring NetBackup for Hadoop 26
Configuring the NetBackup for Hadoop plug-in using the NetBackup for Hadoop configuration file
{
"application_servers":
{
"hostname_of_primary_namenode1":
{
"failover_namenodes":
[
{
"hostname": "hostname_of_failover_namenode1",
"port": port_of_failover_namenode1
}
],
"port":port_of_primary_namenode1
}
}
}
Configuring NetBackup for Hadoop 28
Configuring the NetBackup for Hadoop plug-in using the NetBackup for Hadoop configuration file
2 If you have multiple NetBackup for Hadoop clusters, use the same hadoop.conf
file to update the details. For example,
{
"application_servers":
{
"hostname_of_primary_namenode1":
{
"failover_namenodes":
[
{
"hostname": "hostname_of_failover_namenode1",
"port": port_of_failover_namenode1
}
],
"port"::port_of_primary_namenode1
},
"hostname_of_primary_namenode2":
{
"failover_namenodes":
[
{
"hostname": "hostname_of_failover_namenode2",
"port": port_of_failover_namenode2
}
],
"port":port_of_primary_namenode2
}
}
}
3 Copy this file to the following location on all the backup hosts:
/usr/openv/var/global/
{
"application_servers": {
"hostname_of_namenode1":{
"port":port_of_namenode1
}
}
2 Copy this file to the following location on all the backup hosts:
/usr/openv/var/global/
{
"number_of_threads": number_of_threads
}
{
"num_of_streams": number_of_streams
}
Note: If you increase number of streams, update the maximum number of jobs
per client, update the stu setting for multiple threads, and client timeout to
avoid abrupt failures.
golden ratio that fits your deployment. Golden ratio supported range is from 1
to 100. If not provided default is considered as 75.
{
"distro_algo": distribution_algorithm and
"golden_ratio":godlen_ratio
}
{
"application_servers":
{
"hostname_of_namenode1":
{
"use_ssl":true
}
}
}
{
"application_servers":
{
"primary.host.com":
Configuring NetBackup for Hadoop 32
Configuring the NetBackup for Hadoop plug-in using the NetBackup for Hadoop configuration file
{
"use_ssl":true,
"failover_namenodes":
[
{
"hostname":"secondary.host.com",
"use_ssl":true,
"port":11111
}
]
}
}
}
ECA_TRUST_STORE_PATH Specifies the file path to the certificate bundle file that contains
all trusted root CA certificates.
If you have not configured the option, add all the required
Hadoop server CA certificates to the trust store and set the
option.
If you have not configured the option, add all the required
CRLs to the CRL cache and then set the option.
Set this value to YES when you have set the use_ssl as
true in the hadoop.conf file. The single value is applicable
to all Hadoop clusters when use_ssl is set to true.
HADOOP_CRL_CHECK Lets you validate the revocation status of the Hadoop server
certificate against the CRLs.
■ A file containing the PEM encoded certificates of the trusted root certificate
authorities that are concatenated together.
This option is mandatory for file-based certificates.
The root CA certificate in Cloudera distribution can be obtained from the Cloudera
administrator. It may have a manual TLS configuration or an Auto-TLS enabled for
the Hadoop cluster. For both cases, NetBackup needs a root CA certificate from
the administrator.
The root CA certificate from the Hadoop cluster can validate the certificates for all
nodes and allow NetBackup to run the backup and restore process in case of the
secure (SSL) cluster. This root CA certificate is a bundle of certificates that has
been issued to all such nodes.
Certificate from root CA must be configured under ECA_TRUST_STORE_PATH in case
of self-signed, third party CA or Local/Intermediate CA environments. For example:
In case of AUTO-TLS enabled Cloudera environments, you can typically find the
root CA file named with cm-auto-global_cacerts.pem at path
/var/lib/cloudera-scm-agent/agent-cert. For more details, refer Cloudera
documentation.
Usage Description
Note: For validating the revocation status of a virtualization server certificate, the
VIRTUALIZATION_CRL_CHECK option is used.
Usage Description
Usage Description
For example:
ECA_CRL_PATH = /usr/eca/crl/eca_crl_file.crl
Usage Description
How to use Use the nbgetconfig and the nbsetconfig commands to view,
add, or change the option.
HADOOP_SECURE_CONNECT_ENABLED = YES
Usage Description
ECA_TRUST_STORE_PATH=/tmp/cacert.pem
ECA_CRL_PATH=/tmp/backuphostdirectory
Configuring NetBackup for Hadoop 38
Configuration for a NetBackup for Hadoop cluster that uses Kerberos
HADOOP_SECURE_CONNECT_ENABLED=YES/NO
HADOOP_CRL_CHECK=DISABLE / LEAF / CHAIN
■ Acquire the keytab file and copy it to a secure location on the backup host.
■ Ensure that the keytab has the required principal.
■ Manually update the krb5.conf file with the appropriate KDC server and realm
details.
"application_servers": {
"punnbuucsm5b29-v14.vxindia.veritas.com": {
"port": 9000,
"distro_algo": 4,
"num_streams": 2,
"golden_ratio": 80,
"additionalBackupHosts": ["bh1.vxindia.veritas.com", "bh2.vxindia.veritas.com
}
},
"number_of_threads": 10
}
------------
num_stream: To enhance the restore performance, you can configure the number
of streams that each backup host can allow. Default value is 1.
additionalBackupHosts: To enhance the restore performance, you can configure
additional backup host details. You can specify the hostname of additional backup
hosts.
Notes:
■ You must keep additionalBackupHosts empty, if no additional backup hosts
are available.
■ The hadoop.conf configuration must be same on all the backup hosts.
■ The num_stream configuration must be same for backup and restore process.
■ Hadoop setups and NetBackup setups must be in the same timezone.
■ If you increase streams, adjust the maximum number of jobs per client, update
the stu setting for multiple threads, and update the client timeout to avoid abrupt
failures.
Note: The host name and port of the NameNode must be the same as the values
that you specified with the HTTP address parameter in the core-site.xml of the
NetBackup for Hadoop cluster.
Configuring NetBackup for Hadoop 40
Disaster recovery of a NetBackup for Hadoop cluster
Note: The directory or folder that is specified for the backup selection when
you define a BigData Policy with Application_Type=hadoop must not contain
a space or a comma in their names.
8 Click Create.
For more information on using NetBackup for BigData applications, refer to the
Veritas NetBackup documentation page.
Task Description
After the NetBackup for Hadoop cluster and Perform the following tasks:
nodes are up, prepare the cluster for
Update firewall settings so that the backup
operations with NetBackup.
hosts can communicate with the NetBackup
for Hadoop cluster.
The backup hosts use the hadoop.conf file With this release, the following plug-in
to save the configuration settings of the settings can be configured
NetBackup for Hadoop plug-in. You need to
■ See “Configuring NetBackup for a
create separate file for each backup host and
highly-available NetBackup for Hadoop
copy it to /usr/openv/var/global/. You
cluster” on page 26.
need to create the hadoop.conf file in
■ See “Configuring number of threads for
JSON format.
backup hosts” on page 29.
Update the BigData policy with the original See “Create a BigData policy for Hadoop
NameNode name. clusters” on page 39.
Chapter 4
Performing backups and
restores of Hadoop
This chapter includes the following topics:
Task Reference
(Optional) See “Prerequisites for running backup and restore operations for a
Complete the NetBackup for Hadoop cluster with Kerberos authentication” on page 43.
prerequisites for
Kerberos
Best practices See “Best practices for backing up a NetBackup for Hadoop cluster”
on page 43.
Performing backups and restores of Hadoop 43
About backing up a NetBackup for Hadoop cluster
Task Reference
Troubleshooting For discovery and cleanup related logs, review the following log file on
tips the first backup host that triggered the discovery.
/usr/openv/var/global/logs/nbaapidiscv
For data transfer related logs, search for corresponding backup host
(using the hostname) in the log files on the primary server.
Note: During the backup and restore operations, the TGT must be valid. Thus,
specify the TGT validity accordingly or renew it when required during the operation.
For example,
kinit -k -t /usr/openv/var/global/nbusers/hdfs_mykeytabfile.keytab
[email protected]
■ Ensure that the local time on the HDFS nodes and the backup host are
synchronized with the NTP server.
■ Ensure that you have valid certificates for a Hadoop cluster that is enabled with
SSL (HTTPS).
Task Reference
Complete the See “Prerequisites for running backup and restore operations for a
prerequisites for NetBackup for Hadoop cluster with Kerberos authentication” on page 43.
Kerberos
Performing backups and restores of Hadoop 45
About restoring a NetBackup for Hadoop cluster
Task Reference
Restoring See “Restore Hadoop data on the same Hadoop cluster” on page 46.
NetBackup for
Hadoop data on
the same
NameNode or
NetBackup for
Hadoop cluster
Restoring See “Restoring Hadoop data on an alternate Hadoop cluster” on page 47.
NetBackup for
Hadoop data to an
alternate
NameNode or
NetBackup for
Hadoop cluster
Best practices See “Best practices for restoring a Hadoop cluster” on page 45.
Troubleshooting See “Troubleshooting restore issues for NetBackup for Hadoop data”
tips on page 58.
■ From left directory hierarchy, select the files and folders for restore.
Note: All the subsequent files and folders under the directory are displayed
in the right pane.
■ Click Next.
7 On the Review tab, verify the details and click Start recovery.
Note: Make sure that you have added the credentials for the alternate NameNode
or Hadoop cluster in NetBackup primary server and also completed the allowlisting
tasks on NetBackup primary server. For more information about how to add Hadoop
credentials in NetBackup and whitlelisting procedures, See “Adding NetBackup for
Hadoop credentials in NetBackup” on page 23. See “Including a NetBackup client
on NetBackup primary server allowed list” on page 22.
Performing backups and restores of Hadoop 48
About restoring a NetBackup for Hadoop cluster
Parameter Value
Specifies a file (listfile) that contains a list of files to be restored and can be
used instead of the file names option. In listfile, list each file path must be on
a separate line.
-L progress_log
Specifies the name of allowlisted file path in which to write progress information.
-t 44
Specifies the name of a file with name changes for alternate-path restores.
Use the following form for entries in the rename file:
change backup_filepath to restore_filepath
ALT_APPLICATION_SERVER=<Application Server Name>
Note: Ensure that you have allowlisted all the file paths such as
<rename_file_path>, <progress_log_path> that are already not included as
a part of NetBackup install path.
Performing backups and restores of Hadoop 50
Best practice for improving performance during backup and restore
Table 4-3 Example for large number of small files and small number of large
file case
Upto 1 TB 4 16 5 4 80
Upto 50TB 5 32 5 4 80
>50TB 6 32 5 4 80
For more details, refer Apache Hadoop documentation for secure mode.
Additionally for optimal performance, ensure the following:
■ Primary server is not used as a backup host.
■ In case of multiple policies scheduled to be triggered in parallel:
■ Avoid using the same discovery host in all policies.
Area References
General logging See “About NetBackup for Hadoop debug logging” on page 53.
and debugging
Backup issues See “Troubleshooting backup issues for NetBackup for Hadoop data”
on page 53.
Restore issues See “Troubleshooting restore issues for NetBackup for Hadoop data”
on page 58.
To avoid issues See “Best practices for deploying the NetBackup for Hadoop plug-in”
also review the on page 17.
best practices
See “Best practices for backing up a NetBackup for Hadoop cluster”
on page 43.
Extended attributes (xattrs) and Access Control Lists (ACLs) are not
backed up or restored for Hadoop
Extended attributes allow user applications to associate additional metadata with
a file or directory in Hadoop. By default, this is enabled on Hadoop Distributed File
System (HDFS).
Access Control Lists provide a way to set different permissions for specific named
users or named groups, in addition to the standard permissions. By default, this is
disabled on HDFS.
Hadoop plug-ins do not capture extended attributes or Access Control Lists (ACLs)
of an object during backup and hence these are not set on the restored files or
folders.
Workaround:
If the extended attributes are set on any of the files or directories that is backed up
using the BigData policy with Application_Type = hadoop, then, you have to
explicitly set the extended attributes on the restored data.
Extended attributes can be set using the Hadoop shell commands such as fs
-getfattr and hadoop fs -setfattr.
If the Access Control Lists (ACLs) are enabled and set on any of the files or
directories that is backed up using the BigData policy with Application_Type =
hadoop, then, you have to explicitly set the ACLs on the restored data.
ACLs can be set using the Hadoop shell commands such as hadoop fs -getfacl
and hadoop fs -setfacl.
Troubleshooting 56
Troubleshooting backup issues for NetBackup for Hadoop data
Verify that the backup host has valid Ticket Granting Ticket (TGT) in case of
Kerberos enabled NetBackup for Hadoop cluster.
Workaround:
Renew the TGT.
Troubleshooting 57
Troubleshooting backup issues for NetBackup for Hadoop data
Workaround:
Verify the hadoop.conf file to ensure that blank values or incorrect syntax is not
used with the parameter values.
If custom configuration files for Hadoop and HBase get deleted after a restart, you
can manually create the files at the following location:
■ Hadoop:/usr/openv/var/global/hadoop.conf
■ HBase:/usr/openv/var/global/hbase.conf
You can store the CA certificate that has signed the Hadoop or HBase SSL certificate
and CRL at the following location:
/usr/openv/var/global/
/data/1
/data/2
Workaround
To view the available data that can be restored from an incremental backup image,
select the related full backup images along with the incremental backup images.
■ See “NetBackup restore job for NetBackup for Hadoop completes partially”
on page 59.
■ See “Extended attributes (xattrs) and Access Control Lists (ACLs) are not backed
up or restored for Hadoop” on page 55.
■ See “Restore operation fails when Hadoop plug-in files are missing on the backup
host” on page 60.
■ See “Restore fails with bpbrm error 54932” on page 60.
■ See “Restore operation fails with bpbrm error 21296” on page 60.
Extended attributes (xattrs) and Access Control Lists (ACLs) are not
backed up or restored for Hadoop
For more information about this issue, See “Extended attributes (xattrs) and Access
Control Lists (ACLs) are not backed up or restored for Hadoop” on page 55.
Restore operation fails when Hadoop plug-in files are missing on the
backup host
When a restore job is triggered on a backup host which does not have Hadoop
plug-in files installed, the restore operation fails with the following error:
■ Ensure that with the current Kerberos user, it is possible to set the owners/ACLS
manually using HDFS commands, such as chown and setfacl.
For more information, see the NetBackup for Hadoop Administrator's Guide.
{
"application_servers":
{
"primary.host.com":
{
"use_ssl":true
"failover_namenodes":
[
{
"hostname":"secondary.host.com",
"use_ssl":true
"port":11111
}
],
"port":11111
}
},
"number_of_threads":5
}