0% found this document useful (0 votes)

205 views

Cluster Admin Guide

Uploaded by

Vikash Bora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

205 views

Cluster Admin Guide

Uploaded by

Vikash Bora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

`

Amogh Cluster Admin Guide

For Aeronautical Development Agency - ADA

Dell Technologies Confidential 1

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Table of Contents

1. System architecture and configuration .................................................................................. 4

1.1 System Configuration .................................................................................................... 5
1.1.1 Compute nodes ...................................................................................................... 6
1.1.2 High Memory Compute nodes ................................................................................ 6
1.1.3 Processor architecture ........................................................................................... 6
1.1.4 Operating System................................................................................................... 6
1.1.5 Network infrastructure ............................................................................................ 6
2. IP Address and Partition details: ........................................................................................... 8
2.1 Master Node.................................................................................................................. 8
2.2 Login Nodes .................................................................................................................. 9
2.3 Compute Nodes .......................................................................................................... 10
2.4 Mellanox IB and GigE Switch IP .................................................................................. 11
2.5 Software details: .......................................................................................................... 11
3. Cluster Manager – (Ganana Cluster Toolkit) ...................................................................... 12
3.1 High Availability ........................................................................................................... 19
4. InfiniBand Switch: (SB7800 Series – Switch-IB™ 2 EDR 100Gb/s InfiniBand Smart
Switches) ................................................................................................................................... 28
5. GigE switch: (DellEMC Networking X-1052) ....................................................................... 34
6. Intel Parallel Studio XE 2018/2019 ..................................................................................... 36
7. Start up and Shutdown procedure ...................................................................................... 40
7.1 Start-up Procedure ...................................................................................................... 40
7.2 Shutdown Procedure ................................................................................................... 41

Dell Technologies Confidential 2

Amogh Cluster Admin Guide

Introduction

The document summarizes the HPC solution including architectural diagram, configuration, and
managing of AMOGH HPC Cluster implemented at ADA Bangalore.
The supercomputer AMOGH is based on a cluster configuration of Dell server models from the
Dell India Private Ltd, and the cluster is a combination of various models for Master nodes and
Login nodes PowerEdge R640 , for Compute nodes Dell PowerEdge C6420 with Intel Xeon Gold
6138 2.0G processors. The system is implemented by Locuz Enterprise Solution Ltd. together
with the partner companies Dell, Locuz and DDN.

Dell Technologies Confidential 3

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

1. System architecture and configuration

Total setup comprising of 256 nodes having Dell hardware PowerEdge C6420, Dell PowerEdge
R640 models, which contains master and login nodes, high memory compute, normal memory
compute node. The setup consists two Login nodes which will be used to login and submit jobs
for users. Two master nodes are configured in high availability active/passive mode. DDN Lustre
parallel file system has been configured with DDN Storage and on 6 servers. Networking consist
of 7 Dell GigE switches for OS communication and hardware management. 16 Mellanox EDR
InfiniBand switch is configured for MPI job communication and to provide Lustre file system on all
the compute nodes.

Dell Technologies Confidential 4

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

1.1 System Configuration

AMOGH production systems based on the Dell Systems with a total cpu peak performance of 655 TFlops.
The cluster consists of 246 compute nodes and 10 high memory nodes connected via high speed InfiniBand
EDR 100 Gbps network and utilizes a lustre storage around at 10 GBps, systems deployed by DDN.

The compute nodes differ in their architecture below is the list of computes according to their architecture.

Dell Technologies Confidential 5

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

1.1.1 Compute nodes

 246 nodes
 9840 cores
 2 x Intel Xeon Gold 6138, 20-core, 2.0 GHz processors per node
 192 GB of physical memory per node
 Total 47.232TB RAM

1.1.2 High Memory Compute nodes

 10 nodes
 400 cores
 2 x Intel Xeon Gold 6138, 20-core, 2.0 GHz processors per node
 384 GB of physical memory per node
 Total 3.84 TB RAM

1.1.3 Processor architecture

All computes are containing Intel Xeon Gold 6138 processor. The processor architecture is as below:
 20 cores in each processor
 Speed: 2.0 GHz, up to 3.7 GHz using Turbo Boost Technology
 Cache: 27.5 MB L3 per processor

1.1.4 Operating System

The operating system on AMOGH is Linux –
Red Hat Enterprise Linux Server release 7.4 (Maipo)
Kernel version: 3.10.0-693.17.1.el7

1.1.5 Network infrastructure

AMOGH environment are interconnected by Ethernet and InfiniBand.

 Gigabit Ethernet ( Dell EMC Networking X1052)

Total 7 Gigabit Ethernet is used for OS communication and hardware management. For Gigabit Ethernet,
no additional modules or libraries are needed. All Ethernet switches inter connected to 1G Ethernet
switches.

 InfiniBand (SB7800 Series – Switch-IB 2 EDR 100Gbps Infiniband Smart Switches)

Total 16 Infiniband switch are configured as 100% non-blocking 2:1 Fat tree topology. EDR InfiniBand is a
high-performance switched fabric chassis, which is characterized by its high throughput and low latency.

Dell Technologies Confidential 6

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Logical Connectivity

 4 Spine Switch
 12 Leaf Switch
 2:1 Fat-Tree Topology

Dell Technologies Confidential 7

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

2. IP Address and Partition details:

2.1 Master Node

Hostname Ada001.
Cluster Private IP (bridge0 with eth2) 10.1.2.3/23
Master1 LAN IP (eth0) 172.16.100.61/16

Ganana IP (Floating) 10.1.2.31/23

Management IP 10.1.129.3/23
InfiniBand IP 10.3.2.3/23

Hostname Ada002.

Cluster Private IP (bridge0 with eth2) 10.1.2.4/23

Master2 LAN IP (eth0) 172.16.100.62/16

Management IP 10.1.129.4/23

InfiniBand IP 10.3.2.4/23

Master Nodes Partition detail:

Raid 5 + 1 HS

Partition
Size
name
/boot 1 GB
/ 1.3 TB
Swap 64 GB
/var 600 GB
/tmp 300 GB
/shared 4.4 TB ISCSI storage in Active/Passive for HA

Network Mount Points: -

 5 TB NFS From 10.3.2.63:/home1/application mounted on “/app”

 217 TB NFS From 10.3.2.64:/home1/users mounted on “/adahome”

Dell Technologies Confidential 8

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

 253 TB Lustre from

10.3.2.59@o2ib,10.3.2.60@o2ib1:10.3.2.61@o2ib,10.3.2.62@o2ib1:/scratch mounted
on /scratch
 253 TB Lustre from
10.3.2.59@o2ib,10.3.2.60@o2ib1:10.3.2.61@o2ib,10.3.2.62@o2ib1:/archive mounted
on /archive

2.2 Login Nodes

Hostname ada003
Cluster Private IP (eth2) 10.1.2.5 /16
Login node1 LAN IP (eth1) 172.16.100.63/16

Management IP 10.1.129.5/16
InfiniBand IP 10.3.2.5/16

Hostname ada004

Cluster Private IP (eth2) 10.1.2.6 /16

InfiniBand IP 10.3.2.6/16

Network Mount Points: -

 5 TB NFS From 10.3.2.63:/home1/application mounted on “/app”

 217 TB NFS From 10.3.2.64:/home1/users mounted on “/adahome”
 253 TB Lustre from
10.3.2.59@o2ib,10.3.2.60@o2ib1:10.3.2.61@o2ib,10.3.2.62@o2ib1:/scratch mounted
on /scratch

Login Nodes Partition detail:

Raid 5 + 1 HS.

Partition
Size
name

Dell Technologies Confidential 9

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

/boot 1 GB
/ 1.3 TB
Swap 64 GB
/var 600 GB
/tmp 300 GB
/shared 4.4 TB ISCSI storage in Active/Passive for HA

2.3 Compute Nodes

Hostname Cluster IP (eth0) Mgmt IP InfiniBand IP

hpc000- 10.1.128.1- 172.22.2.1- 10.3.1.1-

hpc255
10.1.2.2 10.1.129.2 10.3.2.2

(Note – Management IP is configured in shared mode in BIOS for all the compute node)

Compute Nodes Partition detail:

Partition name Size

/boot 1 GB
/var 400 GB
/tmp 100 GB
/opt 400 GB
SWAP 64 GB
/adahome 232 TB NFS From 10.3.2.64:/home1/users

270 TB Lustre from

/scratch
10.3.2.59@o2ib,10.3.2.60@o2ib1:10.3.2.61@o2ib,10.3.2.62@o2ib1:/scratch

Dell Technologies Confidential 10

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

2.4 Mellanox IB and GigE Switch IP

Mellanox ETH IP

IB HA (mgmt0) 10.1.129.40

IB Management 1 10.1.129.41

IB Management 2 10.1.129.42

Name Management IP
Gig switch1 (Gsw01) 10.1.130.1
Gig switch2 (Gsw02) 10.1.130.2
Gig switch3 (Gsw03) 10.1.130.3
Gig switch4 (Gsw04) 10.1.130.4
Gig switch5 (Gsw05) 10.1.130.5
Gig switch6 (Gsw06) 10.1.130.6
Gig switch7 (Gsw07) 10.1.130.7

2.5 Software details:

Operating System RHEL 7.4 / kernel 3.10.0-693.el7.x86_64

Cluster Management Toolkit Ganana Cluster Tool kit Ver 2.x

Scheduler Altair PBSPro Ver 19.x

Compilers GNU, Intel Cluster Studio 2018, 2019

Libraries GNU, Intel Cluster Studio 2018, 2019

MPI Intel MPI, OpenMPI

OFED Mellanox OFED - 4.5-1.0.1.0

Dell Technologies Confidential 11

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

3. Cluster Manager – (Ganana Cluster Toolkit)

Ganana Cluster Manager contains tools and applications to facilitate the installation, administration, and
monitoring of a cluster.

Ganana HPC Cluster Manager makes it easier for Admins to build Linux based HPC Cluster, and to
easily manage of their clusters on any x64 hardware. The web-based Portal is available as a
flexible, feature centric for the administrators to interact with their HPC cluster or grid in a natural
and powerful way. By standardizing click of button build & manage compute node images,
management / monitoring packages, middleware software and post installation activities. It is
highly useful in all sorts of HPC environment having a lot of advanced features and doing small
things to save a large amount of time for repeated tasks.

Ganana Cluster Toolkit Features

Dell Technologies Confidential 12

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

The following key services for cluster operations are always running on both head nodes:

ganana is cluster manager daemon service should be running in active/passive mode on both
the master nodes.

Figure – Browser login to Ganana web interface (IP https://)

Dell Technologies Confidential 13

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Figure – AMOGH Cluster Overview

Figure – Ganana Version and License

Dell Technologies Confidential 14

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Figure – Cluster management

Figure – Network Details

Dell Technologies Confidential 15

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Figure – OS images

Dell Technologies Confidential 16

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Figure – categories of profiles

Figure – Nodes Information

Dell Technologies Confidential 17

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Figure – Resource utilization monitoring of cluster

Dell Technologies Confidential 18

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

3.1 High Availability

Why Have High Availability?
In a cluster with a single head node, the head node is a single point of failure for the entire cluster.
It is often unacceptable that the failure of a single machine can disrupt the daily operations of a
cluster.
HA Concepts
Primary, Secondary, Active, Passive
Naming: In a cluster with an HA setup, one of the head nodes is named the primary head node
and the other head node is named the secondary head node.
Mode: Under normal operation, out of the two head nodes one will be in active mode, whereas
the other node will be in passive mode.
The difference between active and passive is that the active head takes the lead in cluster-related
activity, while the passive follows it.

Dell Technologies Confidential 19

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

iscsi
Multipath HA- LVM PCS
storage

Ganana High Availability

The Ganana high availability is configured with the Red Hat High Availability two-node cluster
using Pacemaker.
Shared Storage
Almost any HA setup also involves some form of shared storage between two head nodes to
preserve state after a failover sequence.

We are using ISCSI Block storage for sharing the data in active/passive mode between both the
master nodes.

Web Interface

Dell Technologies Confidential 20

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 21

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Shared Block Device Name: /dev/mapper/vg_shared-lv_shared

The “/shared “directory will be the common directory containing the ganana cluster configuration
that are required on both the nodes for HA.

To Ganana keep services as available as possible by eliminating bottlenecks and single

points of failure, we are using Red Hat High Availability two-node cluster using pcs.

High Availability Status

Dell Technologies Confidential 22

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 23

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 24

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 25

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Authentication
NIS Server is configured with primary and secondary server role for achieving the high
availability requirement

Command Output
NIS Server:
10.1.2.20 Floating IP Address
10.1.2.3 Primary Server
10.1.2.4 Secondary Server

Dell Technologies Confidential 26

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 27

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

4. InfiniBand Switch: (SB7800 Series – Switch-IB™ 2

EDR 100Gb/s InfiniBand Smart Switches)

InfiniBand is a special type of networking fabric that has very low latency compared to standard Ethernet
based networks. It enables the use of larger scales MPI jobs that can spread over several nodes.

Mellanox EDR 100Gb/s InfiniBand Switch provides the highest performing fabric solution in a 1U form
factor by delivering up to 7.2Tb/s of non-blocking bandwidth with 90ns port-to-port latency.

Managed Switch: SB7800 Infiniband EDR 100Gbps Switch

Unmanaged Switch: SB7890 Infiniband EDR 100Gbps Switch

Dell Technologies Confidential 28

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 29

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Unified Fabric Manager - Mellanox Technologies

For monitering and identfying the issue in fabric topolgy, Mellanox UFM (Unified Fabric
Manager) is configured on ada004.
UFM Fetaure:

 Measuring fabric utilization and trends

 Identifying and analyzing congestion and bottle necks
 Efficient and centralized management of a large number of devices
[root@ada004 ~]# ufm-launch-gui

To Login & Manage InfiniBand Switch using web browser:

Dell Technologies Confidential 30

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

OpenSM HA Configured on 2 Managed Switch for Fat Tree Topology

HA https://10.1.129.40
Switch1: https://10.1.129.41
Switch2: https://10.1.129.42

Figure- SM subnet manager running from IB switch

Dell Technologies Confidential 31

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 32

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 33

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

5. GigE switch: (DellEMC Networking X-1052)

Total 7 number of Dell X1052 Ethernet Switch are setup and configured in bus topology.
Specification:
 48 GbE ports Per Switch
 4 x 10Gb SFP+ ports Per Switch

All 7 Switch are

Dell Networking X1052

To Login & manage 1G switch using web browser

Gsw01 10.1.130.1
Gsw02 10.1.130.2
Gsw03 10.1.130.3
Gsw04 10.1.130.4
Gsw05 10.1.130.5
Gsw06 10.1.130.6
Gsw07 10.1.130.7

Dell Technologies Confidential 34

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Document: https://www.dell.com/support/home/in/en/inbsd1/product-support/product/networking-x1000-series/docs

Dell Technologies Confidential 35

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

6. Intel Parallel Studio XE 2018/2019

Introduction Intel® Parallel Studio XE Cluster Edition for Windows* and Linux* OS accelerates
parallel software development on cluster systems based on Intel® 64 architectures, as well as
Intel® Many Integrated Core Architecture (Intel® MIC Architecture) on Linux* OS. For Intel® MIC
Architecture, only Intel® Xeon Phi™ coprocessor (codename: Knights Corner) is supported. Intel®
Parallel Studio XE Cluster Edition provides a software tools environment for hybrid parallel
programming (message passing and threading). Intel® Parallel Studio XE Cluster Edition supports
hybrid parallel programming application development using Intel® MPI Library with optimized
parallel libraries, performance analysis, and benchmarks. Intel® Parallel Studio XE Cluster Edition
saves software developers time and improves performance on distributed computing systems.
Intel® Parallel Studio XE Cluster Edition for Linux OS and Windows* OS supports critical parts of
the message-passing interface (MPI) application development process including:
Compiler support through Intel® C++ Compiler XE and Intel® Fortran Compiler XE. Intel® C++
Compiler XE and Intel® Fortran Compiler XE for Windows* and Linux* OS provide support for
Intel® Many Integrated Core Architecture (Intel® MIC Architecture). Intel® C++ Compiler XE for
Windows* and Linux* OS provide support for offload to Intel® Graphics Technology.
Intel® MPI Library 5.1 Update 1, which implements the Message Passing Interface 3.0 Standard
(MPI-3.0). Intel MPI library enables multiple interconnect solutions with a single implementation.
Intel® MPI Library for Linux* OS supports Intel® Many Integrated Core Architecture (Intel® MIC
Architecture).  Intel® Trace Analyzer and Collector 9.0 Update 1 o Intel® Trace Collector provides
event-based tracing in cluster applications through an instrumentation library that ensures low
overhead in execution. The trace information provides performance data, statistics, and
multithreaded events on Intel® 64 and Intel® Many Integrated Core Architecture (Intel® MIC
Architecture). Intel® Trace Analyzer provides visual analysis of application activities gathered by
the Intel Trace Collector. A message checking component of the Intel Trace Collector provides a
novel MPI correctness technology that detects errors with data types, buffers, communicators,
point-to-point messages and collective operations, deadlocks, and data corruption.
Application tuning with optimized mathematical library functions from Intel® Math Kernel Library
(Intel® MKL) that includes ScaLAPACK* solvers and Cluster DFTs (Discrete Fourier Transforms)
Intel® MKL for Linux* OS support Intel® Many Integrated Core Architecture (Intel® MIC
Architecture).
Intel® MPI Benchmarks that makes it easy to gather performance information about a cluster
system.
Intel® Parallel Studio XE Cluster Edition for Linux* having the following features:

 Intel® Composer Compiler XE

 Intel® Trace Analyzer and Collector
 Intel® MPI Library
 Intel® MPI Benchmarks

Dell Technologies Confidential 36

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

 Intel MKL
 Intel IPP
 Intel V Tune Analyzer

$ module load
$icc –v -> check icc version
$ which icc -> check icc path
$ which mpirun -> check mpirun path

Intel MPI

MPI (Message-Passing Interface) is the de-facto standard for parallelization on distributed

memory parallel systems. Multiple processes explicitly exchange data and coordinate their work
flow. MPI specifies the interface but not the implementation. Therefore, there are plenty of
implementations for PC’s as well as for supercomputers. There are freely available
implementations and commercial ones, which are particularly tuned for the target platform. MPI
has a huge number of calls, although it is possible to write meaningful MPI applications just
employing some 10 of these calls.
Intel-MPI is a commercial implementation based on mpich2 which a public domain
implementation of the MPI 2 standard is provided by the Mathematics and Computer Science
Division of the Argonne National Laboratory.
The compiler drivers mpifc, mpiifort, mpiicc, mpiicpc, mpicc and mpicxx and the instruction for
starting an MPI application mpiexec will be included in the search path. There are two different
versions of compiler driver: mpiifort, mpiicc and mpiicpc are the compiler driver for Intel
Compiler. mpifc, mpicc and mpicxx are the compiler driver for GCC (GNU Compiler Collection).
System specific environment
There is a wide range of software packages targeted at different scientific domains installed on Amogh.
These packages are accessible with modules environment.
The basic command to use is module:

module
(no arguments) print usage instructions
avail or av list available software modules
whatis as above with brief descriptions
load <modulename> add a module to your environment
unload <modulename> remove a module
purge remove all modules

The modules loaded into the user’s environment can be seen with:

Dell Technologies Confidential 37

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

$ module list
To check available modules
$module avail
To use the mpich with GCC implementation:
$ module add mpich/ge/gcc

For using the application correct module needs to be loaded in current working shell or in PBSPRO
job submission scripts.

Dell Technologies Confidential 38

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

Dell Technologies Confidential 39

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

7. Start up and Shutdown procedure

7.1 Start-up Procedure

Chronological sequence to initiate “Cluster Start-up”

############# Start the Cluster ############

1. All PDUs, Switches, DDN Storage should be Power ON without any error, before powering on the
master node.

2. Power ON Master Node ada001 manually, wait for 5 minutes to power on properly and wait 2 more
additional minutes to allow lustre and nfs mount automatically

3. Check #pcs status in ada001 node

4. Check all the license server is up and running (lmgrd and flexlm)

5. Power on ada002 secondary Master node, wait for 5 minutes to power on properly and wait 2 more
additional minutes to allow lustre and nfs mount automatically"

6. Check pcs status in ada001 node, both the nodes should be in standby mode and resource should be in
disabled state.

7. Un-standby the ada001 and ada002 with the time difference of 2 minutes. Execute “pcs cluster
unstandby ada001-eth2;sleep 120; pcs cluster unstandby ada002-eth2” in ada001”

8. Verify the Ganana status by browsing the login page

9. Power on all login nodes and compute nodes with racdam/manually and wait till all nodes booted
properly"

10. Verify the lustre and nfs mount in compute and login nodes using the below commands

#clush -a “df -Th | grep -i lustre | dshbak -c”

#clush -a “df -Th | grep -i nfs | dshbak -c”

#clush -a “ypwhich | dshbak -c”

Dell Technologies Confidential 40

Dell Customer Communication - Confidential

Amogh Cluster Admin Guide

7.2 Shutdown Procedure

Chronological sequence to initiate “Cluster Shutdown”

1. Kill all pbs running jobs if any using command as below

echo qdel -W force <job id>

2. Kill all user session

pdsh -w hpc[000-255] killall -u username

pdsh -w ada[001-004] killall -u username

3. Shutdown pbs from all nodes

pdsh -w hpc[000-255] /etc/init.d/pbs stop

4. Shutdown Lustre from all nodes.

pdsh -w hpc[000-255] umount /scratch; lustre_rmmod

pdsh -w ada[001-004] umount /scratch; lustre_rmmod

5. Power off all compute nodes

pdsh -w hpc[000-255] poweroff

pdsh -w ada[003-004] poweroff

6. Power off all login nodes, except ada001 and ada002 Nodes.

7. Perform the standby action for ada002 and ada001 with the gap of 1 minutes.
8. Stop the pbs service on ada002 and ada001 with gap of 1 minutes.
9. Sync all the IO operation on the system”
10. Drop the cache of the server by issuing “echo 3 > /proc/sys/vm/drop_caches”
11. Kill all the pid linked the mounted file system
12. Unmount the nfs and lustre filesystem.
13. Remove the lustre module lustre_rmmod
14. Power off the ada002 and ada001

Dell Technologies Confidential 41

Dell Customer Communication - Confidential

100 Top Computer Hardware Questions and Answers PDF Computer Hardware Questions PDF
100% (5)
100 Top Computer Hardware Questions and Answers PDF Computer Hardware Questions PDF
16 pages
Systemdesign Karan
No ratings yet
Systemdesign Karan
276 pages
NSX-T Data Center 3.2.3 Configuration - Maximums
100% (1)
NSX-T Data Center 3.2.3 Configuration - Maximums
19 pages
User Guide Linstor
No ratings yet
User Guide Linstor
103 pages
Lustre Quick Cheatsheet
No ratings yet
Lustre Quick Cheatsheet
4 pages
100 MCQ Questions For Operating Systems - MCQ Sets
80% (5)
100 MCQ Questions For Operating Systems - MCQ Sets
40 pages
HA PACEMAKER.2014. Ha Linux Clustering
No ratings yet
HA PACEMAKER.2014. Ha Linux Clustering
290 pages
Doing More With Slurm Advanced Capabilities
No ratings yet
Doing More With Slurm Advanced Capabilities
31 pages
Dell NSS NFS Storage Solution Final PDF
No ratings yet
Dell NSS NFS Storage Solution Final PDF
38 pages
Infinibad Cheat Sheet
No ratings yet
Infinibad Cheat Sheet
2 pages
HPC Cluster Admin Tools DAY
No ratings yet
HPC Cluster Admin Tools DAY
27 pages
Install - Guide CentOS7 xCAT Stateful SLURM 1.3.9 x86 - 64
No ratings yet
Install - Guide CentOS7 xCAT Stateful SLURM 1.3.9 x86 - 64
57 pages
Pacemaker - Quick - Command - Reference 1
No ratings yet
Pacemaker - Quick - Command - Reference 1
6 pages
User Manual
No ratings yet
User Manual
116 pages
Creating A Ceph Storage Cluster Using Old Desktop Computers
100% (1)
Creating A Ceph Storage Cluster Using Old Desktop Computers
7 pages
Linux Administration
No ratings yet
Linux Administration
107 pages
Red Hat Virtualization-4.4-Installing Red Hat Virtualization As A Standalone Manager With Remote Databases-En-Us
No ratings yet
Red Hat Virtualization-4.4-Installing Red Hat Virtualization As A Standalone Manager With Remote Databases-En-Us
87 pages
Linux Questions & Answers On Search & Pattern Matching - Sanfoundry
No ratings yet
Linux Questions & Answers On Search & Pattern Matching - Sanfoundry
8 pages
Linux HPC Cluster Setup Guide
No ratings yet
Linux HPC Cluster Setup Guide
28 pages
Install XRDP On CentOS 7
No ratings yet
Install XRDP On CentOS 7
5 pages
Drbd9 Mysql Rhel8
No ratings yet
Drbd9 Mysql Rhel8
23 pages
Lustre File System Demo Quick Start Guide
No ratings yet
Lustre File System Demo Quick Start Guide
23 pages
Solution Methodology2
No ratings yet
Solution Methodology2
3 pages
Linux Network Namespace Introduction - Docker Kubernetes Lab 0.1
No ratings yet
Linux Network Namespace Introduction - Docker Kubernetes Lab 0.1
10 pages
Pcs Command Reference
No ratings yet
Pcs Command Reference
4 pages
Professional VMware Application Modernization 2V0-71.21 Dumps
No ratings yet
Professional VMware Application Modernization 2V0-71.21 Dumps
11 pages
Terraform Commands
No ratings yet
Terraform Commands
5 pages
Ha Ovirt Iscsi
No ratings yet
Ha Ovirt Iscsi
28 pages
Linux Administration Practical Workbook-Final
No ratings yet
Linux Administration Practical Workbook-Final
198 pages
Mellanox Openstack Solution
No ratings yet
Mellanox Openstack Solution
24 pages
Resize Disk Linux
No ratings yet
Resize Disk Linux
2 pages
Mysql Monitoring Nagios
No ratings yet
Mysql Monitoring Nagios
4 pages
How To Use Rsync Command in Linux 16 Practical Examples
No ratings yet
How To Use Rsync Command in Linux 16 Practical Examples
13 pages
Red Hat Enterprise Virtualization 3.1 V2V Guide en US
No ratings yet
Red Hat Enterprise Virtualization 3.1 V2V Guide en US
63 pages
Ovirt and Gluster Hyper-Converged!: Ha Solution For Maximum Resource Utilization
No ratings yet
Ovirt and Gluster Hyper-Converged!: Ha Solution For Maximum Resource Utilization
30 pages
Deploying HPC Cluster With Mellanox Infiniband Interconnect Solutions
No ratings yet
Deploying HPC Cluster With Mellanox Infiniband Interconnect Solutions
37 pages
Veritas Cluster Commands
No ratings yet
Veritas Cluster Commands
8 pages
Get Started With Red Hat Ansible Tower
No ratings yet
Get Started With Red Hat Ansible Tower
26 pages
Solaris 10 Handbook PDF
No ratings yet
Solaris 10 Handbook PDF
198 pages
How To Configure HA Proxy Load Balancer With EFT Server HA Cluster
No ratings yet
How To Configure HA Proxy Load Balancer With EFT Server HA Cluster
8 pages
KVM Forum 2013 Ovirt SLA
No ratings yet
KVM Forum 2013 Ovirt SLA
26 pages
IDRAC Manual
No ratings yet
IDRAC Manual
363 pages
User Guide DRBD 9
100% (1)
User Guide DRBD 9
213 pages
Linux LVM - Logical Volume Manager: o o o o o o o o o
No ratings yet
Linux LVM - Logical Volume Manager: o o o o o o o o o
10 pages
DRBD Quick Reference Guide
No ratings yet
DRBD Quick Reference Guide
3 pages
Xfs Filesystem Background
No ratings yet
Xfs Filesystem Background
16 pages
TP Nagios
No ratings yet
TP Nagios
9 pages
Veritas Clustering On Linux
No ratings yet
Veritas Clustering On Linux
714 pages
Linux Academy Samba 4 1C30C
No ratings yet
Linux Academy Samba 4 1C30C
2 pages
Red Hat Enterprise Linux 7 Booting Process
100% (2)
Red Hat Enterprise Linux 7 Booting Process
4 pages
Automatic Backup From Ubuntu Server With Rsync
No ratings yet
Automatic Backup From Ubuntu Server With Rsync
6 pages
Hands-On With BTRFS: Course ATT1800 Lecture Manual September 6,2012
No ratings yet
Hands-On With BTRFS: Course ATT1800 Lecture Manual September 6,2012
52 pages
Install HAproxy
No ratings yet
Install HAproxy
2 pages
OpenStack Cheat Sheet 1
No ratings yet
OpenStack Cheat Sheet 1
3 pages
Nixos in Production
No ratings yet
Nixos in Production
132 pages
Red Hat Cluster Configuration and Management: Available Resources
No ratings yet
Red Hat Cluster Configuration and Management: Available Resources
55 pages
NAIT Linux Exam Sample Questions Help For ICT480
100% (5)
NAIT Linux Exam Sample Questions Help For ICT480
7 pages
Red Hat System Administration I: Document Version
No ratings yet
Red Hat System Administration I: Document Version
8 pages
Docker Networking
No ratings yet
Docker Networking
48 pages
Python Getting Started Guide
No ratings yet
Python Getting Started Guide
48 pages
Mastering Active Directory
From Everand
Mastering Active Directory
VICTOR P HENDERSON
No ratings yet
Extending Puppet - Second Edition
From Everand
Extending Puppet - Second Edition
Alessandro Franceschi
No ratings yet
Oracle VM Manager 2.1.2
From Everand
Oracle VM Manager 2.1.2
Tarry Singh
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Computer Based Numerical & Statistical Techniques (MCA - 106)
No ratings yet
Computer Based Numerical & Statistical Techniques (MCA - 106)
209 pages
Customer Guide To DDN Technical Support: © 2012-2013, Datadirect Networks. All Rights Reserved. Spt-Gdc-0001 Revision M
No ratings yet
Customer Guide To DDN Technical Support: © 2012-2013, Datadirect Networks. All Rights Reserved. Spt-Gdc-0001 Revision M
24 pages
Preeti Gupta
No ratings yet
Preeti Gupta
3 pages
100 MCQ Questions For HTML and Web Page Designing - MCQ Sets
100% (1)
100 MCQ Questions For HTML and Web Page Designing - MCQ Sets
21 pages
College Management System - Engineering Projects
No ratings yet
College Management System - Engineering Projects
5 pages
Evolution of Cloud Computing
No ratings yet
Evolution of Cloud Computing
14 pages
Failover Clustering
No ratings yet
Failover Clustering
251 pages
Px c 3891600
No ratings yet
Px c 3891600
7 pages
OIC Questions
No ratings yet
OIC Questions
15 pages
HCIE-Data Center Network V1.0 Training Material
No ratings yet
HCIE-Data Center Network V1.0 Training Material
1,520 pages
Product Architecture
No ratings yet
Product Architecture
33 pages
VMware Building and Enabling A Hybrid Cloud With Vcloud Director - A Perspective For Service Providers - PDF EN
No ratings yet
VMware Building and Enabling A Hybrid Cloud With Vcloud Director - A Perspective For Service Providers - PDF EN
44 pages
Nessus_10_8
No ratings yet
Nessus_10_8
733 pages
Etrel INCH Smart Charging Features
No ratings yet
Etrel INCH Smart Charging Features
15 pages
IBM - Introduction To Spectrum Scale
No ratings yet
IBM - Introduction To Spectrum Scale
30 pages
gtc22 Whitepaper Hopper
No ratings yet
gtc22 Whitepaper Hopper
71 pages
ANSYS EKM Installation Guide 18.2
No ratings yet
ANSYS EKM Installation Guide 18.2
170 pages
Virtualizing Oracle Oracle RAC On Vmware Vsphere PDF
No ratings yet
Virtualizing Oracle Oracle RAC On Vmware Vsphere PDF
9 pages
TECS-CloveStorageV07.20.20.07UME One-Click Deployment and Configuration CloveStorage Guide - R1.4
No ratings yet
TECS-CloveStorageV07.20.20.07UME One-Click Deployment and Configuration CloveStorage Guide - R1.4
82 pages
BigData Unit 2
No ratings yet
BigData Unit 2
15 pages
IBM HPS POWER5 Readme - Service Pack 19
No ratings yet
IBM HPS POWER5 Readme - Service Pack 19
42 pages
HPC Rosalind Gettingstarted
No ratings yet
HPC Rosalind Gettingstarted
6 pages
Distributed Systems: Andrew S. Tanenbaum Maarten Van Steen
No ratings yet
Distributed Systems: Andrew S. Tanenbaum Maarten Van Steen
65 pages
2v0-21.23_2
No ratings yet
2v0-21.23_2
12 pages
Docu87835 - InsightIQ 4.1.2 Administration Guide PDF
No ratings yet
Docu87835 - InsightIQ 4.1.2 Administration Guide PDF
60 pages
21ai404 Os Unit I
No ratings yet
21ai404 Os Unit I
118 pages
Red Hat OpenStack Platform-8-Director Installation and Usage-En-US
No ratings yet
Red Hat OpenStack Platform-8-Director Installation and Usage-En-US
165 pages
UNIT 1 Notes
No ratings yet
UNIT 1 Notes
35 pages
Cloud Computing - KCS713 - UNIT-1 Notes
No ratings yet
Cloud Computing - KCS713 - UNIT-1 Notes
11 pages
CIPT1 v8.0 VoD-SLIDES
No ratings yet
CIPT1 v8.0 VoD-SLIDES
135 pages
HPC - Model - Paper
No ratings yet
HPC - Model - Paper
3 pages
Multi Node Cluster Installation Guide PDF
No ratings yet
Multi Node Cluster Installation Guide PDF
24 pages