0% found this document useful (0 votes)
86 views18 pages

ECS - ECS Upgrade Procedures-ECS 2.0.x.x To 2.1.x.x Operating System Offline Update

Uploaded by

pedram.sarraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views18 pages

ECS - ECS Upgrade Procedures-ECS 2.0.x.x To 2.1.x.x Operating System Offline Update

Uploaded by

pedram.sarraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

ECS ™ Procedure Generator

Solution for Validating your engagement

ECS 2.0.x.x to 2.1.x.x Operating System Offline Update

Topic
ECS Upgrade Procedures
Selections
What ECS Version Are You Upgrading To?: ECS 3.0.x.x or below
Select Type of ECS Upgrade Being Performed: ECS OS Upgrade - Offline/Online Procedures
Select ECS OS Upgrade Version/Procedure: 2.0.x.x to 2.1.x.x Upgrade
Select ECS OS Upgrade Type: OS - Offline

Generated: July 5, 2022 6:02 PM GMT

REPORT PROBLEMS

If you find any errors in this procedure or have comments regarding this application, send email to
[email protected]

Copyright © 2022 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell
EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be
trademarks of their respective owners.

The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of
any kind with respect to the information in this publication, and specifically disclaims implied warranties of
merchantability or fitness for a particular purpose.

Use, copying, and distribution of any software described in this publication requires an applicable
software license.

This document may contain certain words that are not consistent with Dell's current language guidelines.
Dell plans to update the document over subsequent future releases to revise these words accordingly.

This document may contain language from third party content that is not under Dell's control and is not
consistent with Dell's current guidelines for Dell's own content. When such third party content is updated
by the relevant third parties, this document will be revised accordingly.

Publication Date: July, 2022

Dell Technologies Confidential Information version: 2.3.6.90

Page 1 of 18
Contents
Preliminary Activity Tasks .......................................................................................................4
Read, understand, and perform these tasks.................................................................................................4

ECS Core Operating System Offline Update ..........................................................................5


Connecting a service Laptop to the ECS appliance................................................................5
Before you start the update.....................................................................................................5
Identify the internal IP addresses for ECS nodes .........................................................................................5
Create MACHINES file and display ECS OS currently installed...................................................................6
Preserve previous refit run bundle and history .............................................................................................6
Upload the OS Update file to Node 1............................................................................................................6
Staging the OS Update files..........................................................................................................................7
Prepare and Distribute OS Update files to all nodes ....................................................................................7

Offline Update Procedure........................................................................................................8


Prepare Nodes for OS Update......................................................................................................................8
Perform the OS Update on all Nodes ...........................................................................................................8
Enable specific ECS services to start automatically on all nodes.................................................................8
Restart specific ECS services to sync update on all nodes ..........................................................................9
Disable and validate that master has been configured to not allow PXE boot .............................................9
Power down all nodes except Node 1.........................................................................................................10
Validate power state on all nodes "Chassis Power is off" except Node 1...................................................11
Validate that master has been configured to not allow PXE boot. ..............................................................11
Power on all nodes except Node 1 .............................................................................................................12
Validate power state on all nodes "Chassis Power is on" except Node 1...................................................13
Power Down Node1 and validate that master has been configured to not allow PXE boot .......................13
Validate that master has been configured to not allow PXE boot ...............................................................14
Power up Node 1 to Complete OS Update .................................................................................................15
Exit maintenance mode to restart containers on all Nodes.........................................................................15
Final Node Validations ................................................................................................................................16
All Node Updates are Complete .................................................................................................................16

Post-Update Tasks................................................................................................................16
ECS Core Operating System Offline Update: Completed .....................................................................17
Check that the object service initialization process is complete .................................................................17

Dell Technologies Confidential Information version: 2.3.6.90

Page 2 of 18
Validate the data path is operational – Moved section ...............................................................................17

Troubleshooting ....................................................................................................................18
If node does not power on cleanly ..............................................................................................................18
If node does not see DAE disks..................................................................................................................18

Dell Technologies Confidential Information version: 2.3.6.90

Page 3 of 18
Preliminary Activity Tasks
This section may contain tasks that you must complete before performing this procedure.

Read, understand, and perform these tasks


1. Table 1 lists tasks, cautions, warnings, notes, and/or knowledgebase (KB) solutions that you need to
be aware of before performing this activity. Read, understand, and when necessary perform any
tasks contained in this table and any tasks contained in any associated knowledgebase solution.

Table 1 List of cautions, warnings, notes, and/or KB solutions related to this activity

2. This is a link to the top trending service topics. These topics may or not be related to this activity.
This is merely a proactive attempt to make you aware of any KB articles that may be associated with
this product.

Note: There may not be any top trending service topics for this product at any given time.

ECS Top Service Topics

Dell Technologies Confidential Information version: 2.3.6.90

Page 4 of 18
ECS Core Operating System Offline Update
Use this procedure to update the ECS OS of a single appliance. This procedure brings the ECS appliance
offline and out of service. Perform this procedure only during a predefined maintenance window.
This procedure describes how to apply the OS update to all of the nodes at the same time, performing a
reboot of all nodes except one, and finally rebooting the last node to complete the OS update.
This procedure does not describe how to upgrade the ECS software or services running on the nodes.

Connecting a service Laptop to the ECS appliance


Connect your service laptop to port 24 of the Turtle switch and assign it an IP address of 192.168.219.99
with a subnet mask of 255.255.255.0 as shown below:

Figure 1 Connecting your laptop to the turtle switch

Before you start the update


Identify the internal IP addresses for ECS nodes
Each node is assigned a number starting with “1” for node 1, “2” for node 2, and so on. The internal
network is 192.168.219.X

Dell Technologies Confidential Information version: 2.3.6.90

Page 5 of 18
Use this as a guide to SSH to the appropriate node as discussed in this documentation:
Node 1 (provo): 192.168.219.1
Node 2 (sandy): 192.168.219.2
Node 3 (orem ): 192.168.219.3
Node 4 (ogden): 192.168.219.4

Create MACHINES file and display ECS OS currently installed


Use this procedure to create and validate the ‘MACHINES’ file. The MACHINES file enables you to use
the viprexec and viprscp distributed commands.
1. [ ] Run this command to generate the MACHINES file.
# getrackinfo -c /var/tmp/MACHINES

2. [ ] Run this command to view the contents of the MACHINES file. The output should show the
internal IP addresses for all the nodes that you are about to update.
# cat /var/tmp/MACHINES

3. [ ] Run the following command to display the current ECS OS version on all nodes.
# viprexec "rpm -qv ecs-os-base"

4. [ ] Validate the ECS OS on all nodes is the same. If the ECS OS is not the same on all nodes, open
an SR and do not continue with the procedure. The ECS OS currently installed on the nodes should
be lower than the version you are updating to.

Preserve previous refit run bundle and history


Use the procedure in this section to archive and remove from /var/tmp any existing OS update/refit files
from any previous OS update engagements. The files need to be archived (out of the way) to perform the
checks outlined in this guide.The remaining steps, unless directed otherwise, expect you to remain in the
SSH session on `<node1>` and in `/var/tmp`.

1. [ ] Run the following commands to move the update package zip and bundle tbz, as well as any
preupdate and postupdate logs, into a time-stamped archive directory so that they do not interfere
with the checks outlined in this procedure.
# viprexec '[ -d /var/tmp/refit.d ] && mv /var/tmp/*update* /var/tmp/refit.d/. 2>/dev/null'
# viprexec '[ -d /var/tmp/refit.d ] && mv /var/tmp/MD5SUMS /var/tmp/refit.d/. 2>/dev/null'
# viprexec '[ -d /var/tmp/refit.d ] && mv /var/tmp/refit.d /var/tmp/refit.d.$(date +"%Y%m%d-
%H%M%S")'

Upload the OS Update file to Node 1


1. [ ] Using pscp.exe or a similar secure copy tool, upload the OS update zip file to the “/var/tmp”
directory of node 1. The OS update zip file has the format: ecs-os-update-<version>.zip. For
example:
C:\temp> pscp ecs-os-update-<version>.zip [email protected]:/var/tmp/ecs-os-
update-<version>.zip

Dell Technologies Confidential Information version: 2.3.6.90

Page 6 of 18
Staging the OS Update files
1. [ ] Using putty or a similar tool SSH to Node 1 and prepare the OS update files by running the
following commands:
# cd /var/tmp
# unzip ecs-os-update-<version>.zip
# md5sum -c MD5SUMS
# chmod +x /var/tmp/refit

Prepare and Distribute OS Update files to all nodes


1. [ ] Distribute the refit script to all nodes, and validate it.
The viprscp and viprexec enable you to distribute files to and execute commands on multiple
nodes. Both of these commands require a MACHINES file that lists the node IPs and shared
passwordless SSH keys. When using viprexec, you must verify the output from all nodes to ensure it
returns successful output from every node.
# viprscp /var/tmp/refit /usr/local/bin/

2. [ ] Run the following command and verify the md5sum matches on all nodes.
# viprexec md5sum /usr/local/bin/refit

3. [ ] Distribute the OS Update bundle to all nodes that are to be updated, and validate.
# viprscp /var/tmp/*update.tbz /var/tmp/
# viprexec md5sum /var/tmp/*update.tbz

4. [ ] Save and validate pre-update OS information


# viprexec refit preupdateversions
# viprexec cat /var/tmp/preupdateversions-*.log
# viprexec cat /var/tmp/preupdateversions-*rpmlast.log

5. [ ] Create the local repository on all nodes from the OS update bundle.
# viprexec refit deploybundle /var/tmp/ecs-os-setup-target.x86_64-*.update.tbz

The output from each node looks like this:


/ ~
Removing old PXE files
rm -f /srv/www/htdocs/image/*.{md5,xz}
Removing old repo.d files
[ -d /etc/zypp/repos.d.keep ] || mkdir -p /etc/zypp/repos.d.keep
mv /etc/zypp/repos.d/* /etc/zypp/repos.d.keep/.
Removing old repo packages
rm -f /srv/www/htdocs/repo/*.rpm
Untarring update bundle... this can take up to 5 min
tar xvf /var/tmp/ecs-os-setup-target.x86_64-1.398.58.update.tbz
~
zypper ref
Bundled deployed to host. Proceed with update.
run 'refit summary' for more instructions.

6. [ ] Run the following command to validate the repository was created in the previous command:
# viprexec zypper lr

Dell Technologies Confidential Information version: 2.3.6.90

Page 7 of 18
The output from each node looks like this:
# | Alias | Name | Enabled | Refresh
--+-------+------+---------+--------
1 | repo | repo | Yes | Yes

Offline Update Procedure


Prepare Nodes for OS Update
1. [ ] Using putty or a similar tool, SSH to Node 1.
2. [ ] Put all of the nodes into maintenance mode to stop the containers.
# viprexec 'cd /opt/emc/caspian/fabric/cli && bin/fcli maintenance
https://`hostname -f`:9240 enter -force'

3. [ ] Run the watch command and wait for all containers to exit.
# watch viprexec "docker ps –a"

It will take several minutes for all docker images to exit. Continue to wait until the “watch” command
displays no running containers, then “Ctrl+C” out of the “watch” command and continue.
4. [ ] Run the following to stop docker and to validate that it stopped.
# viprexec systemctl stop docker
# viprexec systemctl status docker | grep Active

Perform the OS Update on all Nodes


This command can take up to 40 minutes to complete. Since it does not have a progress bar, it might
appear to be hung.

1. [ ] Create a second SSH session to node1. You can use it to monitor the progress of the update
command shown next.
2. [ ] From the first SSH session, run the following command:
# viprexec refit doupdate

3. [ ] From the second SSH session, run the following command to see if the refit doupdate
command is still running:
# ps auxwww|grep refit

Note: When the command completes, all the refit invocations kicked off by viprexec and its use of 'sh -c'
and 'ssh', will have completed and the grep will return nothing.

Enable specific ECS services to start automatically on all nodes


1. [ ] Run the following command:
# viprexec refit enableserviceslist nileHardwareManager docker nan

The following is shown for each node.


system action: enable for services: nileHardwareManager docker nan
enable nileHardwareManager
=========================================
systemctl enable nileHardwareManager

Dell Technologies Confidential Information version: 2.3.6.90

Page 8 of 18
REFIT SUCCESS running systemctl enable nileHardwareManager
enable docker
=========================================
systemctl enable docker
REFIT SUCCESS running systemctl enable docker
enable nan
=========================================
systemctl enable nan
REFIT SUCCESS running systemctl enable nan

Restart specific ECS services to sync update on all nodes


Restart network services for upgrade to ECS OS 2.1 only, run:
# viprexec ifdown public && systemctl restart nan

Restart network services for upgrades to earlier versions; run the following command (wickedd
below is not a typo ):
# viprexec refit restartserviceslist wickedd wicked nan

The following is shown for each node.


system action: restart for services: wickedd wicked nan
restart wickedd
=========================================
systemctl restart wickedd
REFIT SUCCESS running systemctl restart wickedd
restart wicked
=========================================
systemctl restart wicked

To check that the uptime is recent, run:


[viprexec] systemctl status nan

Disable and validate that master has been configured to not allow PXE boot
Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

You must do this each time before rebooting each node.

1. [ ] Determine what node is the master node. SSH to any node and type the following:
# ssh master

2. [ ] Run
# refit disablenanpxe

3. [ ] Run the following commands as validations to ensure that the NAN PXE has been disabled:
a. Verify the command returns no.
# getrackinfo -p RackInstallServer

b. Run the following command:

Dell Technologies Confidential Information version: 2.3.6.90

Page 9 of 18
# egrep 'tftp|pxe' /etc/dnsmasq.d/private_notftp

Verify it returns:
dhcp-boot=net:priv,pxelinux.0
#enable-tftp
#tftp-root=/srv/tftpboot

4. [ ] Make sure all the private MACs will be ignored by any rogue PXE server with a proper ignore list:
Run
# for mac in $(getrackinfo -v | egrep "private[ ]+:"|awk '{print $3}'); do
setrackinfo --installer-ignore-mac $mac ; done

5. [ ] Run:
# getrackinfo -i

The command should return output similar to the following. Validate that the number of MAC
addresses returned and the number of entries match the node count. You do not need to validate the
addresses, just the count:
Rack Installer Status
=====================
Mac Name Port Ip Status
00:1e:67:96:3e:59 provo 1 none Done!
00:1e:67:96:40:75 sandy 2 none Done!
00:1e:67:96:40:1b orem 3 none Done!
00:1e:67:96:40:2f ogden 4 none Done!

6. [ ] Run:
# viprexec cat /etc/dnsmasq.dhcpignore/all | sort | uniq
It should return output similar to the following:
4
4 00:1e:67:96:3e:59,ignore # (port 1) provo
4 00:1e:67:96:40:1b,ignore # (port 3) orem
4 00:1e:67:96:40:2f,ignore # (port 4) ogden
4 00:1e:67:96:40:75,ignore # (port 2) sandy
1 Output from host : 192.168.219.1
1 Output from host : 192.168.219.2
1 Output from host : 192.168.219.3
1 Output from host : 192.168.219.4

7. [ ] Make sure the dnsmasq was restarted, by running


# systemctl status dnsmasq

It should return something like the following:


...
Active: active (running) since Thu 2015-05-14 21:30:35 UTC; 10s ago

8. [ ] Exit from the master node by running the following command:


# exit

Power down all nodes except Node 1


Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

Dell Technologies Confidential Information version: 2.3.6.90

Page 10 of 18
1. [ ] SSH to node 1, and run the following command:
# refit ipmipower_all_not_node_x <starting_node> <ending_node> <node_1> off

Where
<starting_node> = The first node in the range. [2]
<ending_node> = The last node in the range. [4] or [8]
<node_1> = The node NOT being powered down. [1]
For example: On a 4 node system:
# refit ipmipower_all_not_node_x 2 4 1 off

After running this command, you are prompted with:


Running on node <x>
and you passed: <lower>:2 <upper>:4 <x>:1 <action>:off
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.

Validate power state on all nodes "Chassis Power is off" except Node 1
Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

1. [ ] Run the following command:


# refit ipmipower_all_not_node_x <starting_node> <ending_node> <node_1> status

Where:
<starting_node> = The first node in the range. [2]
<ending_node> = The last node in the range. [4] or [8]
<node_1> = The node NOT being checked. [1]
For example: On a 4 node system:
# refit ipmipower_all_not_node_x 2 4 1 status

After running this command, you are prompted with:


Running on node <x>
and you passed: <lower>:2 <upper>:4 <x>:1 <action>:status
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.

Validate that master has been configured to not allow PXE boot.
Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

The IPMI power down of the peers makes node1 the master. Perform these check(s) before powering the
peers up.

1. [ ] Run the following:

Dell Technologies Confidential Information version: 2.3.6.90

Page 11 of 18
# egrep 'tftp|pxe' /etc/dnsmasq.d/private_notftp

Verify that it returns


dhcp-boot=net:priv,pxelinux.0
#enable-tftp
#tftp-root=/srv/tftpboot

2. [ ] Make sure all the private MACs will be ignored by any rogue PXE server with a proper ignore list
by running:
# for mac in $(getrackinfo -v | egrep "private[ ]+:"|awk '{print $3}'); do
setrackinfo --installer-ignore-mac $mac ; done

3. [ ] Run the following:


# getrackinfo –i

The command should return output similar to the following. Validate that the number of MAC
addresses returned and the number of entries match the node count. You do not need to validate the
addresses, just the count:
Rack Installer Status
=====================
Mac Name Port Ip Status
00:1e:67:96:3e:59 provo 1 none Done!
00:1e:67:96:40:75 sandy 2 none Done!
00:1e:67:96:40:1b orem 3 none Done!
00:1e:67:96:40:2f ogden 4 none Done!

4. [ ] Run the following command:


# viprexec cat /etc/dnsmasq.dhcpignore/all | sort | uniq

It should something like the following:


4
4 00:1e:67:96:3e:59,ignore # (port 1) provo
4 00:1e:67:96:40:1b,ignore # (port 3) orem
4 00:1e:67:96:40:2f,ignore # (port 4) ogden
4 00:1e:67:96:40:75,ignore # (port 2) sandy
1 Output from host : 192.168.219.1
1 Output from host : 192.168.219.2
1 Output from host : 192.168.219.3
1 Output from host : 192.168.219.4

Power on all nodes except Node 1


Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

1. [ ] Run the following command:


# refit ipmipower_all_not_node_x <starting_node> <ending_node> <node_1> on

Where:
<starting_node> = The first node in the range. [2]
<ending_node> = The last node in the range. [4] or [8]
<node_1> = The node NOT being powered up. [1]

Dell Technologies Confidential Information version: 2.3.6.90

Page 12 of 18
For example, on a 4 node system:
# refit ipmipower_all_not_node_x 2 4 1 on

After running the command, you are prompted with:


Running on node <x>
and you passed: <lower>:2 <upper>:4 <x>:1 <action>:on
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.

Validate power state on all nodes "Chassis Power is on" except Node 1
Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

1. [ ] Run the following:


# refit ipmipower_all_not_node_x <starting_node> <ending_node> <node_1> status

Where:
<starting_node> = The first node in the range. [2]
<ending_node> = The last node in the range. [4] or [8]
<node_1> = The node NOT being checked. [1]
For example on a 4 node system:
# refit ipmipower_all_not_node_x 2 4 1 status

After running the command, you are prompted with:


Running on node <x>
and you passed: <lower>:2 <upper>:4 <x>:1 <action>:status
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.

Power Down Node1 and validate that master has been configured to not allow PXE boot
Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

1. [ ] Using putty or a similar tool SSH to Node 2, and power down Node 1
# ssh <Node 1> 'shutdown -h now'

2. [ ] Validate power state of Node 1 "Chassis Power is off"


# refit ipmipower_node_x 1 status

After running the command, you are prompted with:


Running on host <y> not connected to, or thru, node <x>
and you passed: <x>:1 <action>:status
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.

Dell Technologies Confidential Information version: 2.3.6.90

Page 13 of 18
3. [ ] If Node 1 is not in “off” state, use this command to force it:
# refit ipmipower_node_x 1 off

After running the command, you are prompted with:


Running on host <y> not connected to, or thru, node <x>
and you passed: <x>:1 <action>:off
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.
4. [ ] If you forced Node 1 into the "off" state, use this command to validate the status:
# refit ipmipower_node_x 1 status

After running the command, you are prompted with:


Running on host <y> not connected to, or thru, node <x>
and you passed: <x>:1 <action>:status
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.

Validate that master has been configured to not allow PXE boot
Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

1. [ ] Determine the master node. SSH to any node and run the following command:
# ssh master
2. [ ] SSH to the master node and then run:
# getrackinfo -p RackInstallServer

Validate the command returns no.


3. [ ] Run the following command:
# egrep 'tftp|pxe' /etc/dnsmasq.d/private_notftp

Validate that it returns the following:


dhcp-boot=net:priv,pxelinux.0
#enable-tftp
#tftp-root=/srv/tftpboot

4. [ ] Make sure all the private MACs will be ignored by any rogue PXE server with a proper ignore list,
by running:
# for mac in $(getrackinfo -v | egrep "private[ ]+:"|awk '{print $3}'); do
setrackinfo --installer-ignore-mac $mac ; done

5. [ ] Run:
# getrackinfo -i

The command should return output similar to the following. Validate that the number of MAC
addresses returned and the number of entries match the node count. You do not need to validate the
addresses, just the count:

Dell Technologies Confidential Information version: 2.3.6.90

Page 14 of 18
Rack Installer Status
=====================
Mac Name Port Ip Status
00:1e:67:96:3e:59 provo 1 none Done!
00:1e:67:96:40:75 sandy 2 none Done!
00:1e:67:96:40:1b orem 3 none Done!
00:1e:67:96:40:2f ogden 4 none Done!

6. [ ] Run:
# viprexec cat /etc/dnsmasq.dhcpignore/all | sort | uniq

It should return something like the following:


4
4 00:1e:67:96:3e:59,ignore # (port 1) provo
4 00:1e:67:96:40:1b,ignore # (port 3) orem
4 00:1e:67:96:40:2f,ignore # (port 4) ogden
4 00:1e:67:96:40:75,ignore # (port 2) sandy
1 Output from host : 192.168.219.1
1 Output from host : 192.168.219.2
1 Output from host : 192.168.219.3
1 Output from host : 192.168.219.4

Power up Node 1 to Complete OS Update


Note: Execute the following section if the nodes being updated were at 2.0.0 HF3 or higher before
applying the update, otherwise skip to Exit maintmode to restart all containers on all Nodes.

1. [ ] Power on Node 1 by running the following command:


# refit ipmipower_node_x <node_1> on

After running the command, you are prompted with:


Running on host <y> not connected to, or thru, node <x>
and you passed: <x>:1 <action>:on
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.
2. [ ] Validate power state of Node 1 "Chassis Power is on"
# refit ipmipower_node_x <node_1> status

After running the command, you are prompted with:


Running on host <y> not connected to, or thru, node <x>
and you passed: <x>:1 <action>:status
Verify with enter/return to continue - or Ctrl-C to abort

You must hit <enter> to confirm the command. It does not continue until you respond.

Exit maintenance mode to restart containers on all Nodes


1. [ ] When node 1 is fully powered up, run the following command to check that docker is running on
all nodes.
# viprexec systemctl status docker | grep active

If docker is not running on all nodes, start it by running the following command:
# viprexec systemctl start docker

Dell Technologies Confidential Information version: 2.3.6.90

Page 15 of 18
2. [ ] Verify docker was started on all nodes by running the following command:
# viprexec systemctl status docker | grep active

3. [ ] Run the following command to exit maintenance mode:


# viprexec 'cd /opt/emc/caspian/fabric/cli && bin/fcli maintenance
https://`hostname -f`:9240 exit -force'
Example output:
Output from host : 192.168.219.4
Agent 0a6e7141-1678-4ab2-b8d5-f77bdf7ad1ad is in ACTIVE mode now
Exited maintenance mode.

Output from host : 192.168.219.3


Agent 2c492716-61df-4275-9871-ad9eb0e079e9 is in ACTIVE mode now
Exited maintenance mode.

Output from host : 192.168.219.2


Agent 48ab564c-a792-4a76-b388-086f8c39a7dc is in ACTIVE mode now
Exited maintenance mode.

Output from host : 192.168.219.1


Agent ecbf64fb-a11a-4a5f-a30e-771a69d1831e is in ACTIVE mode now
Exited maintenance mode.

4. [ ] Run the following command and verify that the MODE for all nodes is reported as ACTIVE and
not LOCKDOWN:

# viprexec 'cd /opt/emc/caspian/fabric/cli && bin/fcli agent GET /v1/agent/mode


https://`hostname -f`:9240'

Example output:
Output from host : 192.168.219.4
{"mode":"ACTIVE","status":"OK","etag":573}

Output from host : 192.168.219.1


{"mode":"ACTIVE","status":"OK","etag":839}

Output from host : 192.168.219.3


{"mode":"ACTIVE","status":"OK","etag":614}

Output from host : 192.168.219.2


{“mode":"ACTIVE","status":"OK","etag":614}

Final Node Validations


1. [ ] Validate that the containers status is ‘up’
# viprexec docker ps

All Node Updates are Complete


Post-Update Tasks
1. [ ] Using putty or a similar tool SSH to Node 1.

Dell Technologies Confidential Information version: 2.3.6.90

Page 16 of 18
2. [ ] Save the post-update OS information on all nodes to create an audit record that indicates an
upgrade was performed on this node.
# viprexec refit postupdateversions
3. [ ] Run the following command to display the old and new versions.
# viprexec 'ls -alrt /var/tmp/*updateversions*.log'

ECS Core Operating System Offline Update: Completed


Check that the object service initialization process is complete
Check that the object service initialization process is complete by running the following command from
the updated node:
# curl http://localhost:9101/stats/dt/DTInitStat/

You want to see the entry <unready_dt_num>0</unready_dt_num> in the output.


Example output:
<?xml version="1.0" encoding="UTF-8"
standalone="yes"?><result><entry><total_dt_num>1920</total_dt_num><unready_dt_num>0
</unready_dt_num></entry>

The first time you run the command, you might not see an entry for <unready_dt_num>. If that is true,
wait a few minutes and rerun the command. Do not proceed to the next step until you see the entry
<unready_dt_num>0</unready_dt_num> in the output.

Validate the data path is operational – Moved section


Consult with the customer to determine if there is an object user, a secret key and bucket that you can
use to test the data path. If they do not, you will have to create one. Consult the ESC documentation for
instructions about how to create an object user, a secret key, and a bucket. ECS Documentation Index
• Obtain the S3 browser tool.
• Create or locate a text file on your laptop that you can upload to the appliance.

1. [ ] Start the S3 Browser, and set up an account for the ECS Appliance with the following
specifications:
Option Description
Storage Type Select S3 Compatible Storage
REST Endpoint The IP address of one of the ECS Appliance
nodes using port 9020 or 9021. For example:
198.51.100.244:9021
Access Key ID Enter ecsuser
Secret Access Key The Object Secret Access Key
2. [ ] You should see the bucket you created using the ECS Portal.
3. [ ] Use the S3 browser to upload the test file from your laptop to verify that you are able to write to
the appliance

Dell Technologies Confidential Information version: 2.3.6.90

Page 17 of 18
Troubleshooting
If node does not power on cleanly
1. [ ] Login to Remote Management Module (RMM)
2. [ ] Launch virtual Console
3. [ ] If node hung in boot process “reset” power using RMM

If node does not see DAE disks


1. [ ] Shutdown node gracefully
# shutdown -h now
2. [ ] Validate power state of node being updated "Chassis Power is off" (Wait 5 minutes)
# refit ipmipower_node_x <update_node_x> status

Where:
<update_node_x> = The number of the Node being updated. [1-8]
For example:
# refit ipmipower_node_x 4 status

3. [ ] When prompted to continue, press [Enter]


4. [ ] If the node from the previous step is not in the “off” state force it and re-validate (Optional)
# refit ipmipower_node_x <update_node_x> off
# refit ipmipower_node_x <update_node_x> status
Where:
<update_node_x> = The number of the Node being updated. [1-8]
For example:
# refit ipmipower_node_x 4 status

5. [ ] When prompted to continue, press [Enter].


6. [ ] Power on node being updated
# refit ipmipower_node_x <update_node_x> on

Where: <update_node_x> = The number of the Node being updated. [1-8]


For example:
# refit ipmipower_node_x 4 status

7. [ ] When prompted to continue, press [Enter].

Dell Technologies Confidential Information version: 2.3.6.90

Page 18 of 18

You might also like