ECS - ECS PS Procedures-ECS Appliance Tech Refresh - 3.7.x.x

Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

ECS ™ Procedure Generator

Solution for Validating your engagement

Topic
ECS PS Procedures
Selections
ECS Professional Services Procedures: ECS Appliance Tech Refresh - 3.7.x.x

Generated: July 7, 2022 6:26 PM GMT

REPORT PROBLEMS

If you find any errors in this procedure or have comments regarding this application, send email to
[email protected]

Copyright © 2022 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell
EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be
trademarks of their respective owners.

The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of
any kind with respect to the information in this publication, and specifically disclaims implied warranties of
merchantability or fitness for a particular purpose.

Use, copying, and distribution of any software described in this publication requires an applicable
software license.

This document may contain certain words that are not consistent with Dell's current language guidelines.
Dell plans to update the document over subsequent future releases to revise these words accordingly.

This document may contain language from third party content that is not under Dell's control and is not
consistent with Dell's current guidelines for Dell's own content. When such third party content is updated
by the relevant third parties, this document will be revised accordingly.

Publication Date: July, 2022

Dell Technologies Confidential Information version: 2.3.6.91

Page 1 of 48
Contents
Preliminary Activity Tasks .......................................................................................................4
Read, understand, and perform these tasks.................................................................................................4

Notes, cautions, and warnings................................................................................................5


Overview .................................................................................................................................5
Tech Refresh terminology and process overview .........................................................................................5
Terminology ..................................................................................................................................................5
Process overview..........................................................................................................................................6

Extend Nodes and Rack .........................................................................................................7


ECS software extend overview .....................................................................................................................7
Workflow .......................................................................................................................................................7
Prerequisites .................................................................................................................................................8
ECS software extend limitations ...................................................................................................................9
Software extend readiness checklist...........................................................................................................10
Connect to the ECS appliance....................................................................................................................10
Connect from a remote location..................................................................................................................10
Connect a service laptop to a U-Series or D-Series rack on site................................................................10
Connect a service laptop to the EX Series rack..........................................................................................11
ECS software extend procedure .................................................................................................................12

Migrate Data..........................................................................................................................12
Migration planning.......................................................................................................................................13
Migration prerequisites, restrictions, and limitations ...................................................................................13
Prerequisites ...............................................................................................................................................13
Restrictions and limitations .........................................................................................................................13
Run optional premigration health checks....................................................................................................14
Trigger data migration.................................................................................................................................15
Manage migration .......................................................................................................................................20
Monitor migration status using ECS Service Console ................................................................................20
Monitor migration status using ECS UI Grafana Dashboard ......................................................................24
Pause and resume migration ......................................................................................................................25
Pause and resume migration using ECS Service Console.........................................................................25
Data migration throttling..............................................................................................................................28

Dell Technologies Confidential Information version: 2.3.6.91

Page 2 of 48
Tune migration throttling .............................................................................................................................28
Migration alerts ...........................................................................................................................................30
Enable migration capacity alerts .................................................................................................................30

Remove a node from a cluster using ECS Service Console.................................................31


Run optional pre—node evacuation health checks.....................................................................................31
Remove a node from a cluster using ECS Service Console.......................................................................32
Move licenses to new ECS system.............................................................................................................42
Carry out post—Tech Refresh checks ........................................................................................................42

Document feedback ..............................................................................................................48

Dell Technologies Confidential Information version: 2.3.6.91

Page 3 of 48
Preliminary Activity Tasks
This section may contain tasks that you must complete before performing this procedure.

Read, understand, and perform these tasks


1. Table 1 lists tasks, cautions, warnings, notes, and/or knowledgebase (KB) solutions that you need to
be aware of before performing this activity. Read, understand, and when necessary perform any
tasks contained in this table and any tasks contained in any associated knowledgebase solution.

Table 1 List of cautions, warnings, notes, and/or KB solutions related to this activity

2. This is a link to the top trending service topics. These topics may or not be related to this activity.
This is merely a proactive attempt to make you aware of any KB articles that may be associated with
this product.

Note: There may not be any top trending service topics for this product at any given time.

ECS Top Service Topics

Dell Technologies Confidential Information version: 2.3.6.91

Page 4 of 48
Notes, cautions, and warnings

NOTE:
A NOTE indicates important information that helps you make better use of your product.

CAUTION:
A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid the problem.

WARNING:
A WARNING indicates a potential for property damage, personal injury, or death.

Overview
This document provides guidance on replacing ECS Gen 1 and Gen 2 hardware to Gen 3 hardware.

This document provides information about:

• Simplifying the hardware replacement process


• Extending an existing cluster
• Reducing the size of existing cluster

The following graphic provides an overview of the process.

Figure 1. ECS Tech Refresh process

Tech Refresh terminology and process overview


Learn about the terms and processes that are referenced throughout this document.

Terminology
Source node: Node to be decommissioned

Dell Technologies Confidential Information version: 2.3.6.91

Page 5 of 48
Target node: New node to receive data

Extension: Service procedure to add new nodes to an existing ECS cluster.


Migration: Object data migration from set of source nodes to target nodes.

Node Evacuation: : Service procedure to remove a node from the cluster.

Process overview
The following graphics provide figures of the processes that are referenced in this document.

Figure 1. Node Extension

Figure 2. Data Migration

Dell Technologies Confidential Information version: 2.3.6.91

Page 6 of 48
Figure 3. Node Evacuation

Extend Nodes and Rack


Node and Rack extension is the required first step in the Tech Refresh. Ensure that you carry out the
steps documented.

ECS software extend overview


This section provides overview, and workflow for Node and Rack extend procedures.

Workflow
This section provides the flow of the software expansion procedures.

• Connect to existing VDC.


• Configure and upgrade Service Console (SC).
• Run the SC health check from R1N1.
• Run the collection on each of the existing racks. You can get the script from the KB article KB
531528. Provide the information to Customer Service (CS) for review.
• The Professional Services (PS) or partner team sends an email with UPDATE in subject to
mailto:[email protected] to get the latest designer spreadsheet.
• The PS team fills out the designer with the new nodes and racks information.

NOTE:
On dark sites where you cannot share the information, you must send an email to mailto:[email protected] for
procedure.
• The PS team installs the operating system:
o Installs the operating system for new rack.
o Installs the operating system for new nodes in the existing rack.
• The PS team uses the extend scripts which configure the networking on the new nodes and
rack. They run the basic validations and provide the extend.ini in /tmp/extend/extend.ini
folder.

Dell Technologies Confidential Information version: 2.3.6.91

Page 7 of 48
• PS team runs the SC Extend process using extend.ini and waits until the fabric allocates all the
drives.
• PS team runs health checks
• PS team validates new nodes that are shown in the user interface for VDC endpoints.

WARNING:

• Run only the commands that are mentioned in the Solve or OE instructions.
• Run the commands in sequential order as it is listed in the Solve or OE instructions.
• Do not run any commands that are mentioned in the output logs but are not part of the Solve or
OE instructions. Contact ECS Remote Support for any queries.

Prerequisites
Table 1. Tasks to complete before arriving onsite

Item (engage with the customer) next step...

Obtain the ECS Portal login credentials from the onsite Provision new storage for the extended
contact. nodes

Run the Compatibility Checker. Run the Compatibility Checker

Assign IPs to new nodes. • Static—Configure on existing


VDC using the command:
setrackinfo
• Dynamic host configuration
protocol (DHCP)—Add nodes to
DHCP server

Set Domain Name Services (DNS) for extended nodes. Add extended nodes to DNS (both
forward and reverse lookup).

Is network separation implemented? If the customer chose Inquire before arriving onsite.
to use network separation, you must have completed all
prerequisites and procedures that are outlined in the ECS
Networks document in SolVe Desktop or online.

Any custom switch configurations implemented in the If custom switch configurations exist, then
existing ECS appliance? you must replicate across extended
nodes.

Determine current ECS software version running at the Connect remotely and run the command:
customer site.
admin@provo-yellow:~>
svc_versionsvc_version v1.0.7
(svc_tools v1.5.0)
NOTE:
If the environment is running general patches code, download the Example Output:
production.tgz from a different location.

Dell Technologies Confidential Information version: 2.3.6.91

Page 8 of 48
ECS Version: 3.2.2.1
Object Version 3.2.2.1-
102513.515d86e
OS Version 3.2.2.0-1964.f8d017f.44
Fabric Version 1.5.0.0-3545.d53cc93
Fabric-agent Version 1.5.0.0-
3545.d53cc93
Syslog Version <Unknown>
Zookeeper Version 3.4.9.0-82.0ecec52
Registry Version 2.3.1.0-58.3a6dfaf
Utilities Version 1.5.0.0-
3545.d53cc93
SC Version 3.0.0.0-19361.2c53303a9*
xDoctor Version 4.7-49
svc_tools Version 1.5.0

*Versions differ between nodes

| Config Changes Mismatched


Invalid
Patch(es) installed
| Detected Patch(es)
Patch(es)
--------------------
Config Changes Detected
Mismatched Patch(es)
Invalid Patch(es)
--------------------
<None - running GA release>

Familiarize yourself with the applicable ECS documentation. • ECS Release Notes
• Pertinent KnowledgeBase (KB)
articles
• Available in SolVe Desktop or
online:
o ECS Compatibility
Checker User Guide
o xDoctor release notes

ECS software extend limitations


This section provides the limitations for performing the node and rack extend procedure.

• If there are more than five static routes configured on the existing racks, then you must configure
and validate the additional static routes manually in NAN, before performing the extend
operation.
• If the existing VDC contains SSD Read cache, the extended nodes must have the SSD Read
Cache hardware that is installed before performing the extend. Mixed configuration is not
supported for SSD read cache.
• Do not use the Service Console extend process when creating a storage pool with the extended
nodes.
• Service Console (SC) extend supports extending with one rack at a time. For multiple racks, SC
extend command should be run for each rack.

Dell Technologies Confidential Information version: 2.3.6.91

Page 9 of 48
• Automatic extend node imaging or configuration is not supported.

Software extend readiness checklist


Do not proceed with the software extend until the below listed items are complete.

Table 1. Final checks before initiating the Software extend

Item

Check whether the relevant ECS documentation is reviewed.

Check whether the same ECS operating system version is installed on the new hardware as on the
existing hardware.

Check whether the extend node IPs that are added to DHCP OR static information are available.

Check whether the extend nodes are added to DNS.

Check whether the Network separation IP addresses available to configure network separation.

Check whether the ECS Public switches are configured.

Check whether the new ECS hardware that is installed and validated as outlined in the ECS Capacity
Expansion section of SolVe Desktop or online.

Check whether the latest version of xDoctor is installed. See the xDoctor User's Guide, which is available
in SolVe Desktop or online.

Check whether the latest version of Service Console is installed.

Connect to the ECS appliance


This section outlines how to connect to the ECS appliance (Node1 Rack1) locally onsite or remotely.

Connect from a remote location


Use Secure Remote Services to connect remotely. See the ECS Software Installation Guide for
procedures.

Connect a service laptop to a U-Series or D-Series rack on site


Access the U-Series or D-Series rack using the private network (192.168.219.XXX) from the laptop.
Steps

1. Connect your laptop to port 24 of the 1 GbE Turtle switch.


2. Configure your laptop with the following network parameters:
o IP: 192.168.219.99
o Netmask: 255.255.255.0
o No Gateway

Dell Technologies Confidential Information version: 2.3.6.91

Page 10 of 48
3. Validate and ping node1 or the rack that you connected.
4. Ping the node 1 or the rack at: 192.168.219.1 so that later you can ssh to the node1 or that rack
later in the procedure.

ping 192.168.219.1

NOTE:
If 192.168.219.1 does not answer, try 192.168.219.2. If there is no response, verify the laptop IP/subnet mask, network
connection, and switch port connection. If the service laptop is connected to Dell VPN, ping to 192.168.219.x does not
return a response.

Connect a service laptop to the EX Series rack


Access an ECS EX-Series rack using the private (192.168.219.XXX) network from a laptop.
Prerequisites

• Access to private network IP addresses (192.168.219.1 to 16 and 192.168.219.101 to 116) are


limited to the nodes connected in the rack backend 1/10/25GbE fox management switch.
• Private.4 (NAN) network IP addresses (169.254.x.x) of all nodes in all racks in the ECS Virtual
Data Center (VDC) are accessible from any node in the ECS VDC once you SSH in to a node
using a private IP address (192.168.219.x).

• If security lock down is not enabled, access to public network IP addresses for all ECS racks are
available once you SSH to one of the ECS nodes .

• Two Switches Fox and Hound are used for the private network, or Nile Area Network (NAN). For
example, node 8 must connect to Hound port 8 and Fox port 8. For more information, see ECS
EX Series Hardware Guide.
Steps

• Connect your service laptop to the VDC.

Option Description

If the cabinet contains a service shelf Open the service shelf, and connect the red network cable to the
with a red network cable... service laptop.
The red cable connects to port 34 on the fox switch. The fox
switch is the bottom back-end switch in a dual switch
configuration.

If the cabinet does not contain a From the rear of the rack, connect directly to either port 34 or 36
service shelf with a red network on the fox switch, whichever port contains a 1GB SFP.
cable...

If you want to connect a service Locate port 36 on the fox switch. The fox switch is the bottom
laptop to the rear of the rack... back-end switch in a dual switch configuration.
Port 36 has a 1GB SFP that you can connect your service laptop

Dell Technologies Confidential Information version: 2.3.6.91

Page 11 of 48
Option Description

to with a Cat6 cable.

Figure 1. Fox switch

1 - Port 34 for service tray connection

2 - Port 36 for connection from rear

• Set the network interface on the laptop to the static address 192.168.219.99, subnet mask
255.255.255.0, with no gateway required.
• Verify that the temporary network between the laptop and rack's private management network is
functioning by using the ping command.

NOTE:
If 192.168.219.1 does not answer, try 192.168.219.2. If neither responds, verify the laptop IP/subnet mask, network
connection, and switch port connection. If the service laptop is connected to Dell's VPN, ping to 192.168.219.x may not
return a response.

C:\>ping 192.168.219.1
Pinging 192.168.219.1 with 32 bytes of data:
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64
Ping statistics for 192.168.219.1:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms

ECS software extend procedure


For detailed node and rack software extend procedures, see ECS 3.7 Software Extend Instructions
guide.

Migrate Data
This chapter provides information about how to migrate data.

The example outputs in the procedures may not represent the exact output. Use it for reference only.

Dell Technologies Confidential Information version: 2.3.6.91

Page 12 of 48
Migration planning
Read about how to determine how many nodes you must carry out a migration for a storage pool, and
other planning activities.

Calculate the capacity needed for target hardware, and add enough new hardware to the storage pool as
target nodes.

• Estimate if new hardware has enough capacity for the migration to successfully complete. Use
the following loose formula to calculate capacity:

Figure 1. Total capacity on target calculation

• Use existing capacity tools to calculate total capacity.


• Metadata update garbage is 1% of total capacity on the source.
• Estimate data injection during migration from current load pattern and confirm with the customer.
• The target hardware must have at least five nodes, which is the minimum requirement for an
ECS system.

Migration prerequisites, restrictions, and limitations


Learn about the prerequisites, restrictions, and limitations for migrating data.

Prerequisites
• The ECS system must be running ECS 3.5 or later for Tech Refresh.
Ensure that the source nodes and target nodes are running the same ECS code version of 3.5 or
later.

• Ensure that the target nodes are added into the ECS system storage pool before you start the
tech refresh data migration.
• Ensure that the target nodes are clean and not provisioned before adding it to the existing
system.
• Ensure that any ongoing PSO or VDC removal from RG is complete before you can trigger tech
refresh data migration.
• Ensure that any ongoing CAS migration is complete before you trigger tech refresh data
migration.
• Ensure that the NTP server is working and that there is no material NTP drifting between nodes.

Restrictions and limitations


• ECS does not support upgrading to a major ECS version during Tech Refresh data migration or
triggering Tech migration during an ECS major version operating system upgrade. If you need
an ECS patch upgrade, contact the Dell EMC support team.
• Migration cannot be canceled or reverted once it is triggered.
• Tech Refresh does not support adding new source nodes after you trigger data migration. If the
end user fails to include all the source node IP addresses in the trigger data migration command,

Dell Technologies Confidential Information version: 2.3.6.91

Page 13 of 48
they must wait for the command to complete its run, and then rerun the command with any
missed source node IP addresses.
• Tech Refresh data migration does not support performing a user-initiated Planned Site Outage
(PSO) during data migration.
• If the environment has Geo Clusters configured in at least three sites with a large Delete load, or
has a large XOR DECODE task backlog, data migration may be limited or blocked by data
chunks in GEO DELETING status. Data migration continues after XOR DECODE tasks
complete.
• During data migration, the chunks with ec copy and under_transformation copy can
block Tech Refresh.

Run optional premigration health checks


If you choose, use the ECS Service Console to run health checks before you trigger data migration.
There are two different commands, which run different checks.

Steps
From the ECS Service Console, run the run Health_Check command with the - pre_data_migration
tag. For example:
service-console run Health_Check --tags pre_data_migration
Output such as the following appears:

service-console run Health_Check --tags pre_data_migration

Service Console is running on node 169.254.89.1 (suite


20200408_200510_Health_Check)
Service console version: 5.0.0.0-20597.e8eda88ed
Debug log: /opt/emc/caspian/service-
console/log/20200408_200505_run_Health_Check/dbg_robot.log
===============================================================================
=
Health Check
20200408 20:05:29.577: Execute Health Checks
20200408 20:05:29.586: | Validate that all nodes are available - OS
20200408 20:05:33.640: | | PASS (4 sec)
20200408 20:05:33.641: | Validate time drift
20200408 20:05:36.099: | | PASS (2 sec)
20200408 20:05:36.101: | Validate that all partitions are under control
20200408 20:07:44.609: | | PASS (2 min 8 sec)
20200408 20:07:44.612: | Check DT status
Checking DT status (with timeout 10 min).
20200408 20:08:25.353: | | PASS (40 sec)
20200408 20:08:25.354: | Check on-going PSO or VDC removal from RG
20200408 20:08:50.959: | | PASS (25 sec)
20200408 20:08:50.961: | Validate that there are no transformation instances
20200408 20:08:55.278: | | PASS (4 sec)
20200408 20:08:55.279: | PASS (3 min 25 sec)
===============================================================================
=

Dell Technologies Confidential Information version: 2.3.6.91

Page 14 of 48
Status: PASS
Time Elapsed: 3 min 56 sec
Debug log: /opt/emc/caspian/service-
console/log/20200408_200505_run_Health_Check/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200408_200505_run_Health_Check/log.html
===============================================================================
=

Trigger data migration


Trigger data migration from source nodes to newly extended target nodes.
Steps

1. From the ECS Service Console, run the run Data_Migration command. For example:
service-console run Data_Migration --target-node <source node private.4
IPs for data migration>
Where the IP addresses used are those IP addresses of the source nodes. That is, the nodes
from which the data is being migrated. Multiple nodes IP addresses are comma-separated. For
example:

service-console run Data_Migration --target-node

169.254.89.1,169.254.89.2,169.254.89.3,169.254.89.4,169.254.89.5,169.
254.89.6,169.254.89.7,169.254.89.8

2. In the ECS Service Console, look for output: Do you confirm the nodes selected for
the migration? [yes/no]:
Output such as the following appears:

Data Migration command output for confirmation:


Service console version: 5.0.0.0-20640.8c7060970
Debug log: /opt/emc/caspian/service-
console/log/20200417_172741_run_Data_Migration/dbg_robot.log
================================================================================
Data Migration Setup
20200417 17:28:11.980: Is migration running
20200417 17:28:19.992: | PASS (8 sec)
20200417 17:28:19.996: Check data migration parameters
20200417 17:28:20.001: | PASS
20200417 17:28:20.048: Check for multiple Storage pools
20200417 17:28:20.049: | PASS
20200417 17:28:20.049: Validate number of VNEST members
20200417 17:28:21.362: | PASS (1 sec)
20200417 17:28:21.365: Check migration capacity
20200417 17:28:41.752: | PASS (20 sec)
20200417 17:28:41.753: Check source and target nodes version

Dell Technologies Confidential Information version: 2.3.6.91

Page 15 of 48
20200417 17:29:09.813: | PASS (28 sec)
20200417 17:29:09.815: System is fully upgraded
20200417 17:29:18.485: | PASS (8 sec)
Data Migration Pre Check
20200417 17:29:18.737: Run health check
20200417 17:29:18.862: | Validate that all nodes are available - OS
20200417 17:29:23.026: | | PASS (4 sec)
20200417 17:29:23.028: | Validate time drift
20200417 17:29:26.503: | | PASS (3 sec)
20200417 17:29:26.505: | Validate that all partitions are under control
20200417 17:31:31.161: | | PASS (2 min 4 sec)
20200417 17:31:31.163: | Check DT status
Checking DT status (with timeout 10 min).
20200417 17:32:34.126: | | PASS (1 min 2 sec)
20200417 17:32:34.128: | Check on-going PSO or VDC removal from RG
20200417 17:32:53.221: | | PASS (19 sec)
20200417 17:32:53.222: | Validate that there are no transformation instances
20200417 17:32:58.036: | | PASS (4 sec)
20200417 17:32:58.038: | PASS (3 min 39 sec)
Data Migration Plan
20200417 17:32:58.438: Format data migration message
20200417 17:32:58.440: | PASS
We are going to start data migration from the nodes 169.254.89.1, 169.254.89.2,
169.254.89.3, 169.254.89.4, 169.254.89.5, 169.254.89.6, 169.254.89.7, 169.254.89.8.
Once data migration is triggered for a node, it could not be reverted.
Do you confirm the nodes selected for the migration? [yes/no]: 20200417 17:33:56.945:
Mark nodes as migration source
20200417 17:41:04.299: | PASS (7 min 7 sec)
Data Migration Trigger
20200417 17:41:04.454: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : logan-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : murray-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : sandy-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : ogden-brass.ecs.lab.emc.com

Dell Technologies Confidential Information version: 2.3.6.91

Page 16 of 48
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : lehi-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : orem-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : provo-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

20200417 17:41:05.732: | PASS (1 sec)


We are ready to start data migration.
Do you want to continue? [yes/no]:

If you select No, Service Console carries out a health check only and does not mark source
nodes for migration.
If you select Yes, Service Console output lists the source nodes and the amount of data that will
be migrated from each node.
3. Verify in the output the amount of data to be migrated and that the migration is triggered.

20200417 17:51:58.660: Stop chunk re-balance


20200417 17:52:05.742: | PASS (7 sec)
20200417 17:52:05.743: Trigger Data Migration
20200417 17:52:07.832: | PASS (2 sec)
Data Migration Main Phase
20200417 17:52:15.133: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.04TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.04TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.06TiB/43.06TiB

Dell Technologies Confidential Information version: 2.3.6.91

Page 17 of 48
Host : sandy-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.04TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.03TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.07TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.02TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.03TiB/43.03TiB

20200417 17:52:16.284: | PASS (1 sec)


20200417 17:52:16.287: Print data migration instruction and exit
The data migration is running.
20200417 17:52:16.299: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 24 min 38 sec
Debug log: /opt/emc/caspian/service-
console/log/20200417_172741_run_Data_Migration/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200417_172741_run_Data_Migration/log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
================================================================================

The following output shows a complete migration:

/opt/emc/bin/service-console run Data_Migration


Service Console is running on node 169.254.89.1 (suite 20200422_192851_Data_Migration)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-
console/log/20200422_192849_run_Data_Migration/dbg_robot.log

Dell Technologies Confidential Information version: 2.3.6.91

Page 18 of 48
================================================================================
Data Migration Setup
20200422 19:29:14.784: Is migration running
20200422 19:29:18.853: | PASS (4 sec)
20200422 19:29:18.860: Check data migration parameters
Warning: target node option has no effect on the already started migration.
20200422 19:29:18.862: | PASS
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200422 19:29:33.417: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com

Dell Technologies Confidential Information version: 2.3.6.91

Page 19 of 48
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

20200422 19:29:33.988: | PASS


20200422 19:29:33.990: Print data migration instruction and exit
The data migration is done.
20200422 19:29:33.991: | PASS
Data Migration Post Check
20200422 19:29:38.220: Run health check
20200422 19:29:38.322: | Validate that all partitions are under control
20200422 19:31:04.722: | | PASS (1 min 26 sec)
20200422 19:31:04.724: | PASS (1 min 26 sec)
Data Migration Teardown
20200422 19:31:09.975: Start chunk re-balance
20200422 19:31:14.696: | PASS (4 sec)
================================================================================
Status: PASS
Time Elapsed: 2 min 36 sec
Debug log: /opt/emc/caspian/service-
console/log/20200422_192849_run_Data_Migration/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200422_192849_run_Data_Migration/log.html
================================================================================
Messages:
The data migration is done.
================================================================================

When the migration is complete, the Service Console shows that the migration is complete in the
output. If you rerun the run Data_Migration command, the output shows a migration failure.
This behavior is according to problem https://asdjira.isus.emc.com:8443/browse/SCONSOLE-
2383.

Manage migration
Learn how you can manage your data migration in ECS tech refresh.

Monitor migration status using ECS Service Console


Use the ECS Service Console to monitor the migration status.
Steps

1. From the ECS Service Console, run the run Data_Migration command:
When the migration is complete, the Service Console shows that the migration is complete in the
output.
service-console run Data_Migration
For example:

service-console run Data_Migration

Dell Technologies Confidential Information version: 2.3.6.91

Page 20 of 48
2. Verify that the output shows that the migration is running and in progress.

/opt/emc/bin/service-console run Data_Migration


Service Console is running on node 169.254.89.1 (suite 20200420_140658_Data_Migration)
Service console version: 5.0.0.0-20640.8c7060970
Debug log: /opt/emc/caspian/service-
console/log/20200420_140654_run_Data_Migration/dbg_robot.log
================================================================================
Data Migration Setup
20200420 14:07:17.065: Is migration running
20200420 14:07:21.273: | PASS (4 sec)
20200420 14:07:21.281: Check data migration parameters
Warning: target node option has no effect on the already started migration.
20200420 14:07:21.284: | PASS
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200420 14:07:22.127: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.82%
ONGOING 19.02TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.87%
ONGOING 18.99TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.76%
ONGOING 19.05TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.95%
ONGOING 18.96TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.13%
ONGOING 19.31TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.92%
ONGOING 18.98TiB/43.07TiB

Dell Technologies Confidential Information version: 2.3.6.91

Page 21 of 48
Host : orem-brass.ecs.lab.emc.com
[******************************************************-------------------------------
---------------] 54.46%
ONGOING 19.59TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.63%
ONGOING 19.09TiB/43.03TiB

20200420 14:07:22.889: | PASS


20200420 14:07:22.891: Pause or resume migration
20200420 14:07:22.892: | PASS
20200420 14:07:22.893: Print data migration instruction and exit
The data migration is running.
20200420 14:07:22.895: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 31 sec
Debug log: /opt/emc/caspian/service-
console/log/20200420_140654_run_Data_Migration/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200420_140654_run_Data_Migration/log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
==================================================

3. Verify that migration is complete.

/opt/emc/bin/service-console run Data_Migration


Service Console is running on node 169.254.89.1 (suite 20200422_192851_Data_Migration)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-
console/log/20200422_192849_run_Data_Migration/dbg_robot.log
================================================================================
Data Migration Setup
20200422 19:29:14.784: Is migration running
20200422 19:29:18.853: | PASS (4 sec)
20200422 19:29:18.860: Check data migration parameters
Warning: target node option has no effect on the already started migration.
20200422 19:29:18.862: | PASS
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run

Dell Technologies Confidential Information version: 2.3.6.91

Page 22 of 48
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200422 19:29:33.417: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

20200422 19:29:33.988: | PASS


20200422 19:29:33.990: Print data migration instruction and exit
The data migration is done.
20200422 19:29:33.991: | PASS
Data Migration Post Check
20200422 19:29:38.220: Run health check
20200422 19:29:38.322: | Validate that all partitions are under control
20200422 19:31:04.722: | | PASS (1 min 26 sec)

Dell Technologies Confidential Information version: 2.3.6.91

Page 23 of 48
20200422 19:31:04.724: | PASS (1 min 26 sec)
Data Migration Teardown
20200422 19:31:09.975: Start chunk re-balance
20200422 19:31:14.696: | PASS (4 sec)
================================================================================
Status: PASS
Time Elapsed: 2 min 36 sec
Debug log: /opt/emc/caspian/service-
console/log/20200422_192849_run_Data_Migration/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200422_192849_run_Data_Migration/log.html
================================================================================
Messages:
The data migration is done.
================================================================================

Monitor migration status using ECS UI Grafana Dashboard


Use the ECS UI Grafana Dashboard to monitor the migration status.
About this task

NOTE:
Allow 5 to 10 minutes for the Grafana dashboard to display all the nodes.

Steps

1. In the ECS UI, go to the Grafana Dashboard: Go to Advanced Monitoring.


2. Select Tech Refresh: Data Migration from the pulldown menu at the top of the page.
The dashboard provides various migration details.
For example:

Figure 1. ECS UI Grafana Dashboard Data Migration Status

Dell Technologies Confidential Information version: 2.3.6.91

Page 24 of 48
Pause and resume migration
Use the ECS Service Console to pause and resume data migration.

Pause and resume migration using ECS Service Console


About this task

When data migration is paused for a node, on-going data migration is stopped and migration framework
periodically checks and waits until data migration resumes.

Steps

1. From the ECS Service Console, run the pause Data_Migration command. For example:
service-console run Data_Migration --operation pause
Output that indicates that the migration is paused, such as the following bolded text, appears:

/opt/emc/bin/service-console run Data_Migration --operation pause


Service Console is running on node 169.254.89.1 (suite 20200418_174821_Data_Migration)
Service console version: 5.0.0.0-20640.8c7060970
Debug log: /opt/emc/caspian/service-
console/log/20200418_174817_run_Data_Migration/dbg_robot.log
================================================================================
Data Migration Setup
20200418 17:48:39.484: Is migration running
20200418 17:48:43.963: | PASS (4 sec)
20200418 17:48:43.971: Check data migration parameters
Node(s) to be paused: 169.254.89.1, 169.254.89.2, 169.254.89.3, 169.254.89.4,
169.254.89.5, 169.254.89.6, 169.254.89.7, 169.254.89.8.
20200418 17:48:47.449: | PASS (3 sec)
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200418 17:48:48.072: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.83%
ONGOING 22.89TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.80%
ONGOING 22.90TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.84%

Dell Technologies Confidential Information version: 2.3.6.91

Page 25 of 48
ONGOING 22.89TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.90%
ONGOING 22.86TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.12%
ONGOING 23.19TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.96%
ONGOING 22.84TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*********************************************----------------------------------------
---------------] 45.52%
ONGOING 23.44TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.68%
ONGOING 22.94TiB/43.03TiB

20200418 17:48:48.938: | PASS

20200418 17:48:48.939: Pause or resume migration


The migration is paused
20200418 17:48:51.930: | PASS (2 sec)
20200418 17:48:51.931: Print data migration instruction and exit
The data migration is running.
20200418 17:48:51.933: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 35 sec
Debug log: /opt/emc/caspian/service-
console/log/20200418_174817_run_Data_Migration/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200418_174817_run_Data_Migration/log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
================================================================================

2. Verify that the output shows that the migration is paused.

Dell Technologies Confidential Information version: 2.3.6.91

Page 26 of 48
3. From the ECS Service Console, run the resume Data_Migration command. For example:
service-console run Data_Migration --operation resume
Output which indicates that the migration is resumed, such as the following, appears:

/opt/emc/bin/service-console run Data_Migration --operation resume


Service Console is running on node 169.254.89.1 (suite 20200419_003106_Data_Migration)
Service console version: 5.0.0.0-20640.8c7060970
Debug log: /opt/emc/caspian/service-
console/log/20200419_003103_run_Data_Migration/dbg_robot.log
================================================================================
Data Migration Setup
20200419 00:31:23.896: Is migration running
20200419 00:31:27.679: | PASS (3 sec)
20200419 00:31:27.687: Check data migration parameters
Node(s) to be resumed: 169.254.89.1, 169.254.89.2, 169.254.89.3, 169.254.89.4,
169.254.89.5, 169.254.89.6, 169.254.89.7, 169.254.89.8.
20200419 00:31:30.314: | PASS (2 sec)
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200419 00:31:30.849: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.90%
PARTIALLY_PAUSED 22.86TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.87%
PARTIALLY_PAUSED 22.87TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.91%
PARTIALLY_PAUSED 22.86TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.98%
PARTIALLY_PAUSED 22.82TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.19%
PARTIALLY_PAUSED 23.16TiB/43.03TiB

Dell Technologies Confidential Information version: 2.3.6.91

Page 27 of 48
Host : lehi-brass.ecs.lab.emc.com
[***********************************************--------------------------------------
---------------] 47.05%
PARTIALLY_PAUSED 22.80TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*********************************************----------------------------------------
---------------] 45.58%
PARTIALLY_PAUSED 23.41TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.75%
PARTIALLY_PAUSED 22.91TiB/43.03TiB

20200419 00:31:31.487: | PASS


20200419 00:31:31.489: Pause or resume migration
The migration is resumed
20200419 00:31:34.190: | PASS (2 sec)
20200419 00:31:34.192: Print data migration instruction and exit
The data migration is running.
20200419 00:31:34.194: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 34 sec
Debug log: /opt/emc/caspian/service-
console/log/20200419_003103_run_Data_Migration/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200419_003103_run_Data_Migration/log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
================================================================================

Data migration throttling


ECS provides the ability to throttle data migration throughput, or data movement, from source nodes to
target nodes to better meet customer environment needs. You can prioritize data migration by using the
available settings.

Tune migration throttling


About this task

ECS provides three settings:

• Low - This is the default setting, where there is no throttling. Data migration is at the fastest
possible throughput.

Dell Technologies Confidential Information version: 2.3.6.91

Page 28 of 48
• Mid - Data migration is throttled. Data migration throughput is lower than 'Low' and more
resources are available for front-end workload.
• High - Data migration is throttled the most. Data migration throughput is the lowest
A high setting slows down the migration the most of all the throttling settings, and leaves resources for
customers workloads.
A low setting allows for the fastest migration of the settings available. However, it leaves fewer
resources available for customers workloads.
In most cases, start with throttling set to high and then reduce the throttle over time. These settings can
also be used to customize the solution by throttling more during peak hours and reducing to no throttling
during off-peak hours.
Dell EMC Professional Services can review the customer requirements and determine the appropriate
level of throttling to be configured accordingly. For example, an environment or configuration that is
sensitive to latency or additional workloads on the system may configure high throttling. A customer
environment that is not sensitive to latency or additional workloads, and would like the migration to
complete in the fastest way possible may opt to leave the default (low) setting.
Select the appropriate setting based on customer expectation and needs.

Steps

Run the following command to change the throttle traffic rate:

service-console run Data_Migration_Throttling --set low|mid|high|

Where the values are:

• low, or maximum migration throughput. This is the default setting.


• mid, or medium migration throughput.
• high, or lowest migration throughput.

For example,

service-console run Data_Migration_Throttling --set low


Service Console is running on node 169.254.89.1 (suite
20200414_154415_Data_Migration_Throttling)
Service console version: 5.0.0.0-20635.ac5b92b33
Debug log: /opt/emc/caspian/service-
console/log/20200414_154412_run_Data_Migration_Throttling/dbg_robot.log
================================================================================
Data Migration Throttling
20200414 15:44:26.856: Set or get max task number in data movement worker
The max task number in data movement worker is set to 50
20200414 15:44:30.875: | PASS (4 sec)
================================================================================
Status: PASS
Time Elapsed: 22 sec

Dell Technologies Confidential Information version: 2.3.6.91

Page 29 of 48
Debug log: /opt/emc/caspian/service-
console/log/20200414_154412_run_Data_Migration_Throttling/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200414_154412_run_Data_Migration_Throttling/log.html
================================================================================

NOTE:
If you want to check the current migration speed before changing the throttle, run this command: service-console run
Data_Migration

For example,

Service console version: 6.7.0.0-21476.9955dc902e


Debug log: /opt/emc/caspian/service-
console/log/20210604_205228_run_Data_Migration_Throttling/dbg_robot.log
================================================================================
Data Migration Throttling
20210604 20:52:38.939: Set or get max task number in data movement worker
The max task number in data movement worker is 50
20210604 20:52:42.053: | PASS (3 sec)
================================================================================
Status: PASS
Time Elapsed: 15 sec
Debug log: /opt/emc/caspian/service-
console/log/20210604_205228_run_Data_Migration_Throttling/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20210604_205228_run_Data_Migration_Throttling/log.html
================================================================================

Migration alerts
Learn about migration alerting.
The ECS UI provides alerts for the following circumstances:

• Data migration for each node has two alert levels: Level-1 and Level-2.
• Alerts are reported when Level-1 or Level-2 data for each node are migrated
• Data migration for each node is complete when both Level-1 data and Level-2 data are migrated.
• Data migration has had no progress for six hours.
• Capacity on target nodes has reached a certain threshold.

Enable migration capacity alerts


You can enable migration capacity alerts in the ECS UI Storage Pool Management section.
Steps

1. In the ECS UI, go to Manage >Storage Pools.


The Storage Pool Management section opens.
2. Select the storage pool for which you would like to enable migration capacity alerts, and click
Edit.

Dell Technologies Confidential Information version: 2.3.6.91

Page 30 of 48
The Edit Storage Pool window for the selected storage pool opens.
3. Set the alert thresholds to the wanted values, and click Save.

Remove a node from a cluster using ECS Service Console


This chapter provides information on removing a node from a cluster. You do this by using ECS Service
Console node evacuation commands.
The example outputs in the procedures may not represent the exact output. Use it for reference only.

NOTE:
Remove the source nodes from the load balancers after completing the data migration and before initiating the node evacuation.

Run optional pre—node evacuation health checks


If you choose, use the ECS Service Console to run health checks before you remove a node from a
cluster. There are two different commands, which run different checks.

Steps

1. From the ECS Service Console, run the run Health_Check command with the --
pre_node_evacuation tag. For example:
service-console run Health_Check --tags pre_node_evacuation.
Output such as the following appears:

Service Console is running on node 169.254.89.1 (suite


20200408_201513_Health_Check)
Service console version: 5.0.0.0-20597.e8eda88ed
Debug log: /opt/emc/caspian/service-
console/log/20200408_201506_run_Health_Check/dbg_robot.log
=========================================================================
=======
Health Check
20200408 20:15:35.961: Execute Health Checks
20200408 20:15:35.975: | Validate source nodes availability
20200408 20:15:39.431: | | PASS (3 sec)
20200408 20:15:39.432: | PASS (3 sec)
=========================================================================
=======
Status: PASS
Time Elapsed: 38 sec
Debug log: /opt/emc/caspian/service-
console/log/20200408_201506_run_Health_Check/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200408_201506_run_Health_Check/log.html
=========================================================================
=======
2. Run the run Health_Check command with the -- pre_evacuation tag. For example:
service-console run Health_Check --tags pre_evacuation

Dell Technologies Confidential Information version: 2.3.6.91

Page 31 of 48
Output such as the following appears:

service-console run Health_Check --tags pre_evacuation

Service Console is running on node 169.254.89.1 (suite


20200408_201709_Health_Check)
Service console version: 5.0.0.0-20597.e8eda88ed
Debug log: /opt/emc/caspian/service-
console/log/20200408_201703_run_Health_Check/dbg_robot.log
=========================================================================
=======
Health Check
20200408 20:17:31.785: Execute Health Checks
20200408 20:17:31.804: | Validate time drift
20200408 20:17:37.685: | | PASS (5 sec)
20200408 20:17:37.687: | Check DT status
Checking DT status (with timeout 10 min).
20200408 20:18:10.573: | | PASS (32 sec)
20200408 20:18:10.576: | Check on-going PSO or VDC removal from RG
20200408 20:18:33.316: | | PASS (22 sec)
20200408 20:18:33.317: | Validate that there are no transformation
instances
20200408 20:18:37.117: | | PASS (3 sec)
20200408 20:18:37.118: | PASS (1 min 5 sec)
=========================================================================
=======
Status: PASS
Time Elapsed: 1 min 37 sec
Debug log: /opt/emc/caspian/service-
console/log/20200408_201703_run_Health_Check/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200408_201703_run_Health_Check/log.html
=========================================================================
=======

Remove a node from a cluster using ECS Service Console


Use the ECS Service Console run Node_Evacuation command on the current installer node to
configure the new installer node. This command relocates the ECS installer and Service Console to the
"target node '1'."
Prerequisites

• Ensure that the cluster is in healthy state before you run the run Node_Evacuation
command. If there are any failed source nodes, replace them before starting this procedure.
• Ensure that the version of the production package used in the node evacuation operation is the
same as the one running on the source nodes.
• Ensure that you copy the production package and Service Console bundle to the proper
locations on the installer node. Ensure that you select the first target node as the new installer
node.

Dell Technologies Confidential Information version: 2.3.6.91

Page 32 of 48
Steps

• From the ECS Service Console, run the run Node_Evacuation :


service-console run Node_Evacuation --target-node <private.4 IP
addresses of nodes to be evacuated (source nodes)> --production-package
/home/admin/install/production.tgz --service-console-bundle
/tmp/sc/service-console.tgz
For example:

service-console run Node_Evacuation --target-


node169.254.19.9,169.254.19.10,169.254.19.11,169.254.19.12--production-
package/tmp/install/production.tgz --service-console-bundle
/tmp/service_console/service-console.tgz

• Verify that the output shows that the node is being evacuated and that the Service Console is
being relocated.

admin@boston-auburn:~> service-console run Node_Evacuation --target-node


169.254.19.9,169.254.19.10,169.254.19.11,169.254.19.12 --production-package
/tmp/install/production.tgz --service-console-bundle /tmp/service_console/service-
console.tgz
Service console version: 6.0.0.0-20939.4fee7380c
Debug log: /opt/emc/caspian/service-
console/log/20200908_170939_run_Node_Evacuation/dbg_robot.log
================================================================================
Node Evacuation Setup
20200908 17:09:56.495: Run health check
20200908 17:09:56.573: | Validate source nodes availability
20200908 17:09:58.213: | | PASS (1 sec)
20200908 17:09:58.214: | PASS (1 sec)
20200908 17:09:58.218: Get new installer node for node evacuation
The installer node 169.254.19.9 is going to be evacuated
New installer node: 169.254.104.1
20200908 17:09:58.219: | PASS
Node Evacuation Installer Check
20200908 17:10:03.320: Relocate installer
Extracting /tmp/install/production.tgz to
169.254.104.1:/tmp/service_console_production_package
20200908 17:12:01.528: | PASS (1 min 58 sec)
Node Evacuation SC Check
20200908 17:12:05.655: Relocate SC
Service Console was installed on node 169.254.104.1
20200908 17:12:31.237: | PASS (25 sec)
Node Evacuation SC Configuration Check
20200908 17:12:35.360: Check cluster.ini after SC relocation
20200908 17:15:03.100: | PASS (2 min 27 sec)
Node Evacuation Next Steps
Installer and SC relocation is done.
Node Evacuation Pre Check

Dell Technologies Confidential Information version: 2.3.6.91

Page 33 of 48
Node Evacuation VNEST Data Migration
Pending
Node Evacuation Disks Removal
Pending
Node Evacuation Store Nodes Mapping
Pending
Node Evacuation Stat Migrate Totals
Pending
Node Evacuation Initiate Fabric Migration
Pending
Node Evacuation Proceed Fabric Migration
Pending
Node Evacuation DC Node Removal
Pending
Node Evacuation Endpoints Removal
Pending
Node Evacuation Node Removal
Pending
Node Evacuation VNEST Node Removal
Pending
Node Evacuation Monitoring Node Removal
Pending
Node Evacuation Post Check
Pending
Node Evacuation Teardown
Pending
================================================================================
Status: PASS
Time Elapsed: 5 min 31 sec
Debug log: /opt/emc/caspian/service-
console/log/20200908_170939_run_Node_Evacuation/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200908_170939_run_Node_Evacuation/log.html
================================================================================
Messages:
To proceed, run the following command from node 169.254.104.1:
service-console run Node_Evacuation
================================================================================

• Rerun the run Node_Evacuation command from new installer node:


service-console run Node_Evacuation
For example:

service-console run Node_Evacuation

Output such as the following appears:

service-console run Node_Evacuation

Dell Technologies Confidential Information version: 2.3.6.91

Page 34 of 48
Service Console is running on node 169.254.97.1 (suite
20200422_201550_Node_Evacuation)
Cannot write to ZK server, the operation Node_Evacuation is running locally (suite
20200422_201620_Node_Evacuation)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-
console/log/20200422_201547_run_Node_Evacuation/dbg_robot.log
================================================================================
Node Evacuation Setup
20200422 20:16:38.361: Run health check
20200422 20:16:38.439: | Validate source nodes availability
20200422 20:16:41.653: | | PASS (3 sec)
20200422 20:16:41.654: | PASS (3 sec)
20200422 20:16:41.656: Get new installer node for node evacuation
20200422 20:16:41.658: | PASS
Node Evacuation Installer Check
Completed on previous run
Node Evacuation SC Check
Completed on previous run
Node Evacuation SC Configuration Check
Completed on previous run
Node Evacuation Next Steps
Node Evacuation Pre Check
20200422 20:16:57.203: Check target nodes data migration status
Data migration status for node 169.254.89.1 is COMPLETE
Data migration status for node 169.254.89.2 is COMPLETE
Data migration status for node 169.254.89.3 is COMPLETE
Data migration status for node 169.254.89.4 is COMPLETE
Data migration status for node 169.254.89.5 is COMPLETE
Data migration status for node 169.254.89.6 is COMPLETE
Data migration status for node 169.254.89.7 is COMPLETE
Data migration status for node 169.254.89.8 is COMPLETE
20200422 20:17:01.838: | PASS (4 sec)
20200422 20:17:01.839: Run health check
20200422 20:17:01.846: | Validate time drift
20200422 20:17:07.661: | | PASS (5 sec)
20200422 20:17:07.663: | Check DT status
Checking DT status (with timeout 10 min).
20200422 20:17:36.470: | | PASS (28 sec)
20200422 20:17:36.471: | Check on-going PSO or VDC removal from RG
20200422 20:17:50.986: | | PASS (14 sec)
20200422 20:17:50.987: | Validate that there are no transformation instances
20200422 20:17:54.639: | | PASS (3 sec)
20200422 20:17:54.640: | PASS (52 sec)
20200422 20:17:54.641: Validate DT Table Rebalancing
20200422 20:18:10.956: | PASS (16 sec)
Node Evacuation VNEST Data Migration
20200422 20:18:14.338: VNEST data migration status
20200422 20:18:15.764: | PASS (1 sec)
Number of remaining VNEST member nodes to migrate: 5
Starting migration of VNEST data from node 169.254.89.2 to node 169.254.97.1
20200422 20:18:16.409: Migrate VNEST data

Dell Technologies Confidential Information version: 2.3.6.91

Page 35 of 48
20200422 20:18:17.805: | PASS (1 sec)
In Progress
Node Evacuation Disks Removal
Pending
Node Evacuation Store Nodes Mapping
Pending
Node Evacuation Stat Migrate Totals
Pending
Node Evacuation Initiate Fabric Migration
Pending
Node Evacuation Proceed Fabric Migration
Pending
Node Evacuation DC Node Removal
Pending
Node Evacuation Endpoints Removal
Pending
Node Evacuation Node Removal
Pending
Node Evacuation VNEST Node Removal
Pending
Node Evacuation Monitoring Node Removal
Pending
Node Evacuation Post Check
Pending
Node Evacuation Teardown
Pending
================================================================================
Status: PASS
Time Elapsed: 2 min 1 sec
Debug log: /opt/emc/caspian/service-
console/log/20200422_201547_run_Node_Evacuation/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200422_201547_run_Node_Evacuation/log.html
================================================================================
Messages:
Cluster.ini check is not needed
VNEST data migration is in progress, please wait some time and re-run the operation.
======================================================================

• Rerun the command on the same node to check the status.


The more clusters you have, the more often you must run the command. You must monitor it.
You can rerun the command as often as your cycles and bandwidth permit.
Output such as the following appears:

service-console run Node_Evacuation


Service Console is running on node 169.254.97.1 (suite
20200422_203316_Node_Evacuation)
Cannot write to ZK server, the operation Node_Evacuation is running locally (suite
20200422_203346_Node_Evacuation)
Service console version: 5.0.0.0-20661.84846585e

Dell Technologies Confidential Information version: 2.3.6.91

Page 36 of 48
Debug log: /opt/emc/caspian/service-
console/log/20200422_203313_run_Node_Evacuation/dbg_robot.log
================================================================================
Node Evacuation Setup
20200422 20:34:04.269: Run health check
20200422 20:34:04.360: | Validate source nodes availability
20200422 20:34:07.692: | | PASS (3 sec)
20200422 20:34:07.694: | PASS (3 sec)
20200422 20:34:07.696: Get new installer node for node evacuation
20200422 20:34:07.697: | PASS
Node Evacuation Installer Check
Completed on previous run
Node Evacuation SC Check
Completed on previous run
Node Evacuation SC Configuration Check
Completed on previous run
Node Evacuation Next Steps
Completed on previous run
Node Evacuation Pre Check
Completed on previous run
Node Evacuation VNEST Data Migration
20200422 20:34:26.656: VNEST data migration status
20200422 20:34:28.109: | PASS (1 sec)
20200422 20:34:28.127: Finalize VNEST data migration
20200422 20:34:29.824: | PASS (1 sec)
Number of remaining VNEST member nodes to migrate: 3
Starting migration of VNEST data from node 169.254.89.1 to node 169.254.97.5
20200422 20:36:30.683: Migrate VNEST data
20200422 20:36:32.131: | PASS (1 sec)
In Progress
Node Evacuation Disks Removal
Pending
Node Evacuation Store Nodes Mapping
Pending
Node Evacuation Stat Migrate Totals
Pending
Node Evacuation Initiate Fabric Migration
Pending
Node Evacuation Proceed Fabric Migration
Pending
Node Evacuation DC Node Removal
Pending
Node Evacuation Endpoints Removal
Pending
Node Evacuation Node Removal
Pending
Node Evacuation VNEST Node Removal
Pending
Node Evacuation Monitoring Node Removal
Pending
Node Evacuation Post Check
Pending

Dell Technologies Confidential Information version: 2.3.6.91

Page 37 of 48
Node Evacuation Teardown
Pending
================================================================================
Status: PASS
Time Elapsed: 2 min 52 sec
Debug log: /opt/emc/caspian/service-
console/log/20200422_203313_run_Node_Evacuation/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200422_203313_run_Node_Evacuation/log.html
================================================================================
Messages:
VNEST data migration is in progress, please wait some time and re-run the operation.
================================================================================

• Verify that the output shows that the VNEST data migration, node evacuation disks removal, and
fabric migration are complete.
Output such as the following appears:

service-console run Node_Evacuation


Service Console is running on node 169.254.97.1 (suite
20200423_131352_Node_Evacuation)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-
console/log/20200423_131349_run_Node_Evacuation/dbg_robot.log
================================================================================
Node Evacuation Setup
20200423 13:14:12.800: Run health check
20200423 13:14:12.890: | Validate source nodes availability
20200423 13:14:16.417: | | PASS (3 sec)
20200423 13:14:16.418: | PASS (3 sec)
20200423 13:14:16.421: Get new installer node for node evacuation
20200423 13:14:16.422: | PASS
Node Evacuation Installer Check
Completed on previous run
Node Evacuation SC Check
Completed on previous run
Node Evacuation SC Configuration Check
Completed on previous run
Node Evacuation Next Steps
Completed on previous run
Node Evacuation Pre Check
Completed on previous run
Node Evacuation VNEST Data Migration
Completed on previous run
Node Evacuation Disks Removal
Completed on previous run
Node Evacuation Store Nodes Mapping
Completed on previous run
Node Evacuation Stat Migrate Totals
Completed on previous run

Dell Technologies Confidential Information version: 2.3.6.91

Page 38 of 48
Node Evacuation Initiate Fabric Migration
Completed on previous run
Node Evacuation Proceed Fabric Migration
Completed on previous run
Node Evacuation DC Node Removal
Completed on previous run
Node Evacuation Endpoints Removal
Completed on previous run
Node Evacuation Node Removal
20200423 13:15:01.075: Delete nodes for evacuation
20200423 18:17:00.881: | PASS (5 hour(s) 1 min 59 sec)
Node Evacuation VNEST Node Removal
20200423 18:17:05.170: Delete nodes from VNEST
20200423 18:17:08.180: | PASS (3 sec)
Node Evacuation Monitoring Node Removal
20200423 18:17:11.438: Delete nodes from Monitoring
20200423 18:17:13.163: | PASS (1 sec)
Node Evacuation Post Check
20200423 18:17:16.414: Validate Fabric nodes evacuation
20200423 18:17:22.613: | PASS (6 sec)
20200423 18:17:22.614: Get nodes from API
20200423 18:17:35.510: | PASS (12 sec)
20200423 18:17:35.511: Validate SSM nodes evacuation
20200423 18:17:35.516: | PASS
Node Evacuation Teardown
20200423 18:17:38.828: Generate var files templates
20200423 18:17:39.825: | PASS
20200423 18:17:39.827: Generate cluster.ini
Saved existing cluster.ini to /opt/emc/config/local/cluster.ini.back
Preserved files:
/opt/emc/config/local/host_vars
/opt/emc/config/local/host_vars/169.254.89.1
/opt/emc/config/local/host_vars/169.254.97.1
/opt/emc/config/local/group_vars
/opt/emc/config/local/group_vars/datanodes
[INFO] generated cluster.ini file with the content:

######
# This file was automatically generated by the Service Console.
# Please verify that it reflects the actual cluster topology.
# Credentials (BMC, Mgmt API, etc) should be set in separate files.
# Use file group_vars/datanodes to set cluster-wide variables.
# Use file host_vars/HOST_IP to set node-specific variables.
######

[datanodes:children]
vdc2

[vdc2:children]
orchid

Dell Technologies Confidential Information version: 2.3.6.91

Page 39 of 48
[orchid:vars]
rack_id=97
rack_name=orchid
rack_psnt=psnt2
rack_dns_server=10.249.255.254
rack_dns_search=ecs.lab.emc.com,lss.emc.com,isus.emc.com,centera.emc.com,corp.emc.com,
emc.com
rack_ntp_server=10.249.255.254,10.243.84.254
rack_ns_switch=files,mdns4_minimal,[NOTFOUND=return],dns,mdns4
sc_collected=True

[orchid:children]
node_169_254_97_1 # Installer / SC node
node_169_254_97_2
node_169_254_97_3
node_169_254_97_4
node_169_254_97_5
node_169_254_97_6
node_169_254_97_7
node_169_254_97_8

[node_169_254_97_1]
169.254.97.1

[node_169_254_97_1:vars]
bmc_ip=10.249.252.201

[node_169_254_97_2]
169.254.97.2

[node_169_254_97_2:vars]
bmc_ip=10.249.252.202

[node_169_254_97_3]
169.254.97.3

[node_169_254_97_3:vars]
bmc_ip=10.249.252.203

[node_169_254_97_4]
169.254.97.4

[node_169_254_97_4:vars]
bmc_ip=10.249.252.204

[node_169_254_97_5]
169.254.97.5

[node_169_254_97_5:vars]
bmc_ip=10.249.252.211

[node_169_254_97_6]

Dell Technologies Confidential Information version: 2.3.6.91

Page 40 of 48
169.254.97.6

[node_169_254_97_6:vars]
bmc_ip=10.249.252.212

[node_169_254_97_7]
169.254.97.7

[node_169_254_97_7:vars]
bmc_ip=10.249.252.213

[node_169_254_97_8]
169.254.97.8

[node_169_254_97_8:vars]
bmc_ip=10.249.252.214

20200423 18:20:44.066: | PASS (3 min 4 sec)


20200423 18:20:44.073: Show SRS notification
20200423 18:20:47.086: | PASS (3 sec)
================================================================================
Status: PASS
Time Elapsed: 5 hour(s) 7 min 4 sec
Debug log: /opt/emc/caspian/service-
console/log/20200423_131349_run_Node_Evacuation/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200423_131349_run_Node_Evacuation/log.html
================================================================================
Found private.4 IPs from zookeeper: 169.254.97.8, 169.254.97.1, 169.254.97.5,
169.254.97.4, 169.254.97.3, 169.254.97.7, 169.254.97.6, 169.254.97.2
Found private.4 IPs from zookeeper: 169.254.97.1, 169.254.97.2, 169.254.97.3,
169.254.97.4, 169.254.97.5, 169.254.97.6, 169.254.97.7, 169.254.97.8
Created cluster configuration file /opt/emc/config/local/cluster.ini at 169.254.97.1
Successfully generated cluster.ini.
However it is suggested to verify the generated cluster.ini before proceeding with
other service procedures.
If the cluster.ini contains WARNING comment or has <vdc>_unknown section(s),
it is mandatory to correct the cluster.ini manually before proceeding with other
service procedures.
================================================================================

If the node fails during node evacuation before fabric migration completes, replace the node and
go to node evacuation.
If the node fails during node evacuation after fabric migration, VNEST data is already migrated
and this failure does not affect it at all.
The ?endpoint request lists the available nodes only after the node evacuation process.

Dell Technologies Confidential Information version: 2.3.6.91

Page 41 of 48
Move licenses to new ECS system
Steps

1. Go to eLicensing Central https://powerlinklicensing.emc.com/, regenerate the license and add


the capacity as required to the existing license.
1. Regenerate license to remove decommissioned product serial number tags (PSNT) and
add new PSNTs.
The license maintains the same software ID (SWID).
2. Check and add unstructured storage capacity to the license as required.

NOTE:
Do not use a license file with a different SWID unless directed to do so.
2. In the ECS UI, follow these steps to re-apply license to tech refreshed ECS VDC.
1. Go to Settings >Licensing and note the VDC Serial Number before applying the
regenerated license.
2. Select Settings >Licensing >New License and browse to xxxxx.lic file and apply.
3. Browse to the xxxxx.lic file, click Open
4. Click UPLOAD to upload the license file.
5. Verify that the VDC Serial Number has not changed.
6. Click the arrow mark on any row and Verify the PSNT column. Verify the number of
racks and PSNT serial numbers.
3. In the ECS UI, follow these steps to delete the SRS entries and re-create them as required.
1. In Settings >ESRS , note the IP and PORT number of each SRS entry.
2. Under Settings >ESRS >Actions, delete the entry.
3. Click New Server option to re-create the entry.
4. After the entries are created and the status shows as connected, click Action >Test Dial
Home for each entry.
5. Verify that Test Dial Home status is Passed for each entry.
4. File IBG report requesting decommissioned PSNTs and set them to uninstalled.

Carry out post—Tech Refresh checks


Verify complete procedures in the Service Console Health Check and ECS UI Storage Pool page after
completing the Tech Refresh.
About this task

Run the Health Check from the Service Console. Check the ECS UI Storage Pool page for evacuated
nodes.
Steps

1. From the ECS Service Console, run the run Health_Check command:
service-console run Health_Check

Dell Technologies Confidential Information version: 2.3.6.91

Page 42 of 48
Output such as the following appears:

service-console run Health_Check


Service console version: 5.0.0.0-20670.6cb91f5d6
Debug log: /opt/emc/caspian/service-
console/log/20200504_202021_run_Health_Check/dbg_robot.log
================================================================================
Health Check
20200504 20:20:37.490: Execute Health Checks
20200504 20:20:37.498: | Check DNS settings
20200504 20:20:45.763: | | PASS (8 sec)
20200504 20:20:45.765: | Check swap memory
20200504 20:20:47.228: | | PASS (1 sec)
20200504 20:20:47.230: | Check MAC 3A patch
20200504 20:20:48.865: | | PASS (1 sec)
20200504 20:20:48.867: | Check network interfaces
20200504 20:20:50.318: | | PASS (1 sec)
20200504 20:20:50.320: | Check PBR consistency
20200504 20:20:53.122: | | PASS (2 sec)
20200504 20:20:53.124: | Check preset.cfg file
20200504 20:21:00.990: | | PASS (7 sec)
20200504 20:21:00.992: | Check static routes config files consistency for racks
20200504 20:21:00.999: | | PASS
20200504 20:21:01.000: | Check that no nodes need a cold or warm power cycle
20200504 20:21:02.701: | | PASS (1 sec)
20200504 20:21:02.703: | DOM disks
Skip on current hardware.
20200504 20:21:04.005: | | PASS (1 sec)
20200504 20:21:04.007: | Check LVRoot size
20200504 20:21:05.503: | | PASS (1 sec)
20200504 20:21:05.505: | NAN Vlan
20200504 20:21:07.889: | | PASS (2 sec)
20200504 20:21:07.891: | Check SATA DOM OS installation
20200504 20:21:36.623: | | PASS (28 sec)
20200504 20:21:36.625: | Static routes validation
20200504 20:21:36.638: | | PASS
20200504 20:21:36.639: | Validate BE switch OS version
20200504 20:22:07.810: | | PASS (31 sec)
20200504 20:22:07.811: | Validate BMC availability
20200504 20:22:17.619: | | PASS (9 sec)
20200504 20:22:17.622: | Validate BMC settings
20200504 20:22:22.733: | | PASS (5 sec)
20200504 20:22:22.735: | Validate that disk SMART self-test is enabled in
/etc/cron.daily
20200504 20:22:23.324: | | PASS
20200504 20:22:23.326: | Validate EX300 NIC FW version
Skip: Validate only EX300 NIC FW version
20200504 20:22:23.332: | | PASS
20200504 20:22:23.333: | Validate FE switch OS version
20200504 20:22:48.205: | | PASS (24 sec)
20200504 20:22:48.207: | Validate FE switches uplink status
20200504 20:24:00.419: | | PASS (1 min 12 sec)

Dell Technologies Confidential Information version: 2.3.6.91

Page 43 of 48
20200504 20:24:00.421: | Validate the kernel version
20200504 20:24:04.323: | | PASS (3 sec)
20200504 20:24:04.325: | Validate NAN config
20200504 20:24:07.061: | | PASS (2 sec)
20200504 20:24:07.063: | Check NIC FW versions for consistency
20200504 20:24:14.563: | | PASS (7 sec)
20200504 20:24:14.565: | Check NIC
[WARN] Check corresponding ECS EX-Series Firmware Matrix along with Firmware Update
Guide for minimum required or latest recommended versions of NIC and other server
firmware. This documentation can be found on Support site or ECS SolVe Desktop or
SolVe online
20200504 20:24:14.567: | | PASS
20200504 20:24:14.568: | Validate root FS free space
20200504 20:24:16.146: | | PASS (1 sec)
20200504 20:24:16.148: | Validate source nodes availability
20200504 20:24:16.150: | | PASS
20200504 20:24:16.151: | Validate that SSH banner is not added on the nodes
Skip - this check is for ECS version below 3.2.1
20200504 20:24:16.523: | | PASS
20200504 20:24:16.525: | Validate that STIG rules were applied
20200504 20:24:16.531: | | PASS
20200504 20:24:16.532: | Check for stuck disk subsystem processes
20200504 20:24:18.866: | | PASS (2 sec)
20200504 20:24:18.868: | Validate that all nodes are available - OS
20200504 20:24:18.870: | | PASS
20200504 20:24:18.871: | Validate that OS version is equal between nodes
20200504 20:24:24.960: | | PASS (6 sec)
20200504 20:24:24.962: | Validate time drift
20200504 20:24:26.464: | | PASS (1 sec)
20200504 20:24:26.467: | Validate swap space consumers
20200504 20:24:39.964: | | PASS (13 sec)
20200504 20:24:39.966: | Confirm that docker health is GOOD and docker exec works on
all nodes
20200504 20:24:53.369: | | PASS (13 sec)
20200504 20:24:53.371: | Validate agent version
20200504 20:24:54.949: | | PASS (1 sec)
20200504 20:24:54.952: | Validate application role operational mode
Skipped: unnecessary for ECS version 3.5.0.0.120936.e86d8252415
20200504 20:24:59.466: | | PASS (4 sec)
20200504 20:24:59.468: | Validate applications and services health
20200504 20:25:10.662: | | PASS (11 sec)
20200504 20:25:10.663: | Validate cluster compliance status
20200504 20:25:17.191: | | Pass (6 sec)
20200504 20:25:17.194: | Validate that there is cluster master
20200504 20:25:21.204: | | PASS (4 sec)
20200504 20:25:21.206: | Validate docker containers are running where they should be
20200504 20:25:27.335: | | PASS (6 sec)
20200504 20:25:27.337: | Validate event streams by sending agent health
20200504 20:25:45.137: | | PASS (17 sec)
20200504 20:25:45.139: | Validate API availability of fabric services
20200504 20:25:53.483: | | PASS (8 sec)
20200504 20:25:53.484: | Validate that ports for fabric services are open

Dell Technologies Confidential Information version: 2.3.6.91

Page 44 of 48
20200504 20:26:00.794: | | PASS (7 sec)
20200504 20:26:00.795: | Validate services owner and that goalstates are equal on LM
and agents
20200504 20:26:28.153: | | PASS (27 sec)
20200504 20:26:28.155: | Validate that expected number of drives is formatted for
Object
20200504 20:26:55.343: | | PASS (27 sec)
20200504 20:26:55.345: | Validate that all lifecycles are active
20200504 20:26:57.130: | | PASS (1 sec)
20200504 20:26:57.132: | Validate that the correct number of disks are mounted
inside the object container
20200504 20:27:17.471: | | PASS (20 sec)
20200504 20:27:17.473: | Validate number of nodes with zookeeper
20200504 20:27:21.489: | | PASS (4 sec)
20200504 20:27:21.490: | Validate object configuration files between nodes
20200504 20:27:24.617: | | PASS (3 sec)
20200504 20:27:24.619: | Validate that all partitions are under control
20200504 20:28:34.955: | | PASS (1 min 10 sec)
20200504 20:28:34.957: | Validate that provisioned drives are GOOD
20200504 20:28:34.962: | | PASS
20200504 20:28:34.963: | Validate services owner and that realized goalstates are
equal on LM and agents
20200504 20:29:03.358: | | PASS (28 sec)
20200504 20:29:03.360: | Validate SSD disks consistency
20200504 20:29:09.933: | | PASS (6 sec)
20200504 20:29:09.935: | Validate that all nodes are available - Fabric
20200504 20:29:09.936: | | PASS
20200504 20:29:09.937: | Validate that diskset is the same for all disks and cache
files
20200504 20:29:14.500: | | PASS (4 sec)
20200504 20:29:14.502: | Validate zookeeper
20200504 20:29:14.503: | | PASS
20200504 20:29:14.504: | Verify BIOS version
20200504 20:29:16.242: | | PASS (1 sec)
20200504 20:29:16.244: | Check that BTree GC is enabled
Checking BTree GC parameters...
com.emc.ecs.chunk.gc.repo.enabled = true
com.emc.ecs.chunk.gc.repo.verification.enabled = true
com.emc.ecs.chunk.gc.btree.scanner.verification.enabled = true
com.emc.ecs.chunk.gc.btree.scanner.copy.enabled = true
com.emc.ecs.chunk.gc.btree.enabled = true
BTree GC is enabled
20200504 20:29:29.283: | | PASS (13 sec)
20200504 20:29:29.285: | Check Upgrade Completion flags
version across the cluster: 3.5.0.0.120936.e86d8252415
Checking flags on VDC vdc_mantis_a-acid...
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_3_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_2_1_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_5_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_4_0_1_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.2_2_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_1_upgrade_complete' is 'true'

Dell Technologies Confidential Information version: 2.3.6.91

Page 45 of 48
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_2_2_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.2_2_1_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_2_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.timeFormat.rfc822_date_time_format' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_0_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_4_upgrade_complete' is 'true'
20200504 20:29:44.777: | | PASS (15 sec)
20200504 20:29:44.779: | Check counts of replication groups
20200504 20:29:47.938: | | PASS (3 sec)
20200504 20:29:47.940: | Check DT Load Balancing
20200504 20:29:47.942: | | Check Lb Enabled
Checking LB parameters...
com.emc.ecs.ownership.LoadBalanceEnabled = true
LB is enabled
20200504 20:29:50.758: | | | PASS (2 sec)
20200504 20:29:50.759: | | PASS (2 sec)
20200504 20:29:50.761: | Check DT status
Checking DT status (with timeout 10 min).
20200504 20:30:01.718: | | PASS (10 sec)
20200504 20:30:01.719: | Check that rejoin task keys not present in LS table
20200504 20:30:01.723: | | PASS
20200504 20:30:01.724: | Check that the system is not in TSO state
20200504 20:30:01.726: | | PASS
20200504 20:30:01.726: | Check Journal GC
20200504 20:30:05.051: | | PASS (3 sec)
20200504 20:30:05.053: | Check Object version across the cluster
20200504 20:30:13.566: | | PASS (8 sec)
20200504 20:30:13.569: | Check whether each node has reserve SSD
20200504 20:30:19.663: | | PASS (6 sec)
20200504 20:30:19.665: | Validate all OB and LS tables FPP
20200504 20:30:19.667: | | PASS
20200504 20:30:19.668: | Validate BE ECS UI availability
Private IP/Port 192.168.219.254:443 is disabled on installer rack 1
20200504 20:30:20.234: | | PASS
20200504 20:30:20.236: | Validate that all nodes are available - Object
20200504 20:30:20.238: | | PASS
20200504 20:30:20.239: | Validate that data recovery is enabled for all nodes
20200504 20:30:23.461: | | PASS (3 sec)
20200504 20:30:23.463: | Validate that nginx is listening on all nodes
20200504 20:30:27.826: | | PASS (4 sec)
20200504 20:30:27.834: | PASS (9 min 50 sec)
================================================================================
Status: PASS
Time Elapsed: 10 min 10 sec
Debug log: /opt/emc/caspian/service-
console/log/20200504_202021_run_Health_Check/dbg_robot.log
HTML log: /opt/emc/caspian/service-
console/log/20200504_202021_run_Health_Check/log.html
================================================================================

2. Verify that the output shows that all checks are passed.
3. In the ECS UI, go to Storage Pools Management.

Dell Technologies Confidential Information version: 2.3.6.91

Page 46 of 48
4. Verify that the old cluster nodes are deleted or gone, and that the list displays a new storage
pool ready to use.

Figure 1. ECS UI Storage Pools Management

Next steps

• Reconfigure the Secure Remote Services.


oDelete the Secure Remote Services server from the ECS UI and add it back. This
reconfiguration ensures that the old (evacuated) nodes are removed from Secure
Remote Services and the new nodes are added.
oIf this step is not performed, the Secure Remote Services gets disconnected post
evacuation.
• Apply an updated ECS license.
oAny reference to product serial number tags (PSNTs) of old (evacuated) racks should be
removed in the updated license.
oThe capacity of the old rack should be subtracted from the total capacity.
• Disable the rack interconnect. This step ensures that,
oThe old (evacuated) rack does not display in getclusterinfo.
oAll references to the old (evacuated) rack are removed from NAN.

Dell Technologies Confidential Information version: 2.3.6.91

Page 47 of 48
Document feedback
To provide any feedback or suggestions on the document, go to Content Feedback Router portal. For
more information, see Content Feedback Router - Support.

Dell Technologies Confidential Information version: 2.3.6.91

Page 48 of 48

You might also like