0% found this document useful (0 votes)
101 views46 pages

02 - PowerScale Hardware Maintenance-SSP - Participant Guide

The PowerScale Hardware Maintenance-SSP participant guide provides essential information on hardware maintenance procedures, including the distinction between Customer Replaceable Units (CRUs) and Field Replaceable Units (FRUs), as well as safety precautions and compatibility considerations for PowerScale nodes. It covers maintenance basics, hardware monitoring, and indicator lights for various components, ensuring users can effectively manage hardware issues. The guide also includes knowledge checks and assessment questions to reinforce learning and understanding of the material presented.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views46 pages

02 - PowerScale Hardware Maintenance-SSP - Participant Guide

The PowerScale Hardware Maintenance-SSP participant guide provides essential information on hardware maintenance procedures, including the distinction between Customer Replaceable Units (CRUs) and Field Replaceable Units (FRUs), as well as safety precautions and compatibility considerations for PowerScale nodes. It covers maintenance basics, hardware monitoring, and indicator lights for various components, ensuring users can effectively manage hardware issues. The guide also includes knowledge checks and assessment questions to reinforce learning and understanding of the material presented.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

POWERSCALE

HARDWARE
MAINTENANCE-SSP

PARTICIPANT GUIDE

PARTICIPANT GUIDE
PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 2


Table of Contents

PowerScale Hardware Maintenance 5

Maintenance Considerations 6

Maintenance Considerations 7
Hardware Maintenance Basics 7
Electrostatic Discharge 9
Minimize Tool Use 10
PowerScale Nodes Compatibility 11
Drives and Drive Sleds 13
Knowledge Check 15
Hardware Monitoring 16
Indicator Lights 19
Knowledge Check 21
Chassis Replacement Procedure 21
Recommended Tools 25
Safety Precautions and Considerations 26
Knowledge Check 27

Field Replaceable Units 29

Field Replaceable Units 30


Field Replaceable Units (FRU) Overview 30
Prepare the System for FRU Replacement 30
FRU Steps for Internal Components 32
Gen6, All-Flash, and Accelerator Nodes 34
F200 and F600 Nodes 34
F900 Nodes 34
Knowledge Check 35

Customer Replaceable Units 36

Customer Replaceable Units 37

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 3


Customer Replaceable Units (CRU) Overview 37
Gen6 and Newer 37
F200, F600, and Accelerator Nodes 38
F900 38
Knowledge Check 38

Course Assessment (scored) 40

Assessment Introduction 41

Assessment Questions 42
Question 42
Question 42
Question 42
Question 42
Question 43
Question 43
Question 43
Question 44
Question 44
Question 44
Question 44
Question 45

Course Completion 46
You Have Completed This Content 46

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 4


Assessment Questions

PowerScale Hardware Maintenance

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 5


Maintenance Considerations

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 6


Assessment Questions

Maintenance Considerations

Hardware Maintenance Basics

The graphic shows a few basic reminders that are common to all
hardware maintenance procedures. If you encounter any difficulties while
performing this task, immediately contact Dell Technical Support.

1: Customer Replaceable Units (CRUs) are designated hardware


components that customers can replace themselves. Many CRUs are
replaced without shutting down the node by following the correct
procedure. Field Replaceable Units (FRUs), require the node to be
powered off. FRUs need the assistance of a customer engineer. If you
must power off a node, always shut it down properly as described in the
replacement guide.

2: Before disconnecting any cables, ensure that the Do Not Remove LED
on the compute module is off. When the LED is white or On, this shows
that the journal of the node is still active. The Do Not Remove LED is on
the right side of the compute module and looks like a symbol of a hand
with a line through it. Do not disconnect any cables until this LED is off.

3: When performing part replacements on multiple nodes, always work on


one node at a time to help prevent cluster outages and risk of data loss.

4:

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 7


Assessment Questions

SolVe Online is a knowledge management-led standard procedure for Dell


field personnel, service partners, and customers. Use the Solve Online
tool to get the most recent, full instructions for the procedure. These
instructions are often updated based on feedback from the field. Consult
the instruction documents before every engagement, even if you have
previously performed the service that is requested.

5: Save the packaging from the replacement part. Use this packaging to
return the failed part to Dell Technologies. A return label is included with
the replacement part.

6: If the customer or Dell technical support requests a Failure Analysis


(FA) on the replaced part, attach a filled-out FA label to the return box.
Complete an FA request ticket in the Worldwide Failure Analysis (WWFA)
system. The purpose of WWFA is to provide tracking and management of
product failure analysis requests. Provide the FA ticket number to your
support contact or add it to the SR in a comment.

See the what are the required fields and data items entered into internal
FA reports that are written per CS request? article to learn more.

Use the Worldwide Failure Analysis Database portal to create a FA report


and track the progress.

7:

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 8


Assessment Questions

After the work is complete, all service personnel, including partners, must
update the Install Database. Go to SolVe Online > Tools and Forms
section and select Install Base Ticket. Complete the form and submit the
case. If meeting any difficulties, you can also contact the Dell Technical
Support page.

Electrostatic Discharge

Electrostatic Discharge (ESD) is a major cause of damage to electronic


components and potentially dangerous to the installer. To avoid ESD
damage, review ESD procedures before arriving at the customer site and
adhere to the precautions when onsite.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 9


Assessment Questions

Clean work area

• Clear the work area of items that naturally build up electrostatic


discharge.

Anti-static packaging

• Leave components in anti-static packaging until time to install.

ESD kit

• Always use an ESD kit when handling components.

No ESD kit available

• Before touching a component, put one hand firmly on the bare metal
surface.
• After removing components from the anti-static bag, do not move
around the room or touch furnishings, personnel, or surfaces.
• If moving or touching something is necessary, first put the component
back in the anti-static bag.

Avoid movement

• Minimize movement to avoid buildup of electrostatic discharge.

Tip: Always follow ESD procedures when handling


components.

Minimize Tool Use

A design goal for maintenance of the hardware is to make the hardware


as accessible as possible without tools. Demounting handles are color-
coded to show hot or cold serviceability. Terracotta handles indicate that
the component can be removed without first taking the node in question

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 10


Assessment Questions

down. Blue handles indicate that the node should be shut down for the
maintenance procedure.

PowerScale Nodes Compatibility

In PowerScale, node compatibility is used in the context of SSD counts, or


sizes, and RAM sizes between different node types in a cluster. With the
introduction of Gen6 nodes, compatibility can be extended to include
back-end network technology and the node types that can be included in a
cluster.

Node compatibility creates an equivalence association between older and


newer generation nodes from the same performance series. Compatible
nodes can combine into a single node pool. If no node compatibility is
created, nodes cannot be merged into the same node pool. This is

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 11


Assessment Questions

important for a few reasons1. Nodes in the same node pool may have
different size SSDs2 and different size RAM.

At the back of the chassis, the compute modules are labeled from right to
left, one to four, as shown. Because compute modules are installed in
pairs, called node-pairs, the minimum cluster size is four nodes, and
more nodes must be added in node-pairs.

1 A customer can transition slowly to new hardware over time without a


forklift upgrade by allowing the addition of fewer nodes at a time to an
existing node pool. This is more cost effective than adding the node
minimum to start a new node pool with all new hardware. When a
customer has enough new nodes, node compatibility can be disabled on
an individual node pool.
2 Enabling SSD compatibility allows customers to replace older, smaller

SSDs with new, larger SSDs to allow more L3 cache space. This lets
customers better utilize storage resources. Every node in the pool must be
the same model or of the same series or family. The node pool must have
the same number of SSDs per node in every node if the OneFS version is
prior to OneFS 8.0.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 12


Assessment Questions

The graphic shows that the node pairs are either the left half or right half of the chassis.

Drives and Drive Sleds

Drives

Internal to the 2.5" sled, there are individual fault lights for each drive. The
yellow LED associated with each drive is visible through holes in the top
cover. A supercapacitor can keep one light on for around 10 minutes while
the sled is out of the chassis. If more than one light is on (showing multiple
drive failures), the time the LED remains on is correspondingly reduced.

In the 3.5" drive sleds, the yellow drive fault LEDs are on the paddle cards,
and they are visible through the cover of the drive sled identifying which
drive, if any, needs replacement. The graphic shows the 3.5” short drive
sled; the 3.5” long drive sled has four LED viewing locations.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 13


Assessment Questions

Drive Sled

All twenty sleds can be individually serviced. Only remove one sled per
node at a time on running nodes. The typical procedure is:

• Identify the chassis with the fault


• Find the drive sled with the fault light
• Press the service request button3 and wait until the LED stops blinking
and goes dark
• Remove the sled, replace the drive, and replace the sled.

3 The service request button tells the node that the sled will be removed.
The node prepares for the sled removal by moving key boot information
from drives on the sled. The node suspends the drives in the sled from the
cluster file system and then spins them down. This is to maximize
survivability during any further failures and to prevent cluster file system
issues that are caused by multiple drives from becoming temporarily
unavailable.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 14


Assessment Questions

The node automatically detects and configures the replacement drive.

The graphic shows the lights and their information for the drive sleds.

Tip: If the suspend button is pressed and drives are


detected, the node tries to rediscover the sled and rejoin the
drives after 1 hour. If the suspend button is pressed and
drives are not detected or the sled is still removed, the node
automatically smartfails the drives after 15 minutes.

Knowledge Check

1. An IT administrator wants to upgrade 20 drive sleds on some


PowerScale nodes. The admin wants to mix and match the 3.5" drive
type with the 2.5" drive type. As a field engineer, what will you
recommend that the administrator consider for the drive sleds?
a. Attach the paddle card for the 3.5" drive and connect the 2.5"
drives directly to the sled.
b. You cannot add an inconsistent set of sled types.
c. Ensure that the yellow lights are lit for each drive.
d. Attach 10 of the 3.5" drive sleds and 10 of the 2.5" drive sleds to
the node.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 15


Assessment Questions

Hardware Monitoring

isi_hwmon

OneFS uses isi_hwmon for hardware monitoring4, except for drives. It


generates alerts when the system is degraded or when you must replace
the hardware.

Improvements to isi_hwmon include:


• Managing new event creation and event ending
• Command line to dump monitoring blocks diagnostic data
• Monitoring block dependencies
• Logging and debugging characteristics
• Logs at /var/log/isi_hwmon.log

OneFS has integrated support for the PowerTools Agent (PTA) and the
iDRAC Service Module (iSM) to support hardware monitoring in the
F200/F600.

CELOG

The OneFS Cluster Event Log (or CELOG) provides a single source for
the logging of events that occur in an Isilon cluster. CELOG helps avoid
receiving alerts or triggering Dial-Home service requests while tests or
planned activities are being made on the PowerScale cluster.

4 The hardware that "isi_hwmon" monitors on the F200 and F600 are:
iDRAC Services, IDSDM, NVDIMM Battery, NVDIMM persistence, chassis
fans, DIMM health, chassis intrusion, system thermals, system power
supplies, and system sensors.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 16


Assessment Questions

CELOG provides a single source for the logging of events that occur in an
Isilon cluster. Events are used to communicate a figure of cluster health
for various components. CELOG provides a single point from which
notifications about the events are generated, including sending alert email
and SNMP traps.

Cluster events can be easily viewed from the WebUI5 or the CLI, using the
isi event events view command.

CELOG uses /var/log/isi_celog_monitor.log log files for system monitoring


and event creation.

Placing the CELOG in maintenance mode does not affect client activity or
performance. The maintenance or test activity may affect client activity or
performance, depending on the type of activity. Upon the expiration of the
maintenance window specified, the CELOG is automatically removed from
maintenance mode.

HealthCheck

The isi healthcheck command enables you to evaluate the status of


specific software and hardware components of the cluster and the cluster
environment. In order to run the isi healthcheck command you must log in
to the cluster through an account that has the ISI_PRIV_SYS_SUPPORT
role-based access control (RBAC) privilege that is assigned to it.

The following terms are associated with the HealthCheck service:

• Items - An item is an aspect of the cluster or its environment that can


be evaluated6.

5 Go to Cluster Management > Events and Alerts > Events

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 17


Assessment Questions

• Checklists - A checklist defines a list of one or more items to be


evaluated.
• Parameters - Parameters are elements of items to which you can
assign specific values.
• Evaluations - Items and checklists can be evaluated. When an item or
checklist is evaluated, the results are saved in the
/ifs/.ifsvar/modules/health-check/results/evaluations directory.
• Freshness - Each item has a default Freshness value. The Freshness
value defines whether a new value is retrieved for an item being
evaluated, or a cached value from a previous evaluation is retrieved.
• Schedules - A schedule determines the period or duration that is
specified for performing health evaluation of clusters.

Go to: The Dell Isilon HealthCheck - Isilon Info Hub and


OneFS isi healthcheck guide for information.

6Depending on the nature of the item, when the item is evaluated, either
each node in the cluster is checked or the cluster as a whole is checked.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 18


Assessment Questions

Indicator Lights

Disk Drive

Hard Drive - indicators on left side of carrier.

The hard drive carrier LED indicator and a status LED indicator provide
information about the hard drive status. The activity LED indicator shows
whether the drive is in use or not. The status LED indicator shows the
power condition of the drive.

Status Indicator Condition

Flash green twice per second Identifying drive or preparing for


removal

Off Ready for removal

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 19


Assessment Questions

Flash amber four times per Drive failed


second

Solid green Drive online

Power Supply Unit

PSU indicator light on the handle.

The table describes the AC PSU status indicators.

Indicator Condition

Green PSU operational

Blinking amber Problem with PSU

No light PSU not receiving power

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 20


Assessment Questions

Blinking green PSU firmware update - interrupting power


can cause the PSU to lose functionality.

Blinking green and turns off • Hot plugging - blinks green five times at
a rate of 4 Hz and turns off.
• PSU mismatch of efficiency, feature set,
health status, or supported voltage.

Deep Dive: See the SolVe Online Guide and search for the
proper node. Select any related HDD or SSD, or PSU
replacement procedures.

Knowledge Check

2. The IT administrator visits the data center once a week to physically


check the nodes of the cluster. One of the drives that was functioning
last week has failed and should be removed. Arrange the correct order
in which the indicator lights should appear for the drive that has failed
and required to be removed.
Flash green twice per second
Solid green
Flash amber 4 times per second
Light Off

Chassis Replacement Procedure

A PowerScale chassis replacement involves moving two or four compute


modules and their companion drives to the new chassis because each
chassis has either two or four nodes. This procedure allows you to avoid
SmartFailing one or more nodes while maintaining data integrity. This
procedure requires a work area large enough to place two PowerScale

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 21


Assessment Questions

nodes side by side. To avoid losing the data stored on the node, replacing
a chassis involves moving the node’s internal components from the failed
chassis to a new chassis. The process requires following the steps exactly
and in order. Perform this procedure only when directed by Dell
Technologies PowerScale Technical Support.

Click each tab to learn how to generate the procedure through SolVe
Online.

Step 1

• Go to the SolVe Online Isilon - PowerScale Product page, click


Replacement Procedures, and select the node that you have to
generate the procedure for.
• Then select Replace Node Chassis and click Next.

SolVe pop-up selection window step 1

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 22


Assessment Questions

Step 2

Select Replace a Chassis and click Next.

SolVe pop-up selection window step 2

Step 3

Enter the Usage Information and click Next.

The procedure is still generated even if no information is entered.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 23


Assessment Questions

SolVe pop-up selection window step 3

Step 4

Click GENERATE.

A PDF document of the procedure will download shortly that can be


viewed in the browser and saved.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 24


Assessment Questions

SolVe pop-up selection window step 4

Recommended Tools

Recommended tools to perform node maintenance, removal, and


installation procedures:

• The bezel key is required if the system includes a bezel


• Phillips 1 & 2 screwdriver
• 1/4-inch flat blade screwdriver
• Torx T6, T8, and T30 screwdriver
• 5 mm hex nut screwdriver
• Plastic scribe
• Needle-nose pliers to disconnect cables and connectors in hard-to-
reach locations
• ESD mat with connected wrist grounding strap

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 25


Assessment Questions

Safety Precautions and Considerations

When working with PowerScale equipment, it is critical to adhere to the


following precautions.

Do not try to lift a system without help. Using a server lift is


the recommended best practice.

Opening or removing the system cover while the system is


powered on may expose you to a risk of electric shock.

Do not operate the system without the cover longer than five
minutes. Operating the system without the system cover can
result in component damage.

Keep hands clear of the rotating fan blades of the high-


performance fans. Ensure that the system is powered off
before servicing.

Use an anti-static mat and an anti-static wristband while


handling, replacing, or working on components inside the
system.
In an emergency, when an ESD kit is not available:
• Before touching, removing, or moving any FRU, touch a
bare metal surface of the enclosure, and at the same
time, pick up the FRU while it is still sealed in the
antistatic bag.

All system bays and system fans must be populated with a


part or a blank.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 26


Assessment Questions

Warning: A certified technician may only perform many


repairs. Perform troubleshooting and repairs as authorized
in the product documentation, or as directed by the online or
telephone service and support team. Damage due to
servicing that is not authorized by Dell is not covered under
warranty. Read and follow the safety instructions that are
shipped with your product.

Knowledge Check

3. A field engineer is assigned to replace a fan on a PowerScale F210


node on site. The engineer has replaced an F200 fan previously and
feels comfortable with this process but will check the procedure when
on site. The engineer learns that the site does not allow internet
access and therefore cannot confirm the procedure. Generate the fan
replacement procedure from SolVe Online and guide the engineer
though the replacement process.
a. Perform the following tasks in the given order:
1. Gather logs, power off the F210 node, disconnect electrical cables and
peripherals, and extend or remove the system from the cabinet.
2. Remove the system cover, air shroud, and cooling fan.
3. Install the cooling fan, air shroud, and system cover
4. Slide the system into the cabinet and reconnect any cables.
5. Run a HealthCheck, gather logs, and update the install database.

b. Perform the following tasks in the given order:


1. Gather logs, power off the F210 node, disconnect electrical cables and
peripherals, and extend or remove the system from the cabinet.
2. Remove the system cover, air shroud, and cooling fan.
3. Remove the power supply unit for safety measures.
4. Install the cooling fan, air shroud, and system cover.
5. Gather logs and update the install database

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 27


Assessment Questions

c. Perform the following tasks in the given order:


1. Gather logs, power off the F210 node, disconnect electrical cables and
peripherals, and extend or remove the system from the cabinet.
2. Remove the system cover and drive backplane cover.
3. Install the cooling fan, air shroud, and system cover.
4. Slide or re-install the system into the cabinet.
5. Run a HealthCheck and update the install database.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 28


Field Replaceable Units

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 29


Assessment Questions

Field Replaceable Units

Field Replaceable Units (FRU) Overview

DELL PowerScale nodes hold several components (Field Replaceable


Units, or FRUs) that are easily serviced in the field by trained service
personnel. As a supplement to SolVe Online, the Dell Support Video
Library has several videos guide with written instructions on how to
replace FRUs. Replacing FRUs requires opening a node. Only Dell
partners and field personnel can perform the replacements. Customers
should never replace a FRU themselves.

Watch the videos of replacement procedures for all PowerScale FRUs in


the next few slides.

Prepare the System for FRU Replacement

To prepare a system for FRU replacement, follow the given steps:

1. Identify the system service tag.7


2. Prepare the system for FRU replacement.8

7 Identify your system by pulling out the information tag in front of the
system to view the Express Service Code and Service Tag. Alternatively,
the information may be on a sticker on the chassis of the system. The
mini–Enterprise Service Tag (EST) is found on the back of the system.
This information is used by Dell to route support calls to the proper
personnel.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 30


Assessment Questions

3. Gather logs.9
4. Perform the component replacement or maintenance.
5. Update node firmware.10
6. Gather logs post-maintenance.11
7. Update the install database.12

8Using the management software, prepare the system for replacement of


a FRU component.

If additional components, such as rack ears are included when the system
board is received onsite, replace only the system board unless instructed
by Dell Technical Support.
9 Collect cluster logs before all maintenance procedures. Cluster logs

provide snapshots of the cluster, which you can review to ensure that
maintenance is successful. isi_gather_info command collects
configuration and log information from a cluster and automatically uploads
it to Dell for processing. If the environment has a firewall in place that
blocks the uploading of the log information, contact a PowerScale Support
Engineer for a temporary FTP link to upload the logs.
10 It is recommended to update the firmware on the replacement nodes

with latest node firmware package. Node firmware updates reboot only
one node at a time. If the customer cannot tolerate node reboots at the
time, schedule a time with the customer to update the whole cluster.
11 After completing maintenance on a cluster, gather the cluster logs.

12 After all work is complete, update the install database through the

Business Services portal.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 31


Assessment Questions

Deep Dive: See the PowerScale: Isilon: RPS General


Procedure: Isilon/PowerScale-Upgrade ---Customer-
Preparation-Guide to learn more. Use the SolVe Online >
Tools and Forms page and select Install Base Ticket to
update the install database.

FRU Steps for Internal Components

The tabs show the high-level steps for node part replacements. Always
reference the appropriate PowerScale replacement guide for detailed
procedures.

Pre-Replacement Steps

SmartFail node, gather logs, and shut down the node are three steps that
must be performed before proceeding with the replacement procedure.

Take the following CLI activity to perform the pre-replacement steps.

Replacement Steps

There are three steps that must be performed while replacing a node
component: cabling, extend node, and replace component.

• Cabling: Ensure that the node is powered off, and then label the
cables. Disconnect the power and I/O cables from the node.
• Extend node: Extend the node using the two slam latches on the left
and right sides. Once you extend the node, you can remove the cover
to get access to the internal FRU components.
• Replace component: The first step is to remove the cover. See the
replacement guide and the demonstration videos for detailed
procedures.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 32


Assessment Questions

Post-Replacement Steps

There are four steps that must be performed after replacing a node
component: slide node into rack, cabling, HealthCheck, gather logs, and
update database.

• Slide node into rack: Once you replace the component and install the
cover, push the node inward until it locks into place. If necessary,
install the PSUs.
• Cabling: Reconnect the I/O cables and then the power cables
according to the labeling.
• HealthCheck and gather logs: As a best practice to ensure that the
system is healthy before declaring the node or cluster ready for
production, run health checks. Collect cluster logs after all
maintenance. Cluster logs provide snapshots of the cluster that you
can review to ensure that maintenance is successful.
• Update database: After all work is complete, update the install
database using the Business Services portal.

Important: The FRU Steps simulator is not a full simulator


and does not offer all the configuration options.

Important: To begin the activity, select inside the command


prompt window and type the isi command.

The web version of this content contains an interactive activity.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 33


Assessment Questions

Gen6, All-Flash, and Accelerator Nodes

Select each video to learn how to replace PowerScale/Isilon Gen6 FRU


components. A downloadable PDF containing all video transcripts is
available under the "+Show Course Details" button and selecting the
Course Materials icon.

Movie:

The web version of this content contains a movie.

F200 and F600 Nodes

Select each video to learn how to replace PowerScale F200 and F600
FRU components. The demos are without narration, so no downloadable
scripts.

Movie:

The web version of this content contains a movie.

FRU videos for F200 & F600 nodes.

F900 Nodes

Select each video to watch the PowerScale F900 FRU replacement


procedure videos. The demos are without narration, so no downloadable
scripts.

Movie:

The web version of this content contains a movie.

FRU videos for the F900 node.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 34


Assessment Questions

Knowledge Check

4. A field engineer is requested to replace a fan on a PowerScale F900


node onsite. The engineer previously performed a memory
replacement procedure on F900 PowerScale nodes using the
replacement guide generated from SolVe Online. The engineer used
replacement DIMMs provided by the customer from an alternate
source and installed the cooling shroud with the securing slots on the
chassis. Even after the replacement procedures, the customer is still
facing performance issues and erratic system behavior. Generate a
memory module replacement guide from SolVe Online and help the
engineer perform the right procedure to replace the memory module.
Which of the following can be a potential reason for performance
issues and erratic system behavior?
a. Use of DIMMs from an alternate source.
b. Installing the cooling shroud with the securing slots on the
chassis.
c. Did not install the cooling fan.
d. The LEDs on the NVDIMM-N battery were turned on before
installation.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 35


Customer Replaceable Units

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 36


Assessment Questions

Customer Replaceable Units

Customer Replaceable Units (CRU) Overview

CRUs are designed for easy customer removal and replacement.


Customers can perform the replacement of dedicated hardware
components independently.

Customers with systems that are configured for dial-home (SRS or email)
have a service request (SR) opened automatically. A customer without
dial-home capability can request a hardware replacement part by opening
an SR with Dell Technologies Customer Service.

The customer has the option to sign up for AutoCRU. The AutoCRU
program indicates that the customer can always replace a customer
replaceable part without communication before a part work order.

Replacement procedure videos for PowerScale CRUs are covered in this


topic.

Deep Dive: See the Dell Customer Replaceable Unit


("CRU") Program website for more information. Also listed is
the resource file, Dell Warranty and Maintenance, for a list
of the hardware components that are CRUs for specific
hardware systems.

Gen6 and Newer

Click each video to learn how to replace PowerScale/Isilon Gen6 CRU


components. A downloadable PDF containing all video transcripts is
available under the "+Show Course Details" button and selecting the
Course Materials icon.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 37


Assessment Questions

Movie:

The web version of this content contains a movie.

F200, F600, and Accelerator Nodes

Select each video to learn how to replace PowerScale F200, F600, and
Accelerator Node CRU components. The demos are without narration, so
no downloadable scripts.

Movie:

The web version of this content contains a movie.

F900

Select each component to watch the PowerScale F900 CRU replacement


procedure videos. The demos are without narration, so no downloadable
scripts.

Movie:

The web version of this content contains a movie.

Knowledge Check

5. A customer wants to replace the front-end network interface card


(NIC) of an accelerator node. The customer has generated a
replacement procedure through SolVe Online and is not able to
perform the replacement procedure effectively in slot 2. From the
SolVe Online portal, generate the front-end NIC replacement guide
and help the customer perform the right procedure to replace the
front-end NIC card. Which of the following could be a potential reason
for the issue that the customer is facing?
a. The front-end NIC replacement is a FRU procedure that must be
performed by a Dell field engineer.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 38


Assessment Questions

b. The front-end NIC is in slot 1.


c. The customer did not perform a back-end NIC replacement before
replacing the front-end NIC card.
d. Logs must be gathered using the isi_gather_info command before
performing the replacement procedure.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 39


Course Assessment (scored)

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 40


Assessment Questions

Assessment Introduction

The assessment is set up as follows:


• The assessment contains 12 questions.
• Answer a question by clicking Submit and then clicking Next.
• The passing score is 80%.
• At the end of the assessment, a scorecard is displayed that shows
your score and a message that is based on your score. If you want to
improve your score, you must be on this page to click the Retake
button.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 41


Assessment Questions

Assessment Questions

Question

1. Which of the following statements are true? Select all that apply.
a. Always review the replacement guide on SolVe Online for full
instructions.
b. Follow ESD procedures when handling components.
c. When performing replacements on multiple nodes, you should
perform each step on every node before moving to the next step.
d. To power off a node, press and hold the power button.

Question

2. What do the blue touchpoints on components indicate?


a. Phillips screwdriver required
b. Hot serviceable component
c. Cold serviceable component
d. No tools required

Question

3. What is a crucial step that needs to be performed before replacing any


FRU component on a node?
a. The node must be powered off.
b. The node must be SmartFailed and then powered off.
c. All nodes of the associated nodepool should be powered off.
d. The cluster should be powered off.

Question

4. Which of the following are PowerScale Field Replaceable Parts


(FRUs)? Select all that apply.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 42


Assessment Questions

a. Fans
b. Batteries
c. DIMMs
d. Power Supply
e. Drives

Question

5. What best describes a node-pair?


a. A node-pair is a CRU if adding to an existing chassis.
b. Gen6 nodes can only be added per chassis, four nodes at a time.
c. After adding a node, it must be paired with another node using the
CLI.
d. A node can be paired with any other node in the same chassis.

Question

6. The PSU of a node in an H700 cluster fails and needs to be replaced.


What is an important consideration that the field engineer should know
before replacing the node?
a. The node need not be powered off while performing the
replacement procedure.
b. The node should be powered off before performing the
replacement.
c. The node should be SmartFailed before performing the
replacement.
d. The node and its peer-node should be shut down before
performing the replacement.

Question

7. About how many minutes can the drive fault LED in a Gen6 drive sled
stay illuminated when the sled is removed from the chassis?
a. 1
b. 5

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 43


Assessment Questions

c. 10
d. 15

Question

8. What should you do if a node refuses to power off?


a. Contact Dell Support.
b. Press and hold the power button for 5 seconds.
c. Disconnect the power cables.
d. Remove the power supplies.

Question

9. What does it mean when the status indicator on a disk drive is off?
a. Drive is ready for removal.
b. Node is powering off.
c. Drive rebuild is complete.
d. Impending drive failure

Question

10. What does it mean when the status indicator on a PSU is flashing
amber?
a. Problem with the PSU
b. PSU is updating firmware.
c. Hot spare feature activated
d. Peer PSU failure pending

Question

11. What is a key difference between a CRU and a FRU?


a. Typically, a CRU is hot swappable and a FRU is not.
b. Typically, a FRU is hot swappable and a CRU is not.
c. Only Dell partners and field personnel can perform a CRU.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 44


Assessment Questions

d. Any individual familiar with the F900 node can perform a FRU.

Question

12. When replacing a node component, which statement is true?


a. Do not attempt to lift the system without help.
b. Remove the system cover while the system is powered on.
b. Ensure that the system is powered on before servicing.
c. System can be operated without a system cover for a duration of
10 minutes.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 45


Course Completion

You Have Completed This Content

Click the Save Progress and Exit button in the course menu or
below to record this content as complete.
Go to the next learning or assessment, if applicable.

PowerScale Hardware Maintenance-SSP

© Copyright 2024 Dell Inc Page 46

You might also like