108nl Replace Boot Drive
108nl Replace Boot Drive
108nl Replace Boot Drive
Node
Attention If this procedure is not followed accurately, data loss and severe disruption of cluster operations can
occur. Perform every step in this procedure; if the system does not respond as expected, contact Isilon Product
Support.
Procedure
◆ Use the output of isi devices as the content of the file. Write a sentinel file to the root partition by typing the
command:
isi devices > /sentinel.txt
3
Field replace a boot drive
Procedure
1. Using a serial cable, connect to the node you are going to work on.
2. View boot drive information by typing the following command:
atacontrol list
The following information appears:
ATA channel 0:
Master: no device present
Slave: no device present
ATA channel 1:
Master: ad2 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II
Slave: ad3 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II
The boot drives are listed under ATA channel 1. In the previous example, both boot drives are healthy. If one
of the boot drives has failed, the display reads no device present for that drive.
Determine whether the failed boot drive is the ad2 or ad3 device, and then use the following table to determine
the location of the boot drive inside the node.
Master ad2 J3
Slave ad3 J4
Make note of the board drive slot that contains the failed boot drive.
! Caution If both drives appear to have failed, do not continue. Contact Isilon Product Support immediately.
3. If both drives appear to be healthy, one of the drives may have partially failed. To identify a partially failed drive,
check the status of the individual partition mirrors by typing the following command:
gmirror status
From left to right, the output displays the name of each mirror, the status of the mirror relationship, and the
component IDs for each boot drive.
The following example shows the boot drive partition layout in a healthy node. The mirrors for each partition
show:
■ a value of COMPLETE in the Status column.
■ the component IDs for both boot drives in the Components column. The component IDs are a combination of
the OneFS Drive ID, and the partition number (the number following the letter p). Both boot drives are listed
for each mirror with the exception of the var-crash mirror, which only lists the slave drive.
Note The partition numbers in the display may differ from the following example.
ad2p8
mirror/var1 COMPLETE ad3p7
ad2p7
mirror/var0 COMPLETE ad3p6
ad2p6
mirror/root1 COMPLETE ad3p5
ad2p5
The following example shows the boot drive partition layout as it appears in the event of a failed boot drive. A
failed boot drive forces the mirrors for a partition to show:
■ a value of DEGRADED in the Status column.
■ only the component ID of the healthy boot drive in the Components column. The failed boot drive does not
appear.
Attention DEGRADED does not refer to a specific drive, but to the mirror relationship between the drives. If a
drive appears in the Components column next to the DEGRADED status, it is healthy and should not be removed.
In the previous example, ad3p4 is missing from the degraded partition mirror/root0, and ad3p6 is missing
from the degraded partition mirror/var0. The missing drive, ad3, is the failed drive.
Determine which drive has failed. Use the previous table to determine which board drive slot contains the failed
boot drive and make a note of the number (J3 or J4).
Attention If both drives have failed, do not continue. Contact Isilon Product Support.
Procedure
1. Connect to an available node in the cluster with a serial cable or network drop.
2. From the node that you connected to, open a secure shell (SSH) connection to the node that is to be shut down by
typing the command:
ssh [node IP address]
Type the command isi status to determine the IP address of a node.
3. Shut down the node by typing the following command:
shutdown -p now
4. Verify that the node is shut down by typing the following command:
isi status
Confirm that the node has a D (Down) status. See node 3 in the following example.
ID |IP Address |DASR| In Out Total| Used / Size |Used / Size
---+---------------+----+-----+-----+-----+------------------+-
1|10.53.217.201 | OK | 48M| 0| 48M| 19G/ 6.2T(< 1%)|(No SSDs)
Procedure
1. J3 connector 2. J4 connector
3. Insert the replacement boot drive into the empty boot drive slot and gently press down to secure the drive.
4. Replace the boot drive retainer to secure the drives.
Procedure
1. Place the top panel on the node so that the front edge of the top panel is about one inch behind the drive bays and
then slide the top panel forward into place.
! Caution The chassis intrusion switch can be damaged if the top panel is slid too far back on the node.
2. Tighten the captive top panel screw to secure the top panel to the node.
Procedure
Procedure
1. Label the InfiniBand cables to ensure that they are reconnected properly later.
2. Disconnect the InfiniBand cables from the back of the node.
3. Connect directly to the node using a serial cable.
Procedure
◆ Power on the node by pressing the ON/OFF button on the back panel of the node. It is located just left of center,
toward the upper part of the back panel.
Procedure
1. Locate the sentinel file in the root partition by typing the following command:
cat /sentinel.txt
If the sentinel file appears, you replaced the correct boot drive. If the file is missing, do not continue. Contact
Isilon Product Support.
2. Remove the file by typing the following command:
rm /sentinel.txt
Procedure
1. Verify that the boot drives are healthy by typing the following command:
gmirror status
The following information appears:
Name Status Components
mirror/root0 COMPLETE ad3p3
ad2p4
mirror/var-crash COMPLETE ad3p9
Confirm that the values in the Status column all read COMPLETE.
2. Verify boot drive information by typing the following command:
atacontrol list
The following information appears:
ATA channel 0:
Master: no device present
Slave: no device present
ATA channel 1:
Master: ad2 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II
Slave: ad3 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II
Procedure
Procedure
1. Contact Isilon Product Support to notify them that you are returning a failed part.
2. Package the failed part in the packaging materials provided with the replacement part.
3. Attach the return label that was included with the replacement part.
4. When filling in the RMA number, use the support case number provided by Isilon Product Support.
5. Ship the failed part to the address specified on the return label.
Local: 1-206-777-7970
Toll Free: 1-866-276-0723
Email: [email protected]
Online: isilon.com/support
Japan Support: 03-5358-7180