controller_replace
controller_replace
Replacement process
1 Preparing the system for the replacement
Pre-replacement tasks for SAN configurations in an HA pair
Shutting down a node operating in 7-Mode or clustered Data ONTAP
Verifying that the new controller module has no content in NVMEM
Resetting storage encryption disk authentication keys to their MSID
(default security ID set by the manufacturer)
215-07841_G0 October 2016 Copyright © 2016 NetApp, Inc. All rights reserved. 1
Web: www.netapp.com • Feedback: [email protected]
Replacing the controller module
You must review the prerequisites for the replacement procedure and select the correct one for your Data ONTAP operating
system.
• This procedure is for systems running Data ONTAP 8.2 and later only.
• This procedure includes steps for automatically or manually reassigning disks to the replacement node, depending on your
system's configuration.
You should perform the disk reassignment as directed in the procedure.
• You must replace the failed component with a replacement FRU component you received from your provider.
• You must be replacing a controller module with a controller module of the same model type; you cannot upgrade your
system by just replacing the controller module.
• You cannot change any disks or disk shelves as part of this procedure.
• In this procedure, the boot device is moved from the impaired node to the replacement node so that the replacement node
boots up in the same version of ONTAP as the old controller.
• Any PCIe cards moved from the old controller module to the new controller module or added from existing customer site
inventory must be supported by the replacement controller module.
NetApp Hardware Universe
• The term system refers to FAS, AFF, V-Series, and SA (FlexCache) systems within this platform family. The procedures
apply to all platforms, unless otherwise indicated, except that clustered Data ONTAP procedures do not apply to SA systems.
• It is important that you apply the commands in these steps on the correct systems:
◦ The replacement node is the new node that is replacing the impaired node.
• You must always capture the node's console output to a text file.
This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the
replacement process.
Choices
Steps
1. Preparing the system for the replacement on page 4
2. Replacing the controller module hardware on page 9
3. Restoring and verifying the system configuration after hardware replacement on page 19
4. Running diagnostics tests after replacing a controller module on page 22
5. Completing the recabling and final restoration of operations on page 26
6. Completing the replacement process on page 32
1 2 3 4 5 6
Storage encryption?
YES
NO YES
Reset authentication
key to MSID
HA down the
Shutting STANDALONE
impaired controller module
GO TO NEXT STEP
Shut down power
through SP
GO TO NEXT STEP
b. Copy and save the screen display to a safe location for later reuse.
Note: If the impaired node is taken over by its partner, you can boot it to Maintenance
mode and run the fcadmin config command in Maintenance mode.
To boot the impaired node to Maintenance mode, restart the impaired node, press Ctrl-
C to interrupt the boot process when you see the message
Press Ctrl-C
for the Boot Menu. From the Boot Menu, enter the option for Maintenance mode.
c. Enter the Cluster-Mode command to save the port configuration information for the
impaired node:
fcadmin config
Steps
1. Display the key ID for each self-encrypting disk on the original system:
disk encrypt show
Example
The first disk in the example is associated with an MSID; the others are associated with a non-MSID.
2. Examine the output of the disk encrypt show command, and if any disks are associated with a non-MSID key, rekey
them to an MSID key by taking one of the following actions:
• Rekey the disks individually, once for each disk:
disk encrypt rekey 0x0 disk_name
3. Verify that all the self-encrypting disks are associated with an MSID:
disk encrypt show
Example
The following example shows the output of the disk encrypt show command when all self-encrypting disks are
associated with an MSID:
• If you have one controller module in the chassis that is either part of an HA pair or in a stand-alone configuration, you must
turn off the power supplies in the impaired node chassis.
Steps
1. Check the HA status of the impaired node from either node in the HA pair that is displaying the ONTAP prompt:
cf status
2. Take the appropriate action based on the takeover status of the node.
3. Wait at least two minutes after takeover of the impaired node to ensure that the takeover was completed successfully.
4. With the impaired node showing the Waiting for giveback message or halted, shut it down, depending on your
configuration:
5. If the nodes are in a dual-chassis HA pair, unplug the impaired node power cords from the power source.
Steps
2. Shut down the power supplies, and then unplug both power cords from the source.
The system is ready for maintenance.
Steps
The NVMEM LED is located on the controller module to the right of the network ports, marked with a battery symbol. If the
NVMEM LED is flashing, there is content in the NVMEM.
c0a 0c e0a
0a 0b
LNK LNK
!
c0b 0d e0b
2. If the NVMEM LED is not flashing, there is no content in the NVMEM; You can skip the following steps and proceed to the
next task in this procedure.
3. If the NVMEM LED is flashing, there is data in the NVMEM and you must disconnect the battery to clear the memory:
b. Locate the battery and squeeze the clip on the face of the battery plug to release the plug from the socket.
1
CPU air duct
2
NVMEM battery
3
NVMEM battery plug
4
NVMEM battery locking tab
Steps
1. Opening the system on page 10
2. Moving the PCIe cards to the new controller module on page 11
3. Moving the boot device on page 12
4. Moving the NVMEM battery on page 13
5. Moving the DIMMs to the new controller module on page 15
Steps
2. Loosen the hook and loop strap binding the cables to the cable management arm, and then unplug the system cables and
SFPs (if needed) from the controller module, and keep track of where the cables were connected.
Leave the cables in the cable management arm so that when you reinstall the cable management arm, the cables are
organized.
3. Remove the cable management arms from the left and right sides of the controller module and set them aside.
LNK
LNK
4. Pull the cam handle downward and slide the controller module out of the system.
Steps
2. Swing the side panel open until it comes off the controller module.
1
Side panel
2
PCIe card
3. Carefully remove the PCIe card from the controller module and set it aside.
You must keep track of which slot the PCIe card was in.
4. Repeat the preceding steps for the remaining PCIe cards in the old controller module.
5. Open the new controller module side panel, if necessary, slide off the PCIe card filler plate, as needed, and carefully install
the PCIe card.
You must properly align the card in the slot and exert even pressure on the card when seating it in the socket. The card must
be fully and evenly seated in the slot.
Steps
1. Locate the boot device using the following illustration or the FRU map on the controller module:
1
Boot device holder; not removable
2
Boot device
2. Open the boot device cover and hold the boot device by its edges at the notches in the boot device housing, gently lift it
straight up and out of the housing.
Attention: Always lift the boot device straight up out of the housing. Lifting it out at an angle can bend or break the
connector pins in the boot device.
4. Align the boot device with the boot device socket or connector, and then firmly push the boot device straight down into the
socket or connector.
Important: Always install the boot device by aligning the front of the boot device squarely over the pins in the socket at
the front of the boot device housing. Installing the boot device at an angle or over the rear plastic pin first can bend or
damage the pins in the boot device connector.
Steps
2. Locate the battery, press the clip on the face of the battery plug to release the lock clip from the plug socket, and then unplug
the battery cable from the socket.
1
CPU air duct
2
NVMEM battery
3
NVMEM battery plug
3. Gently pull the tab on the battery housing away from the controller module side.
The tab is found near the controller module side, near the plug.
4. Place your forefinger at the far end of the battery housing, and then gently push it toward the CPU air duct.
1 2 3
1
NVMEM battery
2
Battery tabs
3
Notch on chassis with alignment arrow
5. Gently pull the battery housing toward the center of the controller module, and then lift the battery out of the controller
module.
6. Align the tabs on the battery holder with the notches in the controller module side, and then gently push the battery housing
so that the notches are under the lip of the controller module side.
7. While gently pushing the battery against the sheet metal on the chassis to hold it in the battery guide, place the forefinger of
your free hand against the battery housing behind the locking tab on the battery, and then gently push the battery housing
away from the CPU air duct.
If it is properly aligned, the battery snaps into place on the side of the controller module. If it does not, repeat these steps.
Steps
1. Verify that the NVMEM battery cable connector is not plugged into the socket .
1 2
1 2 3 4
1 2
1
NVMEM DIMMs 1 and 2
Note: See Replacing an NVMEM battery and NVMEM DIMMs in a 32xx system for information about
removing these two DIMMs.
2
System DIMMs 1 through 4
The number of DIMMs in your system will vary:
• In the 3210 and 3240 models, only DIMM sockets 1 and 2 are populated.
3. Note the location and orientation of the DIMM in the socket so that you can insert it in the new controller module in the
proper orientation.
4. Slowly press down on the two DIMM ejector tabs, one at a time, to eject the DIMM from its slot, and then lift it out of the
slot.
Caution: The DIMMs are located very close to the CPU heat sync, which might still be hot. Avoid touching the CPU heat
sync when removing the DIMM.
Attention: Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
5. Locate the corresponding slot for the DIMM in the new controller module, align the DIMM over the slot, and then insert the
DIMM into the slot.
The notch among the pins on the DIMM should align with the tab in the socket. The DIMM fits tightly in the slot but should
go in easily. If not, you should realign the DIMM with the slot and reinsert it.
Important: You must install the NVMEM DIMMs only in the NVMEM DIMM slots.
6. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
The edge connector on the DIMM must make complete contact with the slot.
7. Push carefully, but firmly, on the top edge of the DIMM until the latches snap into place over the notches at the ends of the
DIMM.
9. In the new controller module, orient the NVMEM battery cable connector to the socket on the controller module and plug
the cable into the socket.
You must ensure that the plug locks down onto the socket on the controller module.
Steps
1. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway
into the system.
Note: Do not completely insert the controller module in the chassis until instructed to do so.
2. Recable the management port so that you can access the system to perform the tasks in the following sections.
b. With the cam handle in the open position, firmly push the controller module in until it
meets the midplane and is fully seated, and then close the cam handle to the locked
position.
Attention: Do not use excessive force when sliding the controller module into the
chassis; you might damage the connectors.
• If you are running Data ONTAP 8.2.1 and earlier, enter boot_ontap, and press
Ctrl-C when prompted to got to the boot menu, and then select Maintenance mode
from the menu.
• If you are running Data ONTAP 8.2.2 and later, enter boot_ontap maint at the
LOADER prompt.
d. If you have not already done so, reinstall the cable management , and then tighten the
thumbscrew on the cam handle on back of the controller module.
e. Bind the cables to the cable management device with the hook and loop strap.
b. Reconnect the power cables to the power supplies and to the power sources, turn on the
power to start the boot process, and then press Ctrl-C to interrupt the boot process when
you see the message Press Ctrl-C for Boot Menu.
Note: If you miss the prompt and the controller module boots to Data ONTAP, enter
halt and at the LOADER prompt enter boot_ontap, and press Ctrl-C when
prompted, and then repeat this step.
c. From the boot menu, select the option for Maintenance mode.
d. If you have not already done so, reinstall the cable management , and then tighten the
thumbscrew on the cam handle on back of the controller module.
e. Bind the cables to the cable management device with the hook and loop strap.
Important: During the boot process, you might see the following prompts:
• A prompt warning of a system ID mismatch and asking to override the system ID.
• A prompt warning that when entering Maintenance mode in a HA configuration you must ensure that the healthy node
remains down.
START HERE
STEP
Verify HA state (ha-config show)
matches your configuration
1 2 3 4 5 6
NO
YES
Fibre channel?
YES
NO
Restore FC config
NO
YES
GO TO NEXT STEP
Steps
1. In Maintenance mode, display the HA state of the new controller module and chassis:
ha-config show
If your system is... The HA state for all components should be...
In an HA pair ha
Stand-alone non-ha
2. If the displayed system state of the controller does not match your system configuration, set the HA state for the controller
module:
ha-config modify controller [ha | non-ha]
3. If the displayed system state of the chassis does not match your system configuration, set the HA state for the chassis:
ha-config modify chassis [ha | non-ha]
Steps
1. From the healthy node, verify the values of the FC configuration on the replacement node: partner fcadmin config
2. Compare the default FC variable settings with the list you saved earlier.
d. Enter one of the following commands, depending on what you need to do:
• To unconfigure ports:
fcadmin config -t unconfigure adapter_name
Verifying the system time after replacing the controller module in an HA pair
If your system is in an HA pair, you must set the time on the replacement node to that of the healthy node to prevent possible
outages on clients due to time differences.
• The replacement node is the new node that replaced the impaired node as part of this procedure.
When setting the date and time at the LOADER prompt, verify that all times are set to GMT.
Steps
1. If you have not already done so, halt the replacement node to display the LOADER prompt.
3. At the LOADER prompt, check the date and time on the replacement node:
show date
6. At the LOADER prompt, confirm the date and time on the replacement node:
show date
• For ONTAP 8.2 and later, you do not require loopback plugs to run tests on storage interfaces.
Steps
1. If the node to be serviced is not at the LOADER prompt, bring it to the LOADER prompt.
Important: During the boot_diags process, you might see a prompt warning that when entering Maintenance mode in
an HA configuration, you must confirm that the partner remains down.
To continue to Maintenance mode, you should enter y
4. Display and note the available devices on the controller module: sldiag device show -dev mb
The controller module devices and ports that are displayed can be any one or more of the following:
• fcal is a Fibre Channel-Arbitrated Loop device that is not connected to a storage device or Fibre Channel network.
• sas is a Serial Attached SCSI device that is not connected to a disk shelf.
5. How you proceed depends on how you want to run diagnostics on your system.
Choices
• Running diagnostics tests concurrently after replacing the controller module on page 23
• Running diagnostics tests individually after replacing the controller module on page 24
Related information
FAS System Level Diagnostics Guide
Steps
1. Display and note the available devices on the controller module: sldiag device show -dev mb
The controller module devices and ports that are displayed can be any one or more of the following:
• fcal is a Fibre Channel-Arbitrated Loop device that is not connected to a storage device or Fibre Channel network.
• sas is a Serial Attached SCSI device that is not connected to a disk shelf.
2. Review the enabled and disabled devices in the output from step 1 on page 23 and then determine which tests you want to
run concurrently.
4. Examine the output and, if applicable, enable the tests that you want to run for the device:
sldiag device modify -dev dev_name -index test_index_number -selection enable
test_index_number can be an individual number, a series of numbers separated by commas, or a range of numbers.
5. Examine the output and, if applicable, disable the tests that you do not want to run for the device by selecting only the tests
that you want to run:
sldiag device modify -dev dev_name -index test_index_number -selection disable
*> <SLDIAG:_ALL_TESTS_COMPLETED>
9. After the tests are complete, verify that there are no hardware problems on your storage system:
sldiag device status -long -state failed
10. Correct any issues that are found, and repeat this procedure.
Steps
3. Examine the output and, if applicable, enable the tests that you want to run for the device:
sldiag device modify -dev dev_name -index test_index_number -selection enable
test_index_number can be an individual number, a series of numbers separated by commas, or a range of numbers.
4. Examine the output and, if applicable, disable the tests that you do not want to run for the device by selecting only the tests
that you want to run:
sldiag device modify -dev dev_name -index test_index_number -selection only
<SLDIAG:_ALL_TESTS_COMPLETED>
boot media sldiag device status -dev bootmedia -long -state failed
b. Turn off or leave on the power supplies, depending on how many controller modules are in
the chassis:
• If you have two controller modules in the chassis, leave the power supplies turned on
to provide power to the other controller module.
• If you have one controller module in the chassis, turn off the power supplies and
unplug them from the power sources.
c. Check the controller module you are servicing and verify that you have observed all the
considerations identified for running system-level diagnostics, that cables are securely
connected, and that hardware components are properly installed in the storage system.
d. Boot the controller module you are servicing, interrupting the boot by pressing Ctrl-C
when prompted.
This takes you to the Boot menu:
• If you have two controller modules in the chassis, fully seat the controller module you
are servicing in the chassis.
The controller module boots up when fully seated.
• If you have one controller module in the chassis, connect the power supplies and turn
them on.
g. Enter boot_diags at the prompt and rerun the system-level diagnostic test.
8. Continue to the next device that you want to test, or exit system-level diagnostics and continue with the procedure.
Steps
b. Enter the information for the target system, and then click Collect Data.
d. Check other cabling by clicking the appropriate tab, and examining the output from Config Advisor.
Reassigning disks
If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when
the giveback occurs at the end of the procedure. In a stand-alone system, you must manually reassign the ID to the disks.
Steps
1. If the replacement node is in Maintenance mode (showing the *> prompt), exit Maintenance mode:
halt
After you issue the command, you must wait until the system stops at the LOADER prompt.
2. From the LOADER prompt on the replacement node, display the Boot menu:
3. Wait until the Waiting for giveback... message is displayed on the console of the replacement node and then, on the
healthy node, verify that the controller module replacement has been detected and that the new partner system ID has been
automatically assigned:
cf status
You should see a message similar to the following, which indicates that the system ID change has been detected:
HA mode.
System ID changed on partner (Old: 1873774576, New: 1873774574).
partner_node has taken over target_node.
target_node is ready for giveback.
The message shows the new system ID of the replacement node. In this example, the new system ID is 1873774574.
4. From the healthy node, verify that all coredumps are saved: partner savecore
If the command output indicates that savecore is in progress, you must wait for savecore to finish before initiating the
giveback operation. You can monitor the progress of the savecore: partner savecore -s
5. Initiate the giveback operation after the replacement node displays the Waiting for Giveback... message:
cf giveback
You should see a message similar to the following noting the system ID change and prompting you to continue:
System ID changed on partner. Giveback will update the ownership of partner disks with
system ID: 1873774574.
Do you wish to continue {y|n}?
You must enter y to proceed. If the giveback is vetoed, you can consider overriding the veto.
Find the High-Availability Configuration Guide for your version of Data ONTAP 8
Find the Active/Active Configuration Guide for your version of Data ONTAP 7G
6. Verify that the disks were assigned correctly:
disk show
You must verify that the disks belonging to the replacement node show the new system ID for the replacement node. In the
following example, the disks owned by node2 now show the new system ID, 1873774574:
Example
7. Verify that the expected volumes are present and are online for each node:
vol status
Steps
1. If you have not already done so, reboot the replacement node, interrupt the boot process by entering Ctrl-C, and then select
the option to boot to Maintenance mode from the displayed menu.
You must enter Y when prompted to override the system ID due to a system ID mismatch.
Note: Make a note of the old system ID, which is displayed as part of the disk owner column.
Example
The following example shows the old system ID of 118073209:
3. Reassign disk ownership by using the system ID information obtained from the disk show command:
disk reassign -s old system ID
In the case of the preceding example, the command is: disk reassign -s 118073209
You can respond Y when prompted to continue.
You must verify that the disks belonging to the replacement node show the new system ID for the replacement node. In the
following example, the disks owned by system-1 now show the new system ID, 118065481:
Example
5. If the replacement node is in Maintenance mode (showing the *> prompt), exit Maintenance mode:
halt
After you issue the command, you must wait until the system stops at the LOADER prompt.
Steps
4. On each node, verify that all disks are rekeyed: disk encrypt show
None of the disks should list a key ID of 0x0.
6. On each node, verify that all keys are stored on their key management servers: key_manager query
None of the key IDs should have an asterisk next to it.
1. If you require new license keys in the Data ONTAP 8.2 format, obtain replacement license keys on the NetApp Support Site
in the My Support section under Software licenses.
Note: The new license keys that you require are auto-generated and sent to the email address on file. If you fail to receive
the email with the license keys within 30 days, you should contact technical support.
2. You must wait until the ONTAP command-line interface has been up for at least five minutes and then confirm that the
license database is running.
You can add one license or multiple licenses simultaneously, with each license key separated by a comma or a space.
If the ONTAP command-line interface was not up for a sufficient amount of time, you might receive a message indicating
that the license database is unavailable.
Related information
NetApp Support
Related information
https://library.netapp.com/ecm/ecm_download_file/ECMP12475945
Steps
1. Preparing the system for controller replacement on page 33
2. Replacing the controller module hardware on page 42
3. Restoring and verifying the system configuration after hardware replacement on page 52
4. Running diagnostics tests after replacing a controller module on page 55
5. Completing the recabling and final restoration of operations on page 60
6. Completing the replacement process on page 69
1 2 3 4 5 6
Storage encryption?
YES
NO YES
Reset authentication
key to MSID
HA down the
Shutting STANDALONE
impaired controller module
GO TO NEXT STEP
Shut down power
through SP
GO TO NEXT STEP
Steps
1. Preparing for SAN configurations on page 35
2. Checking quorum on the SCSI blade on page 35
3. Preparing for Storage Encryption configurations on page 36
4. Shutting down the target controller on page 37
5. Verifying the new controller module has no content in NVMEM on page 40
Steps
b. Run the following Cluster-Mode command on the console of the impaired node:
system node hardware unified-connect modify
2. Copy and save the information displayed on the screen to a safe location for later reuse.
Steps
1. Verify that the internal SCSI blade is operational and in quorum on the impaired node:
event log show -node impaired-node-name -messagename scsiblade.*
You should see messages similar to the following, indicating that the SCSI-blade process is in quorum with the other nodes
in the cluster:
2. If you do not see the quorum messages, check the health of the SAN processes and resolve any issues before proceeding
with the replacement.
Steps
2. Display the key ID for each self-encrypting disk on the original system:
disk encrypt show
Example
The first disk in the example is associated with an MSID key; the other disks are associated with a non-MSID key.
3. Examine the output of the disk encrypt show command, and if any disks are associated with a non-MSID key, rekey the
disks to an MSID key by taking one of the following actions:
4. Verify that all of the self-encrypting disks are associated with an MSID key:
disk encrypt show
Example
The following example shows the output of the disk encrypt show command when all self-encrypting disks are
associated with an MSID key:
6. Repeat step 1 on page 36 through step 5 on page 36 for each individual node or HA pair.
Choices
• Shutting down a node running ONTAP on page 37
Steps
1. If the system is running clustered Data ONTAP, check the status of the nodes in the cluster:
Note: In a cluster with a single HA pair, Epsilon will not be assigned to either node.
c. Take one of the following actions, depending on the result of the command:
If... Then...
All nodes show true for both health
a. Exit advanced mode:
and eligibility and Epsilon is not
assigned to the impaired node. set -privilege admin
b. Proceed to Step 3.
d. Go to Step 3.
The impaired node shows false for Complete the following steps:
health and is the Epsilon node.
a. Change to the advance privilege level:
set -privilege advanced
2. If the impaired node is part of an HA pair, disable the auto-giveback option from the console of the healthy node:
storage failover modify -node local -auto-giveback false
• If the impaired node is showing the ONTAP prompt, take over the impaired node from the
healthy node and be prepared to interrupt the reboot:
storage failover takeover -ofnode impaired_node_name
When prompted to interrupt the reboot, you must press Ctrl-C to go to the LOADER
prompt.
Note: In a two-node cluster, if Epsilon is assigned to the impaired node, you must
move Epsilon to the healthy node before halting the impaired node.
• If the display of the impaired node is showing the Waiting for giveback message,
press Ctrl-C and respond Y to take the node to the LOADER prompt.
• If the impaired node does not show either the Waiting for giveback message or an
ONTAP prompt, power-cycle the node.
You should contact technical support if the node does not respond to the power cycle.
a. From the LOADER prompt, go to the Service Processor (SP) by entering ^G.
The method that you use to shut down the node depends on whether remote management through a Service Processor (SP) is
used, and whether the system is in a dual-chassis configuration or single-chassis configuration.
6. If the system is in a dual-chassis HA pair or stand-alone configuration, turn off the power supplies, and then unplug the
power cords of the impaired node from the power source.
Steps
The NVMEM LED is located on the controller module to the right of the network ports, marked with a battery symbol. If the
NVMEM LED is flashing, there is content in the NVMEM.
c0a 0c e0a
0a 0b
LNK LNK
!
c0b 0d e0b
1
NVMEM LED
2. If the NVMEM LED is not flashing, there is no content in the NVMEM; You can skip the following steps and proceed to the
next task in this procedure.
3. If the NVMEM LED is flashing, there is data in the NVMEM and you must disconnect the battery to clear the memory:
b. Locate the battery and squeeze the clip on the face of the battery plug to release the plug from the socket.
1
CPU air duct
2
NVMEM battery
3
NVMEM battery plug
4
NVMEM battery locking tab
Steps
1. Opening the system on page 43
2. Moving the PCIe cards to the new controller module on page 44
3. Moving the boot device on page 45
4. Moving the NVMEM battery on page 46
5. Moving the DIMMs to the new controller module on page 48
Steps
2. Loosen the hook and loop strap binding the cables to the cable management arm, and then unplug the system cables and
SFPs (if needed) from the controller module, and keep track of where the cables were connected.
Leave the cables in the cable management arm so that when you reinstall the cable management arm, the cables are
organized.
3. Remove the cable management arms from the left and right sides of the controller module and set them aside.
LNK
LNK
4. Pull the cam handle downward and slide the controller module out of the system.
Steps
2. Swing the side panel open until it comes off the controller module.
1
Side panel
2
PCIe card
3. Carefully remove the PCIe card from the controller module and set it aside.
You must keep track of which slot the PCIe card was in.
4. Repeat the preceding steps for the remaining PCIe cards in the old controller module.
5. Open the new controller module side panel, if necessary, slide off the PCIe card filler plate, as needed, and carefully install
the PCIe card.
You must properly align the card in the slot and exert even pressure on the card when seating it in the socket. The card must
be fully and evenly seated in the slot.
Steps
1. Locate the boot device using the following illustration or the FRU map on the controller module:
1
Boot device holder; not removable
2
Boot device
2. Open the boot device cover and hold the boot device by its edges at the notches in the boot device housing, gently lift it
straight up and out of the housing.
Attention: Always lift the boot device straight up out of the housing. Lifting it out at an angle can bend or break the
connector pins in the boot device.
4. Align the boot device with the boot device socket or connector, and then firmly push the boot device straight down into the
socket or connector.
Important: Always install the boot device by aligning the front of the boot device squarely over the pins in the socket at
the front of the boot device housing. Installing the boot device at an angle or over the rear plastic pin first can bend or
damage the pins in the boot device connector.
Steps
2. Locate the battery, press the clip on the face of the battery plug to release the lock clip from the plug socket, and then unplug
the battery cable from the socket.
1
CPU air duct
2
NVMEM battery
3
NVMEM battery plug
3. Gently pull the tab on the battery housing away from the controller module side.
The tab is found near the controller module side, near the plug.
4. Place your forefinger at the far end of the battery housing, and then gently push it toward the CPU air duct.
1 2 3
1
NVMEM battery
2
Battery tabs
3
Notch on chassis with alignment arrow
5. Gently pull the battery housing toward the center of the controller module, and then lift the battery out of the controller
module.
6. Align the tabs on the battery holder with the notches in the controller module side, and then gently push the battery housing
so that the notches are under the lip of the controller module side.
7. While gently pushing the battery against the sheet metal on the chassis to hold it in the battery guide, place the forefinger of
your free hand against the battery housing behind the locking tab on the battery, and then gently push the battery housing
away from the CPU air duct.
If it is properly aligned, the battery snaps into place on the side of the controller module. If it does not, repeat these steps.
Steps
1. Verify that the NVMEM battery cable connector is not plugged into the socket .
1 2
1 2 3 4
1 2
1
NVMEM DIMMs 1 and 2
Note: See Replacing an NVMEM battery and NVMEM DIMMs in a 32xx system for information about
removing these two DIMMs.
2
System DIMMs 1 through 4
The number of DIMMs in your system will vary:
• In the 3210 and 3240 models, only DIMM sockets 1 and 2 are populated.
3. Note the location and orientation of the DIMM in the socket so that you can insert it in the new controller module in the
proper orientation.
4. Slowly press down on the two DIMM ejector tabs, one at a time, to eject the DIMM from its slot, and then lift it out of the
slot.
Caution: The DIMMs are located very close to the CPU heat sync, which might still be hot. Avoid touching the CPU heat
sync when removing the DIMM.
Attention: Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
5. Locate the corresponding slot for the DIMM in the new controller module, align the DIMM over the slot, and then insert the
DIMM into the slot.
The notch among the pins on the DIMM should align with the tab in the socket. The DIMM fits tightly in the slot but should
go in easily. If not, you should realign the DIMM with the slot and reinsert it.
Important: You must install the NVMEM DIMMs only in the NVMEM DIMM slots.
6. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
The edge connector on the DIMM must make complete contact with the slot.
7. Push carefully, but firmly, on the top edge of the DIMM until the latches snap into place over the notches at the ends of the
DIMM.
9. In the new controller module, orient the NVMEM battery cable connector to the socket on the controller module and plug
the cable into the socket.
You must ensure that the plug locks down onto the socket on the controller module.
Steps
1. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway
into the system.
Note: Do not completely insert the controller module in the chassis until instructed to do so.
2. Recable the management port so that you can access the system to perform the tasks in the following sections.
b. With the cam handle in the open position, firmly push the controller module in until it
meets the midplane and is fully seated, and then close the cam handle to the locked
position.
Attention: Do not use excessive force when sliding the controller module into the
chassis; you might damage the connectors.
• If you are running Data ONTAP 8.2.1 and earlier, enter boot_ontap, and press
Ctrl-C when prompted to got to the boot menu, and then select Maintenance mode
from the menu.
• If you are running Data ONTAP 8.2.2 and later, enter boot_ontap maint at the
LOADER prompt.
d. If you have not already done so, reinstall the cable management , and then tighten the
thumbscrew on the cam handle on back of the controller module.
e. Bind the cables to the cable management device with the hook and loop strap.
b. Reconnect the power cables to the power supplies and to the power sources, turn on the
power to start the boot process, and then press Ctrl-C to interrupt the boot process when
you see the message Press Ctrl-C for Boot Menu.
Note: If you miss the prompt and the controller module boots to Data ONTAP, enter
halt and at the LOADER prompt enter boot_ontap, and press Ctrl-C when
prompted, and then repeat this step.
c. From the boot menu, select the option for Maintenance mode.
d. If you have not already done so, reinstall the cable management , and then tighten the
thumbscrew on the cam handle on back of the controller module.
e. Bind the cables to the cable management device with the hook and loop strap.
Important: During the boot process, you might see the following prompts:
• A prompt warning of a system ID mismatch and asking to override the system ID.
• A prompt warning that when entering Maintenance mode in a HA configuration you must ensure that the healthy node
remains down.
START HERE
STEP
Verify HA state (ha-config show)
matches your configuration
1 2 3 4 5 6
NO
YES
Fibre channel?
YES
NO
Restore FC config
NO
YES
GO TO NEXT STEP
Steps
1. In Maintenance mode, display the HA state of the new controller module and chassis:
ha-config show
If your system is... The HA state for all components should be...
In an HA pair ha
Stand-alone non-ha
2. If the displayed system state of the controller does not match your system configuration, set the HA state for the controller
module:
ha-config modify controller ha-state
3. If the displayed system state of the chassis does not match your system configuration, set the HA state for the chassis:
ha-config modify chassis ha-state
Steps
1. From the healthy node, verify the values of the FC configuration on the replacement node:
2. Compare the default FC variable settings with the list you saved earlier.
d. Enter one of the following commands, depending on what you need to do:
• To unconfigure ports:
fcadmin config -t unconfigure adapter_name
Steps
2. Because modifying one port in a port pair modifies the other port, answer y when prompted by the system.
After you issue the command, wait until the system stops at the LOADER prompt.
4. Boot the node back into Maintenance mode for the configuration changes to take effect.
• The replacement node is the new node that replaced the impaired node as part of this procedure.
When setting the date and time at the LOADER prompt, verify that all times are set to GMT.
Steps
1. If you have not already done so, halt the replacement node to display the LOADER prompt.
3. At the LOADER prompt, check the date and time on the replacement node:
show date
6. At the LOADER prompt, confirm the date and time on the replacement node:
show date
• For ONTAP 8.2 and later, you do not require loopback plugs to run tests on storage interfaces.
Steps
1. If the node to be serviced is not at the LOADER prompt, bring it to the LOADER prompt.
2. On the node with the replaced component, run the system-level diagnostic test: boot_diags
Note: You must enter this command from the LOADER prompt for system-level diagnostics to function properly. The
boot_diags command starts special drivers that are designed specifically for system-level diagnostics.
Important: During the boot_diags process, you might see a prompt warning that when entering Maintenance mode in
an HA configuration, you must confirm that the partner remains down.
To continue to Maintenance mode, you should enter y
4. Display and note the available devices on the controller module: sldiag device show -dev mb
The controller module devices and ports that are displayed can be any one or more of the following:
• fcal is a Fibre Channel-Arbitrated Loop device that is not connected to a storage device or Fibre Channel network.
• sas is a Serial Attached SCSI device that is not connected to a disk shelf.
5. How you proceed depends on how you want to run diagnostics on your system.
Choices
• Running diagnostics tests concurrently after replacing the controller module on page 57
• Running diagnostics tests individually after replacing the controller module on page 58
Related information
FAS System Level Diagnostics Guide
Steps
1. Display and note the available devices on the controller module: sldiag device show -dev mb
The controller module devices and ports that are displayed can be any one or more of the following:
• bootmedia is the system booting device.
• fcal is a Fibre Channel-Arbitrated Loop device that is not connected to a storage device or Fibre Channel network.
• sas is a Serial Attached SCSI device that is not connected to a disk shelf.
2. Review the enabled and disabled devices in the output from step 1 on page 57 and then determine which tests you want to
run concurrently.
4. Examine the output and, if applicable, enable the tests that you want to run for the device:
test_index_number can be an individual number, a series of numbers separated by commas, or a range of numbers.
5. Examine the output and, if applicable, disable the tests that you do not want to run for the device by selecting only the tests
that you want to run:
sldiag device modify -dev dev_name -index test_index_number -selection disable
*> <SLDIAG:_ALL_TESTS_COMPLETED>
9. After the tests are complete, verify that there are no hardware problems on your storage system:
sldiag device status -long -state failed
10. Correct any issues that are found, and repeat this procedure.
Steps
3. Examine the output and, if applicable, enable the tests that you want to run for the device:
sldiag device modify -dev dev_name -index test_index_number -selection enable
test_index_number can be an individual number, a series of numbers separated by commas, or a range of numbers.
4. Examine the output and, if applicable, disable the tests that you do not want to run for the device by selecting only the tests
that you want to run:
sldiag device modify -dev dev_name -index test_index_number -selection only
<SLDIAG:_ALL_TESTS_COMPLETED>
boot media sldiag device status -dev bootmedia -long -state failed
b. Turn off or leave on the power supplies, depending on how many controller modules are in
the chassis:
• If you have two controller modules in the chassis, leave the power supplies turned on
to provide power to the other controller module.
• If you have one controller module in the chassis, turn off the power supplies and
unplug them from the power sources.
c. Check the controller module you are servicing and verify that you have observed all the
considerations identified for running system-level diagnostics, that cables are securely
connected, and that hardware components are properly installed in the storage system.
d. Boot the controller module you are servicing, interrupting the boot by pressing Ctrl-C
when prompted.
This takes you to the Boot menu:
• If you have two controller modules in the chassis, fully seat the controller module you
are servicing in the chassis.
The controller module boots up when fully seated.
• If you have one controller module in the chassis, connect the power supplies and turn
them on.
g. Enter boot_diags at the prompt and rerun the system-level diagnostic test.
8. Continue to the next device that you want to test, or exit system-level diagnostics and continue with the procedure.
Steps
b. Enter the information for the target system, and then click Collect Data.
d. Check other cabling by clicking the appropriate tab, and examining the output from Config Advisor.
Reassigning disks
If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when
the giveback occurs at the end of the procedure. In a stand-alone system, you must manually reassign the ID to the disks.
Steps
1. If the replacement node is in Maintenance mode (showing the *> prompt), exit Maintenance mode:
halt
After you issue the command, you must wait until the system stops at the LOADER prompt.
If you are prompted to override the system ID due to a system ID mismatch, enter y.
4. Wait until the Waiting for giveback... message is displayed on the replacement node console and then, on the healthy
node, verify that the controller module replacement has been detected and the new partner system ID has been automatically
assigned.
Example
5. From the healthy node, verify that any coredumps are saved:
You can respond Y when prompted to continue into advanced mode. The advanced mode prompt appears (*>).
6. Your next step depends on the version of ONTAP your system is running.
b. Once the node displays Waiting for Giveback..., give back the node:
storage failover giveback -ofnode replacement_node_name
As the replacement node boots up, it might again display the prompt warning of a system
ID mismatch and asking to override the system ID. You can respond Y.
The replacement node takes back its storage and completes booting up to the ONTAP
prompt.
Note: If the giveback is vetoed, you can consider overriding the vetoes.
Find the High-Availability Configuration Guide for your version of Data ONTAP 8
d. Wait until the storage failover show-giveback command output indicates that
the giveback operation is complete.
e. Confirm that the HA pair is healthy and takeover is possible: storage failover
show
The output from the storage failover show command should not include the
"System ID changed on partner" message.
e. Wait until the storage failover show-giveback command output indicates that
the giveback operation is complete.
f. Confirm that the HA pair is healthy and takeover is possible: storage failover
show
The output from the storage failover show command should not include the
"System ID changed on partner" message.
Example
Verify that the disks belonging to the replacement node should show the new system ID for the replacement node. In the
following example, the disks owned by node1 now show the new system ID, 1873775277:
Disk Aggregate Home Owner DR Home Home ID Owner ID DR Home ID Reserver Pool
----- ------ ----- ------ -------- ------- ------- ------- --------- ---
1.0.0 aggr0_1 node1 node1 - 1873775277 1873775277 - 1873775277 Pool0
1.0.1 aggr0_1 node1 node1 1873775277 1873775277 - 1873775277 Pool0
.
.
.
9. Verify that the expected volumes are present for each node:
vol show -node node-name
Steps
1. If you have not already done so, reboot the replacement node, interrupt the boot process by entering Ctrl-C, and then select
the option to boot to Maintenance mode from the displayed menu.
You must enter Y when prompted to override the system ID due to a system ID mismatch.
Note: Make a note of the old system ID, which is displayed as part of the disk owner column.
Example
The following example shows the old system ID of 118073209:
3. Reassign disk ownership by using the system ID information obtained from the disk show command:
disk reassign -s old system ID
In the case of the preceding example, the command is: disk reassign -s 118073209
You can respond Y when prompted to continue.
You must verify that the disks belonging to the replacement node show the new system ID for the replacement node. In the
following example, the disks owned by system-1 now show the new system ID, 118065481:
Example
5. If the replacement node is in Maintenance mode (showing the *> prompt), exit Maintenance mode:
halt
After you issue the command, you must wait until the system stops at the LOADER prompt.
6. If you are running Data ONTAP 8.2.2 or earlier, on the replacement node at the prompt, confirm that the new controller
module boots in clustered Data ONTAP:
setenv bootarg.init.boot_clustered true
Steps
1. If you need new license keys in the Data ONTAP 8.2 format, obtain replacement license keys on the NetApp Support Site in
the My Support section under Software licenses.
Note: The new license keys that you require are auto-generated and sent to the email address on file. If you fail to receive
the email with the license keys within 30 days, contact technical support.
3. If you want to remove the old licenses, complete the following substeps:
Related information
Find a System Administration Guide for your version of Data ONTAP 8
NetApp KB Article 3013749: Data ONTAP 8.2 and 8.3 Licensing Overview and References
Steps
7. On each node, verify that all keys are stored on their key management servers:
key_manager query
Steps
1. Verify that the logical interfaces are reporting to their home server and ports:
network interface show -is-home false
If any LIFs are listed as false, revert them to their home ports:
network interface revert *
If... Then...
AutoSupport is enabled Send an AutoSupport message to register the serial number.
AutoSupport is not enabled Call NetApp Support to register the serial number.
Related information
NetApp Support
Disposing of batteries
You must dispose of batteries according to the local regulations regarding battery recycling or disposal. If you cannot properly
dispose of batteries, you must return the batteries to NetApp, as described in the RMA instructions that are shipped with the kit.
Related information
https://library.netapp.com/ecm/ecm_download_file/ECMP12475945
Copyright information
Copyright © 1994–2016 NetApp, Inc. All rights reserved. Printed in the U.S.
No part of this document covered by copyright may be reproduced in any form or by any means—graphic, electronic, or
mechanical, including photocopying, recording, taping, or storage in an electronic retrieval system—without prior written
permission of the copyright owner.
Trademark information
NetApp, the NetApp logo, Go Further, Faster, AltaVault, ASUP, AutoSupport, Campaign Express, Cloud ONTAP, Clustered
Data ONTAP, Customer Fitness, Data ONTAP, DataMotion, Fitness, Flash Accel, Flash Cache, Flash Pool, FlashRay,
FlexArray, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexVol, FPolicy, GetSuccessful, LockVault, Manage
ONTAP, Mars, MetroCluster, MultiStore, NetApp Insight, OnCommand, ONTAP, ONTAPI, RAID DP, RAID-TEC, SANtricity,
SecureShare, Simplicity, Simulate ONTAP, Snap Creator, SnapCenter, SnapCopy, SnapDrive, SnapIntegrator, SnapLock,
SnapManager, SnapMirror, SnapMover, SnapProtect, SnapRestore, Snapshot, SnapValidator, SnapVault, StorageGRID, Tech
OnTap, Unbound Cloud, and WAFL and other names are trademarks or registered trademarks of NetApp, Inc., in the United
States, and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders
and should be treated as such. A current list of NetApp trademarks is available on the web at http://www.netapp.com/us/legal/
netapptmlist.aspx.