0% found this document useful (0 votes)
679 views28 pages

Processor Controller Module (PCM) Replacement For The Fas2040 For Netapp Authorized Service Engineers

This document provides instructions for replacing the Processor Controller Module (PCM) on a NetApp FAS2040 appliance. It outlines visual checks, shutting down the node, removing and replacing the PCM, checking the battery, running diagnostics, and rebooting the system. Additional steps are included for configurations using NetApp Storage Encryption.

Uploaded by

naret.sea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
679 views28 pages

Processor Controller Module (PCM) Replacement For The Fas2040 For Netapp Authorized Service Engineers

This document provides instructions for replacing the Processor Controller Module (PCM) on a NetApp FAS2040 appliance. It outlines visual checks, shutting down the node, removing and replacing the PCM, checking the battery, running diagnostics, and rebooting the system. Additional steps are included for configurations using NetApp Storage Encryption.

Uploaded by

naret.sea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Processor Controller Module (PCM) Replacement for the FAS2040

For NetApp Authorized Service Engineers


README FIRST (1 of 2) Doc Rev -010a

README FIRST
New Battery Process for Processor Controller Module (PCM) Replacement:
1) The replacement PCM comes with a pre-installed NVMEM battery that may be discharged.
2) Effective 8-May-2015, new process is in place to ship a new NVMEM battery (p/n X1845A-R6) in a separate box
to replace the battery that is pre-installed on the rev "A", 3244A-R5 replacement PCM.
- No separate battery is shipped for a rev "B", 3244B PCM.
3) Check your dispatch data to see if a battery is mentioned or the p/n X1845A-R6 is listed. On rare occasions a
separate NVMEM battery might not have shipped. In this case, the battery from the original PCM should be
moved to the replacement PCM.

 No "Failed" Disks can exist in the target node in a HA config or the disk reassign will not execute. The AP covers this.
 If this system has SAN attached Tape drives, confirm a storage admin is available to remap the switch if the SAN
Tape is using on-board FC Adapter.
 Known Bugs/Issues - Bug Table and Notes Below
Bug Description First Fixed Release
TSB-1110-04 is an internal Bulletin: When the disk reassign is performed on the
TSB-1110-
1 partner (HA-takeover) the GB must be immediately performed, a TO/GB from the See Note 1
04 repaired node is required to sync the system-IDs.
Note: For ONTAP 8.0.5 or higher and 8.1.3 or higher the console message to perform an additional TO/GB can be ignored.
(The bugs listed in the TSB are fixed in ONTAP releases 8.0.5 and 8.1.3 and higher, but the console message was not
removed.)
590488 2 In “disruptive” MB w/NVMEM replacements, a TO/GB from the repaired node is req'd. See Note 2
489060 3 NDMP, Qtree-SnapMirror, Vol-SnapMirror or SnapVault processes can hang TO/GB See Note 3

Bug Notes:
1 In some versions of ONTAP when the 'disk reassign' command is executed from the partner, ONTAP may print out a
warning that states 2 things.
(i) The giveback must be done right way - IF a GB will not be immediately performed, the disk reassign needs to be
post-poned.
(ii) A second TO/GB should be performed from the repaired node. This is covered in the AP. (TSB-1110-04)

disk reassign: A giveback must be done immediately following a reassign of partner


disks. After the partner node becomes operational, do a takeover and giveback of
this node to complete the disk reassign process.
Do you want to continue (y/n)?

2 IF this system has a partner AND the partner did NOT takeover this controller, it is still necessary to sync the new system-ids
by executing a TO/GB from the repaired node although no console message is displayed. This is covered in the AP.
3 The AP will cover asking the customer if they are running these processes. If so, there is a link how to disable them.

 Link to Statement of Volatility by Platform is: http://support.netapp.com/info/web/ECMP1132988.html


 AP doc rev is at top of page - If using hard-copy for secure site, be sure to print all the linked documents in this AP.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
README FIRST (2 of 2)

README FIRST
 This AP has been updated to include commands for systems running "Cluster-Mode" (C-Mode) ONTAP.
● The login name for C-Mode systems is "admin", not "root".
● The ONTAP version and mode is listed in your dispatch!
● C-Mode: Has two console command shells, clustershell and nodeshell. The default shell is clustershell.
IF clustershell, the console prompt includes a double colon ( :: ). Ex(1): cluster ::> Ex(2): cluster ::storage>
● To switch from clustershell to nodeshell, enter 'run local' at the ::> prompt, then the double colons (::) are
removed. To exit nodeshell, enter 'exit' or Ctrl-D.
● From clustershell, nodeshell commands can be entered by prefacing the 7-Mode command with “run local".
Ex: cluster::> run local sysconfig -v Note, all 7-Mode commands are not supported in C-Mode.

 This AP has been updated to include additional commands and procedures for a system configured with NSE
(NetApp Storage Encryption ) disks.
● NSE is supported in DOT 8.1 and higher, 7-mode only - No cluster mode support at this time.
● The DOT version and mode is listed in your dispatch!
● The "badging" on the NSE disk canister is embossed as compared to standard disk badging. See picture >> here
● Presently all NSE systems are HA configured, and all disks in all shelves must be NSE disks.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 1 of 26

SECTION OUTLINE of a FAS2040 Appliance Processor Controller Module (PCM) Replacement


This procedure will take 60-90 minutes
I. Appliance / PCM Visual Checks X. Verify Battery Status
II. Node Pre-Checks XI. Run Diagnostics (20-30 min)
III. Node State Check and Shutdown Procedure XII Verify FC Adapter Configuration
IV. Capture the Current System Configuration XIII. Capture new System-ID on replacement Controller
V. Remove the cables and extract the PCM XIV. Disk Reassign
VI. Move the SFPs - Exchange the CF Cards XV. Boot PROM Variable Checks
VII. Replace the NVMEM Battery XVI. Boot the Operating System - 'cf giveback' if applicable
VIII. Partially Reinsert the Replacement PCM and Reconnect XVII. NetApp Storage Encryption (NSE) System?
the cables XVIII. Configure the BMC if Necessary
IX. Set date and time on the RTC XIX. Controller Reg., Enable options, Submit logs, Part Return

I. FAS2040: Appliance / PCM Visual Checks


Step Action Description
1 Visually verify if you are working on correct model and READ the STOP box and the other note boxes below.
The FAS2040 Appliance has 1 or 2 Processor Controller Module(s) (PCM) integrated into a 12 Bay Shelf
Fig 1
FAS Model Number One Thumbscrew and cam
PS-1 AC PS-2 AC
Switch handle to extract each PCM. Switch

2u Fig 2
A

AC
" ! " LED is ON when hardware
failures are detected or if B
controller failover is disabled. Rear View
HA (Active-Active) Configurations:
2 PCMs, (A & B)
Non-HA Configurations:
1 PCM in the bottom slot

Controller activity LEDs:


If LED flashes GREEN, that Each Controller Module, (A or B slot) has
controller is online. it's own System Serial Number

Fibre Channel Ports: 0a, 0b Ethernet Ports: e0a, e0b, e0c, e0d

Fig 3

1 Orange Thumbscrew to SAS IOIOI BMC


extract each PCM Adapter Console Port
0d Port ~
The NVMEM LED on the faceplate will start flashing when power is removed from !
STOP !! the controller if the system is "waiting for giveback", or the system was not
shutdown properly (uncommitted data). Follow the steps in Section V carefully.

The Status "!" LED will be "ON" if the


PCM is faulted or if HA is disabled

2 Continue with Section I on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 2 of 26

I. FAS2040: Appliance / PCM Visual Checks (cont.)


Step Action Description
3
Notes:
1. This Action Plan covers Controller running ONTAP 7-Mode or Cluster-Mode .
Fig 4
FAS2040 PCM 2. Procedure will take 60-90 minutes or 90- 120 minutes if has NSE Disks.

3. Note the Caution on NVMEM LEDs in Section V.

4. This Action Plan needs to be followed in step order

5. FC port configuration, disk list and the system date are captured prior to
removing the original Controller.

6. Compact Flash (CF) Card needs to be moved from the Original PCM to the
Replacement PCM.

7. System variables; date-time, disk reassignment and FC port configuration


must be verified before rebooting the system.

8. If a HA configuration and ONTAP 8, the console may report you "must


perform a final ' cf takeover' and 'cf giveback' from the 'partner node", the node
that was repaired to complete the 'disk reassign' process. Follow the new steps
in 'Disk Reassign' and 'Boot the OS' sections carefully.

II. FAS2040: Node Pre-Checks


Step Action Description
1 Verify the "Order Reference 8xxxxxxxxx number on the RMA packing slip is the same as the Part Request (PREQ) number
listed in your dispatch notes.
2 Adhere to anti-static precautions. (A paper ESD strap is included inside the RMA box if you don't have your own)
3 Remove the replacement PCM from the anti-static bag and examine the housing and connector for damage.
4 Go to Section III "Node State Check and Shutdown" on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 3 of 26

III. FAS2040: Node State Check and Shutdown Procedure


Step Action Description
1 Always capture the node’s console output to a text file, ex: “NetApp-dispatch-num.txt”, even if using the end-user's computer.
To review the Job Aid on how to connect to console (IOIOI) port and serial emulator options, click > Console Attach Aid
Visual Chassis Checks
FRONT: Look for an Amber Status ( ! ) LED, Fig 5a, then observe which Activity LED is flashing, which is OFF. The activity LED
NOTE that is not flashing is not running Data Ontap or the controller is not installed.
REAR: Look for the controller that has the Status ( ! ) LED ON, Fig 5b. Both could be on, verify which Activity LED is not flashing -
Continue with console response checks in step 2.
Chassis Check: To see if two controllers are installed reference HA figures here > HA Figs

Front OPS LEDS Controller Fault ( ! ) LED on Rear


AC Power
Fig 5a Fig 5b
A

" ! " (LED is ON when Controller Activity LEDs The Fault ( ! )


If LED actively flashes
hardware failures are
GREEN, that controller is online -
LED on the "B"
detected and if a controller PCM is "ON"
failover has occureed or "A" is online.
"A" is the top PCM, "A" Top is OFF
HA is disabled
"B" is the bottom PCM.

2 Check the state of the node by viewing the console port responses from (each) controller if HA (Active-Active) configuration . HA
config requires two controller assemblies installed in the same physical chassis. Detailed messages here> Appliance Check
3 Non-HA Controller Configuration: If the console response is "login" or "password" or the <system prompt>, the end-user
will have issue a 'halt' on the system for proper shutdown. Work with NGS if you have questions.
NOTE The "LOADER" prompt will include -A if attached to the top controller or -B if attached to the bottom controller.
NOTE HA-config Status Command: After logging in, "cf status" will display the state of the HA . Example of >> cf status cmd
WARNING for HA configurations:
STOP! If the failure has caused a HA failover you may have been dispatched on the surviving controller's serial number, not the
failed one.
4 Dual Controller Configuration
a) If both controllers' are UP and Online: the end-user will have to issue a cf takeover from the partner node if
controller failover is active or halt it if controller failover is disabled. Work with NGS if you have questions.
5 If the console response is "LOADER-A|B>", go to Section IV.
6 Continue with Section III on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 4 of 26

III. FAS2040: Node State Check and Shutdown Procedure (cont.)


Step Action Description
7 If the console response is: "Waiting for giveback…..." follow steps 7a-7c. If console response is LOADER, skip to next Section.
a) At the "Waiting for giveback ……" prompt, Enter: Ctrl-C
b) At the message: "Do you wish to halt this node rather than wait [y/n]? " Enter: y
c) After the system drops to the LOADER-A|B> prompt, continue with step 7.
Waiting for giveback...(Press Ctrl-C to abort wait) Step 7: Hitting Enter
^C displays this prompt
This node was previously declared dead. Step 7a): Enter: CTRL-C
.....
The HA partner is currently operational and in takeover mode. Information on
..... Partner Status
.....
Do you wish to halt this node rather than wait [y/n]? y
System halting... Step 7b): Halt the node

LOADER-A>

IV FAS2040: Capture the Current System Configuration


Step Action Description
NOTE Confirm the "console" output is being saved to a text file. It will be needed later in this action plan.
1 IF Cluster-Mode, continue with next step otherwise skip to step 2.
a) After the target system drops to the LOADER-A|B> prompt, login to the partner and check if the auto-giveback option is
enabled by entering the following command: You can copy-n-paste the command syntax.

Cluster-Mode (Run in clustershell)


cluster::> sto fa show -node local -fields auto-giveback

IF enabled C-Mode will show:


node auto-giveback
-------------- -------------
Node-B true

b) Disable the auto-giveback option if enabled from the partner node. (copy-n-paste)

Cluster-Mode (Run in clustershell)


cluster::> sto fa modify -node local -auto-giveback false

2 The date and time is stored in the system PROM in Greenwich Mean Time, (GMT) also known as Universal Time Clock, (UTC).
At the LOADER> prompt, enter: show date Record on paper the system's GMT time and the local time to determine the
number of hours (and minutes) the local time is ahead or behind GMT.

LOADER-A> show date


Current date & time is: 06/12/2011 15:59:10 Step 2): Enter: show date

3 Enter: printenv This command displays (and captures) all boot environmental variables.

LOADER-A> printenv STEP 3): Enter: printenv

An example of a "printenv" output is here > printenv-C-no-V-MC.pdf


4 Continue with Section IV on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 5 of 26

IV FAS2040: Capture the Current System Configuration (cont.)


Step Action Description
5 Enter :^G (Ctrl-G) to enter the BMC-shell . The prompt changes to: "bmc shell ->".

LOADER> ^G (CTRL-G)
=== OEMCLP v1.0.0 BMC v1.5 === STEP 5: Enter: (CTRL-G) to enter the BMC-shell
bmc shell ->

6 At the bmc shell -> prompt:


a) Enter: 'bmc config' to capture the BMC configuration. Read "STOP" below!

Example Only

bmc shell -> bmc config


STEP 6(a): Capture the BMC configuration
ipaddr :10.61.69.90
netmask :255.255.255.0
gateway :10.61.69.1
mac :00:a0:98:0d:49:76
dhcp :off
link :up
autoneg :enabled
speed :100
duplex :full
bmc shell ->

From the output, verify how the "ipaddr" is configured. If it shows a valid IP Adress, you will need to run "BMC
STOP setup" on the replacement MB later in this AP.
No "BMC setup" will be required if the "ipaddr" value is "0.0.0.0" - means BMC is not configured in this system.
b) Enter: 'exit' to return to "LOADER>" prompt.
7 Continue with Section IV on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 6 of 26

IV FAS2040: Capture the Current System Configuration (cont.)


Step Action Description
8 From the LOADER-A|B> prompt enter autoboot to initiate a prom bootstrap.
a) When this message appears: "Press CTRL-C for Boot Menu" , press CTRL-C (^C) to load the "Boot Menu". After about
30-40 seconds, the "Maintenance menu" will appear.
NOTE If the original MB fails to boot to the Maintenance menu due to an error, skip to Section V.
b) For ONTAP 7.x, ONTAP 8.0.x 7-mode and ONTAP 8.1 (7-Mode, C-Mode), refer to left menu, enter 5 for "Maintenance
mode boot".
c) If asked "Continue with boot?" Answer: y
ONTAP 7.x , 8.0.x 7-Mode and ONTAP 8.1 (7,C-Mode) ONTAP 8.0.x Cluster-Mode Only
LOADER-A> autoboot LOADER-A> autoboot
Loading Loading
X86_64/freebsd/image1/kernel:0x100000/3375736 x86_64/freebsd/image2/kernel:....0x100000/3386664
0x538280/3221872 0x53b000/3222096 0x84da50/1190096
.....
Step 8: Enter: autoboot Step 8: Enter: autoboot
.....
Copyright (C) 1992-2010 NetApp. NetApp Data ONTAP 8.0.1 Cluster-Mode
All rights reserved. Copyright (C) 1992-2010 NetApp.
******************************* Step 8a): Wait for All rights reserved.
* * this message, then ******************************* Step 8a): Wait for
* Press Ctrl-C for Boot Menu. * * *
* * hit ^C (CTRL-C) * Press Ctrl-C for Boot Menu. *
this message, then
******************************* * * hit ^C (CTRL-C)
^CBoot Menu will be available. *******************************
^CBoot Menu will be available.
Please choose one of the following:
How would you like to continue booting?
(1) Normal Boot.
(2) Boot without /etc/rc. (normal) Normally
(3) Change password. (install) Install new software first
(4) Clean configuration and initialize all disks. (password [<user>]) Change user password
(5) Maintenance mode boot. (setup) Run setup first
(6) Update flash from backup config. (init) Initialize disks and create
(7) Install new software first. flexvol
(8) Reboot node. (maint) Boot into maintenance mode
Selection (1-8)? 5 Step 8b): Enter: 5 (syncflash) Update flash from backup
config
You have selected the maintenance (reboot) Reboot node
boot option: Please make a selection: maint Step 8b):
..... ..... Enter: maint
..... .....
In a High Availablity configuration, you MUST
In a High Availablity configuration, you MUST ensure that the
ensure that the partner node is (and remains) down, partner node is (and remains) down, or that
Step
or that 8c): If is
takeover thismanually
node hasdisabled
a partner on
node
thethis takeover is manually
message
partner will be displayed.
node, because Answer: y software
High Availability to the is disabled on the partner node, because High
not started or fully
"Continue enabled
with boot?" in Maintenance mode.
question. Availability
software is not started or fully enabled in
FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS Maintenance mode.
BEING DESTROYED
FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS
NOTE: It is okay to use 'show/status' sub-commands BEING DESTROYED
such as 'disk show or aggr status' in Maintenance
mode while the partner is up NOTE: It is okay to use 'show/status' sub-commands
such as
Continue with boot? yes 'disk show or aggr status' in Maintenance mode
..... while the partner is up
..... .....
*> maintenance mode console prompt *> maintenance mode console prompt

9 From the *> prompt enter: fcadmin config to log the configuration of the integrated FC host adapters.
a) Check if "0a" and "0b" Adapter ports are configured as a "target". If so, it will need to be verified later.

*> fcadmin config Example Only


STEP 9): Enter: fcadmin config
Local
Adapter Type State Status
--------------------------------------------------- STEP 9a): Log all the adapters
0a target CONFIGURED offline listed as "target" adapters. In our
example, adapters 0a and 0b are
0b target CONFIGURED offline
targets

10 Continue with Section IV on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 7 of 26

IV FAS2040: Capture the Current System Configuration (cont.)


Step Action Description
11 Follow Steps 11(a) if this system has SAN attached Tape Drives < need to ask customer. If not, go to step 12.
a) Enter the 'fcadmin channels' command to capture current WWPNs for the SAN attached TAPE.
12 Enter: disk_list to capture disk models.
13 Enter: storage show disk -p to capture multipathing information.
14 Next, from the *> prompt enter: disk show -v to view which SAS and FC Adapter ports are driving disks- See Text Box 14.
The "disk show -v" sample output below is abbreviated console output. If DOT 8.x, a HOME column is also listed with
NOTE OWNER for each disk, which displays the node's systemname (and system-ID). After the controller is replaced it is
necessary to confirm each SAS/FC Adapter port is seeing its storage.
15 Take note of all the "unique" Adapter port numbers displayed. See Text Box STEP 15. In this example: SAS Adapters 0c, 0d and
FC Adapters 0a, 0b are displayed.
16 At the *> prompt enter: halt (after prom initialization the console will display the LOADER-A|B> prompt)

*> disk show -v


Local System ID: 122217803
Example Only
DISK OWNER POOL SERIAL NUMBER HOME
-------- ------------------ ----- -------------------- ------------------
0c.00.0 STEP 14: The disk
tsst-2 (142217816) show -v3LM17RW900009750Q6SF
Pool0 command prints out tsst-2 (142217816)
0c.00.1 the System ID of the
tsst-2 (142217816) Local System
Pool0 (122217803). It also
3LM1623E00009750Q7YT tsst-2 (142217816)
0c.00.10 prints the owner ofPool0
tsst-2 (142217816) each disk3LM18TSE00009751QMQU
under the HOME heading tsst-2 (142217816)
0c.00.4 which lists the node's
tsst-2 (142217816) system3LM19W4J00009751QPD1
Pool0 name. This system name is tsst-2 (142217816)
0c.00.6 (tsst-1) and owns Pool0
tsst-2 (142217816) disks: 0b.21, 0b.18, 0b.28, 0d.01.11,
3LM185P700009750FB2K tsst-2 (142217816)
0c.00.9 0d.01.0, etc.
tsst-2 (142217816) Pool0 3LM194HA000097510H4Q tsst-2 (142217816)
0c.00.7 tsst-2 (142217816) Pool0 3LM1BQTG00009752XK3D tsst-2 (142217816)
0c.00.5 NOTE- Partner owned
tsst-2 (142217816) Pool0disks3LM1BQLG00009801JB5M
are intermixed in the tsst-2 (142217816)
..... output. The partner hostname is 'tsst-2' and it's System
..... ID is (142217816).
0a.41 tsst-2 (142217816) Pool0 JLVT29GC tsst-2 (142217816)
0a.43 tsst-2 (142217816) Pool0 JLVT7BUC tsst-2 (142217816)
0a.33 tsst-2 (142217816) Pool0 JLVS4EHC tsst-2 (142217816)
.....
.....
0b.21 tsst-1 (122217803) Pool0 JLVT0KDC tsst-1 (122217803)
0b.18 tsst-1 (122217803) Pool0 JLVT2HZC tsst-1 (122217803)
0b.28 tsst-1 (122217803) Pool0 JLVS585C tsst-1 (122217803)
....
.... STEP 15: Under the DISK heading, all SAS &
0d.01.11 tsst-1 FC Adapters are
(122217803) listed. In 9QJ75555
Pool0 this example SAS tsst-1 (122217803)
0d.01.0 tsst-1 adapter 0c andPool0
(122217803) 0d and FC9QJ756DN
adapter ' 0a and tsst-1 (122217803)
0d.01.3 tsst-1 0b' are seen, but
(122217803) typically 9QJ758RZ
Pool0 there are more. tsst-1 (122217803)
0d.01.7 tsst-1 After the controller
(122217803) Pool0 is replaced, confirm the
9QJ754ST tsst-1 (122217803)
0d.01.6 tsst-1 same adapters Pool0
(122217803) are listed meaning
9QJ75925 there is an tsst-1 (122217803)
0d.01.10 tsst-1 active SAS/FC Pool0
(122217803) path to the9QJ74TQG
disks. tsst-1 (122217803)
0d.01.9 tsst-1 (122217803) Pool0 9QJ758NQ tsst-1 (122217803)
0d.01.5 tsst-1 (122217803) Pool0 9QJ74VNZ tsst-1 (122217803)
.....
.....
*> A typical listing will display many
more disks and FC/SAS adapters
*> halt Step 16: Enter halt to exit to the LOADER-A|B> prompt
than this partial listing.

17 Go to Section V, "Remove the cables and extract the PCM" on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 8 of 26

V. FAS2040: Remove the cables and extract the PCM


Step Action Description
STOP DO NOT turn off the power supplies because disks are spinning in the chassis.
1 On the node to be serviced, loosen the orange thumbscrew, ref Fig 3. Pull down on the cam lever and slide the PCM towards
you halfway out of the chassis.
*
HA (Active-Active) Configuration : If the NVMEM Status LED starts flashing ref Page-1, Fig 3,
STOP! when the PCM is extracted from the chassis:
(i) Confirm from end-user or NGS that the partner controller had a clean takeover, or if this controller was "waiting for
and giveback", the flashing LED can be ignored.
(ii) If a non-successful takeover, the flashing LED indicates uncommitted customer data - Contact NGS
READ *
Non-HA Configuration : If the NVMEM Status LED is flashing, the system was not 'halted' properly:
(i) Ask end-user if controller was properly "halted". If not, re-insert controller and if the system does not autoboot,
this enter: bye at the LOADER-A|B> prompt . If the system boots to the login prompt, login and then enter: halt to
properly shutdown. Engage NGS if questions.
CAUTION
* The node configuration should have been determined by following Section III.
2 Before proceeding further the state of the NVMEM LED should be resolved if it's valid by reading caution above.
3 Label each cable connector with its port number and then unplug the cabling from the connector.
4 Pull the cam handle downward and slide the controller module out of the system.

VI. FAS2040: Move the SFPs - Exchange the CF Cards


Step Action Description
1 Remove each SFP/GBICs one at a time, installed in the Ethernet and FC ports from the original Controller Module and fully insert
each one into the same port location in the replacement Module. (Do not mix them up!)
2 On the original PCM, turn it upside down to reveal the Compact Flash (CF) cover.
3 Slide the CF cover up and carefully slide the CF card from it's connector and mark it with an "O" for original . Ref Fig 6.

Slide the CF cover to expose the CF Card.

Fig 6

FAS2040 PCM
Bottom View NetApp Label is on the Top Side

Slide the CF card to disengage it


from the connector

4 Exchange the CF cards between the PCMs. The one marked "O" should now be in the replacement PCM.
5 Go to Section VII, "Replace the NVMEM Battery" on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 9 of 26

VII. FAS2040: Replace the NVMEM Battery


Step Action Description
READ your dispatch instructions and check what part(s) are to be replaced from the two options below:
1A Dispatch is to replace the PCM and Battery shipped in a separate box
S Effective 8-May-2015: Have both PCM (Processor Controller Module) and fresh Battery in hand on for a
T rev "A" 3244A PCM before proceeding. - No battery is shipped for a rev "B" 3244B PCM.
Fresh battery p/n (X1845A-R6) is shipped in a separate box. (Properly discard removed battery.)
O Note: "Parts Detail" can also be viewed using the "FSO Lookup" website > here.
P IF a battery is shipped, install that battery on the replacement PCM first - Skip to step 1A.
1B Dispatch is to replace the PCM (No fresh Battery dispatched)
IF only a PCM was sent and no Battery was shipped, the battery on the original PCM is to be moved and installed on the
replacement PCM. (Properly discard removed battery.) - Skip to step 1B.

1 A. Dispatch is to replace the PCM and Battery shipped in a separate box


A1. Follow steps (i to iv) to install the new Battery shipped for the rev A replacement PCM.
i) Remove the top cover on the replacement PCM.
ii) Mark the NVMEM battery pre-installed (if any) on the replacement PCM with "D" for
discarding. Disconnect the cable and remove the Battery - Fig 7.
iii) Install the replacement battery pack that was shipped in a separate box onto the
replacement PCM. Connect battery cable - Fig 7.
iv) Skip to step 2.

B. Dispatch is to replace the PCM (No fresh Battery dispatched)


B1 Follow steps (i to v) to move the Battery from the original PCM onto the replacement PCM.
i) Remove the top cover on both the original and replacement PCMs.
ii) Mark the "original" PCM and NVMEM battery with "O" for original.
iii) Remove the battery from both the original and replacement PCMs - Fig 7.
iv) Install the original battery into the replacement PCM. Connect battery cable - Fig 7.
v) Skip to step 2.

Fig 7
Battery and cable connector. Press
tab to release connector. Battery
FAS2040 is held into the module by Velcro tape.
PCM
Connector is next to Heat Sink
which may be Hot, let it cool

2 Close the PCM top cover(s).


3 Go to Section VIII, "Partially Reinsert the Replacement PCM and Reconnect the cables" on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 10 of 26

VIII. FAS2040: Partially Reinsert the Replacement PCM and Reconnect the cables
Step Action Description
1 Partially insert the PCM into the slot so that the cables can be attached- DO NOT engage the backplane yet.
2 Cables: Fully insert each cable that was removed to its proper port until it clicks in. Test by pulling on them. Especially the
FC and SAS ports!

IX. FAS2040: Set date and time on the RTC


Step Action Description
1 Re-attach laptop to the console port and capture the display output even if using the end user's computer.
2 Fully Insert the PCM into the slot and raise the cam lever and secure it with Orange thumbscrew.
3 IMMEDIATELY after the console message "Starting AUTOBOOT press Ctrl-C to abort…" is displayed, press Ctrl-C
(^C) key a couple times to abort the autoboot. See Console output example below.
Phoenix TrustedCore(tm) Server
Copyright 1985-2006 Phoenix Technologies Ltd.
.......
....... ".…" = Deleted lines to save space
Portions Copyright (C) 2002-2008 NetApp

CPU Type: Intel(R) Xeon(R) CPU L5410 @ 2.33GHz


STEP 3: Press "CTRL-C"
Starting AUTOBOOT press Ctrl-C to abort...
Loading x86_64/freebsd/image1/kernel:....0x100000/3386728 0x53b000/3222096
0x84da50/1190096
Autoboot of PRIMARY image aborted by user.

LOADER-A> Prompt example is from the top controller

4 IF you miss the window to abort the autoboot, look for this message: "Press CTRL-C for boot menu" and complete steps
4a-4c, otherwise if at the "LOADER" prompt, skip to step 5.
a. Immediately Press ^C (CTRL-C) to access the "Boot menu".
b. If a 'System ID mismatch' warning message below is displayed, answer : y
.......
.......
*******************************
* *
* Press Ctrl-C for Boot Menu. *
* *
******************************* STEP 4a:
^C Press "CTRL-C"
Boot Menu will be available.
Restoring /var from /cfcard/x86/freebsd/varfs.tgz
WARNING: System id mismatch. This usually occurs when replacing CF or NVRAM cards!
Override system id? {y|n} [n] y
STEP 4b: Enter: y

c. Next, drop to the LOADER prompt from the Boot Menu by following the linked process > here
5 Continue with Section IX on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 11 of 26
IX. FAS2040: Set date and time on the RTC (cont.)
Step Action Description
6 At the LOADER-A|B> prompt enter: show date to display the date and time in GMT on the new PCM

LOADER-A> show date


Current date & time is: 10/14/2010 16:36:50
LOADER-A>
Time is displayed in 24hr mode

7 The original motherboard's GMT time and local time should have been recorded in Section IV. If you don't have it, you can
obtain the GMT time from the partner node, or another NetApp appliance or any Unix Server using: date -u (The "-u" option
displays the time in GMT/UTC) The new motherboard's Real Time Clock (RTC) must be set within 2 minutes of the time
displayed (which is GMT time) for users to be able to re-connect to this appliance.
NOTE Detailed instructions for another method of obtaining the time in GMT and setting the date and time is here> RTC Check
8 To set the time issue: set time hh:mm:ss Set the time in GMT using 24 hour format - Do not set the time to local time.
NOTE If this maintenance period spans across the midnight hour in GMT time, the DATE will also need to be set.
9 To change the date, issue: set date mm/dd/yyyy (mm = 2-digit month, dd = 2-digit Day, yyyy = 4-digit Year)
10 If the date or time was changed, issue: show date again to verify the GMT date and time are correct.
11 Go to Section X, "Verify Battery Status" on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 12 of 26

X. FAS2040: Verify Battery Status


Step Action Description
1 At the LOADER-A|B> prompt, enter :^G (Ctrl-G) to enter the BMC-shell . The prompt changes to: "bmc shell ->".

LOADER-A> ^G (CTRL-G)
=== OEMCLP v1.0.0 BMC v1.5 === STEP 4: Enter: (CTRL-G) to enter the BMC-shell.
bmc shell ->

2 Check the NVMEM Battery:


At the bmc shell -> prompt:
a) Enter: priv set advanced to change to "bmc shell * ->". (Has Asterisk)
b) Enter: battery show to display the NVMEM battery status.
c) Confirm "status" indicates "ready", "charging" or "full".
d) Enter: exit to return to "LOADER-A|B>" prompt.

bmc shell -> priv set advanced


Warning: These advanced commands are potentially dangerous; use STEP 2a):
them only when directed to do so by Network Appliance Set advanced shell
personnel.
bmc shell*->

bmc shell*-> battery show


chemistry :LION STEP 2b): Display the battery specs.
device-name :bq20z80
expected-load-mw:81
The following text displays if the battery is "detected". If nothing is
id :27100010
displayed make sure the battery connector is fully seated. If still no
manufacturer :AVT
output call NGS, otherwise continue.
manufacture-date:4/9/2007
rev_cell :2
rev_firmware :200
rev_hardware :F0
serial :03dc
status :charging
test-capacity :disabled STEP 2c): "status" must indicate "ready", "charging" or "full."
bmc shell*->
bmc shell*-> exit
Press ^G to enter BMC command shell STEP 2d): Enter: exit to return to "LOADERA|B>" prompt.
LOADER-A>

3 Go to Section XI, "Run Diagnostics" on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 13 of 26

XI. FAS2040: Run Diagnostics (20-30 minutes)


Step Action Description
1 Test the Replacement Tray with diagnostics by entering boot_diags at the "LOADER-A|B>" prompt.
2 In the Diagnostic Menu, enter: run mb mem cf-card
These diagnostics tests are basic confidence tests for the new motherboard, memory and CompactFlash. IF any single test
FAILs, then the next diag test will not be started. Contact NGS for test failure. IF error can be skipped, run remaining test(s)
individually, ex: run mem or run cf-card, etc
3 If asked "OK to run NVMEM diagnostic (yes/no)?" Answer: yes

LOADER-A> boot_diags
Loading X86_ELF/diag/diag.krn:..0x200000/12629600 0xe0b660/4226832 0x1213570/8 Entry at
0x00200000
Starting program at 0x00200000 STEP 1: Enter: boot_diags

Copyright (c) 1992-2009 NetApp.


hat_fill_pd_pae: page already VALID addr 0xd8000000

Diagnostic Monitor
version: 5.4.3
built: Tue Oct 6 15:00:13 PDT 2009
--------------------------------------
all Run all system diagnostics
mb FAS2040 motherboard diagnostic
mem Main memory diagnostic
cf-card CompactFlash controller diagnostic
stress System wide stress diagnostic

Commands:
Config (print a list of configured PCI devices)
Default (restore all options to default settings)
Exit (exit diagnostics)
Help (print this commands list)
Options (print current option settings)
Version (print the diagnostic version)
Run <diag ... diag> (run selected diagnostics)

Options:
Count <number> (loop selected diagnostic(s) (number) of passes)
Loop <yes|no> (loop selected diagnostic(s))
Status <yes|no> (print status messages)
Stop <yes|no> (stop-on-error / keep running) NOTE: New RUN
Xtnd <yes|no> (extended tests / regular tests) Command options
Mchk <auto|off|on|halt> (machine check control)
Cpu <0|1> (default cpu)
Seed <number> (random seed (0:use machine generated number))

Enter Diag, Command or Option: STEP 2: Enter: run mb mem cf-card


Bad input! At prompt type help to see commands menu.

Enter Diag, Command or Option: run mb mem cf-card


WARNING! Do not run the NVMEM diagnostic immediately after a NOTE: IF any Comprehensive
system crash or if there is a possibility that log
test FAILs, then the next diag
data is stored. Run only on new boards, or after a
normal system shutdown, or if there is no chance of
test will not be started
preserving customer data.

OK to run NVMEM diagnostic (yes/no)? yes STEP 3: Enter: yes

4 Continue with Section XI on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 14 of 26

XI. FAS2040: Run Diagnostics (cont.)


Step Action Description
5 The test output below only includes the test suite summary line. Look to see that all these show as PASSED. If any state
FAILED, scroll back through your test output to see which test FAILED and call NGS to report the test failure. Read all the
NOTE: Text box information below.

FAS2040 Motherboard Diagnostic DIAGNOTIC RESULTS CONFIRMATION CHECKS


------------------------------
Confirm all Comprehensive Tests state: PASSED or SKIPPED.
No test should indicate FAILED. If so STOP - call NGS!
Performing comprehensive motherboard diagnostic
.....
Performing comprehensive GBE test on e0d
.....
****** Comprehensive GBE test ................... PASSED
Performing comprehensive GBE test on e0c
.....
****** Comprehensive GBE test ................... PASSED NOTE: The BGE tests
Performing comprehensive GBE test on e0b pass for all 4 onboard
..... Ethernet ports e0a-e0d
****** Comprehensive GBE test ................... PASSED
Performing comprehensive GBE test on e0a
.....
****** Comprehensive GBE test ................... PASSED

Performing comprehensive BGE test on e0P


.....
****** Comprehensive BGE test ................... PASSED

Testing FCAL card on channel 0a

Performing comprehensive FCAL test on channel 0a


..... If a FC test fails remove the
****** Comprehensive FCAL test .................. PASSED cable for that port if attached
and retest "mb' test only
Testing FCAL card on channel 0b

Performing comprehensive FCAL test on channel 0b


.....
****** Comprehensive FCAL test .................. PASSED
ONBOARD SAS present:
Slot 0 58 Dual Channels [Lsi Rev 0x8] NOTE: The FCAL tests
pass for ports 0a,0b and
Testing SAS card on channel 0c both onboard SAS tests
pass for ports 0c,0d.
Performing comprehensive SAS test on channel 0c
.....
****** Comprehensive SAS test ................... PASSED

Testing SAS card on channel 0d


.....
****** Comprehensive SAS test ................... PASSED
Internal loopback test ...................... PASSED
Link test(xtnd only) ........................ SKIPPED Confirm: The BMC
****** Comprehensive IB test .................... PASSED and SES tests passed
Performing comprehensive BMC test
..... Note: That the
****** Comprehensive BMC Test ................... PASSED
.....
Comprehensive mb test
Environmental check, subsystem: SES ......... PASSED "PASSED". If states "FAILED",
****** Comprehensive mb test .................... PASSED notify NGS and run remaining
tests individually (see step 2)

6 Continue with Section XI on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 15 of 26

XI. FAS2040: Run Diagnostics (cont.)


Step Action Description
7 In text box 'STEP 7' below, verify all the memory was discovered: FAS2040 ~ 4GB
8 If all tests show PASSED or SKIPPED, enter: exit to exit the main diagnostic menu. If any tests listed as FAILED, report
failure to NGS.

Testing : 3188 MB (start=10c00000, end=d8000000) STEP 7:


Total Memory Size : 3456 MB Note: PLEASE CONFIRM
Main Memory Diagnostic FAS2040 should total ~4GB
----------------------

Performing comprehensive main memory test


.....
****** Comprehensive Memory test ................ PASSED

CompactFlash Diagnostic
------------------------
..... Note: That the Comprehensive
****** Comprehensive CompactFlash test .......... PASSED Memory test & Comprehensive
CompactFlash test "PASSED"
Pass = 1, Current date = Saturday Jul 15 09:46:39 2011

--- Completed pass 1. Note: Test Suite


Complete message

Enter Diag, Command or Option: exit


STEP 8: Enter: exit to exit the Diags. The system will display
many messages. After about 10-20 seconds, the it will drop to
AMI BIOS8 Modular BIOS the LOADER-A|B> prompt.
Copyright (C) 1985-2009, American Megatrends, Inc. All Rights Reserved
.....
CPU Type: Intel(R) Xeon(R) CPU @ 1.66GHz
LOADER-A>

9 Go to Section XII, "Verify FC Adapter Configuration" on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 16 of 26

XII. FAS2040: Verify FC Adapter Configuration


Step Action Description
1 Boot into maintenance mode by following steps here
a) If a 'System ID mismatch' warning message is displayed due to the new Controller Module , answer : y

NOTE: If the NVRAM FW is down rev, an auto-


......
...... update will start and the controller will reboot.

nvram: Need to update primary image on flash from version 49 to 2


nvram: Need to update secondary image on flash from version 49 to 2
Updating nvram firmware, battery is off. The system will automatically reboot
when the update is complete.

WARNING: System id mismatch. This usually occurs when replacing CF or NVRAM cards!
Override system id? {y|n} [n] y STEP 1a): Enter: y
.....

If the replacement PCM fails to boot to the Maintenance menu, confirm the original Boot Device (CF Card) moved from the
NOTE
original MB to the replacement. Engage NGS for assistance.
If the system reports the battery voltage is too low or a critical failure, do NOT proceed - Do NOT bypass the system stop.
STOP Engage NGS for assistance.
Under NO CIRCUMSTANCES bypass the system halt to "boot" the system on a NVMEM battery voltage issue.
2 Review the fcadmin config output from Section IV. If any onboard Adapters (0a, 0b) were configured as "target" verify they are
still configured by entering: fcadmin config If one or more adapters need to be set as a "target" follow steps 2a-2b. If all
are OK, skip to step 3.
a) For each Adapter to be configured as a target enter: fcadmin config -t target <HA> Issue one command per
adapter. This example configures Adapter ports '0a' and '0b' as targets:
If the adapter that needs to be changed to a target, is listed as " online", it must be off-lined first before it can be
NOTE
changed. Issue: fcadmin offline <HA>
b) Enter: fcadmin config to confirm the changed FC Adapters are displaying as PENDING: (target) ports.

*> fcadmin config


STEP 2: List theFC Adapter configuration
Example Only
Local
Adapter Type State Status
---------------------------------------------------
0a initiator CONFIGURED offline

0b initiator CONFIGURED offline STEP 2a: Examples to configure a


port to be a target.
*> fcadmin config -t target 0a
Tue Oct 28 07:19:05 GMT [fci.config.state:info]: Fibre channel initiator adapter
0a is in the PENDING (target) state.
A reboot is required for the new adapter configuration to take effect.

*> fcadmin config -t target 0b


... Fibre channel initiator adapter 0b is in the PENDING (target) state.
A reboot is required for the new adapter configuration to take effect.

*> fcadmin config


STEP 2b: Enter: fcadmin config to
Local confirm each target port is shown
Adapter Type State Status as PENDING
---------------------------------------------------
0a initiator PENDING (target) offline

0b initiator PENDING (target) offline

3 If any FC cables were disconnected from adapters '0a' or '0b' due to boot issue, firmly reconnect them now. Must click in.
4 Continue with Section XII on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 17 of 26

XII. FAS2040: Verify FC Adapter Configuration (cont.)


Step Action Description
5 Follow Steps 5a-5c if this system has SAN attached Tape Drives < need to ask customer. If not, go to step 6.
a) Enter the command: 'fcadmin channels' to list new WWPNs of on-board ports - 0a, 0b.
b) Provide the output of fcadmin channels to the end-user administrator to remap the switch if the SAN Tape is using on-board
FC Adapter 0a or 0b. The end-user may see the new WWPNs through the SAN switch. An example of
"fcadmin channels" output highlighting the WWPNs changing before and after a MB swap is here.
c) Wait until the end-user administrator verifies the SAN Fabric zoning is changed if necessary, and he has updated the host-
group on the array with the new WWPNs and that the array can see the NetApp FC WWPNs before continuing.

XIII. FAS2040: Capture new System-ID on replacement Controller


Step Action Description
FAS2040 systems have NVMEM integrated into the controller and so when replacing its controller, the disks need to be
NOTE
reassigned to the new System-ID.
1 Enter: disk_list to force some disk I/O for the primary and secondary path check in next step.
2 Enter: storage show disk -p to confirm all adapters list a PRIMARY and SECONDARY path. No? Re-check cable seating.
3 Enter: disk show -v This shows disk ownership by system-ID. Confirm all disks are listed as originally captured.
NOTE: The primary and secondary path, to the SAS and FC Adapter under the Disk heading, can reverse.
4 Compare the new system ID to the old system ID. The old system-ID is on the disk show -v output that was captured in
Section IV. For DOT 8, always use the HOME column, not OWNER.

*> disk show -v In this example, the local System ID for the new Controller is 1743755272.
Local System ID: 1743755272 The old MB System ID was 122217803 (disk show -v from Section IV). The
Example Only disks need to be reassigned to the local System ID.
DISK OWNER POOL SERIAL NUMBER HOME
-------- ------------------ ----- -------------------- ------------------
0c.00.0 tsst-2 (142217816) Pool0 3LM17RW900009750Q6SF tsst-2 (142217816)
0c.00.1 tsst-2 (142217816) Pool0 3LM1623E00009750Q7YT tsst-2 (142217816)
.....
.....
0a.41 tsst-2 (142217816) Pool0 JLVT29GC tsst-2 (142217816)
0a.43 tsst-2 (142217816) Pool0 JLVT7BUC tsst-2 (142217816)
.....
0b.21 tsst-1 (122217803) Pool0 JLVT0KDC tsst-1 (122217803)
0b.18 tsst-1 (122217803) Pool0 JLVT2HZC tsst-1 (122217803)
.....
.....
0d.01.6 tsst-1 (122217803) Pool0 9QJ75925 tsst-1 (122217803)
0d.01.10 tsst-1 (122217803) Pool0 9QJ74TQG tsst-1 (122217803)
0d.01.9 tsst-1 (122217803) Pool0 9QJ758NQ tsst-1 (122217803)
0d.01.5 tsst-1 (122217803) Pool0 9QJ74VNZ tsst-1 (122217803)

5 Go to Section XIV, "Disk Reassign" on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 18 of 26

XIV. FAS2040: Disk Reassign


Step Action Description
ONTAP 8.2.x introduced a new automatic 'disk reassign' process for "HA" systems provided the partner took over the
NOTE
target node. Follow step 1 CAREFULLY!
1 IF dual controllers, the system is HA. If not dual controllers check the HA config > here Also, the dispatch includes this
text: HA System: IF YES, this system is HA and has a partner controller. Read the STOP (i) and (ii) to follow correct process.
Target Node Status Procedure to be followed
S "HA" and the node was successfully  IF ONTAP version is 8.2.x or higher , follow the linked process here
T (i)
taken over by its partner  IF ONTAP version is less than 8.2, follow Procedure-A on next page.
O
P Node does not have a partner (non-HA)
(ii) Follow Procedure-B on second next page.
OR the partner did not takeover.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 19 of 26

XIV. FAS2040: Disk Reassign (cont.)


Partner has taken over target controller running ONTAP less than version 8.2
Procedure-A Execute all of these steps on the PARTNER node.
Bug ONTAP < 8.0.3 will give an error message if more than 500 disks are attempted to be reassigned from the partner.
537799 If system is exposed, a system outage is required to do the disk reassign - Read this link for options >here
A1 Move console cable to the PARTNER node, login as: (7-Mode=root , C-Mode=admin ). Ask end-user for password.
A2 C-Mode only: Enter: run local to enter the nodeshell.
A3 Partner "takeover" Verification Check: Check the console prompt as follows:
Case1: IF the prompt has the word “(takeover) ” in it (Example: nodeB (takeover)> ) , continue with step A4.
Case2: IF the prompt has a "/" in it (Example: nodeA / nodeB> ) , enter: partner and then press enter key twice to
exit the partner shell. IF the prompt has "(takeover)" in it - continue with step A4, otherwise  NGS.
Case3: IF either case1 or case2 does not match your console prompt, verify with cf status.
IF no "takeover", follow Procedure-B on next page. Questions or if “partial takeover”,  NGS.
A4 Enter: partner aggr status -f IF any FAILED disk exists, inform customer and NGS about that.
FAILED disk(s) must be physically dis-engaged before the disk reassignment and giveback (Leave the disk in
the slot until replacement is received). If the system reports " partner: Not in takeover mode" or "partner
not found" you are entering the command from the wrong controller!! - Restart at step 1 above.
A5 Enter: priv set advanced at the prompt for the following command to work. Prompt will include " * ".
A6 Enter: partner savecore to ensure any coredumps on the repaired node are saved.
(i) IF console reports, " No core needs to be saved" skip to step A7, otherwise enter: partner savecore -s
to monitor the progress of the savecore command. Note: Before proceeding to next step, you must verify that
any core dumps on the partner (repaired node) are saved.
A7 Reconfirm the partner console prompt has the word "takeover" in it (see Command Example below) and then
enter: disk reassign -s <old_system_ID> -d <new_system_ID>
Cut and paste the old and new System IDs from the console Log. Read "CAUTION" steps 1 and 2.
Example Only partner-system name(takeover)*> disk reassign -s 122217803 -d 1743755272
1) IF the system reports: "Partner node must not be in Takeover mode during disk reassignment
from maintenance mode. Serious problems could result!!
* The above message indicates the system is "HA" and need to check the partner controller! *
Enter the appropriate response to "Abort/Cancel" the disk reassignment and restart this process at step A1.
2) IF the following highlighted console message is displayed:
 For ONTAP versions 8.0.5 or higher and 8.1.3 or higher :
C - Ignore the additional takeover/giveback request in the console message highlighted below.
A - Enter y to the question "Do you want to continue (y/n)?' and then skip to step A9.
U  For ONTAP version less than 8.0.5 or 8.1.3 :
T - (i) The giveback cannot be postponed
- (ii) A second takeover/giveback from the "target" (repaired) node must be executed later in this AP
I
Ask the customer: “Are there any Windows applications running that would inhibit a ‘cf giveback’ at
O this time? (open cifs sessions)" If the customer states the giveback cannot be performed now,
N answer n to cancel the 'disk reassignment' and follow the steps > here.
If re-dispatched for the disk reassign, start at the beginning of this Procedure- A.
disk reassign: A giveback must be done immediately following a reassign of partner
disks. After the partner node becomes operational, do a takeover and giveback of
this node to complete the disk reassign process.
Do you want to continue (y/n)? Note: Ignore this message if the ONTAP 8.0.5/ 8.1.3 or higher.
A8 If the giveback can be preformed now, enter y and continue.
A9 The next console message confirms the disk ownership update to the new system-ID. Enter y to the question.

Disk ownership will be updated on all disks previously belonging to Filer with
sysid 122217803. 7-Mode only: A console message will be displayed for
Would you like to continue (y/n)? y Enter: y each disk changing ownership (System ID)

If the console messages stated that the giveback must be completed immediately, do not enter any other commands
STOP
on the partner node until "after" the disk ownership on the down node is verified and the giveback is completed.
A10 Continue with step 2 on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 20 of 26

XIV. FAS2040: Disk Reassign (cont.)


Step Action Description
Use for Node that does not have a partner (non-HA) OR the partner did not takeover
Procedure-B Execute all of these steps from Maintenance mode on the repaired node.
B1 At the maintenance mode " * > " prompt enter:
disk reassign -s <old_system_ID> -d <new_system_ID> Read "CAUTION" below.
Cut and paste the old and new System IDs from the console Log.

Command Example Only:


*> disk reassign -s 122217803 -d 1743755272

IF the system reports: "Partner node must not be in Takeover mode during disk reassignment
from maintenance mode. Serious problems could result!!.
The system sees controller as a HA configuration.
CAUTION
(i) Confirm with the end-user that the partner did not takeover and continue with step B3 OR
(ii) IF the partner did takeover, enter the appropriate response to "Abort/Cancel" the disk reassignment and
then follow Procedure-A on previous page.
B2 IF Single Controller configuration, follow steps (i)-(ii). IF the partner did NOT takeover, skip to step B3.
(i) Enter: y to question "Would you like to continue (y/n)?"
(ii) Skip to step 2.

Command Example Only:


Disk ownership will be updated on all disks previously belonging to Filer with
sysid 122217803.
Would you like to continue (y/n)? y 7-Mode only: A console message will be
displayed for each disk changing ownership
Enter: y (System ID)

B3 IF the partner did NOT takeover, follow steps (i)-(iii).


(i) Enter the appropriate response (y/n) to "Proceed" with disk reassignment and then continue with next step.
(ii) Enter: y to question to continue with the disk ownership update.
(iii) Continue with step 2.

Command Example Only:


...
Disk ownership will be updated on all disks previously belonging to Filer with
sysid 122217803.
Would you like to continue (y/n)? y 7-Mode only: A console message will
be displayed for each disk changing
Enter: y ownership (System ID)

2 From the console port on "target" controller on which you replaced the MB (in maintenance mode):
a) Enter: disk show -s <old-sysID> No disks or V-Series LUNs should be listed as shown in console window below.
(The "-s old-sysID" was specified in the disk reassign step1)
IF any disks/V-LUNs are listed, a reservation may not have released, continue with step 2(b). IF no output, skip to step 3.

*> disk show -s 122217803


Local System ID: 1743755272
Example Only

DISK OWNER POOL SERIAL NUMBER HOME


---------- ------------- ----- ------------- -------------
*>
IF all disks properly reassigned, there should be no disks/V-LUNs listed in the output.

b) Not all disks re-assigned: Re-issue the disk reassign, from the node it was entered on, to see if the reservation releases.
Then repeat the disk show -s command in step 2(a). IF disks/V-LUNs are still listed in the output, call  Support.
3 Continue with Section XIV on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 21 of 26

XIV. FAS2040: Disk Reassign (cont.)


Step Action Description
4 Enter: disk show -v Verify the "Local System ID" matches node's disks listed in the "HOME" column. READ STOP below.
BEFORE the "giveback" is executed, you must verify the system-id for this node's disks listed under "HOME" if the
STOP! column exists, use OWNER if not, and the new "Local System ID" are the same. If not, confirm the correct system-ids were
entered on the 'disk reassign' command. If problems, do NOT proceed, call NGS for assistance. 
*> disk show -v
The new local System ID for the Controller is 1743755272. The
Local System ID: 1743755272
owner name (tsst-1) may or may not be shown. But those disks
Example Only should reflect the new local System ID.
DISK OWNER POOL SERIAL NUMBER HOME
-------- ------------------ ----- -------------------- ------------------
0c.00.0 tsst-2 (142217816) Pool0 3LM17RW900009750Q6SF tsst-2 (142217816)
0c.00.1 tsst-2 (142217816) Pool0 3LM1623E00009750Q7YT tsst-2 (142217816)
.....
.....
0a.41 tsst-2 (142217816) Pool0 JLVT29GC tsst-2 (142217816)
0a.43 tsst-2 (142217816) Pool0 JLVT7BUC tsst-2 (142217816)
.....
0b.21 ------------------ Pool0 JLVT2HZC (1743755272)
0b.18 ------------------ Pool0 JLVT2HZC (1743755272)
.....
.....
0d.01.6 ------------------ Pool0 9QJ75925 (1743755272)
0d.01.10 ------------------ Pool0 9QJ74TQG (1743755272)
0d.01.9 ------------------ Pool0 9QJ758NQ (1743755272)
0d.01.5 ------------------ Pool0 9QJ74VNZ (1743755272)

5 At the maintenance mode prompt: " * >", enter: halt to exit to LOADER-A|B.

XV. FAS2040: Boot PROM Variable Checks


Step Action Description
1 IF ONTAP version is < 8.0.2 (ONTAP 8.1 and > are not affected), unset the variable bootarg.init.wipeclean . (copy-n-paste)
LOADER-A> unsetenv bootarg.init.wipeclean

2 Go to Section XVI, "Boot the Operating System - 'giveback' if applicable" on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 22 of 26

XVI. FAS2040: Boot the Operating System - 'giveback' if applicable


Step Action Description
1 At the LOADER-A|B prompt, enter: autoboot to boot ONTAP.
2 After the console stops printing messages, press the <enter> key.
a) If the system booted up to a "login>" prompt, example below, continue with step 2b, otherwise skip to step 3.

Loading X86_64/freebsd/image1/kernel:0x100000/3375736 0x538280/3221872


.... Many typical system startup
..... messages removed for clarity
.....
******************************* These are
* Press Ctrl-C for Boot Menu. * typical Boot strap console
******************************* messages. If the partner did not
..... takeover OR this is a stand-alone
system, you should eventually get a
login: "login" prompt when you
hit <enter>.

b) Is this system HA Configured?


 Yes, but the partner did not takeover: Go to the STOP under Step 10 on next page.
 No, the system is stand-alone (non-HA head): Skip to Section XVII.
3 If the system booted up to a "Waiting for giveback>" prompt (press the <enter> key) , example below, the node was part of an
HA configuration and was taken over by its partner.

Phoenix TrustedCore(tm) Server


..... "...." = Deleted
..... lines to save space
******************************* NOTE 3.1:
* Press Ctrl-C for Boot Menu. * If you see this message, this node is part
******************************* of a HA configuration and the partner
..... node took over for it.
.....
Waiting for giveback...(Press Ctrl-C to abort wait)

4 Login into the PARTNER node (7-Mode=root , C-Mode=admin ). Engage end-user for password.
5 Check takeover status by entering the appropriate command shown for the specified ONTAP Mode. If "partner not ready" may
have to wait 2-4 minutes for the NVRAMs to synchronize.
7-Mode Cluster-Mode
partner(takeover)> cf status cluster::> run local cf status

6 IF Pre-ONTAP 8.2 and 7-Mode: Ask the customer if there are any heavy NDMP, SnapMirror or SnapVault processes running. If
Yes, they should be disabled due to bug 489060. The procedure to disable the processes is here.
7 Enter the proper controller giveback command(s) based on the mode running as follows:
A giveback cannot be completed due to: "a failed disk" or "Open CIFS sessions" or "partner not ready" .
IF FAILED disk: Physically dis-engage the failed disk (Leave the disk in the slot till replacement is received).
NOTE IF Open CIFS sessions: Check with customer how to close out CIFS sessions. Terminating CIFS can cause loss of data.
IF partner "not ready": Wait 5 minutes for the NVMEMs to snychronize.
Giveback fails due to any other reason? contact  NGS.
7-Mode Cluster-Mode
partner(takeover)> cf giveback cluster::> storage failover giveback -fromnode local

8 Continue with Section XVI on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers

Page 23 of 26

XVI. FAS2040: Boot the Operating System - 'giveback' if applicable (cont.)


Step Action Description
9 Wait! 90 seconds for 7-Mode or 3 minutes for Cluster-Mode after giveback reported complete.
Check controller failover status by entering the appropriate command shown for the specified ONTAP Mode.
7-Mode Cluster-Mode
partner> cf status cluster::> storage failover show
Controller Failover enabled, Follow step (a)
XYZ is up.
Look for failover enabled

a) IF Cluster-Mode: Confirm the "giveback" status of the storage, refer this doc > ONTAP 8 failover show
If the "giveback" is incomplete, wait 2 minutes and re-check. If still not complete after 10 minutes, contact Support.
NOTE
Do not proceed to next step if 'incomplete or partial giveback'!
10 IF Motherboard was Replaced and the partner printed out the following highlighted message after the "disk reassign" command
was executed, go to Step 11. If no message reported, skip to step 12 on next page.

disk reassign: A giveback must be done immediately following a reassign of partner disks.
After the partner node becomes operational, do a takeover and giveback of
this node to complete the disk reassign process.

IF this system has a partner controller, but the partner did not takeover, (Disks were assigned in maintenance mode) continue
STOP with step 10a (Ref Internal TSB-1209-02). Note, the console message above is not displayed when disks are reassigned in
maintenance mode.
a) Login to the "repaired node" (target) and re-enable "controller failover" using proper command syntax below (copy-and-
paste).

Cluster-Mode (run from clustershell )

7-Mode (1st cmd is for 2-node clusters ONLY, 2nd cmd is for 3 or more node clusters)
target> cf enable cluster::> cluster ha modify -configured true
OR
cluster::> storage failover modify -node local -enabled true

11 From the "repaired node", execute a takeover using the proper command below to sync the sys-IDs.

7-Mode Cluster-Mode (run from clustershell )


target> cf takeover cluster::> storage failover takeover -bynode local

a) Wait! 60 seconds for 7-Mode or 90 seconds for Cluster-Mode after takeover reports complete- Then check takeover
status by entering the appropriate command shown for the specified ONTAP Mode.

7-Mode Cluster-Mode

target(takeover)> cf status cluster::> run local cf status

b) After the appropriate Wait period in step 11a) and the cf status reports: "Ready for giveback" , enter the proper
"giveback" command below. This is the final synchronization of the system-Ids across the HA pair.

7-Mode Cluster-Mode

target(takeover)> cf giveback cluster::> storage failover giveback -fromnode local

c) Continue with Section XVI on next page.


Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 24 of 26

XVI. FAS2040: Boot the Operating System - 'giveback' if applicable (cont.)


Step Action Description
d) Wait Again! This time 90 seconds for 7-Mode or 3 miniutes for Cluster-Mode and then check giveback status by entering
the appropriate command below for the proper ONTAP mode. For 7-Mode look for failiover enabled , for Cluster-Mode
follow step (i).

7-Mode Cluster-Mode
target> cf status cluster::> storage failover show
Controller Failover enabled,
XYZ is up. Follow step (i).

(i) For ONTAP Cluster Mode, storage failover show should not show any "partial" givebacks. If there are, wait
another 60 seconds and recheck. Some large systems may take up to 10 minutes to complete.
Click > ONTAP 8 failover show to see examples of output. Issues? Call  NGS.
12 IF Cluster-Mode: Follow steps (a-b) below, otherwise skip to Step 13.
a) From the clustershell on each node, enter the command below to list the logical interfaces that are not on their home
server and port.
Cluster-Mode
cluster::> net int show -is-home false

Example of output here> net int show


b) If any interfaces are listed as "false" in the above command, enter the command below to revert them back to their home
port. Issues? Call  NGS.
Cluster-Mode
cluster::> net int revert *

Example of output here> net int show


13 Go to Section XVII, "NetApp Storage Encryption (NSE) System? " on next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 25 of 26

XVII. FAS2040: NetApp Storage Encryption (NSE) System?


Step Action Description
STOP IF this system is using NSE disks continue with step 1, otherwise skip to Section XVIII.
1 Confirm with customer that they have not power cycled the controller chassis or any attached shelves with NSE drives.
- IF not power cycled, run "key_manager setup" by following step 2 below.
- IF power cycled, do NOT proceed - Contact NGS.
2 Enter: key_manager setup on the target (repaired) node to update the boot PROM variables and to regenerate the key for
the new system-ID. Follow these additional steps here.
a) Next, login to the partner node and enter: key_manager setup to update its boot parameters.

XVIII. FAS2040: Configure the BMC if Necessary


Step Action Description
1 IF the BMC "IP Address" captured in step 6(a) of section IV was "0.0.0.0", skip to section XVIII. Otherwise, continue with step 2 to
configure the BMC.
2 Login into the TARGET node as "root". Engage end-user for password.
3 Enter: 'bmc setup' and follow steps 3a-3c in the console output below.

fas2050cl1-rtp> bmc setup Step 3): Enter 'bmc setup' to configure the BMC

The Baseboard Management Controller (BMC) provides remote management


capabilities including console redirection, logging and power
control.It also extends autosupport by sending down filer event alerts
Step 3a): Enter 'y'
Would you like to configure the BMC? (y/n)? y
Step 3b): Enter the values
based on the 'bmc config'
Would you like to enable DHCP on BMC LAN interface? (y/n)? output captured in
Please enter the IP address for the BMC [0.0.0.0]: section IV.
Please enter the netmask for the BMC [0.0.0.0]:
If the default value given in
Please enter the IP address for the BMC gateway [0.0.0.0]:
parentheses is same as the
actual value, just press

Please enter the gratuitous ARP Interval for the BMC [10 sec (max 60)]:

The BMC is setup successfully. Step 3c): Press "Enter Key" to


accept the default value'

4 Enter: 'bmc status' and verify it is configured correctly.


5 Go to Section XIX, "Controller registration, Enable options, Submit logs and Part Return" on the next page.
Processor Controller Module (PCM) Replacement for the FAS2040
For NetApp Authorized Service Engineers
Page 26 of 26

XIX. FAS2040: Controller registration, Enable options, Submit logs and Part Return
Step Action Description
NOTE Service entitlements break when the MB is swapped because the new motherboard changes the system serial number.
1 Ask end-user if using "AutoSupport"? If YES, perform step 1(a). If NO, perform step 1(b).
a) ASUP system: Request end-user to send NetApp an ASUP Message from the target (repaired) node so the configuration
setup can be verified and the new system serial number can be registered by NGS. If the target system is not UP, send
ASUP from its partner. Use the corresponding command for the version of ONTAP running. Enter your dispatch's 7-digit
FSO number (begins with 5).

DOT 7 and DOT 8 7-Mode Cluster-Mode


filer> options autosupport.doit 5xxxxxx cluster::> invoke * -type all -message 5xxxxxx

b) If ASUP is disabled: Call NGS CSR and provide the new MB serial number so they can register it as the new system s/n.
2 IF NDMP, SnapMirror or SnapVault options were disabled, enable them now. Refer to page 2 of doc > > here
3 Ask customer if using Operations Manager? If so, can they still access the controllers. If not, see bug > > 583160
4 C-Mode Only: Re-enable "auto-giveback" options if they were disabled on either node. C-Mode command here
5 Email the console log with the NetApp Reference Number in the Subject Line to [email protected]
6 Place the defective part in the antistatic bag and seal the box.
7 Follow the return shipping instructions on the box to ship the part(s) back to NetApp’s RMA processing center. If the
shipping label is missing see process to obtain a shipping label here Missing Shipping Label?
8 Verify with customer that the system is OK and if working with NGS ask them if it is OK to be released.
9 Close dispatch per Rules of Engagement.

You might also like