NetApp Hardware Diagnostics Guide PDF
NetApp Hardware Diagnostics Guide PDF
The sections in this guide provide the following information: Overview of the Diagnostics Guide gives a high-level overview of what diagnostics are available for your NetApp storage systems and gives some examples of when to run them. Running Diagnostics describes the Diagnostic Monitor and how to run diagnostics on your system. Diagnostics Menus lists and defines the menu options of the Diagnostic Monitor's individual diagnostic tests. Error Messages defines the coding conventions used, lists and defines the error messages generated by the diagnostic tests, and recommends the corrective action to address errors you encounter. Environmental Error Messages lists and defines the environmental error messages generated when you run the environmental status test in the miscellaneous motherboard test menu. The error messages are listed according to the platform in which the motherboard and any related daughterboard resides and are described according to the type of sensor that is reporting the error condition. This section also recommends the corrective action to address errors you encounter.
Part Number: 215-06426_A0 ur002 Last updated: December 10, 2012
Legal Information
This section describes the following topics: Copyright Trademarks Support note Communications regulations
Copyright
Copyright 1994-2012 NetApp, Inc. All rights reserved. Printed in the U.S.A. No part of this document covered by copyright may be reproduced in any form or by any means graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an electronic retrieval systemwithout prior written permission of the copyright owner. Software derived from copyrighted NetApp material is subject to the following license and disclaimer: THIS SOFTWARE IS PROVIDED BY NETAPP AS IS AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. NetApp reserves the right to change any products described herein at any time, and without notice. NetApp assumes no responsibility or liability arising from the use of products described herein, except as expressly agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp. The product described in this manual may be protected by one or more U.S.A. patents, foreign patents, or pending applications. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.2777103 (October 1988) and FAR 52-227-19 (June 1987).
Trademarks
NetApp, the NetApp logo, Network Appliance, the Network Appliance logo, Akorri, ApplianceWatch, ASUP, AutoSupport, BalancePoint, BalancePoint Predictor, Bycast, Campaign Express, ComplianceClock, Cryptainer, CryptoShred, Data ONTAP, DataFabric, DataFort, Decru, Decru DataFort, DenseStak, Engenio, Engenio logo, EStack, FAServer, FastStak, FilerView, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexSuite, FlexVol, FPolicy, GetSuccessful, gFiler, Go further, faster, Imagine Virtually Anything, Lifetime Key Management, LockVault, Manage ONTAP, MetroCluster, MultiStore, NearStore, NetCache, NOW (NetApp on the Web), Onaro, OnCommand, ONTAPI, OpenKey, PerformanceStak, RAID-DP, ReplicatorX, SANscreen, SANshare, SANtricity, SecureAdmin, SecureShare, Select, Service Builder, Shadow Tape, Simplicity, Simulate ONTAP, SnapCopy, SnapDirector, SnapDrive, SnapFilter, SnapLock, SnapManager, SnapMigrator, SnapMirror, SnapMover, SnapProtect, SnapRestore, Snapshot, SnapSuite, SnapValidator, SnapVault, StorageGRID, StoreVault, the StoreVault logo, SyncMirror, Tech OnTap, The evolution of storage, Topio, vFiler, VFM, Virtual File Manager, VPolicy, WAFL, Web Filer, and XBB are trademarks or registered trademarks of NetApp, Inc. in the United States, other countries, or both. IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. A complete and current list of other IBM trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml. Apple is a registered trademark and QuickTime is a trademark of Apple, Inc. in theUnited States and/or other countries. Microsoft is a registered trademark and Windows Media is a trademark of Microsoft Corporation in theUnited States and/or other countries. RealAudio, RealNetworks, RealPlayer, RealSystem, RealText, and RealVideo are registered trademarks and RealMedia, RealProxy, and SureStream are trademarks of RealNetworks, Inc. in theUnited States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. NetApp, Inc. is a licensee of the CompactFlash and CF Logo trademarks. NetApp, Inc. NetCache is certified RealSystem compatible.
Support note
Microsoft has not established a commitment to support SnapManager for Exchange and storage systems used in an Exchange configuration. There can be no assurance that Microsoft will provide support for this usage. NetApp supports SnapManager for Exchange and NetApp storage systems used in an Exchange environment and has invested resources in third-party programs to provide the highest quality support possible to our customers.
Communications regulations
FCC notices (U.S. only)
NetApp devices are designed for a CFR 47 (Code Federal Regulations) Part 15 Class A environment. The FCC and NetApp guarantee the users rights to operate this equipment only if the user complies with the following rules and regulations: Install and operate this equipment in accordance with the specifications and instructions in this guide. Modify this equipment only in the ways specified by NetApp. Use shielded cables with metallic RFI/EMI connector hoods to maintain compliance with applicable emissions standards. If the system has nine or more Fibre Channel disk shelves, install the system in two or three NetApp System Cabinets to maintain performance within Part 15 of CFR 47 regulations.
Translation of the BSMI notice: Warning: This is a Class A product. In a domestic environment this product may cause radio interference, in which case the user may be required to take adequate measures.
Voluntary Control Council for Interference by Information Technology Equipment (VCCI, Japan)
Translation of the VCCI-A notice: This is a Class A product based on the standard of the Voluntary Control Council for Interference by Information Technology Equipment (VCCI). If this equipment is used in a domestic environment, radio disturbance may arise. If such trouble occurs, the user may be required to take corrective actions.
Contact Information
NetApp, Inc. 495 East Java Drive Sunnyvale, CA 94089 Telephone: +1 (408) 822-6000 Fax: +1 (408) 822-4501 Support telephone: +1 (888) 4-NETAPP Documentation comments: [email protected] Information Web: http://www.netapp.com
Preface
About this guide
This document describes how to boot and operate the diagnostics available for NetApp storage systems.
Audience
This guide is for qualified system administrators and service personnel who are familiar with NetApp storage systems. The procedures in this guide describe replacement, upgrade, and maintenance tasks for personnel with the following skills and experience: Working familiarity with small computer system hardware and operation Basic understanding of common networking concepts and practices Working familiarity with accepted tools and procedures for installing and operating sensitive electronic equipment
Command conventions
You can enter storage system commands on the system console or from any client that can obtain access to the storage system using Telnet. This guide uses the command syntax and output of SunOS 4.1x in examples of commands run on a UNIX workstation. If you use a different version of UNIX, the command syntax and output might be different.
Formatting conventions
The following table lists different character formats used in this guide to offset special information: Formatting convention Italic type Type of information Words or characters that require special attention. Placeholders for information you must supply. For example, if the guide requires you to enter the fctest adaptername command, you enter the characters "fctest" followed by the actual name of the adapter. Man page names. Book titles in cross-references. font Command and daemon names. Information displayed on the system console or other computer monitors. Contents of files. font Words or characters you type. What you type is always shown in lowercase letters, unless your program is case-sensitive and uppercase letters are necessary for it to work properly.
Monospaced
Bold monospaced
Keyboard conventions
This guide uses capitalization and some abbreviations to refer to the keys on the keyboard. The keys on your keyboard
might not be labeled exactly as they are in this guide: What is in this guide... hyphen (-) What it means... Used to separate individual keys. For example Ctrl-D means holding down the Ctrl key while pressing the D key. Enter type enter Used to refer to the key that generates a carriage return, although the key is named Return on some keyboards. Used to mean pressing one or more keys on the keyboard. Used to mean pressing one or more keys and then pressing the Enter key.
Special messages
This guide contains special messages that are described as follows: Note A note contains important information that helps you install or operate the system efficiently. Caution A caution contains instructions that you must follow to avoid damage to the equipment, a system crash, or loss of data. WARNING A warning contains instructions that you must follow to avoid personal injury.
Release history
For release information and history, see the NetApp Support site at http://support.netapp.com/.
If your storage system or disk shelf has multiple power cords and you need to turn the unit off, heed the following warning: WARNING: This unit has more than one power supply cord. To reduce the risk of electrical shock, disconnect all power supply cords before servicing.
Sicherheitsvorgaben
Alle Produkte sind Lasergerte der Klasse 1. Die folgenden Sicherheitshinweise sind beim Betreiben des Gerts unbedingt zu beachten: VORSICHT: Nichtbeachtung dieser Anweisungen kann zu schweren Krperschden fhren oder tdlich sein. Bei der Montage der Diskettenregale und Archivierungsgerte, des NetCache -Gerts oder des NearStore Systems in bewegliche Schrnke oder Regale sind die Gerte von unten nach oben einzubauen, um optimale Stabilitt zu gewhrleisten. Gleichstrom-Systeme mssen an Betriebsstaette mit beschraenktem Zutritt installiert sein und die beiden Eingangsstromklemmen fr das Gleichstrom-Netzteil mssen an separate und isolierte Abzweigleitungen angeschlossen sein. Zum Schutz vor Krperverletzung oder Sachschden am Gert lassen Sie die inneren Bauteile stets vor dem Berhren abkhlen. Sorgen Sie dafr, dass das Gert richtig abgesttzt ist oder fest aufrecht steht, bevor Sie
neues Zubehr einbauen. Dieses Gert ist fr die Einspeisung aus einer geerdeten Netzverbindung ausgelegt. Der Netzstecker mit Erdungsvorrichtung ist ein wichtiger Sicherheitsschutz. Zum Schutz vor elektrischem Schlag oder Sachschden am Gert die Erdung nicht abschalten. Das Gert ist mit einer oder mehreren auswechselbaren Batterien ausgestattet. Bei unsachgemem Auswechseln der Batterie besteht Explosionsgefahr. Batterien nur mit dem vom Hersteller empfohlenen Typ oder entsprechenden Typen ersetzen. Gebrauchte Batterien sind gem den Anweisungen des Herstellers zu entsorgen.
Sollte Ihr Archiviergert, NetCache-Gert, NearStore-Gert oder Diskettenregal mehrfache Netzanschlussleitungen aufweisen und Sie wollen das Gert abschalten, bitte folgenden Warnhinweis beachten. ACHTUNG: Gert besitzt zwei Netzanschlussleitungen. Vor Wartung alle Anschlsse vom Netz trennen.
About diagnostics
The Diagnostic Monitor
The Diagnostic Monitor is a set of diagnostics tools and tests that is used to search for and determine hardware problems. It is used as part of system troubleshooting to help isolate and identify a faulty component or to confirm that a specific component is operating properly.
Optional materials
Optional tools and equipment
You might need the following tools and equipment to run diagnostics, if you plan on correcting any system or component problems you might find: Tools and equipment #1 and #2 Phillips screwdriver Loopback plugs Antistatic wrist strap and grounding leash Where needed Opening the storage system, removing cabinet components, and replacing cards and adapters in the system. Needed by some diagnostic tests that run in extended mode. The plugs close data transmission loops of some system cards, such as Ethernet cards. Make sure that you have the appropriate loopback plugs for the specific card or adapter. Used for grounding yourself during equipment replacement.
Reference guides
You might need the following supporting guides to assist you in replacing system components: Manuals Appropriate hardware, hardware service guide, or field service guide for your storage system. Reasons These guides contain information for installing or replacing components in your storage system.
Result The diagnostics program starts to boot. When booting is complete, the top-level user interface and the Diagnostic Monitor appear, listing all available features.
Where to go next
After the Diagnostic Monitor loads, you can begin running either all diagnostic tests or specific tests. See Running Diagnostics for more information about the specific tests you can perform with the Diagnostic Monitor.
Running Diagnostics
About this section
This section describes the Diagnostic Monitor and how to run diagnostics on your system.
Commands
Config (print a list of configured PCI devices) (restore all options to default settings) Default (exit diagnostics and return to firmware OK prompt) Exit (print this commands list) Help (print current option settings) Options Run <diag...diag> (run selected diagnostics)
Options
Count <number> (loop selected diagnostic<s> <number> of passes> (loop selected diagnostic(s)> Loop <yes | no> (print status messages) Status <yes | no> (stop-on-error / keep running) Stop <yes | no> (extended tests / regular tests) Xtnd <yes | no> Mchk <auto | off | on | halt> (machine check control) CPU <0 | 1> (run diagnostic with CPU0 | run diagnostic with CPU1) (random seed (0:use machine generated number)) Seed <number> Enter Diag, Command or Option:
40: 41: 42: 43: 70: 71: 72: 73: 90: 91: 92: 93: 99:
Port-port 10B test (Xtnd) Port-port 100B test (Xtnd) Port-port 1 G test (Xtnd) Cluster diag-diag test Display MAC address Display all registers Display Counters Set MAC address [Factory] GBE card selection Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
(print a list of configured PCI devices) (restore all options to default settings) Exit (exit diagnostics and return to firmware OK prompt) Help (print this commands list) Options <(print current option settings) Run < diag...diag> (run selected diagnostic)
Config command
The config command enables you to learn what Peripheral Component Interconnect (PCI) devices you have on your system. Default and options commands The default and options commands are closely related. They are compared in the following table: Command Enables you to... default Return all test option settings to default values, which are
loop no status yes stop yes xtnd no mchk auto options
Display the current test option settings. When test options are set to default values, the system displays the following output after the default command:
--Tests will stop on error --Diagnostic looping disabled --Status messages enabled --Normal testing enabled --Automatically select action on machine checks (Halt on most machine checks)
For example, when you modify the option to the setting you want at the Enter Diag , Command, or Option prompt: loop yes The system response in this example shows all settings but one are set to default:
--Tests will stop on error **Diagnostic looping enabled --Status messages enabled --Normal testing enabled --Automatically select action on machine checks (Halt on most machine checks)
Note The asterisks before the option setting indicate a non-default value. The count option is not listed because it does not have a default setting.
Exit command
The exit command exits the Diagnostics program and returns you to the firmware prompt. Following this, you can reboot the system without power-cycling the machine. If you need to stop a diagnostic session while it is running, you can use the Ctrl-C command.
Help command
Online help is available for the Diagnostic Monitor through the help command. The help command lists what is available through the diagnostics, commands , and options menus. It also identifies the version of Diagnostics that is being run.
Run command
The run command enables you to run several diagnostic sessions in sequence, using the run command followed by the diagnostic names you want to run. Each session runs without interactive test selection menus. In the following example, you are running the mb (motherboard) diagnostic and the memory diagnostic:
run mb mem
Description You can control how many loop passes are executed. The count option works only when looping is enabled. Example: To limit an internal or external loopback test to six loop passes, you would enter:
count 6
Looping is disabled. Terminates session at the end of a pass. Does not continue to loop continuously. Looping is enabled. The test run loops continuously or for the specified number of loop passes, if you set the count option. Enabled looping applies to the all and run commands. When you enable looping with loop yes , you can also specify the number of loop passes with count <number>. Example: To enable looping, you would enter the following command:
loop yes **Diagnostic looping enabled
Example count and loop options The following example enables looping and sets the number of loop passes to six:
loop yes **Diagnostic looping enabled count 6
Status option
The following table lists the status option settings: Status option
status yes (default) status no
Description Displays the diagnostic status in detail. Displays the diagnostic status in a brief sentence.
Stop option
The following table lists the stop option settings: Stop option
stop yes default)
stop no
Description When diagnostics discovers an error, it stops at the end of a complete loop pass. The error is logged to the console terminal. If the stop option is enabled, the diagnostic stops execution at the end of a complete test pass. When diagnostics discovers an error, it continues running. You can run additional tests and continue to encounter additional errors.
Xtnd option
Extended mode applies only to tests that are marked with the Xtnd label. There are two possible settings, described in the following table: Xtnd option
xtnd no (default) xtnd yes
Description In this test mode, called normal test mode, you are testing the system component within the inner boundaries of the unit. In this test mode, called extended test mode, you are testing the physical media outside the unit. With NICs, you are required to disconnect the unit and put special loopback connectors or plugs on the card. Note Loopback plugs are required to run some FC-AL diagnostic tests. They are not required when the Fibre Channel loop has its own terminator.
Example of xtnd yes This example shows xtnd yes and the system reminding you that you might need loopback plugs.
xtnd yes **Extended testing enabled NOTE: Some diagnostics require loopback plugs for complete test operation and will indicate
Example of a test failure This example shows a test failure when you have done the following: Failed to prepare the FC-AL adapter with loopback plugs Failed to set the xtnd yes test option Selected 11--Loop integrity LRC test [Xtnd] in the FCAL test menu
ERROR DLH0020: FCAL loop is open. Check cables and associated hardware FCAL loop test...........................................FAILED
Run the comprehensive test or the specific loop test. Remove the loopback plugs after the test is completed.
Mchk option
The mchk (machine check) option enables you to control system behavior when the hardware detects a machine check error. The four mchk settings are as follows: Mchk option
mchk auto(default)
When a machine check is detected, the system... Automatically chooses the best machine check control for the diagnostic. Usually it halts the diagnostic session. You can use non-default machine check settings in certain memory testing circumstances to aid in diagnosing hardware problems. Halts the system immediately, going into a panic state. Reboot the system to continue running diagnostics. Silently ignores the error, unless it is fatal. Does not halt diagnostic execution if memory parity/ECC errors or similar errors are detected. The system reports the machine check and resumes the diagnostic execution. The diagnostic can continue testing and analyzing all errors in the test pass, possibly providing a more accurate callout of memory DIMM failures.
Example:
In the following example, you enable machine check with the mchk yes option.
mchk yes **Machine checks enabled (Display memory machine checks and continue)
You can also enable looping on the card by entering the number for the option:
91
Seed option
The following table lists the seed option settings: Seed option Description seed <number> Enables the user to feed the Memory, NVRAM, and Cache diagnostics tests with a user-defined seed. Even if the test is random, this option recreates a test scenario and the value of the seed is displayed at the beginning of the test. seed 0 (default) The diagnostics tests will use a machine generated seed number.
See Diagnostics Menus for more information about individual diagnostic test menus.
Running all diagnostic tests Running individual diagnostic tests Test results
Note You can set the all option to run diagnostic testing without stopping when an error is detected. Use the stop no option from the Diagnostic Monitor. See Stop option for more information about setting this option.
Results
As each test starts, its name and the test result appears on the console. By default, diagnostic testing stops when an error is encountered. The error is displayed on the screen, so you can identify the problem. See Error Messages, for more information about error messages. C1300 only: Running the all option generates a System Event Log that must be cleared. Failure to clear this log will cause the Alarm LED to blink in amber. To clear the log, enter the following command:
environment chassis bmc sel-clear
Result The Gigabit diagnostic test menu appears. Enter the number of the test you want to run or enter 1 to run a comprehensive test.
Test results
Example test output
When you run a test, its name, results, and error messages, if any, appear on the screen and you are returned to the test menu.
Where to go next
After the Diagnostic Monitor is loaded, you can run diagnostics on all system components or individual components. See Diagnostics Menus, for a list and description of the tests you can run. See Error Messages, for a list and description of all diagnostic error messages, along with the suggested corrective action. See Environmental Error Messages, for a list and description of all environmental error messages, along with the suggested corrective action.
Diagnostics Menus
About this section
This section lists and defines the menu options of the Diagnostic Monitor's individual diagnostic tests. If you receive an error message during a particular test, go to Error Messages, to determine what the message means and to determine how to correct the problem encountered by the test.
Motherboard diagnostics
About motherboard diagnostics
The motherboard diagnostic test the integrity of a variety of components on the motherboard or system backplane. The data you retrieve from these tests helps you determine what component is causing an error. For example, if you want to check the PCI devices and slots on the motherboard, you select the Misc. board component menu option, then select the appropriate test from the Miscellaneous board component tests submenu.
Motherboard menu
This section describes the Motherboard menu. FAS270c only: If you are running diagnostics on system module B and you responded that system module A is running Data ONTAP or Diagnostics, then only a limited set of FC-AL tests or options are available for running: Test Test no 1 Comprehensive motherboard diag 2 Misc. board test menu 3 Cache test menu 4 Onboard Ethernet test menu 5 Onboard FCAL test menu 71 Show PCI configuration 72 Show detailed PCI info 73 Initialize real-time clock 74 Show system info 75 Serial info setup menu [Factory only] 76 Show all disks 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit Description Runs all tests in this menu in current mode. Accesses the miscellaneous motherboard test menu. Accesses the CPU cache tests. For more information, see the Cache test menu. Accesses the onboard Gigabit Ethernet test menu. Accesses the onboard FC-AL test menu. Lists the contents of all adapters in the PCI slots on the motherboard. Displays detailed information about the contents and settings of the cards in the PCI slots. Initializes the onboard real-time clock to user-defined settings. Displays information about the system. Option not available. Displays information about the disks. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
4 5
7 8 9 71 72 73 91 92
Check boot flash This test verifies that the boot flash can be accessed reliably by software access Real-time clock This test will access the Real Time Clock and test its ability to count seconds. The RTC is test initialized and then the battery register is accessed to make sure that the correct status is read. Then the seconds register is accessed and the data is saved. The test will wait for about one second and then the seconds register is accessed again to make sure that it has changed. The second check will access the days register to make sure it is in the correct bounds (1-7). So, it basically verifies that the Real Time Clock is incrementing correctly and that its battery is in a good state. Check Checks the Environmental Status Register (ESR) for fault conditions, such as fan failure and Environmental high temperature. Status Front panel LED Exercises the front panel LEDs by changing patterns in the displays. You need to observe the exercise LEDs blinking to verify that they are working. Test PCI slots Tests the PCI devices. Check watchdog Checks that the watchdog interrupt is working. interrupt Show PCI Shows the configuration of the Peripheral Component Interconnect (PCI), a peripheral bus. configuration Show detailed Shows detailed information about the PCI devices on the various PCI buses. PCI info Initialize realInitializes the battery powered, real-time clock. time clock Enable/disable Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is looping pressed or when an error is encountered if option 92 is active. Stop/continue Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping on error looping continues after an error is encountered.
93 Extended/normal Enables or disables extended mode on tests where extended mode is an available option. test mode 99 Exit Exits this diagnostics menu.
3 4
7 8 9 91 92
99
Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices: Test Test Description no 1 Comprehensive Runs all tests in this menu in current mode. GBE test 2 Reset test Runs a test that verifies if the registers have the specified default values on reset on a reset of Intel GBE card. 4 Link test Verifies the external link condition. Requires loopback plug or Ethernet connection. 5 Internal Mac lp Tests movement of data through the MAC. test 10B 6 Internal Mac lp test 100B 7 Internal Mac lp test 1G 8 Internal Tcvr lp Tests movement of data through the transceiver. test 10B 9 Internal Tcvr lp test 100B 10 Internal Tcvr lp test 1G 11 External lp test Extended test mode: Tests card functionality and data movement between memory and the 10B (Xtnd) Ethernet cable. Requires a loopback plug. 12 External lp test 100B (Xtnd) 13 External lp test 1G (Xtnd) 14 Interrupt test Performs the internal loopback test in Interrupt mode to test and verify that the DMA/data
40 Port-port 10B test (Xtnd) 41 Port-port 100B test (Xtnd) 42 Port-port 1 G test (Xtnd) 43 Cluster diagdiag test 70 Display MAC address 71 Display all registers 72 Display Counters 73 Set MAC address [Factory] 90 GBE card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit
transfers work in Interrupt mode. Tests the data path from one channel to another for the dual-channel network interfaces. It requires a twisted-pair network cable to be connected between the two ports.
FAS270c only: Tests the third Ethernet interface which is on the backplane and functions as the interconnect interface between the two system modules. Verifies and displays the MAC address of the interface. Displays all the memory registers. Displays the date counters. Option not available. Enables the selection of a specific GbE interface in the system. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
4 5 6 7 8 9
Int loop test Bus reset test [Xtnd] Ext loop test [Xtnd] Read-only bus test Read/write bus test [Mfg] Disk read test (FCTEST)
Tests data movement between main memory and the FC-AL chip, using on-chip loopback capability for 10 bit and 1 bit. Extended test mode: Tests the FC-AL loop integrity and LRC functionality by resetting the bus. Extended test mode: Tests the functionality and data movement between memory and FC-AL cable. Requires loopback plug. Tests the FC-AL loop integrity by reading from each disk attached to the FC-AL interface. Option not available. Tests the FC-AL adapter loop integrity by reading from each disk attached to the FC-AL onboard interface. This test has optional parameters. Requires disks attached to the FC host adapter. Option not available. Lists the status of all the disks on all FC-AL adapters on the storage system. Requires disks attached to the FC host adapter. Lists the status of all the disks on the specified FC-AL adapters. Requires disks attached to the FC host adapter. Tests the external LEDs on the FC-AL card. Displays information about the ISP Fibre Channel chip. Displays all devices attached to a specific FC-AL adapter. Lists disk information for all disks attached to the system. Resets the selected FC-AL adapter to its original state. Displays the serial EEPROM data. Option not available. Displays the link statistics maintained for all drives on a Fibre Channel loop. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Enables you to select a specific FC-AL interface for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91,
10 Disk read/write test [Mfg] 41 Scan all disks on all FC-AL adapters 42 Scan and show disks on selected FC-AL adapters 43 FC-AL adapter LED test 71 Show ISP FC chip info 72 Show attached FC-AL devices 73 Show all disks (probe-scsi-all) 74 Reset FC-AL adapter 75 Show serial EEPROM data 76 Program serial EEPROM data [Factory] 77 Display fcstat link_status 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 90 FC-AL channel selection 91 Enable/disable looping 92 Stop/continue
looping on error looping continues after an error is encountered. 93 Extended/normal Enables or disables extended mode on tests where extended mode is an available option. test mode 99 Exit Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
Motherboard menu
This section describes the Motherboard menu: Test Test no 1 Comprehensive motherboard diag 2 Misc. board test menu 3 Cache test menu 4 Onboard Gigabit Ethernet test menu 5 Onboard FCAL test menu 6 SAS test menu 7 IB test menu 8 BMC test menu 9 NVMEM test menu 71 Show PCI configuration 72 Show detailed PCI info 73 Initialize real-time clock 75 Serial info setup menu [Mfg only] 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit Description Runs all tests in this menu in current mode. Accesses the miscellaneous motherboard test menu. Accesses the CPU cache tests. For more information, see the Cache test menu. Accesses the onboard Gigabit Ethernet test menu. Accesses the onboard FC-AL test menu. Accesses the SAS test menu. Accesses the Infiniband test menu. Accesses the baseboard management controller test menu. Accesses the NVMEM test menu. Lists the contents of all adapters in the PCI slots on the motherboard. Displays detailed information about the contents and settings of the cards in the PCI slots. Initializes the onboard real-time clock to user-defined settings. Option not available. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
4 Check PCI This test verifies that all available PCI devices are valid and are located in valid slots. devices and slots 5 Check memory Verifies the interface and data path integrity between the CPU and the memory DIMMs. interface This is a very small subset of the memory diagnostics and is not intended to be a comprehensive test. Performs a sliding 0 and 1 test to fixed locations of memory. Cache is disabled prior to running and then re-enabled at the end of the test. 6 Check boot flash This test verifies that the boot flash can be accessed reliably by software access 7 Real-time clock This test will access the Real Time Clock and test its ability to count seconds. The RTC is test initialized and then the battery register is accessed to make sure that the correct status is read. Then the seconds register is accessed and the data is saved. The test will wait for about one second and then the seconds register is accessed again to make sure that it has changed. The second check will access the days register to make sure it is in the correct bounds (1-7). So, it basically verifies that the Real Time Clock is incrementing correctly and that its battery is in a good state. 8 Check Checks the Environmental Status Register (ESR) for fault conditions, such as fan failure and environmental high temperature. status 71 Show PCI Shows the configuration of the Peripheral Component Interconnect (PCI), a peripheral bus. configuration 72 Show detailed Shows detailed information about the PCI devices on the various PCI buses.
73 91 92 93 99
PCI info Initialize realtime clock Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
Initializes the battery powered, real-time clock. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
3 4
7 8 9 91 92
99
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
Gigabit diagnostics
About the Gigabit diagnostic tests
This section describes the onboard Gigabit Ethernet (GbE) test submenu. The GbE diagnostic tests can generate error messages associated with the hardware and software. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices.
42 Port-port 10B test (Xtnd) 43 Port-port 100B test (Xtnd) 44 Port-port 1 G test (Xtnd) 70 Display MAC address 71 Display all registers 72 Display EEPROM 73 Set MAC address [Factory] 90 GBE card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit
Note The standard data packet size is 1522 bytes. Extended test mode: This test tests the data path from one channel to another for the dual channel NICS, requires a twisted pair network cable to be connected between the 2 ports.
Verifies and displays the MAC address of the card. Displays all the memory registers. Display the entire contents of the Ethernet device's EEPROM This is test is unavailable. Enables the selection of a specific GbE card in the system. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
4 5 6 7 8 9
Int loop test Bus reset test [Xtnd] Ext loop test [Xtnd] Read-only bus test Read/write bus test [Mfg] Disk read test (FCTEST)
correctly. Tests data movement between main memory and the FC-AL chip, using on-chip loopback capability for 10 bit and 1 bit. Extended test mode: Tests the FC-AL loop integrity and LRC functionality by resetting the bus. Extended test mode: Tests the functionality and data movement between memory and FC-AL cable. Requires loopback plug. Tests the FC-AL loop integrity by reading from each disk attached to the FC-AL interface. Option not available. Tests the FC-AL adapter loop integrity by reading from each disk attached to the FC-AL onboard interface. This test has optional parameters. Requires disks attached to the FC host adapter. Option not available. Lists the status of all the disks on all FC-AL interfaces on the storage system. Requires disks attached to the FC host interface. Lists the status of all the disks on the specified FC-AL interface. Requires disks attached to the FC host interface. Tests the external LEDs on the FC-AL card. The test is not supported. Displays information about the ISP Fibre Channel chip. Displays all devices attached to a specific FC-AL adapter. Lists disk information for all disks attached to the system. Resets the selected FC-AL adapter to its original state. Displays the serial EEPROM data. Option not available. Displays the link statistics maintained for all drives on a Fibre Channel loop. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Enables you to select a specific FC-AL interface for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is
10 Disk read/write test [Mfg] 41 Scan all disks on all FC-AL 42 Scan and show disks on selected FC-AL 43 FC-AL adapter LED test 44 FC initiatortarget test 71 Show ISP FC chip info 72 Show attached FC-AL devices 73 Show all disks (probe-scsi-all) 74 Reset FC-AL adapter 75 Show serial EEPROM data 76 Program serial EEPROM data [Factory] 77 Display fcstat link_status 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 90 FC-AL card selection 91 Enable/disable
pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
Displays the expander status. Displays the sector size for the drives on the disk shelves. Enables or disables extended mode on tests where extended mode is an available option. Returns the user to the main FC-AL menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
SAS diagnostics
About the SAS diagnostic tests
The SAS (Serial Attached SCSI) group of diagnostics tests the functioning of the SAS interfaces that are in your system. The tests range from EEPROM data verification through data transfer integrity testing. The SAS diagnostic tests can generate error messages associated with the interface and disk shelf. To perform disk or shelf diagnostics, select test 90 and identify the channel. This returns you to the main SAS menu. Then select test 80 or 81. Note Altering disks or cabling in a loop adapter requires you to perform either Test 41 or Test 42 before running any SAS test.
10 Disk read/write test [Mfg] 41 Scan all disks on all SAS 42 Scan and show disks on selected SAS 71 Show ISP SAS chip info 72 Show attached SAS devices 74 Reset SAS interface 76 Program onboard WWN [Factory] 78 Zeroing disk test [Mfg] 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 90 SAS card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit
Option not available. Lists the status of all the disks on all SAS interfaces on the storage system. Requires the presence of disks. Lists the status of all the disks on the specified SAS interface. Requires the presence of disks. Displays information about the ISP SAS chip. Displays all devices attached to a specific SAS interface. Resets the selected SAS interface to its original state. Option not available. Option not available. Select a node, then go to the disk write test patterns submenu disk write test patterns menu. From this menu, a user can select a specific pattern test to run. Accesses the disk shelf diagnostics submenu. Enables you to select a specific SAS interface for testing. At present, the only possible interface is 0c. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the SAS host adapter. Returns the user to the main SAS menu.
Displays the expander status. Displays the sector size for the drives on the disk shelves. Enables or disables extended mode on tests where extended mode is an available option. Returns the user to the main SAS menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
IB diagnostics
The following table describes the tests in the IB diagnostic test. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices: Test Test no 1 Comprehensive test 6 Internal loopback test 7 Link test [Xtnd] 70 Display card information 71 Reset chip [Xtnd] 72 Display firmware information 73 Download firmware 74 Read GUID 75 Write GUID 90 IB card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit Description Runs all tests in this menu in current mode. Test data transfer between host memory and IB card, using onchip loopback. Extended test mode: Both heads must be in the IB test menu to perform the link test. Display card vendor ID, device ID, Revision ID, class code and GUID base. Extended test mode: Resets the controller chip. Displays the IB firmware information Downloads the IB firmware. Reads and displays the GUID information. Allows a user to enter a new GUID number. Option is logically not available. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
BMC menu
This section describes the BMC menu. The BMC diagnostic tests can generate error messages associated with the hardware and software. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Caution: Disconnect all network connections prior to running network diagnostics in [Xtnd] mode. Running with attached networks can adversely affect other attached devices. Type xtnd n to cancel Extended test mode: Test Test Description no 1 Comprehensive Runs all tests in this menu in current mode. test 2 BMC self test This test issues a self test command to the BMC and waits a specified amount of time for a pass or fail response code. If the test times out without a response from the BMC, a timeout error will display. 3 Environment test Verifies the environmental hardware can successfully report extraordinary environmental events. 4 SDR read test Verifies that the sensor data repository (SDR) is readable. 5 SEL read test Verifies that the system event log (SEL) is readable. 6 LCD exercise Option not available. 7 BMC timer test Verifies that the SEL timer increments correctly. 10 Show BMC SSH Displays the BMC SSH Key. Keys 41 BMC NMI test This platform does not support this selection. 42 BMC Front Panel Button Test 43 SEL Write Test Extended test mode: Verifies that the SEL can be written to by software. Test is only available (Xtnd) in XTND mode. This test writes a dummy record into the SEL and checks if it was written correctly. 71 Show BMC SEL Displays the current time as measured by the BMC's SEL timer. time 72 Get reason for Identifies the reason for the previous reboot. restart 73 Show device Displays device information about the BMC. info 74 Show SDR info Displays information held in the BMC's SDR. 75 Show SEL info Displays information held in the BMC's SEL. 76 Clear SEL (Mfg) Option not available. 77 Emergency
78 79 80 91 92 93 99
shutdown (Mfg) BMC update menu (Xtnd) Dump SEL Records Dump Raw SEL Records Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
This platform is unable to perform this selection. Displays all the BMC SEL records in a user-readable format. Displays all the BMC SEL records in the raw format. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SAS] [Onboard IB] [BMC test] [NVMEM test]
Motherboard menu and submenus
NVMEM diagnostics
The following table describes the NVMEM test menu for the FAS20xx: Test Test no 1 Comprehensive NVMEM test 2 Battery test 71 Set battery armed 75 Fill for power cycle test, burst write 76 Fill for power cycle test, burst read 77 Fill for power cycle test 78 Verify data retention 82 Display memory by address 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Tests the battery. Toggles between arming and disarming the battery Fills NVRAM memory with data patterns for power cycle test, which does burst writes. Fills NVRAM memory with data patterns for power cycle test, which does burst reads. Fills NVRAM memory with data patterns for power cycle test. Checks the retention of data in NVRAM after a power cycle. Data comes from data patterns entered in Test 75. Displays the contents of a memory address location. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SCSI]
Motherboard menu and submenus
Motherboard menu
This section describes the Motherboard menu: Test Test no 1 Comprehensive motherboard diag 2 Misc. board test menu 3 Cache test menu 4 Onboard Gigabit Ethernet test menu 5 Onboard FCAL test menu 6 Onboard SCSI test menu 71 Show PCI configuration 72 Show detailed PCI info 73 Initialize real-time clock 74 Show system info 75 Serial info setup menu [Factory only] 76 Show Adapter card info [Mfg only] 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit Description Runs all tests in this menu in current mode. Accesses the miscellaneous motherboard test menu. Accesses the CPU cache tests. For more information, see the Cache test menu. Accesses the onboard Gigabit Ethernet test menu. Accesses the onboard FC-AL test menu. Accesses the onboard SCSI test menu. Lists the contents of all adapters in the PCI slots on the motherboard. Displays detailed information about the contents and settings of the cards in the PCI slots. Initializes the onboard real-time clock to user-defined settings. Displays information about the system. Option not available. Option not available. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SCSI]
Motherboard menu and submenus
6 7
9 11 12 13 14 41 71 72 73 74 91 92 93 99
Check Super I/O status Front panel LED exercise Front panel LCD exercise Test PCI devices [Factory only] Check on-board 8K nvsram Check watchdog interrupt Show PCI configuration Show detailed PCI info Initialize realtime clock Toggle front panel LEDs Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
Verifies that the Super I/O chip is alive and responding normally. Exercises the front panel LEDs by changing patterns in the displays. You need to observe the LEDs blinking to verify that they are working. Exercises the front panel LCD by changing patterns in the display. You need to observe the LCDs to verify that they are working. Option is unavailable. Verifies that the onboard 8K NVSRAM is working correctly. Checks that the watchdog interrupt is working. Shows the configuration of the Peripheral Component Interconnect (PCI), a peripheral bus. Shows detailed information about the PCI devices on the various PCI buses. Initializes the battery powered, real-time clock. Verifies that the front panel activity and status LEDs are working by turning them ON/OFF or changing colors. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SCSI]
Motherboard menu and submenus
3 4
7 8 9 91 92
99
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SCSI]
Motherboard menu and submenus
Gigabit diagnostics
About the Gigabit diagnostic tests
This section describes the onboard Gigabit Ethernet (GbE) test submenu. The GbE diagnostic tests can generate error messages associated with the hardware and software. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices.
Extended test mode: Tests card functionality and data movement between memory and the Ethernet cable. Requires loopback plug.
Tests the transmit and receive interrupts to verify the device's ability to generate interrupts, and the system's ability to handle interrupts correctly. Tests and verifies that all the device interrupts are working. Data is not transfered during this test. This test will test data from the transmitter to the receiver before it goes to the MAC. This test tests the data path from one channel to another for the dual channel NICS, requires a twisted pair network cable to be connected between the 2 ports.
43 70 71 72 73 90 91 92 93 99
test (Xtnd) Port-port 1 G test (Xtnd) Display MAC address Display all registers Display EEPROM Set MAC address [Factory] GbE card selection Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
Verifies and displays the MAC address of the card. Displays all the card memory registers. Displays the EEPROM data on the GbE card. This is test is unavailable. Enables the selection of a specific GbE card in the system. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SCSI]
Motherboard menu and submenus
5 6 7 8 9
Bus reset test [Xtnd] Ext loop test [Xtnd] Read-only bus test Read/write bus test [Mfg] Disk read test (FCTEST)
Extended test mode: Tests the FC-AL loop integrity and LRC functionality by resetting the bus. Extended test mode: Tests the functionality and data movement between memory and FC-AL cable. Requires loopback plug. Tests the FC-AL loop integrity by reading from each disk attached to the FC-AL interface. Option not available. Tests the FC-AL adapter loop integrity by reading from each disk attached to the FC-AL onboard interface. This test has optional parameters. Requires disks attached to the FC host adapter. Option not available. Lists the status of all the disks on all FC-AL adapters on the storage system. Requires disks attached to the FC host adapter. Lists the status of all the disks on the specified FC-AL adapters. Requires disks attached to the FC host adapter. Tests the external LEDs on the FC-AL card. Displays information about the ISP Fibre Channel chip. Displays all devices attached to a specific FC-AL adapter. Lists disk information for all disks attached to the system. Resets the selected FC-AL adapter to its original state. Displays the serial EEPROM data. Option not available. Displays the link statistics maintained for all drives on a Fibre Channel loop. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Enables you to select a specific FC-AL interface for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option.
10 Disk read/write test [Mfg] 41 Scan all disks on all FC-AL adapters 42 Scan and show disks on selected FC-AL adapters 43 FC-AL adapter LED test 71 Show ISP FC chip info 72 Show attached FC-AL devices 73 Show all disks (probe-scsi-all) 74 Reset FC-AL adapter 75 Show serial EEPROM data 76 Program serial EEPROM data [Factory] 77 Display fcstat link_status 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 90 FC-AL channel selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard SCSI]
Motherboard menu and submenus
3 4 5 6 7 42 71 72 73 74 75 76
SCSI interrupt test Read-only bus test [Xtnd] Read/write bus test [Mfg] Disk read test (FCTEST) Disk read/write test Scan and show disks (R100) Show ISP chip info Show attached SCSI devices Show all disks (probe-scsi-all) Reset SCSI adapter Show serial EEPROM data Program serial EEPROM data
77
78
79 90 91 92 93 99
[Factory] Go to shelf Diagnostics menu Set serial # and revision [Factory] Zero disk test area [Factory] SCSI card selection Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
Option not available. Option not available. Option not available. Enables you to select a specific SCSI card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
Motherboard menu
This section describes the Motherboard menu: Test Test no 1 Comprehensive motherboard diag 2 Misc. board test menu 3 Cache test menu 4 Onboard Gigabit Ethernet test menu 5 Onboard FCAL test menu 71 Show PCI configuration 72 Show detailed PCI info 74 Show system info 75 Serial info setup menu [Factory only] 76 Show Adapter card info [Mfg only] 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit Description Runs all tests in this menu in current mode. Accesses the miscellaneous motherboard test menu. Accesses the CPU cache tests. For more information, see the Cache test menu. Accesses the onboard Gigabit Ethernet test menu. Accesses the onboard FC-AL test menu. Lists the contents of all adapters in the PCI slots on the motherboard. Displays detailed information about the contents and settings of the cards in the PCI slots. Displays information about the system. Option not available. Option not available. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
Real-time clock This test will access the Real Time Clock and test its ability to count seconds. The RTC is test initialized and then the battery register is accessed to make sure that the correct status is read. Then the seconds register is accessed and the data is saved. The test will wait for about one second and then the seconds register is accessed again to make sure that it has changed. The second check will access the days register to make sure it is in the correct bounds (1-7). So, it basically verifies that the Real Time Clock is incrementing correctly and that its battery is in a good state. 8 Check Checks the Environmental Status Register (ESR) for fault conditions, such as fan failure and environmental high temperature. status 9 Check Super I/O Verifies that the Super I/O chip is alive and responding normally. status 11 Front panel LED Exercises the front panel LEDs by changing patterns in the displays. You need to observe the exercise LEDs blinking to verify that they are working. 12 Front panel LCD Exercises the front panel LCD by changing patterns in the display. You need to observe the exercise LCDs to verify that they are working. 41 Check watchdog Checks that the watchdog interrupt is working. interrupt 7
42 NMI Dump Switch Test 43 Check HT link speed 71 Show PCI configuration 72 Show detailed PCI info 73 Initialize realtime clock 74 Toggle front panel LEDs 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit
Within two minutes of selecting this test, you must press the NMI switch below the front panel. You will then get a confirmation message. Verifies if the HT link frequency and the width are the same as the factory settings. Shows the configuration of the Peripheral Component Interconnect (PCI), a peripheral bus. Shows detailed information about the PCI devices on the various PCI buses. Initializes the battery powered, real-time clock. Verifies that the front panel activity and status LEDs are working by turning them ON/OFF or changing colors. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
3 4
7 8 9 91 92
99
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
Gigabit diagnostics
About the Gigabit diagnostic tests
This section describes the onboard Gigabit Ethernet (GbE) test submenu. The GbE diagnostic tests can generate error messages associated with the hardware and software. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices.
Extended test mode: Tests data transfer between memory and the Ethernet chip on the 10BaseT/100Base-TX interface, involving loopback over connected wire. Also tests overall Ethernet functionality. Requires loopback plug.
This test tests the data path from one channel to another for the dual channel NICS, requires a twisted pair network cable to be connected between the 2 ports.
70 Display MAC address 71 Display all registers 72 Display all stats counters 73 Dump EEPROM 74 Set MAC address [Factory only] 75 EEPROM firmware update [Factory only] 90 BGE card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Return to main menu
Verifies and displays the MAC address of the card. Displays all the card memory registers. Displays all the card statistics. Displays the EEPROM data. Option is unavailable. Option is unavailable. Enables you to select the onboard Ethernet port for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Returns you to the main Diagnostics menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
5 6 7 8 9
Bus reset test [Xtnd] Ext loop test [Xtnd] Read-only bus test Read/write bus test [Mfg] Disk read test (FCTEST)
Extended test mode: Tests the FC-AL loop integrity and LRC functionality by resetting the bus. Extended test mode: Tests the functionality and data movement between memory and FC-AL cable. Requires loopback plug. Tests the FC-AL loop integrity by reading from each disk attached to the FC-AL interface. Option not available. Tests the FC-AL adapter loop integrity by reading from each disk attached to the FC-AL onboard interface. This test has optional parameters. Requires disks attached to the FC host adapter. Option not available. Lists the status of all the disks on all FC-AL adapters on the storage system. Requires disks attached to the FC host adapter. Lists the status of all the disks on the specified FC-AL adapters. Requires disks attached to the FC host adapter. Tests the external LEDs on the FC-AL card. Test the mode (target or initiator) of the FC-AL. Displays information about the ISP Fibre Channel chip. Displays all devices attached to a specific FC-AL adapter. Lists disk information for all disks attached to the system. Resets the selected FC-AL adapter to its original state. Displays the serial EEPROM data. Option not available. Displays the link statistics maintained for all drives on a Fibre Channel loop. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Display the onboard Fibre Channel port's World Wide Name. Enables you to select a specific FC-AL interface for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is
10 Disk read/write test [Mfg] 41 Scan all disks on all FC-AL adapters 42 Scan and show disks on selected FC-AL adapters 43 FC-AL adapter LED test 44 FC-AL initiatortarget test 71 Show ISP FC chip info 72 Show attached FC-AL devices 73 Show all disks (probe-scsi-all) 74 Reset FC-AL adapter 75 Show serial EEPROM data 76 Program serial EEPROM data [Factory] 77 Display fcstat link_status 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 85 Show onboard Fcal WWN 90 FC-AL channel selection 91 Enable/disable
pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard NVRAM test] [Ethernet Switch test]
Motherboard menu and submenus
Motherboard menu
This section describes the Motherboard menu: Test Test no 1 Comprehensive motherboard diag 2 Misc. board test menu 3 Cache test menu 4 Onboard Gigabit Ethernet test menu 5 Onboard FCAL test menu 6 NVRAM test menu 7 Ethernet Switch test menu 71 Show PCI configuration 72 Show detailed PCI info 74 Show system info 75 Serial info setup menu [Factory only] 76 Show Adapter card info [Mfg only] 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit Description Runs all tests in this menu in current mode. Accesses the miscellaneous motherboard test menu. Accesses the CPU cache tests. For more information, see the Cache test menu. Accesses the onboard Gigabit Ethernet test menu. Accesses the onboard FC-AL test menu. Accesses the NVRAM test menu. Accesses the menu that tests the failover assistance switch. Lists the contents of all adapters in the PCI slots on the motherboard. Displays detailed information about the contents and settings of the cards in the PCI slots. Displays information about the system. Option not available. Option not available. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard NVRAM test] [Ethernet Switch test]
Motherboard menu and submenus
4 Check PCI This test verifies that all available PCI devices are valid and are located in valid slots. devices and slots 5 Check memory Verifies the interface and data path integrity between the CPU and the memory DIMMs. interface This is a very small subset of the memory diagnostics and is not intended to be a comprehensive test. Performs a sliding 0 and 1 test to fixed locations of memory. Cache is disabled prior to running and then re-enabled at the end of the test. 6 Check boot flash This test verifies that the boot flash can be accessed reliably by software access 7 Real-time clock This test will access the Real Time Clock and test its ability to count seconds. The RTC is test initialized and then the battery register is accessed to make sure that the correct status is read. Then the seconds register is accessed and the data is saved. The test will wait for about one second and then the seconds register is accessed again to make sure that it has changed. The second check will access the days register to make sure it is in the correct bounds (1-7). So, it basically verifies that the Real Time Clock is incrementing correctly and that its battery is in a good state. 8 Check Checks the Environmental Status Register (ESR) for fault conditions, such as fan failure and environmental high temperature. status 71 Show PCI Shows the configuration of the Peripheral Component Interconnect (PCI), a peripheral bus. configuration 72 Show detailed Shows detailed information about the PCI devices on the various PCI buses.
73 91 92 93 99
PCI info Initialize realtime clock Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
Initializes the battery powered, real-time clock. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard NVRAM test] [Ethernet Switch test]
Motherboard menu and submenus
3 4
7 8 9 91 92
99
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard NVRAM test] [Ethernet Switch test]
Motherboard menu and submenus
Gigabit diagnostics
About the Gigabit diagnostic tests
This section describes the onboard Gigabit Ethernet (GbE) test submenu. The GbE diagnostic tests can generate error messages associated with the hardware and software. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices.
Extended test mode: Tests data transfer between memory and the Ethernet chip on the 10Base-T/100Base-TX interface, involving loopback over connected wire. Also tests overall Ethernet functionality. Requires loopback plug.
This test tests the data path from one channel to another for the dual channel NICS, requires a twisted pair network cable to be connected between the 2 ports.
70 71 72 73 74 75
76
77 90 91 92 93 99
(Xtnd) Display MAC address Display all registers Display all stats counters Dump EEPROM Set MAC address [Factory only] EEPROM firmware update [Factory only] Set IO board FRU information [Factory only] Show IO board FRU information BGE card selection Enable/disable looping Stop/continue looping on error Extended/normal test mode Return to main menu
Verifies and displays the MAC address of the card. Displays all the card memory registers. Displays all the card statistics. Displays the EEPROM data. Option is unavailable. Option is unavailable. Option is unavailable. Display the IO Board FRU information. Enables you to select the onboard Ethernet port for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Returns you to the main Diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard NVRAM test] [Ethernet Switch test]
Motherboard menu and submenus
4 5 6 7 8 9
Int loop test Bus reset test [Xtnd] Ext loop test [Xtnd] Read-only bus test Read/write bus test [Mfg] Disk read test (FCTEST)
correctly. Tests data movement between main memory and the FC-AL chip, using on-chip loopback capability for 10 bit and 1 bit. Extended test mode: Tests the FC-AL loop integrity and LRC functionality by resetting the bus. Extended test mode: Tests the functionality and data movement between memory and FC-AL cable. Requires loopback plug. Tests the FC-AL loop integrity by reading from each disk attached to the FC-AL interface. Option not available. Tests the FC-AL adapter loop integrity by reading from each disk attached to the FC-AL onboard interface. This test has optional parameters. Requires disks attached to the FC host adapter. Option not available. Lists the status of all the disks on all FC-AL interfaces on the storage system. Requires disks attached to the FC host interface. Lists the status of all the disks on the specified FC-AL interface. Requires disks attached to the FC host interface. Tests the external LEDs on the FC-AL card. The test is not supported. Displays information about the ISP Fibre Channel chip. Displays all devices attached to a specific FC-AL adapter. Lists disk information for all disks attached to the system. Resets the selected FC-AL adapter to its original state. Displays the serial EEPROM data. Option not available. Displays the link statistics maintained for all drives on a Fibre Channel loop. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Enables you to select a specific FC-AL interface for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is
10 Disk read/write test [Mfg] 41 Scan all disks on all FC-AL 42 Scan and show disks on selected FC-AL 43 FC-AL adapter LED test 44 FC initiatortarget test 71 Show ISP FC chip info 72 Show attached FC-AL devices 73 Show all disks (probe-scsi-all) 74 Reset FC-AL adapter 75 Show serial EEPROM data 76 Program serial EEPROM data [Factory] 77 Display fcstat link_status 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 90 FC-AL card selection 91 Enable/disable
pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
Displays the expander status. Displays the sector size for the drives on the disk shelves. Enables or disables extended mode on tests where extended mode is an available option. Returns the user to the main FC-AL menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard NVRAM test] [Ethernet Switch test]
Motherboard menu and submenus
NVRAM diagnostics
The following table describes the NVRAM test menu for the FAS31xx/V31xx : Test Test no 1 Comprehensive NVRAM test 2 NVRAM memory menu 3 NVRAM IB menu 5 NVRAM environmental test 6 NVRAM EEPROM test 7 NVRAM FLASH test 8 NVRAM i2c test 70 Set NVRAM properties [Mfg only] 71 Display NVRAM properties 72 Display NVRAM EEPROM 73 Display NVRAM status 74 Display NVRAM config space 76 Upgrade NVRAM firmware [Xtnd] 77 Clear NVRAM properties [Mfg only] 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Accesses the NVRAM memory menu. Accesses the IB menu which tests the part of the adapter associated with clustering. Accesses the environmental test menu. Tests the NVRAM EEPROM subcomponent. Tests the NVRAM FLASH subcomponent. Tests the NVRAM i2c bus. Option not available. Displays information about the NVRAM7 adapter. Displays information about the NVRAM7 Electrically Erasable Programmable Read Only Memory (EEPROM) contents. Displays information about the NVRAM7 status. Displays information about the NVRAM7 configuration space. Extended test mode: Updates the firmware on the NVRAM7. Option not available. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Memory diagnostics
The following table describes the NVRAM7 memory test menu: Test Test no 1 Comprehensive NVRAM memory test 2 NVRAM memory walking data test 3 NVRAM memory walking address test 4 NVRAM memory partial word test 5 NVRAM memory random data test 6 NVRAM memory random address test 10 NVRAM DIMM SPD test 11 Display NVRAM DIMM SPD 12 Dump NVRAM DIMM SPD 20 Inject ECC errors [Xtnd only] 21 Inject/read ECC errors [Xtnd only] 50 NVRAM DMA Write-Read-Verify 51 NVRAM DMA Write-only 52 NVRAM DMA Read-only 70 NVRAM memory dump 71 NVRAM memory poke 72 NVRAM memory custom pattern 74 Memory fill power cycle test 75 Memory write power Description Runs all tests in current mode. Runs quick test of data lines. Runs quick test of all address lines to verify address paths in NVRAM memory. Tests intermixed data sizes. Runs longer test by writing and reading random data to all NVRAM locations. Runs longer test using random addresses. Compares NVRAM DIMM properties (SPD) against supported values. Displays NVRAM DIMM properties (SPD) as field-value pairs. Displays NVRAM DIMM properties (SPD) as a hexadecimal dump. Extended test mode: Injects ECC errors into the NVRAM DIMM, without triggering detection. Extended test mode: Injects ECC errors into the NVRAM DIMM, and then triggers detection. Fills system memory with a random data pattern, and then DMA transfers this pattern back-and-forth from NVRAM memory. Fills system memory with a random data pattern, and then DMA transfers this pattern to NVRAM memroy. Fills NVRAM memory with a random data pattern, and then DMA transfers this pattern to system memory. Allows the user to dump a region of memory. Allows the user to write to a region of memory. Fills NVRAM memory with a user-specified data pattern. Fills NVRAM memory with data patterns for power cycle test. Fills NVRAM memory with data patterns for power cycle test, which does burst writes.
cycle test 76 Memory read power Fills NVRAM memory with data patterns for power cycle test, which does burst reads. cycle test 77 Memory DMA write Fills NVRAM memory with data patterns for power cycle test, which does DMA writes. power cycle test 78 Verify data retention Checks the retention of data in NVRAM after a power cycle. Data comes from data patterns entered in Test 75. 80 Memory class change Option not available. [Mfg only] 90 NVRAM card Enables the selection of a specific NVRAM card for testing. selection 91 Enable/disable Enables or disables continuous running of a diagnostic test. The test is stopped when looping Ctrl-C is pressed or when an error is encountered if option 92 is active. 92 Stop/continue on Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option error 91, looping continues after an error is encountered. 93 Extended/normal test Enables or disables extended mode on tests where extended mode is an available option. mode 99 Exit Exits this diagnostics menu.
NVRAM7 IB diagnostics
The following table describes the tests in the NVRAM7 IB diagnostic test: Test Test no 1 Comprehensive NVRAM cluster test 2 Internal loopback RDMAW test 3 Internal loopback send test 4 Link test [Xtnd] 5 External loopback RDMAW test [Xtnd] 6 External loopback send test [Xtnd] 70 Reset port performance counter 71 Display port performance counter 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Test remote direct memory access write (RDMAW) between host memory and NVRAM7 card, using onchip loopback. Test data transfer between host memory and NVRAM7 card, using onchip loopback. Extended test mode: Verify external link status. Point to point cable needed. Extended test mode: Test remote direct memory access write (RDMAW) between host memory and NVRAM7 card, using external loopback. Point to point cable needed. Extended test mode: Test data transfer between host memory and NVRAM7 card, using external loopback. Point to point cable needed. Resets the counter on the performance of the cluster ports. Displays information about the performance of the cluster ports. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Environmental diagnostics
The following table describes the NVRAM7 environmental test menu. The NVRAM7 environmental test can generate environmental error messages associated with the battery or the temperature sensors. The corrective action for this error message grouping is below the error message description: Test Test no 1 Comprehensive NVRAM env test 2 NVRAM env subsytem test 3 NVRAM battery test 4 NVRAM charger test 70 GPIO bit control 71 GPIO dump 72 Turn battery on 73 Turn charger on 74 LM81 I2C dump 75 LM81 I2C write 76 Force GPIO interrupt 77 Charge Battery 78 Discharge Battery 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Tests the interrupt conditions for each sensor. Tests the battery. Tests the battery charger. Allows the user to toggle the general purpose IO lines. Dumps the settings of the general purpose IO lines. Turns on the battery. Turns on the battery charger. Allows the user to read the devices on the NVRAM board. Allows the user to write to the devices on the NVRAM board. Force an interrupt from the NVRAM board through the general purpose IO line. Charges the NVRAM battery to a user-specified voltage. Discharges the NVRAM battery to a user-specified voltage. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
ENV01960x ENV01961x
NVRAM7-battery-0 NVRAM7-temperature-0
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error code range: Error code range ENV01960x ENV01961x Corrective action 1. Verify that the NVRAM7 battery is connected. 2. Call NetApp Technical Support if the error is not corrected. Call NetApp Technical Support if the error is not corrected.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL] [Onboard NVRAM test] [Ethernet Switch test]
Motherboard menu and submenus
This option dumps the content of the switch config EEPROM. This option dumps the content of the switch config registers. This option dumps the content of the switch's receive and transmit counters for each port. Option not available. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
Motherboard menu
This section describes the Motherboard menu: Test Test no 1 Comprehensive motherboard diag 2 Misc. board test menu 3 Cache test menu 4 Onboard Gigabit Ethernet test menu 5 Onboard FCAL test menu 71 Show PCI configuration 72 Show detailed PCI info 73 Initialize real-time clock 74 Show system info 75 Serial info setup menu [Factory only] 76 Show Adapter card info [Mfg only] 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit Description Runs all tests in this menu in current mode. Accesses the miscellaneous motherboard test menu. Accesses the CPU cache tests. For more information, see the Cache test menu. Accesses the onboard Gigabit Ethernet test menu. Accesses the onboard FC-AL test menu. Lists the contents of all adapters in the PCI slots on the motherboard. Displays detailed information about the contents and settings of the cards in the PCI slots. Initializes the onboard real-time clock to user-defined settings. Displays information about the system. Option not available. Option not available. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
[Up] [Motherboard] [Miscellaneous board] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
Check boot flash This test verifies that the boot flash can be accessed reliably by software access 7 Real-time clock This test will access the Real Time Clock and test its ability to count seconds. The RTC is test initialized and then the battery register is accessed to make sure that the correct status is read. Then the seconds register is accessed and the data is saved. The test will wait for about one second and then the seconds register is accessed again to make sure that it has changed. The second check will access the days register to make sure it is in the correct bounds (1-7). So, it basically verifies that the Real Time Clock is incrementing correctly and that its battery is in a good state. 8 Check Checks the Environmental Status Register (ESR) for fault conditions, such as fan failure and environmental high temperature. status 9 Check Super I/O Verifies that the Super I/O chip is alive and responding normally. status 10 Change the Option is unavailable. SYSTEM fan speeds [Factory only] 11 Front panel LED Exercises the front panel LEDs by changing patterns in the displays. You need to observe the 6
12 13 14 41 42 71 72 73 74 75 91 92 93 99
exercise Front panel LCD exercise Test PCI devices [Factory only] FRU LED exercise Check watchdog interrupt NMI Dump Switch Test Show PCI configuration Show detailed PCI info Initialize realtime clock Toggle front panel LEDs Margins menu [Factory only] Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
LEDs blinking to verify that they are working. Exercises the front panel LCD by changing patterns in the display. You need to observe the LCDs to verify that they are working. Option is unavailable. Exercises the front panel LEDs by changing patterns in the displays. You need to observe the LEDs blinking to verify that they are working. Checks that the watchdog interrupt is working. Within two minutes of selecting this test, you must press the NMI switch below the front panel. You will then get a confirmation message. Shows the configuration of the Peripheral Component Interconnect (PCI), a peripheral bus. Shows detailed information about the PCI devices on the various PCI buses. Initializes the battery powered, real-time clock. Verifies that the front panel activity and status LEDs are working by turning them ON/OFF or changing colors. Option is unavailable. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
3 4
7 8 9 91 92
99
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
Gigabit diagnostics
About the Gigabit diagnostic tests
This section describes the onboard Gigabit Ethernet (GbE) test submenu. The GbE diagnostic tests can generate error messages associated with the hardware and software. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices.
Extended test mode: Tests data transfer between memory and the Ethernet chip on the 10Base-T/100Base-TX interface, involving loopback over connected wire. Also tests overall Ethernet functionality. Requires loopback plug.
This test tests the data path from one channel to another for the dual channel NICS, requires a twisted pair network cable to be connected between the 2 ports.
70 Display MAC address 71 Display all registers 72 Display all stats counters 73 Dump EEPROM 74 Set MAC address [Factory only] 75 EEPROM firmware update [Factory only] 76 Set IO board FRU information [Factory only] 77 Show IO board FRU information 90 BGE card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Return to main menu
Verifies and displays the MAC address of the card. Displays all the card memory registers. Displays all the card statistics. Displays the EEPROM data. Option is unavailable. Option is unavailable. Option is unavailable. Display the IO Board FRU information. Enables you to select the onboard Ethernet port for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Returns you to the main Diagnostics menu.
[Up] [Motherboard] [Miscellaneous test] [Cache test] [Onboard GbE] [Onboard FC-AL]
Motherboard menu and submenus
5 6 7 8 9
Bus reset test [Xtnd] Ext loop test [Xtnd] Read-only bus test Read/write bus test [Mfg] Disk read test (FCTEST)
Extended test mode: Tests the FC-AL loop integrity and LRC functionality by resetting the bus. Extended test mode: Tests the functionality and data movement between memory and FC-AL cable. Requires loopback plug. Tests the FC-AL loop integrity by reading from each disk attached to the FC-AL interface. Option not available. Tests the FC-AL adapter loop integrity by reading from each disk attached to the FC-AL onboard interface. This test has optional parameters. Requires disks attached to the FC host adapter. Option not available. Lists the status of all the disks on all FC-AL adapters on the storage system. Requires disks attached to the FC host adapter. Lists the status of all the disks on the specified FC-AL adapters. Requires disks attached to the FC host adapter. Tests the external LEDs on the FC-AL card. Test the mode (target or initiator) of the FC-AL. Displays information about the ISP Fibre Channel chip. Displays all devices attached to a specific FC-AL adapter. Lists disk information for all disks attached to the system. Resets the selected FC-AL adapter to its original state. Displays the serial EEPROM data. Option not available. Displays the link statistics maintained for all drives on a Fibre Channel loop. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Option is unavailable.
10 Disk read/write test [Mfg] 41 Scan all disks on all FC-AL adapters 42 Scan and show disks on selected FC-AL adapters 43 FC-AL adapter LED test 44 FC-AL initiatortarget test 71 Show ISP FC chip info 72 Show attached FC-AL devices 73 Show all disks (probe-scsi-all) 74 Reset FC-AL adapter 75 Show serial EEPROM data 76 Program serial EEPROM data [Factory] 77 Display fcstat link_status 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 83 Set onboard Fcal FRU information [Factory only] 84 Show onboard
85 90 91 92 93 99
Fcal FRU information Show onboard Fcal WWN FC-AL channel selection Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
Display the onboard Fibre Channel port's World Wide Name. Enables you to select a specific FC-AL interface for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
Motherboard diagnostics
[Up]
Main memory diagnostics
Runs a short test, walking a byte of ones through a field of 64 words of zeros. Test is repeated with complemented data. Runs a short test, walking a byte of ones through a field of 64 words of zeros. Test is repeated with complemented data. Tests intermixed words, half-words, and bytes to verify ability of memory/CPU to merge data. Byte patterns test Spins through all 256 possible data patterns within each byte of a long word, one byte at a ( 0 sec ) time. RAS/CAS corners Runs a quick test between several locations that cause maximum change in the Row test ( 0 sec ) Address, Column Address, and RAS/CAS line. Random read/write Randomly reads or writes memory locations and tests memory controller sequencing. test ( 128 sec ) Alternating address Tests even and odd addresses, stressing PC byte marks. test ( 256 sec ) Random data test Runs a longer test, placing random data in every location. Tests DRAM cell verification. ( 0 sec ) Random address Runs a longer test, generating random addresses for reading and writing. Stresses DRAM test ( 768 sec ) addressing. Longer option also available for a test that quietly reads all memory locations. MP memory test Multiprocessor memory test. ( 2176 sec ) Large memory VM A fixed pattern test that is performed and verified on platforms with memory equal to or test ( 0 sec ) greater than 4 GB. Fill memory with Enables you to input data pattern and memory range. data pattern
43 Check memory with data pattern 44 Log2 patterns test ( 640 sec ) 45 Parity/ECC bits test ( 3200 sec ) 49 Qualification scope loop 71 Read all locations 72 Dump from specified address 73 Set test address range 74 91 92 93 95 99
Verifies the data pattern and memory range specified in Test 42. Runs a longer test of a set of log2-based (binary) data patterns. Runs a longer test to verify that each bit of a byte can propagate into the parity/ECC term. Initializes a memory region with a data pattern Reads through all memory locations, looking for errors. Gives a checksum at the end. You can run this test twice to compare the checksums. Enables you to set hexadecimal base addresses for the memory tests. You can repeat this test to confirm whether checksums for both tests are the same. Enables you to set the memory range for testing.
The default range is the entire testable address space. Show memory size Displays memory size and test range. and test range Enable/disable Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C looping is pressed or when an error is encountered if option 92 is active. Stop/continue on Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option error 91, looping continues after an error is encountered. Extended/normal Enables or disables extended mode on tests where extended mode is an available option. test mode Enable/disable Enables or disables caching on the system. cache Exit Exits this diagnostics menu.
[Up]
Main memory diagnostics
For FAS20xx/SA200
Main memory diagnostic menu
The following table describes the tests in the menu: Test Test no 1 Comprehensive memory test 2 Walking data bits test 3 Walking address test 4 Stuck faults test 5 Walking data words test 6 Walking data bytes test 7 Partial words test 8 9 10 11 12 13 42 43 44 45 49 71 Description Runs all tests in this menu in current mode. This test does take time -- approximately 750 seconds for the FAS2020 and approximately 2016 seconds for the FAS2050. Verifies the data path between the CPU and memory. Runs a quick check of all data lines. Verifies address paths in memory. Runs a quick test of all address lines, up to size of memory.
Scans memory to check for stuck bits, either 1 or 0. Runs a short test, walking a byte of ones through a field of 64 words of zeros. Test is repeated with complemented data. Runs a short test, walking a byte of ones through a field of 64 words of zeros. Test is repeated with complemented data. Tests intermixed words, half-words, and bytes to verify ability of memory/CPU to merge data. Byte patterns test Spins through all 256 possible data patterns within each byte of a long word, one byte at a time. RAS/CAS Runs a quick test between several locations that cause maximum change in the Row Address, corners test Column Address, and RAS/CAS line. Random Randomly reads or writes memory locations and tests memory controller sequencing. read/write test Alternating Tests even and odd addresses, stressing PC byte marks. address test Random data test Runs a longer test, placing random data in every location. Tests DRAM cell verification. Random address Runs a longer test, generating random addresses for reading and writing. Stresses DRAM test addressing. Longer option also available for a test that quietly reads all memory locations. Fill memory Enables you to input data pattern and memory range. with data pattern Check memory Verifies the data pattern and memory range specified in Test 42. with data pattern Log2 patterns Runs a longer test of a set of log2-based (binary) data patterns. test Parity/ECC bits Reads all of memory looking for ECC errors. test Qualification Initializes a memory region with a data pattern scope loop Read all Reads through all memory locations, looking for errors. Gives a checksum at the end. You
locations 72 Dump from specified address 73 Set test address range 74 Show memory size and test range 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 95 Enable/disable cache 99 Exit
can run this test twice to compare the checksums. Enables you to set hexadecimal base addresses for the memory tests. You can repeat this test to confirm whether checksums for both tests are the same. Enables you to set the memory range for testing. The default range is the entire testable address space. Displays memory size and test range. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Enables or disables caching on the system. Exits this diagnostics menu.
[Up]
Main memory diagnostics
FAS200 series
Rules for running main memory diagnostics with the FAS200 series
Observe the following rules when you run the mem diagnostic on the FAS200 series: Do not run this diagnostic immediately after a system crash. Be aware that mem diagnostics overwrite all contents of the main and NVRAM memory. Before you run mem diagnostics, reboot and shut down the system. You are alerted to be very careful by the following warning:
WARNING! Do not run the NVMEM diagnostic immediately after a system crash or if there is a possibility that log data is stored. Run only on new boards, or after a normal system shutdown, or if there is no chance of preserving customer data.
12 13 14 42 43 44 45 71
test (23 sec) Random data test (34 sec) Random address test (13 sec) MP memory test (14 sec) Fill memory with data pattern Check memory with data Log2 patterns test (28 sec) Parity/ECC bits test (90 sec) Read all locations
Runs a longer test, placing random data in every location. Tests DRAM cell verification. Runs a longer test, generating random addresses for reading and writing. Stresses DRAM addressing. Longer option also available for a test that quietly reads all memory locations. Option not available Enables you to input data pattern and memory range. Verifies the data pattern and memory range specified in Test 42. Runs a longer test of a set of log2-based (binary) data patterns. Runs a longer test to verify that each bit of a byte can propagate into the parity/ECC term. Reads through all memory locations, looking for errors. Gives a checksum at the end. You can run this test twice to compare the checksums. Enables you to set hexadecimal base addresses for the memory tests. You can repeat this test to confirm whether checksums for both tests are the same. Enables you to set the memory range for testing.
The default range is the entire testable address space. Show memory size Displays memory size and test range. and test range Enable/disable Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C looping is pressed or when an error is encountered if option 92 is active. Stop/continue on Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option error 91, looping continues after an error is encountered. Extended/normal Enables or disables extended mode on tests where extended mode is an available option. test mode Enable/disable Enables or disables caching on the system. cache Exit Exits this diagnostics menu.
Card diagnostics
About card diagnostics
The card diagnostics are a collection of tests of the different cards that you can install in your storage system.
Card diagnostics
71 72 73 75 91 92 93 99
Displays information about the RLM agent ID, the firmware revision, the FIFO depth, the ring depth, and the maximum number of power supplies supported. Enable/disable Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C looping is pressed or when an error is encountered if option 92 is active. Continue/stop Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping on error looping continues after an error is encountered. Extended/normal Enables or disables extended mode on tests where extended mode is an available option. test mode Exit Exits this diagnostics menu.
buffer info Show RLM info Show Restart reason Delete SEL [Mfg] Show Agent info
Displays the RLM serial number, revision, part number, and MAC address. Displays the reason the system was rebooted. Option not available.
Card diagnostics
71 74 90 91 92 93 99
Ethernet diagnostics
About the Ethernet diagnostic tests
This group of diagnostics tests the functioning of the Ethernet funtionality of the CNAs that are in your system. This diagnostic test can generate error messages associated with the hardware, software, and user input.
Card diagnostics
FC-AL diagnostics
About the FC-AL diagnostic tests
The FC-AL (Fibre Channel Arbitrated Loop) group of diagnostics tests the functioning of the Fibre Channel arbitrated loop adapters that are in your system. The tests range from EEPROM data verification through data transfer integrity testing. The FC-AL diagnostic tests can generate error messages associated with the interface and disk shelf. To perform disk or shelf diagnostics, select test 90 and identify the channel. This returns you to the main FC-AL menu. Then select test 80 or 81. Note: Altering disks or cabling in a loop adapter requires you to perform either Test 41 or Test 42 before running any FC-AL test. If you change a multiple loop adapter, run Test 41. If you change a single loop adapter, run Test 42. Caution: There are limitations to running Fibre Channel Diagnostics: This test does not support switches, media changeres, connections to the SAN environment, and Fibre Channel adapters functioning in target mode. The external loopback test does not detect the presence or absence of a loopback device before running, nor does it distinguish an open loop or broken cable. For example, the external loopback test fails if no loopback is attached. Users are responsible for correctly attaching the loopback device, and enabling external loopback tests. Running diagnostics on a Multipath High Availability nodes: If you are running tests 41, 42, 73 or option 4 of test 81 on a node in a MultiPath High Availability cabled pair of nodes, verify that one of the following is true: The partner node is at the CFE or Loader boot prompt. The partner node is powered off. The multipath high availability cabling for both nodes has been removed such that the appliance is only responsible for its own storage and not that of its partner.
4 5 6 7 8 9
Int loop test Bus reset test [Xtnd] Ext loop test [Xtnd] Read-only bus test Read/write bus test [Mfg] Disk read test (FCTEST)
Tests data movement between main memory and the FC-AL chip, using on-chip loopback capability for 10 bit and 1 bit. Extended test mode: Tests the FC-AL loop integrity and LRC functionality by resetting the bus. Extended test mode: Tests the functionality and data movement between memory and FC-AL cable. Requires loopback plug. Tests the FC-AL loop integrity by reading from each disk attached to the FC-AL interface. Option not available. Tests the FC-AL adapter loop integrity by reading from each disk attached to the FC-AL onboard interface. This test has optional parameters. Requires disks attached to the FC host adapter. Option not available. Lists the status of all the disks on all FC-AL adapters on the storage system. Requires disks attached to the FC host adapter. Lists the status of all the disks on the specified FC-AL adapters. Requires disks attached to the FC host adapter. Tests the external LEDs on the FC-AL card. Displays information about the ISP Fibre Channel chip. Displays all devices attached to a specific FC-AL adapter. Lists disk information for all disks attached to the system. Resets the selected FC-AL adapter to its original state. Displays the serial EEPROM data. Option not available. Displays the link statistics maintained for all drives on a Fibre Channel loop. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Enables you to select a specific FC-AL card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91,
10 Disk read/write test [Mfg] 41 Scan all disks on all FC-AL adapters 42 Scan and show disks on selected FC-AL adapters 43 FC-AL adapter LED test 71 Show ISP FC chip info 72 Show attached FC-AL devices 73 Show all disks (probe-scsi-all) 74 Reset FC-AL adapter 75 Show serial EEPROM data 76 Program serial EEPROM data [Factory] 77 Display fcstat link_status 80 Go to disk diagnostic menu 81 Go to shelf diagnostics menu 90 FC-AL channel selection 91 Enable/disable looping 92 Stop/continue
looping on error looping continues after an error is encountered. 93 Extended/normal Enables or disables extended mode on tests where extended mode is an available option. test mode 99 Exit Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
Turns on the drive LEDs on the target disk shelf. Turns off the drive LEDs on the target disk shelf. Displays the list of disk shelves and their firmware revisions on the target FC-AL card. Get shelf drive map Displays the list of drives on the disk shelves of the target FC-AL card. Get shelf environment Displays the environmental parameters for the disk shelves on the target FCinformation AL card. Check SES temperature sensors Check SES temperature sensors against threshold value. Check SES FANs Check SES fan status. Check SES Power Supply Check SES Power Supply status. Check SES ESH (HUB) Check SES HUB status on the ESH. Check all SES elements Check status of all SES elements in the shelf. Loop integrity/LRC test [Xtnd] Extended test mode: Tests the FC-AL loop integrity and LRC functionality. Show HUB status Display status of each port in the HUB for each ESH module. Display sector size for FC-AL Displays the sector size for the drives on the disk shelves. devices Extended/normal test mode Enables or disables extended mode on tests where extended mode is an available option. Exit this menu Returns the user to the main FC-AL menu.
Card diagnostics
Gigabit diagnostics
About the Gigabit diagnostic tests
The Gigabit group of diagnostics tests the functioning of the Gigabit Ethernet (GbE) cards that are in your system. The tests range from a status check of the card to the testing of data movement through the system while the GbE card is being used. The GbE diagnostic tests can generate error messages associated with the hardware and software. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Type xtnd n to cancel Extended test mode. Caution: Disconnect all network connections prior to running network diagnostics. Running with attached networks can adversely affect other attached devices.
5 6 7 8 9 10
External lp test 1G (Xtnd) Internal lp test 10B Internal lp test 100B External lp test 10B (Xtnd) External lp test 100B (Xtnd) Interrupt test
Extended test mode: Tests card functionality and data movement between memory and the Ethernet cable. Requires loopback plug.
Tests the transmit and receive interrupts to verify the device's ability to generate interrupts, and the system's ability to handle interrupts correctly. Tests and verifies that all the device interrupts are working. Data is not transfered during this test. This test will test data from the transmitter to the receiver before it goes to the MAC. Note
41 Port-port 10B test (Xtnd) 42 Port-port 100B test (Xtnd) 43 Port-port 1 G test (Xtnd) 70 Display MAC address 71 Display all registers 72 Display EEPROM 73 Set MAC address [Factory] 90 GbE card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit
If your system is running an Intel Copper GbE card, it requires a loopback plug. This test tests the data path from one channel to another for the dual channel NICS, requires a twisted pair network cable to be connected between the 2 ports.
Verifies and displays the MAC address of the card. Displays all the card memory registers. Displays the EEPROM data on the GbE card. This is test is unavailable. Enables the selection of a specific GbE card in the system. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Card diagnostics
IPSec diagnostics
About the IPSec diagnostic tests
The IPSec group of diagnostics tests the functioning of the Internet Protocol Security (IPSec) card that is in your system. The IPSec diagnostic tests can generate error messages associated with the hardware.
Card diagnostics
iSCSI diagnostics
About the iSCSI diagnostic tests
The iSCSI group of diagnostics tests the functioning of the iSCSI card that are in your system. The tests range from a status check of the card to the testing of data movement through the system while the iSCSI card is being used. The iSCSI diagnostic tests can generate error messages associated with the hardware. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Caution: Do not run [Xtnd] mode diagnostics on network adapter cards with live network connections. Disconnect all network connections prior to running network diagnostics in [Xtnd] mode. Running with attached networks can adversely affect other attached devices. Type xtnd n to cancel Extended test mode.
Extended test mode: Tests card functionality and data movement between memory and the Ethernet cable. Requires loopback plug.
Displays information about the iSCSI chip. Resets the selected iSCSI adapter to its original state Enables the selection of a specific iSCSI card in the system. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C
is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Card diagnostics
NVRAM diagnostics
About the NVRAM diagnostic test menu
The NVRAM group of diagnosticss test the functioning of the NVRAM cards in the system, including PCI connectivity, data transfer, and data registers. In addition, the NVRAM diagnostics, together with other tests, run a set of memory tests on the NVRAM board. These memory tests focus on the memory module strips plugged into the cards. The NVRAM diagnostic tests can generate error messages associated with the hardware and user input.
NVRAM diagnostics
NVRAM6 diagnostics
The following table describes the NVRAM6 test menu: Test Test no 1 Comprehensive NVRAM test 2 NVRAM memory menu 3 NVRAM IB menu 4 NVRAM ECC menu [Xtnd] 5 NVRAM environmental test 6 NVRAM EEPROM test 7 NVRAM FLASH test 8 NVRAM i2c test 70 Set NVRAM properties [Mfg only] 71 Display NVRAM properties 72 Display NVRAM EEPROM 73 Display NVRAM status 74 Display NVRAM config space 76 Upgrade NVRAM firmware [Xtnd] 77 Clear NVRAM properties [Mfg only] 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Accesses the NVRAM memory menu. Accesses the IB menu which tests the part of the adapter associated with clustering. Accesses the error correction code menu. Accesses the environmental test menu. Tests the NVRAM EEPROM subcomponent. Tests the NVRAM FLASH subcomponent. Tests the NVRAM i2c bus. Option not available. Displays information about the NVRAM6 adapter. Displays information about the NVRAM6 Electrically Erasable Programmable Read Only Memory (EEPROM) contents. Displays information about the NVRAM6 status. Displays information about the NVRAM6 configuration space. Extended test mode: Updates the firmware on the NVRAM6. Option not available. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
NVRAM6 diagnostics
Memory diagnostics
The following table describes the NVRAM6 memory test menu: Test Test no 1 Comprehensive NVRAM memory test 2 NVRAM memory walking data test 3 NVRAM memory walking address test 4 NVRAM memory partial word test 5 NVRAM memory random data test 6 NVRAM memory random address test 10 NVRAM DIMM SPD test 11 Display NVRAM DIMM SPD 12 Dump NVRAM DIMM SPD 20 Inject ECC errors [Xtnd only] 21 Inject/read ECC errors [Xtnd only] 50 NVRAM DMA Write-Read-Verify 51 NVRAM DMA Write-only 52 NVRAM DMA Read-only 70 NVRAM memory dump 71 NVRAM memory poke 72 NVRAM memory custom pattern 74 Memory fill power cycle test 75 Memory write power cycle test 76 Memory read power Description Runs all tests in current mode. Runs quick test of data lines. Runs quick test of all address lines to verify address paths in NVRAM memory. Tests intermixed data sizes. Runs longer test by writing and reading random data to all NVRAM locations. Runs longer test using random addresses. Compares NVRAM DIMM properties (SPD) against supported values. Displays NVRAM DIMM properties (SPD) as field-value pairs. Displays NVRAM DIMM properties (SPD) as a hexadecimal dump. Extended test mode: Injects ECC errors into the NVRAM DIMM, without triggering detection. Extended test mode: Injects ECC errors into the NVRAM DIMM, and then triggers detection. Fills system memory with a random data pattern, and then DMA transfers this pattern back-and-forth from NVRAM memory. Fills system memory with a random data pattern, and then DMA transfers this pattern to NVRAM memroy. Fills NVRAM memory with a random data pattern, and then DMA transfers this pattern to system memory. Allows the user to dump a region of memory. Allows the user to write to a region of memory. Fills NVRAM memory with a user-specified data pattern. Fills NVRAM memory with data patterns for power cycle test. Fills NVRAM memory with data patterns for power cycle test, which does burst writes. Fills NVRAM memory with data patterns for power cycle test, which does burst reads.
cycle test 77 Memory DMA write Fills NVRAM memory with data patterns for power cycle test, which does DMA writes. power cycle test 78 Verify data retention Checks the retention of data in NVRAM after a power cycle. Data comes from data patterns entered in Test 75. 80 Memory class change Option not available. [Mfg only] 90 NVRAM card Enables the selection of a specific NVRAM card for testing. selection 91 Enable/disable Enables or disables continuous running of a diagnostic test. The test is stopped when looping Ctrl-C is pressed or when an error is encountered if option 92 is active. 92 Stop/continue on Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option error 91, looping continues after an error is encountered. 93 Extended/normal test Enables or disables extended mode on tests where extended mode is an available option. mode 99 Exit Exits this diagnostics menu.
NVRAM6 IB diagnostics
The following table describes the tests in the NVRAM6 IB diagnostic test: Test Test no 1 Comprehensive NVRAM cluster test 2 Internal loopback RDMAW test 3 Internal loopback send test 4 Link test [Xtnd] 5 External loopback RDMAW test [Xtnd] 6 External loopback send test [Xtnd] 70 Reset port performance counter 71 Display port performance counter 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Test remote direct memory access write (RDMAW) between host memory and NVRAM6 card, using onchip loopback. Test data transfer between host memory and NVRAM6 card, using onchip loopback. Extended test mode: Verify external link status. Point to point cable needed. Extended test mode: Test remote direct memory access write (RDMAW) between host memory and NVRAM6 card, using external loopback. Point to point cable needed. Extended test mode: Test data transfer between host memory and NVRAM6 card, using external loopback. Point to point cable needed. Resets the counter on the performance of the cluster ports. Displays information about the performance of the cluster ports. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
NVRAM6 diagnostics
NVRAM6 diagnostics
Environmental diagnostics
The following table describes the NVRAM6 environmental test menu. The NVRAM6 environmental test can generate environmental error messages associated with the battery or the temperature sensors. The corrective action for this error message grouping is below the error message description: Test Test no 1 Comprehensive NVRAM env test 2 NVRAM env subsytem test 3 NVRAM battery test 4 NVRAM charger test 70 GPIO bit control 71 GPIO dump 72 Turn battery on 73 Turn charger on 74 LM81 I2C dump 75 LM81 I2C write 76 Force GPIO interrupt 77 Charge Battery 78 Discharge Battery 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Tests the interrupt conditions for each sensor. Tests the battery. Tests the battery charger. Allows the user to toggle the general purpose IO lines. Dumps the settings of the general purpose IO lines. Turns on the battery. Turns on the battery charger. Allows the user to read the devices on the NVRAM board. Allows the user to write to the devices on the NVRAM board. Force an interrupt from the NVRAM board through the general purpose IO line. Charges the NVRAM battery to a user-specified voltage. Discharges the NVRAM battery to a user-specified voltage. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
ENV01501x ENV01502x ENV01503x ENV01504x ENV01505x ENV01506x ENV01507x ENV01508x ENV01509x ENV01510x ENV01511x ENV01512x ENV01513x ENV01514x ENV01515x ENV01516x ENV01517x
NVRAM6-battery2-1 NVRAM6-temperature-1 NVRAM6-battery-2 NVRAM6-battery2-2 NVRAM6-temperature-2 NVRAM6-battery-5 NVRAM6-battery2-5 NVRAM6-temperature-5 NVRAM6-battery-6 NVRAM6-battery2-6 NVRAM6-temperature-6 NVRAM6-battery-7 NVRAM6-battery2-7 NVRAM6-temperature-7 NVRAM6-battery-8 NVRAM6-battery2-8 NVRAM6-temperature-8
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error code range: Error code range ENV01500x - ENV01501x, ENV01503x - ENV01504x, ENV01506x - ENV01507x, ENV01509x - ENV01510x, ENV01512x - ENV01513x, ENV01515x - ENV01516x ENV01502x, ENV01505x, ENV01508x, ENV01511x, ENV01514x, ENV01517x Corrective action 1. Verify that the NVRAM6 battery is connected. 2. Call NetApp Technical Support if the error is not corrected.
NVRAM diagnostics
NVRAM5 diagnostics
The following table describes the NVRAM5 test menu: Test Test no 1 Comprehensive NVRAM test 2 NVRAM memory menu 3 NVRAM IB menu 4 NVRAM ECC menu [Xtnd] 5 NVRAM environmental test 6 NVRAM EEPROM test 70 Set NVRAM properties [Mfg only] 71 Display NVRAM properties 72 Display NVRAM EEPROM 73 Display NVRAM status 74 Display NVRAM config space 76 Upgrade NVRAM firmware [Xtnd] 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Accesses the NVRAM memory menu. Accesses the IB menu which tests the part of the adapter associated with clustering. Accesses the error correction code menu. Accesses the environmental test menu. Tests the contents of the NVRAM5 EEPROM. Menu not available. Displays information about the NVRAM5 adapter. Displays information about the NVRAM5 Electrically Erasable Programmable Read Only Memory (EEPROM) contents. Displays information about the NVRAM5 status. Displays information about the NVRAM5 configuration space. Extended test mode: Updates the firmware on the NVRAM5. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
NVRAM5 diagnostics
Memory diagnostics
The following table describes the NVRAM5 memory test menu: Test Test no 1 Comprehensive NVRAM memory test 2 NVRAM memory walking data test 3 NVRAM memory walking address test 4 NVRAM memory partial word test 5 NVRAM memory random data test 6 NVRAM memory random address test 70 NVRAM memory dump 71 NVRAM memory poke 74 Memory fill power cycle test 75 Memory write power cycle test 76 Memory read power cycle test 77 Memory DMA write power cycle test 78 Verify data retention 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Runs quick test of data lines. Runs quick test of all address lines to verify address paths in NVRAM memory. Tests intermixed data sizes. Runs longer test by writing and reading random data to all NVRAM locations. Runs longer test using random addresses. Allows the user to dump a region of memory. Allows the user to write to a region of memory. Fills NVRAM memory with data patterns for power cycle test. Fills NVRAM memory with data patterns for power cycle test, which does burst writes. Fills NVRAM memory with data patterns for power cycle test, which does burst reads. Fills NVRAM memory with data patterns for power cycle test, which does DMA writes. Checks the retention of data in NVRAM after a power cycle. Data comes from data patterns entered in Test 75. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
NVRAM5 IB diagnostics
The following table describes the tests in the NVRAM5 IB diagnostic test: Test Test no 1 Comprehensive NVRAM cluster test 2 Internal loopback RDMAW test 3 Internal loopback send test 4 Link test [Xtnd] 5 External loopback RDMAW test [Xtnd] 6 External loopback send test [Xtnd] 70 Reset port performance counter 71 Display port performance counter 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Test remote direct memory access write (RDMAW) between host memory and NVRAM5 card, using onchip loopback. Test data transfer between host memory and NVRAM5 card, using onchip loopback. Extended test mode: Verify external link status. Point to point cable needed. Extended test mode: Test remote direct memory access write (RDMAW) between host memory and NVRAM5 card, using external loopback. Point to point cable needed. Extended test mode: Test data transfer between host memory and NVRAM5 card, using external loopback. Point to point cable needed. Resets the counter on the performance of the cluster ports. Displays information about the performance of the cluster ports. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
NVRAM5 diagnostics
NVRAM5 diagnostics
Environmental diagnostics
The following table describes the NVRAM5 environmental test menu. The NVRAM5 environmental test can generate environmental error messages associated with the battery or the temperature sensors. The corrective action for this error message grouping is below the error message description: Test Test no 1 Comprehensive NVRAM env test 2 NVRAM env subsytem test 3 NVRAM battery test 4 NVRAM charger test 70 GPIO bit control 71 GPIO dump 72 Turn battery on 73 Turn charger on 74 LM81 I2C dump 75 LM81 I2C write 76 Force GPIO interrupt 90 NVRAM card selection 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Tests the interrupt conditions for each sensor. Tests the battery. Tests the battery charger. Allows the user to toggle the general purpose IO lines. Dumps the settings of the general purpose IO lines. Turns on the battery. Turns on the battery charger. Allows the user to read the devices on the NVRAM board. Allows the user to write to the devices on the NVRAM board. Force an interrupt from the NVRAM board through the general purpose IO line. Enables the selection of a specific NVRAM card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error code range: Error code range ENV011080 through ENV011089, ENV011100 through ENV011109 ENV011090 through ENV011099, ENV011090 through ENV011099 Corrective action 1. Verify that the NVRAM5 battery is connected. 2. Call NetApp Technical Support if the error is not corrected. Call NetApp Technical Support if the error is not corrected.
NVRAM diagnostics
NVMEM diagnostics
The following table describes the NVMEM test menu for the FAS20xx: Test Test no 1 Comprehensive NVMEM test 2 Battery test 71 Set battery armed 75 Fill for power cycle test, burst write 76 Fill for power cycle test, burst read 77 Fill for power cycle test 78 Verify data retention 82 Display from given address 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Tests the battery. Toggles between arming and disarming the battery Fills NVRAM memory with data patterns for power cycle test, which does burst writes. Fills NVRAM memory with data patterns for power cycle test, which does burst reads. Fills NVRAM memory with data patterns for power cycle test. Checks the retention of data in NVRAM after a power cycle. Data comes from data patterns entered in Test 75. Displays the contents of a memory address location. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
NVRAM diagnostics
NVMEM diagnostics
The following table describes the NVMEM test menufor the FAS200: Test Test no 1 Comprehensive NVMEM test 2 Battery test 71 Turn battery off 72 Turn charger on 75 Fill for power cycle test, burst write 76 Fill for power cycle test, burst read 77 Fill for power cycle test 78 Verify data retention 82 Display from given address 91 Enable/disable looping 92 Stop/continue on error 93 Extended/normal test mode 99 Exit Description Runs all tests in current mode. Tests the battery. Turns off the battery. Turns on the battery charger. Fills NVRAM memory with data patterns for power cycle test, which does burst writes. Fills NVRAM memory with data patterns for power cycle test, which does burst reads. Fills NVRAM memory with data patterns for power cycle test. Checks the retention of data in NVRAM after a power cycle. Data comes from data patterns entered in Test 75. Displays the contents of a memory address location. Enables or disables continuous running of a diagnostic test. The test is stopped when CtrlC is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Card diagnostics
64 Show FRU information 66 Show ECC Stats and Provides a summary of the ECC status. Summary 68 Show pass/fail status Provides a summary of the status. 69 Toggle Training Flag By default, after each comprehensive test, the DMA controller is retrained. Use this option to enable or disable this feature. 70 Toggle Voltage By default, after each comprehensive test, the voltage margin is cycled. Use this option
73 74 81 86
88 89 90 91 92 93 93 99
Enables the selection of a specific card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Stop/continue on error Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Extended/normal test Enables or disables extended mode on tests where extended mode is an available option. mode Toggle detail mode Use this option to to define the amount of error information detail required. Exit Exits this diagnostics menu.
Margining Set Bit Toggle Flag Show memory Set Address Range Toggle Show Summary Flag (global) Set Comprehensive Test Mask (global) Set Detail Dump Options (global) Memory card selection Enable/disable looping
to enable or disable this feature. Use this option to control the pattern sequencing in option 6. Displays the memory contents. Sets the memory address range for subsequent tests. Enables or disables the display of test summary information after each comprehensive loop. Enables the definition of which tests to be run in option 1. Fills NVRAM memory with data patterns for power cycle test, which does burst reads.
Card diagnostics
Displays the sensor and FPGA temperature. Displays the DIMM size, part number, and serial number. Extended mode: Updates the Primary EEPROM. The new FPGA bits will be loaded after a reboot. Displays the contents of firmware directory and allows the user the option to choose the FPGA file to load. Reload backup FPGA Extended mode: Sets up backup FPGA bits. The backup FPGA will be loaded after a [Xtnd only] reboot. Show FRU Displays the module name, part number, serial number, and revision. information Show BCH Stats and Provides a summary of the BCH status. Summary Clear Flash [Xtnd Extended mode: Disables the adapter and all adapter channels. Clears all elements of only] Flash cache. Show Flash Displays the flash memory hardware location. Designator Map
74 80 81 86
88 90 91 92 93 93 99
Show memory Show Address Range Set Address Range Toggle Show Summary Flag (global) Set Comprehensive Test Mask (global) Memory card selection Enable/disable looping Stop/continue on error Extended/normal test mode Toggle detail mode Exit
Displays the memory contents. Displays the memory address range for subsequent tests. Sets the memory address range for subsequent tests. Enables or disables the display of test summary information after each comprehensive loop. Enables the definition of which tests to be run in option 1. Enables the selection of a specific card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Use this option to to define the amount of error information detail required. Exits this diagnostics menu.
Card diagnostics
Card diagnostics
SAS diagnostics
About the SAS diagnostic tests
The SAS (Serial Attached SCSI) group of diagnostics tests the functioning of the SAS interfaces that are in your system. The tests range from EEPROM data verification through data transfer integrity testing. The SAS diagnostic tests can generate error messages associated with the interface and disk shelf. To perform disk or shelf diagnostics, select test 90 and identify the channel. This returns you to the main SAS menu. Then select test 80 or 81. Note Altering disks or cabling in a loop adapter requires you to perform either Test 41 or Test 42 before running any SAS test. If you change a multiple loop adapter, run Test 41. If you change a single loop adapter, run Test 42.
42
71 72 73 74 76
78 80 81
90 91 92 93 99
all SAS Scan and show disks on selected SAS Show ISP SAS chip info Show attached SAS devices Show all disks (probe-scsi-all) Reset SAS interface Program onboard WWN [Factory] Zeroing disk test [Mfg] Go to disk diagnostic menu Go to shelf diagnostics menu SAS interface selection Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
presence of disks. Lists the status of all the disks on the specified SAS interface. Requires the presence of disks. Displays information about the ISP SAS chip. Displays all devices attached to a specific SAS interface. Scan and List disk information for all disks attached to the system. Resets the selected SAS interface to its original state. Option not available. Option not available. Accesses the disk bus pattern diagnostics submenu. Accesses the disk shelf diagnostics submenu. Enables you to select a specific SAS interface for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Displays primary and grown defect list. This test has optional parameters. Requires disks attached to the FC host adapter. Returns the user to the main FC-AL menu.
Turns on the drive LEDs on the target disk shelf. Turns off the drive LEDs on the target disk shelf. Displays the list of disk shelves and their firmware revisions on the target FC-AL card. Get shelf drive map Displays the list of drives on the disk shelves of the target FC-AL card. Get shelf environment Displays the environmental parameters for the disk shelves on the target FCinformation AL card. Check SES temperature sensors Check SES temperature sensors against threshold value. Check SES FANs Check SES fan status. Check SES Power Supply Check SES Power Supply status. Check SES ESH (HUB) Check SES HUB status on the ESH. Check all SES elements Check status of all SES elements in the shelf. Loop integrity/LRC test [Xtnd] Extended test mode: Tests the FC-AL loop integrity and LRC functionality. Show HUB status Display status of each port in the HUB for each ESH module. Display sector size for FC-AL Displays the sector size for the drives on the disk shelves. devices Extended/normal test mode Enables or disables extended mode on tests where extended mode is an available option. Exit this menu Returns the user to the main FC-AL menu.
Card diagnostics
SCSI diagnostics
About the SCSI diagnostic tests
The Small Computer System Interface (SCSI) group of diagnostics tests the functioning of the SCSI adapters that are in your system. The tests range from checking firmware versions and disk access through Static Read Random Access Memory (SSRAM) and data transfer integrity. The SCSI diagnostic tests generate error messages associated with the adapter. Note: Tests that are labeled [Xtnd] often require loopback plugs for complete test operation and will indicate failures without these plugs. Caution: Do not run [Xtnd] mode diagnostics on network adapter cards with live network connections. Disconnect all network connections prior to running network diagnostics in [Xtnd] mode. Running with attached networks can adversely affect other attached devices. Type xtnd n to cancel Extended test mode.
3 4 5 6 7 71 72 74 75
SCSI interrupt test Read-only bus test [Xtnd] Read/write bus test [Mfg] Disk read test (FCTEST) Disk read/write test Show ISP chip info Show attached SCSI devices Reset SCSI adapter Show serial EEPROM data
76 Program serial EEPROM data [Factory] 78 Set serial # and revision [Factory] 79 Zero disk test area [Factory] 90 SCSI card selection 91 Enable/disable looping 92 Stop/continue looping on error 93 Extended/normal test mode 99 Exit
Option not available. Option not available. Option not available. Enables you to select a specific SCSI card for testing. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Enables or disables extended mode on tests where extended mode is an available option. Exits this diagnostics menu.
Card diagnostics
8 70 90 91 92 93 99
External lp test (Xtnd) Dump Registers TOE card selection Enable/disable looping Stop/continue looping on error Extended/normal test mode Exit
CF card diagnostics
About the CF card diagnostic tests
The CF (CompactFlash) card group of diagnostics tests the functionality of CompactFlash card that is in your system. Use these diagnostics for testing and verifying card data. The CF card diagnostic tests can generate error messages associated with the hardware.
Displays CF card information. Displays the contents of specific registers. Displays the contents of individual sectors that the user selects. Displays the value of checksum info. Enables or disables continuous running of a diagnostic test. The test is stopped when Ctrl-C is pressed or when an error is encountered if option 92 is active. Starts or stops running a diagnostic test on an error. If looping is enabled, as set by option 91, looping continues after an error is encountered. Exits this diagnostics menu.
Stress diagnostics
About the stress diagnostics
This section describes the stress diagnostic tests. They simulate heavy traffic on the storage system to identify malfunctioning components or those that might malfunction in the near future. The stressable devices displayed depend on the cards in the system. FAS270c only: If you are running diagnostics on system module B and you responded that system module A is running Data ONTAP or Diagnostics, then only tests 1 and 3 are available for running. Running diagnostics on a Multipath High Availability nodes: If you are running diagnostics on a node in a Multi Path High Availability cabled pair of nodes, verify that one of the following is true: The partner node is at the CFE or Loader boot prompt. The partner node is powered off. The multipath high availability cabling for both nodes has been removed such that the appliance is only responsible for its own storage and not that of its partner.
Error Messages
About this section
This section defines the coding conventions used, lists and defines the error messages generated by the diagnostic tests, and recommends the corrective action to address errors you encounter.
Note Error codes of the type "ENVxxxxxx" indicate that an environmental error code was generated. These codes, along with the corrective action, are listed in Environmental Error Messages. Module code The module code identifies the software driver, hardware adapter, or firmware for which the error is generated. Typically, the hardware error messages generated by the diagnostic tool are associated with the diagnostic kernel system code. Also generated by the diagnostics are Data ONTAP kernel and firmware error messages. Only the diagnostic kernel messages are documented in this section. The types of diagnostic kernel module codes are shown in the following table. Module code letter B C G Module generating the error Hardware bridges CNA errors GbE adapters, iSCSI adapters, and TCP Offload Engine (TOE) adapter
H I L M N P R S T Z Type code
Disk shelf Performance Accelerator module and Flash Cache module FC-AL adapters Memory and onboard SIMMs NVRAM CompactFlash unit Remote management card (RMC) and Remote LAN Management SCSI adapters Baseboard management controller Motherboard and backplane adapters
The last letter of the error message code identifies the probable error type; what caused the error to be generated. The types of probable type codes are shown in the following table: Type code letter H S U Type generating the error Hardware card or adapter Software error User error
This error message can appear with a variety of cards. Check the test that yielded this error message to determine which bridge is faulty. Device ID incorrect Incorrect bridge chip device ID found during testing. Base class incorrect Incorrect bridge chip base class detected during testing. Subclass incorrect Incorrect bridge chip subclass detected during testing. Incorrect revision number Incorrect bridge chip revision number detected during testing. Bridge at bus hex_value, slot Might indicate an error in the bridge chip, or more likely, a problem with a hex_value has error device on the bus managed by the bridge. The following bridge error Internal bridging software error detected during testing. bits could not be located No CIOB found on The motherboard does not have the CIOB. motherboard.
Corrective action
To correct the displayed error, replace the card or contact NetApp Technical Support.
DBS0307
Message type
This error message grouping covers software errors associated with the bridge cards that are in the storage system. These errors are generated when you run the Option 2: Check CPU/Hostbridge CNB20HE status option and Option 3: Check SIO (OSb4) status option from the Motherboard menu.
Corrective action
Report this error to NetApp Technical Support for analysis.
Adapter link initialization failed Adapter interrupt timed out Invalid adapter state Adapter reset failed Command request failed Command request timed out Adapter self test failed Loop is open Adapter mailbox command register test failed DCH0010 Loopback pattern match failed DCH0011 Adapter login timed out DCH0012 IDC AE timed out
Failed to match incoming and outgoing patterns during loopback test. Adapter FW failed to perform internal login. Adapter FW failed to process Internal Driver Communication Asynchronous Event. DCH0013 FW DCBX AE timed out Adapter FW failed to process internal Data Center Bridging Exchange Asynchronous Event. DCH0014 Specified adapter not found The adapter was functioning but is no longer accessible. DCH0015 Unexpected number of ports found Potential adapter FW issue with onboard ports.
3. Repeat the test. DCH0005-DCH0007 DCH0011-DCH0012 DCH0010, DCH0013 1. Reset the adapter . 2. Repeat the test. 1. Check SFP(s) and loopback adapters are connected. 2. Reset the adapter. 3. Repeat the test. DCH0015 1. Verify a supported FCoE adapter is available. 2. Reset the adapter. 3. Repeat the test.
Corrective action
Check the cable, make sure there is loopback plug when running external loopback tests. Call technical support if the error is not corrected.
Corrective action
Reset the adapter and repeat the test.
Corrective action
Replace the CNA card. Call technical support if the error is not corrected because there is a possible software issue.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DCU0500 DCU0501 Corrective action Enter valid slot number. Enter valid port number.
Selected slot does not have a GbE card. Number of cards initialized does not match those found. Reported by the stress diagnostics. Initialization of the card failed.
Corrective action
Replace the GbE card or contact NetApp Technical Support.
Transmit error.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DGH0140-DGH0146 DGH0148-DGH0149 DGH0147 Corrective action Replace the GbE card or contact NetApp Technical Support. 1. Check that the external loopback plug is connected. 2. If it is connected and the GbE card still fails, call NetApp Technical Support.
Read and expected checksum do not match. Return error code from card. Host was expecting a reply message, but did not get it. Host was expecting a notify message, but did not get it. Return error code from card during loopback test and DMA test.
Host was waiting for interrupt, but timed out. Card failed to reboot during warm reset. Return error code from card during warm reset.
DGH0281 Flash checksum mismatch DGH0282 Could not initialize the registers DGH0283 Flash manufacture ID and device ID do not match DGH0284 8M PCI memory, i960 switches 1 & 2 must be on DGH0285 hex_value PCI memory should be 16M
Failed to initialize device registers. Read out manufacture ID and device ID of the flash are not what was expected. Unless 16M is mapped in, switches 1 and 2 are not set to On. 16M should be mapped in.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DGH0250-DGH0268 DGH0270-DGH0285 DGH0269 Corrective action Replace the GbE card. 1. Check that the external loopback plug is connected. 2. If it is connected and the GbE card still fails, call NetApp Technical Support.
DGH0407 DGH0408 DGH0409 DGH0410 DGH0411 DGH0412 Did not get an interrupt-status hex_value DGH0413 Did not reset sbm_macenable val DGH0414 No on-board ethernet detected\n" DGH0415 DGH0416 DGH0417 status 0x%llx\n"
Invalid slot selected to run the test. The command issued to the common firmware environment (CFE) did not respond; the command status is returned. Slot dec_value receive error send(hex_value) Did not receive any data; displays the sent and the 0x%x recv(hex_value) 0x%x buf dec_value offset received data and offset at which the data mismatch dec_value occurred. Receive failed dsc dec_value Failed the receive operation and shows the descriptor at which the receive failed. Transmit failed dec_value Failed the transmit operation and shows the descriptor at which the receive failed. Ring is full The buffer ring is full trying to allocate more buffers than available. Receive ring to allocate is full dsc hex_value\n", The receive buffer is full trying to allocate more dsc); buffers than available. No 10B link Did not detect a 10Bt link. No 100B link Did not detect a 100Bt link. No 1G link Did not detect a 1G link. status 0x%llx loop dec_value\n" Displays the transmit and receive status in case of an error. Did not get the expected interrupt. The status of the expected interrupt is also shown. Did not reset the mac enable register. Failed to detect an onboard Ethernet interface.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DGH0400-DGH0406 Corrective action Contact NetApp Technical Support.
DGH0410-DGH0417 DGH0407-DGH0409
1. Check that the external loopback plug is connected. 2. If it is connected and the GbE card still fails, call NetApp Technical Support.
Corrective action
Replace the card or contact NetApp Technical Support.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DGH0600-DGH0603 DGH0608-DGH0609 DGH0604-DGH0607 Corrective action Replace the TOE card or contact NetApp Technical Support. 1. Check that the external loopback plug is connected. 2. If it is connected and the TOE card still fails, call NetApp Technical Support.
Card has bad SRAM. Card has bad DRAM. Card has bad SRAM. Card has bad DRAM. Card failed to execute boot firmware. Unable to allocate memory. Card fail to set an interrupt. Card fail to reset interrupt. Unable to get card system configuration information. Unable to get card revision. Card is not ready. Card failed to initialize the ports. Unable to allocate memory. Unable to set the card to promiscuous mode. Unable to activate the port. Unable to configure the port.
Fail get link up Fail xmt request Fail send Fail receive Frame drop Uncorrect receive length Data compare error
Failed to get the link up. Unable to send a transmit request. Unable to send data. Unable to receive data. A frame was dropped. The received data length is incorrect. The received data is incorrect.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DGH0800-DGH0820 DGH0821DGH0827 Corrective action Replace iSCSI card or contact NetApp Technical Support. 1. Check that the external loopback plug is connected. 2. If it is connected and the iSCSI card still fails, call NetApp Technical Support.
DGS0006
Message type
This error message grouping covers software errors associated with GbE cards that are in the storage system.
Corrective action
Replace the card or contact NetApp Technical Support.
DHH0001
Message type
This error message grouping covers hardware errors associated with the disk shelves that are connected to the storage system or with the Fibre Channel or SAS cards that are in the storage system.
Corrective action
To correct this error, complete the following steps: Step 1 2 3 4 Action Make sure that the drive bays for SES monitoring on the target disk shelf have disk drives. Check the FC-AL or SAS connection. If the connection is good, replace the FC-AL or SAS adapter. Contact NetApp Technical Support.
DIH0004 Board is running backup FPGA Image DIH0005 [IOmem only] Build data code is earlier than required DIH0006 Train bit still set DIH0007 Train bit NOT set, train pattern miscompare DIH0008 Unsupported storage system platform DIH0009 Wider PCIe width expected_min: dec_value got:dec_value DIH0010 Narrower PCIe width expected_max:dec_value got:dec_value DIH0011 Unexpected PCIe width detected DIH0012 ZDI_IOCTL_GET_TEMP_INFO DIH0013 SET_VMARGIN_STATE DIH0014 UECC DIMM dec_value and/or DIMM dec_value addr:hex_value engine: dec_value synd:hex_value col:hex_value row: hex_value bank:hex_value rank:hex_value rd_buffer_count=hex_value DIH0015 CECC addr:hex_value dimm:dec_value bit: dec_value engine: dec_value synd:hex_value col:hex_value row: hex_value bank:hex_value rank:hex_value rd_buffer_count=dec_value DIH0016 data miscompare addr= hex_value exp= hex_value got= hex_value error_bits= hex_value
Unable to get sensor temperature information. Unable to set voltage margin. An uncorrectable error occured and since the occurence could not to isolated to a single DIMM, the message displays the DIMM pair, address, and read buffer count. A correctable error occured and the message displays the address, the DIMM number, bit, DMA controller, syndrome, column, row, bank, rank, and read buffer count. A data "miscompare" occured and the message displays the address, the value expected, the value read, and the bits in error. DIH0018 PIO Read miscompare, win= hex_value loc= hex_value A data "miscompare" occured during the PIO test rd= hex_value exp= hex_value and the message displays the window, the address
DIH0019 PIO_UCECC = dec_value, PIO_CECC = dec_value DIH0020 SET_FRU_INFO DIH0021 Set_Memory data miscompare, addr:hex_value expected: hex_value got:hex_value error_bit: hex_value DIH0022 [Flash Cache only] Build data code is earlier than required DIH0023 [Flash Cache only] Build data code is earlier than required
within the window,the value read, and the value expected. An uncorrectable PIO error was detected. Unable to set FRU information. Unable to set local memory. FGPA image is out of date. FGPA image is out of date.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group Corrective action DIH0001-DIH0002 1. Replace the card. DIH0004 2. Call technical support if the error is not corrected. DIH0006-DIH0007 DIH0009-DIH0021, DIS0023 DIH0003 1. Check the CompactFlash card. 2. Reinstall the software. DIH0005 1. Update to the most current Service Image release. 2. Boot the diagnostics program. 3. Enter:
Xtnd yes
4. Enter:
iomem
DIH0008 DIH0022
Install the Performance Accelerator module (IOmem adapter) in a supported storage system. 1. Update to the most current Service Image release. 2. Boot the diagnostics program. 3. Enter:
Xtnd yes
4. Enter:
pam2
DIS0028 Unsupported bit_toggle test pattern option The selection of an unsupported bit toggle test pattern option was detected. DIS0029 Failed to get initialization state Unable to get the initialization state. DIS0030 No new address buffer allocated Run out of system memory. DIS0031 Failed to allocate Zodiac I/O message DIS0032 Failed to allocate data buffer DIS0033 GET_DMA_INFO Unable to get DMA information. DIS0034 ZDI_IOCTL_GET_INDICATOR_STATE Unable to get LED information. DIS0035 DMA erase error DMA erase failed. DIS0036 bad block module is not initialized Card failed to initialize. DIS0037 can not retrieve bad block Failed to get bad block information. DIS0038 Card is not ready for sanitizing Card state is not ok. DIS0039 Can not clear card Failed to clear data in flash card. DIS0040 Clearing fail Can not clear the card.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DIS0002-DIS0006 DIS0008 DIS0010-DIS0011 DIS0014-DIS0022 DIS0024-DIS0025 DIS0027, DIS0029 DIS0033-DIS0037 DIS0012 DIS0001,DIS0007, DSI0009, DIS0013, DIS0023, DIS0026, DIS0028, DIS0030-DIS0032, DIS0038-DIS0040 Corrective action 1. Replace the adapter. 2. Call technical support if the error is not corrected.
Enter a valid voltage margin. Call technical support as there is a possible software issue.
DIU0013 Limiting input address to 9 hex chars... DIU0014 Detected illegal hex char. 'ascii_value' char_loc= dec_value DIU0015 Exceeded number of retries ( dec_value). Aborting... Exceeded number of retries. DIU0016 Limiting input address to 16 hex chars... Exceeded more hex characters than expected. DIU0017 Addr > BOARD_SIZE ( hex_value) Requested memory address is greater than memory available on board. DIU0018 Invalid start_addr value. (tries=dec_value) Detected invalid start address. DIU0019 Aborting set_addr after dec_value unsuccessful Exceeded max number of reties while setting address. tries... Unable to get valid input. DIU0020 Getline failed DIU0022 FPGA filename not set Unable to set FPGA filename. DIU0023 Choose_FPGA_file failed Unable to choose FPGA file. DIU0024 Failed to get FPGA info Firmware packaging error.
DIU0025 Sampling factor cannot be zero. DIU0027 Detected illegal value for bit_toggle flag dec_value DIU0028 Detected illegal value for bit_toggle pattern dec_value DIU0029 Detected illegal value for reset DIMM log chc= dec_value DIU0031 Failed to sync DIMM Logs DIU0032 Invalid address interface value DIU0033 Invalid address bank value DIU0034 Invalid address block value DIU0035 Invalid address page value DIU0036 Invalid address chip value
Detected illegal value for set sampling factor. Detected illegal value for set bit toggle flag Detected unsupported option in set bit toggle value. Detected unsupported option in reset DIMM Log. Unable to save DIMM Logs. Detected invalid address interface value. Detected invalid address bank value. Detected invalid address block value. Detected invalid address page value. Detected invalid address chip value.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DIU0001, DIU0003 DIU0022-DIU0023 DIU0002 DIU0004 DIU0005-DIU0007 DIU0013-DIU0016 DIU0008-DIU0010 DIU0018-DIU0019 DIU0011-DIU0012 DIU0017 DIU0020 DIU0021 DIU0024 DIU0025 DIU0026 DIU0027 DIU0028 DIU0029 DIU0031 DIU0032-DIU0036 Corrective action Install valid ZDI firmware package Choose valid FPGA file from firmware package. Choose valid option. Enter valid hexadecimal string value. Enter valid start address value. Enter valid voltage margin string. Choose valid memory address. Enter valid characters. Choose valid slot number. Choose valid end address value. Choose valid sampling factor value. Enter valid value for check status flag. Enter valid value for bit toggle flag. Enter valid bit toggle value. Choose supported option in reset DIMM log. Call technical support as there is a possible software issue. Enter valid address values.
DLH0005 ISP VID is hex_value but should be hex_value DLH0006 ISP DID is hex_value but should be either hex_value or hex_value DLH0007 RISC status was hex_value, but should be hex_value DLH0009 ISP firmware simple command DLH0010 ISP firmware bad command test DLH0011 ISP firmware wraparound failed DLH0012 ISP firmware wraparound DLH0013 DLH0014 DLH0015 DLH0016
DLH0019 DLH0020
Card failed to execute a simple command (NOP operation). Card failed to execute an invalid command. Card failed to execute a wraparound mailbox command. The data transmitted to and received by the mailbox does not match. Expected dec_value ISP2100 controllers, but Card received an unexpected number of ISP chips. only found 1 Copy to SSRAM on channel dec_value failed Card failed to write to Synchronous Static Random Access Memory (SSRAM). Read from SSRAM on channel dec_value Card failed to read from SSRAM. failed Data mismatch at SSRAM word dec_value, Read and written data do not match. channel dec_value, Note read hex_value, expected hex_value, dest The word is the address offset from the starting address in hex_value, source hex_value the word. RISC checksum failed Card failed when verifying the checksum of the downloaded firmware code. FCAL loop is open, channel dec_value Card failed to reconnect to the loop. Check the cable, disk, and terminator plug. Card failed to download the EEPROM.
DLH0021 Could not save new ISP 2100 settings to EEPROM; giving up after 2 retries
DLH0023 Unable to execute firmware: error code 0004 Card failed to execute the downloaded firmware. DLH0025 FCAL loop is open, channel dec_value Card failed to reconnect to the loop. DLH0026 DLH0030 DLH0032 DLH0033 DLH0034 DLH0035 DLH0036 DLH0037 DLH0038 DLH0039 DLH0040 DLH0041 DLH0042 DLH0043 DLH0044 DLH0045 DLH0046 DLH0047 DLH0070 DLH0071 DLH0072 DLH0073 DLH0074 DLH0100 Check the cable, disk, and terminator plug. No FCAL in slot dec_value No card was found in designated slot. isp2100_diag_reset_isp: while resetting ISP, Card failed to come back after reset. ISP never came ready on adapter dec_value FCAL ISP POST test failed: error code Card failed to execute the POST code given by the FC-AL dec_value, count dec_value failing FIFO: vendor. hex_value, FIFO addr: hex_value NOP command failed execution Card failed to execute the NOP command. Unexpected number of ISP 2100s; found Chip number is incorrect. dec_value The HCCR_INTR bit was not reset Test failed to flush the previous command. FCAL interrupt test failed, the interrupt test Card/test failed to set the interrupt bit to the main CPU. never got set FCAL interrupt bit either never got reset or it Test either failed to flush the previous command or the regenerated an interrupt interrupt bit was reset. There is a link failure or loss of sync or System failed to receive the link status from the Fibre invalid CRC Channel chip. FCTEST confidence factor is < 95 The fctest has a confidence factor of < 95%. ISP internal loop test 10 bit failed during Applies to ISP2200 card only: Card failed to execute the mail, channel dec_value internal loop test (before the serial transceiver). ISP internal loop test 1 bit failed during mail, channel dec_value ISP external loop test failed during mail, Applies to ISP2200 card only: Card failed to execute the channel dec_value external loop test. Data mismatch doing dec_value at word Card has a data mismatch when executing an internal or dec_value, channel dec_value, received external loop test. hex_value, send hex_value ISP failed to get device link status at channel Card failed to get device link status before the fctest. dec_value, device dec_value ISP failed to get adapter link status at Card failed to get adapter link status before the fctest. channel dec_value ISP failed to execute fctest at channel Card failed to execute the fctest. dec_value ISP failed to get device link status at channel Card failed to get device link status after the fctest. dec_value, device dec_value Unrecognized signature The save EEPROM data has an invalid signature. Invalid NVRAM minimum version The save EEPROM data has an invalid NVRAM version. EEPROM data checksum error The save EEPROM data has an invalid checksum. Serial number in EEPROM is not equal to the Serial numbers saved in EEPROM and in FLASH do not one in FLASH match. Never saw LIP occur after executing internal Card never saw the loop initialization process (LIP) back up loopback test after executing the internal loopback test. LED test failed LED test failed.
DLH1000 Self test failed with error of class dec_value, subclass dec_value, info dec_value DLH1001 Interrupt test failed with error of class dec_value, subclass dec_value, info dec_value DLH1002 External loopback test failed with error of class dec_value, subclass dec_value, info dec_value DLH1003 Failed to relip with error of class dec_value, subclass dec_value, info dec_value DLH1004 Internal loopback test failed with error
Card self test failed. Failed to get interrupt from the card. The card failed to execute external loopback test. The card failed to generate a lip or close the loop. Inconclusive test or transient error..
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group Corrective action DLH0001-DLH0007 Replace the FC-AL card or contact NetApp Technical Support. DLH0009-DLH0016 DLH0019, DLH0021 DLH0023 DLH0026-DLH0030 DLH0032-DLH0041 DLH0070-DLH0074 DLH0100 DLH1000-DLH1003 DLH0020, DLH0025 1. Check the external connection. DLH0042-DLH0043 2. If the FC-AL card still fails, replace the card or contact NetApp Technical Support. DLH0044-DLH0047 DLH1004 1. Check the external connection, disk, and disk shelf. 2. If the FC-AL card still fails, replace the card or contact NetApp Technical Support. Rerun the test.
Addr=hex_value: Exp=hex_value, Data might be corrupted and a specific DIMM is bad. Act= hex_value, Diff=hex_value Read/write error. ** SIMM banks indicating errors: hex_value Addr=hex_value Exp=hex_value, Act= hex_value, Diff=hex_value
DMH0101-DMH0106 DMH0301-DMH0352
An error in cache memory is found and identified. Cache errors require replacement of the motherboard.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DMH0001DMH0059 DMH0101DMH0352 Corrective action Replace the DIMM/SIMM for DIMM/SIMM errors. Call NetApp Technical Support for cache errors. 1. Replace the motherboard 2. Call NetApp Technical Support if the error is not corrected.
Corrective action
Contact NetApp Technical Support.
DNH0143 Addr=hex_value, Exp=hex_value, Act= hex_value, Diff=hex_value DNH0145 Majority vote for address not reached ( hex_value, hex_value, Indicates that the address of the location hex_value) being written to when power was lost could not be obtained. DNH0106- Addr=hex_value Exp=hex_value, Act= hex_value, A read/write error was encountered. DNH0109 Diff=hex_value DNH0112DNH0143 DNH0301DNH0302 DNH0311DNH0312 Test forced an ECC error. Status register is DNH0321 Soft error register value is not correct: Exp=dec_value, Act= dec_value not correct. DNH0322 Soft error count not generated The status register did not count the errors. DNH0323- No ACK received I2C write failed. DNH0324 An EEPROM read/write error occurred. DNH0325 EEPROM byte=dec_value, val=hex_value, exp= hex_value DNH0326 Soft error register value shows error = hex_value Status register shows an unexpected error. DNH0327 Cannot clear soft error register = hex_value Test could not clear the status register bit. DNH0328 Hard error register value shows error = hex_value Status register shows an unexpected error. DNH0329 Cannot clear hard error register = hex_value Test could not clear the status register bit. DNH0330 NVRAM battery needs to be charged NVRAM battery voltage is low and needs charging or replacing.
DNH0331 NVRAM battery voltage too high DNH0332 DNH0402 DNH0416 DNH0417 DNH0423 DNH0424 DNH0425 DNH0426 DNH0435 NVRAM battery in the chassis is *missing or dead* Command status reads as busy Expected interrupt hex_value did not occur Unexpected interrupt hex_value Clear command did not clear memory Incorrect number of unlogged ECC corrections dec_value ECC log 0 incorrect mask=hex_value, addr= hex_value ECC log 1 incorrect mask=hex_value, addr= hex_value Unable to read the flash ID
NVRAM adapter is bad, incorrect voltage read. NVRAM battery is missing or discharged. Previous command was not completed. Missing interrupt. Unexpected interrupt occurred. Memory was supposed to be cleared, but was not. Log data shows memory errors, or expected errors not logged. Flash that stores NVRAM microcode is not responding properly. A sector of flash memory that stores NVRAM microcode could not be written to. Test encountered an invalid serial number for the storage system type or NetCache appliance. The test encountered an invalid revision number for the storage system type or NetCache appliance. The test encountered an invalid memory size for the storage system type or NetCache appliance. Memory errors read from the NVRAM card have not been corrected. Single-bit ECC error not corrected. NVRAM installed with wrong memory size. Unable to communicate with the NVRAM flash. NVRAM programmed with a bad part number, or unable to read part number. Unlogged ECC correction is incorrect.
DNH0436 Flash write error address = hex_value DNH0440 Invalid nvram serial number dec_value DNH0441 Invalid nvram revision number hex_value DNH0442 Board part number ( hex_value) does not match DIMM size DNH0443 ECC PCI correction DNH0444 ECC silent correction Loc=hex_value, Exp=hex_value, Act= hex_value DNH0445 Wrong size DIMM ( dec_value MB) for this platform DNH0446 A front panel is hex_value detected on this system DNH0447 Unrecognized part number (string-value)
DNH0448 ECC unlogged correction Adr=hex_value, Exp=hex_value, Act= hex_value Odd ECC cacheline correction is incorrect. DNH0449 ECC odd cacheline correction Addr=hex_value, Exp=hex_value, Act= hex_value DMA memory transfer shows unexpected DNH0461 DMA failed: Engine= dec_value, Ctrl=hex_value Addr=hex_value, Exp=hex_value, Act= hex_value, data. Diff=hex_value DNH0462 DMA ECC: Engine= dec_value, Exp=hex_value, hex_value, hex_value, hex_value Act= hex_value, hex_value, hex_value, hex_value DNH0463 DMA time out: Engine= dec_value, Desc Exp=hex_value, Desc Act= hex_value
DNH0471 Vendor ID incorrect - Expected hex_value, Actual hex_value This card has a different vendor than what testing reads. DNH0472 Device ID incorrect - Expected hex_value, Actual hex_value This card is of a different type than what testing reads. DNH0473 Class incorrect - Expected hex_value, Actual hex_value This card is of a different class than what testing reads. DNH0474 Completion buffer timeout Command issued to NVRAM, but NVRAM did not reply. DNH0490 NVRAM front panel EEPROM wrote hex_value, read EEPROM read and/or write failed. hex_value NVRAM5 IB failed to create the DNH0500 NVRAM5 IB fail create CQ completion queue. DNH0501 NVRAM5 IB fail QP prep NVRAM5 IB failed the queue pair preparation. DNH0502 NVRAM5 IB fail create QP NVRAM5 IB failed to create the queue pair. DNH0503 NVRAM5 IB fail transit QP from reset to init NVRAM5 IB failed to transition the queue pair from reset to initialized state. DNH0504 NVRAM5 IB fail transit QP from init to rtr NVRAM5 IB failed to transition the queue pair from initialized state to ready-toreceive. DNH0505 NVRAM5 IB fail transit QP from rtr to rts NVRAM5 IB failed to transition the queue pair from ready-to- receive to ready-tosend. DNH0506 NVRAM5 IB fail memory registration NVRAM5 IB failed memory region registration. DNH0507 NVRAM5 IB fail post send request NVRAM5 IB failed post send request. DNH0508 NVRAM5 IB fail post rcv request NVRAM5 IB failed post receive request. DNH0509 NVRAM5 IB fail completion poll NVRAM5 IB failed completion poll. DNH0510 NVRAM5 IB error verify data NVRAM5 IB error in verifying data. DNH0511 NVRAM5 IB fail link up on port ( dec_value) NVRAM5 IB failed to get the link up on the identified port. DNH0512 NVRAM5 IB slot ( dec_value) failed initialization The identified slot for NVRAM5 IB failed to initialize. DNH0550 Timeout waiting for ECC correction ECC errors not corrected or not recorded in logs. DNH0551 NVRAM5 did not receive expected ECC error NVRAM5 failed to receive the expected error correction code. DNH0552 NVRAM5 EEPROM write failed: exp hex_value got NVRAM5 read and expected EEPROM hex_value write do not match. DNH0553 NVRAM5 received wrong ECC error: dec_value NVRAM5 received the wrong error correction code. DNH0554 NVRAM5 received too many ECC errors NVRAM5 received too many error correction codes. DNH0555 NVRAM5 ECC did not correct data: exp hex_value got NVRAM5 error correction code did not hex_value correct the data. DNH0556 NVRAM5 battery is too low or disconnected at 4590 mV NVRAM5 battery power is below normal.
DNH0600 NVRAM DMA mismatch: Addr1: hex_value Data1: hex_value Addr2: hex_value Data2: hex_value DNH0601 NVRAM SPD byte dec_value unsupported: dec_value DNH0602 NVRAM battery dec_value is too low or disconnected at dec_value mV
DMA memory transfer shows unexpected data The DIMM in the NVRAM adapter shows unsupported properties (SPD). NVRAM battery power is below normal.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group. Error message group DNH0101-DNH0330 DNH0471-DNH0473 Corrective action 1. Replace the NVRAM adapter for platforms with an NVRAM adapter or Replace the SDRAM DIMM in the FAS250. 2. Call NetApp Technical Support if the error is not corrected. DNH0331-DNH0332 DNH0430, DNH0432, DNH0556, DNH0602 DNH0402-DNH0426 DNH0435-DNH0436 DNH0442 DNH0445-DNH0446 DNH0473, DNH0550 DNH0601 DNH0440-DNH0441 DNH0447 DNH0443 DNH0461-DNH0463 DNH0600 DNH0490 DNH0500- DNH0512 Replace the NVRAM battery.
Replace the storage system head. 1. Replace the NVRAM adapter. 2. Replace the storage system head. Replace the NVRAM adapter and the attached front panel. 1. Reseat the cables. 2. Reseat the adapter. 3. Replace the adapter. 1. Reseat the DIMM. 2. Reseat the adapter. 3. Replace the adapter.
DNH0551- DNH0555
Corrective action
Check the storage system for the last bank of SIMMs. If it is there, verify that it is seated properly, then rerun the diagnostic test. If the same error occurs, call NetApp Technical Support.
Corrective action
1. Replace CompactFlash card. 2. Call NetApp Technical Support.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DRH0001-DRH0002 DRH0005 DRH0007 DRH0011-DRH0012 DRH0003 DRH0004 DRH0006 DRH0008 DRH0021-DRH0022 DRH0026 DRH0034 DRH0023 DRH0027 Corrective action Replace the remote management card.
1. Check the external power source. 2. Replace the remote management card. 1. Check the cable connecting the remote management card to the motherboard. 2. Replace the remote management card. Replace the motherboard. 1. Check the LAN cable. 2. Replace the remote management card. 1. Replace the RLM card. 2. Call NetApp Technical Support. 1. Verify that the actual temperature in the enviroment is not too high or too low. 2. Replace the RLM card. 1. Verify that all the power supplies are present before rerunning the test. 2. If the error continues to occur, check the agent on the motherboard. 3. If the error continues to occur, check the power supply. 1. Verify that all the power supplies are on before rerunning the test. 2. If the error continues to occur, check the agent on the motherboard. 3. If the error continues to occur, check the power supply. 1. Check the agent on the motherboard. 2. If the error continues to occur, check the power supply. 1. 2. 3. 4. Reseat the RLM card. If the error continues to occur, replace the RLM card. If the error continues to occur, check the agent on the motherboard. Call NetApp Technical Support.
DRH0028
DRS0010
Message type
This error message grouping covers software errors associated with the remote management card (RMC).
Corrective action
1. Update the remote management card firmware. 2. Replace the remote management card.
DRU0009
Message type
This error message grouping covers user errors associated with the remote management card (RMC) in the storage system.
Corrective action
1. Make sure there is only one remote management card in the system. 2. Replace the remote management card.
SCSI adapter failed to execute NOP operation. Invalid command entered; the SCSI adapter responds with invalid return status. DSH0018 ISP firmware wraparound failed SCSI adapter failed to execute the wraparound mailbox command. DSH0019 ISP firmware wraparound mailbox # During execution of the wraparound mailbox command, dec_value: read hex_value, expected the value put on the incoming mailbox does not match the hex_value value that was received in the outgoing mailbox. DSH0020 Expected dec_value ISP controllers, but found Number of ISP chips found does not match the number
dec_value Copy to SSRAM on slot dec_value failed Read from SSRAM on slot dec_value failed Data mismatch at SSRAM word dec_value, slot dec_value, read hex_value, expected hex_value, dest hex_value, source hex_value Read of firmware from SSRAM failed
recorded. SCSI adapter failed to copy DMA data to the SSRAM. SCSI failed to read DMA data to the host. Value read from SSRAM does not match the value written to SSRAM. SCSI adapter failed to dump the firmware value written to the host. Firmware data written to SSRAM does not match the firmware data that was read from SSRAM. SCSI adapter failed the firmware checksum. HCCR interrupt bit has not cleared existing data. SCSI adapter interrupt is not set. SCSI adapter interrupt is not set or the interrupt was set again. SCSI adapter failed to execute the loaded firmware. Vendor ID numbers on the device and in the device database do not match. Cannot find SCSI adapter in the specified slot. SCSI adapter failed to copy the firmware to SSRAM. SCSI adapter failed to copy the Qlogic stress code to SSRAM. SCSI adapter failed when doing a checksum for a given Qlogic stress code. Cannot execute Qlogic stress code. SCSI device failed to read the SSRAM in the identified slot. SCSI device failed to read the SSRAM in the identified slot during the firmware test. Failed to reset SCSI adapter card. Failed to reset SCSI chip. Value read from SSRAM does not match the value written to SSRAM.
DSH0025 Firmware data mismatch at word dec_value, read hex_value, expected hex_value DSH0026 Firmware checksum failed DSH0027 The HCCR_INTR bit was not reset DSH0028 SCSI interrupt test failed, the interrupt test never got set DSH0029 SCSI interrupt bit either never got reset or it regenerated DSH0030 Unable to execute firmware: error code hex_value DSH0031 Expected vendor hex_value, device dec_value saw vendor hex_value, device dec_value DSH0032 No SCSI in slot dec_value DSH0033 Copy of firmware to SSRAM failed DSH0034 Copy of stress Qlogic code to SSRAM failed DSH0035 Qlogic stress code checksum failed DSH0036 Unable to execute Qlogic stress code DSH0037 Read from SSRAM on slot dec_value failed DSH0038 Read from firmware from SSRAM in slot dec_value failed DSH0039 Failed to reset adapter card DSH0040 Failed to reset ISP DSH0041 Data mismatch at SSRAM word hex_value, slot dec_value, read hex_value, expected hex_value DSH0042 Firmware data mismatch at word dec_value, read hex_value, expected hex_value DSH0050 Failed to flush previous pending mailbox command DSH0051 Mailbox command failed to finish
Firmware data written to SSRAM does not match the firmware data that was read from SSRAM. SCSI adapter failed to flush the previous pending mailbox command. SCSI adapter found a timeout when executing a mailbox command. DSH1000 SCSI adapter in slot dec_value, port dec_value Adapter is marked dead. is dead DSH1001 SCSI adapter in slot dec_value, port dec_value Adapter is busy executing the OSM event and cannot be is currently in OSM event mode disturbed.
DSH1002 Failed to initialize SCSI adapter in slot dec_value, port dec_value DSH1003 Timeout when initializing SCSI adapter in slot dec_value, port dec_value DSH1004 Failed to reset SCSI adapter in slot dec_value, port dec_value DSH1005 Timeout when resetting SCSI adapter in slot dec_value, port dec_value DSH1006 Failed to reset SCSI adapter bus in slot dec_value, port dec_value DSH1007 Timeout when resetting SCSI adapter bus in slot dec_value, port dec_value DSH1008 Failed to reset target in slot dec_value, port dec_value DSH1009 Timeout when resetting target in slot dec_value, port dec_value DSH1010 Failed to rescan SCSI adapter in slot dec_value, port dec_value DSH1011 Timeout when rescanning SCSI adapter in slot dec_value, port dec_value DSH1012 Timeout from SCSI during disk init in slot dec_value, port dec_value DSH1013 OSM event happened for SCSI card in slot dec_value, port dec_value and failed to handle it DSH1014 ISP VID is 0x%x but should be 0x%x DSH1015 ISP DID is 0x%x but should be 0x%x
Adapter failed to do hardware initialization. A timeout occurred during the hardware initialization. Resetting of the SCSI adapter failed. A timeout occurred while the SCSI adapter was being reset. Adapter failed to do a bus reset. A timeout occurred while the adapter bus was being reset. Adapter failed to do specific disk reset. A timeout occurred while a specific disk was being reset. Adapter failed to do a rescan. A timeout occurred during rescanning. A timeout occurred during disk initialization through this adapter. An OSM event happened during the task and the adapter failed to handle it. Wrong vendor ID. Wrong device ID.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DSH0001-DSH0038 DSH0040-DSH0051 DSH1000-DSH1007 DSH1013-DSH1015 DSH0039 DSH1008 DSH1009-DSH1012 Corrective action Replace the SCSI card or contact NetApp Technical Support.
1. Check that the external loopback plug is connected. 2. If it is connected, and the SCSI card still fails, call NetApp Technical Support. Replace the bad disk. Call NetApp Technical Support. 1. Replace the bad disk. 2. Replace the SCSI card. 3. Call NetApp Technical Support.
DTH0025 Failed to get the RTC Time. This is not a BMC Error. DTH0026 Failed to Set the BMC Watchdog Timer DTH0027 Failed to Start the BMC Watchdog Timer DTH0028 Failed to Enable the BMC's NMI generation capability DTH0029 BMC should have generated an NMI but did not DTH0030 Failed to Get the BMC's Device ID DTH0031 Failed to Get the reason for System Restart DTH0032 Failed to Display text successfully on the LCD DTH0033 Failed to Retrieve the BMC's Self Test Information DTH0034 Failed to Get the System GUID DTH0035 Failed to Set the System GUID DTH0036 BMC does not support a Self Test option DTH0037 Sensor Data Repository Empty DTH0039 BMC Boot Firmware Code is Corrupted. DTH0040 BMC FRU internal use area is Corrupted. DTH0041 Sensor Data Repository is Corrupted. DTH0042 System Event Log is Corrupted. DTH0043 Platform Information Area is Corrupted. DTH0044 DTH0045 DTH0046 DTH0047 DTH0048 DTH0049 DTH0050 DTH0051 DTH0052 DTH0053
The real- time clock time could not be read. Could not set the BMC watchdog timer. Could not start the BMC watchdog timer. Could not enable the BMC's NMI generation capability. BMC should have generated an NMI but did not. Could Could Could Could not get the BMC's device identification. not get the reason for system restart. not display text successfully on the LCD. not retrieve the BMC's self test information.
Could not get the system GUID. Could not set the system GUID. BMC does not support a self test option. Sensor data repository was found to be empty. The BMC Boot firmware was found to be corrupted The BMC internal FRU area was found to be corrupted The Sensor Data Repository was found to be corrupted The System Event Log was found to be Corrupted. The BMC Platform Information Area was found to be corrupted. BMC FRU device is Inaccessible. The BMC FRU device could not be accessed BMC Sensor Data Repository is Inaccessible. BMC Sensor Data Repository could not be accessed. BMC System Event Log is Inaccessible. BMC System Event Log could not be accessed IPMB Signal Error. There was a Signal Error on the BMC Private Bus BMC RAM test error. BMC RAM had errors during self test BMC fatal hardware error. The BMC had a fatal internal hardware error. Management controller error. The BMC had a Management controller error during Self Test Private I2C bus error. A BMC Private I2C bus had an error. BMC internal exception. The BMC had an internal error. BMC A/D timeout error. The BMC analog to digital converter failed to respond
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DTH0001 DTH0009 Corrective action 1. Update the BMC firmware. 2. Replace the motherboard. 1. Update the BMC sensor data repository.
2. Replace the motherboard. DTH0004-DTH0008 DTH0011-DTH0053 Replace the motherboard or contact NetApp Technical Support.
Corrective action
Replace the CompactFlash card.
Motherboard was not ready to receive data from the serial port.
DZH0154 DZH0155 DZH0160 DZH0161 DZH0162 DZH0163 DZH0164 DZH0158 Unable to read backplane SEEPROM SEEPROM, error code = dec_value DZH0159 Unable to read motherboard SEEPROM SEEPROM, error code = dec_value DZH0136 Unable to read motherboard SEEPROM, error code = DZH0165 dec_value
A data mismatch was received from the serial port. Burst data transfer hung Motherboard was not ready to receive data from the serial port. Com port dec_value burst data received does not match data A data mismatch was received from the sent serial port. Can't program backplane SEEPROM, error code = dec_value Failed to program the backplane SEEPROM. Can't program motherboard SEEPROM, error code = Failed to program the motherboard dec_value SEEPROM. Super I/O config error; config = hex_value, expected hex_value An invalid device ID was read from the Super I/O. Unable to read backplane SEEPROM, error code = dec_value Failed to read the backplane SEEPROM.
Failed to program the backplane SEEPROM. Failed to program the backplane SEEPROM. Failed to read the motherboard SEEPROM.
DZH0166 DZH0167 DZH0168 DZH0169 Can't program backplane SEEPROM SEEPROM, error code = dec_value DZH0170 Unrecognized device (ID = hex_value, hex_value) in slot dec_value DZH0171 No card detected in slot dec_value DZH0172 Card detected in (nonexistent) slot dec_value DZH0175 Unable to read backplane SEEPROM, error code = dec_value DZH0180 Unable to read Front Panel SEEPROM, error code = dec_value DZH0194 DZH0191 Unable to read IO Board SEEPROM, error code = dec_value DZH0197 DZH0195 Invalid CPU dec_value installed DZH0196 Incorrect sensor ASCII_value DZH0192 Can't program onboard FC-AL SEEPROM SEEPROM, error code = dec_value DZH0193 Can't program Front Panel SEEPROM SEEPROM, error code = dec_value DZH0198 Can't program IO Board SEEPROM SEEPROM, error code = dec_value DZH0199 Unable to read onboard FC-AL SEEPROM, error code = dec_value DZH0215 Invalid CPU dec_value microcode revision DZH0216 Invalid CPU microcode revision DZH0218 Invalid CPU dec_value MHz DZH0219 UNKNOWN model DZH0301 DZH0302 DZH0303 DZH0304 DZH0305 DZH0306 DZH0307 DZH0308 DZH0309 DZH0310 DZH0311 DZH0312 DZH0313 DZH0314 21071 - CA GCR register wrong 21071 - CA TENR register wrong 21071 - DA DCSR register wrong 21071 - DA PCI base address register wrong 21071 - DA PCI mask register wrong 21071 - DA host address extension register 0 wrong 21071 - CA error detected - hex_value Host chipset errors detected Corrected 1-bit ECC error Uncorrectable ECC error System bus parity error Attempt to access nonexistent memory PCI bus system error PCI bus data parity error
Failed to program the backplane SEEPROM. Unrecognized PCI device. No PCI device found in indicated slot. Invalid PCI card found in indicated slot. Failed to read backplane SEEPROM. Failed to read the front fanel SEEPROM. Failed to read the I/Oboard SEEPROM. Invalid CPU slot installed. An incorrect sensor type was found. Failed to program the onboard FC-AL SEEPROM. Failed to program the Front Panel SEEPROM. Failed to program the I/O Board SEEPROM. Failed to read the onboard FC-AL SEEPROM. CPU has unsupported microcode. CPU speed is unsupported. Model number is incorrect (for example, storage system or NetCache model number). Invalid register read.
Error in reading the register. Chipset error is detected. 1-bit ECC error is detected. Unknown ECC error is detected. System bus error is detected. Memory access out of bounds. PCI bus error is detected. PCI bus data parity error is detected.
PCI bus address parity error PCI master abort PCI target abort Invalid PTE on scatter-gather
DZH0319 FLASH not write enabled error DZH0320 I/O timeout occurred (R/W > 1) DZH0321 Correctable ECC error occurred while error register locked DZH0322 Uncorrectable ECC error occurred DZH0323 System bus parity error occurred DZH0324 Access to nonexistent memory occurred DZH0325 PCI bus system error occurred while error register locked DZH0326 PCI bus address parity error occurred DZH0327 PCI master abort occurred while error register locked DZH0328 PCI target abort occurred while error register locked DZH0329 Invalid PTE error on scatter/gather occurred DZH0330 FLASH not write-enabled error DZH0331 I/O timeout occurred while error register locked DZH0332 An error occurred while error register locked DZH0333 DZH0334 DZH0335 DZH0336 DZH0337 DZH0338 DZH0339 DZH0340 DZH0341 DZH0342 DZH0343 DZH0344 DZH0345 DZH0346 Tsunami error detected P0 - hex_value Tsunami error detected P1 - hex_value Unknown system; cannot check ISA bridge Unknown system; cannot check PCI bridge Conflicting CRCs (FLASH half = dec_value) First CRC has garbage (FLASH half = dec_value) Second CRC has garbage (FLASH half = dec_value) Conflicting CRCs; CRC1 = hex_value, CRC2 = hex_value First CRC has garbage CRC1 = hex_value Second CRC has garbage CRC2 = hex_value Conflicting CRCs (FLASH half = dec_value) First CRC has garbage (FLASH half = dec_value) Second CRC has garbage (FLASH half = dec_value) System info checksum error
PCI bus address parity error is detected. PCI bus master abort is detected. PCI bus target abort is detected. Invalid PTE on scatter-gather access is detected. Could not write the FLASH. One-second I/O timeout occurred. Correctable ECC error with error register locked. Uncorrectable ECC error with error register locked. System bus parity error with error register locked. Memory access out of bounds with error register locked. PCI bus system error with error register locked. PCI bus address parity error with error register locked. PCI bus master abort with error register locked. PCI bus target abort with error register locked. Invalid PTE on scatter-gather access with error register locked. Cannot write FLASH error with error register locked. One-second I/O timeout with error register locked. Errors occurred while the error register was locked. Chipset error occurred. Error in chipset register. Invalid system type. Incorrect CRC. CRC in first half is incorrect. CRC in second half is incorrect. Incorrect CRC. CRC in first half is incorrect. CRC in second half is incorrect. Incorrect CRC. CRC in first half is incorrect. CRC in second half is incorrect. Checksum is not correct.
DZH0347 System information missing DZH0348 DZH0349 DZH0350 DZH0351 DZH0352 DZH0353 DZH0354 DZH0355 DZH0356 DZH0357 DZH0358 DZH0359 DZH0360 DZH0361 DZH0362 DZH0363 DZH0364 DZH0365 DZH0366 DZH0367 DZH0368 DZH0369 DZH0370 DZH0371 DZH0372 DZH0373 DZH0374 DZH0375 DZH0376 DZH0377 DZH0378 DZH0379 DZH0415 DZH0416 DZH0419 Cache size ( hex_value) mismatch with model ( ASCII_value) Conflicting CRCs (FLASH half = dec_value) First CRC has garbage (FLASH half = dec_value) Second CRC has garbage (FLASH half = dec_value) Conflicting CRCs; CRC1 = hex_value, CRC2 = hex_value CRC has garbage; CRC = hex_value R/W test, address = hex_value expected = hex_value Readback, address = hex_value expected = hex_value Conflicting CRCs; CRC1 = hex_value, CRC3 = hex_value Conflicting CRCs; CRC1 = hex_value, CRC2 = hex_value CRC has garbage; CRC = hex_value R/W test; address = hex_value expected = hex_value Readback; address = hex_value expected = hex_value Conflicting CRCs; CRC1 = hex_value, CRC3 = hex_value Battery dead; RTC not functional Update-busy signal never cleared Seconds not counting properly Day-of-week not in proper range Tiny NVRAM; address = hex_value expected = hex_value Tiny NVRAM; address = hex_value expected = hex_value Super I/O config reg 0 error Super I/O config reg 1 error Super I/O config reg 2 error Invalid super I/O chip ID; Read = hex_value, expected = hex_value Super I/O device ID error Super I/O revision error Super I/O power control error Noisy com port dec_value Com Port dec_value hung Com Port dec_value data received does not match Burst data transfer hung Com Port dec_value burst data received does not match Expected overtemp signal missing (sensor) Can't write FLASH
System information is not programmed correctly. Incorrect cache size found. Incorrect CRC. CRC in first half is incorrect. CRC in second half is incorrect. Incorrect CRC. Onboard NVRAM has an incorrect value. Incorrect value read back from NVRAM. Incorrect CRC.
Onboard NVRAM has an incorrect value. Incorrect value read back from onboard NVRAM. Incorrect CRC. RTC battery is not working. Signal refresh did not take place. RTC seconds value is incorrect. RTC day of week is incorrect. Onboard NVRAM test failed on data mismatch. NVRAM failed the data compare check. Incorrect register configuration.
Incorrect device ID. Incorrect revision. Incorrect power control settings. Comm port signal error detected. Comm port stuck. Comm port failed on data mismatch. Comm port failed on data transfer. Comm port failed on data comparison. No over-temperature signal detected. Cannot program the FLASH.
DZH0428 Unrecognized device (ID = hex_value, hex_value) in slot dec_value DZH0431 No card detected in slot dec_value DZH0432 Card detected in (nonexistent) slot dec_value DZH0435 DZH0436 DZH0437 DZH0438 DZH0446 DZH0447 DZH0448 DZH0449 DZH0450 DZH0451 Expected overtemp interrupt A did not occur Expected overtemp interrupt B did not occur Expected overtemp interrupt C did not occur Bad CRC in PS_dec_value; got hex_value, expected hex_value Backplane over temperature Read i2c failed Write EEPROM failed Read EEPROM failed Failed to read PS status remove Jumper J1 (backplane)
Incorrect device is detected. No card is in the selected slot. A card is found in a slot that does not exist. Expected interrupt signal did not occur.
Incorrect CRC on power supply EEPROM. Backplane temperature is beyond range. i2c cannot be read. Cannot write to the EEPROM. Cannot read from the EEPROM. Need to remove the jumper to read the power supply status. Incorrect CRC. Onboard NVRAM read/write test failed. Onboard NVRAM test failed on data comparison. Incorrect CRC. Cache data error. Unexpected data read. CPU failed MP cache test. Expected voltage interrupt did not occur.
DZH0452 Conflicting CRCs; CRC1 = hex_value, CRC2 = hex_value DZH0453 CRC has garbage; CRC = %08x DZH0454 R/W test; address = hex_value expected = hex_value, actual = hex_value\n DZH0455 Readback; address = hex_value expected = hex_value, actual = hex_value DZH0456 Conflicting CRCs; CRC1 = hex_value, CRC3 = hex_value DZH0507 MP table checksum bad DZH0508 DZH0601 Data error in cache tag test DZH0602 MP cache test CPU dec_value is too slow DZH0801 Expected ASCII_value (i.e., voltage) overvoltage interrupt did not occur DZH0802 Expected ASCII_value (i.e., voltage) undervoltage interrupt did not occur DZH2000 Unable to write Xstorage system PEF Table entry DZH2001 DZH2002 DZH3000
DZH3001 DZH3002
Unable to add the Platform Event Filter entry into the table, which controls the watchdog functionality. Watchdog did not bite The expected watchdog interrupt did not occur. Incorrect CIOB installed. The motherboard requires revision The wrong revision of CIOB was found on %d, but revision %d was found. the motherboard. PCI Express Correctable Error from HT2000 ( %d ): EXB( %d, The chipset detected an error on a PCI %d, %d ): RootErr(hex_value(s)); Br[ %d ](%d, %d, %d ): Express bus, but the hardware has already DevStatus( hex_value(s)); Br[ %d ](%d, %d, %d ): corrected it. DevStatus( hex_value(s)). Unexpected watchdog The watchdog hardware is faulty. Unexpected NMI: <Message string will identify either the If the front panel is identified, then the
error could be due to a faulty front panel or a faulty front panel-to-motherboard connection. If the RLM is identified, then the error could be due to a faulty RLM or a faulty front panel-to-RLM connection.
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error message group: Error message group DZH0101-DZH0106 DZH0123-DZH0127 DZH0136-DZH0138 DZH0150 DZH0154-DZH0168 DZH0175-DZH0194 DZH0196 DZH0197-DZH0198 DZH0301-DZH0308 DZH0319 DZH0330 DZH0354-DZH0355 DZH0359 DZH0366-DZH0379 DZH0416, DZH0419 DZH0454-DZH0455 DZH0415 DZH0435-DZH0437 DZH0446 DZH0195 DZH0309-DZH0312 DZH0318 DZH0320-DZH0324 DZH0329 DZH0330-DZH0353 DZH0356-DZH0358 DZH0360-DZH0365 DZH0447-DZH0450 DZH0452-DZH0453 DZH0456 DZH0170-DZH0172 DZH0313-DZH0317 DZH0325-DZH0328 DZH0394-DZH0398 DZH0438 Corrective action Motherboard error, call NetApp Technical Support.
1. 2. 3. 4.
Check that the fans are running. Replace the defective fans. Check that the air vents are clear of dirt or debris. Clean the vents if clogged. Replace the motherboard if the fans are working and the vents are clear. Call NetApp Technical Support if the error is not corrected.
1. 2. 3. 4.
Check that the correct PCI device is in the correct slot. Replace the PCI device. Replace the motherboard if the PCI device is not working. Call NetApp Technical Support if the error is not corrected.
1. Check that the power supplies are connected and running. Replace the defective power supply.
2. Replace the motherboard if power supplies are good. 3. Call NetApp Technical Support for instructions if the error is not corrected. DZH0442-DZH0445 1. Check the battery connections. 2. Replace the battery if connections are good. 3. Call NetApp Technical Support for instructions if the error is not corrected. 1. Replace the CPU. 2. If the CPU is good, replace the motherboard. 3. Call NetApp Technical Support if the error is not corrected. 1. Replace the motherboard. 2. Call NetApp Technical Support if the error is not corrected. 1. Check the device at the indicated slot and replace it with the correct device. 2. If the device is correct, replace the motherboard. 3. Call NetApp Technical Support if the error is not corrected. 1. Check the jumper location and move it to the correct location. 2. If this does not solve the error, call NetApp Technical Support. 1. Check power supplies and replace defective units. 2. Call NetApp Technical Support if the error is not corrected. 1. Ignore this message if it only appears once, because the hardware has already corrected it. 2. Call NetApp Technical Support if the message is persistent. 1. Ignore this message if it only appears once, because the hardware has already corrected it. 2. If the message is persists, replace the motherboard. 1. Ignore this message if it only appears once, because the hardware has already corrected it. 2. If the message is persists, replace the identified HW component. 3. If the message is still persists, replace the motherboard.
DZH0215-DZH0216
DZH3001
DZH3002
Corrective action
Call NetApp Technical Support.
Temperature sensors
Temperature sensor error message description
Error messages can be generated by the following temperature sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01030x ENV01031x Sensor description Motherboard temperature (motherboard temp). Front panel temperature (Front panel temp).
The following table lists the error messages that can be generated by the temperature sensors on the motherboard. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents one of the four temperature sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read [d]. [d] exceeds the critical high threshold. [d] exceeds the warning high threshold. [d] falls below the warning low threshold. [d] falls below the critical low threshold. Missing interrupt when [d] exceeds the warning high threshold. Missing interrupt when [d] falls below the warning low threshold.
Corrective action
1. Check to see whether the PSU fans are working properly (from the Diagnostics menu, as well as by physically looking at them). 2. If the fans are bad, replace the PSUs. 3. If the fans are good, replace the motherboard.
The following table lists the error messages that can be generated by the voltage power sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the seven voltage power sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read the [d] power sensor. [d] power sensor exceeds the critical high threshold. [d] power sensor exceeds the warning high threshold. [d] power sensor falls below the warning low threshold. [d] power sensor falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold.
Corrective action
Replace the power supply. If the problem remains, replace the motherboard.
Fan sensors
Fan sensor error message description
Error messages can be generated by the fan sensors for existence and status. The corrective action for all fan sensor error messages is below all the error message descriptions.
Fan sensors
Status error messages can be generated by the following power supply fans within each power supply module. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01038x ENV01039x ENV01040x ENV01041x Sensor description PSU 1 Fan 1 (Power supply unit 1 fan 1). PSU 1 Fan 2 (Power supply unit 1 fan 2). PSU 2 Fan 1 (Power supply unit 2 fan 1). PSU 2 Fan 2 (Power supply unit 2 fan 2).
The following table lists the error messages that can be generated by the baseboard and power supply fan sensors. Note: "[d]" in the sample error message representsone of the six baseboard fan sensors or one of the four power supply fan sensors: If "x" is... 1 2 3 4 5 6 7 8 9 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] cant be speeded up [d] cant be slowed down Description Cannot read [d] sensor. [d] speed read exceeds the critical high threshold. [d] speed read exceeds the warning high threshold. [d] speed read exceeds the warning low threshold. [d] speed read exceeds the critical low threshold. Missing interrupt when the [d] speed exceeds the warning high threshold. Missing interrupt when the [d] speed exceeds the warning low threshold. [d] cannot be speeded up by the system. [d] cannot be slowed down by the system.
Corrective action
Replace the power supply unit.
The following table lists the error messages that can be generated for the power supply existence sensors. Note: "[d]" in the sample error message represents one of the two sensors indicating the existence of the power supplies: If "x" is... 2 3 4 Sample error message [d] is not installed [d] is installed, but powered off [d] is installed and powered on, but not functioning Description [d] is missing. [d] is off. [d] is not functioning.
Corrective action
1. Install the power supply. 2. Turn the power supply on. 3. Replace the power supply.
Temperature sensors
Temperature sensor error message description
Error messages can be generated by the following temperature sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01060x ENV01061x Sensor description Motherboard temperature (motherboard temp). Front panel temperature (Front panel temp).
The following table lists the error messages that can be generated by the temperature sensors on the motherboard. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents one of the four temperature sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read [d]. [d] exceeds the critical high threshold. [d] exceeds the warning high threshold. [d] falls below the warning low threshold. [d] falls below the critical low threshold. Missing interrupt when [d] exceeds the warning high threshold. Missing interrupt when [d] falls below the warning low threshold.
Corrective action
1. Check to see whether the PSU fans are working properly (from the Diagnostics menu, as well as by physically looking at them). 2. If the fans are bad, replace the PSUs. 3. If the fans are good, replace the motherboard.
The following table lists the error messages that can be generated by the voltage power sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the seven voltage power sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read the [d] power sensor. [d] power sensor exceeds the critical high threshold. [d] power sensor exceeds the warning high threshold. [d] power sensor falls below the warning low threshold. [d] power sensor falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold.
Corrective action
Replace the power supply. If the problem remains, replace the motherboard.
Fan sensors
Fan sensor error message description
Error messages can be generated by the fan sensors for existence and status. The corrective action for all fan sensor error messages is below all the error message descriptions.
Fan sensors
Status error messages can be generated by the following power supply fans within each power supply module. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01068x ENV01069x ENV01070x ENV01071x Sensor description PSU 1 Fan 1 (Power supply unit 1 fan 1). PSU 1 Fan 2 (Power supply unit 1 fan 2). PSU 2 Fan 1 (Power supply unit 2 fan 1). PSU 2 Fan 2 (Power supply unit 2 fan 2).
The following table lists the error messages that can be generated by the baseboard and power supply fan sensors. Note: "[d]" in the sample error message representsone of the six baseboard fan sensors or one of the four power supply fan sensors: If "x" is... 1 2 3 4 5 6 7 8 9 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] cant be speeded up [d] cant be slowed down Description Cannot read [d] sensor. [d] speed read exceeds the critical high threshold. [d] speed read exceeds the warning high threshold. [d] speed read exceeds the warning low threshold. [d] speed read exceeds the critical low threshold. Missing interrupt when the [d] speed exceeds the warning high threshold. Missing interrupt when the [d] speed exceeds the warning low threshold. [d] cannot be speeded up by the system. [d] cannot be slowed down by the system.
Corrective action
Replace the power supply unit.
The following table lists the error messages that can be generated for the power supply existence sensors. Note: "[d]" in the sample error message represents one of the two sensors indicating the existence of the power supplies: If "x" is... 2 3 4 Sample error message [d] is not installed [d] is installed, but powered off [d] is installed and powered on, but not functioning Description [d] is missing. [d] is off. [d] is not functioning.
Corrective action
1. Install the power supply. 2. Turn the power supply on. 3. Replace the power supply.
Temperature sensors
Temperature sensor error message description
Error messages can be generated by the temperature sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01538x ENV01539x ENV01540x ENV01547x ENV01552x ENV01553x ENV01554x ENV01555x Sensor description Board temperature top Board temperature bottom CPU temperature Battery temperature Board top temperature Board bottom temperature PSU starboard temperature PSU port temperature
The following table lists the error messages that can be generated by the temperature sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the temperature sensors: If "x" is... 1 2 3 4 5 6 7 8 Sample error message [d] does not read [d] is in critical high state [d]is in a high warning state [d] is in a low warning state [d] is in a critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] failed to return to normal Description Cannot read [d] sensor. [d] falls exceeds the critical high threshold. [d] falls exceeds the high threshold warning. [d] falls below the low threshold warning. [d] falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold. Sensor [d] cannot be set to the normal state
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error code range: Error code range ENV01538x-ENV01540x ENV01547x Corrective action Replace the motherboard. 1. Replace the motherboard battery. 2. If the problem remains, replace the motherboard.
Replace the top PCM motherboard in chassis. Replace the bottom PCM motherboard in chassis. Replace PSU 2. Replace PSU 1.
Voltage sensors
Voltage power sensors error message description
Error messages can be generated by the voltage power sensors on the platform. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01530x ENV01531x ENV01532x ENV01533x ENV01534x ENV01535x ENV01536x ENV01537x ENV01546x ENV01549x ENV01550x ENV01551x ENV01558x ENV01559x ENV01560x ENV01561x ENV01562x ENV01563x Sensor description Board 1.1V sensor Board 1.2V sensor Board 1.5V sensor Board 1.8V sensor Board 2.5V sensor Board 3.3V sensor CPU 1.2V sensor 12V sensor Charger voltage Battery 8.0 voltage NVMEM 1.8V sensor NVMEM 8.0 V sensor PSU starboard 12V sensor PSU starboard 5V sensor PSU starboard 3.3V sensor PSU port 12V sensor PSU port 5V sensor PSU port 3.3V sensor
The following table lists the error messages that can be generated by the voltage power sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the voltage sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d]is in a high warning state [d] is in a low warning state [d] is in a critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not Description Cannot read [d] sensor. [d] falls exceeds the critical high threshold. [d] falls exceeds the high threshold warning. [d] falls below the low threshold warning. [d] falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error code range: Error code range Corrective action ENV01530x-ENV01537x, ENV1550x Replace the motherboard. ENV01546x, ENV01549x, ENV1551x 1. Replace the motherboard battery. 2. If the problem remains, replace the motherboard. ENV01558x-ENV01560x ENV01561x-ENV01563x Replace PSU 2. Replace PSU 1.
Current sensors
Current sensors error message description
Error messages can be generated by the current sensors on the power supplies. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01542x ENV01544x ENV01564x ENV01565x ENV01566x ENV01567x Sensor description Battery Amp Charger Amp PSU starboard current 12 PSU starboard current 5 PSU port current 12 PSU port current 5
The following table lists the error messages that can be generated by the current sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the current sensors: If "x" is... 1 2 3 4 5 6 7 8 Sample error message [d] does not read [d] is in critical high state [d]is in a high warning state [d] is in a low warning state [d] is in a critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] failed to return to normal Description Cannot read [d] sensor. [d] falls exceeds the critical high threshold. [d] falls exceeds the high threshold warning. [d] falls below the low threshold warning. [d] falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold. Sensor [d] cannot be set to the normal state
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error code range: Error code range ENV01542x, ENV01544x ENV01564x-ENV01565x ENV01565x-ENV01567x Corrective action 1. Replace the motherboard battery. 2. If the problem remains, replace the motherboard. Replace PSU 2. Replace PSU 1.
Battery sensor
Battery sensor error message description
Error messages can be generated by the following battery sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01543x ENV01545x ENV01548x Sensor description Battery capacity Charger cycles Battery run time
The following table lists the error messages that can be generated by the real time clock sensor on the motherboard. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents the status of the battery sensor: If "x" is... 1 2 3 4 5 6 7 8 Sample error message [d] does not read [d] is in critical high state [d]is in a high warning state [d] is in a low warning state [d] is in a critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] failed to return to normal Description Cannot read [d] sensor. [d] falls exceeds the critical high threshold. [d] falls exceeds the high threshold warning. [d] falls below the low threshold warning. [d] falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold. Sensor [d] cannot be set to the normal state
Corrective action
1. Replace the motherboard battery. 2. If the problem remains, replace the motherboard.
Fan sensors
Fan sensors error message description
Error messages can be generated by the following fan sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01556x ENV01557x Sensor description PSU starboard fan PSU port fan
The following table lists the error messages that can be generated by the fan sensors. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents one of the two fan sensors: If "x" is... 1 2 3 4 5 6 7 8 Sample error message [d] does not read [d] is in critical high state [d]is in a high warning state [d] is in a low warning state [d] is in a critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] failed to return to normal Description Cannot read [d] sensor. [d] falls exceeds the critical high threshold. [d] falls exceeds the high threshold warning. [d] falls below the low threshold warning. [d] falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold. Sensor [d] cannot be set to the normal state
Corrective action
The following table lists the error message groupings and corrective action that can be taken for the error code range: Error code range ENV01556x ENV01557x Corrective action Replace PSU 2. Replace PSU 1.
The following table lists the error messages that can be generated for the power supply existence sensors. Note: "[d]" in the sample error message represents one of the two sensors indicating the existence of the power supplies: If "x" is... 1 2 4 5 Sample error message [d] does not read [d] is in bad state [d] expected interrupt-to-bad did not occur [d] expected interrupt-to-normal did not occur Description [d] is not responding. [d] is not functioning. The interrupt indicating that [d] sensor is malfunctioning. The interrupt indicating that [d] sensor is back to normal.
Corrective action
Replace the motherboard.
Temperature sensors
Temperature sensor error message description
Error messages can be generated by the following temperature sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01240x ENV01241x ENV01247x ENV01248x ENV01251x ENV01252x ENV01253x Sensor description CPU 1 temperature (computer processing unit 1 temp). CPU 2 temperature (computer processing unit 2 temp). PSU 1 temperature (power supply 1 temp). PSU 2 temperature (power supply 2 temp). Backplane MB temperature (backplane motherboard temp). Backplane HDD temperature (backplane hard disk drive temp). Front panel temperature (Front panel temp).
The following table lists the error messages that can be generated by the temperature sensors on the motherboard. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents one of the four temperature sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read [d]. [d] exceeds the critical high threshold. [d] exceeds the warning high threshold. [d] falls below the warning low threshold. [d] falls below the critical low threshold. Missing interrupt when [d] exceeds the warning high threshold. Missing interrupt when [d] falls below the warning low threshold.
Corrective action
1. Check to see whether the PSU fans are working properly (from the Diagnostics menu, as well as by physically looking at them). 2. If the fans are bad, replace the PSUs. 3. If the fans are good, replace the motherboard.
The following table lists the error messages that can be generated by the voltage power sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the seven voltage power sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read the [d] power sensor. [d] power sensor exceeds the critical high threshold. [d] power sensor exceeds the warning high threshold. [d] power sensor falls below the warning low threshold. [d] power sensor falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold.
Corrective action
Replace the power supply. If the problem remains, replace the motherboard.
Fan sensors
Fan sensor error message description
Error messages can be generated by the fan sensors for existence and status. The corrective action for all fan sensor error messages is below all the error message descriptions.
Fan sensors
Status error messages can be generated by the following power supply fans within each power supply module. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01254x ENV01255x ENV01256x ENV01257x ENV01258x ENV01259x ENV01260x ENV01261x Sensor description Sys 1 Fan 1 (Chassis fan 1 unit 1). Sys 1 Fan 2 (Chassis fan 1 unit 2). Sys 2 Fan 1 (Chassis fan 2 unit 1). Sys 2 Fan 2 (Chassis fan 2 unit 2). PSU 1 Fan 1 (Power supply unit 1 fan 1). PSU 1 Fan 2 (Power supply unit 1 fan 2). PSU 2 Fan 1 (Power supply unit 2 fan 1). PSU 2 Fan 2 (Power supply unit 2 fan 2).
The following table lists the error messages that can be generated by the baseboard and power supply fan sensors. Note: "[d]" in the sample error message representsone of the six baseboard fan sensors or one of the four power supply fan sensors: If "x" is... 1 2 3 4 5 6 7 8 9 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] cant be speeded up [d] cant be slowed down Description Cannot read [d] sensor. [d] speed read exceeds the critical high threshold. [d] speed read exceeds the warning high threshold. [d] speed read exceeds the warning low threshold. [d] speed read exceeds the critical low threshold. Missing interrupt when the [d] speed exceeds the warning high threshold. Missing interrupt when the [d] speed exceeds the warning low threshold. [d] cannot be speeded up by the system. [d] cannot be slowed down by the system.
Corrective action
The following table lists the error messages that can be generated for the power supply existence sensors. Note: "[d]" in the sample error message represents one of the two sensors indicating the existence of the power supplies: If "x" is... 2 3 4 Sample error message [d] is not installed [d] is installed, but powered off [d] is installed and powered on, but not functioning Description [d] is missing. [d] is off. [d] is not functioning.
Corrective action
1. Install the power supply. 2. Turn the power supply on. 3. Replace the power supply.
Temperature sensors
Temperature sensor error message description
Error messages can be generated by the following temperature sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01612x ENV01613x ENV01614x ENV01615x ENV01616x ENV01281x ENV01282x Sensor description CPU 1 temperature (computer processing unit 1 temp). CPU 2 temperature (computer processing unit 2 temp). PSU 1 temperature (power supply 1 temp). PSU 2 temperature (power supply 2 temp). LCD board temperature (platform's LCD board temp). MB front zone temperature (motherboard temp in the front). MB rear zone temperature (motherboard temp in the rear).
The following table lists the error messages that can be generated by the temperature sensors on the motherboard. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents one of the four temperature sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read [d]. [d] exceeds the critical high threshold. [d] exceeds the warning high threshold. [d] falls below the warning low threshold. [d] falls below the critical low threshold. Missing interrupt when [d] exceeds the warning high threshold. Missing interrupt when [d] falls below the warning low threshold.
Corrective action
1. Check to see whether the PSU fans are working properly (from the Diagnostics menu, as well as by physically looking at them). 2. If the fans are bad, replace the PSUs. 3. If the fans are good, replace the motherboard.
The following table lists the error messages that can be generated by the voltage power sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the seven voltage power sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read the [d] power sensor. [d] power sensor exceeds the critical high threshold. [d] power sensor exceeds the warning high threshold. [d] power sensor falls below the warning low threshold. [d] power sensor falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold.
Corrective action
Replace the power supply. If the problem remains, replace the motherboard.
Fan sensors
Fan sensor error message description
Error messages can be generated by the fan sensors for existence and status. The corrective action for all fan sensor error messages is below all the error message descriptions.
Fan sensors
Status error messages can be generated by the following power supply fans within each power supply module. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01600x ENV01601x ENV01602x ENV01603x ENV01604x ENV01605x ENV01606x ENV01607x ENV01627x ENV01628x ENV01640x ENV01641x Sensor description Sys 1 Fan 1 (Chassis fan 1 unit 1). Sys 1 Fan 2 (Chassis fan 1 unit 2). Sys 2 Fan 1 (Chassis fan 2 unit 1). Sys 2 Fan 2 (Chassis fan 2 unit 2). PSU 1 Fan 1 (Power supply unit 1 fan 1). PSU 1 Fan 2 (Power supply unit 1 fan 2). PSU 2 Fan 1 (Power supply unit 2 fan 1). PSU 2 Fan 2 (Power supply unit 2 fan 2). System Fan FRU 1. System Fan FRU 2. SYS_FAN_1 present. SYS_FAN_2 present.
The following table lists the error messages that can be generated by the baseboard and power supply fan sensors. Note: "[d]" in the sample error message representsone of the six baseboard fan sensors or one of the four power supply fan sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read [d] sensor. [d] speed read exceeds the critical high threshold. [d] speed read exceeds the warning high threshold. [d] speed read exceeds the warning low threshold. [d] speed read exceeds the critical low threshold. Missing interrupt when the [d] speed exceeds the warning high threshold. Missing interrupt when the [d] speed exceeds the warning low threshold.
8 9
[d] cannot be speeded up by the system. [d] cannot be slowed down by the system.
Corrective action
Replace the power supply unit.
The following table lists the error messages that can be generated for the power supply existence sensors. Note: "[d]" in the sample error message represents one of the two sensors indicating the existence of the power supplies: If "x" is... 2 3 4 Sample error message [d] is not installed [d] is installed, but powered off [d] is installed and powered on, but not functioning Description [d] is missing. [d] is off. [d] is not functioning.
Corrective action
1. Install the power supply. 2. Turn the power supply on. 3. Replace the power supply.
Temperature sensors
Temperature sensor error message description
Error messages can be generated by the following temperature sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01856x ENV01857x ENV01858x ENV01859x ENV01860x ENV01861x ENV01862x Sensor description CPU 1 temperature (computer processing unit 1 temp). CPU 2 temperature (computer processing unit 2 temp). Front CPU temperature. MB front temperature (motherboard temp in the front). Rear CPU temperature. MB central temperature (motherboard temp in the center). MB Rear PCI temperature (motherboard temp in the rear).
The following table lists the error messages that can be generated by the temperature sensors on the motherboard. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents one of the four temperature sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read [d]. [d] exceeds the critical high threshold. [d] exceeds the warning high threshold. [d] falls below the warning low threshold. [d] falls below the critical low threshold. Missing interrupt when [d] exceeds the warning high threshold. Missing interrupt when [d] falls below the warning low threshold.
Corrective action
1. Check to see whether the PSU fans are working properly (from the Diagnostics menu, as well as by physically looking at them). 2. If the fans are bad, replace the PSUs. 3. If the fans are good, replace the motherboard.
The following table lists the error messages that can be generated by the voltage power sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the seven voltage power sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read the [d] power sensor. [d] power sensor exceeds the critical high threshold. [d] power sensor exceeds the warning high threshold. [d] power sensor falls below the warning low threshold. [d] power sensor falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold.
Corrective action
Replace the power supply. If the problem remains, replace the motherboard.
Fan sensors
Fan sensor error message description
Error messages can be generated by the fan sensors for existence and status. The corrective action for all fan sensor error messages is below all the error message descriptions.
Fan sensors
Status error messages can be generated by the following power supply fans within each power supply module. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01840x ENV01841x ENV01842x ENV01843x ENV01844x ENV01845x ENV01846x ENV01847x ENV01848x ENV01849x ENV01850x ENV01851x ENV01852x ENV01853x ENV01854x ENV01855x ENV01883x ENV01884x ENV01885x ENV01886x ENV01887x ENV01888x ENV01889x Sensor description SYS_FAN_A1 Fan 1 (Chassis fan A1 unit 1). SYS_FAN_A1 Fan 2 (Chassis fan A1 unit 2). SYS_FAN_A2 Fan 1 (Chassis fan A2 unit 1). SYS_FAN_A2 Fan 2 (Chassis fan A2 unit 2). SYS_FAN_A3 Fan 1 (Chassis fan A3 unit 1). SYS_FAN_A3 Fan 1(Chassis fan A3 unit 2). SYS_FAN_B1 Fan 1 (Chassis fan B1 unit 1). SYS_FAN_B1 Fan 2 (Chassis fan B1 unit 2). SYS_FAN_B2 Fan 1 (Chassis fan B2 unit 1). SYS_FAN_B2 Fan 2 (Chassis fan B2 unit 2). SYS_FAN_B3 Fan 1 (Chassis fan B3 unit 1). SYS_FAN_B3 Fan 1(Chassis fan B3 unit 2). CPU 3.3V Active CPU 3.3V Stadby CPU 5V CPU 12V Partner controller present SYS_FAN_A1 present SYS_FAN_A2 present SYS_FAN_A3 present SYS_FAN_B1 present SYS_FAN_B2 present SYS_FAN_B3 present
The following table lists the error messages that can be generated by the baseboard and power supply fan sensors. Note: "[d]" in the sample error message representsone of the six baseboard fan sensors or one of the four power supply fan sensors:
If "x" is... 1 2 3 4 5 6 7 8 9
Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] cant be speeded up [d] cant be slowed down
Description Cannot read [d] sensor. [d] speed read exceeds the critical high threshold. [d] speed read exceeds the warning high threshold. [d] speed read exceeds the warning low threshold. [d] speed read exceeds the critical low threshold. Missing interrupt when the [d] speed exceeds the warning high threshold. Missing interrupt when the [d] speed exceeds the warning low threshold. [d] cannot be speeded up by the system. [d] cannot be slowed down by the system.
Corrective action
Replace the power supply unit.
The following table lists the error messages that can be generated for the power supply existence sensors. Note: "[d]" in the sample error message represents one of the two sensors indicating the existence of the power supplies: If "x" is... 2 3 4 Sample error message [d] is not installed [d] is installed, but powered off [d] is installed and powered on, but not functioning Description [d] is missing. [d] is off. [d] is not functioning.
Corrective action
1. Install the power supply. 2. Turn the power supply on. 3. Replace the power supply.
Temperature sensors
Temperature sensor error message description
Error messages can be generated by the following temperature sensors. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01393x ENV01394x ENV01411x ENV01412x ENV01413x ENV01414x ENV01415x ENV01416x ENV01425x ENV01426x Sensor description I/O board temperature. Front panel temperature. CPU 0 temperature (computer processing unit 1 temp). CPU 1 temperature (computer processing unit 1 temp). CPU 2 temperature (computer processing unit 2 temp). CPU 3 temperature (computer processing unit 1 temp). MB Zone 1 temperature. MB Zone 2 temperature. PSU 1 temperature. PSU 2 temperature.
The following table lists the error messages that can be generated by the temperature sensors on the motherboard. The corrective action for this error message grouping is below the error message description. Note: "[d]" in the sample error message represents one of the four temperature sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read [d]. [d] exceeds the critical high threshold. [d] exceeds the warning high threshold. [d] falls below the warning low threshold. [d] falls below the critical low threshold. Missing interrupt when [d] exceeds the warning high threshold. Missing interrupt when [d] falls below the warning low threshold.
Corrective action
1. Check to see whether the PSU fans are working properly (from the Diagnostics menu, as well as by physically looking at them). 2. If the fans are bad, replace the PSUs. 3. If the fans are good, replace the motherboard.
PSU 2 AC voltage PSU 1 12 Volt PSU 2 12 Volt PSU 1 5 Volt PSU 2 5 Volt
The following table lists the error messages that can be generated by the voltage power sensors. The corrective action for these error messages is below the error message description. Note: "[d]" in the sample error message represents one of the seven voltage power sensors: If "x" is... 1 2 3 4 5 6 7 Sample error message [d] does not read [d] is in critical high state [d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur Description Cannot read the [d] power sensor. [d] power sensor exceeds the critical high threshold. [d] power sensor exceeds the warning high threshold. [d] power sensor falls below the warning low threshold. [d] power sensor falls below the critical low threshold. Missing interrupt when [d] power sensor exceeds the warning high threshold. Missing interrupt when [d] power sensor falls below the warning low threshold.
Corrective action
Replace the power supply. If the problem remains, replace the motherboard.
Fan sensors
Fan sensor error message description
Error messages can be generated by the fan sensors for existence and status. The corrective action for all fan sensor error messages is below all the error message descriptions.
Fan sensors
Status error messages can be generated by the following power supply fans within each power supply module. Note: The "x" in the code represents the actual error condition: Platform and sensor code ENV01419x ENV01420x ENV01421x ENV01422x ENV01431x ENV01432x ENV01433x ENV01434x ENV01435x ENV01436x ENV01437x ENV01438x ENV01439x ENV01440x ENV01443x ENV01444x ENV01445x ENV01446x ENV01447x Sensor description PSU 1 Fan 1 (Power supply unit 1 fan 1). PSU 1 Fan 2 (Power supply unit 1 fan 2). PSU 2 Fan 1 (Power supply unit 2 fan 1). PSU 2 Fan 2 (Power supply unit 2 fan 2). Sys Fan 0 (Chassis fan 0). Sys Fan 1 (Chassis fan 1). Sys Fan 2 (Chassis fan 2). Sys Fan 3 (Chassis fan 3). Sys Fan 4 (Chassis fan 4). Sys Fan 5 (Chassis fan 5). Sys Fan 6 (Chassis fan 6). Sys Fan 7 (Chassis fan 7). Sys Fan 8 (Chassis fan 8). Sys Fan 9 (Chassis fan 9). Fan FRU 1 present. Fan FRU 2 present. Fan FRU 3 present. Fan FRU 4 present. Fan FRU 5 present.
The following table lists the error messages that can be generated by the baseboard and power supply fan sensors. Note: "[d]" in the sample error message representsone of the six baseboard fan sensors or one of the four power supply fan sensors: If "x" is... 1 2 Sample error message [d] does not read [d] is in critical high state Description Cannot read [d] sensor. [d] speed read exceeds the critical high threshold.
3 4 5 6 7 8 9
[d] is in warning high state [d] is in warning low state [d] is in critical low state [d] expected high interrupt did not occur [d] expected low interrupt did not occur [d] cant be speeded up [d] cant be slowed down
[d] speed read exceeds the warning high threshold. [d] speed read exceeds the warning low threshold. [d] speed read exceeds the critical low threshold. Missing interrupt when the [d] speed exceeds the warning high threshold. Missing interrupt when the [d] speed exceeds the warning low threshold. [d] cannot be speeded up by the system. [d] cannot be slowed down by the system.
Corrective action
Replace the power supply unit.
The following table lists the error messages that can be generated for the power supply existence sensors. Note: "[d]" in the sample error message represents one of the two sensors indicating the existence of the power supplies: If "x" is... 2 3 4 Sample error message [d] is not installed [d] is installed, but powered off [d] is installed and powered on, but not functioning Description [d] is missing. [d] is off. [d] is not functioning.
Corrective action
1. Install the power supply. 2. Turn the power supply on. 3. Replace the power supply.