Cisco MDS 9000 Family Troubleshooting Guide, Release 2.x
Cisco MDS 9000 Family Troubleshooting Guide, Release 2.x
c o m
December 2005
Corporate Headquarters
Cisco Systems, Inc.
170 West Tasman Drive
San Jose, CA 95134-1706
USA
http://www.cisco.com
Tel: 408 526-4000
800 553-NETS (6387)
Fax: 408 526-4100
THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS MANUAL ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL
STATEMENTS, INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE ACCURATE BUT ARE PRESENTED WITHOUT
WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY PRODUCTS.
THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT ARE SET FORTH IN THE INFORMATION PACKET THAT
SHIPPED WITH THE PRODUCT AND ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE SOFTWARE LICENSE
OR LIMITED WARRANTY, CONTACT YOUR CISCO REPRESENTATIVE FOR A COPY.
The Cisco implementation of TCP header compression is an adaptation of a program developed by the University of California, Berkeley (UCB) as part of UCB’s public
domain version of the UNIX operating system. All rights reserved. Copyright © 1981, Regents of the University of California.
NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND SOFTWARE OF THESE SUPPLIERS ARE PROVIDED “AS IS” WITH
ALL FAULTS. CISCO AND THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING, WITHOUT
LIMITATION, THOSE OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF
DEALING, USAGE, OR TRADE PRACTICE.
IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING,
WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL, EVEN IF CISCO
OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
CCSP, CCVP, the Cisco Square Bridge logo, Follow Me Browsing, and StackWise are trademarks of Cisco Systems, Inc.; Changing the Way We Work, Live, Play, and Learn, and
iQuick Study are service marks of Cisco Systems, Inc.; and Access Registrar, Aironet, ASIST, BPX, Catalyst, CCDA, CCDP, CCIE, CCIP, CCNA, CCNP, Cisco, the Cisco
Certified Internetwork Expert logo, Cisco IOS, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo, Cisco Unity, Empowering the Internet Generation,
Enterprise/Solver, EtherChannel, EtherFast, EtherSwitch, Fast Step, FormShare, GigaDrive, GigaStack, HomeLink, Internet Quotient, IOS, IP/TV, iQ Expertise, the iQ logo, iQ
Net Readiness Scorecard, LightStream, Linksys, MeetingPlace, MGX, the Networkers logo, Networking Academy, Network Registrar, Packet, PIX, Post-Routing, Pre-Routing,
ProConnect, RateMUX, ScriptShare, SlideCast, SMARTnet, StrataView Plus, TeleRouter, The Fastest Way to Increase Your Internet Quotient, and TransPath are registered
trademarks of Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries.
All other trademarks mentioned in this document or Website are the property of their respective owners. The use of the word partner does not imply a partnership relationship
between Cisco and any other company. (0502R)
CONTENTS
Preface xix
Document Conventions xx
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Overview 2-1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Overview 3-1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Overview 5-1
Overview 6-1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Overview 8-1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Overview 10-1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Verifying the Establishment of the FCIP Tunnel with the CLI 10-10
Verifying the Establishment of Default TCP Connections for Each Configured FCIP Tunnel with the
CLI 10-12
Verifying the Statistics of the ASIC Chip on Each Gigabit Ethernet Port with with the CLI 10-13
Ethereal Screen Captures of the TCP Connection and FCIP Tunnels 10-13
One-to-Three FCIP Tunnel Creation and Monitoring 10-15
Displaying the Configuration of the First Switch with the CLI 10-16
Creating the FCIP Interface for the Second Tunnel with the CLI 10-16
FCIP Profile Misconfiguration Examples 10-17
Displaying Incorrect or Non-existent IP Address for Use with FCIP Profile with the CLI 10-17
Displaying Configuration Errors when Bringing Up a Tunnel on a Selected Port with the
CLI 10-18
Interface FCIP Misconfiguration Examples 10-20
Displaying FCIP Misconfiguration Examples with the CLI 10-20
Displaying the Interface FCIP Shut Down Administratively with the CLI 10-20
Displaying the Debug Output from the Second Switch with the CLI 10-22
Displaying Passive Mode Set on Both Sides of the FCIP Tunnel with the CLI 10-23
Displaying a Time Stamp Acceptable Difference Failure with the CLI 10-24
FCIP Special Frame Tunnel Creation and Monitoring 10-26
Configuring and Displaying an FCIP Tunnel with a Special Frame with the CLI 10-27
Special Frame Misconfiguration Examples 10-29
Displaying Incorrect Peer WWN when Using Special Frame with the CLI 10-29
Troubleshooting iSCSI Issues 10-31
Troubleshooting iSCSI Authentication 10-31
Displaying iSCSI Authentication with the CLI 10-33
Username/Password Configuration Troubleshooting 10-33
Verifying iSCSI Users Account Configuration with the CLI 10-33
RADIUS Configuration Troubleshooting 10-33
Verifying Matching RADIUS Key and Port for Authentication and Accounting with the CLI 10-34
Troubleshooting RADIUS Routing Configuration 10-36
Displaying the Debug Output for RADIUS Authentication Request Routing with the CLI 10-36
Troubleshooting Dynamic iSCSI Configuration 10-36
Checking the Configuration 10-37
Performing Basic Dynamic iSCSI Troubleshooting 10-37
Useful show Commands for Debugging Dynamic iSCSI Configuration 10-37
Virtual Target Access Control 10-39
Useful show Commands for Debugging Static iSCSI Configuration with the CLI 10-39
Fine Tuning/Troubleshooting iSCSI TCP Performance 10-44
Commands Used to Access Performance Data with the CLI 10-44
Understanding TCP Parameters for iSCSI 10-44
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Overview 11-1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
INDEX
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
This chapter provides release-specific information for each new and changed troubleshooting guideline
for the Cisco MDS SAN-OS Release 2.x software. The Cisco MDS 9000 Family Troubleshooting Guide,
Release 2.x is updated to address each new and changed guideline in the Cisco MDS SAN-OS Release
2.x software. The latest version of this document is available at the following Cisco Systems website:
http://www.cisco.com/en/US/products/ps5989/prod_troubleshooting_guides_list.html
Tip The troubleshooting guides created for previous releases are also listed in the website mentioned above.
Each guide addresses the features introduced in or available in those releases. Select and view the
troublehsooting guide pertinent to the software installed in your switch.
To check for additional information about Cisco MDS SAN-OS Release 2.x, refer to the Cisco MDS
9000 Family Release Notes available at the following Cisco Systems website:
http://www.cisco.com/en/US/products/hw/ps4159/ps4358/prod_release_notes_list.html
Table 1 summarizes the new and changed features for the Cisco MDS 9000 Family Troubleshooting
Guide, Release 2.x, and tells you where they are documented. The table includes a brief description of
each new feature and the release in which the change occurred.
Note This updated version of the Cisco MDS 9000 Family Troubleshooting Guide, Release 2.x has been
reorganized from earlier versions to better address the most common troubleshooting issues in Cisco
SAN-OS Release 2.x.
Changed
Feature Description in Release Where Documented
Upgrades/Downgrades Added troubleshooting Cisco SAN-OS upgrades, All Chapter 2, “Troubleshooting
downgrades and reboots and bootflash recovery. releases Installs, Upgrades, and Reboots”
Hardware Added troubleshooting Fans, power supplies and All Chapter 3, “Troubleshooting
clock modules. releases Hardware”
Licenses Added troubleshooting license issues. 1.3(1) Chapter 4, “Troubleshooting
Licensing”
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Changed
Feature Description in Release Where Documented
Domains, FSPF Added troubleshooting options for domains and All Chapter 6, “Troubleshooting
FSPF. releases Ports”
Inter-VSAN Routing Describes troubleshooting IVR, including IVR 2.1(2b) Chapter 8, “Troubleshooting
(IVR) Enhancements NAT and IVR auto topology. IVR”
Preface
This document is intended to provide guidance for troubleshooting issues that may appear when
deploying a storage area network (SAN) using the Cisco MDS 9000 Family of switches. This document
introduces tools and methodologies to recognize a problem, determine its cause, and find possible
solutions.
Document Organization
This document is organized into the following chapters:
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Document Conventions
Command descriptions use these conventions:
screen font Terminal sessions and information the switch displays are in screen font.
boldface screen font Information you must enter is in boldface screen font.
italic screen font Arguments for which you supply values are in italic screen font.
< > Nonprinting characters, such as passwords are in angle brackets.
[ ] Default responses to system prompts are in square brackets.
!, # An exclamation point (!) or a pound sign (#) at the beginning of a line of code
indicates a comment line.
Note Means reader take note. Notes contain helpful suggestions or references to material not covered in the
manual.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Caution Means reader be careful. In this situation, you might do something that could result in equipment
damage or loss of data.
Related Documentation
The documentation set for the Cisco MDS 9000 Family includes the following documents. To find a
document online, use the Cisco MDS SAN-OS Documentation Locator at:
http://www.cisco.com/en/US/products/ps5989/products_documentation_roadmap09186a00804500c1.html.
For information on IBM TotalStorage SAN Volume Controller Storage Software for the Cisco MDS
9000 Family, refer to the IBM TotalStorage Support website:
http://www.ibm.com/storage/support/2062-2300/
Release Notes
• Cisco MDS 9000 Family Release Notes for Cisco MDS SAN-OS Releases
• Cisco MDS 9000 Family Release Notes for Storage Services Interface Images
• Cisco MDS 9000 Family Release Notes for Cisco MDS SVC Releases
• Cisco MDS 9000 Family Release Notes for Cisco MDS 9000 EPLD Images
Compatibility Information
• Cisco MDS 9000 SAN-OS Hardware and Software Compatibility Information
• Cisco MDS 9000 Family Interoperability Support Matrix
• Cisco MDS SAN-OS Release Compatibility Matrix for IBM SAN Volume Controller Software for
Cisco MDS 9000
• Cisco MDS SAN-OS Release Compatibility Matrix for Storage Service Interface Images
Hardware Installation
• Cisco MDS 9500 Series Hardware Installation Guide
• Cisco MDS 9200 Series Hardware Installation Guide
• Cisco MDS 9216 Switch Hardware Installation Guide
• Cisco MDS 9100 Series Hardware Installation Guide
• Cisco MDS 9020 Fabric Switch Hardware Installation Guide
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Command-Line Interface
• Cisco MDS 9000 Family Software Upgrade and Downgrade Guide
• Cisco MDS 9000 Family CLI Quick Configuration Guide
• Cisco MDS 9000 Family CLI Configuration Guide
• Cisco MDS 9000 Family Command Reference
• Cisco MDS 9000 Family Quick Command Reference
• Cisco MDS 9020 Fabric Switch Configuration Guide and Command Reference
• Cisco MDS 9000 Family SAN Volume Controller Configuration Guide
Obtaining Documentation
Cisco documentation and additional literature are available on Cisco.com. Cisco also provides several
ways to obtain technical assistance and other technical resources. These sections explain how to obtain
technical information from Cisco Systems.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Cisco.com
You can access the most current Cisco documentation at this URL:
http://www.cisco.com/techsupport
You can access the Cisco website at this URL:
http://www.cisco.com
You can access international Cisco websites at this URL:
http://www.cisco.com/public/countries_languages.shtml
Ordering Documentation
Beginning June 30, 2005, registered Cisco.com users may order Cisco documentation at the Product
Documentation Store in the Cisco Marketplace at this URL:
http://www.cisco.com/go/marketplace/
Nonregistered Cisco.com users can order technical documentation from 8:00 a.m. to 5:00 p.m.
(0800 to 1700) PDT by calling 1 866 463-3487 in the United States and Canada, or elsewhere by
calling 011 408 519-5055. You can also order documentation by e-mail at
[email protected] or by fax at 1 408 519-5001 in the United States and Canada,
or elsewhere at 011 408 519-5001.
Documentation Feedback
You can rate and provide feedback about Cisco technical documents by completing the online feedback
form that appears with the technical documents on Cisco.com.
You can send comments about Cisco documentation to [email protected].
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
You can submit comments by using the response card (if present) behind the front cover of your
document or by writing to the following address:
Cisco Systems
Attn: Customer Document Ordering
170 West Tasman Drive
San Jose, CA 95134-9883
We appreciate your comments.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Tip We encourage you to use Pretty Good Privacy (PGP) or a compatible product to encrypt any sensitive
information that you send to Cisco. PSIRT can work from encrypted information that is compatible with
PGP versions 2.x through 8.x.
Never use a revoked or an expired encryption key. The correct public key to use in your correspondence
with PSIRT is the one linked in the Contact Summary section of the Security Vulnerability Policy page
at this URL:
http://www.cisco.com/en/US/products/products_security_vulnerability_policy.html
The link on this page has the current PGP key ID in use.
Note Use the Cisco Product Identification (CPI) tool to locate your product serial number before submitting
a web or phone request for service. You can access the CPI tool from the Cisco Technical Support &
Documentation website by clicking the Tools & Resources link under Documentation & Tools. Choose
Cisco Product Identification Tool from the Alphabetical Index drop-down list, or click the Cisco
Product Identification Tool link under Alerts & RMAs. The CPI tool offers three search options: by
product ID or model name; by tree view; or for certain products, by copying and pasting show command
output. Search results show an illustration of your product with the serial number label location
highlighted. Locate the serial number label on your product and record the information before placing a
service call.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Cisco Press publishes a wide range of general networking, training and certification titles. Both new
and experienced users will benefit from these publications. For current Cisco Press titles and other
information, go to Cisco Press at this URL:
http://www.ciscopress.com
• Packet magazine is the Cisco Systems technical user magazine for maximizing Internet and
networking investments. Each quarter, Packet delivers coverage of the latest industry trends,
technology breakthroughs, and Cisco products and solutions, as well as network deployment and
troubleshooting tips, configuration examples, customer case studies, certification and training
information, and links to scores of in-depth online resources. You can access Packet magazine at
this URL:
http://www.cisco.com/packet
• iQ Magazine is the quarterly publication from Cisco Systems designed to help growing companies
learn how they can use technology to increase revenue, streamline their business, and expand
services. The publication identifies the challenges facing these companies and the technologies to
help solve them, using real-world case studies and business strategies to help readers make sound
technology investment decisions. You can access iQ Magazine at this URL:
http://www.cisco.com/go/iqmagazine
or view the digital edition at this URL:
http://ciscoiq.texterity.com/ciscoiq/sample/
• Internet Protocol Journal is a quarterly journal published by Cisco Systems for engineering
professionals involved in designing, developing, and operating public and private internets and
intranets. You can access the Internet Protocol Journal at this URL:
http://www.cisco.com/ipj
• Networking products offered by Cisco Systems, as well as customer support services, can be
obtained at this URL:
http://www.cisco.com/en/US/products/index.html
• Networking Professionals Connection is an interactive website for networking professionals to share
questions, suggestions, and information about networking products and technologies with Cisco
experts and other networking professionals. Join a discussion at this URL:
http://www.cisco.com/discuss/networking
• World-class networking training is available from Cisco. You can view current offerings at
this URL:
http://www.cisco.com/en/US/learning/index.html
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 1
Troubleshooting Overview
This chapter introduces the basic concepts, methodology, and general troubleshooting guidelines for
problems that may occur when configuring and using the Cisco MDS 9000 Family of multilayer
directors and fabric switches.
This chapter includes the following sections:
• Overview of the Troubleshooting Process, page 1-1
• Overview of Best Practices, page 1-2
• Troubleshooting Basics, page 1-2
• Primary Troubleshooting Flowchart, page 1-8
• Overview of Symptoms, page 1-8
• System Messages, page 1-9
• Troubleshooting with Logs, page 1-12
• Contacting Customer Support, page 1-14
To identify the possible problems, you need to use a variety of tools and understand the overall storage
environment. For this reason, this guide describes a number of general troubleshooting tools in
Appendix B, “Troubleshooting Tools and Methodology,” including those that are specific to the Cisco
MDS 9000 Family. This chapter also provides a plan for investigating storage issues. See other chapters
in this book for detailed explanations of specific issues.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Troubleshooting Basics
This section provides a series of questions that may be useful when troubleshooting a problem with a
Cisco MDS 9000 Family switch or connected devices. Use the answers to these questions to plan a
course of action and to determine the scope of the problem. For example, if a host can only access some,
but not all, of the logical unit numbers (LUNs) on an existing subsystem, then fabric-specific issues
(such as FSPF, ISLs, or FCNS) do not need to be investigated. The fabric components can therefore be
eliminated from possible causes of the problem.
This section contains the following topics:
• Troubleshooting Guidelines, page 1-2
• Gathering Information Using Common Fabric Manager Tools and CLI Commands, page 1-3
• Verifying Basic Connectivity, page 1-4
• Verifying SAN Element Registration, page 1-5
• Fibre Channel End-to-End Connectivity, page 1-5
Troubleshooting Guidelines
The two most common symptoms of problems occurring in a storage network are:
• A host not accessing its allocated storage
• An application not responding after attempting to access the allocated storage
By answering the questions in the following subsections, you can determine the paths you need to follow
and the components that you should investigate further. These questions are independent of host, switch,
or subsystem vendor.
Answer the following questions to determine the status of your installation:
• Is this a newly installed system or an existing installation? (It could be a new SAN, host, or
subsystem, or new LUNs exported to an existing host.)
• Has the host ever been able to see its storage?
• Does the host recognize any LUNs in the subsystem?
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Are you trying to solve an existing application problem (too slow, too high latency, excessively long
response time) or did the problem show up recently?
• What changed in the configuration or in the overall infrastructure immediately before the
applications started to have problems?
To discover a SAN problem, use the following general SAN troubleshooting steps:
Step 1 Gather information on problems in your fabric. See the “Gathering Information Using Common Fabric
Manager Tools and CLI Commands” section on page 1-3.
Step 2 Verify physical connectivity between your switches and end devices. See the “Verifying Basic
Connectivity” section on page 1-4.
Step 3 Verify registration to your fabric for all SAN elements. See the “Verifying SAN Element Registration”
section on page 1-5.
Step 4 Verify the configuration for your end devices (storage subsystems and servers).
Step 5 Verify end-to-end connectivity and fabric configuration. See the “Fibre Channel End-to-End
Connectivity” section on page 1-5.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Fabric Manager and Device Manager also provide the following tools to proactively monitor your fabric:
• ISL performance—In Fabric Manager, click the ISL Performance icon.
• Network monitoring—In Device Manage, click the Summary tab.
• Performance monitoring—In Fabric Manager, choose Performance > Start Collection.
Note To issue commands with the internal keyword, you must have an account that is a member of the
network-admin group.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Fabric Issues
Answering the following questions will help to determine the status of the fabric configuration:
• Are both the HBA and the subsystem port successfully registered with the fabric name server?
• Does the correct pWWN for the server HBA and the storage subsystem port show up on the correct
port in the FLOGI database? In other words, is the device plugged into the correct port?
• Does any single zone contain both devices? The zone members can be WWNs or FC IDs.
• Is the zone correctly configured and part of the active configuration or zone set within the same
VSAN?
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Port Issues
Initial tasks to perform while investigating port connectivity issues include:
• Verify correct media: copper or optical; single-mode (SM) or multimode (MM).
• Is the media broken or damaged?
• Is the LED on the switch green?
• Is the active LED on the HBA for the connected device on?
Basic port monitoring using Device Manager begins with the visual display in the Device View. (See
Figure 1-1.) Port display descriptions include:
• Green box: A successful fabric login has occurred; the connection is active.
• Red X: A small form-factor pluggable (SFP) transceiver is present but there is no connection. This
could indicate a disconnected or faulty cable, or no active device connection.
• Red box: An SFP is present but fabric login (FLOGI) has failed. Typically there is a mismatch in
port or fabric parameters with the neighboring device. For example, a port parameter mismatch
would occur if a node device were connected to a port configured as an E port. An example of a
fabric parameter mismatch would be differing timeout values.
• Yellow box: In Device Manager, a port has been selected.
• Gray box: The port is administratively disabled.
• Black box: An SFP is not present.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
In Device Manager, selecting the Summary View expands the information available for port monitoring.
(See Figure 1-2.) The display includes:
• VSAN assignment
• For N ports, the port World Wide Name (pWWN) and Fibre Channel ID (FC ID) of the connected
device
• For ISLs, the IP address of the connected switch
• Speed
• Frames transmitted and received
• Percentage utilization for the CPU, dynamic memory, and Flash memory
To drill down for additional port information, use the Device View or Summary View. Select and
double-click any port. The initial display shows administrative settings for Mode, Speed, and Status, plus
current operational status, failure cause, and date of the last configuration change.
Additional tabs include:
• Rx BB Credit—Configure and view buffer-to-buffer credits (BB_credits).
• Other—View PortChannel ID, WWN, and maximum transmission unit (MTU), and configure
maximum receive buffer size.
• FLOGI—View FC ID, pWWN, nWWN, BB_credits, and class of service for N port connections.
• ELP—View pWWN, nWWN, BB_credits, and supported classes of service for ISLs.
• Trunk Config—View and configure trunk mode and allowed VSANs.
• Trunk Failure—View the failure cause for ISLs.
• Physical—Configure beaconing; view SFP information.
• Capability—View current port capability for hold-down timers, BB credits, maximum receive buffer
size.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
EISL,
Hardware Power Host HBA PortChannel
FSPF
Software image
issues
Domain
144817
Manager
Overview of Symptoms
The symptom-based troubleshooting approach provides multiple ways to diagnose and resolve
problems. By using multiple entry points with links to solutions, this guide best serves users who may
have identical problems that are perceived by different indicators. Search this guide in PDF form, use
the index, or rely on the symptoms and diagnostics listed in each chapter as entry points to access
necessary information in an efficient manner.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Using a given a set of observable symptoms on a Fibre Channel SAN, it is important to be able to
diagnose and correct software configuration issues and inoperable hardware components, so that the
problems are resolved with minimal disruption to the SAN environment. Those problems and corrective
actions include:
• Identify key Cisco MDS troubleshooting tools.
• Obtain and analyze Fibre Channel protocol traces using RSPAN on the CLI.
• Identify or rule out physical port issues.
• Identify or rule out switch module issues.
• Diagnose and correct Fx port issues.
• Diagnose and correct issues on the data path.
• Diagnose and correct advanced services issues.
• Recover from switch upgrade failures.
• Diagnose and resolve Fabric Manager and Device Manager configuration problems.
• Obtain core dumps and other diagnostic data for use by the TAC.
System Messages
The system software sends these syslog (system) messages to the console (and, optionally, to a logging
server on another system) during operation. Not all messages indicate a problem with your system. Some
messages are purely informational, while others might help diagnose problems with links, internal
hardware, or the system software.
This section contains the following topics:
• System Message Text, page 1-9
• Syslog Server Implementation, page 1-10
• Implementing Syslog with Fabric Manager, page 1-10
• Implementing Syslog with the CLI, page 1-11
Use this string to find the matching system message in the Cisco MDS 9000 Family System Messages
Reference.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Each system message is followed by an explanation and recommended action. The action may be as
simple as “No action required.” It may involve a fix or a recommendation to contact technical support
as shown in the following example:
Recommended Action Enter the show interface transceiver CLI command or similar Fabric
Manager/Device Manager command to determine the transceiver being used. Please contact your
customer support representative for a list of authorized transceiver vendors.
Note The Cisco MDS messages should be logged to a different file from the standard syslog file so that they
cannot be confused with other non-Cisco syslog messages. The logfile should not be located on the / file
system, to prevent log messages from filling up the / file system.
Syslog Client: switch1
Syslog Server: 172.22.36.211 (Solaris)
Syslog facility: local1
Syslog severity: notifications (level 5, the default)
File to log MDS messages to: /var/adm/MDS_logs
Step 1 In Fabric Manager, choose Switches > Events > Syslog and click the Servers tab in the Information
pane.
In Device Manager, choose Logs > Syslog > Setup and click the Servers tab in the Syslog dialog box.
Step 2 Click Create Row in Fabric Manager or Create in Device Manager to add a new syslog server.
Step 3 Enter the name or IP address in dotted decimal notation (for example, 192.168.2.12) of the syslog server
in the Name or IP Address field.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 Set the message severity threshold by clicking the MsgSeverity radio button and set the facility by
clicking the Facility radio button.
Step 5 Click Apply Changes in Fabric Manager or click Create in Device Manager to save and apply your
changes.
Step 6 If CFS is enabled in Fabric Manager for the syslog feature, click CFS and commit these changes to
propagate the configuration through the fabric.
Device Manager allows you to view event logs on your local PC as well as those on the switch. For a
permanent record of all events that occur on the switch, you should store these messages off the switch.
To do this the Cisco MDS switch must be configured to send syslog messages to your local PC and a
syslog server must be running on that PC to receive those messages. These messages can be categorized
into four classes:
• Hardware—Line card or power supply problems
• Link incidents—FICON port condition changes
• Accounting—User change events
• Events—All other events
Note You should avoid using PCs that have IP addresses randomly assigned to them by DHCP. The switch
continues to use the old IP address unless you manually change it; however the Device Manager prompts
you if it does detect this situation. UNIX workstations have a built-in syslog server. You must have root
access (or run the Cisco syslog server as setuid to root) to stop the built-in syslog daemon and start the
Cisco syslog server.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
#touch /var/adm/MDS_logs
c. Restart syslog.
# /etc/init.d/syslog stop
# /etc/init.d/syslog start
syslog service starting.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Example 1-1 shows an example of the show logging CLI command output.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 2
Troubleshooting Installs, Upgrades, and Reboots
This chapter describes how to identify and resolve problems that might occur when installing, upgrading,
or restarting Cisco MDS 9000 Family products. It includes the following sections:
• Overview, page 2-1
• Best Practices, page 2-2
• Disruptive Module Upgrades, page 2-4
• Troubleshooting Fabric Manager Installations, page 2-4
• Verifying Cisco SAN-OS Software Installations, page 2-5
• Troubleshooting Cisco SAN-OS Software Upgrades and Downgrades, page 2-6
• Troubleshooting Cisco SAN-OS Software System Reboots, page 2-12
• Recovering the Administrator Password, page 2-30
• Miscellaneous Software Image Issues, page 2-30
Overview
Each Cisco MDS 9000 switch ships with an operating system (Cisco SAN-OS) that consists of two
images—the kickstart image and the system image. There is also a module image if the Storage Services
Module (SSM) is present.
Installations, upgrades, and reboots are ongoing parts of SAN maintenance activities. It is important to
minimize the risk of disrupting ongoing operations when performing these operations in production
environments, and to know how to recover quickly when something does go wrong.
Note For documentation purposes, we use the term upgrade in this document. However, upgrade refers to both
upgrading and downgrading your switch, depending on your needs.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Best Practices
This sections lists the best practices for Cisco SAN-OS software installations, image upgrade and
downgrade procedures, and reboots and includes the following topics:
• Best Practices for Installations, page 2-2
• Best Practices for Upgrading, page 2-2
• Best Practices for Reboots, page 2-3
Checklist Checkoff
Copy the new Cisco SAN-OS image onto your supervisor modules in bootflash: or slot0:.
Save your running configuration to the startup configuration.
Backup a copy of your configuration to a remote TFTP server.
Schedule your upgrade during an appropriate maintenance window for your fabric.
After you have completed the checklist, you are ready to upgrade the switches in your fabric.
Note It is normal for the active supervisor to become the standby supervisor during an upgrade.
Follow these best practices guidelines for upgrading and downgrading Cisco SAN-OS software images:
• Read the Cisco SAN-OS Release Notes for the release you are upgrading or downgrading to. Cisco
SAN-OS Release Notes are available at the following website:
http://cisco.com/en/US/products/ps5989/prod_release_notes_list.html
• Ensure that an FTP or TFTP server is available.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Copy the startup-config to a snapshot config in NVRAM. This creates a backup copy of the
startup-config.
– In Device Manager, Choose Admin > Copy Configuration and select the startupConfig radio
button for the From: field and the serverFile radio button for the To: field. Set the other fields
and click Apply.
• From the CLI, use the copy nvram:startup-config nvram-snapshot-config CLI command.
• Where possible, choose to do a nondisruptive upgrade. You can nondisruptively upgrade to Cisco
SAN-OS Release 2.x from any Cisco SAN-OS software release beginning with Release 1.3(x). If
you are running an older version of Cisco SAN-OS, upgrade to Release 1.3(x) and then Release 2.x.
• Establish a PC serial connection to each supervisor console to record upgrade activity to a file. This
catches any error messages or problems during bootup.
• In Fabric Manager, choose Tools > Other > Software Install or click the Software Install icon on
the toolbar to use the Software Install Wizard.
• From the CLI, use the install all [{asm-sfn | kickstart | ssi | system} URL] command to run a
complete script, test the images, and verify the compatibility with the hardware. See the “Installing
Cisco SAN-OS Software from the CLI” section on page 2-10. Using the install all command offers
the following advantages:
– You can upgrade the entire switch using the least disruptive procedure with just one command.
– You can receive descriptive information on the intended changes to your system before you
continue with the command.
– You have the option to cancel the command. Once the effects of the command are presented,
you can continue or cancel when you see this question (the default is no):
Do you want to continue (y/n) [n] :y
– You can view the progress of this command on the console, Telnet, and SSH screens.
– The image integrity is automatically checked, including the running kickstart and system
images.
– The command performs a platform validity check to verify that a wrong image is not used. For
example, the command verifies that an MDS 9500 Series image is not used inadvertently to
upgrade an MDS 9200 Series switch.
– After issuing the install all command, if any step in the sequence fails, the command completes
the step in progress and ends.
For example, if a switching module fails to be updated for any reason (for example, due to an
unstable fabric state), then the command sequence disruptively updates that module and ends.
In such cases, you can verify the problem on the affected switching module and upgrade the
other switching modules.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Schedule the reboot to avoid possible disruption of services during critical business hours.
Note Log messages are not saved across system reboots. However, a maximum of 100 log messages with a
severity level of critical and below (levels 0, 1, and 2) are saved in NVRAM. You can view this log at
any time with the show logging nvram CLI command.
Fabric Manager and Device Manager do not operate properly with JRE 1.4.2_03 on Windows 2003.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
-- SUCCESS
Extracting “loader” version from image bootflash:/b-1.3.0.104.
-- SUCCESS
switch# show install all status
This is the log of last installation. <----------------- log of last install
Verifying image bootflash:/b-1.3.0.104
-- SUCCESS
Verifying image bootflash:/i-1.3.0.104
-- SUCCESS
Extracting “system” version from image bootflash:/i-1.3.0.104.
-- SUCCESS
Extracting “kickstart” version from image bootflash:/b-1.3.0.104.
-- SUCCESS
Extracting “loader” version from image bootflash:/b-1.3.0.104.
-- SUCCESS
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Use the show incompatibility CLI command for diagnosis when the install all CLI command warns of
compatibility issues.
During an attempted upgrade, the install all CLI command may return the following warning:
Warning: The startup config contains commands not supported by the system image; as a
result, some resources might become unavailable after an install.
Do you wish to continue? (y/ n) [y]: n
Message 2 indicates that the Fibre Channel tunnel feature is not supported in the new image. The RSPAN
feature uses Fibre Channel tunnels.
2) Feature Index : 119 , Capability : CAP_FEATURE_FC_TUNNEL_CFG
Description : fc-tunnel is enabled
Capability requirement : STRICT
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Open the Software Install Wizard by clicking its icon in the toolbar (see Figure 2-1).
Note There is no limit on the number of switches you can upgrade. However, the upgrade is a serial
process; that is, only a single switch is upgraded at a time.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note On hosts where the TFTP server cannot be started, a warning is displayed. The TFTP server may
not start because an existing TFTP server is running or because access to the TFTP port 69 has
been denied for security reasons (the default setting on LINUX). In these cases, you cannot
transfer files from the local host to the switch.
Note Before exiting the session, be sure the upgrade process is complete. The wizard will display a
status as it goes along. Check the lower left-hand corner of the wizard for the status message
Upgrade Finished. First, the wizard displays the message Success followed a few seconds later
by InProgress Polling. Then the wizard displays a second message Success before displaying
the final Upgrade Finished.
Step 1 Log into the switch through the console, Telnet, or SSH port of the active supervisor.
Step 2 Create a backup of your existing configuration file, if required.
Step 3 Perform the upgrade by issuing the install all command.
The example below demonstrates upgrading from SAN-OS 2.0(2b) to 2.1(1a) using the install all
command with the source images located on a SCP server.
Tip Always carefully read the output of install all’s compatibility check. This tells you exactly what
needs to be upgraded (BIOS, loader, firmware) and what modules are not hitless. If there are any
questions or concerns about the results of the output, select ‘n’ to stop the installation and
contact the next level of support.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 Exit the switch console and open a new terminal session to view the upgraded supervisor module using
the show module command.
If the configuration meets all guidelines when the install all command is issued, all modules (supervisor
and switching) are upgraded. This is true for any switch in the Cisco MDS 9000 Family.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Loads kernel,
basic drivers, and 3. Kickstart image
SAN-OS image
Login prompt
79952
4. System image
If the images on your switch are corrupted and you cannot proceed (error state), you can interrupt the
switch boot sequence and recover the image by entering the BIOS configuration utility described in the
following section. Access this utility only when needed to recover a corrupted internal disk.
Caution The BIOS changes explained in this section are only required to recover a corrupted bootflash.
Recovery procedures require the regular sequence to be interrupted. The internal switch sequence goes
through four phases between the time you turn the switch on and the time the switch prompt appears on
your terminal—BIOS, boot loader, kickstart, and system (see Table 2-6 and Figure 2-3).
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
1 2 3 4
Regular Power BIOS Bootloader Kickstart SAN-OS Access
sequence on image image switch
79950
3 = Kickstart Power on 3 = Kickstart Power on 4 = SAN-OS
image and Ctrl-C image and Esc image
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note Your navigating options are provided at the bottom of the screen.
Tab = Jump to next field
Ctrl-E = Down arrow
Ctrl-X = Up arrow
Ctrl-H = Erase (Backspace might not work if your terminal is not configured properly.)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 Press the Tab key to select the Basic CMOS Configuration.
You see the System BIOS Setup - Basic CMOS Configuration screen (see Figure 2-5).
Caution The file name must be entered exactly as it is displayed on your TFTP server. For example, if you have
a file named MDS9500-kiskstart_mzg.10, then enter this name using the exact uppercase characters and
file extensions as shown on your TFTP server.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Caution The switch must have IP connectivity to reboot using the newly configured values.
Step 14 Enter the init system command at the switch(boot)# prompt, and press Enter to reformat the file
system.
switch(boot)# init system
Note The init system command also installs a new loader from the existing (running) kickstart image.
Step 15 Follow the procedure specified in the “Recovery from the switch(boot)# Prompt” section on page 2-20.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note The loader> prompt is different from the regular switch# or switch(boot)# prompt. The CLI
command completion feature does not work at this prompt and may result in undesired errors. You must
type the command exactly as you want the command to appear.
Tip Use the help command at the loader> prompt to display a list of commands available at this prompt or
to obtain more information about a specific command in that list.
To recover a corrupted kickstart image (system error state) for a switch with a single supervisor module,
follow these steps:
Step 1 Enter the local IP address and the subnet mask for the switch at he loader> prompt, and press Enter.
loader> ip address 172.16.1.2 255.255.255.0
Found Intel EtherExpressPro100 82559ER at 0xe800, ROM address 0xc000
Probing...[Intel EtherExpressPro100 82559ER]Ethernet addr: 00:05:30:00:52:27
Address: 172.16.1.2
Netmask: 255.255.255.0
Server: 0.0.0.0
Gateway: 0.0.0.0
Step 3 Boot the kickstart image file from the required server.
loader> boot tftp://172.16.10.100/kickstart-image1
Address: 172.16.1.2
Netmask: 255.255.255.0
Server: 172.16.10.100
Gateway: 172.16.1.1
Booting: /kick-282 console=ttyS0,9600n8nn quiet loader_ver= “2.1(2)”....
............................................Image verification OK
Starting kernel...
INIT: version 2.78 booting
Checking all filesystems..... done.
Loading system software
INIT: Sending processes the TERM signal
Sending all processes the TERM signal... done.
Sending all processes the KILL signal... done.
Entering single-user mode...
INIT: Going single user
INIT: Sending processes the TERM signal
switch(boot)#
The switch(boot)# prompt indicates that you have a usable Kickstart image.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 5 Follow the procedure specified in the “Recovery from the switch(boot)# Prompt” section on page 2-20.
Step 1 Change to configuration mode and configure the IP address of the mgmt0 interface.
switch(boot)# config t
switch(boot)(config)# interface mgmt0
Step 2 Follow this step if you issued an init system command. Otherwise, skip to Step 3.
a. Issue the ip address command to configure the local IP address and the subnet mask for the switch.
switch(boot)(config-mgmt0)# ip address 172.16.1.2 255.255.255.0
b. Issue the ip default-gateway command to configure the IP address of the default gateway.
switch(boot)(config-mgmt0)# ip default-gateway 172.16.1.1
Step 3 Issue the no shutdown command to enable the mgmt0 interface on the switch.
switch(boot)(config-mgmt0)# no shutdown
Step 5 If you believe there are file system problems, issue the init system check-filesystem command. As of
Cisco MDS SAN-OS Release 2.1(1a), this command checks all the internal file systems and fixes any
errors that are encountered. This command takes considerable time to complete.
switch(boot)# init system check-filesytem
Step 6 Copy the system image from the required TFTP server.
switch(boot)# copy tftp://172.16.10.100/system-image1 bootflash:system-image1
Step 7 Copy the kickstart image from the required TFTP server.
switch(boot)# copy tftp://172.16.10.100/kickstart-image1 bootflash:kickstart-image1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 8 Verify that the system and kickstart image files are copied to your bootflash: file system.
switch(boot)# dir bootflash:
12456448 Jul 30 23:05:28 1980 kickstart-image1
12288 Jun 23 14:58:44 1980 lost+found/
27602159 Jul 30 23:05:16 1980 system-image1
Step 9 Load the system image from the bootflash: files system.
switch(boot)# load bootflash:system-image1
Uncompressing system image: bootflash:/system-image1
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
Would you like to enter the initial configuration mode? (yes/no): yes
Note If you enter no at this point, you will return to the switch# login prompt, and you must manually
configure the switch.
Step 1 Boot the functioning supervisor module and log on to the switch.
Step 2 At the switch# prompt on the booted supervisor module, issue the reload module slot force-dnld
command, where slot is the slot number of the supervisor module with the corrupted bootflash.
The supervisor module with the corrupted bootflash performs a netboot and checks the bootflash for
corruption. When the bootup scripts discovers that the bootflash is corrupted, it performs an init system,
which fixes the corrupt bootflash. The supervisor boots up as the HA Standby.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Boot up the switch and press the Esc key after the BIOS memory test to interrupt the boot loader.
Note Press Esc immediately after you see the following message:
00000589K Low Memory Passed
00000000K Ext Memory Passed
Hit ^C if you want to run SETUP....
Wait.....
If you wait too long, you will skip the boot loader phase and enter the kickstart phase.
Caution The loader> prompt is different from the regular switch# or switch(boot)# prompt. The
CLI command completion feature does not work at this prompt and may result in undesired
errors. You must type the command exactly as you want the command to appear.
Tip Use the help command at the loader> prompt to display a list of commands available at this
prompt or to obtain more information about a specific command in that list.
Step 2 Specify the local IP address and the subnet mask for the switch.
loader> ip address 172.16.1.2 255.255.255.0
Found Intel EtherExpressPro100 82559ER at 0xe800, ROM address 0xc000
Probing...[Intel EtherExpressPro100 82559ER]Ethernet addr: 00:05:30:00:52:27
Address: 172.16.1.2
Netmask: 255.255.255.0
Server: 0.0.0.0
Gateway: 0.0.0.0
Step 4 Boot the kickstart image file from the required server.
loader> boot tftp://172.16.10.100/kickstart-latest
Address: 172.16.1.2
Netmask: 255.255.255.0
Server: 172.16.10.100
Gateway: 172.16.1.1
Booting: /kick-282 console=ttyS0,9600n8nn quiet loader_ver= “2.1(2)”....
............................................Image verification OK
Starting kernel...
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The switch(boot)# prompt indicates that you have a usable Kickstart image.
Step 5 Issue the init-system command to repartition and format the bootflash.
Step 6 Perform the procedure specified in the “Recovery from the switch(boot)# Prompt” section on page 2-20.
Step 7 Perform the procedure specified in the “Recovering One Supervisor Module With Corrupted Bootflash”
section on page 2-21 to recover the other supervisor module.
Note If you do not issue the reload module command when a boot failure has occurred, the active supervisor
module automatically reloads the standby supervisor module within 3 to 6 minutes after the failure.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Enter the following command to check the syslog file to see which process restarted and why it restarted.
switch# show log logfile | include error
For information about the meaning of each message, refer to the Cisco MDS 9000 Family System
Messages Reference.
The system output looks like the following:
Sep 10 23:31:31 dot-6 % LOG_SYSMGR-3-SERVICE_TERMINATED: Service "sensor" (PID 704) has
finished with error code SYSMGR_EXITCODE_SY.
switch# show logging logfile | include fail
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 0.0.0.0, in_classd=0 flags=1 fails: Address already in use
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 127.0.0.1, in_classd=0 flags=0 fails: Address already in use
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 127.1.1.1, in_classd=0 flags=1 fails: Address already in use
Jan 27 04:08:42 88 %LOG_DAEMON-3-SYSTEM_MSG: bind() fd 4, family 2, port 123, ad
dr 172.22.93.88, in_classd=0 flags=1 fails: Address already in use
Jan 27 23:18:59 88 % LOG_PORT-5-IF_DOWN: Interface fc1/13 is down (Link failure
or not-connected)
Jan 27 23:18:59 88 % LOG_PORT-5-IF_DOWN: Interface fc1/14 is down (Link failure
or not-connected)
Jan 28 00:55:12 88 % LOG_PORT-5-IF_DOWN: Interface fc1/1 is down (Link failure o
r not-connected)
Jan 28 00:58:06 88 % LOG_ZONE-2-ZS_MERGE_FAILED: Zone merge failure, Isolating p
ort fc1/1 (VSAN 100)
Jan 28 00:58:44 88 % LOG_ZONE-2-ZS_MERGE_FAILED: Zone merge failure, Isolating p
ort fc1/1 (VSAN 100)
Jan 28 03:26:38 88 % LOG_ZONE-2-ZS_MERGE_FAILED: Zone merge failure, Isolating p
ort fc1/1 (VSAN 100)
Jan 29 19:01:34 88 % LOG_PORT-5-IF_DOWN: Interface fc1/1 is down (Link failure o
r not-connected)
switch#
Step 2 Enter the following command to identify the processes that are running and the status of each process.
switch# show processes
The following codes are used in the system output for the State (process state):
• D = uninterruptible sleep (usually I/O)
• R = runnable (on run queue)
• S = sleeping
• T = traced or stopped
• Z = defunct (“zombie”) process
• NR = notrunning
• ER = should be running but currently notrunning
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note ER usually is the state a process enters if it has been restarted too many times and has been detected as
faulty by the system and disabled.
The system output looks like the following example. (The output has been abbreviated to be more
concise.)
PID State PC Start_cnt TTY Process
----- ----- -------- ----------- ---- -------------
1 S 2ab8e33e 1 - init
2 S 0 1 - keventd
3 S 0 1 - ksoftirqd_CPU0
4 S 0 1 - kswapd
5 S 0 1 - bdflush
6 S 0 1 - kupdated
71 S 0 1 - kjournald
136 S 0 1 - kjournald
140 S 0 1 - kjournald
431 S 2abe333e 1 - httpd
443 S 2abfd33e 1 - xinetd
446 S 2ac1e33e 1 - sysmgr
452 S 2abe91a2 1 - httpd
453 S 2abe91a2 1 - httpd
456 S 2ac73419 1 S0 vsh
469 S 2abe91a2 1 - httpd
470 S 2abe91a2 1 - httpd
Step 3 Enter the following command to show the processes that have had abnormal exits and if there is a
stack-trace or core dump.
switch# show process log
Process PID Normal-exit Stack-trace Core Log-create-time
---------------- ------ ----------- ----------- ------- ---------------
ntp 919 N N N Jan 27 04:08
snsm 972 N Y N Jan 24 20:50
Step 4 Enter the following command to show detailed information about a specific process that has restarted.
switch# show processes log pid 898
Service: idehsd
Description: ide hotswap handler Daemon
Started at Mon Sep 16 14:56:04 2002 (390923 us)
Stopped at Thu Sep 19 14:18:42 2002 (639239 us)
Uptime: 2 days 23 hours 22 minutes 22 seconds
Start type: SRV_OPTION_RESTART_STATELESS (23)
Death reason: SYSMGR_DEATH_REASON_FAILURE_SIGTERM (3)
Exit code: signal 15 (no core)
CWD: /var/sysmgr/work
Virtual Memory:
CODE 08048000 - 0804D660
DATA 0804E660 - 0804E824
BRK 0804E9A0 - 08050000
STACK 7FFFFD10
Register Set:
EBX 00000003 ECX 0804E994 EDX 00000008
ESI 00000005 EDI 7FFFFC9C EBP 7FFFFCAC
EAX 00000008 XDS 0000002B XES 0000002B
EAX 00000003 (orig) EIP 2ABF5EF4 XCS 00000023
EFL 00000246 ESP 7FFFFC5C XSS 0000002B
Stack: 128 bytes. ESP 7FFFFC5C, TOP 7FFFFD10
0x7FFFFC5C: 0804F990 0804C416 00000003 0804E994 ................
0x7FFFFC6C: 00000008 0804BF95 2AC451E0 2AAC24A4 .........Q.*.$.*
0x7FFFFC7C: 7FFFFD14 2AC2C581 0804E6BC 7FFFFCA8 .......*........
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 5 Enter the following command to determine if the restart recently occurred.
switch# show system uptime
Start Time: Fri Sep 13 12:38:39 2002
Up Time: 0 days, 1 hours, 16 minutes, 22 seconds
To determine if the restart is repetitive or a one-time occurrence, compare the length of time that the
system has been up with the timestamp of each restart.
Step 6 Enter the following command to view the core files.
switch# show cores
Module-num Process-name PID Core-create-time
---------- ------------ --- ----------------
5 fspf 1524 Jan 9 03:11
6 fcc 919 Jan 9 03:09
8 acltcam 285 Jan 9 03:09
8 fib 283 Jan 9 03:08
This output shows all the cores presently available for upload from the active supervisor. The
module-num column shows the slot number on which the core was generated. In the previous example,
an FSPF core was generated on the active supervisor module in slot 5. An FCC core was generated on
the standby supervisory module in slot 6. Core dumps generated on the module in slot 8 include
ACLTCAM and FIB.
To copy the FSPF core dump in this example to a TFTP server with the IP address 1.1.1.1, enter the
following command:
switch# copy core://5/1524 tftp::/1.1.1.1/abcd
The following command displays the file named zone_server_log.889 in the log directory.
switch# show pro log pid 1473
======================================================
Service: ips
Description: IPS Manager
Virtual Memory:
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Register Set:
Step 7 Enter the following command to configure the switch to use TFTP to send the core dump to a TFTP
server.
system cores tftp:[//servername][/path]
This command causes the switch to enable the automatic copy of core files to a TFTP server. For
example, the following command sends the core files to the TFTP server with the IP address 10.1.1.1.
switch(config)# system cores tftp://10.1.1.1/cores
See also the “Troubleshooting Supervisor Issues” section on page 3-15 or the “Troubleshooting
Switching and Services Modules” section on page 3-22.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Problem Solution
You forgot the administrator password for You can recover the password using a local console connection. For the
accessing a Cisco MDS 9000 Family switch. latest instructions on password recovery, refer to the Cisco MDS 9000
Family Configuration Guide at the following website:
http://cisco.com/en/US/products/ps5989/products_installation_and_conf
iguration_guides_list.html
Symptom Console reports all ports on a module are down because of a system health failure.
Table 2-9 All Ports are Down Because of a System Health Failure.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Symptom Switch rebooted after FCIP module was reloaded, upgraded or downgraded.
Symptom A newly configured FCIP link may fail to come up when running on an MPS-14/2 module.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Symptom Switch displays the wrong user with the show running-config CLI command.
C H A P T E R 3
Troubleshooting Hardware
This chapter describes how to identify and resolve problems that might occur in the hardware
components of the Cisco MDS 9000 Family. It includes the following sections:
• Overview, page 3-1
• Best Practices, page 3-2
• Troubleshooting Startup Issues, page 3-3
• Troubleshooting Power Supply Issues, page 3-4
• Troubleshooting Fan Issues, page 3-9
• Temperature Threshold Violations, page 3-12
• Troubleshooting Clock Module Issues, page 3-13
• Troubleshooting Other Hardware Issues, page 3-14
• Troubleshooting Supervisor Issues, page 3-15
• Troubleshooting Switching and Services Modules, page 3-22
Overview
The key to success when troubleshooting the system hardware is to isolate the problem to a specific
system component. The first step is to compare what the system is doing to what it should be doing.
Because a startup problem can usually be attributed to a single component, it is more efficient to isolate
the problem to a subsystem rather than troubleshoot each separate component in the system.
Problems with the initial power up are often caused by a module that is not firmly connected to the
backplane or a power supply that has been disconnected from the power cord connector.
Overheating can also cause problems with the system, though typically only after the system has been
operating for an extended period of time. The most common cause of overheating is the failure of a fan
module.
The Cisco MDS 9000 Family includes the following subsystems on most chassis:
• Power supply— This includes the power supply fans.
• Fan module—The chassis fan module should operate whenever system power is on. You should see
the Fan LED turn green and should hear the fan module to determine whether or not it is operating.
If the Fan LED is red, this indicates that one or more fans in the fan module is not operating. You
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
should immediately contact your customer service representative (see the “Steps to Perform Before
Calling TAC” section on page A-1). There are no installation adjustments that you can make if the
fan module does not function properly at initial startup.
Note If you purchased Cisco support through a Cisco reseller, contact the reseller directly. If you
purchased support directly from Cisco, contact Cisco Technical Support at this website:
http://www.cisco.com/warp/public/687/Directory/DirTAC.shtm
• Supervisor module—The supervisor module contains the operating system software, so check your
supervisor module if you have trouble with the system software. Status LEDs on the supervisor
module indicate whether or not the supervisor module can initialize a switching or services module.
If you have a redundant supervisor module, refer to the following website for the latest Cisco MDS
9000 Family configuration guides for descriptions of how the redundant supervisor module comes
online and how the software images are handled:
http://www.cisco.com/univercd/cc/td/doc/product/sn5000/mds9000/index.htm.
• Switching or services module—Status LEDs on each module indicate if it has been initialized by the
supervisor module. A module that is partially installed in the backplane can cause the system to halt.
Best Practices
You should consider the best practices recommended in this section to ensure the proper installation,
initialization, and operation of your switch. This section includes the following topics:
• Best Practices for Switch Installation, page 3-2
• Best Practices for System Initialization, page 3-2
• Best Practices for Supervisor Modules, page 3-3
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• The system software boots successfully. Refer to the following website for the latest Cisco MDS
9000 Family configuration guides containing information on booting the system and initial
configuration tasks:
http://www.cisco.com/univercd/cc/td/doc/product/sn5000/mds9000/index.htm.
• The supervisor module and all switching or services modules are installed correctly and each one
initialized without problems. See the “Troubleshooting Supervisor Issues” section on page 3-15.
If all of these conditions are met and the hardware installation is complete, see the rest of this document
to troubleshoot any other software issues.
If any of these conditions are not met, use the procedures in this chapter to isolate and, if possible,
resolve the problem.
Step 1 Turn on the power supplies by turning or pressing the switch on (|). You should immediately hear the
system fan module begin to operate. If not, see the “Troubleshooting Power Supply Issues” section on
page 3-4.
Step 2 If you determine that the power supplies are functioning normally and the fan module is faulty, see the
“Troubleshooting Fan Issues” section on page 3-9.
Step 3 Verify that the LEDs on the supervisor module display as follows:
a. The Status LED flashes orange once and stays orange during diagnostic boot tests. It turns green
when the module is operational (online). If the system software cannot start up, this LED stays
orange.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
b. The System LED turns green, indicating that all chassis environmental monitors are reporting that
the system is operational. If one or more environmental monitors reports a problem, the System LED
is orange or red.
c. The Active LED turns green, indicating that the supervisor module is operational and active. If the
supervisor module is in standby mode, the Active LED is orange.
d. Each Link LED flashes orange once and stays orange during diagnostic boot tests, and turns green
when the module is operational (online). If no signal is detected, the Link LED turns off. The link
LED blinks orange if the port is bad.
If any LEDs on the supervisor module front panel are red or orange after the initialization time, see the
“Troubleshooting Supervisor Issues” section on page 3-15. If you have a redundant supervisor module,
refer to the following website for the latest Cisco MDS 9000 Family configuration guides for
descriptions of the supervisor module LEDS, how the redundant supervisor module comes online, and
how the software images are handled:
http://www.cisco.com/univercd/cc/td/doc/product/sn5000/mds9000/index.htm.
Step 4 Verify that the Status LEDs on the supervisor module and on each switching or services module are green
when the supervisor module completes initialization. This LED indicates that the modules are receiving
power, have been recognized by the supervisor module, and contain a valid Flash code version. This LED
does not indicate the state of the individual interfaces on the switching modules. If a Status LED is red
or orange, see the “Troubleshooting Supervisor Issues” section on page 3-15.
Step 5 Verify that the terminal is set correctly and that it is connected properly to the supervisor module console
port if the boot information and system banner are not displayed.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Error Message PLATFORM-2-PS_FAIL: Power supply [dec] failed or shutdown (Serial No.
[chars]).
Recommended Action Enter the show environment power and show platform internal info CLI
commands or similar Fabric Manager or Device Manager command to collect more information.
Refer to power supply documentation in the relevant hardware installation guide to learn more on
increasing or decreasing power supply capacity and configuring power supplies.
Explanation Detected a new power supply that has reduced capacity compared to an existing power
supply.
Recommended Action Refer to power supply document on increasing decreasing power supply
capacity and configuring power supplies. Enter the show environment power and show platform
internal info CLI command or similar Fabric Manager/Device Manager command to collect more
information.
Error Message PLATFORM-5-PS_REMOVE: Power supply [dec] removed (Serial No. [chars]).
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Recommended Action Enter the show environment power and show platform internal info CLI
command or similar Fabric Manager/Device Manager command to collect more information.
Introduced Cisco MDS SAN-OS Release 1.3(1).
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Verify that the Input Ok LED on the power supply is green. If the Input Ok LED is green, the AC or DC
source is good and the power supply is functional.
Step 2 If the Input Ok LED is off, first ensure that the power supply is flush with the chassis. Turn the power
switch off, tighten the captive screw(s), and then turn the power switch on (|). If the Input Ok LED
remains off, there might be a problem with the AC source or the DC source, or the power cable.
a. Turn off the power to the switch by pressing or turning both power switches to 0, connect the power
cord to another power source if one is available, and turn the power on. If the Input Ok LED is now
green, the problem was the first power source.
b. If the Input Ok LED fails to light after you connect the power supply to a new power source, replace
the power cord and turn the switch on. If the Input Ok LED lights at this point, return the first power
cord for replacement.
c. If the Input Ok LED still fails to light when the switch is connected to a different power source with
a new power cord, the power supply is probably faulty. If a second power supply is available, install
it in the second power supply bay and contact your customer service representative for further
instructions.
Note If you purchased Cisco support through a Cisco reseller, contact the reseller directly. If you
purchased support directly from Cisco, contact Cisco Technical Support at this URL:
http://www.cisco.com/warp/public/687/Directory/DirTAC.shtm
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
If you are unable to resolve the problem or if you determine that either a power supply or backplane
connector is faulty, contact your customer support representative.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Physical > Fan. You see the Fan Status dialog box.
Step 2 If the OperStatus is failure, one or more fans are not operational. Replace the failed fan module before
your switch overheats. You should see the following system message in the switch log:
Explanation Fan module failed and needs to be replaced. This can lead to overheating and
temperature alarms.
Recommended Action Enter the show platform internal info CLI command or similar Fabric
Manager/Device Manager command to collect more information.
Step 3 If the OperStatus is absent, the fan module has been removed. As soon as the fan module is removed, Cisco
SAN-OS starts a 5 minute countdown.
Caution If the fan module is not reinserted within 5 minutes, the entire switch is shutdown.
Software reads a byte on the SEEPROM to determine if the fan module is present. If the fan module is
partially inserted or software is unable to access the SEEPROM on the fan module for any other reason,
then Cisco SAN-OS cannot distinguish this case from a real fan module removal. The switch will be shut
down in five minutes. The following priority 0 syslog messages are printed every five seconds:
Error Message PLATFORM-0-FAIL_REMOVED: Fan module removed. Fan module has been
absent for [dec] seconds.
Explanation Fan module was removed. This could lead to temperature alarms.
Note If you purchased Cisco support through a Cisco reseller, contact the reseller directly. If you
purchased support directly from Cisco, contact Cisco Technical Support at this URL:
http://www.cisco.com/warp/public/687/Directory/DirTAC.shtm
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Use the show environment fan CLI command and verify the status of each fan type. (See Example 3-2.)
Step 2 If the fan status is failure, one or more fans are not operational. Replace the failed fan module before
your switch overheats. You should see the following system message in the log:
Explanation Fan module failed and needs to be replaced. This can lead to overheating and
temperature alarms.
Recommended Action Enter the show platform internal info CLI command to collect more
information.
Step 3 If the fan status is absent, the fan module has been removed. As soon as the fan module is removed, Cisco
SAN-OS starts a 5 minute countdown.
Caution If the fan module is not reinserted within five minutes, the entire switch is shut down.
Software reads a byte on the SEEPROM to determine if the fan module is present. If the fan module is
partially inserted or software is unable to access the SEEPROM on the fan module for any other reason,
then Cisco SAN-OS cannot distinguish this case from a real fan module removal. The switch will be shut
down in five minutes. The following priority 0 syslog messages are printed every five seconds:
Error Message PLATFORM-0-FAIL_REMOVED: Fan module removed. Fan module has been
absent for [dec] seconds.
Explanation Fan module was removed. This could lead to temperature alarms.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 Remove and reinstall or replace the fan module. If the Fan LED is still red, the system detects a fan
module failure. Contact your customer service representative for instructions.
Note If you purchased Cisco support through a Cisco reseller, contact the reseller directly. If you
purchased support directly from Cisco, contact Cisco Technical Support at this URL:
http://www.cisco.com/warp/public/687/Directory/DirTAC.shtm
5 Outlet 75 60 35 ok
5 Intake 65 50 34 ok
6 Outlet 75 60 35 ok
6 Intake 65 50 34 ok
9 Outlet 75 60 45 ok
9 Intake 65 50 40 ok
The intake sensor is placed at the airflow intake and is the most critical indicator of module temperature.
All Cisco SAN-OS actions are taken when the major threshold of an intake sensor is exceeded.
A minor threshold violation or a major threshold violation on an outlet sensor results in the following
system message:
Recommended Action Enter the show environment temperature CLI command or choose Physical
> Temperature Sensors on Device Manager to collect more information.
This also generates a Call Home event and an SNMP notification.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
A major temperature threshold violation on a module intake sensor results in the following system
message:
Explanation System shutdown in the number of seconds shown in the error message.
Recommended ActionEnter the show environment temperature CLI command or similar Fabric
Manager/Device Manager command to collect more information.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
On a clock module failure, the system switches over to the redundant clock module automatically. This
also results in a hardware reset of the switch. When the switch reboots, it displays the current active
clock module. The following syslog message is printed at switch boot-up time, indicating the current
active clock module.
Explanation Chassis clock source has failed and system will be reset. System will automatically start
using the redundant clock module.
Recommended Action Replace the failed clock module during the next maintenance window.
Typically, clock module A is the active clock and on a failure of clock module A, clock module B
becomes the active clock. Refer to the hardware installation guide for your platform at the following
website to replace a clock module.
http://www.cisco.com/en/US/products/hw/ps4159/ps4358/prod_installation_guides_list.html
To identify a hardware issue with a module using the CLI, follow these steps:
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
device id: 85
device errorcode: 0xc550120c
system time: (1127748710 ticks) Mon Sep 26 15:31:50 2005
Step 2 View the error statistics from the show hardware internal errors command output.
Some error statistics reported under FC-MAC are not necessarily errors, but those counters normally
do not increment for a port that is in an up state.
Step 3 View the interrupt counts in the show hardware internal errors command output.
Note the following:
• Some interrupts are not necessarily error interrupts.
• Some interrupts have a threshold before the corresponding ports are declared as bad. Do not
conclude that the hardware is bad because of some interrupt counts. However, these commands are
useful for your customer support representative when debugging the problems.
• Some interrupt counts may show up under UP-XBAR and DOWN-XBAR ASICs, when one of
Supervisors is pulled out or restarted.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
This section describes how to diagnose when an active or standby supervisor fails to initialize properly.
This section includes the following topics:
• Active Supervisor Reboots, page 3-16
• Standby Supervisor Not Recognized by Active Supervisor, page 3-18
• Standby Supervisor Stays in Powered-Up State, page 3-20
Example 3-6 displays the reset reason when a supervisor rebooted because of a process crash.
Example 3-6 Reset Reason for Supervisor Reboot Caused by Failed Process
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Example 3-7 displays the system messages on the standby supervisor when a supervisor rebooted
because of a process crash.
Example 3-7 System Messages for Supervisor Reboot Caused by Failed Process
Example 3-8 displays the exception log when a supervisor rebooted because of a runtime diagnostic
failure.
Example 3-8 Exception Log for Supervisor Reboot Caused by Runtime Diagnostic Failure
error type: FATAL error <--------------------- exception that caused the reboot
Number Ports went bad:
1,2,3,4,5,6
Example 3-9 displays the system messages on the standby supervisor when a supervisor rebooted
because of a runtime diagnostic failure..
Example 3-9 System Messages for Supervisor Reboot Caused by Runtime Diagnostic Failure
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Use the show module command on the active supervisor to verify that the active supervisor does not
detect the standby supervisor.(See Example 3-10.)
Step 2 Telnet to the standby supervisor console port and verify that it is in standby mode. (See Example 3-11.)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
MDS Switch
login: admin
Password:
Cisco Storage Area Networking Operating System (SAN-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2005, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained herein are owned by
other third parties and are used and distributed under license.
Some parts of this software are covered under the GNU Public
License. A copy of the license is available at
http://www.gnu.org/licenses/gpl.html.
switch(standby)#
Step 3 Use the show system redundancy status CLI command on the active supervisor to verify that the
standby supervisor did not complete the synchronization phase with the active supervisor.
switch# show system redundancy status
Redundancy mode
---------------
administrative: HA
operational: None
The most likely reason for the synchronization to stall is if one of the software components on the
standby supervisor failed to synchronize its state with the active supervisor.
Step 4 Use the show system internal sysmgr gsyncstats CLI command on the active supervisor to determine which
processes did not synchronize on the standby supervisor.
switch# show system internal sysmgr gsyncstats
Name Gsync done Gsync time(sec)
---------------- ---------- -------------
aaa 1 0
ExceptionLog 1 0
platform 1 1
radius 1 0
securityd 1 0
SystemHealth 1 0
tacacs 0 N/A
acl 1 0
ascii-cfg 1 1
bios_daemon 0 N/A
bootvar 1 0
callhome 1 0
capability 1 0
cdp 1 0
cfs 1 0
cimserver 1 0
cimxmlserver 0 N/A
confcheck 1 0
core-dmon 1 0
core-client 0 N/A
device-alias 1 0
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
dpvm 0 N/A
dstats 1 0
epld_upgrade 0 N/A
epp 1 1
Step 5 Use the show system internal sysmgr service all CLI command on the standby supervisor to determine
if any process is experiencing excessive restarts. (See Example 3-12.)
Note This command may not be available if the standby supervisor is at the loader> prompt.
Looking at the standby supervisor in Example 3-12 shows that the crossbar (xbar) software component
has been restarted 23 times. This has probably prevented the standby from initializing properly.
Step 6 Use the reload module CLI command to restart the standby supervisor. If this fails, use the reload
module 6 force-dlnd command from the active supervisor to force the standby supervisor to netboot off
of the active supervisor.
Table 3-9
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Verifying That a Standby Supervisor Is in the Powered-Up State Using Device Manager
To verify that a standby supervisor is in the powered-up state using Device Manager, follow these steps:
Step 1 Choose Physical > Modules.... and verify that the operational status of the standby supervisor
(OperStatus) is PoweredUp.
Step 2 Right-click the standby supervisor and select Reset from the drop-down menu to restart the standby
supervisor.
Step 1 Use the show module command on the active supervisor to verify that the standby supervisor in the
powered-up state.(See Example 3-13.)
Step 2 Use the show module internal event-history module CLI command to determine what component may
have failed.
Step 3 Use the reload module CLI command to restart the standby supervisor.
Note If only one supervisor module is installed, ensure that automatic synchronization is off before servicing
the other module. This prevents the switch from attempting to fail over to an unavailable module.
This section provides a workaround for a failed supervisor under certain conditions. An example
situation is used to describe the problem and the workaround.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
In this case, the supervisor failed when the standby was reloaded, or when the supervisor was replaced
with a new one. It was discovered that the failed supervisor either had its version of code changed, or
the running configuration on the active supervisor was not saved with the appropriate boot parameters.
In either case, the problem was mismatched code on the active and standby supervisors. One clue that
indicated the mismatched code was a heartbeat error on the active supervisor. Because of this error, the
current Flash images were unable to be copied from the active supervisor to the standby.
The workaround was to copy the images to CompactFlash, switch consoles, and load code from
CompactFlash onto the second supervisor. The second supervisor was at a loader prompt, which is
indicative of missing boot statements. When a dir slot0: CLI command was executed, none of the images
appeared. This may have been the result of mismatched images on supervisors or to not having current
images in Flash memory on the supervisor. Performing a copy slot0: bootflash: CLI command copied
the images anyway. Once the images were loaded on the second supervisor and the boot statements were
confirmed and saved on the active supervisor, the supervisor loaded and came up in standby-ha mode.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The module status gives a good indication of the state of the module. Table 3-10 identifies all the
different states that a module can experience and a brief description of the state.
Module
Status
Module Status Description Condition
OK The module is up and running. Good
powered-down The module has been powered down because of user configuration or an error. Us e the Good
err-pwd-dn show running-config | include poweroff CLI command to determine if the module has Failed
been configured as powered-down. Otherwise, the module was powered down because
of a failure.
If a module reports a FATAL error, the supervisor logs an exception and reboots the
module. If the supervisor reboots the module for errors three times in a one-hour
interval, the supervisor keeps the module permanently powered down.
pwr-denied The chassis does not have enough remaining power to power up the module. Use the Failed
show environment power CLI command to show the current power status of the
switch.
powered-up The module powered up and the supervisor is waiting for the module to initialize. Transient
pwr-cycled The module reloaded. Transient
testing The module has powered up and doing runtime diagnostics. Transient
initializing The module is receiving configuration from the supervisor. Transient
upgrading The module is in the process of a nondisruptive upgrade. Transient
failure The module has experienced a failure, but the module has not been power cycled Failed
because the debug flag was configured. Use the debug flag to collect debug information
from the module as required by your customer support representative. Once all
necessary data is collected, reload the module by using the reload module CLI
command.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• show logging
• show module internal exception-log
• show module internal event-history module
• show module internal event-history errors
• show platform internal event-history errors
• show platform internal event-history module
Module Bootup
When a module is inserted into the switch, the supervisor puts the module in powered-up state. In this
state, the supervisor waits for the module to bootup and send its identification to the active supervisor.
If the supervisor does not receive the registration from the module within a given time frame, it power
cycles the module. This failure is called a boot-up failure. The failure codes for boot-up failure can be
obtained using the show platform internal event-history errors CLI command. (See Example 3-15.)
Image Download
Once the supervisor receives the registration message, it checks the image compatibility matrix. The
image compatibility determines whether the version of code running on the supervisor is compatible
with the version of code running on the module. If they do not match, the module downloads an updated
version of the code, reboots, and sends a registration message again with the updated parameters.
If the module is unable to download the code, the supervisor generates the following system message:
Explanation The module failed to download a new image from the supervisor module.
Recommended Action Collect module information by entering the show module internal all module
<dec> command.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
In addition the module generates a system message indicating the exact reason why the image download
failed:
Explanation The add-on image download to the module failed. This module is not operational until
an add-on image has been successfully installed.
Recommended Action Verify the location and version of your module image. Enter install module
CLI command or similar Fabric Manager/Device Manager command to download a new module
image.
If the image download fails, the supervisor power cycles the module. Choose Logs > Switch Resident
> Syslog > Since Reboot in Device Manager or use the show logging CLI command to view the failure
messages.
Runtime Diagnostics
After the module succeeds registering with the supervisor, the module checks the hardware. If this fails,
the module reports the error to the supervisor and generates the following system message:
Explanation The module reported a failure in the runtime diagnostic. Module manager is going to
power cycle the module.
Recommended Action Collect information about the module by entering the show module internal
all module CLI command.
In addition, this information is stored in the exception log (which is persistent across reboots). The
supervisor then power cycles the module. Choose Logs > Switch Resident > Syslog > Since Reboot in
Device Manager or use the show logging and show module internal exception-log module CLI
commands to retrieve failure information.
Runtime Configuration
After the runtime diagnostics complete successfully, the module informs the supervisor that it is ready
for configuration. Individual supervisor components configure the module. If any component reports a
problem during this stage, the supervisor reboots the module. Use the show module internal
event-history module CLI command to determine which component reported the problem.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Online health management (OHMS)— Sent from the supervisor to all the ports in the module to
verify that traffic is flowing properly.
In addition, the module monitors itself and generates an exception if it detects an anomalous condition.
If the exception is a FATAL error, the module is power cycled. Use the following CLI commands to view
the conditions leading up to the problem:
• show logging
• show module diag
• show module internal exception-log module
• show module internal event-history module
• show hardware internal errors
7) FSM:<ID(2): Slot 8, node 0x0800> Transition at 14258 usecs after Mon Sep 26 17:50:56
2005
Previous state: [LCM_ST_LC_POWERED_UP]
Triggered event: [LCM_EV_PFM_LC_STATUS_POWERED_DOWN]
Next state: [LCM_ST_LC_NOT_PRESENT]
Based on the above state transition you can infer that when the module was in the powered-up state, an
event from PFM to power down the module was triggered. This trigger caused the state machine to go
to the not present state.
Step 1 Verify that all Status LEDs are green. If any status LED is red or off, the module might have shifted out
of its slot.
Step 2 Reseat the module until both ejector levers are at 90 degrees to the rear of the chassis.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 3 Tighten the captive screws at the left and right of the module front panel.
Step 4 Restart the system.
If the Status LED on a switching module is orange, the module might be busy or disabled. Refer to the
following website for the latest Cisco MDS 9000 Family configuration guides to configure or enable the
interfaces:
http://www.cisco.com/univercd/cc/td/doc/product/sn5000/mds9000/index.htm.
After the system reinitializes the interfaces, the Status LED on the module should be green.
Step 5 If the module does not transition into the online state, see the symptoms listed in this section.
If you are unable to resolve a problem with the startup, gather the information listed under Appendix A,
“Before Contacting Technical Support” and contact your technical support representative for assistance
as directed in the “Obtaining Technical Assistance” section on page xxv.
Recommended Action Replace the bootflash in the module and try again.
Recommended Action Replace the BIOS in the module. See the “Troubleshooting Cisco SAN-OS
Software System Reboots” section on page 2-12.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Recommended Action Enter the show platform internal all module [dec] CLI command to collect
more information.
Introduced Cisco MDS SAN-OS Release 1.2(2a).
Recommended Action Enter the show platform internal all module [dec] and show module
internal all module [dec] show sprom module [dec][dec] CLI command to read module IDPROM
contents to collect more information.
Error Message PLATFORM-5-MOD_PWRDN: Module [dec] powered down (Serial No. [chars]).
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
to verify that the module did not register. Right-click the module in Device
Manager and select Reset or use the reload module CLI command to restart
the module. See the “Reinitializing a Failed Module Using Fabric Manager”
section on page 3-38 or the “Reinitializing a Failed Module Using the CLI”
section on page 3-39.
Module failed to Use the show system internal xbar internal event-history module CLI
connect to fabric. command and look for :
Triggered event: [XBM_MOD_EV_SYNC_FAILED]
to verify that the module could not connect to the fabric. Right-click the
module in Device Manager and select Reset or use the reload module CLI
command to restart the module. See the “Reinitializing a Failed Module Using
Fabric Manager” section on page 3-38 or the “Reinitializing a Failed Module
Using the CLI” section on page 3-39.
Supervisor failed to Verify the cause of the failure. See the “Diagnosing a Powered-Down Module”
configure the module. section on page 3-29. Right-click the module in Device Manager and select
Reset or use the reload module CLI command to restart the module. See the
“Reinitializing a Failed Module Using Fabric Manager” section on page 3-38
or the “Reinitializing a Failed Module Using the CLI” section on page 3-39.
Step 1 Use the show module CLI command to verify the status of the module.
switch# show module
Mod Ports Module-Type Model Status
--- ----- -------------------------------- ------------------ ------------
5 0 Supervisor/Fabric-1 DS-X9530-SF1-K9 ha-standby
6 0 Supervisor/Fabric-1 DS-X9530-SF1-K9 active *
8 8 IP Storage Services Module powered-dn
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 2 Use the show logging CLI command to see what events occurred on this module.
Switch# show logging
Note that module 8 powered up and reinitialized three times. This indicates that the module was never
able to go online. The supervisor powered down the module.
Step 3 Use the show module internal exception module CLI command to view the exception log.
switch# show module internal exceptionlog module 8
********* Exception info for module 8 ********
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note that the time when the module was reinitialized (from system messages) and the time when the
exceptions were raised (in the exception log) are correlated. This means that device ID:8 had errors
while bringing the module up.
Step 4 Use the show module internal event-history module CLI command to gather more information.
Switch# show module internal event-history module 8
79) Event:ESQ_START length:32, at 665931 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x2710, Ret:success
Seq Type:SERIAL
80) Event:ESQ_REQ length:32, at 667362 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x1, Ret:success
[E_MTS_TX] Dst:MTS_SAP_ILC_HELPER(125), Opc:MTS_OPC_LC_IS_MODULE_SAME(2810)
81) Event:ESQ_REQ length:32, at 667643 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x2, Ret:success
[E_MTS_TX] Dst:MTS_SAP_MIGUTILS_DAEMON(949), Opc:MTS_OPC_LC_INSERTED(1081)
82) Event:ESQ_RSP length:32, at 673004 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x2, Ret:success
[E_MTS_RX] Src:MTS_SAP_MIGUTILS_DAEMON(949), Opc:MTS_OPC_LC_INSERTED(1081)
83) Event:ESQ_REQ length:32, at 673265 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x3, Ret:success
[E_MTS_TX] Dst:MTS_SAP_XBAR_MANAGER(48), Opc:MTS_OPC_LC_INSERTED(1081)
85) Event:ESQ_RSP length:32, at 692394 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x3, Ret:(null)
[E_MTS_RX] Src:MTS_SAP_XBAR_MANAGER(48), Opc:MTS_OPC_LC_INSERTED(1081)
86) FSM:<ID(3): Slot 8, node 0x0802> Transition at 692410 usecs after Tue Sep 27
15:30:23 2005
Previous state: [LCM_ST_CHECK_INSERT_SEQUENCE]
Triggered event: [LCM_EV_LC_INSERTED_SEQ_FAILED]
Next state: [LCM_ST_CHECK_REMOVAL_SEQUENCE]
87) Event:ESQ_START length:32, at 692688 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x2710, Ret:success
Seq Type:SERIAL
88) Event:ESQ_REQ length:32, at 696483 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x1, Ret:success
[E_MTS_TX] Dst:MTS_SAP_MIGUTILS_DAEMON(949), Opc:MTS_OPC_LC_REMOVED(1082)
89) Event:ESQ_RSP length:32, at 698390 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x1, Ret:success
[E_MTS_RX] Src:MTS_SAP_MIGUTILS_DAEMON(949), Opc:MTS_OPC_LC_REMOVED(1082)
108) Event:ESQ_REQ length:32, at 715171 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0xc, Ret:success
[E_MTS_TX] Dst:MTS_SAP_XBAR_MANAGER(48), Opc:MTS_OPC_LC_REMOVED(1082)
109) Event:ESQ_RSP length:32, at 716623 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0xc, Ret:success
[E_MTS_RX] Src:MTS_SAP_XBAR_MANAGER(48), Opc:MTS_OPC_LC_REMOVED(1082)
110) FSM:<ID(3): Slot 8, node 0x0802> Transition at 716643 usecs after Tue Sep 2
7 15:30:23 2005
Previous state: [LCM_ST_CHECK_REMOVAL_SEQUENCE]
Triggered event: [LCM_EV_ALL_LC_REMOVED_RESP_RECEIVED]
Next state: [LCM_ST_LC_FAILURE]
111) FSM:<ID(3): Slot 8, node 0x0802> Transition at 716886 usecs after Tue Sep 2
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
7 15:30:23 2005
Previous state: [LCM_ST_LC_FAILURE]
Triggered event: [LCM_EV_LC_INSERTED_SEQ_FAILED]
Next state: [LCM_ST_LC_FAILURE]
112) FSM:<ID(3): Slot 8, node 0x0802> Transition at 717250 usecs after Tue Sep 2
7 15:30:23 2005
Previous state: [LCM_ST_LC_FAILURE]
Triggered event: [LCM_EV_FAILED_MORE3TIMES]
Next state: [LCM_ST_LC_NOT_PRESENT]
113) FSM:<ID(3): Slot 8, node 0x0802> Transition at 21633 usecs after Tue Sep 27
15:30:24 2005
Previous state: [LCM_ST_LC_NOT_PRESENT]
Triggered event: [LCM_EV_MODULE_POWERED_DOWN]
Next state: [LCM_ST_LC_NOT_PRESENT]
Step 5 Starting with the most recent time (end of the log) and moving backwards in this example, you can infer
the following:
Curr state: [LCM_ST_LC_NOT_PRESENT]<---- Indicates that the module is not present.
Index 85) Event:ESQ_RSP length:32, at 692394 usecs after Tue Sep 27 15:30:23 2005
Instance:3, Seq Id:0x3, Ret:(null)
[E_MTS_RX] Src:MTS_SAP_XBAR_MANAGER(48),
Opc:MTS_OPC_LC_INSERTED(1081) <---Indicates the event that caused the module insertion
to fail. This indicates that xbar_manager failed.
In this example, you can conclude that module is not coming up, because the XBAR Manager is failing
during the insertion of the module.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Explanation The module is not replying to the hello message. The module manager will reset the
module.
Explanation Module reported a failure in the runtime diagnostic because of a failure in some of the
ports.
Recommended Action Collect module information by entering the show module internal all module
CLI command.
Explanation The module reported a failure in the runtime diagnostic. Module manager is going to
power cycle the module.
Recommended Action Collect information about the module by entering the show module internal
all module CLI command.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Right-click the module and select Module on Device Manager or use the show module CLI command
to verify the status of the module.
Step 2 Choose Logs > Switch Resident > Syslog > Sever Events on Device Manager or use the show logging
CLI command to search for common reload problems.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 3 Use the show module internal exception module CLI command to view the exception log.
switch# show module internal exceptionlog module 8
********* Exception info for module 8 ********
exception information --- exception instance 3 ----
device id: 0
device errorcode: 0x40730017
system time: (1127843486 ticks) Tue Sep 27 17:51:26 2005
85) Event:ESQ_START length:32, at 755279 usecs after Tue Sep 27 17:51:26 2005
Instance:3, Seq Id:0x2710, Ret:success
Seq Type:SERIAL
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Right-click the module and select Module on Device Manager or use the show module CLI command
to verify the status of the module.
Step 2 Choose Logs > Switch Resident > Syslog > Sever Events on Device Manager or use the show logging
CLI command to search for common problems.
Step 3 Use the show platform internal event-history errors CLI command to view possible causes for the
unknown state.
switch# show platform internal event-history errors
1) Event:E_DEBUG, length:37, at 370073 usecs after Thu Sep 29 17:22:48 2005
[103] unable to init lc sprom 0 mod 8
1) FSM:<Slot 8> Transition at 500219 usecs after Thu Sep 29 17:22:43 2005
Previous state: [PLTFRM_STATE_MODULE_ABSENT]
Triggered event: [PLTFRM_EVENT_MODULE_INSERTED]
Next state: [PLTFRM_STATE_MODULE_PRESENT]
2) FSM:<Slot 8> Transition at 370112 usecs after Thu Sep 29 17:22:48 2005
Previous state: [PLTFRM_STATE_MODULE_PRESENT]
Triggered event: [PLTFRM_EVENT_MODULE_BOOTUP_ERROR]
Next state: [PLTFRM_STATE_MODULE_UNRECOVERABLE_ERROR]
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Right-click the module and select Module on Device Manager or use the show module CLI command
to verify the status of the module.
Step 2 Choose Logs > Switch Resident > Syslog > Server Events on Device Manager or use the show logging
CLI command to search for common problems.
Step 3 Use the show platform internal event-history errors CLI command to view possible causes.
switch# show platform internal event-history errors
1) Event:E_DEBUG, length:42, at 703984 usecs after Thu Sep 29 17:46:20 2005
[103] Module 8 pwr mgmt I/O cntrl reg 0x74
1) FSM:<Slot 8> Transition at 370299 usecs after Thu Sep 29 17:46:12 2005
Previous state: [PLTFRM_STATE_MODULE_ABSENT]
Triggered event: [PLTFRM_EVENT_MODULE_INSERTED]
Next state: [PLTFRM_STATE_MODULE_PRESENT]
2) FSM:<Slot 8> Transition at 698894 usecs after Thu Sep 29 17:46:17 2005
Previous state: [PLTFRM_STATE_MODULE_PRESENT]
Triggered event: [PLTFRM_EVENT_MODULE_SPROM_READ]
Next state: [PLTFRM_STATE_MODULE_POWER_EVAL]
3) FSM:<Slot 8> Transition at 705551 usecs after Thu Sep 29 17:46:17 2005
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
4) FSM:<Slot 8> Transition at 110120 usecs after Thu Sep 29 17:46:20 2005
Previous state: [PLTFRM_STATE_MODULE_START_POWER_UP]
Triggered event: [PLTFRM_EVENT_MOD_END_POWER_UP]
Next state: [PLTFRM_STATE_MODULE_POWERED_UP]
5) FSM:<Slot 8> Transition at 704067 usecs after Thu Sep 29 17:46:20 2005
Previous state: [PLTFRM_STATE_MODULE_POWERED_UP]
Triggered event: [PLTFRM_EVENT_MODULE_REMOVED]
Next state: [PLTFRM_STATE_MODULE_ABSENT]
When a module is inserted into the switch, the supervisor reads the SPROM contents of the module. If
the module is supported by the current version of Cisco SAN-OS, the module will be powered-up by the
supervisor. If the power status does not come up ok, the module information is not relayed to the
supervisor.
Step 1 Choose Switches > Copy Configuration to save the running configuration to the startup configuration.
Step 2 Choose Switches > Hardware. Then select the Module Status tab in the Information pane and check
the Reset check box to reload the module. Click the Apply Changes icon.
Step 3 If the module is not up, choose Switches > Hardware and check the S/W Rev column to verify the
software image on the module.
Step 4 If the software image on the module is not the latest, choose Tools > Other > Software Install to
download the latest image to supervisor bootflash memory.
Step 5 Use the CLI to force-download the software image from the supervisor to the module.
switch# reload module 2 force-dnld
Step 6 If the module is still not up, choose Switches > Hardware and view the Power Admin column to verify
the power status for the module.
Step 7 If the module is not powered on, remove and reseat the module and select on from the Power Admin
drop-down menu to power on the module.
Step 8 If the module is still not up, right-click on the switch in the map pane and select Reset to reload the
entire switch.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 3 If the module is not up, verify the software image on the module.
switch# show module
Step 4 If the software image on the module is not the latest, download the latest image to supervisor bootflash
memory.
switch# copy tftp: bootflash:
Step 5 Force-download the software image from the supervisor to the module.
switch# reload module 2 force-dnld
Step 6 If the module is still not up, verify the power status for the module.
switch# show environment power
Step 7 If the module is not powered on, remove and reseat the module and then power on the module.
switch# config t
switch(config)# no poweroff module 2
switch(config)# exit
switch#
Step 8 If the module is still not up, reload the entire switch.
switch# reload
Module Resets
Resets and reboots of modules are covered in detail in the “Troubleshooting Cisco SAN-OS Software
System Reboots” section on page 2-12. If you use the module reset-reason CLI command and the
output has an “unknown” reset reason, this may indicate a hardware problem. Some of the conditions
that may cause this include:
• The switch experienced a power reset. This may be because you reset the power supplies, or because
of a power interruption or failure.
• The front panel reset button on the supervisor module was pressed.
• Any hardware failure that caused the processor, dynamic memory, or I/O to reset or hang.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 4
Troubleshooting Licensing
Licensing functionality is available in all switches in the Cisco MDS 9000 Family. This functionality
allows you to access specified premium features on the switch after you install the appropriate license
for that feature. Licenses are supported, and enforced in Cisco MDS SAN-OS Release 1.3(1) and later.
This chapter includes the following topics:
• License Overview, page 4-1
• Best Practices, page 4-3
• Initial Troubleshooting Checklist, page 4-4
• Licensing Installation Issues, page 4-6
License Overview
Cisco SAN-OS requires licenses for advanced features. These licenses have two options:
• Feature-based licensing—Features that are applicable to the entire switch. You need to purchase and
install a license for each switch that uses the features you are interested in. The Enterprise license
is an example of a feature-based license.
• Module-based licensing—Features that require additional hardware modules. You need to purchase
and install a license for each module that uses the features you are interested in. The SAN Extension
over IP license is an example of a module-based license.
Note The Cisco MDS 9216i switch enables SAN Extension features on the two fixed IP services ports only.
The features enabled on these ports are identical to the features enabled by the SAN Extension over IP
license on the14/2-port Multiprotocol Services (MPS-14/2) module. If you install a module with IP ports
in the empty slot on the Cisco MDS 9216i, a separate SAN Extension over IP license is required to enable
related features on the IP ports of the additional module.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Explanation The feature has a license with an invalid license Host ID. This can happen if a
supervisor module with licensed features for one switch is installed on another switch.
Recommended Action Reinstall the correct license for the chassis where the supervisor module is
installed.
Grace Period
If you use a feature that requires a license but have not installed a license for that feature, you are given
a 120 day grace period to evaluate the feature. You must purchase and install the number of licenses
required for that feature before the grace period ends or Cisco SAN-OS will disable the feature at the
end of the grace period. If you try to use an unlicensed feature, you may see the following system
messages:
Explanation The unlicensed feature has exceeded its grace time period. Applications using this
license will be shut down immediately.
Recommended Action Please install the license file to continue using the feature.
Explanation The Application [chars1] has not been licensed. The application will work for a grace
period of [dec] days after which it will be shut down unless a license file for the feature is installed.
Explanation The feature has exceeded its evaluation time period. The feature will be shut down
after a grace period.
Explanation The feature has not been licensed. The feature will work for a grace period, after which
the application(s) using the feature will be shutdown.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Explanation The application will exceed its evaluation time period in the listed number of days and
will be shut down unless a permanent license for the feature is installed.
Recommended Action Install the license file to continue using the feature.
License packages can contain several features.If you disable a feature during the grace period and there
are other features in that license package that are still enabled, the clock does not stop for that license
package. To suspend the grace period countdown for a licensed feature, you must disable every feature
in that license package. Choose Switches > Licenses and select the Usage tab in Fabric Manager or use
the show license usage CLI command to determine which features are enabled for a license package.
Best Practices
This section provides the best practices when dealing with licenses for Cisco SAN-OS products.
• Do not ignore grace period expiration warnings. Allow 60 days before the grace period expires to
allow time for ordering, shipping, and installation.
• Carefully determine the license(s) you require based on the features and modules that require a
license. Remember that you need one license per chassis for feature-based licenses and one per
module for module-based licenses.
• Order your license accurately:
– Enter the Product Authorization Key that appears in the Proof of Purchase document that comes
with your switch.
– Enter the correct chassis serial number when ordering the license. The serial number must be
for the same chassis that you plan to install the license on. Choose Switches > Hardware and
check the SerialNo Primary for the switch chassis in Fabric Manager or use the show license
host-id CLI command.
– Enter serial numbers accurately. The serial number contains zeros, but no letter “O".
– Order the license specific to your chassis or module type. An MDS 9200 Series license will not
work on an MDS 9500 Series switch. Similarly, the SAN_EXTENSION_OVER_IP2 license
works for an MPS-14/2 module, but will not work for an IPS-4 module. See Table 8-3 on
page 8-7 for details on the SAN Extension over IP licenses available.
• Install licenses using the one-click method in Fabric Manager.
• Backup the license file to a remote, secure place. Archiving your license files ensures that you will
not lose the licenses in the case of a failure on your switch.
• Install the correct licenses on each switch, using the licenses that were ordered using that switch’s
serial number. Licenses are serial-number specific and platform or module type specific.
• Choose Switches > Licenses and select the Usage tab in Fabric Manager or use the show license
usage CLI command to verify the license installation.
• Never modify a license file or attempt to use it on a switch that it was not ordered for. If you RMA
a chassis, contact your customer support representative to order a replacement license for the new
chassis.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Checklist Checkoff
Verify the chassis serial number for all licenses ordered.
Verify the platform or module type for all licenses ordered.
Verify that the Product Authorization Key you used to order the licenses comes from the
same chassis that you retrieved the chassis serial number on.
Verify that you have installed all licenses on all switches that require the licenses for the
features you enable.
Step 1 Select Switches > Licenses from the Physical Attributes pane. You see the license information in the
Information pane, one line per feature.
Step 2 Click the Feature Usage tab to see the switch, name of the feature package, the type of license installed,
the number of licenses used (Installed Count), the expiration date, the grace period (if you do not have
a license for a particular feature), and any errors (for example, if you have a missing license). Click the
Keys tab to display information about each of the License Key files installed on your switches.
Step 3 Click the Usage tab to see the applications using the feature package on each switch. Use this tab to
determine which applications depend on each license you have installed.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note Use the entire ID that appears after the colon (:) . The VHD is the Vendor Host ID.
Example 4-4 Displays All Installed License Key Files and Contents
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Symptom One-click license install fails or cannot connect to the licensing website.
Table 4-1 One-Click License Install Fails or Cannot Connect to License Website
Step 1 Choose Switches > Hardware and select the Inventory tab.
Step 2 Copy down the SerialNo Primary field for the chassis that matches where you want to install a new
license.
Note If you are ordering a module-based license, such as the SAN Extension over IP license package,
you still use the chassis serial number for the chassis where the module resides, not the module
serial number.
Use the show license host-id CLI command to obtain the correct chassis serial number for your switch
using the CLI.
When entering the chassis serial number during the license ordering process, do not use the letter “O”
in place of any zeros in the serial number.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note If you purchased Cisco support through a Cisco reseller, contact the reseller directly. If you purchased
support directly from Cisco Systems, contact Cisco Technical Support at this URL:
http://www.cisco.com/warp/public/687/Directory/DirTAC.shtml
This error message is generated because the license grace period is only applicable when no licenses are
installed. The installation of one license terminates the grace period and will arbitrarily cause the second
module to shut down, because this is not allowed by licensing.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The workaround for this scenario includes doing one of the following:
• Concatenate both licenses into one license file.
• Manually reduce the usage count by one.
To concatenate both licenses into one license file, follow these steps:
Step 1 Bring down one of the modules manually to reduce the usage count by one.
Step 2 Reinsert the module after installing both licenses.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Beyond that, the frequency of these messages become hourly during the last seven days of the grace
period. The following example uses the FICON feature. On January 30th, you enabled the FICON
feature, using the 120 day grace period. You will receive grace period ending messages as:
• Daily alerts from January 30th to May 21st.
• Hourly alerts from May 22nd to May 30th.
On May 31st, the grace period ends, and the FICON feature is automatically disabled. You will not be
allowed to use FICON until you purchase a valid license.
Note You cannot modify the frequency of the grace period messages.
Caution After the final seven days of the grace period, the feature is turned off and your network traffic may be
disrupted. Any future upgrade will enforce license requirements and the 120-day grace period.
Step 1 Choose Admin > Licenses and select the Features tab.
Step 2 Click Check In FM.
Note Because of Caveat CSCeg23889, you might still receive Call Home or system messages for an unused
FM_SERVER_PKG license. This caveat describes how extraneous messages are sent after a Fabric
Manager Server license is checked in.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 5
Troubleshooting Cisco Fabric Services
This chapter describes procedures used to troubleshoot Cisco Fabric Services (CFS) problems in the
Cisco MDS 9000 Family multilayer directors and fabric switches. It includes the following sections:
• Overview, page 5-1
• Best Practices, page 5-2
• Initial Troubleshooting Checklist, page 5-3
• Merge Failure Troubleshooting, page 5-6
• Lock Failure Troubleshooting, page 5-7
• Distribution Status Verification, page 5-9
Overview
Many features in the Cisco MDS 9000 Family switches require configuration synchronization in all
switches in the fabric. It is important to maintain configuration synchronization across a fabric for
consistency. In the absence of a common infrastructure, such synchronization is achieved through
manual configuration at each switch in the fabric. This process is tedious and error prone.
As of Cisco MDS SAN-OS Release 2.0(1b), Cisco Fabric Services (CFS) provides a common
infrastructure for automatic configuration synchronization in the fabric. It provides the transport
function as well as a rich set of common services to the applications. CFS can discover CFS-capable
switches in the fabric as well as their application capabilities. Applications that can be synchronized using
CFS include:
• IVR
• NTP
• DPVM
• user roles
• AAA server addresses
• syslog
• call home
Applications may be add to this list in future releases.
All switches in the fabric must be CFS capable. A Cisco MDS 9000 Family switch is CFS capable if it
is running Cisco SAN-OS Release 2.0(1b) or later. Switches that are not CFS capable do not receive
distributions and result in part of the fabric not receiving the intended distribution.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Best Practices
You can avoid problems when configuring CFS if you observe the following best practices:
• Make sure that all the applications that you are using are enabled for CFS distribution on all the
switches. By doing so, you ensure that application specific configurations will be in sync across the
fabric.
• Do not simultaneously acquire a lock by configuring CFS from two different switches for the same
application, even though the CFS module is capable of handling this type of activity. Applications
on both sides might try to take the lock and might take a while to come out of the deadlock.
• If the CFS distribution for an application is enabled, then ensure that you either commit, abort, or
clear the changes once you start the configuration. Applications take the lock on all the switches
that come under the scope of the application’s distribution. Once the lock is taken, if there is an ISL
flap or a new switch joins the fabric, then the merge for that application goes into the waiting/in
progress state until the lock is released.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Checklist Checkoff
Verify that CFS is enabled for the same applications on all affected switches.
Verify that CFS distribution is enabled for the same applications on all affected switches.
Verify that there are no pending changes for an application and that a CFS commit was
issued for any configuration changes in a CFS enabled application.
Verify that there are no unexpected CFS locked sessions. Clear any unexpected locked
sessions.
Step 1 Choose Admin > CFS on Device Manager to verify that an application is listed and enabled. Repeat this
on all switches.
Step 2 To list the set of switches in which an application is registered with CFS, choose the application
configuration menu on Fabric Manager and select the CFS tab. For example, to verify that DPVM is
enabled and global distribution is enabled on all switches, choose Fabricxx > All VSANs > DPVM and
select the CFS tab. Verify that the Oper field is enabled and the Global filed is enabled for all switches
in the fabric.
Step 3 To determine if all the switches in the fabric constitute one CFS fabric, or a multitude of partitioned CFS
fabrics using Device Manager, follow these steps:
a. Choose Admin > CFS and highlight the application that you want to verify CFS on.
b. Click Details and select the Merge tab in the Details dialog box.
c. If you see multiple rows in the Merge status table, then the fabric is partitioned into multiple CFS
fabrics. Some features enable CFS per VSAN and this is expected. If the selected feature should be
fabric wide but you see multiple rows in the Merge status table, then the fabric may be partitioned ,
and the merge status may show that the merge has failed, is pending, or is waiting.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 To verify that an application is listed and enabled, issue the show cfs application command to all
switches. An example of the show cfs application command follows:
Switch# show cfs application
-------------------------------------------
Application Enabled Scope
-------------------------------------------
ivr Yes Physical
ntp No Physical
dpvm Yes Physical
fscm Yes Physical
role Yes Physical
radius Yes Physical
fctimer No Physical
syslogd No Physical
callhome No Physical
device-alias Yes Physical
port-security Yes Logical
The Physical scope means that CFS applies the configuration for that application to the entire switch.
The Logical scope means that CFS applies the configuration for that application to a specific VSAN.
Step 2 Verify the set of switches in which an application is registered with CFS, using the show cfs peers name
application-name for physical scope applications, and the show cfs peers name application-name vsan
vsan-id for logical scope applications.
An example command output for a physical scope application follows:
Switch# show cfs peers name dpvm
Scope : Physical
--------------------------------------------------
Switch WWN IP Address
--------------------------------------------------
20:00:00:0e:d7:0e:bf:c0 10.76.100.51 [Local]
20:00:00:0e:d7:00:3c:9e 10.76.100.52
Note The show cfs peers name application-name command displays the peers for all VSANs when applied
to a logical application.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 3 To determine if all the switches in the fabric constitute one CFS fabric, or a multitude of partitioned CFS
fabrics, issue the show cfs merge status name application-name command and the show cfs peers
name application-name command and compare the outputs. If the outputs contain the same list of
switches, the entire set of switches constitutes one CFS fabric. When this is the case the merge status
should always show success at all switches. Example command outputs follow:
Switch# show cfs merge status name dpvm
Physical Merge Status: Success [ Sat Nov 20 11:59:36 2004 ]
Local Fabric
---------------------------------------------------------
Switch WWN IP Address
---------------------------------------------------------
20:00:00:05:30:00:4a:de 10.76.100.51 [Merge Master]
20:00:00:0d:ec:0c:f1:40 10.76.100.204
Scope : Physical
--------------------------------------------------
Switch WWN IP Address
--------------------------------------------------
20:00:00:0d:ec:0c:f1:40 10.76.100.204 [Local]
20:00:00:05:30:00:4a:de 10.76.100.51
If the list of switches in the show cfs merge status name command output is shorter than that of the
show cfs peers name command output, the fabric is partitioned into multiple CFS fabrics and the merge
status may show that the merge has failed, is pending, or is waiting.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Select the CFS tab for the application that you are configuring and check the merge field to identify a
switch that shows a merge failure. For example, choose Fabricxx > All VSANS > DPVM and select the
CFS tab to determine if there is a merge failure for DPVM.
Step 2 Set the Config Action drop-down menu to commit and click Apply Changes to restore all peers in the
fabric to the same configuration database.
Step 1 To identify a switch that shows a merge failure, issue the show cfs merge status name application-name
command. Example command output follows:
Switch# show cfs merge status name ntp
Remote Fabric
---------------------------------------------------------
Switch WWN IP Address
---------------------------------------------------------
20:00:00:0d:ec:06:55:c0 10.76.100.205 [Merge Master]
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 2 Enter configuration mode and issue the application-name commit command to restore all peers in the
fabric to the same configuration database. Example command output follows:
Switch# config terminal
Switch(config)# ntp commit
Switch(config)#
Step 1 Select the CFS tab for the application that you are configuring and view the Master check box to identify
the master switch for that CFS application. For example, choose Fabricxx > All VSANS > DPVM and
select the CFS tab.
Step 2 Set the Config Action drop-down menu on the master switch to commit or abort and click Apply
Changes to restore all peers in the fabric to the same configuration database and free the CFS lock.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Issue a show cfs lock name command to determine the lock holder. An example of the show cfs lock
name command follows:
Switch# show cfs lock ntp
Application:ntp
Scope :Physical
--------------------------------------------------------------------
Switch WWN IP Address User Name User Type
--------------------------------------------------------------------
20:00:00:05:30:00:6b:9e 10.76.100.167 admin CLI/SNMP v3
Step 2 If the lock is being held by a remote peer, an application-name commit command or an
application-name abort command must be executed at that switch. An example of the application-name
commit command follows:
Switch# config terminal
Switch(config)# ntp commit
Switch(config)#
Step 1 Select the CFS tab for the application that you are configuring and view the Master check box to identify
the master switch for that CFS application. For example, choose Fabricxx > All VSANS > DPVM and
select the CFS tab.
Step 2 Set the Config Action drop-down menu on the master switch to clear and click Apply Changes to free
the CFS lock.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Scope :Physical
--------------------------------------------------------------------
Switch WWN IP Address User Name User Type
--------------------------------------------------------------------
20:00:00:05:30:00:6b:9e 10.76.100.167 admin CLI/SNMP v3
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 6
Troubleshooting Ports
This chapter describes how to identify and resolve problems that can occur with ports in the Cisco MDS
9000 Family of multilayer directors and fabric switches. It includes the following sections:
• Overview, page 6-1
• Best Practices, page 6-2
• Initial Troubleshooting Checklist, page 6-2
• Overview of the FC-MAC Driver and the Port Manager, page 6-5
• Common Problems with Port Interfaces, page 6-10
Overview
Before a switch can relay frames from one data link to another, the characteristics of the interfaces
through which the frames are received and sent must be defined. The configured interfaces can be Fibre
Channel interfaces, Gigabit Ethernet interfaces, the management interface (mgmt0), or VSAN interfaces
(IPFC).
Each physical Fibre Channel interface in a switch may operate in one of several port modes: E port, F
port, FL port, TL port, TE port, SD port, and B port. In addition to these modes, each interface may be
configured in auto or Fx port modes. These modes determine the port type during interface initialization.
Each interface has an associated administrative configuration and operational status:
• The administrative configuration does not change unless you modify it. This configuration has
various attributes that you can configure in administrative mode.
• The operational status represents the current status of a specified attribute like the interface speed.
This status cannot be changed and is read-only. Some values may not be valid when the interface is
down (such as the operation speed).
For a complete description of port modes, administrative states, and operational states, refer to the Cisco
MDS 9000 Family Configuration Guide and the Cisco MDS 9000 Fabric Manager Configuration Guide.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Best Practices
You can avoid potential problems by following best practices when you configure a port interface.
• Before you begin configuring a switch, make sure that the modules in the chassis are functioning as
designed. Choose Switches > Hardware in Fabric Manager or use the show module CLI command
to verify that a module is OK or active before continuing the configuration.
• Ensure that a Fibre Channel port is configured to the appropriate port mode for your configuration.
The default port mode is auto on the 16-port 2-Gbps Fiber Channel switching modules and Fx on
the 32-port 2-Gbps Fibre Channel switching modules.
• Configure devices attached to TL ports in zones.
• Observe the following guidelines when configuring a 32-port 2-Gbps Fibre Channel switching
module or the Cisco MDS 9100 Series. When configuring these host-optimized ports, the following
port mode guidelines apply:
– You can configure only the first port in each 4-port group s an E port (for example, port 1 from
ports 1-4, port 5 from ports 5-8, and so on). If the first port in the group is configured as an E
port, the other three ports in each group (ports 2-4, 6-8, and so on) are not usable and remain
shutdown.
– If any of the other three ports are enabled, you cannot configure the first port as an E port. The
other three ports continue to remain enabled.
– The auto mode is not allowed in a 32-port 2-Gbps Fibre Channel switching module or the
host-optimized ports in the Cisco 9100 Series (16 host-optimized Fibre Channel ports in the
Cisco MDS 9120 switch and 32 host-optimized Fibre Channel ports in the Cisco MDS 9140
switch).
– The default port mode is Fx (Fx negotiates to F or FL) for 32-port 2-Gbps Fibre Channel
switching modules and the host-optimized Fibre Channel ports in the Cisco 9100 Series.
– The 32-port 2-Gbps Fibre Channel switching module has not been qualified for FICON.
Checklist Checkoff
Check the physical media to ensure there are no damaged parts.
Verify that the SFP (small form-factor pluggable) devices in use are those authorized by
Cisco and that they are not faulty.
Verify that you have enabled the port by right-clicking the port in Device Manager and
selecting enable or by using the no shut CLI command.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Applicable
Reason Code Description Mode
Link failure or not connected The physical layer link is not operational. All
SFP not present The small form-factor pluggable (SFP) hardware
is not plugged in.
Initializing The physical layer link is operational and the
protocol initialization is in progress.
Reconfigure fabric in progress The fabric is currently being reconfigured.
Offline The Cisco SAN-OS software waits for the
specified R_A_TOV time before retrying
initialization.
Inactive The interface VSAN is deleted or is in a suspended
state.
To make the interface operational, assign that port
to a configured and active VSAN.
Hardware failure A hardware failure is detected.
Error disabled Error conditions require administrative attention.
Interfaces may be error-disabled for various
reasons. For example:
• Configuration failure.
• Incompatible buffer-to-buffer credit
configuration.
To make the interface operational, you must first
fix the error conditions causing this state; then,
administratively shut down and reenable the
interface.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Applicable
Reason Code Description Mode
Isolation due to ELP failure The port negotiation failed. Only E ports
Isolation due to ESC failure The port negotiation failed. and TE ports
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the VSANs
crossing the EISL instead of just the VSAN experiencing the isolation problem.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• E or TE port initialization.
• SFP validation.
The FC-MAC detects the port is in one of the following states:
• Disable—The port is administratively disabled.
• Enable—The port is administratively enabled. In this state, the port may be in speed initialization,
loop-initialization, link (point-to-point connection) initialization, or the link-up state.
• HW Failure—The port has been declared bad due to a hardware failure.
• Pause—An intermediate state after the link is down and subsequent enabling of the port to start the
port initialization.
You can check the state of the port using the command:
show hardware internal fc-mac port slot/port port-info
The FLOGI server is a separate application that handles the FLOGI processing for Nx ports.
Device View
Basic port monitoring using Device Manager begins with the visual display in the Device View
(Figure 6-1). Port display descriptions include:
• Green box—A successful fabric login has occurred; the connection is active.
• Red X—A small form-factor pluggable transceiver (SFP) is present but there is no connection. This
could indicate a disconnected or faulty cable, or no active device connection.
• Red box—An FSP is present but fabric login (FLOGI) has failed. Typically a mismatch in port or
fabric parameters with the neighboring device. For example, a port parameter mismatch would occur
if a node device were connected to a port configured as an E port. An example of a fabric parameter
mismatch would be differing timeout values.
• Yellow box—In Device Manager, a port was selected.
• Gray box—The port is administratively disabled.
• Black box—FSP is not present.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note To issue commands with the internal keyword, you must have an account that is a member of the
network-admin group.
Note Use the fcmac2 keyword for the MDS 9120, MDS 9140, MDS 9216i, and the MPS-14/2 module.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note To issue CLI commands with the internal keyword, you must have an account that is a member of the
network-admin group.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the VSANs
crossing the EISL instead of just the VSAN experiencing the problem.
Note FLOGI is transparent to the MAC driver and is based on some expected configuration. The
MAC driver assumes that the FLOGI process is completed.
Note The link reinitializes after a link down event is initiated only if enable is issued by the Port
Manager.
6. Checks for the presence of SFP/GBIC. If present, FC-MAC checks for loss of signal. The loss of
signal state indicates either the physical connectivity between two end ports is bad or there is a
transmit fault in the SFP. Use the show hardware internal fc-mac port slot/port gbic-info
command to check for the transmit fault.
7. Checks for the speed and sync state of the port. If the port is in the speed initialization state, then:
– Auto speed is in progress is displayed if the port is in automode.
– Waiting for stable sync is displayed if the port is configured for a fixed speed.
– Sync not acquired is displayed if the MAC state indicates a loss of synchronization. In auto
mode, this state is not necessarily an error. In any case, check the speed capabilities and
configuration at both ends.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the VSANs
crossing the EISL instead of just the VSAN experiencing the problem.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Use the show interface fc slot/port CLI command and verify that the fibre channel interface connected
to the device in question is up and free of any errors. (See Example 6-2.)
If the interface is not working correctly, check the cabling and the host or storage device interface for
faults. If the interface is working correctly, proceed to the next step.
Step 2 Verify that the device in question appears in the FLOGI database. To do this, enter the following
command:
show flogi database vsan vsan-id
If the device in question appears in this output, skip to Step 7. If the device does not appear in the output,
go to the next step.
Step 3 Use the shutdown CLI command in interface configuration mode to shut down the Fibre Channel
interface connected to the device in question.
switch# config terminal
switch(config)# interface fcx/x
switch(config-if)# shutdown
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the
VSANs crossing the EISL instead of just the VSAN experiencing the problem.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 Use the no shutdown CLI command on the Fibre Channel interface.
switch(config-if)# no shutdown
By shutting down the interface and bringing it back up, you can determine what happens when the
connected device tries to log in to the interface.
Use the show flogi internal event-history interface CLI command to view the events that occurred on
the interface after you enabled it again. The comments that follow each section of output explain the
meaning of the output.
Note To issue commands with the internal keyword, you must have an account that is a member of
the network-admin group.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
If the device logs in successfully, proceed to the next step. Otherwise, you may have a problem with the
device or its associated software.
Step 5 Use the shutdown CLI command in interface mode to shut down the Fibre Channel interface Then use
the no shutdown CLI command after turning on the debug described in Step 6 and Step 7.
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the
VSANs crossing the EISL instead of just the VSAN experiencing the problem.
Step 6 Use the debug fcns events register vsan CLI command to watch the FLOGI process take place.
switch# debug fcns events register vsan 99
This command enables debug mode for name server registration. It generates messages on the switch
console related to FCNS events. The system output may look something like this:
switch# config t
Enter configuration commands, one per line. End with CNTL/Z.
switch(config)# interface fc3/14
switch(config-if)# no shutdown /* enable the port */
switch(config-if)# Feb 17 04:42:54 fcns: vsan 99: Created entry for port-id 27800
Feb 17 04:42:54 fcns: vsan 99: Got Entry for port-id 27800
Feb 17 04:42:54 fcns: vsan 99: Registered port-name 36a4078be0000021 for port-id 780200
Feb 17 04:42:54 fcns: vsan 99: Registered node-name 36a4078be0000020 for port-id 780200
/* The wwpn and FCID for the port, note that the bytes in the world wide name are reversed
*/
Feb 17 04:42:54 fcns: vsan 99: Registered cos 8 for port-id 780200
/* Class of Service */
Feb 17 04:42:54 fcns: vsan 99: Registered port-type 1 for port-id 780200
/* Port Type */
Feb 17 04:42:54 fcns: vsan 99: Reading configuration for entry with port-name
36a4078be0000021, node-name 36a4078be0000020
Feb 17 04:42:54 fcns: vsan 99: No configuration present for this portname
Feb 17 04:42:54 fcns: vsan 99: No configuration present for this nodename
/* Port is now registered in nameserver, will send out RSCN to it */
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Feb 17 04:42:54 fcns: vsan 99: Trying to send RSCN; affected port 780200
Feb 17 04:42:54 fcns: vsan 99: rscn timer started for port 780200
Feb 17 04:42:54 fcns: vsan 99: Saving new entry into pss
Feb 17 04:42:54 fcns: vsan 99: Sending sync message to the standby
Feb 17 04:42:54 fcns: vsan 99: sending accept response to 780200
/* RSCN was received by N/NL port */
Feb 17 04:42:55 fcns: vsan 99: Registered fc4-types for port-id 780200
Feb 17 04:42:55 fcns: vsan 99: Registered fc4-features for fc4_type 8 for port-id 780200
/* FC4 Type, type 8 FCP has been registered */
Additional lines similar to these will be listed if more name server objects are registered.
Step 7 If you are managing the switch over a Telnet connection, enable terminal monitoring by entering the
terminal monitor CLI command in exec mode.
The system output looks like this:
switch# show fcns database detail vsan 99
------------------------
VSAN:99 FCID:0x780200
------------------------
port-wwn (vendor) :21:00:00:e0:8b:07:a4:36 (QLogic) /* Port world wide name */
node-wwn :20:00:00:e0:8b:07:a4:36
class :3 /* Fibrechannel class of service */
node-ip-addr :0.0.0.0 /* IP Address */
ipa :ff ff ff ff ff ff ff ff
fc4-types:fc4_features:scsi-fcp:init /* Registered FC4 Types: example SCSI and
initiator */
symbolic-port-name :
symbolic-node-name :
port-type :N /* Fibrechannel port type (F,FL) */
port-ip-addr :0.0.0.0
fabric-port-wwn :20:8e:00:05:30:00:86:9e /* wwn of the switch port */
hard-addr :0x000000
Other attribute objects of the Nx port are registered one per register operation after the FLOGI process
is complete. The Nx port performs PLOGI to the well-known WWN of the Name Server, 0xFFFFFC.
The FC_CT Common Transport protocol uses Request and Accept messages to conduct transactions. To
verify that additional attributes are correctly registered and recorded in the database, you can use the
SAN-OS debug facility.
Note The command show fcns database detail vsan X displays a detailed list of all devices registered in the
fabric.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Table 6-5 lists possible causes and solutions for link flapping.
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the VSANs
crossing the EISL instead of just the VSAN experiencing the problem.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
LR = Link Reset
OL3
NOS
LF1 LF2 LRR = Link Reset Response
OL1 PLS
OL2 Ordered sets of 8B/10B coding:
LR LR1
LR2 RC = Active State
LRR
Idle LR3 LR = Link Recovery State
AC
Idle
AC LF = Link Failure STate
144869
OL = Offline State
Figure 6-4 shows the link initialization flow. It displays the ordered sets transmitted between the ports
and the primary operational states of the port during the process. They include:
1. Active state.
2. Link recovery state (LR):
a. LR transmit substate (LR1)
b. LR receive substate (LR2)
c. LRR receive substate (LR3)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Table 6-6 Link Flap Reasons Initiated by a Device Connected to the Switch Port
Reason Description
Sync Loss A synchronization loss condition persisted for more than 100 milliseconds.
Look at the Invalid Transmission Word Count to check whether the physical
link is really bad and if that caused the loss of synchronization. Sometimes this
is not necessarily a problem with the physical link, but with the way some
devices initialize the link. Use attach module to connect to the module and
then use the show hardware internal debug-info interface CLI command.
See Table 6-2.
Loss of signal A signal loss condition persisted for more than 100 milliseconds. Look at the
Invalid Transmission Word Count to check whether the physical link is really
bad and if that caused the loss of synchronization. Sometimes this is not
necessarily a problem with the physical link, but with the way some devices
initialize the link. If the link does not come up after a flap, then probably the
other end is in a shutdown state or the cable is broken. You can check for the
broken or disconnected optical link by using the show hardware internal
fc-mac port slot/port gbic-info CLI command.
NOS received A NOS received condition is detected. If the other end is an MDS port, then
the NOS is transmitted by the other end in one of the following conditions:
• A signal loss or sync loss condition is detected.
• The port is administratively shut down.
• The port is operationally down.
OLS received An OLS received condition is detected.
LR received B2B Link reset (LR) failed because of the receive queue (in the queue engine) not
being empty.
Cr loss Too many credit loss events occurred.
Rx queue overflow The receive queue overflowed in the queue engine occurred. This can happen
under the following conditions:
• Improper credit configuration at one or both ends of the link.
• A bad link can sometimes result in extra R_RDYs. Check for invalid
transmission words at both ends.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Table 6-6 Link Flap Reasons Initiated by a Device Connected to the Switch Port (continued)
Reason Description
LIP F* received An loop initialization procedure (LIP) was received.
LC port shutdown The port shutdown was invoked. Use the show process exception CLI
command to check for any other errors.
LIP received B2B An LIP was received while the Rx queue was not empty.
OPNy tmo B2B An open circuit on a loop (OPNy) timeout occurred while the Rx queue was
not empty.
OPNy Ret B2B An OPNy was returned while the Rx queue was not empty.
Cr Loss B2B Credit loss occurred while the Rx queue was not empty.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The output in Example 6-3 also displays evidence of corrupt data on the wire if there are a high number
of CRCs and errors. Discards may or may not indicate a problem. For example, a frame can be discarded
because of an ACL violation.
Table 6-7 Port Bounces Between the Initializing and Offline States
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Use the show interface CLI command to verify E port isolation:
switch# show interface fc2/4
fc2/4 is down (Isolation due to ELP failure)
Hardware is Fibre Channel, WWN is 20:44:00:05:30:00:18:a2
vsan is 1
Beacon is turned off
1445517676 packets input, 727667035658 bytes, 0 discards
0 input errors, 0 CRC, 0 invalid transmission words
0 address id, 0 delimiter
Received 0 runts, 0 jabber, 0 too long, 0 too short
0 EOF abort, 0 fragmented, 0 unknown class
100 OLS, 67 LRR, 37 NOS, 0 loop inits
In this example the interface indicates a link isolation caused by an ELP failure on an E port. The ELP
is a frame sent between two switches to negotiate fabric parameters.
Step 2 Verify that the following parameters match on each switch in the VSAN using the show fctimer CLI
command:
• ED_TOV timer
• RA_TOV timer
• FS_TOV timer
Note Because fabric parameters are configured on a per VSAN basis, they are required to be the same
for all switches within a VSAN.
This sample output shows the default settings for these timeout values.
Step 3 Optionally, use the fctimer CLI command in config mode to globally set these timeout values across all
VSANs or use the fctimer D_S_TOV <timeout> vsan <vsan-id> CLI command for example, to set the
D_S_TOV timeout for a particular VSAN to override the global values.
Step 4 Use the show port internal info interface fc CLI command to verify that Rx buffer size matches on both
ends of the ISL.
Note To issue commands with the internal keyword, you must have an account that is a member of
the network-admin group.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Switches > Interfaces > FC Physical to verify that the E port did not come up because of a zone
merge failure.
Note Zoning information exists on a per VSAN basis. Therefore, for a TE port, it may be necessary to verify
that the zoning information does not conflict for any allowed VSAN.
Step 2 Select Zone > Edit Local Full Zone Database to verify the zoning configuration.
Step 3 Use one of the following two approaches to resolve a zone merge failure:
• Choose File > Restore from the Edit Local Full Zone Database dialog box to overwrite the zoning
configuration of one switch with the other switch’s configuration.
The Restore option overwrites the local switch’s active zone set with that of the remote switch.
• If the zoning databases between the two switches are overwritten, you cannot use the Restore
option. To work around this, you can manually change the content of the zone database on either of
the switches using the Edit Local Full Zone Database, and then choose Switches > Interfaces > FC
Physical and select down and then up on the Admin Status drop-down menu for the isolated port.
Step 4 If the isolation is specific to one VSAN and not on an E port, the correct way to issue the cycle up or
down is to remove the VSAN from the list of allowed VSANs on that trunk port, and reinsert it.
a. Choose Switches > Interfaces > FC Physical and select the Trunk Config tab.
b. Remove the VSAN from the Allowed VSAN list and click Apply Changes.
c. Add the VSAN back to Allowed VSAN list and click Apply Changes.
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the
VSANs crossing the EISL instead of just the VSAN experiencing the isolation problem.
Using the Zone Merge Analysis tool in Fabric Manager, the compatibility of two active zone sets in two
switches can be checked before actually merging the two zone sets. Refer to the Cisco MDS 9000 Fabric
Manager Configuration Guide for more information.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Use the show interface command output to verify that the E port did not come up because of a zone
merge failure.
Note Zoning information exists on a per VSAN basis. Therefore, for a TE port, it may be necessary to verify
that the zoning information does not conflict for any allowed VSAN.
Note We recommend that you do not disable and then enable a T or TE port. This would affect all the
VSANs crossing the EISL instead of just the VSAN experiencing the isolation problem.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Use the show interface command to verify that the switch detected a problem and disabled the port.
Check cables, SFPs, and optics.
mds# show interface fc1/14
fc1/14 is down (errDisabled)
Step 2 Use the show port internal event-history interface command to view information about the internal
state transitions of the port. In this example, port fc1/7 entered the ErrDisabled state because of a
capability mismatch, or “CAP MISMATCH.” You might not know how to interpret this event, but you
can look for more information with other commands.
mds# show port internal event-history interface fc1/7
>>>>FSM: <fc1/7> has 86 logged transitions<<<<<
1) FSM:<fc1/7> Transition at 647054 usecs after Tue Jan 1 22:44..
Previous state: [PI_FSM_ST_IF_NOT_INIT]
Triggered event: [PI_FSM_EV_MODULE_INIT_DONE]
Next state: [PI_FSM_ST_IF_INIT_EVAL]
2) FSM:<fc1/7> Transition at 647114 usecs after Tue Jan 1 22:43..
Previous state: [PI_FSM_ST_IF_INIT_EVAL]
Triggered event: [PI_FSM_EV_IE_ERR_DISABLED_CAP_MISMATCH]
Next state: [PI_FSM_ST_IF_DOWN_STATE]
Step 3 Use the show logging logfile command to display the switch log file and view a list of port state changes.
In this example, an error was recorded when someone attempted to add port fc1/7 to PortChannel 3. The
port was not configured identically to PortChannel 3, so the attempt failed.
mds# show logging logfile
. . .
Jan 4 06:54:04 switch %PORT_CHANNEL-5-CREATED: port-channel 17 created
Jan 4 06:54:24 switch %PORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel
17 is down (No operational members)
Jan 4 06:54:40 switch %PORT_CHANNEL-5-PORT_ADDED: fc1/8 added to port-channel 7
Jan 4 06:54:56 switch %PORT-5-IF_DOWN_ADMIN_DOWN: Interface fc1/7 is down
(Admnistratively down)
Jan 4 06:54:59 switch %PORT_CHANNEL-3-COMPAT_CHECK_FAILURE: speed is not compatible
Jan 4 06:55:56 switch %PORT_CHANNEL-5-PORT_ADDED: fc1/7 added to port-channel 7
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Overview of Symptoms
An F port may be connected to a single N port, which is the mode used by peripheral devices (hosts or
storage). In all the possible cases an administrator can encounter in troubleshooting an Fx port, two
different scenarios can be recognized:
• The port does not come up (check the interface configuration, cabling and the port connected to the
switch).
• The port comes up, but the host cannot communicate with the storage subsystem (check the VSAN
and zone configurations).
Typical end-user questions that lead to Fx port troubleshooting include:
• Why is no storage visible on my newly installed server?
• Why is previously assigned storage not visible to my server after reboot?
Typical administrator questions to investigate:
• Why does the server fail to complete FLOGI to the switch?
• Why does the storage device fail to complete FLOGI to the switch?
Figure 6-5 illustrates one possible methodology for troubleshooting Fx ports.
Troubleshoot
Yes HBA
No Check zoning
config and
LUN masking
C H A P T E R 7
Troubleshooting VSANs, Domains, and FSPF
This chapter describes how to identify and resolve problems that might occur when implementing
VSANs, domains, and FSPF. This chapter includes the following sections:
• Best Practices for VSAN Implementation, page 7-1
• Best Practices for Domain ID Assignment, page 7-2
• Best Practices for FSPF, page 7-3
• Initial Troubleshooting Checklist, page 7-3
• VSAN Issues, page 7-5
• Dynamic Port VSAN Membership Issues, page 7-12
• Domain Issues, page 7-18
• FSPF Issues, page 7-23
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Use Inter-VSAN routing (IVR) only when necessary to selectively connect devices across VSANs.
– If IVR is used without NAT, ensure that domain IDs are are statically configured and unique
across all VSANs.
• Place FCIP gateways in their own native VSAN.
Placing FCIP gateways in their own VSAN isolates disturbances when problems in the IP cloud
(such as flapping links) occur.
• Use VSAN-based roles to control and limit management access to your switches.
Note You cannot issue a disruptive restart for VSANs that are in any of the interop modes. Use a
nondisruptive restart as needed.
• To disable the Domain manager, choose Fabricxx > All VSANs > Domain Manager and uncheck
the Enable check box in Fabric Manager or use the no fcdomain vsan x CLI command.
– Disable the Domain Manager to disable the principal switch selection process. This is possible
if all domains are statically assigned. Disabling principal switch selection can reduce disruption
when switches are rebooted or added to the fabric. This must be done on each switch that should
not participate in principal switch selection. A disruptive restart of the fabric is required to apply
this change.
• Keep domain ID allowed lists the same on all switches in a fabric for consistency. If the principal
switch changes, the allowed domain lists will remain the same.
• Assign domain IDs between decimal 97 and 127 if the domain may be used for standards-based
interop mode.
• Do not perform frequent changes to the Domain Manager on production fabrics. Experienced
administrators familiar with switch operations should be responsible for Domain Manager changes.
Plan your domain configuration carefully so that you avoid the need to make disruptive changes at
a later time.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Save Domain Manager changes. When you change the configuration, be sure to save the running
configuration by choosing Switches > Copy Configuration in Fabric Manager or using the copy
running-config startup-config CLI command. The next time you reboot the switch, the saved
configuration is used. If you do not save the configuration, the previously saved startup
configuration is used.
• Enable reconfigure fabric (RCF) rejection on every ISL port if high availability is mandatory.
Choose Switches > Interfaces > FC Physical in Fabric Manager and select the Domain Manager
tab in the Information pane and then check the RcfReject check box on all ISL ports to enable
rcf-rejects. Or use the interface CLI command on a TE or E port and then use the fcdomain
rcf-reject vsan CLI command in interface configuration mode to enable the rcf-reject option. RCF
reject prevents other switches from sending an RCF and potentially causing a disruption in your
production traffic.
Checklist Checkoff
Verify the FSPF parameters for switches in the VSAN.
Verify the domain parameters for switches in the VSAN.
Verify the physical connectivity for any problem ports or VSANs.
Verify that you have both devices in the name server.
Verify that you have both end devices in the same VSAN.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The following zone CLI commands may be useful to validate your configuration:
• show zoneset name zonesetName vsan-id
• show zoneset active vsan-id
Note An asterix (*) near the device listed by the show zoneset active CLI command indicates that
the device is logged into the name server.
Note For more information on zoning issues, see Chapter 9, “Troubleshooting Zones and Zone Sets.”
VSAN Issues
This section covers the following VSAN issues:
• Host Cannot Communicate with Storage, page 7-5
• xE Port Is Isolated in a VSAN, page 7-7
• Troubleshooting Interop Mode Issues, page 7-11
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Fabricxx > VSANxx and select the Host or Storage tab in the Information pane. Verify that
both devices are in the same VSAN.
Step 2 If the host and storage are in different VSANs, verify which port is not in the correct VSAN and then
follow these steps to change the port VSAN:
a. Highlight the host or storage in the Information pane. You see the link to that end device highlighted
in blue in the map pane.
b. Right-click on the highlighted link and select Interface Attributes from the pop-up menu.
c. Set the PortVSAN field to the VSAN that holds the other end device and click Apply Changes.
Step 3 Right-click any ISL between the switches and select Interface Attributes. Select the Trunk Config tab
and verify that the allowed VSAN list includes the VSAN found in Step 1.
Step 4 If the trunk is not configured for the VSAN, set the Allowed VSANs field to include the VSAN that the
host and storage devices are on and click Apply Changes.
Step 1 Use the show vsan membership command to see all the ports connected to your host and storage, and
verify that both devices are in the same VSAN. Use this command on the switches that connect to your
host or storage devices.
switch# show vsan membership
vsan 1 interfaces:
fc2/7 fc2/8 fc2/9 fc2/10 fc2/11 fc2/12 fc2/13 fc2/14
fc2/15 fc2/16 fc7/1 fc7/2 fc7/3 fc7/4 fc7/5 fc7/6
fc7/7 fc7/8 fc7/9 fc7/10 fc7/11 fc7/12 fc7/13 fc7/14
fc7/15 fc7/16 fc7/17 fc7/18 fc7/19 fc7/20 fc7/21 fc7/22
fc7/25 fc7/26 fc7/27 fc7/28 fc7/29 fc7/30 fc7/31 fc7/32
vsan 2 interfaces:
fc2/6 fc7/23 fc7/24
vsan 3 interfaces:
fc2/1 fc2/2 fc2/5
vsan 4 interfaces:
fc2/3 fc2/4
Step 2 If the host and storage are in different VSANs, use the vsan database vsan vsan-id interface CLI
command to move the interface connected to the host and storage devices into the same VSAN.
Step 3 Use the show interface command to verify that the trunks connecting the end switches are configured
to transport the VSAN found in Step 1.
switch# show interface fc2/14
fc2/14 is trunking
Hardware is Fibre Channel, WWN is 20:4e:00:05:30:00:63:9e
Port mode is TE
Speed is 2 Gbps
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
vsan is 2
Beacon is turned off
Trunk vsans (allowed active) (1-3,5)
Trunk vsans (operational) (1-3,5)
Trunk vsans (up) (2-3,5)
Trunk vsans (isolated) (1)
Trunk vsans (initializing) ()
475 frames input, 8982 bytes, 0 discards
0 runts, 0 jabber, 0 too long, 0 too short
0 input errors, 0 CRC, 3 invalid transmission words
0 address id, 0 delimiter
0 EOF abort, 0 fragmented, 0 unknown class
514 frames output, 7509 bytes, 16777216 discards
Received 30 OLS, 21 LRR, 18 NOS, 53 loop inits
Transmitted 68 OLS, 25 LRR, 28 NOS, 32 loop inits
Step 4 If the trunk is not configured for the VSAN, use the interface CLI command and then the switchport
trunk allowed vsan CLI command in interface mode to add the VSAN to the allowed VSAN list for
the interface that connects the host and storage devices.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Switches > Interfaces > FC Physical and check the FailureCause column on the E port to verify
that you have a VSAN mismatch problem.
Step 2 Choose Switches > Interfaces > FC Physical and set the PortVSAN field to correct a VSAN mismatch.
Step 1 Use the show interface command to verify that the port is isolated because of a VSAN mismatch.
switch# show interface fc2/4
fc2/4 is down fc2/4 is down (isolation due to port vsan mismatch)
Step 2 Use the show vsan membership CLI command to verify that the ports are in separate VSANs.
switch# show vsan membership
vsan 3 interfaces:
fc2/1 fc2/2 fc2/3 fc2/4 fc2/6 fc2/7 fc2/8 fc2/9
fc2/10 fc2/11 fc2/12 fc2/14 fc2/15 fc2/16 fc7/1 fc7/2
fc7/3 fc7/4 fc7/5 fc7/6 fc7/7 fc7/8 fc7/9 fc7/10
fc7/11 fc7/12 fc7/13 fc7/14 fc7/15 fc7/16 fc7/17 fc7/18
fc7/19 fc7/20 fc7/21 fc7/22 fc7/23 fc7/24 fc7/25 fc7/26
fc7/27 fc7/28 fc7/29 fc7/30 fc7/31 fc7/32
vsan 4 interfaces:
fc2/5 fc2/13
This sample output shows that all the interfaces on the switch belong to VSAN 3, with the exception of
interface fc2/5 and fc2/13, which are part of VSAN 4.
Step 3 Use the vsan database vsan vsan-id interface CLI command to move the ports into the same VSAN.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Switches > Interfaces > FC Physical and check the FailureCause column on the TE port to
verify that you have trunk problems.
Step 2 Choose Switches > Interfaces > FC Physical and select the Trunk Failures tab to determine the reason
for the trunk problem.
Step 3 Correct the problem listed in the FailureCause column. See the “DPVM Config Database Not
Activating” section on page 7-16 for domain misconfiguration problems. Choose Switches > Interfaces
> FC Physical and set the PortVSAN field to to correct the VSAN misconfiguration problems.
Step 4 Repeat this procedure for all isolated VSANs on this TE port.
Step 1 Use the show interface command on the TE port to verify that you have an isolated VSAN.
switch# show interface fc2/14
fc2/14 is trunking
Hardware is Fibre Channel, WWN is 20:4e:00:05:30:00:63:9e
Port mode is TE
Speed is 2 Gbps
vsan is 2
Beacon is turned off
Trunk vsans (allowed active) (1-3,5)
Trunk vsans (operational) (1-3,5)
Trunk vsans (up) (2-3,5)
Trunk vsans (isolated) (1)
Trunk vsans (initializing) ()
475 frames input, 8982 bytes, 0 discards
0 runts, 0 jabber, 0 too long, 0 too short
0 input errors, 0 CRC, 3 invalid transmission words
0 address id, 0 delimiter
0 EOF abort, 0 fragmented, 0 unknown class
514 frames output, 7509 bytes, 16777216 discards
Received 30 OLS, 21 LRR, 18 NOS, 53 loop inits
The example shows the output of the show interface command with one or more isolated VSANs. Here,
the TE port has one VSAN isolated.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 2 Use the show interface fc slot/port trunk vsan vsan-id command to verify the reason for VSAN
isolation.
switch# show interface fc2/14 trunk vsan 1
fc2/15 is trunking
Vsan 1 is down (Isolation due to zone merge failure)
This output shows that VSAN 1 is isolated because of a zone merge error.
Step 3 Use the show port internal info interface fc slot/port command to determine the root cause of the
VSAN isolation.
Note To issue commands with the internal keyword, you must have an account that is a member of the
network-admin group.
The last few lines of the command output provide a description of the reason for VSAN isolation for
every isolated VSAN.
In this example, VSAN 7 is up, while two VSANs are isolated. VSAN 1 is isolated because of domain
ID misconfiguration, and VSAN 8 is isolated because of VSAN misconfiguration.
Step 4 Correct the root cause. See the “DPVM Config Database Not Activating” section on page 7-16 for
domain misconfiguration problems. Use the the vsan vsan-id interface CLI command to correct the
VSAN misconfiguration problems.
Step 5 Repeat this procedure for all isolated VSANs on this TE port.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Fabricxx > VSANxx > VSAN Attributes to verify that the fabric timers are inconsistent across
the VSANs.
Step 2 Choose Switches > FC Services > Timers and Policies. You see the fabric timers in the Information
pane.
Step 3 Click Change Timeout Values and set the timers and click Apply.
Step 1 Use the show fctimer CLI command to verify that the fabric timers are inconsistent across the VSANs.
Step 2 Use the fctimer distribute CLI command to enable CFS distribution for the fabric timers. Repeat this
on all switches in this VSAN.
Step 3 Use the fctimer CLI command to set each timer.
Step 4 Use the fctimer commit command to save these changes and distribute them to all switches in the
VSAN.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note The DPVM feature overrides any existing static port VSAN membership configuration. If the VSAN
corresponding to the dynamic port is deleted or suspended, the port is shut down.
Note If you copy the DPVM database and fabric distribution is enabled, you must commit the changes.
To begin configuring the DPVM feature, you must explicitly enable DPVM on the required switches in
the fabric. By default, this feature is disabled in all switches in the Cisco MDS 9000 Family.
For more information on enabling DPVM, see one of the following guides:
• Cisco MDS 9000 Family Fabric Manager Configuration Guide
• Cisco MDS 9000 Family Configuration Guide
This section contains the following topics:
• Troubleshooting DPVM Using Fabric Manager, page 7-13
• Troubleshooting DPVM Using the CLI, page 7-13
• DPVM Configuration Not Available, page 7-14
• DPVM Database Not Distributed, page 7-14
• DPVM Autolearn Not Working, page 7-14
• No Autolearn Entries in Active Database., page 7-15
• VSAN Membership not Added to Database., page 7-16
• DPVM Config Database Not Activating, page 7-16
• Cannot Copy Active to Config DPVM Database, page 7-17
• Port Suspended or Disabled after DPVM Activation, page 7-17
• DPVM Merge Failed, page 7-17
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Fabricxx > All VSANs > DPVM and select the CFS tab.
Step 2 Verify that the Oper and Global columns are enabled. If not, set the Admin drop-down menu to enable
and the Global drop-down menu to enable. Then click Apply Changes.
Step 3 Select the Actions tab. Uncheck AutoLearn Enable if it is checked and click Apply Changes.
Step 4 Select the Active Database tab.
Step 5 Select Pending from the Compare To drop-down menu. You see a dialog box listing any differences
between the active DPVM database and the pending database.
Step 6 Select the CFS tab and set Config Action to commit if there are any pending changes that you want to
save. Click Apply Changes.
Step 7 Select the Actions tab and select activate from the Actions drop-down menu to activate the database.
Click Apply Changes.
Step 1 Use the show dpvm CLI command in EXEC mode to verify that CFS distribution is enabled for DPVM.
Optionally, use the dpvm distribute CLI command in config mode to enable CFS distribution if
required.
Step 2 Use the show dpvm status CLI command in EXEC mode to verify that autolearning is disabled.
Optionally, use the no dpvm auto-learn command in config mode if you need to disable autolearning
before activating the database.
Step 3 Use the show dpvm pending-diff CLI in EXEC mode command to compare the active and pending
databases.
Optionally use the dpvm commit CLI command in config mode to commit any pending entries to the
config database.
Step 4 Use the dpvm activate CLI in config mode command to activate the database.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note When DPVM distribution is enabled, you must do an explicit commit for DPVM activate and autolearn
to take effect.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Symptom The VSAN membership of the port is not added to the database.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Symptom Cannot copy the active DPVM database to the config database.
Symptom A port in a static VSAN that was operational goes into suspend or disabled state after DPVM
database activation.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Domain Issues
This section includes the following topics:
• Domain ID Conflict Troubleshooting, page 7-18
• Switch Cannot See Other Switches in a VSAN, page 7-19
• FC Domain ID Overlap, page 7-19
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
FC Domain ID Overlap
To resolve an FC domain ID overlap, you can either change the overlapping static domain ID by
manually configuring a new static domain ID for the isolated switch, or disable the static domain
assignment and allow the switch to request a new domain ID after a fabric reconfiguration.
• To assign a static domain ID, see the “Assigning a New Domain ID Using Fabric Manager” section
on page 7-19 or the “Assigning a New Domain ID Using the CLI” section on page 7-20.
• To assign a dynamic domain ID after a fabric reconfiguration, see the “Using Fabric Reconfiguration
for Domain ID Assignments” section on page 7-21.
You may see the following system message in the message log when a domain ID overlap occurs:
Recommended Action Use the show fcdomain domain-list to determine which domain IDs are
overlapping. Us the fcdomain Use the fcdomain domain domain-id [static | preferred] vsan
vsan-id CLI command or similar Fabric Manager procedure to change the domain ID for one of the
overlapping domain IDs.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
To verify FC domain ID overlap and reassign a new Domain ID using Fabric Manager, follow these
steps:
Step 1 Choose Switches > Interfaces > FC Physical and check the FailureCause column for an isolation or
domain overlap status.
Step 2 Choose Fabricxx > VSANxx > Domain Manger to view which domains are currently in the VSAN.
Step 3 Repeat Step 2 on the other switch to determine which domain IDs overlap.
Step 4 Select the Configuration tab and set Config Domain and Config Type to change the domain ID for one
of the overlapping domain IDs.
• The static option tells the switch to request that particular domain ID. If it does not get that particular
address, it will isolate itself from the fabric.
• The preferred option has the switch request a specified domain ID. If that ID is unavailable, it will
accept another ID.
Step 5 Set the Restart drop-down menu to disruptive and click Apply Changes to restart the Domain Manager.
Note While the static option can be applied to runtime after a disruptive or nondisruptive restart, the
preferred option is applied to runtime only after a disruptive restart.
Step 1 Issue the show interface command. The following example output shows the isolation error message.
switch# show interface fc2/14
fc2/14 is down (Isolation due to domain overlap)
Hardware is Fibre Channel, WWN is 20:4e:00:05:30:00:63:9e
vsan is 2
Beacon is turned off
192 frames input, 3986 bytes, 0 discards
0 runts, 0 jabber, 0 too long, 0 too short
0 input errors, 0 CRC, 3 invalid transmission words
0 address id, 0 delimiter
0 EOF abort, 0 fragmented, 0 unknown class
231 frames output, 3709 bytes, 16777216 discards
Received 28 OLS, 19 LRR, 16 NOS, 48 loop inits
Transmitted 62 OLS, 22 LRR, 25 NOS, 30 loop inits
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 2 Use the show fcdomain domain-list vsan vsan-id command to view which domains are currently in
your fabric.
switch1# show fcdomain domain-list vsan 2
Number of domains: 2
Domain ID WWN
--------- -----------------------
0x4a(74) 20:01:00:05:30:00:13:9f [Local]
0x4b(75) 20:01:00:05:30:00:13:9e [Principal]
--------- -----------------------
Step 3 Repeat Step 2 on the other switch to determine which domain IDs overlap.
switch2# show fcdomain domain-list vsan 2
Number of domains: 1
Domain ID WWN
--------- -----------------------
0x4b(75) 20:01:00:05:30:00:13:9e [Local][Principal]
--------- -----------------------
In this example, switch 2 is isolated because of a domain ID 75 overlap.
Step 4 Use the fcdomain domain domain-id [static | preferred] vsan vsan-id CLI command to change the
domain ID for one of the overlapping domain IDs.
• The static option tells the switch to request that particular domain ID. If it does not get that particular
address, it will isolate itself from the fabric.
• The preferred option has the switch request a specified domain ID. If that ID is unavailable, it will
accept another ID.
Step 5 Use the fcdomain restart disruptive vsan CLI command to restart the Domain Manager.
Note While the static option can be applied to runtime after a disruptive or nondisruptive restart, the
preferred option is applied to runtime only after a disruptive restart.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
To use fabric reconfiguration to reassign domain IDs for a particular VSAN using Fabric Manager,
follow these steps:
Step 1 Choose Switches > Interfaces > FC Physical and select the Domain Manager tab in the Information
pane.
Step 2 Uncheck the RcfReject check box and click Apply Changes to disable RCF rejection.
Step 3 Choose Fabricxx > VSANxx > Domain Manager in the Logical Domain pane.
Step 4 Click the Configuration tab in the Information pane and set the Config Type drop-down menu to
preferred to remove any static domain ID assignments.
Step 5 Check the AutoReconfigure check box to enable the auto-reconfiguration option.
Step 6 Set the Restart drop-down menu to disruptive and click Apply Changes to restart the Domain Manager.
To use fabric reconfiguration to reassign domain IDs for a particular VSAN using the CLI, follow these
steps:
Step 1 Use the show fcdomain domain-list CLI command to determine if you have statically assigned domain
IDs on the switches.
Step 2 If you have statically assigned domain IDs, use the no fcdomain domain CLI command to remove the
static assignments.
Step 3 Use the show fcdomain vsan CLI command to determine if you have rcf-reject option enabled.
switch# show fcdomain vsan 1
The local switch is a Subordinated Switch
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 If you have the rcf-reject option enabled, use the interface CLI command and then the no fcdomain
rcf-reject vsan CLI command in interface mode.
Step 5 Use the fcdomain auto-reconfigure vsan CLI command in the EXEC mode on both switches to enable
auto-reconfiguration after a Domain Manager restart.
Step 6 Use the fcdomain restart disruptive vsan CLI command to restart the Domain Manager.
FSPF Issues
The implementation of VSANs dictates that each configured VSAN support a separate set of fabric
services. One such service is the FSPF routing protocol, which can be independently configured per
VSAN. Therefore, within each VSAN topology, FSPF can be configured to provide a unique routing
configuration and resulting traffic flow. Using the traffic engineering capabilities offered by VSANs
allows a greater control over traffic within the fabric and a higher utilization of the deployed fabric
resources.
This section describes how to identify and resolve Fabric Shortest Path First (FSFP) problems. It
includes the following topics:
• Troubleshooting FSPF, page 7-24
• Loss of Two-Way Communication, page 7-27
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Troubleshooting FSPF
Figure 7-1 shows a single-VSAN topology.
Switch1
Domain_ID
237
Port 1 Port 1
Index Index
0x00010001 0x00010000
Port 2 Port 2
Metric 1000 Metric 1000
Index Index
0x00010001 0x00010001
Port 4 Port 2
Index Index
0x00010003 Metric 1000 Metric 1000 0x00010001
Port 3 Port 4
Index Index
0x00010002 Domain_ID 0x00010003
238
91410
Switch5
For the purpose of this example, assume that all interfaces are located in VSAN 1.
Step 1 Choose FC > Advanced > FSPF and select the LSDB LSRs tab to verify the link state records in the
FSPF database.
• The VSANId/ DomainId column shows the domain’s view of the fabric topology.
• The AdvDomainId column shows which domain is the owner of the LSR (link state record).
• The Age value is a 16-bit counter starting at 0x0000, incremented by one for each switch during
flooding and by one for each second held in the database. This field is used as a tie-breaker if
Incarnation numbers are the same.
• The IncarnationNumber is a 32-bit value between 0x80000001 and 0x7FFFFFFF that is
incremented by one each time the originating switch transmits an LSR. This is used first before the
Age value.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 2 Choose FC > Advanced > FSPF and select the LSDB Links tab to verify that each path is in the FSPF
database.
Step 3 Choose FC > Advanced > FSPF and select the Interfaces tab to verify that the FSPF parameters are
correct for each interface and verify that the AdminStatus is up.
• The Cost column shows the cost of the path out of the interface.
• The Intervals column shows the configured FSPF timers for this interface, which must match on both
sides.
• The State column shows the full or adjacent state if the interface has sent and received all database
exchanges and required Acks. The port is now ready to route frames.
• The Neighbors column shows FSPF neighbor information.
Step 4 Choose FC > Advanced > FSPF and select the Statistics or InterfaceStats tab to verify that there are
no excessive errors present.
Step 1 Use the show fspf database vsan CLI command to verify that each path is in the FSPF database.
switch1# show fspf database
FSPF Link State Database for VSAN 2 Domain 1 -----1
LSR Type = 1
Advertising domain ID = 1 -----2
LSR Age = 81 -----3
LSR Incarnation number = 0x80000098 -----4
LSR Checksum = 0x2cd3
Number of links = 2
NbrDomainId IfIndex NbrIfIndex Link Type Cost
--------------------------------------------------------------------------------------
237 0x00010002 0x00010001 1 1000 -----5
238 0x00010003 0x00010002 1 1000 -----6
FSPF Link State Database for VSAN 2 Domain 237 <-----------LSR for another switch
LSR Type = 1
Advertising domain ID = 237 -----7
LSR Age = 185
LSR Incarnation number = 0x8000000c
LSR Checksum = 0xe0a2
Number of links = 2
NbrDomainId IfIndex NbrIfIndex Link Type Cost
--------------------------------------------------------------------------------------
239 0x00010000 0x00010003 1 1000 -----8
1 0x00010001 0x00010002 1 1000 -----9
FSPF Link State Database for VSAN 2 Domain 238 <-----------LSR for another switch
LSR Type = 1
Advertising domain ID = 238
LSR Age = 1052
LSR Incarnation number = 0x80000013
LSR Checksum = 0xe294
Number of links = 2
NbrDomainId IfIndex NbrIfIndex Link Type Cost
--------------------------------------------------------------------------------------
239 0x00010003 0x00010001 1 1000
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
FSPF Link State Database for VSAN 2 Domain 239 <-----------LSR for another switch
LSR Type = 1
Advertising domain ID = 239
LSR Age = 1061
LSR Incarnation number = 0x80000086
LSR Checksum = 0x66ac
Number of links = 4
NbrDomainId IfIndex NbrIfIndex Link Type Cost
--------------------------------------------------------------------------------------
237 0x00010003 0x00010000 1 1000
238 0x00010001 0x00010003 1 1000
This displays the number of packets; Hellos should be received every 20 seconds.
10. The cost of the path out this interface.
11. The configured FSPF timers for this interface, which must match on both sides.
12. Either Full State or Adjacent. Sent and received all database exchanges and required Acks. Port is
now ready to route frames.
13. FSPF neighbor information.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 3 Use the show fspf internal route vsan CLI command to verify that all Fibre Channel routes are available.
Note To issue commands with the internal keyword, you must have an account that is a member of
the network-admin group.
With the implementation of VSANs used with Cisco MDS 9000 Family switches, a separate instance of
FSPF runs within each VSAN, and each instance is independent of the others. For this reason, FSPF
issues affecting one VSAN have no effect on FSPF running in other VSANs.
Note For all FSPF configuration statements and diagnostic commands, if the vsan keyword is not specified,
VSAN 1 is used by default. When making configuration changes or issuing diagnostic commands in a
multi-VSAN environment, be sure to explicitly specify the target VSAN by including the vsan keyword
in the statement or command
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose FC > Advanced > FSPF and select the Interfaces tab to verify that the FSPF parameters are
correct for each interface and check the Hello interval column and the State column.
• The Intervals column shows the configured FSPF timers for this interface, which must match on both
sides.
• The State column shows the full or adjacent state if the interface has sent and received all database
exchanges and required Acks. The port is now ready to route frames.
Step 2 Repeat Step 1 to determine the value of the hello interval on the adjacent switch.
Step 3 Fill in the Hello field to change the hello interval and click Apply.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Use the debug fspf all CLI command and look for wrong hello interval messages.
switch1# debug fspf all
Jan 5 00:28:14 fspf: Wrong hello interval for packet on interface 100f000 in VSAN 1
Jan 5 00:28:14 fspf: Error in processing hello packet , error code = 4
Tip We recommend that you open a second Telnet or SSH session before entering any debug
commands. If the debug output overwhelms the current session, you can use the second session
to enter the undebug all command to stop the debug message output.
Note To issue commands with the internal keyword, you must have an account that is a member of
the network-admin group.
Step 4 Use the show fspf vsan vsan-id interface CLI command to view the FSFP configuration.
switch1# show fspf vsan 1 interface fc1/16
FSPF interface fc1/16 in VSAN 1
FSPF routing administrative state is active
Interface cost is 500
Timer intervals configured, Hello 5 s, Dead 80 s, Retransmit 5 s -----1
FSPF State is INIT -----2
Statistics counters :
Number of packets received : LSU 0 LSA 0 Hello 2 Error packets 1
Number of packets transmitted : LSU 0 LSA 0 Hello 4 Retransmitted LSU 0
Number of times inactivity timer expired for the interface = 0
1. The Hello timer is not set to the default, so you should check the neighbor configuration to make
sure it matches.
2. FSPF is not in FULL state, indicating a problem.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 5 Repeat Step 4 to determine the value of the Hello timer on the adjacent switch.
switch2# show fspf v 1 interface fc2/16
FSPF interface fc2/16 in VSAN 1
FSPF routing administrative state is active
Interface cost is 500
Timer intervals configured, Hello 20 s, Dead 80 s, Retransmit 5 s -----1
FSPF State is INIT -----2
Statistics counters :
Number of packets received : LSU 0 LSA 0 Hello 2 Error packets 1
Number of packets transmitted : LSU 0 LSA 0 Hello 4 Retransmitted LSU 0
Number of times inactivity timer expired for the interface = 0
1. The neighbor FSPF Hello interval is set to the default (20 seconds).
2. FSPF is not in full state, indicating a problem.
Step 6 Use the interface CLI command and then the fspf hello-interval CLI command in interface mode to
change the default Hello interval.
Step 1 Choose FC > Advanced > FSPF and select the Interfaces tab to verify that the FSPF parameters are
correct for each interface and check the Retransmit interval column and the State column.
• The Intervals column shows the configured FSPF timers for this interface, which must match on both
sides.
• The State column shows the full or adjacent state if the interface has sent and received all database
exchanges and required Acks. The port is now ready to route frames.
Step 2 Repeat Step 1 to determine the value of the retransmit interval on the adjacent switch.
Step 3 Fill in the Retransmit field to change the retransmit interval and click Apply.
Step 1 Use the show fspf vsan vsan-id interface CLI command to view the FSFP configuration.
switch1# show fspf vsan 1 interface fc1/16
FSPF interface fc1/16 in VSAN 1
FSPF routing administrative state is active
Interface cost is 500
Timer intervals configured, Hello 5 s, Dead 80 s, Retransmit 10 s -----1
FSPF State is INIT -----2
Statistics counters :
Number of packets received : LSU 0 LSA 0 Hello 2 Error packets 1
Number of packets transmitted : LSU 0 LSA 0 Hello 4 Retransmitted LSU 0
Number of times inactivity timer expired for the interface = 0
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
1. The retransmit interval is not set to the default, so you should check the neighbor configuration to
make sure it matches.
2. FSPF is not in FULL state, indicating a problem.
Step 2 Repeat Step 1 to determine the value of the retransmit interval on the adjacent switch.
switch2# show fspf v 1 interface fc2/16
FSPF interface fc2/16 in VSAN 1
FSPF routing administrative state is active
Interface cost is 500
Timer intervals configured, Hello 20 s, Dead 80 s, Retransmit 5 s -----1
FSPF State is INIT -----2
Statistics counters :
Number of packets received : LSU 0 LSA 0 Hello 2 Error packets 1
Number of packets transmitted : LSU 0 LSA 0 Hello 4 Retransmitted LSU 0
Number of times inactivity timer expired for the interface = 0
Step 1 Choose FC > Advanced > FSPF and select the Interfaces tab to verify that the FSPF parameters are
correct for each interface and check the Dead interval column and the State column.
• The Intervals column shows the configured FSPF timers for this interface, which must match on both
sides.
• The State column shows the full or adjacent state if the interface has sent and received all database
exchanges and required Acks. The port is now ready to route frames.
Step 2 Repeat Step 1 to determine the value of the dead interval on the adjacent switch.
Step 3 Fill in the Dead field to change the dead interval and click Apply.
Step 1 Use the debug fspf all CLI command and look for wrong dead interval messages.
switch1# debug fspf all
Jan 5 00:28:14 fspf: Wrong dead interval for packet on interface 100f000 in VSAN 1
Jan 5 00:28:14 fspf: Error in processing hello packet , error code = 4
Tip We recommend that you open a second Telnet or SSH session before entering any debug
commands. If the debug output overwhelms the current session, you can use the second session
to enter the undebug all command to stop the debug message output.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note To issue commands with the internal keyword, you must have an account that is a member of
the network-admin group.
1. The dead timer is not set to the default, so you should check the neighbor configuration.
2. FSPF is not in full state, which indicates a problem.
Step 4 Use the interface CLI comma nd and then the fspf dead-interval CLI command in interface mode to
change the dead interval.
Step 1 Choose FC > Advanced > FSPF and select the General tab to verify the RegionId.
Step 2 Repeat Step 1 to determine the value of the region on the adjacent switch.
Step 3 Fill in the RegionId field to change the region and click Apply.
Step 1 Use the show fspf vsan CLI command to display the currently configured region in a VSAN.
switch# show fspf vsan 99
Step 2 Use the debug fspf all CLI command and look for nonexistent region messages.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Tip We recommend that you open a second Telnet or SSH session before entering any debug
commands. If the debug output overwhelms the current session, you can use the second session
to enter the undebug all command to stop the debug message output.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 8
Troubleshooting IVR
This chapter describes how to troubleshoot and resolve inter-VSAN routing (IVR) configuration issues
in the Cisco MDS 9000 Family of multilayer directors and fabric switches. It includes the following
sections:
• Overview, page 8-1
• Best Practices, page 8-1
• Initial Troubleshooting Checklist, page 8-3
• Common IVR Problems, page 8-6
• Troubleshooting the IVR Wizard, page 8-13
Overview
Troubleshooting IVR involves checking the configuration of domain IDs, VSANs, border switches, and
zone sets. Configuration problems with IVR can prevent devices from communicating properly.
Prior to Cisco MDS SAN-OS Release 2.1(1a), IVR required unique domain IDs for all switches in the
fabric. As of Cisco MDS SAN-OS Release 2.1(1a), you can enable IVR Network Address Translation
(NAT) to allow non-unique domain IDs. This feature simplifies the deployment of IVR in an existing
fabric where non-unique domain IDs might be present.
Best Practices
This section provides the best practices for implementing IVR:
• Use Fabric Manager to configure IVR.
Using Fabric Manager to configure IVR can help avoid errors and will ensure that the same IVR
configuration is applied to all IVR enabled switches.
• Use IVR-NAT. If you do not use IVR-NAT, you must use non-overlapping domains across VSANs
associated with IVR.
Note If you are using IVR-NAT, you are not required to use non-overlapping domains across VSANs.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• For large installations, do not spread IVR zone members across many switches.
The VSAN rewrite table is limited to 4096 entries, and the entries are per-domain, not per-end
device, so it is best to minimize the number of switches that contain IVR zone members in very large
implementations.
• Use static domain IDs. This prevents changes in domain IDs that may conflict with virtual domain
ID assignments.
• Allow for multiple paths between the IVR zone members. Implement redundant path designs
whenever possible.
• Set the default zone policy to deny and avoid using the force option when activating the IVR zone
set.
In normal Fibre Channel environments, it is generally considered a best practice to set the default
zone policy to deny. Because members of IVR zones cannot exist in the default zone, activation of
an IVR zone set using the force option may lead to traffic disruption if IVR zone members
previously existed in a default zone policy of permit.
• Use IVR auto-topology. If you do not use IVR auto-topology, use CFS distribution to ensure that the
same IVR topology is applied to all IVR-enabled switches.
• Configure IVR only in the relevant border switches.
• Configure IVR-enabled VSANs in no interop (default) mode or interop 1 mode.
• Turn RDI mode on. This ensures that the switch will not assign used domain IDs and is compatible
with third-party switches. In Cisco SAN-OS Release 2.0(x) and earlier, existing domain IDs are
reserved in a local database. In Cisco SAN-OS Release 2.1(1a) and later, domain IDs are
dynamically reserved using RDI.
Note Contact your customer support representative for more information regarding this feature
(specifically for CSCei88345 and Field Notice 62187).
Transit VSANs
Follow these guidelines when configuring transit VSANs:
• Besides defining the IVR zone membership, you can choose to specify a set of transit VSANs to
provide connectivity between two edge VSANs:
– If two edge VSANs in an IVR zone overlap, then a transit VSAN is not required (though not
prohibited) to provide connectivity.
– If two edge VSANs in an IVR zone do not overlap, you may need one or more transit VSANs
to provide connectivity. Two edge VSANs in an IVR zone will not overlap if IVR is not enabled
on a switch that is a member of both the source and destination edge VSANs.
• Traffic between the edge VSANs traverses only the shortest IVR path.
• Transit VSAN information is common to all IVR zones. Sometimes a transit VSAN can also be an
edge VSAN in another IVR zone.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Border switches
Always follow these guidelines when configuring border switches:
• Border switches require Cisco SAN-OS Release 1.3(1) or higher.
• A border switch must be a member of two or more VSANs.
• A border switch that facilities IVR communications must be IVR enabled.
• For redundant paths between active IVR zone members, IVR can (optionally) be enabled on
additional border switches.
• The VSAN topology configuration must be updated before a border switch is added or removed.
Checklist Checkoff
Verify that IVR is enabled on all border switches involved in IVR.
Verify that you have the correct license installed (SAN_EXTENSION for IVR over FCIP
or ENTERPRISE_PKG for IVR over Fibre Channel).
Verify that the IVR configuration is the same on all IVR-enabled switches.
Verify that the IVR zone is part of the active IVR zone set.
Verify that you have an active zone set or that you activate the IVR zone set using the force
option.
Verify that you have added IVR virtual domains to the allowed domain ID list if you have
a Cisco SN5428 storage router or a Cisco MDS 9020 switch in your fabric.
If you change any FSPF link cost, ensure that the FSPF path cost (that is, the sum of the link costs on
the path) of any IVR path is less than 30,000.
This section includes the following topics:
• Verifying IVR Configuration Using Fabric Manager, page 8-3
• Verifying IVR Configuration Using the CLI, page 8-4
• Limitations and Restrictions, page 8-5
• IVR Enhancements by Cisco SAN-OS Release, page 8-6
Step 1 Choose Fabricxx > All VSANs > IVR to verify your IVR configuration.
Step 2 Select the CFS tab to verify that the Oper column is enabled and the Global column is enabled for CFS
distribution. Check the LastResult column for the status of the last CFS action.
Step 3 Select the Action tab to determine if auto topology and IVR NAT are enabled.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 Select the Local Topology and Active Topology tabs to verify your IVR VSAN topology.
Step 5 Choose Fabricxx > All VSANs > Domain Manager to verify unique domain IDs if IVR NAT is not
enabled.
Step 6 Choose Zone > IVR > Edit Local Full Zone Database to verify your IVR zones and zone sets and to
verify that you have activated your IVR zone set. The active IVR zone set name appears in bold.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The following show internal commands can be useful for troubleshooting IVR issues.
add-rw Show ivr fcid rewrite fsm internals
adv_vsans Show IVR advertise VSANs for a native VSAN and domain
area-port-allocation Show IVR area-port allocation
capability-fsm Show IVR capability fsm internal debug information
commit-rw Show ivr fcid rewrite fsm internals
debug-log-buffer1 Show IVR debug-log buffer
del-rw Show ivr fcid rewrite fsm internals
dep Show ivr dep internals
device-list Show ivr device list
distribution Show ivr distribution internals
domain-capture-list Show ivr domain controller capture list
drav-fsm Show DRAV FSM details
event-history Show ivr internal event history
fcid-rewrite-fsm Show ivr fcid rewrite fsm internals
fcid-rewrite-list Show ivr fcid rewrite entries
fsmtca Show IVR FSM transition statistics
global-data Show ivr global data
mem-stats Show memory statistics
nhvsan-change Show ivr fcid rewrite fsm internals
plogi-captured-list Show ivr PLOGI captured
pnat Show IVR payload NAT internal information
pvm Show IVR PV Master internal information
tu-fsm Show TU FSM internal debug information
vdri-fsm Show VDRI FSM internal debug information
virtual-domains Show IVR capability fsm internal debug information
vsan-rewrite-list Show ivr vsan rewrite list
vsan-topology Show internal information on IVR VSAN topology
vsan-topology-graph Show IVR VSAN Topology graph internal debug information
zone-fsm Show ivr zone fsm internals
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Table 8-1 shows the limitations to the IVR configuration based on the Cisco SAN-OS release.
Note Two VSANS with the same VSAN ID combined with a unique AFID count as two VSANs in the total
number of allowed VSANs per fabric.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
IPS-4 SAN_EXTN_OVER_IPS4
1. Cisco MDS 9216i enables the SAN_EXTENSION features without a license for the two Gigabit Ethernet ports on the integrated supervisor card.
Note If you are using IVR over FCIP and Fibre Channel, you need the ENTERPRISE_PKG as well as the
appropriate SAN extension license as shown in Table 8-3.
Tip Be sure to enter the correct chassis serial number when purchasing your license packages. Choose
Switches > Hardware and check the SerialNo Primary for the switch chassis in Fabric Manager or use
the show license host-id CLI command to obtain the chassis serial number for each switch that requires
a license. Your license will not operate if the serial number used does not match the serial number of the
chassis you are installing the license on.
See Chapter 4, “Troubleshooting Licensing,” for complete details on troubleshooting licensing issues.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Link Isolated
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Warning: Not All Switches Are IVR NAT Capable or Are Unmanageable
Symptom Warning: Not all switches are IVR NAT capable or are unmanageable.
Table 8-15 Not All Switches Are IVR NAT Capable or Are Unmanageable
Table 8-16 The Following Switches Do Not Have Unique Domain IDs
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 9
Troubleshooting Zones and Zone Sets
Zoning enables access control between storage devices and user groups. Creating zones increases
network security and prevents data loss or corruption.
Zone sets consist of one or more zones. A zone set can be activated or deactivated as a single entity
across all switches in the fabric, but only one zone set can be activated at any time.
Zones can be members of more than one zone set. A zone consists of multiple zone members. Members
in a zone can access each other; members in different zones cannot access each other.
This chapter describes how to identify and resolve problems that might occur while implementing zones
and zone sets on switches in the Cisco MDS 9000 Family. It includes the following sections:
• Best Practices, page 9-1
• Troubleshooting Checklist, page 9-2
• Zone and Zone Set Issues, page 9-4
• Zone Merge Failure, page 9-12
Best Practices
This section provides the best practices for implementing zones and zone sets.
• Fibre Channel zoning should always be used.
Creating zones increases network security and prevents data loss or corruption.
• Each host bus adapter (HBA) should have its own zone.
In general, we recommend that the number of zones equal the number of HBAs communicating with
the storage device. For example, if there are two hosts each with two HBAs communicating with
three storage devices, we recommend using four zones. This type of zoning is sometimes referred
to as single initiator zoning.
• Preplan your zone configuration, keeping in mind that multiple zone sets can be configured, but only
one zone set can be active.
• Keep documented backups of zone members and zones within zone sets.
• Device aliases or FC aliases should be used to simplify management whenever possible.
It is easier to identify devices with aliases than with WWNs. In general, you should assign aliases
to WWNs.
• Use enhanced zoning whenever possible. Enhanced zoning is less disruptive, and ensures
fabric-wide consistency for your zone configuration.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Troubleshooting Checklist
The following criteria must be met for zoning to function properly:
Checklist Checkoff
Verify that you have an active zone set.
Verify that you have the correct hosts and storage devices in the same zone.
Verify that the zone is part of the active zone set.
Verify that the default zone policy is permit if you are not using zoning.
Step 1 Choose Tools > Edit Full Zone Database and select the VSAN from the drop-down menu. You see the
full zone database for that VSAN. The active zone set appears in bold. If there is no zone set in bold,
you have not activated a zone set for this VSAN.
Step 2 Expand the active zone set. You see the active zones displayed as new folders.
Step 3 Click on a zone. You see the devices belonging to the zone listed in the column on the left side of the
dialog box. They are also highlighted in the map view.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note To issue commands with the internal keyword, you must have a network-admin group account.
The debug zone change CLI command followed by the zone name in question can help you get started
debugging zones for protocol errors, events, and packets.
Note To enable debugging for zones, use the debug zone command in EXEC mode. To disable a debug
command, use the no form of the command or use the no debug all command to turn off all debugging.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Resolving Host Not Communicating with Storage Issue Using Fabric Manager
To verify that the host is not communicating with storage using Fabric Manager, follow these steps:
Step 1 Verify that the host and storage device are in the same VSAN. See the “Verifying VSAN Membership
Using Fabric Manager” section on page 7-6.
Step 2 Configure zoning, if necessary, by choose Fabricxx > VSANxx > Default Zone and selecting the
Policies tab to determine if the default zone policy is set to deny.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The default zone policy of permit means all nodes can see all other nodes. Deny means all nodes are
isolated when not explicitly placed in a zone.
Step 3 Optionally, select permit from the Default Zone Behavior drop-down menu to set the default zone policy
to permit if you are not using zoning. Got to Step 8.
Step 4 Choose Zone > Edit Local Full Zone Database and select the VSAN you are interested in. Click on the
zones folder and verify that the host and storage are both members of the same zone. If they are not in
the same zone, see the “Resolving Host and Storage Not in the Same Zone Using Fabric Manager”
section on page 9-5.
Step 5 Choose Zone > Edit Local Full Zone Database and select the VSAN you are interested in. Click on the
active zone folder and determine if the zone in Step 5 and the host and disk appear in the active zone set.
If the zone is not in the active zone set, see the “Resolving Zone is Not in Active Zone Set Using Fabric
Manager” section on page 9-6.
Step 6 If there is no active zone set, right-click the zone set you want to activate in the Edit Local Full Zone
Database dialog box and select Activate to activate the zone set.
Step 7 Verify that the host and storage can now communicate.
Resolving Host and Storage Not in the Same Zone Using Fabric Manager
To move the host and storage device into the same zone using Fabric Manager, follow these steps:
Step 1 Choose Zone > Edit Local Full Zone Database and select the VSAN you are interested in. Click on the
zones folder and find the zones that the host and storage are members of.
Step 2 Click on the zone that contains the host or storage that you want to move. Right-click on the row that
represents this zone member and select Delete from the pop-up menu to remove this end device from the
zone.
Step 3 Click on the zone that you want to move the end device to. Click and drag the row that represents the
end device in the bottom table and add it to the zone in the top table.
Step 4 Verify that you have an active zone set for this VSAN by selecting the zone set name that appears in bold.
If you do not have an active zone set, right-click on the zone set you want to activate in the Edit Local
Full Zone Database dialog box and select Activate to activate the zone set.
Step 5 Expand the active zone set folder to verify that the zone in Step 3 is in the active zone set. If it is not, see
the “Resolving Zone is Not in Active Zone Set Using Fabric Manager” section on page 9-6.
Step 6 Click Activate... to activate the modified zone set.
Step 7 Verify that the host and storage can now communicate.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
To add a zone to the active zone set using Fabric Manager, follow these steps:
Step 1 Choose Zone > Edit Local Full Zone Database and select the VSAN you are interested in. Right-click
on the active zone set, which is in bold, and select Insert.
Step 2 Click on the zone that you want to add to this zone set and click Add.
Step 3 Click Activate... to activate the modified zone set.
Step 4 Verify that the host and storage can now communicate.
Step 1 Verify that the host and storage device are in the same VSAN. See the “Verifying VSAN Membership
Using the CLI” section on page 7-6.
Step 2 Configure zoning, if necessary, by using the show zone status vsan-id command to determine if the
default zone policy is set to deny.
switch# show zone status vsan 1
VSAN: 1 default-zone: deny distribute: active only Interop: default
mode: basic merge-control: allow session: none
hard-zoning: enabled
Default zone:
qos: low broadcast: disabled ronly: disabled
Full Zoning Database :
Zonesets:0 Zones:0 Aliases: 0
Active Zoning Database :
Name: Database Not Available
Status:
The default zone policy of permit means all nodes can see all other nodes. Deny means all nodes are
isolated when not explicitly placed in a zone.
Step 3 Optionally, use the zone default-zone permit CLI command to set the default zone policy to permit if
you are not using zoning. Go to Step 7.
Step 4 Use the show zone member CLI command for host and storage device to verify that they are both in the
same zone. If they are not in the same zone, see the “Resolving Host and Storage Not in the Same Zone
Using Fabric Manager” section on page 9-5.
Step 5 Use the show zoneset active command to determine if the zone in Step 4 and the host and disk appear
in the active zone set.
v_188# show zoneset active vsan 2
zoneset name ZoneSet3 vsan 2
zone name Zone5 vsan 2
pwwn 10:00:00:00:77:99:7a:1b [Hostalias]
pwwn 21:21:21:21:21:21:21:21 [Diskalias]
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 6 If the zone is not in the active zone set, see the “Resolving Zone is Not in Active Zone Set Using Fabric
Manager” section on page 9-6.
Step 7 If there is no active zone set, use the zoneset activate command to activate the zone set.
switch(config)# zoneset activate ZoneSet1 vsan 2.
Step 8 Verify that the host and storage can now communicate.
Resolving Host and Storage Not in the Same Zone Using the CLI
To move the host and storage device into the same zone using the CLI, follow these steps:
Step 1 Use the zone name zonename vsan-id command to create a zone in the VSAN if necessary, and add the
host or storage into this zone.
ca-9506(config)# zone name NewZoneName vsan 2
ca-9506(config-zone)# member pwwn 22:35:00:0c:85:e9:d2:c2
ca-9506(config-zone)# member pwwn 10:00:00:00:c9:32:8b:a8
Note The pWWNs for zone members can be obtained from the device or by issuing the show flogi
database vsan-id command.
Step 2 Use the show zone command to verify that host and storage are now in the same zone.
switchA# show zone
zone name NewZoneName vsan 2
pwwn 22:35:00:0c:85:e9:d2:c2
pwwn 10:00:00:00:c9:32:8b:a8
Step 3 Use the show zoneset active command to verify that you have an active zone set. If you do not have an
active zone set, use the zoneset activate command to activate the zone set.
Step 4 Use the show zoneset active command to verify that the zone in Step 2 is in the active zone set. If it is
not, use the zoneset name command to enter the zone set configuration submode, and use the member
command to add the zone to the active zone set.
switch(config)# zoneset name zoneset1 vsan 2
ca-9506(config-zoneset)# member NewZoneName
Step 5 Use the zoneset activate command to activate the zone set.
switch(config)# zoneset activate ZoneSet1 vsan 2
Step 6 Verify that the host and storage can now communicate.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
To add a zone to the active zone set using the CLI, follow these steps:
Step 1 Use the show zoneset active command to verify that you have an active zone set. If you do not have an
active zone set, use the zoneset activate command to activate the zone set.
Step 2 Use the show zoneset active command to verify that the zone in Step 1 is not in the active zone set.
Step 3 Use the zoneset name command to enter the zone set configuration submode, and use the member
command to add the zone to the active zone set.
switch(config)# zoneset name zoneset1 vsan 2
ca-9506(config-zoneset)# member NewZoneName
Step 4 Use the zoneset activate command to activate the zone set.
switch(config)# zoneset activate ZoneSet1 vsan 2
Step 5 Verify that the host and storage can now communicate.
Recommended Action Use the zoneset activate CLI command or similar Fabric Manager procedure
to.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Explanation The zone server cannot activate because of reason shown in the error message.
Explanation The zone server cannot activate because of reason shown in the error message on the
domain.
Step 1 Choose Zone > Edit Local Full Zone Database and select the VSAN you are interested in. Click on the
active zone set, which is in bold.
Step 2 Verify that the needed zones are active. If a zone is missing from the active zone set, see the “Resolving
Zone is Not in Active Zone Set Using Fabric Manager” section on page 9-6.
Step 3 Click Activate... to activate the zone set.
Step 4 If you are still experiencing zone set activation failure, use the show zone internal change
event-history vsan <vsan-id> CLI command to determine the source of zone set activation problem.
Step 1 Use the show zoneset active vsan-id command to display the active zones.
switchA# show zoneset active vsan 2
zoneset name ZoneSet1 vsan 2
zone name NewZoneName vsan 2
* pwwn 22:35:00:0c:85:e9:d2:c2
* pwwn 10:00:00:00:c9:32:8b:a8
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 Use the zoneset activate command to activate the zone set.
switch(config)# zoneset activate ZoneSet1 vsan 2
Step 5 If you are still experiencing zone set activation failure, use the show zone internal change
event-history vsan <vsan-id> command to determine the source of the zone set activation problem.
Step 1 Choose Fabricxx > VSANxx > zonesetname and select the Policies tab.
Step 2 Verify that the Propagation field is set to FullZoneSet. If it is not, select FullZoneSet from the
drop-down menu.
Step 3 Click Apply Changes to save these changes.
Step 1 Use the show zone status command to verify if the distribute flag is on.
switch# config t show zone status
VSAN: 1 default-zone: deny distribute: active only Interop: default
mode: basic merge-control: allow session: none
hard-zoning: enabled
Default zone:
qos: low broadcast: disabled ronly: disabled
Full Zoning Database :
Zonesets:3 Zones:7 Aliases: 9
Active Zoning Database :
Name: ZoneSet1 Zonesets:1 Zones:2
Status:
This example shows that only the active zone set is distributed.
Step 2 Verify that the distribute flag is on.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Fabricxx > VSANxx > zonesetname and select the Policies tab.
Step 2 View the Default Zone Behavior field for each switch in the VSAN to determine which switches have
mismatched default zone policies.
Step 3 Click Apply Changes to save these changes.
Step 4 If you are using basic zoning, Select the same value from the Default Zone Behavior drop-down menu
for each switch in the VSAN to set the same default zone policy.
Step 5 If you are using enhanced zoning, follow these steps:
a. Choose Fabricxx > VSANxx and view the Release field to verify that all switches are capable of
working in the enhanced mode.
All switches must have Cisco MDS SAN-OS Release 2.0(1b) or later. If one or more switches are
not capable of working in enhanced mode, then your request to move to enhanced mode is rejected.
b. Choose Fabricxx > VSANxx > zonesetname and select the Policies tab and set Default Zone
Behavior field to set the default zone policy.
c. Click Apply Changes to save these changes.
d. Select the Enhanced tab and select enhanced from the Action drop-down menu.
e. Click Apply Changes to save these changes.
By doing so, you automatically start a session, acquire a fabric wide lock, distribute the active and
full zoning database using the enhanced zoning data structures, distribute zoning policies, and then
release the lock. All switches in the VSAN then move to the enhanced zoning mode.
Note After moving from basic zoning to enhanced zoning (or vice versa), we recommend that you save the
running configuration.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
This example shows the default zone policy is deny, and the zone mode is basic.
Step 2 If you are using basic zoning, follow these steps:
a. Repeat Step 1 for all switches in the VSAN to verify that they have the same zone mode. Use the
zone mode basic command to change any switches that are not in basic mode.
b. Use the zone default-zone command on each switch in the VSAN to set the same default zone
policy.
Step 3 If you are using enhanced zoning, follow these steps:
a. Use the show version command on all switches in the VSAN to verify that all switches are capable
of working in the enhanced mode.
All switches must have Cisco MDS SAN-OS Release 2.0(1b) or later. If one or more switches are
not capable of working in enhanced mode, then your request to move to enhanced mode is rejected.
b. Use the zone default-zone command to set the default zone policy.
c. Use the zone mode enhanced vsan-id command to set the operation mode to enhanced zoning
mode.
By doing so, you will automatically start a session, acquire a fabric wide lock, distribute the active
and full zoning database using the enhanced zoning data structures, distribute zoning policies, and
then release the lock. All switches in the VSAN then move to the enhanced zoning mode.
switch(config)# zone mode enhanced vsan 3000
Note After moving from basic zoning to enhanced zoning (or vice versa), we recommend that you use
the copy running-config startup-config command to save the running configuration.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Explanation Interface on the VSAN was isolated because the adjacent switch is not responding to
zone server requests.
Recommended Action Compare active zoneset with the adjacent switch or enter the zone merge
interface CLI command or similar Fabric Manager/Device Manager command.
Introduced Cisco MDS SAN-OS Release 1.2(2a).
Explanation Full zoning databases are inconsistent between two switches connected by interface .
Databases are not merged.
Recommended Action Compare full zoning database with the adjacent switch. Correct the difference
and flap the link.
Introduced Cisco MDS SAN-OS Release 1.3(1).
Explanation Full zoning databases are inconsistent between two switches connected by the
interface. Databases are not merged.
Recommended Action Compare full zoning database with the adjacent switch, correct the difference
and flap the link.
Introduced Cisco MDS SAN-OS Release 1.2(2a).
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Recommended Action Set the interoperability mode to the same value on both switches.
Introduced Cisco MDS SAN-OS Release 2.0(1b).
Note Zoning information exists on a per VSAN basis. Therefore, for a TE port, it may be necessary to verify
that the zoning information does not conflict with any allowed VSAN.
Resolving a Link Isolation Because of a Failed Zone Merge Using Fabric Manager
Using the Zone Merge Analysis tool in Fabric Manager, the compatibility of two active zone sets in two
switches can be checked before actually merging the two zone sets. Refer to the Cisco MDS 9000 Fabric
Manager Configuration Guide for more information.
To perform a zone merge analysis using Fabric Manager, follow these steps:
Step 1 Choose Zone > Merge Analysis from the Zone menu.
You see the Zone Merge Analysis dialog box.
Step 2 Select the first switch to be analyzed from the Check Switch 1 drop-down list.
Step 3 Select the second switch to be analyzed from the And Switch 2 drop-down list.
Step 4 Enter the VSAN ID where the zone set merge failure occurred in the For Active Zoneset Merge Problems
in VSAN Id field.
Step 5 Click Analyze to analyze the zone merge. Click Clear to clear the analysis data from the Zone Merge
Analysis dialog box.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Resolving a Link Isolation Because of a Failed Zone Merge Using the CLI
The following CLI commands are used to resolve a failed zone merge:
• zoneset import vsan-id
• zoneset export vsan-id
To resolve a link isolation because of a failed zone merge using the CLI, follow these steps:
Step 1 Use the show interface command to confirm that the port is isolated because of a zone merge failure.
switch# show interface fc2/14
fc2/14 is down (Isolation due to zone merge failure)
Hardware is Fibre Channel, WWN is 20:4e:00:05:30:00:63:9e
vsan is 1
Beacon is turned off
40 frames input, 1056 bytes, 0 discards
0 runts, 0 jabber, 0 too long, 0 too short
0 input errors, 0 CRC, 3 invalid transmission words
0 address id, 0 delimiter
0 EOF abort, 0 fragmented, 0 unknown class
79 frames output, 1234 bytes, 16777216 discards
Received 23 OLS, 14 LRR, 13 NOS, 39 loop inits
Transmitted 50 OLS, 16 LRR, 21 NOS, 25 loop inits
An E port is segmented (isolation due to zone merge failure) if the following conditions are true:
• The active zone sets on the two switches differ from each other in terms of zone membership
(provided there are zones at either side with identical names).
• The active zone set on both switches contain a zone with the same name but with different zone
members.
Step 2 Verify the zoning information, using the following commands on each switch:
• show zone vsan vsan-id
• show zoneset vsan vsan-id
Step 3 You can use two different approaches to resolve a zone merge failure by overwriting the zoning
configuration of one switch with the other switch’s configuration. This can be done with either of the
following commands:
• zoneset import interface interface-number vsan vsan-id
• zoneset export interface interface-number vsan vsan-id
The import option of the command overwrites the local switch’s active zone set with that of the remote
switch. The export option overwrites the remote switch’s active zone set with the local switch’s active
zone set.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 4 If the zoning databases between the two switches are overwritten, you cannot use the import option. To
work around this, you can manually change the content of the zone database on either of the switches,
and then issue a shutdown/no shutdown command sequence on the isolated port.
Step 5 If the isolation is specific to one VSAN and not on an E port, the correct way to issue the cycle up/down,
is to remove the VSAN from the list of allowed VSANs on that trunk port, and reinsert it.
Note Do not simply issue a shutdown/no shutdown command sequence on the port. This would
affect all the VSANs crossing the EISL instead of just the VSAN experiencing the isolation
problem.
Resolving Mismatched Active Zone Sets Within the Same VSAN Using Fabric Manager
Mismatched active zone sets within the same VSAN result in that VSAN being segmented in Fabric
Manager. To verify a mismatched active zone set within the same VSAN using Fabric Manager, follow
these steps:
Step 1 Choose Zone > Edit Local Full Zone Database and select the segmented VSAN you are interested in.
Click on the active zone set, which is in bold, to view the list of zones and zone members for this active
zone set.
Step 2 Repeat Step 1 for the other segmented VSAN.
A mismatched active zone set may include zones with the same name but different members, or a missing
zone within the zone set.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Resolving Mismatched Active Zone Sets Within the Same VSAN Using the CLI
To verify a mismatched active zone set within the same VSAN using the CLI, follow these steps:
Step 1 Use the show zoneset active vsan-id command to display the active zone set configuration of the first
switch.
Switch1# show zoneset active vsan 99
zoneset name ZoneSet1 vsan 99
zone name VZ1 vsan 99
* fcid 0x7800e2 [pwwn 22:00:00:20:37:04:ea:2b]
* fcid 0x7800d9 [pwwn 22:00:00:20:37:04:f8:a1]
Step 2 Use the show zoneset active vsan-id command to display the active zone set configuration of the second
switch:
Switch2# show zoneset active vsan 99
zoneset name ZoneSet1 vsan 99
zone name VZ1 vsan 99
pwwn 22:00:00:20:37:04:f8:a1
pwwn 22:00:00:20:37:0e:65:44
Even though the zones have the same name, their respective members are different.
Step 3 Issue the show interface command to view information about the TE port and the interface.
Switch2# show interface fc1/8
fc1/8 is trunking
Hardware is Fibre Channel
Port WWN is 20:08:00:05:30:00:5f:1e
Peer port WWN is 20:05:00:05:30:00:86:9e
Admin port mode is E, trunk mode is auto
Port mode is TE
Port vsan is 1
Speed is 2 Gbps
Receive B2B Credit is 255
Receive data field size is 2112
Beacon is turned off
Trunk vsans (admin allowed and active) (1,99)
Trunk vsans (up) (1)
Trunk vsans (isolated) (99)
Trunk vsans (initializing) ()
5 minutes input rate 120 bits/sec, 15 bytes/sec, 0 frames/sec
5 minutes output rate 88 bits/sec, 11 bytes/sec, 0 frames/sec
10845 frames input, 620268 bytes, 0 discards
0 CRC, 0 unknown class
0 too long, 0 too short
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note To issue commands with the internal keyword, you must have an account that is a member of
the network-admin group.
From this output, you can see the VSAN is isolated because of o a zone merge failure.
Step 5 Do one of the following to resolve the isolation problem:
• Change the membership of one of the zones to match the other zone of the same name. See the
“Resolving Host and Storage Not in the Same Zone Using Fabric Manager” section on page 9-5.
• Discard one of the zone sets completely by deactivating it using the no zoneset activate command.
If a VSAN does not have an active zone set, it automatically takes the active zone set of the other
merging switch. See the “Deactivating a Zone Set and Restarting the Zone Merge Process Using the
CLI” section on page 9-20.
• Overwrite the active zone set on one switch using the import or export commands. This method is
destructive to one of the active zone sets.
– zoneset import interface interface-number vsan vsan-id
– zoneset export interface interface-number vsan vsan-id
Step 6 Use the show interface fcx/y trunk vsan-id command to verify that VSAN 99 is no longer isolated:
Switch1# show interface fc1/5 trunk vsan 99
fc1/5 is trunking
Vsan 99 is up, FCID is 0x780102
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Deactivating a Zone Set and Restarting the Zone Merge Process Using Fabric Manager
To deactivate a zone set and restart the zone merge process using Fabric Manager, follow these steps:
Step 1 Choose Zone > Deactivate Zone Set to deactivate the zone set configuration.
Caution This will disrupt traffic and cause the MDS 9000 switch to lose connectivity with the network.
Step 2 Choose Interfaces > FC Physical and select down from the Status Admin drop-down menu to shut
down the connection to the zone to be merged. You may see the following system messages:
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_CHANNEL_ADMIN_DOWN: Interface fc1/14 is down
(Channel admin down)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_CHANNEL_ADMIN_DOWN: Interface fc1/15 is down
(Channel admin down)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_CHANNEL_ADMIN_DOWN: Interface fc1/16 is down
(Channel admin down)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface
port-channel 1 is down (No operational members)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_ADMIN_DOWN: Interface port-channel 1 is down
(Administratively down)
Nov 19 10:26:10 switch4 %LOG_PORT_CHANNEL-5-FOP_CHANGED: port-channel 1: first operational
port changed from fc1/16 to none
Step 3 Choose Interfaces > FC Physical and select up from the Status Admin drop-down menu to enable the
connection to the zone to be merged. You may see the following system messages:
Nov 19 10:28:11 switch4 %LOG_PORT_CHANNEL-5-FOP_CHAN
GED: port-channel 1: first operational port changed from none to fc1/15
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_UP: Interface port-channel 1 is up in mode TE
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/14, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/15, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/16, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/14, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/15, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/16, vsan 1 is up
Step 4 Choose Zone > Edit Local Full Zone Database to verify the active zone set configuration.
After deactivating the zone set onthe first switch and performing a shutdown followed by a no shutdown
on the ISL that connects it to the second switch, the zone merge is processed again. Because the first
switch has no active zone set, it learns the active zone set from the second switch during the zone merge
process.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Deactivating a Zone Set and Restarting the Zone Merge Process Using the CLI
To deactivate a zone set and restart the zone merge process using the CLI, follow these steps:
Step 1 Use the no zoneset activate name zoneset-name vsan-id command to deactivate the zone set
configuration from the switch:
Caution This will disrupt traffic and cause the MDS 9000 switch to lose connectivity with the network.
Step 2 Use the show zoneset active command to confirm that the zone set has been removed.
Step 3 Use the shut down command to shut down the connection to the zone to be merged.
switch4(config)# interface port-channel 1
switch4(config-if)# shutdown
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_CHANNEL_ADMIN_DOWN: Interface fc1/14 is down
(Channel admin down)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_CHANNEL_ADMIN_DOWN: Interface fc1/15 is down
(Channel admin down)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_CHANNEL_ADMIN_DOWN: Interface fc1/16 is down
(Channel admin down)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface
port-channel 1 is down (No operational members)
Nov 19 10:26:10 switch4 %LOG_PORT-5-IF_DOWN_ADMIN_DOWN: Interface port-channel 1 is down
(Administratively down)
Nov 19 10:26:10 switch4 %LOG_PORT_CHANNEL-5-FOP_CHANGED: port-channel 1: first operational
port changed from fc1/16 to none
Step 4 Use the no shutdown command to reactivate the connection to the zone to be merged:
switch4(config-if)# no shutdown
Nov 19 10:28:11 switch4 %LOG_PORT_CHANNEL-5-FOP_CHAN
GED: port-channel 1: first operational port changed from none to fc1/15
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_UP: Interface port-channel 1 is up in mode TE
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/14, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/15, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/16, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/14, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/15, vsan 1 is up
Nov 19 10:28:21 switch4 %LOG_PORT-5-IF_TRUNK_UP: Interface fc1/16, vsan 1 is up
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 5 Use the show zoneset active vsan-id commands to exit configuration mode and check the active zone
sets.
switch4# show zoneset active
zoneset name wall vsan 1
zone name excal1 vsan 1
* fcid 0x620200
fcid 0x6200ca
zone name $default_zone$ vsan 1
* fcid 0x6e00da
* fcid 0x6e00d9
* fcid 0x6e00d6
* fcid 0x6e0100
After deactivating the zone set on switch 4 and performing a shutdown followed by a no shutdown on
the ISL that connects it to switch 3, the zone merge is processed again. Because switch 3 has no active
zone set, it learns the active zone set from switch 4 during the zone merge process.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Choose Fabricxx > VSANxx and select the zone set that you want to configure.
Step 2 Select the Enhanced tab from the Information pane and view the Config DB Locked By column to
determine which switch and which user holds the enhanced zoning lock for this VSAN.
Step 3 Check the Config DB Discard Changes check box and click Apply Changes to clear the enhanced
zoning lock.
Note Verify that no valid configuration change is in progress before you clear a lock
Step 1 Use the show zone internal vsan CLI command to determine which switch has the lock for the VSAN.
switch# show zone internal vsan 16
VSAN: 16 default-zone: deny(rw) distribute: active only
E_D_TOV: 2000 R_A_TOV: 10000 D_S_TOV: 5000 F_S_TOV: 5000 F_D_TOV: 2000
Interop: default IOD: disable bcast: enable dflt-bcast: disable dflt-qos: 0
DBLock:-(F count:0) Ifindex Table Size: 2 Transit Frame Index: 0
Total Transit Frame Count: 11 Transit Discard Count: 9 Global Full Database Counters :
Zonesets: 9 Zones: 153
Aliases: 58 Attribute-groups: 15
Members: 482 LUN Members: 0
Global Active Database Counters :
Zones: 159 Members: 442 LUN Members: 0 Global Database (Active + Full) Counters :
Read-only Zones: 0 LUN Members: 0
License Info: 0x0
Full Zoning Database :
Zonesets:2 Zones:2 Aliases: 0 Attribute-groups: 1 Active Zoning Database :
Name: CX400-BLUE Zonesets:1 Zones:2 TCAM Info :
cur_seq_num : 2840, state : 0
add_reqs = 15, del_reqs = 0, entries_added = 9 Change protocol info :
local domain id = 50, ACA by 0x58 <===========domain ID 58 has the lock
State = Idle, reply_cnt = 1, req_sent_cnt = 1, req_pending =0
Remote domains :
58
Note If you see ACA by 0xff in the display, it means that no lock is known to exist on the domain in
this switch. This should be the same for all switches in the VSAN.
Step 2 Use the show zone status vsan CLI command on the switch that holds the lock to determine the lock
holder In the example above, you use this command on the switch that has the domain ID 58.
switch#show zone status vsan 16
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
hard-zoning: enabled
Step 3 Use the no zone commit vsan CLI command to release the lock if you are the holder of the lock.
Step 4 Use the no zone commit vsan <vsan id> force CLI command to release the lock if another user holds
the lock.
Note Verify that no valid configuration change is in progress before you clear a lock
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 10
Troubleshooting IP Storage Services
This chapter describes how to identify and resolve problems that might occur in the IP storage services
portion of the Cisco MDS 9000 Family products. It includes the following sections:
• Overview, page 10-1
• IP Connections Troubleshooting, page 10-2
• FCIP Connections Troubleshooting, page 10-5
• Troubleshooting iSCSI Issues, page 10-31
• Fine Tuning/Troubleshooting iSCSI TCP Performance, page 10-44
Overview
Using open-standard, IP-based technology, the Cisco MDS 9000 Family IP storage module enables you
to extend the reach of Fibre Channel SANs. The switch can connect separated SAN islands together via
IP networks using FCIP, and allow IP hosts to access FC storage using the iSCSI protocol.
The IP Storage (IPS) services module allows you to use FCIP and iSCSI features. It supports the full
range of features available on other switching modules, including VSANs, security, and traffic
management. The IPS module can be used in any Cisco MDS 9000 Family switch and has eight Gigabit
Ethernet ports. Each port can run the FCIP and iSCSI protocols simultaneously.
FCIP transports Fibre Channel frames transparently over an IP network between two Cisco MDS 9000
Family switches or other FCIP standards-compliant devices (see Figure 10-1). Using the iSCSI protocol,
the IPS module provides IP hosts access to Fibre Channel storage devices. IP host-initiated iSCSI
commands are encapsulated in IP, and sent to an MDS 9000 IPS port. There, the commands are routed
from the IP network into a Fibre Channel network, and forwarded to the intended target.
MDS1 MDS2
FC
GE 2/8 IP GE 2/8
94218
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
IP Connections Troubleshooting
If you suspect that all or part of your IP connection has failed, you can verify that by performing one or
more of the procedures in this section. Using these procedures, you can verify connectivity for IP 802.1q,
EtherChannel, and VRRP for iSCSI.
Step 1 Perform a basic check of host reachability and network connectivity using the ping command. A sample
output of the ping command follows:
switch# ping 172.18.185.121
PING 172.18.185.121 (172.18.185.121): 56 data bytes
64 bytes from 172.18.185.121: icmp_seq=0 ttl=128 time=0.3 ms
64 bytes from 172.18.185.121: icmp_seq=1 ttl=128 time=0.1 ms
64 bytes from 172.18.185.121: icmp_seq=2 ttl=128 time=0.2 ms
64 bytes from 172.18.185.121: icmp_seq=3 ttl=128 time=0.2 ms
64 bytes from 172.18.185.121: icmp_seq=4 ttl=128 time=0.1 ms
64 bytes from 172.18.185.121: icmp_seq=5 ttl=128 time=0.1 ms
Step 2 Verify route to remote device using show ip route, traceroute, and show arp commands. A sample
output of the show ip route command follows:
switch # show ip route
A sample output of the traceroute command follows. The route is using interface GigE, verified using
the show arp command.
switch# traceroute 172.18.185.121
traceroute to 172.18.185.121 (172.18.185.121), 30 hops max, 38 byte packets
1 172.18.185.121 (172.18.185.121) 0.411 ms 0.150 ms 0.146 ms
Another sample output of the traceroute command follows. This route is using interface mgmt0, verified
using the show arp command.
switch# traceroute 10.82.241.17
traceroute to 10.82.241.17 (10.82.241.17), 30 hops max, 38 byte packets
1 172.18.189.129 (172.18.189.129) 0.413 ms 0.257 ms 0.249 ms
2 172.18.0.33 (172.18.0.33) 0.296 ms 0.260 ms 0.258 ms
3 10.81.254.69 (10.81.254.69) 0.300 ms 0.273 ms 0.277 ms
4 10.81.254.118 (10.81.254.118) 0.412 ms 0.292 ms 0.287 ms
5 10.83.255.81 (10.83.255.81) 0.320 ms 0.301 ms 0.310 ms
6 10.83.255.163 (10.83.255.163) 0.314 ms 0.295 ms 0.279 ms
7 10.82.241.17 (10.82.241.17) 48.152 ms 48.608 ms 48.423 ms
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
A sample output of the clear ips arp command follows. You clear the arp cache to verify that the activity
you are viewing is the most current.
switch# clear ips arp interface gigabitethernet 4/7
arp clear successful
Step 3 Use the show interface command to verify that the Gigabit Ethernet interface is up. A sample output of
the show interface command follows.
GigabitEthernet4/7 is up
Hardware is GigabitEthernet, address is 0005.3000.9f58
Internet address is 172.18.189.137/26
MTU 1500 bytes, BW 1000000 Kbit
Port mode is IPS
Speed is 1 Gbps
Beacon is turned off
5 minutes input rate 688 bits/sec, 86 bytes/sec, 0 frames/sec
5 minutes output rate 312 bits/sec, 39 bytes/sec, 0 frames/sec
156643 packets input, 16859832 bytes
0 multicast frames, 0 compressed
0 input errors, 0 frame, 0 overrun 0 fifo
144401 packets output, 7805631 bytes, 0 underruns
0 output errors, 0 collisions, 0 fifo
0 carrier errors
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note The FC ID variable used in this procedure is the domain controller address; it is not a duplication of the
domain ID.
Command Purpose
Step 1 switch# show fcdomain domain-list vsan 200 Displays the destination switch’s domain ID.
Number of domains: 7
Domain ID WWN To obtain the domain controller address,
--------- ----------------------- concatenate the domain ID with FFFC. For
0x01(1) 20:c8:00:05:30:00:59:df [Principal] example, if the domain ID is 0xda(218), the
0x02(2) 20:c8:00:0b:5f:d5:9f:c1
concatenated ID is 0xfffcda.
0x6f(111) 20:c8:00:05:30:00:60:df
0xda(218) 20:c8:00:05:30:00:87:9f [Local]
0x06(6) 20:c8:00:0b:46:79:f2:41
0x04(4) 20:c8:00:05:30:00:86:5f
0x6a(106) 20:c8:00:05:30:00:f8:e3
Step 2 switch# fcping fcid 0xFFFCDA vsan 200 Verifies reachability of the destination
28 bytes from 0xFFFCDA time = 298 usec switch by checking its end-to-end
28 bytes from 0xFFFCDA time = 260 usec
28 bytes from 0xFFFCDA time = 298 usec
connectivity.
28 bytes from 0xFFFCDA time = 294 usec
28 bytes from 0xFFFCDA time = 292 usec
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
MDS1 MDS2
FC
GE 2/8 IP GE 2/8
94218
FC1/14 10.10.10.2/24 Network 10.10.11.2/24 FC1/1
FC
HBA
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The interface FCIP can be any number between 1 – 255 and does not need to be the same as the profile
number. In this example the same number is used for simplicity.
Step 9 Specify a profile to use.
MDS1(config-if)# use-profile 28
The interface FCIP will use the Local FCIP profile. The FCIP profile binds the interface FCIP to the
physical Gigabit Ethernet port and configures the TCP settings used by the interface FCIP.
MDS1(config-if)# peer-info ipaddr 10.10.11.2
The IP address in this example indicates the remote endpoint IP address of the FCIP tunnel.
MDS1(config-if)# no shutdown
MDS1(config-if)# end
vsan database
vsan 2 name grumpy_02
interface fcip28
no shutdown
use-profile 28
peer-info ipaddr 10.10.11.2
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Setting the Static Route for FCIP Tunnels with the CLI
The static route must be set for FCIP tunnels. This route could also be ip route 10.10.11.0 255.255.255.0
interface gigabitethernet 2/8.
ips heartbeat
ips hapreset
ips boot
interface GigabitEthernet2/8
ip address 10.10.10.2 255.255.255.0
(This is the IP address used by the FCIP profile.)
no shutdown
MDS2(config-profile)# exit
MDS2(config-if)# use-profile 28
Mar 10 21:42:23 ips: Dequeued mts msg MTS_OPC_IPS_FCIP_CMI_REQUEST(mts opc 3321, msg id
32480)
Mar 10 21:42:23 ips: FCIP28: Process tunnel configuration event
Mar 10 21:42:23 ips: FCIP28: Change Entity-id from 0 to 28
Mar 10 21:42:23 ips: FCIP: Optimal IF lookup for GigabitEthernet2/8 is GigabitEthernet2/8
Mar 10 21:42:23 ips: FCIP28: bind with GigabitEthernet2/8 (phy GigabitEthernet2/8)
Mar 10 21:42:23 ips: FCIP28: Queueing bind tunnel to src if event to tunnel FSM resource:
0
Mar 10 21:42:23 ips: Locked fcip_if_fsm for MTS_OPC_IPS_FCIP_CMI_REQUEST(msg id 32480)
Mar 10 21:42:23 ips: FCIP28: Send bind for GigabitEthernet2/8 to PM (phy
GigabitEthernet2/8)
Mar 10 21:42:23 ips: FCIP28: add to run-time pss
Mar 10 21:42:23 ips: FCIP28: log: 2087000 phy: 2087000 state: 0 syslog: 0
Mar 10 21:42:23 ips: Dequeued mts msg MTS_OPC_IPS_CFG_FCIP_IF(mts opc 1905, msg id 7304)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
MDS2(config-if)#
MDS2(config-if)# no shutdown
MDS2(config-if)# Mar 10 21:43:32 ips: Dequeued mts msg
MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(mts opc 3114, msg id 32737)
Mar 10 21:43:32 ips: Hndlr MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE (mts_opc 3114 msg_id
32737)
Mar 10 21:43:32 ips: Dequeued mts msg MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(mts opc
3114, msg id 32778)
Mar 10 21:43:32 ips: Hndlr MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE (mts_opc 3114 msg_id
32778)
Mar 10 21:43:32 ips: Dequeued mts msg MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(mts opc
3114, msg id 32783)
Mar 10 21:43:32 ips: Hndlr MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE (mts_opc 3114 msg_id
32783)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Displaying the Debug Output from FCIP Tunnel Supervisor with the CLI
The following example shows the debug output from the supervisor of the FCIP tunnel.
MDS2(config)# interface fcip 28
MDS2(config-if)# no shutdown
MDS2(config-if)# Mar 10 22:59:46 ips: fu_priority_select: - setting fd[3] for select call
- found data in FU_PSEL_Q_CAT_MTS queue, fd(3), usr_q_info(1)
Mar 10 22:59:46 ips: fu_priority_select_select_queue: round credit(0)
Mar 10 22:59:46 ips: curr_q - FU_PSEL_Q_CAT_CQ, usr_q_info(3), priority(4), credit(0),
empty
Mar 10 22:59:46 ips: Starting a new round
Mar 10 22:59:46 ips: fu_priority_select: returning FU_PSEL_Q_CAT_MTS queue, fd(3),
usr_q_info(1)
Mar 10 22:59:46 ips: Dequeued mts msg MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(mts opc
3114, msg id 47540)
Mar 10 22:59:46 ips: ips_mts_hdlr_pm_logical_port_state_change_range:
Mar 10 22:59:46 ips: Hndlr MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE (mts_opc 3114 msg_id
47540)
Mar 10 22:59:46 ips: fu_fsm_execute_all: match_msg_id(0), log_already_open(0)
Mar 10 22:59:46 ips: fu_fsm_execute_all: null fsm_event_list
Mar 10 22:59:46 ips: fu_fsm_engine: mts msg
MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(msg_id 47540) dropped
Mar 10 22:59:46 ips: fu_priority_select: - setting fd[3] for select call - found data in
FU_PSEL_Q_CAT_MTS queue, fd(3), usr_q_info(1)
Mar 10 22:59:46 ips: fu_priority_select_select_queue: round credit(6)
Mar 10 22:59:46 ips: curr_q - FU_PSEL_Q_CAT_CQ, usr_q_info(3), priority(4), credit(3),
empty
Mar 10 22:59:46 ips: fu_priority_select: returning FU_PSEL_Q_CAT_MTS queue, fd(3),
usr_q_info(1)
Mar 10 22:59:46 ips: Dequeued mts msg MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(mts opc
3114, msg id 47589)
Mar 10 22:59:46 ips: ips_mts_hdlr_pm_logical_port_state_change_range:
Mar 10 22:59:46 ips: Hndlr MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE (mts_opc 3114 msg_id
47589)
Mar 10 22:59:46 ips: fu_fsm_execute_all: match_msg_id(0), log_already_open(0)
Mar 10 22:59:46 ips: fu_fsm_execute_all: null fsm_event_list
Mar 10 22:59:46 ips: fu_fsm_engine: mts msg
MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(msg_id 47589) dropped
Mar 10 22:59:46 ips: fu_priority_select: - setting fd[3] for select call - found data in
FU_PSEL_Q_CAT_MTS queue, fd(3), usr_q_info(1)
Mar 10 22:59:46 ips: fu_priority_select_select_queue: round credit(4)
Mar 10 22:59:46 ips: curr_q - FU_PSEL_Q_CAT_CQ, usr_q_info(3), priority(4), credit(2),
empty
Mar 10 22:59:46 ips: fu_priority_select: returning FU_PSEL_Q_CAT_MTS queue, fd(3),
usr_q_info(1)
Mar 10 22:59:46 ips: Dequeued mts msg MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(mts opc
3114, msg id 47602)
Mar 10 22:59:46 ips: ips_mts_hdlr_pm_logical_port_state_change_range:
Mar 10 22:59:46 ips: Hndlr MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE (mts_opc 3114 msg_id
47602)
Mar 10 22:59:46 ips: fu_fsm_execute_all: match_msg_id(0), log_already_open(0)
Mar 10 22:59:46 ips: fu_fsm_execute_all: null fsm_event_list
Mar 10 22:59:46 ips: fu_fsm_engine: mts msg
MTS_OPC_PM_LOGICAL_PORT_STATE_CHANGE_RANGE(msg_id 47602) dropped
Displaying the Debug Output from the FCIP Tunnel IPS Module with the CLI
The following example shows the debug output from the IPS module of the FCIP tunnel.
MDS2# attach module 2
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
-------------------------------------------------------------------------------
ProfileId Ipaddr TcpPort
-------------------------------------------------------------------------------
28 10.10.10.2 3225
MDS1# show fcip profile 28
FCIP Profile 28
Listen Port is 3225
TCP parameters
SACK is disabled
PMTU discover is enabled, reset timeout is 3600 sec
Keep alive is 60 sec
Minimum retransmission timeout is 100 ms
Maximum number of re-transmissions is 4
Advertised window size is 64 KB
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
-------------------------------------------------------------------------------
Interface Vsan Admin Admin Status Oper Profile Port-channel
Mode Trunk Mode
Mode
-------------------------------------------------------------------------------
fcip28 1 auto on trunking TE 28 --
-------------------------------------------------------------------------------
Interface Input (rate is 5 min avg) Output (rate is 5 min avg)
----------------------------- -----------------------------
Rate Total Rate Total
Mbits/s Frames Mbits/s Frames
-------------------------------------------------------------------------------
fcip28 18 0 18 0
(This is the frames that averaged over 5 minutes and the total count of frames since the last clear
counters command was issued, or since the last tunnel up.)
Verifying the Establishment of Default TCP Connections for Each Configured FCIP Tunnel with the
CLI
Verify two default TCP connections are established for each FCIP tunnel configured, one for control
traffic and one for data traffic.
MDS1# show ips stats tcp interface gigabitethernet 2/8
TCP Statistics for port GigabitEthernet2/8
Connection Stats
6 active openings, 8 accepts
6 failed attempts, 0 reset received, 8 established
Segment stats
295930 received, 1131824 sent, 109 retransmitted
(Excessive retransmits indicate possible core drops and/or that the TCP window size should be adjusted.)
0 bad segments received, 0 reset sent
You can use the following command to verify that traffic is incrementing on Gigabit Ethernet port of the
FCIP tunnel.
MDS1# show ips stats mac interface gigabitethernet 2/8
Ethernet MAC statistics for port GigabitEthernet2/8
Hardware Transmit Counters
1074898 frame 1095772436 bytes
0 collisions, 0 late collisions, 0 excess collisions
0 bad frames, 0 FCS error, 0 abort, 0 runt, 0 oversize
Hardware Receive Counters
33488196 bytes, 298392 frames, 277 multicasts, 16423 broadcasts
0 bad, 0 runt, 0 CRC error, 0 length error
0 code error, 0 align error, 0 oversize error
Software Counters
298392 received frames, 1074898 transmit frames
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Verifying the Statistics of the ASIC Chip on Each Gigabit Ethernet Port with with the CLI
Traffic statistics can be verified on the internal ASIC chip on each Gigabit Ethernet port.
MDS1# show ips stats flamingo interface gigabitethernet 2/8
Flamingo ASIC Statistics for port GigabitEthernet2/8
Hardware Egress Counters
2312 Good, 0 bad protocol, 0 bad header cksum, 0 bad FC CRC
(Good frames and CRC error frames can be monitored.)
Hardware Ingress Counters
(Verify good increments on the active tunnel.)
2312 Good, 0 protocol error, 0 header checksum error
0 FC CRC error, 0 iSCSI CRC error, 0 parity error
Software Egress Counters
2312 good frames, 0 bad header cksum, 0 bad FIFO SOP
0 parity error, 0 FC CRC error, 0 timestamp expired error
0 unregistered port index, 0 unknown internal type
0 RDL, 0 RDL too big RDL, 0 TDL ttl_1
3957292257 idle poll count, 0 loopback, 0 FCC PQ, 0 FCC EQ
Flow Control: 0 [0], 0 [1], 0 [2], 0 [3]
Software Ingress Counters
2312 Good frames, 0 header cksum error, 0 FC CRC error
0 iSCSI CRC error, 0 descriptor SOP error, 0 parity error
0 frames soft queued, 0 current Q, 0 max Q, 0 low memory
0 out of memory drop, 0 queue full drop
0 RDL, 0 too big RDL drop
Flow Control: 0 [0], 0 [1], 0 [2], 0 [3]
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Figure 10-4 shows more of the trace, with frame 13 being the first FCIP frame. This frame carries the
FC Standard ELP.
Figure 10-5 shows the FC portion of the EISL initialization over the FCIP tunnel.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
GE 2/1 GE 2/1
10.10.10.2/24 10.10.10.2/24
Fcip profile 21
ip address 10.10.10.2 Fcip profile 1
ip address 10.10.8.2
Interface fcip 1
Interface fcip 23 use-proficle 1
use-proficle 21 peer-info ipaddr 10.10.10.2
peer-info ipaddr 10.10.8.2
GE 2/1
MDS1
IP 10.10.11.2/24
Network
Fcip profile 1
ip address 10.10.11.2
Interface fcip 1
use-proficle 1
Interface fcip 28 peer-info ipaddr 10.10.10.2
use-proficle 21
peer-info ipaddr 10.10.7.2
GE 2/1
10.10.7.2/24
Interface fcip 21
use-proficle 21 Fcip profile 1
peer-info ipaddr 10.10.11.2 ip address 10.10.7.2
Interface fcip 1
use-proficle 1
94222
peer-info ipaddr 10.10.10.2
Creating the FCIP Interface for the Second Tunnel with the CLI
Now the interface FCIP is created for the second tunnel. The same FCIP profile is used for this example.
A separate FCIP profile can be used for each interface FCIP if desired.
MDS1(config-if)#
MDS1(config-if)# interface fcip 23
MDS1(config-if)# use-profile 21
MDS1(config-if)# peer-info ipaddr 10.10.8.2
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
MDS1(config-if)# no shutdown
MDS1(config-if)# exit
MDS1(config)#
Displaying Incorrect or Non-existent IP Address for Use with FCIP Profile with the CLI
MDS22(config)# fcip profile 21
MDS22(config-profile)# ip addr 1.1.1.1
MDS22(config-profile)# ip addr 34.34.34.34
MDS22(config-profile)# exit
MDS22(config)# exit
MDS22# show fcip profile 21
FCIP Profile 21
Internet Address is 34.34.34.34
(In the line above, the interface Gigabit Ethernet port is not shown. This means the IP address is not
assigned a Gigabit Ethernet port.
Listen Port is 3225
TCP parameters
SACK is disabled
PMTU discover is enabled, reset timeout is 3600 sec
Keep alive is 60 sec
Minimum retransmission timeout is 300 ms
Maximum number of re-transmissions is 4
Advertised window size is 64 KB
MDS22# config t
Enter configuration commands, one per line. End with CNTL/Z.
MDS22(config)# interface gigabitethernet 2/5
MDS22(config-if)# ip addr 34.34.34.34 255.255.255.0
MDS22(config-if)# no shutdown
MDS22(config-if)# end
MDS22# show fcip profile 34
error: fcip profile not found
MDS22# show fcip profile 21
FCIP Profile 21
Internet Address is 34.34.34.34 (interface GigabitEthernet2/5)
(In the line above, the Gigabit Ethernet port is now shown and the FCIP profile is bound to a physical
port.)
Listen Port is 3225
TCP parameters
SACK is disabled
PMTU discover is enabled, reset timeout is 3600 sec
Keep alive is 60 sec
Minimum retransmission timeout is 300 ms
Maximum number of re-transmissions is 4
Advertised window size is 64 KBThe following example shows a configuration error
when using multiple FCIP profiles on one physical Gigabit Ethernet port.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Displaying Configuration Errors when Bringing Up a Tunnel on a Selected Port with the CLI
The following example shows a configuration error when bringing a tunnel up on the selected port. This
could be either an FCIP profile issue or an interface FCIP issue. Both sides must be configured correctly.
MDS2(config)# fcip profile 21
MDS2(config-profile)# port 13
(Change the TCP listen port on switch MDS2.)
MDS2(config-profile)# end
MDS2(config)# interface fcip 21
MDS2(config-if)# passive-mode
(Put interface FCIP 21 in passive mode to guarantee MDS1 initiates a TCP connection.)module-2#
debug ips fcip fsm port 1
module-2# Mar 14 23:08:02 port1: 863:FCIP21: SUP-> Set Port mode 1
Mar 14 23:08:02 port1: 864:FCIP21: SUP-> Port VSAN (1) already set to same value
Mar 14 23:08:02 port1: 865:FCIP21: SUP-> Trunk mode (1) already set to same value
Mar 14 23:08:02 port1: 866:FCIP21: SUP-> Enable tunnel ADMIN UP
Mar 14 23:08:02 port1: 867:FCIP21: Try to Bring UP the Tunnel
Mar 14 23:08:02 port1: 868:FCIP21: Start TCP listener with peer: 10.10.10.2:13
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
(This debug output from switch MDS2 shows that the FCIP tunnel will not come up because switch
MDS2 is listening on port 13, and switch MDS1 is trying to establish the connection on the default port
3225.)
Mar 14 23:08:02 port1: 869:FCIP: Create a new listener object for 10.10.11.2:13
Mar 14 23:08:02 port1: 870:FCIP: Create FCIP Listener with local info: 10.10.11.2:13
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Displaying the Interface FCIP Shut Down Administratively with the CLI
The following example shows the interface FCIP is administratively shut down. The debug output is
from the IPS module.
Module-2# debug ips fcip fsm port 1
module-2# Mar 14 21:32:27 port1: 1:FCIP21: Create tunnel with ifindex: a000014
Mar 14 21:32:27 port1: 2:FCIP21: Get the peer info from the SUP-IPS-MGR
Mar 14 21:32:27 port1: 3:FCIP21: SUP-> Disable tunnel: already in disable state
Mar 14 21:32:27 port1: 4:FCIP21: SUP-> Set Port mode 1
Mar 14 21:32:27 port1: 5:FCIP21: SUP-> Set port index: 21
Mar 14 21:32:27 port1: 6:FCIP21: Try to Bring UP the Tunnel
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Local MDS trying to connect to remote end point on port 13 and remote end point set to
default listen port 3225
MDS2# show interface fcip 21
fcip21 is down (Link failure or not-connected)
Hardware is GigabitEthernet
Port WWN is 20:42:00:0b:5f:d5:9f:c0
Admin port mode is auto, trunk mode is on
vsan is 1
Using Profile id 21 (interface GigabitEthernet2/1)
Peer Information
Peer Internet address is 10.10.10.2 and port is 13
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Displaying the Debug Output from the Second Switch with the CLI
The following debug output is from switch MDS2.
Mar 14 23:26:07 port1: 1340:FCIP21: Start TCP listener with peer: 10.10.10.2:3225
Mar 14 23:26:07 port1: 1341:FCIP: Create a new listener object for 10.10.11.2:3225
Mar 14 23:26:07 port1: 1342:FCIP: Create FCIP Listener with local info: 10.10.11.2:3225
Mar 14 23:26:07 port1: 1343:FCIP21: Create a DE 0xd802d140 for this tunnel
Mar 14 23:26:07 port1: 1344:FCIP21: Bind the DE 0xd802d140 [1] to tunnel LEP 0x80111570
Mar 14 23:26:07 port1: 1345:FCIP21: Start the active connection [1] to 10.10.10.2:13
Mar 14 23:26:07 port1: 1346:FCIP21: Create a DE 0xd802cdc0 for this tunnel
Mar 14 23:26:07 port1: 1347:FCIP21: Bind the DE 0xd802cdc0 [2] to tunnel LEP 0x80111570
Mar 14 23:26:07 port1: 1348:FCIP21: Start the active connection [2] to 10.10.10.2:13
(The switch is attempting to create a TCP connection on port 13. The creation port must match the TCP
listen port on the remote end point.)
Mar 14 23:26:07 port1: 1349:FCIP21: Active Connect creation FAILED [1]
Mar 14 23:26:07 port1: 1350:FCIP21: Delete the DE [1]0xd802d140
Mar 14 23:26:07 port1: 1351:FCIP21: Delete the DE object [1] 0xd802d140
Mar 14 23:26:07 port1: 1352:FCIP21: Try 7 to bring up the tunnel
Mar 14 23:26:07 port1: 1353:FCIP21: Start the bringup tunnel timer, timeout: 64000
Mar 14 23:26:07 port1: 1354:FCIP21: Active Connect creation FAILED [2]
Mar 14 23:26:07 port1: 1355:FCIP21: Delete the DE [2]0xd802cdc0
Mar 14 23:26:07 port1: 1356:FCIP21: Set lep operation state to DOWN
Mar 14 23:26:07 port1: 1357:FCIP21: Delete the DE object [2] 0xd802cdc0
Mar 14 23:26:07 port1: 1358:FCIP21: Try 8 to bring up the tunnel
Mar 14 23:26:07 port1: 1359:FCIP21: Start the bringup tunnel timer, timeout: 128000
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Displaying Passive Mode Set on Both Sides of the FCIP Tunnel with the CLI
In the following example, passive mode is set on both sides of the FCIP tunnel.
module-2# Mar 14 23:49:06 port1: 1870:FCIP21: SUP-> Set Port mode 1
Mar 14 23:49:06 port1: 1871:FCIP21: SUP-> Port VSAN (1) already set to same value
Mar 14 23:49:06 port1: 1872:FCIP21: SUP-> Trunk mode (1) already set to same value
Mar 14 23:49:06 port1: 1873:FCIP21: SUP-> Enable tunnel ADMIN UP
Mar 14 23:49:06 port1: 1874:FCIP21: Try to Bring UP the Tunnel
Mar 14 23:49:06 port1: 1875:FCIP21: Start TCP listener with peer: 10.10.10.2:3225
Mar 14 23:49:06 port1: 1876:FCIP: Create a new listener object for 10.10.11.2:3225
Mar 14 23:49:06 port1: 1877:FCIP: Create FCIP Listener with local info: 10.10.11.2:3225
Mar 14 23:49:06 port1: 1878:FCIP21: Passive mode set, don't initiate TCP connection
(A TCP connection will not be established when passive mode is set.The Gigabit Ethernet port will only
listen.)
MDS2# show interface fcip 21
fcip21 is down (Link failure or not-connected)
Hardware is GigabitEthernet
Port WWN is 20:42:00:0b:5f:d5:9f:c0
Admin port mode is auto, trunk mode is on
vsan is 1
Using Profile id 21 (interface GigabitEthernet2/1)
Peer Information
Peer Internet address is 10.10.10.2 and port is 3225
Passive mode is enabled
(Passive mode is set, so a TCP connection will not be established.)
Special Frame is disabled
MDS1# show interface fcip 21
fcip21 is down (Link failure or not-connected)
Hardware is GigabitEthernet
Port WWN is 20:42:00:05:30:00:59:de
Admin port mode is auto, trunk mode is on
vsan is 1
Using Profile id 21 (interface GigabitEthernet2/1)
Peer Information
Peer Internet address is 10.10.11.2 and port is 3225
Passive mode is enabled
(Both sides are set to passive mode. You must change one or both sides to no passive-mode under the
interface FCIP.)
Special Frame is disabled
MDS2(config)# interface fcip 21
MDS2(config-if)# no passive-mode
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Mar 15 00:01:56 port1: 3315:FCIP21: tunnel bring-up debounce timer set, wait for timer to
pop
(Connect the NTP server or synchronized clocks, or increase the acceptable difference.)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Configuring and Displaying an FCIP Tunnel with a Special Frame with the CLI
<<Add text here.>>
MDS2# show wwn switch
Switch WWN is 20:00:00:0b:5f:d5:9f:c0
(You’ll need the WWN of each MDS 9000 switch end point.)
MDS1(config)# interface fcip 21
MDS1(config-if)# special-frame peer-wwn 20:00:00:0b:5f:d5:9f:c0 profile-id 1
(This enables the Special Frame that is used in the creation of the FCIP tunnel.)
MDS1# show wwn switch
Switch WWN is 20:00:00:05:30:00:59:de
MDS2(config)# interface fcip 21
MDS2(config-if)# special-frame peer-wwn 20:00:00:05:30:00:59:de profile-id 1
module-2#
Jan 14 15:25:38 port1: 857314:FCIP21: SUP-> Set Port mode 1
Jan 14 15:25:38 port1: 857315:FCIP21: SUP-> Port VSAN (1) already set to same value
Jan 14 15:25:38 port1: 857316:FCIP21: SUP-> Trunk mode (1) already set to same value
Jan 14 15:25:38 port1: 857317:FCIP21: SUP-> Enable tunnel ADMIN UP
Jan 14 15:25:38 port1: 857318:FCIP21: Try to Bring UP the Tunnel
Jan 14 15:25:38 port1: 857319:FCIP21: Start TCP listener with peer: 10.10.10.2:3225
Jan 14 15:25:38 port1: 857320:FCIP: Create a new listener object for 10.10.11.2:3225
Jan 14 15:25:38 port1: 857321:FCIP: Create FCIP Listener with local info: 10.10.11.2:3225
Jan 14 15:25:38 port1: 857322:FCIP21: Create a DE 0xd802cd00 for this tunnel
Jan 14 15:25:38 port1: 857323:FCIP21: Bind the DE 0xd802cd00 [1] to tunnel LEP 0x80111570
Jan 14 15:25:38 port1: 857324:FCIP21: Start the active connection [1] to 10.10.10.2:3225
Jan 14 15:25:38 port1: 857325:FCIP21: Create a DE 0xd802db40 for this tunnel
Jan 14 15:25:38 port1: 857326:FCIP21: Bind the DE 0xd802db40 [2] to tunnel LEP 0x80111570
Jan 14 15:25:38 port1: 857327:FCIP21: Start the active connection [2] to 10.10.10.2:3225
Jan 14 15:25:38 port1: 857328:FCIP21: Active Connect creation SUCCEEDED [1]
Jan 14 15:25:38 port1: 857329:FCIP21: Bind DE 1 to TCP-hdl 0xd8072c00
Jan 14 15:25:38 port1: 857330:FCIP21: Setup for Special Frame handling: I'm Originator
(This begins the Special Frame setup of the Originator.)
Jan 14 15:25:38 port1: 857331:FCIP21: Send the SF as Originator & wait for response
(The Special Frame is sent.)
Jan 14 15:25:38 port1: 857332:FCIP21: Setup timer to wait for SF
Jan 14 15:25:38 port1: 857333:FCIP21: Active Connect creation SUCCEEDED [2]
(The Special Frame is correctly configured with the WWN of the remote MDS 9000 switch.)
Jan 14 15:25:38 port1: 857334:FCIP21: Bind DE 2 to TCP-hdl 0xd8072000
Jan 14 15:25:38 port1: 857335:FCIP21: Setup for Special Frame handling: I'm Originator
Jan 14 15:25:38 port1: 857336:FCIP21: Send the SF as Originator & wait for response
Jan 14 15:25:38 port1: 857337:FCIP21: Setup timer to wait for SF
Jan 14 15:25:38 port1: 857338:FCIP21: processing SF frame, I'm Originator
Jan 14 15:25:38 port1: 857339:FCIP21: Bind DE 1 to eport 0x80110550
Jan 14 15:25:38 port1: 857340:FCIP21: bind de 1 in eport 0x80110550, hash = 1 num-conn: 2
Jan 14 15:25:38 port1: 857341:FCIP21: processing SF frame, I'm Originator
Jan 14 15:25:38 port1: 857342:FCIP21: Bind DE 2 to eport 0x80110550
Jan 14 15:25:38 port1: 857343:FCIP21: bind de 2 in eport 0x80110550, hash = 2 num-conn: 2
Jan 14 15:25:38 port1: 857344:FCIP21: Send LINK UP to SUP
Jan 14 15:25:39 port1: 857345:FCIP21: SUP-> Set trunk mode: 2
Jan 14 15:25:39 port1: 857346:FCIP21: Change the operational mode to TRUNK
Jan 14 15:25:39 port1: 857347:FCIP21: *** Received non-eisl frame in TE mode 64 64
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Port mode is TE
vsan is 1
Trunk vsans (allowed active) (1-2)
Trunk vsans (operational) (1-2)
Trunk vsans (up) (1-2)
Trunk vsans (isolated) ()
Trunk vsans (initializing) ()
Using Profile id 21 (interface GigabitEthernet2/1)
Peer Information
Peer Internet address is 10.10.10.2 and port is 3225
Special Frame is enabled
(The Special Frame is enabled. It is used for security to verify that the tunnel remote end point is the
correct pWWN of the switch.)
Peer switch WWN is 20:00:00:05:30:00:59:de
(This is the peer WWN of the remote switch. The pWWN of the switch can be found using the show
wwn switch command.)
Maximum number of TCP connections is 2
Time Stamp is enabled, acceptable time difference 3000 ms
B-port mode disabled
TCP Connection Information
2 Active TCP connections
Control connection: Local 10.10.11.2:64792, Remote 10.10.10.2:3225
Data connection: Local 10.10.11.2:64794, Remote 10.10.10.2:3225
372 Attempts for active connections, 345 close of connections
TCP Parameters
Path MTU 1500 bytes
Current retransmission timeout is 300 ms
Round trip time: Smoothed 10 ms, Variance: 5
Advertized window: Current: 64 KB, Maximum: 64 KB, Scale: 1
Peer receive window: Current: 64 KB, Maximum: 64 KB, Scale: 1
Congestion window: Current: 2 KB, Slow start threshold: 1048560 KB
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Displaying Incorrect Peer WWN when Using Special Frame with the CLI
<<Add text here.>>
module-2# Jan 14 15:14:30 port1: 855278:FCIP21: SUP-> Set Port mode 1
Jan 14 15:14:30 port1: 855279:FCIP21: SUP-> Port VSAN (1) already set to same value
Jan 14 15:14:30 port1: 855280:FCIP21: SUP-> Trunk mode (1) already set to same
Jan 14 15:14:30 port1: 855281:FCIP21: SUP-> Enable tunnel ADMIN UP
Jan 14 15:14:30 port1: 855282:FCIP21: Try to Bring UP the Tunnel
Jan 14 15:14:30 port1: 855283:FCIP21: Start TCP listener with peer: 10.10.10.2:3225
Jan 14 15:14:30 port1: 855284:FCIP: Create a new listener object for 10.10.11.2:3225
Jan 14 15:14:30 port1: 855285:FCIP: Create FCIP Listener with local info: 10.10.11.2:3225
Jan 14 15:14:30 port1: 855286:FCIP21: Create a DE 0xd802d240 for this tunnel
Jan 14 15:14:30 port1: 855287:FCIP21: Bind the DE 0xd802d240 [1] to tunnel LEP 0x80111570
Jan 14 15:14:30 port1: 855288:FCIP21: Start the active connection [1] to 10.10.10.2:3225
Jan 14 15:14:30 port1: 855289:FCIP21: Create a DE 0xd802d200 for this tunnel
Jan 14 15:14:30 port1: 855290:FCIP21: Bind the DE 0xd802d200 [2] to tunnel LEP 0x80111570
Jan 14 15:14:30 port1: 855291:FCIP21: Start the active connection [2] to 10.10.10.2:3225
Jan 14 15:14:30 port1: 855292:FCIP21: Active Connect creation SUCCEEDED [1]
Jan 14 15:14:30 port1: 855293:FCIP21: Bind DE 1 to TCP-hdl 0xd8072c00
Jan 14 15:14:30 port1: 855294:FCIP21: Setup for Special Frame handling: I'm Originator
Jan 14 15:14:30 port1: 855295:FCIP21: Send the SF as Originator & wait for response
Jan 14 15:14:30 port1: 855296:FCIP21: Setup timer to wait for SF
Jan 14 15:14:30 port1: 855297:FCIP21: Active Connect creation SUCCEEDED [2]
Jan 14 15:14:30 port1: 855298:FCIP21: Bind DE 2 to TCP-hdl 0xd8072000
Jan 14 15:14:30 port1: 855299:FCIP21: Setup for Special Frame handling: I'm Originator
Jan 14 15:14:30 port1: 855300:FCIP21: Send the SF as Originator & wait for response
Jan 14 15:14:30 port1: 855301:FCIP21: Setup timer to wait for SF
Jan 14 15:14:30 port1: 855302:FCIP21: TCP Received a close connection [1] reason 1
Jan 14 15:14:30 port1: 855303:FCIP21: Delete the DE [1]0xd802d240
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Figure 10-10 shows a trace of an incorrect remote switch WWN using a Special Frame
Figure 10-10 Trace of Incorrect Remote Switch WWN Using a Special Frame
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Figure 10-11 shows a successful iSCSI login for the Windows 2000 driver.
On Solaris systems, a successful login is found in the /var/adm/messages directory, and should look
similar to the following example:
Mar 14 12:53:23 ca-sun1 iscsid[12745]: [ID 702911 daemon.notice] discovery process for
172.22.91.223 finished, exiting
Mar 14 12:58:45 ca-sun1 iscsid[12802]: [ID 448557 daemon.notice] logged into
DiscoveryAddress 172.22.91.223:3260 isid 023d0040
Mar 14 12:58:45 ca-sun1 iscsid[12802]: [ID 702911 daemon.notice] iSCSI target 2 =
iqn.com.domainname.vrrp-11.gw.21000020375aff77 at0
Mar 14 12:58:45 ca-sun1 iscsid[12809]: [ID 529321 daemon.notice] logged into target
iqn.com.domainname.vrrp-11.gw.21000020375aff77 7
Mar 14 12:58:45 ca-sun1 iscsid[12802]: [ID 702911 daemon.notice] iSCSI target 3 =
iqn.com.domainname.vrrp-11.gw.21000020374baf02 at0
Mar 14 12:58:45 ca-sun1 iscsid[12810]: [ID 529321 daemon.notice] logged into target
iqn.com.domainname.vrrp-11.gw.21000020374baf02 7
Figure 10-12 shows a failed iSCSI login for the Windows 2000 driver.
On Solaris systems, a failed login is found in the /var/adm/messages directory and should look similar
to the following example.
Mar 14 11:44:42 ca-sun1 iscsid[12561]: [ID 702911 daemon.notice] login rejected: initiator
error (01)
Mar 14 11:44:42 ca-sun1 iscsid[12561]: [ID 702911 daemon.error] Hard discovery login
failure to 172.22.91.223:3260 - exiting
Mar 14 11:44:42 ca-sun1 iscsid[12561]: [ID 702911 daemon.notice] discovery process for
172.22.91.223 finished, exiting
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
username:iscsiuser
secret:1234567812345678
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Verifying Matching RADIUS Key and Port for Authentication and Accounting with the CLI
Execute the show radius-server command to make sure radius key and port for authentication and
accounting match exactly with is configured on RADIUS server.
switch# show radius-server
retransmission count:3
timeout value:5
Adjust the radius timeout and retransmission accordingly, as they have default value of 1 second and 1
time.
Figure 10-13 shows a Windows-based Radius server configuration.
If the items shown above match, verify that the client username and password match those in the Radius
database.
The following example shows the results of the debug security radius command, if the iSCSI client logs
in successfully.
switch#
switch# Mar 4 23:16:20 securityd: received CHAP authentication request for user002
Mar 4 23:16:20 securityd: RADIUS is enabled, hence it will be tried first for CHAP
authentication
Mar 4 23:16:20 securityd: reading RADIUS configuration
Mar 4 23:16:20 securityd: opening radius configuration for group:default
Mar 4 23:16:20 securityd: opened the configuration successfully
Mar 4 23:16:20 securityd: GET request for RADIUS global config
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Mar 4 23:16:20 securityd: got back the return value of global radius configuration
operation:success
Mar 4 23:16:20 securityd: closing RADIUS pss configuration
Mar 4 23:16:20 securityd: opening radius configuration for group:default
Mar 4 23:16:20 securityd: opened the configuration successfully
Mar 4 23:16:20 securityd: GETNEXT request for radius index:0 addr:
Mar 4 23:16:20 securityd: got some reply from 171.71.49.197
Mar 4 23:16:20 securityd: verified the response from:171.71.49.197
Mar 4 23:16:20 securityd: RADIUS server sent accept for authentication request for
user002
Mar 4 23:16:25 securityd: received CHAP authentication request for user002
Mar 4 23:16:25 securityd: RADIUS is enabled, hence it will be tried first for CHAP
authentication
Mar 4 23:16:25 securityd: reading RADIUS configuration
Mar 4 23:16:25 securityd: opening radius configuration for group:default
Mar 4 23:16:25 securityd: opened the configuration successfully
Mar 4 23:16:25 securityd: GET request for RADIUS global config
Mar 4 23:16:25 securityd: got back the return value of global radius configuration
operation:success
Mar 4 23:16:25 securityd: closing RADIUS pss configuration
Mar 4 23:16:25 securityd: opening radius configuration for group:default
Mar 4 23:16:25 securityd: opened the configuration successfully
Mar 4 23:16:25 securityd: GETNEXT request for radius index:0 addr:
Mar 4 23:16:25 securityd: got some reply from 171.71.49.197
Mar 4 23:16:25 securityd: verified the response from:171.71.49.197
Mar 4 23:16:25 securityd: RADIUS server sent accept for authentication request for
user002
Mar 4 23:16:25 securityd: got some reply from 171.71.49.197
Mar 4 23:16:25 securityd: verified the response from:171.71.49.197
Mar 4 23:16:25 securityd: RADIUS server sent accept for authentication request for
user002
The example above shows that the iSCSI client has been authenticated 3 times, first for the switch login,
and the second and third times for the SCSI drive login. The switch sends Radius attributes 1, 3, 4, 5,
6, 60 and 61 to the Radius server. The Radius server only needs to respond with request accept or
request reject.
The following example shows a radius authentication.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Displaying the Debug Output for RADIUS Authentication Request Routing with the CLI
The following example shows the output from the debug security radius command.
switch# Mar 5 00:51:13 securityd: received CHAP authentication request for user002
Mar 5 00:51:13 securityd: RADIUS is enabled, hence it will be tried first for CHAP
authentication
Mar 5 00:51:13 securityd: reading RADIUS configuration
Mar 5 00:51:13 securityd: opening radius configuration for group:default
Mar 5 00:51:13 securityd: opened the configuration successfully
Mar 5 00:51:13 securityd: GET request for RADIUS global config
Mar 5 00:51:13 securityd: got back the return value of global radius configuration
operation:success
Mar 5 00:51:13 securityd: closing RADIUS pss configuration
Mar 5 00:51:13 securityd: opening radius configuration for group:default
Mar 5 00:51:13 securityd: opened the configuration successfully
Mar 5 00:51:13 securityd: GETNEXT request for radius index:0 addr:
Mar 5 00:51:18 securityd: sending data to 171.71.49.197
Mar 5 00:51:18 securityd: waiting for response from 171.71.49.197
Mar 5 00:51:23 securityd: sending data to 171.71.49.197
Mar 5 00:51:23 securityd: waiting for response from 171.71.49.197
Mar 5 00:51:28 securityd: sending data to 171.71.49.197
Mar 5 00:51:28 securityd: waiting for response from 171.71.49.197
Mar 5 00:51:33 securityd: trying out next server
Mar 5 00:51:33 securityd: no response from RADIUS server for authentication user002
Mar 5 00:51:33 securityd: doing local chap authentication for user002
Mar 5 00:51:33 securityd: local chap authentication result for user002:user not present
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
VSAN 1:
-----------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
-----------------------------------------------------------------------
0x750000 N 20:23:00:a0:b8:0b:14:da (SymBios) scsi-fcp:target
0x750102 N 10:00:00:00:c9:30:ba:06 (Emulex) scsi-fcp:init
0x750105 N 20:0d:00:0b:be:77:72:42 scsi-fcp:init isc..w
0x750201 N 50:08:05:f3:00:04:96:71 scsi-fcp
0x750301 N 50:08:05:f3:00:04:96:79 scsi-fcp
0x750400 N 20:00:00:02:3d:07:05:c0 (NuSpeed) scsi-fcp:init
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Useful show Commands for Debugging Static iSCSI Configuration with the CLI
The output from the following commands reflects correctly established iSCSI sessions. Execute the
same commands on your switch and compare with the output below to help identify possible issues:
• show iscsi session detail
• show iscsi stats
• show iscsi stats detail
• show fcns data vsan 5
• show flogi data vsan 5
• show iscsi remote-node iscsi-session-detail tcp-parameters
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Number of connection: 1
Connection #1
Local IP address: 0xa011d64, Peer IP address: 0xa011d65
CID 0, State: LOGGED_IN
StatSN 1356, ExpStatSN 0
MaxRecvDSLength 524288, our_MaxRecvDSLength 1392
CSG 3, NSG 3, min_pdu_size 48 (w/ data 48)
AuthMethod none, HeaderDigest None (len 0), DataDigest None (len 0)
Version Min: 0, Max: 0
FC target: Up, Reorder PDU: No, Marker send: No (int 0)
Received MaxRecvDSLen key: Yes
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
iSCSI Stats:
Login: attempt: 16726, succeed: 114, fail: 16606, authen fail: 0
Rcvd: NOP-Out: 36164, Sent: NOP-In: 36160
NOP-In: 0, Sent: NOP-Out: 0
TMF-REQ: 28, Sent: TMF-RESP: 0
Text-REQ: 39, Sent: Text-RESP: 0
SNACK: 0
Unrecognized Opcode: 0, Bad header digest: 0
Command in window but not next: 0, exceed wait queue limit: 0
Received PDU in wrong phase: 0
FCP Stats:
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
iSCSI Drop:
Command: Target down 0, Task in progress 0, LUN map fail 0
CmdSeqNo not in window 0, No Exchange ID 0, Reject 0
Persistent Resv 0 Data-Out: 0, TMF-Req: 0
FCP Drop:
Xfer_rdy: 0, Data-In: 0, Response: 0
Buffer Stats:
Buffer less than header size: 48475, Partial: 2524437, Split: 3550971
Pullup give new buf: 48475, Out of contiguous buf: 0, Unaligned m_data: 0
VSAN 5:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x610002 N 20:0b:00:0b:be:77:72:42 scsi-fcp:init isc..w
0x6101e1 NL 22:00:00:20:37:c5:2d:6d (Seagate) scsi-fcp:target
0x6101e2 NL 22:00:00:20:37:c5:2e:2e (Seagate) scsi-fcp:target
0x6101e4 NL 22:00:00:20:37:c5:23:56 (Seagate) scsi-fcp:target
0x6101e8 NL 22:00:00:20:37:c5:26:0a (Seagate) scsi-fcp:target
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Target node:
Statistics:
PDU: Command: 0, Response: 0
Bytes: TX: 0, RX: 0
Number of connection: 1
TCP parameters
Connection Local 10.1.29.100:3260, Remote 10.1.29.101:1026
Path MTU 1500 bytes
Current retransmission timeout is 310 ms
Round trip time: Smoothed 179 ms, Variance: 33
Advertized window: Current: 62 KB, Maximum: 62 KB, Scale: 0
Peer receive window: Current: 63 KB, Maximum: 63 KB, Scale: 0
Congestion window: Current: 63 KB
VSAN ID 5, FCID 0x610002
No. of FC sessions: 4
No. of iSCSI sessions: 4
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
If left to default MTU, the FC frame size from the Target device is decreased to match the maximum
Ethernet frame size, so that the switching of the packet through the switch is swifter. Hence, one point
of performance tuning is increasing the MTU of the IP network between the peers. In this setup there is
one single Catalyst switch.
Jumbo support was enabled for the IPS ports, as well as the MTU for the VLAN corresponding to these
ports was increased.
The second point is to increase the TCP window size of the iSCSI end points. Depending on the latency
between the iSCSI client and IPS, this will need fine tuning. The switch’s iSCSI configuration defines
the TCP window size in kilobytes.
Any value starting with 64K (> 65535 = 0xFFFF bytes) will automatically trigger TCP window scaling
according to RFC1323. The IPS TCP Window scaling begins only when the remote peer (iSCSI client
in this case) requests it. This means that you need to configure the TCP stack of your client to trigger
this functionality (see Figure 10-14).
For the FC side, depending on the direction of the traffic, the B2Bcredit of the ports corresponding to
the input interfaces (feeding/receiving traffic to/from the iSCSI side) could be increased, especially in
the case of local Gigabit Ethernet attached iSCSI clients.
Each of the above-mentioned commands are taken from a scenario in Figure 10-14. The important
sections of the displays are highlighted/italicized or bolded.
MDS 9216_Top
1.1(0.133c)
2/1 FC 1/3
Shark-nas
10.48.69.233 Catalyst 6000
C4
FC
FC
E-ISL ESS 2105/F20
HBA
MDS 9216_Bottom
94230
1.1(0.133c)
Lab Setup
This is the lab setup that was used in collecting the performance-related information.
The server was an IBM Pentium III Server: Dual CPU @ 1.13 Ghz
The tcp window-size at both ends was set to 1MB (1024K).
The IBM ESS Shark had a hardcoded B2B value of 64 (not configurable).
The fcrxbbcredit on the corresponding switch port (fc1/3) was set to the same value.
The C4 and C8 represented the corresponding port WWNs (pWWN) for the IBM Shark storage
subsystem. See below for full pWWN:
C4 Î 50:05:07:63:00:c4:94:4c (in VSAN 778)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
interface GigabitEthernet2/1
ip address 10.48.69.251 255.255.255.192
iscsi authentication none
no shutdown
vrrp 1
priority 110
address 10.48.69.250
(This is the iSCSI target IP address for the Windows iSCSI client.)
no shutdown
interface iscsi2/1
tcp pmtu-enable
tcp window-size 1024
(To increase the receive window size of the IPS module (in kilobytes).)
tcp sack-enable
no shutdown
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
VSAN 777:
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x610000 N 50:05:07:63:00:c8:94:4c (IBM) scsi-fcp:target fc..
0x610001 N 20:05:00:0c:30:6c:24:42 scsi-fcp:init isc..w
MDS_BOTTOM#
MDS_BOTTOM# show interface iscsi 2/1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
iscsi2/1 is up
Hardware is GigabitEthernet
Port WWN is 20:41:00:0c:30:57:5e:c0
Admin port mode is ISCSI
Port mode is ISCSI
Speed is 1 Gbps
Number of iSCSI session: 2, Number of TCP connection: 2
Configured TCP parameters
Local Port is 3260
PMTU discover is enabled (default)
(This is especially required if there may be devices without jumbo support in the path. The initial TCP
3-way handshake will establish a session with a high MSS value (provided both the IPS module and the
iSCSI client are configured/capable) even if there are devices without jumbo frame support in the path.
Without PMTU discovery, this will create problems.)
Keepalive-timeout 60
Initial-retransmit-time 300
(If there is high delay between the peers, this is one of the parameters that can be adjusted. There’s no
real formula, rather use trial and error to find the optimum value for your network. Try lower values as
well as higher ones, and get hints from the show ips stats tcp display.)
Max-retransmissions 8
Window-size 1024000
Sack is enabled
Forwarding mode: pass-thru
5 minutes input rate 410824 bits/sec, 51353 bytes/sec, 1069 frames/sec
5 minutes output rate 581291520 bits/sec, 72661440 bytes/sec, 53302 frames/sec
iSCSI statistics
1072393 packets input, 51482588 bytes
1072305 Command pdus, 0 Data-out pdus, 0 Data-out bytes, 0 fragments
53430805 packets output, 72837086312 bytes
1072273 Response pdus (with sense 9), 0 R2T pdus
52358444 Data-in pdus, 70272402880 Data-in bytes
Target node:
Statistics:
PDU: Command: 0, Response: 0
Bytes: TX: 0, RX: 0
Number of connection: 1
TCP parameters
Local 10.48.69.250:3260, Remote 10.48.69.233:1026
Path MTU: 1500 bytes
Retransmission timeout: 300 ms
Round trip time: Smoothed 150 ms, Variance: 31
Advertized window: Current: 998 KB, Maximum: 1000 KB, Scale: 4
Peer receive window: Current: 1000 KB, Maximum: 1000 KB, Scale: 4
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Caution Editing the registry is a very high risk operation and can render the system unusable, requiring a
reinstallation of the whole operating system. Only advanced users should perform this operation.
(Better throughput can be achieved if the MTU of both the client NIC, as well as the IPS Gigabit
interface is changed for higher MTU, provided the network in the middle supports jumbo frames.)
Port mode is IPS
Speed is 1 Gbps
Beacon is turned off
5 minutes input rate 3957384 bits/sec, 494673 bytes/sec, 6716 frames/sec
5 minutes output rate 609420144 bits/sec, 76177518 bytes/sec, 53267 frames/sec
6979248 packets input, 514206826 bytes
0 multicast frames, 0 compressed
0 input errors, 0 frame, 0 overrun 0 fifo
55551272 packets output, 79456286344 bytes, 0 underruns
0 output errors, 0 collisions, 0 fifo
0 carrier errors
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
(The lower the better. Increasing values would show that the IP network in the middle has issues,
or that the TCP peer has problems ACKing the data that IPS sends to it.)
0 retransmitted while on ethernet send queue, 0 packets split
(Packets Split shows the IP level fragmentation; would increase if the MTU of this interface is
higher than the MSS of the iSCSI client; for example, client MTU default 1500 => MSS=1460, but
IPS Gigabit MTU changed to 2500).)
3 delayed acks sent
TCP receive stats
7068115 segments, 1061853 data packets in sequence, 54245464 bytes in sequence
0 predicted ack, 187 predicted data
0 bad checksum, 0 multi/broadcast, 0 bad offset
0 no memory drops, 0 short segments
0 duplicate bytes, 0 duplicate packets
0 partial duplicate bytes, 0 partial duplicate packets
0 out-of-order bytes, 0 out-of-order packets
0 packet after window, 0 bytes after window
0 packets after close
7067879 acks, 76746255713 ack bytes, 0 ack toomuch, 21 duplicate acks
0 ack packets left of snd_una, 0 non-4 byte aligned packets
5980106 window updates, 0 window probe
50 pcb hash miss, 0 no port, 0 bad SYN, 0 paws drops
TCP Connection Stats
0 attempts, 24 accepts, 24 established
22 closed, 2 drops, 0 conn drops
0 drop in retransmit timeout, 0 drop in keepalive timeout
0 drop in persist drops, 0 connections drained
TCP Miscellaneous Stats
7054414 segments timed, 7067879 rtt updated
0 retransmit timeout, 0 persist timeout
19 keepalive timeout, 19 keepalive probes
TCP SACK Stats
0 recovery episodes, 54218621 data packets, 77791012992 data bytes
0 data packets retransmitted, 0 data bytes retransmitted
1 connections closed, 0 retransmit timeouts
TCP SYN Cache Stats
24 entries, 24 connections completed, 0 entries timed out
0 dropped due to overflow, 0 dropped due to RST
0 dropped due to ICMP unreach, 0 dropped due to bucket overflow
0 abort due to no memory, 0 duplicate SYN, 2 no-route SYN drop
0 hash collisions, 0 retransmitted
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
nWWN: 50:05:07:63:00:c8:94:4c
Session state: LOGGED_IN
1 iSCSI sessions share this FC session
Target: shark_nas
Negotiated parameters
RcvDataFieldSize 2048 our_RcvDataFieldSize 1392
MaxBurstSize 0, EMPD: FALSE
Random Relative Offset: FALSE, Sequence-in-order: Yes
Statistics:
PDU: Command: 0, Response: 1612007
MDS_BOTTOM#
Displaying the Effects of Changing the Gigabit MTU on the FC RcvDataFieldSize with the CLI
The following example shows the effect of changing the Gigabit MTU on FC RcvDataFieldSize.
interface GigabitEthernet2/1
ip address 10.48.69.249 255.255.255.192
iscsi authentication none
switchport mtu 2440
no shutdown
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
vrrp 1
address 10.48.69.250
no shutdown
Target node:
Statistics:
PDU: Command: 0, Response: 0
Bytes: TX: 0, RX: 0
Number of connection: 1
TCP parameters
Local 10.48.69.250:3260, Remote 10.48.69.233:1026
Path MTU: 2440 bytes
Retransmission timeout: 420 ms
Round trip time: Smoothed 94 ms, Variance: 83
Advertized window: Current: 999 KB, Maximum: 1000 KB, Scale: 4
Peer receive window: Current: 1024 KB, Maximum: 1024 KB, Scale: 4
Congestion window: Current: 11 KB
VSAN ID 777, FCID 0x700003
No. of FC sessions: 1
No. of iSCSI sessions: 1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Verifying that the Host is Configured for High MTU/MSS with the CLI
To get the real benefit of this increased MTU and higher FC Frame Size, the path between the iSCSI
client and the IPS iSCSI interface (as well as the host NIC) has to be capable of supporting this high
MTU.
If you do not have access to the host, one way to see if the host is also configured for high MTU/MSS
(as well as the path in the middle) is to check the split packets field in the show ips stats tcp display:
However this is a generic display for all TCP sessions. That is, if you have some Hosts with high
MTU-capable NICs, and some others without, it may be difficult to assess which is which.
MDS_Top# show ips stats tcp interface gigabitethernet 2/1 detail (truncated output)
TCP Statistics for port GigabitEthernet2/1
TCP send stats
10 segments, 240 bytes
5 data, 5 ack only packets
0 control (SYN/FIN/RST), 0 probes, 0 window updates
0 segments retransmitted, 0 bytes
0 retransmitted while on ethernet send queue, 0 packets split
...
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Afterward, traffic starts flowing from the FC storage towards the server that is connected via iSCSI to
the IPS.
MDS_Top# show ips stats tcp interface gigabitethernet 2/1 detail
TCP Statistics for port GigabitEthernet2/1
TCP send stats
715535 segments, 943511612 bytes
712704 data, 2831 ack only packets
0 control (SYN/FIN/RST), 0 probes, 0 window updates
0 segments retransmitted, 0 bytes
0 retransmitted while on ethernet send queue, 345477 packets split
...
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
C H A P T E R 11
Troubleshooting IPsec
This chapter describes procedures used to troubleshoot IP security (IPsec) and Internet Key Exchange
(IKE) encryption in the Cisco MDS 9000 Family Switch products. It includes the following sections:
• Overview, page 11-1
• Troubleshooting IPsec Issues, page 11-1
Overview
The IP security (IPsec) protocol is a framework of open standards that provides data confidentiality, data
integrity, and data authentication between participating peers. It is developed by the Internet Engineering
Task Force (IETF). IPsec provides security services at the IP layer, including protecting one or more data
flows between a pair of hosts, between a pair of security gateways, or between a security gateway and a
host. The overall IPsec implementation is per the latest version of RFC 2401. Cisco MDS SAN-OS IPsec
implements RFC 2402 through RFC 2410.
MDS A
10.10.100.231
MDS C FCIP
Tunnel 2
120483
10.10.100.232
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Ensure that the preshared keys are identical on each switch. Issue the show crypto ike domain ipsec
key command on both switches. Example command outputs for configuration shown in Figure 11-1
follow:
MDSA# show crypto ike domain ipsec key
Step 2 Ensure that at least one matching policy that has the same encryption algorithm, hash algorithm, and
Diffie-Hellman (DH) group, is configured on each switch. Issue the show crypto ike domain ipsec
policy command on both switches. Example command outputs for the configuration shown in
Figure 11-1 follow:
MDSA# show crypto ike domain ipsec policy
Priority 1, auth pre-shared, lifetime 86300 secs, encryption 3des, hash md5, DH group 1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Platform IPsec
Microsoft iSCSI initiator, Microsoft IPsec 3DES, SHA-1
implementation on Windows 2000 platform
Cisco iSCSI initiator, FreeSwan IPsec 3DES, MD5
implementation on Linux platform
Step 1 Issue the show crypto map domain ipsec command and the show crypto transform-set domain ipsec
command. The following example command outputs display the fields discussed in Step 2 through
Step 7.
MDSA# show crypto map domain ipsec
Crypto Map “cmap-01” 1 ipsec
Peer = 10.10.100.232
IP ACL = acl1
permit ip 10.10.100.231 255.255.255.255 10.10.100.232 255.255.255.255
Transform-sets: tfs-02,
Security Association Lifetime: 3000 gigabytes/120 seconds
PFS (Y/N): Y
PFS Group: group5
Interface using crypto map set cmap-01:
GigabitEthernet7/1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 2 Ensure the ACLs are compatible on the show crypto map domain ipsec command outputs of both
switches.
Step 3 Ensure the peer configuration is correct on the show crypto map domain ipsec command outputs of
both switches.
Step 4 Ensure the transform sets are compatible on the show crypto transform-set domain ipsec command
outputs of both switches.
Step 5 Ensure that the PFS settings on the show crypto map domain ipsec command outputs are configured
the same on both switches.
Step 6 Ensure the security association (SA) lifetime settings on the show crypto map domain ipsec command
outputs are large enough to avoid excessive re-keys (the default settings ensure this).
Step 7 Ensure that the crypto map set is applied to the correct interface on the show crypto map domain ipsec
command outputs of both switches.
Step 1 Issue the show crypto spd domain ipsec command on both switches to display the SPD. The example
command outputs follow:
MDSA# show crypto spd domain ipsec
Policy Database for interface:GigabitEthernet7/1, direction:Both
# 0: deny udp any port eq 500 any <-----------Clear test policies for IKE
# 1: deny udp any any port eq 500 <-----------Clear test policies for IKE
# 2: permit ip 10.10.100.231 255.255.255.255 10.10.100.232 255.255.255.255
# 127: deny ip any any <------------Clear test policy for all other traffic
Step 2 Issue the show crypto-accelerator interface gigabitethernet slot/port spd inbound command on both
switches to display SPD information from the crypto-accelerator.
Note To issue commands with the internal keyword, you must have an account that is a member of the
network-admin group.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Inbound Policy 1 :
Source IP Address :*
Destination IP Address :*
Source port :*, Destination port :500 Protocol UDP
Physical port:0/0, Vlan_id:0/0
Action cleartext
Inbound Policy 2 :
Source IP Address :10.10.100.232/255.255.255.255
Destination IP Address :10.10.100.231/255.255.255.255
Source port :*, Destination port :* Protocol *
Physical port:0/1, Vlan_id:0/4095
Action ipsec
MDSC# show ipsec internal crypto-accelerator interface gigabitethernet 1/2 spd inbound
Inbound Policy 0 :
Source IP Address :*
Destination IP Address :*
Source port :500, Destination port :* Protocol UDP
Physical port:0/0, Vlan_id:0/0
Action cleartext
Inbound Policy 1 :
Source IP Address :*
Destination IP Address :*
Source port :*, Destination port :500 Protocol UDP
Physical port:0/0, Vlan_id:0/0
Action cleartext
Inbound Policy 2 :
Source IP Address :10.10.100.231/255.255.255.255
Destination IP Address :10.10.100.232/255.255.255.255
Source port :*, Destination port :* Protocol *
Physical port:1/1, Vlan_id:0/4095
Action ipsec
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Issue the show interface gigabitethernet command on both switches. Verify that the interfaces are up
and their internet addresses are correct. Issue the no shutdown command if necessary. The example
command outputs follow:
MDSA# show interface gigabitethernet 7/1
GigabitEthernet7/1 is up
Hardware is GigabitEthernet, address is 0005.3001.804e
Internet address is 10.10.100.231/24
MTU 1500 bytes
Port mode is IPS
Speed is 1 Gbps
Beacon is turned off
Auto-Negotiation is turned on
5 minutes input rate 7728 bits/sec, 966 bytes/sec, 8 frames/sec
5 minutes output rate 7968 bits/sec, 996 bytes/sec, 8 frames/sec
7175 packets input, 816924 bytes
0 multicast frames, 0 compressed
0 input errors, 0 frame, 0 overrun 0 fifo
7285 packets output, 840018 bytes, 0 underruns
0 output errors, 0 collisions, 0 fifo
0 carrier errors
Step 2 Issue the show interface fcip command on both switches. Verify that each interface is using the correct
profile, the peer internet addresses are configured correctly, and the FCIP tunnels are compatible. Issue
the no shutdown command if necessary. The example command outputs follow:
MDSA# show interface fcip 1
fcip1 is trunking
Hardware is GigabitEthernet
Port WWN is 21:90:00:0d:ec:02:64:80
Peer port WWN is 20:14:00:0d:ec:08:5f:c0
Admin port mode is auto, trunk mode is on
Port mode is TE
Port vsan is 1
Speed is 1 Gbps
Trunk vsans (admin allowed and active) (1,100,200,302-303,999,3001-3060)
Trunk vsans (up) (1)
Trunk vsans (isolated) (100,200,302-303,999,3001-3060)
Trunk vsans (initializing) ()
Using Profile id 1 (interface GigabitEthernet7/1)
Peer Information
Peer Internet address is 10.10.100.232 and port is 3225
FCIP tunnel is protected by IPSec
Write acceleration mode is off
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Issue the show crypto sad domain ipsec command to verify the current peer, mode, and the inbound
and outbound index of each switch. The example command outputs follow:
MDSA# show crypto sad domain ipsec
interface:GigabitEthernet7/1
Crypto map tag:cmap-01, local addr. 10.10.100.231
protected network:
local ident (addr/mask):(10.10.100.231/255.255.255.255)
remote ident (addr/mask):(10.10.100.232/255.255.255.255)
current_peer:10.10.100.232
local crypto endpt.:10.10.100.231, remote crypto endpt.:10.10.100.232
mode:tunnel, crypto algo:esp-3des, auth algo:esp-md5-hmac
tunnel id is:1
current outbound spi:0x822a202 (136487426), index:1
lifetimes in seconds::3600
lifetimes in bytes::483183820800
current inbound spi:0x38147002 (940863490), index:1
lifetimes in seconds::3600
lifetimes in bytes::483183820800
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 2 The SA index can be used to look at the SA in the crypto-accelerator. Issue the show ipsec internal
crypto-accelerator interface gigabitethernet slot/port sad [inbound | outbound] sa-index command
to display the inbound or outbound SA information. The hard limit bytes and soft limit bytes fields
display the lifetime in bytes. The hard limit expiry secs and the soft limit expiry secs fields display the
lifetime in seconds.
Note To issue commands with the internal keyword, you must have an account that is a member of the
network-admin group.
MDSC# show ipsec internal crypto-accelerator interface gigabitethernet 1/2 sad inbound 513
Inbound SA 513 :
Mode :Tunnel, flags:0x492300000000000
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
MDSA# show ipsec internal crypto-accelerator interface gigabitethernet 7/1 sad outbound 1
Outbound SA 1 :
SPI 136487426 (0x822a202), MTU 1400, MTU_delta 4
Mode :Tunnel, flags:0x92100000000000
IPsec mode is ESP
Tunnel options index:0, ttl:0x40, flags:0x1
Encrypt algorithm is DES/3DES
Auth algorithm is MD5
Tunnel source ip address 10.10.100.231
Tunnel destination ip address 10.10.100.232
Hard limit 483183820800 bytes
Soft limit 376883380224 bytes
SA byte count 874544 bytes <----Elapsed traffic
SA user byte count 874544 bytes <----Elapsed traffic
Packet count 7150
Hard limit expiry 1100652419 secs (since January 1, 1970), remaining 208 9 secs
Soft limit expiry 1100652384 secs (since January 1, 1970), remaining 205 4 secs
Outbound MAC table index:1
Sequence number:7151
MDSC# show ipsec internal crypto-accelerator interface gigabitethernet 1/2 sad outbound
513
Outbound SA 513 :
SPI 940863490 (0x38147002), MTU 1400, MTU_delta 4
Mode :Tunnel, flags:0x92100000000000
IPsec mode is ESP
Tunnel options index:0, ttl:0x40, flags:0x1
Encrypt algorithm is DES/3DES
Auth algorithm is MD5
Tunnel source ip address 10.10.100.232
Tunnel destination ip address 10.10.100.231
Hard limit 483183820800 bytes
Soft limit 449360953344 bytes
SA byte count 855648 bytes <----Elapsed traffic
SA user byte count 855648 bytes <----Elapsed traffic
Packet count 7122
Hard limit expiry 1100652419 secs (since January 1, 1970), remaining 206 4 secs
Soft limit expiry 1100652397 secs (since January 1, 1970), remaining 204 2 secs
Outbound MAC table index:125
Sequence number:7123
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Verify that traffic was flowing when the soft SA lifetime expired.
Step 2 Verify that the configurations are still compatible.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The show crypto global domain ipsec interface gigabitethernet slot/port command output displays
interface level statistics. Example command output follows:
MDSA# show crypto global domain ipsec interface gigabitethernet 7/1
IPSec interface statistics:
IKE transaction stats:0 num
Inbound SA stats:1 num, 512 max
Outbound SA stats:1 num, 512 max
C H A P T E R 12
Troubleshooting Fabric Manager Problems
This chapter contains some common issues you may experience while using Cisco Fabric Manager, and
provides solutions.
This chapter contains the following sections:
• Tips for Troubleshooting Fabric Manager Problems, page 12-1
• Tips for Using Fabric Manager, page 12-2
Symptom: The Map Shows Two Switches Where Only One Switch Exists
If two switches show on your map, but you only have one switch, it may be that you have two switches
in a non-contiguous VSAN with the same domain ID. The Fabric Manager uses the VSAN ID and
domain ID to look up a switch, and this can cause the fabric discovery to assign links incorrectly between
these errant switches.
The workaround is to verify that all switches use unique domain IDs within the same VSAN in a
physically connected fabric. (The fabric configuration checker will do this task.)
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Setting the Map Layout So It Stays After Restarting the Fabric Manager
If you have configured the map layout and would like to “freeze” the map so that the objects stay as they
are even after you stop Fabric Manager and restart it again, do the following:
Step 1 Right-click on a blank space in the map. You see a pop-up menu.
Step 2 Select Layout -> Fix All Nodes from the menu.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Step 1 Double-click the Java Web Start application manager icon on your Windows desktop, or choose
Program Files > Java Web Start.
Step 2 Select File > Preferences from the Java WebStart Application Manager.
Step 3 Click the Manual radio button and enter the IP address of the proxy server in the HTTP Proxy field.
Step 4 Enter the HTTP port number used by your proxy service in the HTTP Port field.
Step 5 Click OK.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note Any devices not currently accessible (may be offline) will be purged.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
A P P E N D I X A
Before Contacting Technical Support
This appendix describes the steps to perform before calling for technical support for any Cisco MDS
9000 Family multilayer director and fabric switch. This appendix includes the following sections:
• Steps to Perform Before Calling TAC, page A-1
• Using Core Dumps, page A-5
Note If you purchased Cisco support through a Cisco reseller, contact the reseller directly. If you
purchased support directly from Cisco, contact Cisco Technical Support at this URL:
http://www.cisco.com/warp/public/687/Directory/DirTAC.shtm
Note Do not reload the module or the switch at least until you have completed Step 1 below. Some logs and
counters are kept in volatile storage and will not survive a reload.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
To prepare for contacting your customer support representative, follow these steps:
Step 1 Collect switch information and configuration. This should be done before and after the issue has been
resolved. The following three methods each provide the same information:
a. Select Tools > Show Tech Support in Fabric Manager. Fabric Manager can capture switch
configuration information from multiple switches simultaneously. The file can be saved on the local
PC.
b. Configure your Telnet or SSH application to log the screen output to a text file. Use the terminal
length 0 CLI command and then use the show tech-support details CLI command.
c. Use the tac-pac <filename> CLI command to redirect the output of the show tech-support details
CLI command to a file, and then gzip the file.
switch# tac-pac bootflash://showtech.switch1
If no filename is specified, the file is created as volatile:show_tech_out.gz. The file should then
be copied from the switch using the procedure outlined in the “Copying Files to or from the Switch”
section on page A-3.
Step 2 If an error occurs in Fabric Manager, take a screen shot of the error. In Windows, press Alt+PrintScreen
to capture the active window, or press only PrintScreen to capture the entire desktop. Then paste this
into a new Microsoft Paint (or similar program) session and save the file.
Step 3 Capture the exact error codes you see in the message logs from either Fabric Manager or the CLI.
a. Select the Logs tab in the Map pane in Fabric Manager or choose Switches > Events to see the
recent list of messages generated.
b. Copy the error from the message log, which can be displayed using either the show logging log CLI
command or the show logging last number to view the last lines of the log.
Step 4 Answer the following questions before calling for technical support:
• On which switch, host bus adapter (HBA), or storage port is the problem occurring?
• Which Cisco SAN-OS software, driver versions, operating systems versions and storage device
firmware are in your fabric?
• What is the network topology? (In Fabric Manager, go to Tools > Show Tech Support and check
the Save Map check box.)
• Were any changes being made to the environment (zoning, adding modules, upgrades) prior to or at
the time of this event?
• Are there other similarly configured devices that could have this problem, but do not?
• Where was this problematic device connected (which MDS switch and interface)?
• When did this problem first occur?
• When did this problem last occur?
• How often does this problem occur?
• How many devices have this problem?
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Were any traces or debug output captured during the problem time? What troubleshooting steps have
you attempted? Which, if any, of the following tools were used?
– FC Analyzer, PAA-2, Ethereal, local or remote SPAN
– CLI debug commands
– FC traceroute, FC ping
– Fabric Manager or Device Manager tools
Step 5 Is your problem related to a software upgrade attempt?
• What was the original Cisco SAN-OS version?
• What is the new Cisco SAN-OS version?
• Did you use Fabric Manager or the CLI to attempt this upgrade?
• Please collect the output from the following commands and forward them to your customer support
representative:
– show install all status
– show system internal log install
– show system internal log install details
– show log nvram
Step 1 Choose Admin > Copy Configuration. You see the Copy Configuration dialog box.
Step 2 Set the To field to the server where you want to copy the configuration file to.
Step 3 Set the From field to running or startup configuration.
Step 4 Select the protocol you want to use to copy the file from the switch.
Step 5 Select Apply to copy the file.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
To copy files to the switch using Device Manager, follow these steps:
Step 1 Choose Admin > Flash Files. You see the list of files in the chosen device and partition.
Step 2 Select Copy to copy a file. You see the copy file dialog box.
Step 3 select the protocol you want to use to copy the file to the switch.
Step 4 Set the server address and the file that you want to copy.
Step 5 Select Apply to copy the file.
The copy CLI command supports four transfer protocols and 12 different sources for files.
ca-9506# copy ?
bootflash: Select source filesystem
core: Select source filesystem
debug: Select source filesystem
ftp: Select source filesystem
licenses Backup license files
log: Select source filesystem
modflash: Select source filesystem
nvram: Select source filesystem
running-config Copy running configuration to destination
scp: Select source filesystem
sftp: Select source filesystem
slot0: Select source filesystem
startup-config Copy startup configuration to destination
system: Select source filesystem
tftp: Select source filesystem
volatile: Select source filesystem
Use the following syntax to use secure copy (scp) as the transfer mechanism:
"scp:[//[username@]server][/path]"
To copy /etc/hosts from 172.22.36.10 using the user user1, where the destination would be
hosts.txt, use the following command:
switch# copy scp://[email protected]/etc/hosts bootflash:hosts.txt
[email protected]'s password:
hosts 100% |*****************************| 2035 00:00
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Tip Backing up the startup-configuration to a server should be done on a daily basis and prior to any
changes. A short script could be written to be run on the MDS to perform a save and then backup of
the configuration. The script only needs to contain two commands: copy running-configuration
startup-configuration and then copy startup-configuration tftp://server/name. To execute the
script use: run-script filename.
Note The file name (indicated by jsmith_cores) must exist in the TFTP server directory.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
A P P E N D I X B
Troubleshooting Tools and Methodology
This appendix describes the troubleshooting tools and methodology available for the Cisco MDS 9000
Family multilayer directors and fabric switches. It includes the following sections:
• Using Cisco MDS 9000 Family Tools, page B-1
• Using Cisco Network Management Products, page B-22
• Using Other Troubleshooting Products, page B-25
• Using Host Diagnostic Tools, page B-26
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
CLI Debug
The Cisco MDS 9000 Family switches support an extensive debugging feature set for actively
troubleshooting a storage network. Using the CLI, you can enable debugging modes for each switch
feature and view a real-time updated activity log of the control protocol exchanges. Each log entry is
time-stamped and listed in chronological order. Access to the debug feature can be limited through the
CLI roles mechanism and can be partitioned on a per-role basis. While debug commands show realtime
information, the show commands can be used to list historical information as well as realtime.
Note You can log debug messages to a special log file, which is more secure and easier to process than sending
the debug output to the console.
By using the '?' option, you can see the options that are available for any switch feature, such as FSPF.
A log entry is created for each entered command in addition to the actual debug output. The debug output
shows a time-stamped account of activity occurring between the local switch and other adjacent
switches.
You can use the debug facility to keep track of events, internal messages, and protocol errors. However,
you should be careful with using the debug utility in a production environment, because some options
may prevent access to the switch by generating too many messages to the console or if very
CPU-intensive may seriously affect switch performance.
Note We recommend that you open a second Telnet or SSH session before entering any debug commands. If
the debug session overwhelms the current output window, you can use the second session to enter the
undebug all command to stop the debug message output.
The following is an example of the output from the debug flogi event command:
switch# debug flogi event interface fc1/1
Dec 10 23:40:26 flogi: current state [FLOGI_ST_FLOGI_RECEIVED]
Dec 10 23:40:26 flogi: current event [FLOGI_EV_VALID_FLOGI]
Dec 10 23:40:26 flogi: next state [FLOGI_ST_GET_FCID]
Dec 10 23:40:26 flogi: fu_fsm_execute: ([1]21:00:00:e0:8b:08:96:22)
Dec 10 23:40:26 flogi: current state [FLOGI_ST_GET_FCID]
Dec 10 23:40:26 flogi: current event [FLOGI_EV_VALID_FCID]
Dec 10 23:40:26 flogi: next state [FLOGI_ST_PERFORM_CONFIG]
Dec 10 23:40:26 flogi: fu_fsm_execute: ([1]21:00:00:e0:8b:08:96:22)
Dec 10 23:40:26 flogi: current state [FLOGI_ST_PERFORM_CONFIG]
Dec 10 23:40:26 flogi: current event [FLOGI_EV_CONFIG_DONE_PENDING]
Dec 10 23:40:26 flogi: next state [FLOGI_ST_PERFORM_CONFIG]
Dec 10 23:40:26 flogi: fu_fsm_execute: ([1]21:00:00:e0:8b:08:96:22)
Dec 10 23:40:26 flogi: current state [FLOGI_ST_PERFORM_CONFIG]
Dec 10 23:40:26 flogi: current event [FLOGI_EV_RIB_RESPOSE]
Dec 10 23:40:26 flogi: next state [FLOGI_ST_PERFORM_CONFIG]
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The following is a summary of some of the common debug commands available Cisco SAN-OS:
Table B-1 Debug Commands
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note Use the Fibre Channel ping and Fibre Channel traceroute features to troubleshoot problems with
connectivity and path choices. Do not use them to identify or resolve performance issues.
Ping and traceroute are two of the most useful tools for troubleshooting TCP/IP networking problems.
The ping utility generates a series of echo packets to a destination across a TCP/IP internetwork. When
the echo packets arrive at the destination, they are rerouted and sent back to the source. Using ping, you
can verify connectivity and latency to a particular destination across an IP routed network.
The traceroute utility operates in a similar fashion, but can also determine the specific path that a frame
takes to its destination on a hop-by-hop basis.
These tools have been migrated to Fibre Channel for use with the Cisco MDS 9000 Family switches and
are called FC ping and FC traceroute. You can access FC ping and FC traceroute from the CLI or from
Fabric Manager.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Using FC Ping
The FC ping tool:
• Checks end-to-end connectivity.
• Uses an pWWN or FCID.
FC ping allows you to ping a Fibre Channel N port or end device. (See Example B-1.) By specifying the
FCID or Fibre Channel address, you can send a series of frames to a target N port. Once these frames
reach the target device’s N port, they are looped back to the source and a time-stamp is taken. FC ping
helps you to verify the connectivity and latency to an end N port. FC ping uses the PRLI Extended Link
Service, and verifies the presence of a Fibre Channel entity in case of positive or negative answers.
The FC Ping feature verifies reachability of a node by checking its end-to-end connectivity.
• Choose Tools > Ping to access FC ping using Fabric Manager.
• Invoke the FC ping feature using the CLI by providing the FC ID or the destination port WWN
information in the following ways:
switch# fcping pwwn 20:00:00:2e:c4:91:d4:54
switch# fcping fcid 0x123abc
Using FC Traceroute
Use the FC Trace feature to:
• Trace the route followed by data traffic.
• Compute inter-switch (hop-to-hop) latency.
FC traceroute identifies the path taken on a hop-by-hop basis and includes a timestamp at each hop in
both directions. (See Example B-2.) FC ping and FC traceroute are useful tools to check for network
connectivity problems or verify the path taken toward a specific destination. You can use FC traceroute
to test the connectivity of TE ports along the path between the generating switch and the switch closest
to the destination.
Choose Tools > Traceroute on Fabric Manager or use the fctrace CLI command to access this feature.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Use FC Trace by providing the FC ID, the N port, or the NL port WWN of the destination. The frames
are routed normally as long as they are forwarded through TE ports. After the frame reaches the edge of
the fabric (the F port or FL port connected to the end node with the given port WWN or the FC ID), the
frame is looped back (swapping the source ID and the destination ID) to the originator.
If the destination cannot be reached, the path discovery starts, which traces the path up to the point of
failure.
The FC Trace feature works only on TE Ports. Make sure that only TE ports exist in the path to the
destination. If there is an E Port in the path:
• The FC Trace frame will be dropped by that switch.
• The FC Trace will time out in the originator.
• Path discovery will not start.
Note The values rendered by the FC traceroute process do not reflect the actual latency across the switches.
The actual trace value interpretation is shown in the example below.
VSAN 600
--------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPEFEATURE
--------------------------------------------------------------------------
0xeb01e8 NL 210000203767f7a2 (Seagate) scsi-fcptarget
0xec00e4 NL 210000203767f48a (Seagate) scsi-fcp
0xec00e8 NL 210000203767f507 (Seagate) scsi-fcp
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
2000000c30575ec0(0xfffced) --> first hop MDS on the return path from traced FCID to
originor
switch#
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note The ER state typically designates a process that has been restarted too many times, causing the system
to classify it as faulty and disable it.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note For detailed information about using Cisco Fabric Manager, refer to the Cisco MDS 9000 Family Fabric
Manager Configuration Guide.
Note When you click on a zone or VSAN in Fabric Manager, the members of the zone or VSAN are
highlighted on the Fabric Manager Map pane.
Device Manager provides a graphic display of a specific switch and shows the status of each port on the
switch. From Device Manager, you can drill down to get detailed statistics about a specific switch or
port.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The Summary View window lets you analyze switch performance issues, diagnose problems, and change
parameters to resolve problems or inconsistencies. This view shows aggregated statistics for the active
Supervisor Module and all switch ports. Information is presented in tabular or graphical formats, with
bar, line, area, and pie chart options. You can also use the Summary View to capture the current state of
information for export to a file or output to a printer.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The Switch Health Analysis window displays any problems affecting the selected switches.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
You use a policy file to define the rules to be applied when running the Fabric Checker. When you create
a policy file, the system saves the rules selected for the selected switch.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The Zone Merge Analysis window displays any inconsistencies between the zone configuration of the
two selected switches.
You can use the following options on the Fabric Manager Tools menu to verify connectivity to a selected
object or to open other management tools:
• Traceroute—Verify connectivity between two end devices that are currently selected on the Map
pane.
• Device Manager— Launch Device Manager for the switch selected on the Map pane.
• Command Line Interface—Open a Telnet or SSH session for the switch selected on the Map pane.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Use the RMON Threshold Manager to configure event thresholds that will trigger log entries or
notifications. Use either Fabric Manager or Device Manager to:
• Identify Syslog servers that will record events.
• Configure Call Home, which can issue alerts via e-mail messages or paging when specific events
occur.
The RMON groups that have been adapted for use with Fibre Channel include the AlarmGroup and
EventGroup. The AlarmGroup provides services to set alarms. Alarms can be set on one or multiple
parameters within a device. For example, an RMON alarm can be set for a specific level of CPU
utilization or crossbar utilization on a switch. The EventGroup allows configuration of events (actions
to be taken) based on an alarm condition. Supported event types include logging, SNMP traps, and
log-and-trap.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
For more information about SCSI target discovery, refer to the Cisco MDS 9000 Family Configuration
Guide.
Note This tool can be effective to find out the number of LUNs exported by a storage subsystem, but it may
be ineffective when LUN Zoning/LUN Security tools are used.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Note During initial configuration of your switch, the system prompts you to define SNMP v1 or V2
community strings and to create a SNMP v3 username and password.
Cisco MDS 9000 Family switches support over 50 different MIBs, which can be divided into the
following six categories:
• IETF Standards-based Entity MIBs (for example, RFC273 ENTITY-MIB)
These MIBs are used to report information on the physical devices themselves in terms of physical
attributes etc.
• Cisco-Proprietary Entity MIBs (for example, CISCO-ENTITY-FRU-CONTROL-MIB)
These MIBs are used to report additional physical device information about Cisco-only devices such
as their configuration.
• IETF IP Transport-oriented MIBs (for example, RFC2013 UDP-MIB)
These MIBs are used to report transport-oriented statistics on such protocols as IP, TCP, and UDP.
These transports are used in the management of the Cisco MDS 9000 Family through the OOB
Ethernet interface on the Supervisor module.
• Cisco-Proprietary Storage and Storage Network MIBs (for example, NAME-SERVER-MIB)
These MIBs were written by Cisco to help expose information that is discovered within a fabric to
management applications not connected to the fabric itself. In addition to exposing configuration
details for features like zoning and Virtual SANs (VSANs) via MIBs, discovered information from
sources like the FC-GS-3 Name Server can be pulled via a MIB. Additionally, MIBs are provided
to configure/enable features within the Cisco MDS 9000 Family. There are over 20 new MIBs
provided by Cisco for this information and configuration capability.
• IETF IP Storage Working Group MIBs (for example, ISCSI-MIB)
While many of these MIBs are still work-in-progress, Cisco is helping to draft such MIBs for
protocols such as iSCSI and Fibre Channel-over-IP (FCIP) to be standardized within the IETF.
• Miscellaneous MIBs (for example, SNMP-FRAMEWORK-MIB)
There are several other MIBs provided in the Cisco MDS 9000 Family switches for tasks such as
defining the SNMP framework or creating SNMP partitioned views.
You can use SNMPv3 to assign different SNMP capabilities to specific roles.
Cisco MDS 9000 Family switches also support Remote Monitoring (RMON) for Fibre Channel. RMON
provides a standard method to monitor the basic operations of network protocols providing connectivity
between SNMP management stations and monitoring agents. RMON also provides a powerful alarm and
event mechanism for setting thresholds and sending notifications based on changes in network behavior.
The RMON groups that have been adapted for use with Fibre Channel include the AlarmGroup and the
EventGroup. The AlarmGroup provides services to set alarms. Alarms can be set on one or multiple
parameters within a device. For example, you can set an RMON alarm for a specific level of CPU
utilization or crossbar utilization on a switch. The EventGroup lets you configure events that are actions
to be taken based on an alarm condition. The types of events that are supported include logging, SNMP
traps, and log-and-trap.
Note To configure events within an RMON group, use the Events > Threshold Manager option from Device
Manager. See the “Device Manager: RMON Threshold Manager” section on page B-15.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Using RADIUS
RADIUS is fully supported for the Cisco MDS 9000 Family switches through the Fabric Manager and
the CLI. RADIUS is a protocol used for the exchange of attributes or credentials between a head-end
RADIUS server and a client device. These attributes relate to three classes of services:
• Authentication
• Authorization
• Accounting
Authentication refers to the authentication of users for access to a specific device. You can use RADIUS
to manage user accounts for access to Cisco MDS 9000 Family switches. When you try to log into a
switch, the switch validates you with information from a central RADIUS server.
Authorization refers to the scope of access that you have once you have been authenticated. Assigned
roles for users can be stored in a RADIUS server along with a list of actual devices that the user should
have access to. Once the user has been authenticated, then switch can then refer to the RADIUS server
to determine the extent of access the user will have within the switch network.
Accounting refers to the log information that is kept for each management session in a switch. This
information may be used to generate reports for troubleshooting purposes and user accountability.
Accounting can be implemented locally or remotely (using RADIUS).
The following is an example of an accounting log entries.
switch# show accounting log
Sun Dec 15 04:02:27 2002:start:/dev/pts/0_1039924947:admin
Sun Dec 15 04:02:28 2002:stop:/dev/pts/0_1039924947:admin:vsh exited normally
Sun Dec 15 04:02:33 2002:start:/dev/pts/0_1039924953:admin
Sun Dec 15 04:02:34 2002:stop:/dev/pts/0_1039924953:admin:vsh exited normally
Sun Dec 15 05:02:08 2002:start:snmp_1039928528_172.22.95.167:public
Sun Dec 15 05:02:08 2002:update:snmp_1039928528_172.22.95.167:public:Switchname
Note The accounting log only shows the beginning and ending (start and stop) for each session.
Using Syslog
The system message logging software saves messages in a log file or directs the messages to other
devices. This feature provides the following capabilities:
• Logging information for monitoring and troubleshooting.
• Selection of the types of logging information to be captured.
• Selection of the destination of the captured logging information.
Syslog lets you store a chronological log of system messages locally or sent to a central Syslog server.
Syslog messages can also be sent to the console for immediate use. These messages can vary in detail
depending on the configuration that you choose.
Syslog messages are categorized into 7 severity levels from debug to critical events. You can limit the
severity levels that are reported for specific services within the switch. For example, you may wish only
to report debug events for the FSPF service but record all severity level events for the Zoning service.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
A unique feature within the Cisco MDS 9000 Family switches is the ability to send RADIUS accounting
records to the Syslog service. The advantage of this feature is that you can consolidate both types of
messages for easier correlation. For example, when you log into a switch and change an FSPF parameter,
Syslog and RADIUS provide complimentary information that will help you formulate a complete picture
of the event.
Log messages are not saved across system reboots. However, a maximum of 100 log messages with a
severity level of critical and below (levels 0, 1, and 2) are saved in NVRAM. You can view this log at
any time with the show logging nvram command.
Logging Levels
The MDS supports the following logging levels:
• 0-emergency
• 1-alert
• 2-critical
• 3-error
• 4-warning
• 5-notification
• 6-informational
• 7-debugging
By default, the switch logs normal but significant system messages to a log file and sends these messages
to the system console. Users can specify which system messages should be saved based on the type of
facility and the severity level. Messages are time-stamped to enhance real-time debugging and
management.
Note Note: When logging to a console session is disabled or enabled, that state is applied to all future console
sessions. If a user exits and logs in again to a new session, the state is preserved. However, when logging
to a Telnet or SSH session is enabled or disabled, that state is applied only to that session. The state is
not preserved after the user exits the session.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
switch2(config-span)# destination interface fc1/3 <<==== Specify the span destination port
switch2(config-span)# end
For more information about configuring SPAN, refer to the Cisco MDS 9000 Family Configuration
Guide.
Note The Cisco MDS 9000 Family Port Analyzer Adapter does not support half-duplex mode and for this
reason, it will not work when connected to a hub.
The Cisco MDS 9000 Family Port Analyzer Adapter provides the following features:
• Encapsulates Fibre Channel frames into Ethernet frames.
• Sustains 32 maximum size Fibre Channel frames burst (in 100 Mbps mode).
• Line rate at 1Gbps (for Fibre Channel frames larger than 91 bytes).
• 64 KB of onboard frame buffer.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
• Configurable option for Truncating Fibre Channel frames to 256 bytes (for greater burst).
• Configurable option for Deep Truncating Fibre Channel frames to 64 bytes (best frames burst).
• Configurable option for Ethernet Truncating Fibre Channel frames to 1496 bytes (maximum size
E-net frames).
• Configurable option for No Truncate Mode (sends jumbo frames on E-net side).
• Packet Counter (Indicates number of previous packet drops).
• SOF/EOF type information embedded.
• 100/1000 Mb/s Ethernet interface (option on board).
• Auto Configuration on power up.
• Fibre Channel and Ethernet Link up indicator LEDs.
• Checks Fibre Channel frame CRC.
When used in conjunction with the open source protocol analyzer, Ethereal (http//www.ethereal.com),
the Cisco MDS 9000 Family Port Analyzer Adapter provides a cost-effective and powerful
troubleshooting tool. It allows any PC with a Ethernet card to provide the functionality of a flexible Fibre
Channel analyzer. For more information on using the Cisco MDS 9000 Family Port Analyzer Adapter
see the Cisco MDS 9000 Family Port Analyzer Adapter Installation and Configuration Guide.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
The Ethereal application allows remote access to Fibre Channel control traffic and does not require a
Fibre Channel connection on the remote workstation.
The Cisco Fabric Analyzer lets you capture and decode Fibre Channel traffic remotely over Ethernet. It
captures Fibre Channel traffic, encapsulates it in TCP/IP, and transports it over an Ethernet network to
the remote client. The remote client then deencapsulates and fully decodes the Fibre Channel frames.
This capability provides flexibility for troubleshooting problems in remote locations.
The Cisco Fabric Analyzer captures and analyzes control traffic coming to the Supervisor Card. This tool
is much more effective than the debug facility for packet trace and traffic analysis, because it is not very
CPU intensive and it provides a graphic interface for easy analysis and decoding of the captured traffic.
switch# config terminal
switch(config)# fcanalyzer local brief
Capturing on eth2
0.000000 ff.ff.fd -> ff.ff.fd SW_ILS 1 0x59b7 0xffff 0x7 -> 0xf HLO
0.000089 ff.ff.fd -> ff.ff.fd FC 1 0x59b7 0x59c9 0xff -> 0x0 Link Ctl, ACK1
1.991615 ff.ff.fd -> ff.ff.fd SW_ILS 1 0x59ca 0xffff 0xff -> 0x0 HLO
1.992024 ff.ff.fd -> ff.ff.fd FC 1 0x59ca 0x59b8 0x7 -> 0xf Link Ctl, ACK1
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Device Entry 0
Port State 0x20
Port Id 7f.00.01
Port WWN 1000000530005f1f (000530)
Node WWN 1000000530005f1f (000530)
However, the Cisco Fabric Analyzer is not the right tool for troubleshooting end-to-end problems
because it cannot access any traffic between the server and storage subsystems. That traffic is switched
locally on the linecards, and does not reach the Supervisor card. In order to debug issues related to the
communication between server and storage subsystems, you need to use Fibre Channel SPAN with an
external protocol analyzer.
There are two ways you can start the Cisco Fabric Analyzer from the CLI.
• fcanalyzer local—Launches the text-based version on the analyzer directly on the console screen
or on a file local to the system.
• fcanalyzer remote ip address—Activates the remote capture agent on the switch, where ip address
is the address of the management station running Ethereal.
For more information about using the Cisco Fabric Analyzer, refer to the Cisco MDS 9000 Family
Configuration Guide.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Fibre Channel network devices (HBAs, switches, and storage subsystems) are not able to monitor many
SAN behavior patterns. Also, management tools that gather data from these devices are not necessarily
aware of problems occurring at the Fibre Channel physical, framing, or SCSI upper layers for a number
of reasons.
Fibre Channel devices are specialized for handling and distributing incoming and outgoing data streams.
When devices are under maximum loads, which is when problems often occur, the device resources
available for error reporting are typically at a minimum and are frequently inadequate for accurate error
tracking. Also, Fibre Channel host bus adapters (HBAs) do not provide the ability to capture raw network
data.
For these reasons, a protocol analyzer may be more important in troubleshooting a storage network than
in a typical Ethernet network. There are a number of common SAN problems that occur in deployed
systems and test environments that are visible only with a Fibre Channel analyzer. These include the
following:
• Credit starvation
• Missing, malformed, or non-standard-compliant frames or primitives
• Protocol errors
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Utilities provided by the Sun Solaris operating system let you determine if the remote storage has been
recognized and exported to you in form of a raw device or mounted file system, and to issue some basic
queries and tests to the storage. You can measure performance and generate loads using the iostat utility,
the perfmeter GUI utility, the dd utility, or a third-party utility like Extreme SCSI.
Every UNIX version provides similar utilities, but this guide only provides examples for Solaris. Refer
to the documentation for your specific operating system for details.
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
INDEX
Numerics C
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
P R
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m
zones
T
best practices 9-1
temperature violations 3-12 cannot configure enhanced zoning 9-21
Threshold Manager B-15 database distribution 9-10
traceroute enhanced 9-21
See FC trace
Se n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a ck - d o c @ c i s c o . c o m