100% found this document useful (1 vote)
1K views120 pages

Cisco UCS

UCS Cisco Troubleshooting
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views120 pages

Cisco UCS

UCS Cisco Troubleshooting
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 120

Troubleshooting the Cisco UCS Compute

Deployment
Session ID BRKCOM-3001
Agenda
 UCS insights to Troubleshooting
 Blade/Server Troubleshooting
 IOM/CMC Troubleshooting
 Fabric Interconnect Troubleshooting
 SAN NPV Troubleshooting

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2
UCS tools for Troubleshooting

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3
System Components - Major Points of Service

UCS Manager (XML and CLI), Cisco UCS Manager


NXOS, Physical Connections Embedded in Fabric Interconnect
to Chassis & Core SAN/LAN
network, Cluster Operations
Cisco UCS 6100 Series Fabric
Interconnects
UCS 6120XP 20 Port Fabric Interconnect
Chassis Management UCS 6140XP 40 Port Fabric Interconnect
Controller (CMC) Operations,
Chassis Discovery, Physical
Cisco UCS 2100 Series Fabric Extenders
Connections to Fabric
Logically part of Fabric Switch
Interconnect (FI) and Logical Inserts into Blade Enclosure
Connections to Adaptor
Cards
Cisco UCS 5100 Series Blade Chassis
Flexible bay configurations
Logically part of Fabric Interconnect
Baseboard Management
Controller (BMC) of
Compute nodes, All Compute Cisco UCS B-Series Blade Servers
node Components (memory, UCS B-200 M1 Blade Server
proc, mezz cards, disk UCS B-250 M1 Extended Memory Blade Server

Cisco UCS Network Adapters


Three adapter options
Power, Fans, Connectors Mix adapters within blade chassis

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4
61xx Fabric Interconnect (FI)
 Active/Active Clustered System
 Navigation to proper component when troubleshooting
CLI NX-OS or UCSM
Virtual IP
Management Network

IP #A IP #B

Switch-A# Switch-B#

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
UCS 2100 Fabric Extender Switch Connection
 Each UCS 2100 Fabric Extender in a UCS 5100 Blade Server Chassis is
connected to a 6100 Series Fabric Interconnect for Redundancy or
Bandwidth Aggregation

 Fabric Extender provides 4x10GE ports to the NX5K switch.


 Link physical health and the chassis discovery occurs over these links

UCS 6100 Series Switch A UCS 6100 Series Switch B

UCS 5100 Series Blade


Server Chassis
Back

UCS 2100 Series Fabric Extenders


Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6
Unified Compute System Manager
Part of UCS Troubleshooting will be the
verification UCSM is communicating to end
systems correctly

Management Redundant
interfaces management
service

UCSM switch elements


UCSM

chassis elements
multiple protocol
support

server elements

Redundant management plane


Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
UCSM access Enable Logging in Java to capture issues

Example of session log file on client

Client logs for debugging UCSM access & Client KVM access are found at this location
on Client system:
C:\Documents and Settings\userid\Application Data\Sun\Java\Deployment\log\.ucsm
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8
UCSM Client Logs
To find what log you should currently view for issues with UCSM Window go to task manager to check the
process id for the javaw process. The same file should appear in the log area also base it off the time
modified.

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9
Interface Stats and reports

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10
Statistics breakdown Live/now

History

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11
UCS Internal Operations
 Unified Compute System Manager (UCSM)  Distributed Cluster State
& Data Management Engine (DME) Runs as a cluster  Stored in Chassis EPROM
 State-full switch-over  Solves split brain
 Object state is replicated  Application Gateway (AG)
interfaces to the blade
Fabric Interconnect A Fabric Interconnect B

Interface Layer Interface Layer

UCSM-A HA
UCSM-B
HA
Controller Controller

DME Replicator Replicator


FSM
DME
FSM (standby)
(active)
Persistifier flash flash Persistifier

Application Gateway Layer EPROM Application Gateway Layer

EPROM
EPROM

EPROM

CMC CMC CMC CMC CMC CMC


... CMC CMC

Chassis 1 Chassis 2 Chassis 3 chassis

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12
FarNorth-A# scope server ?
Events per component WORD <chassis-id>/<blade-id>
dynamic-uuid Dynamic UUID

FarNorth-A# scope server 1/1


FarNorth-A /chassis/server # show event

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13
Server Discovery FSM
 FSM runs as a workflow involving many stages (FSM-Stage)
 Workflows are predefined and stages can be skipped if
Not need (in HA if remote is down, not NIC configuration for Oplin)
FSM Flags (shallow checkpoint or deep checkpoint)
 Each Stage is an interaction between:
DME  Application Gateway -> End Point
 DME just manages the state of the object and workflow, and then
instructs the AG to perform the activity.
 AGs do the real work.
 FSM usually have the following notation
FSM <Object><Workflow><Operation><Where-is-it-executed>
Object “Blade/Chassis”… Processing Node Utility OS
Workflow “Discover”/”Association” Linux-based pre-boot execution environment that can boot on a
processing node to run diagnostics, report inventory, or configure the
Operation “Pnuos-Config” firmware state of the Blade

Where is generally “”, or “A” or “B” or “Local” or “Peer”


If „Where‟ is not specified, it is executed on managing node

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 14
FSM

Most every action


done by the
UCSM has a
FSM to verify
operation and
status

View and monitor


each action for
ongoing feedback
and progress
state of an action

Logs kept for


review and
troubleshooting

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15
FSM mapped out - example

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16
OBFL

 Onboard Fault Log stores hardware logs on the


different components, saved at time of issue.
 Alternate method to viewed by connecting to the
device.
 Show tech-support will capture these logs

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17
System Event Log (SEL) –Events Supported
 Server BIOS events
3 Kinds of equipment end-points:
Memory Unit (DIMM)
ECC errors, Address Parity, Memory Mismatch
Processor Unit
Memory Mirroring, Sparing, SMI Link errors
Motherboard
PCIe, QPI uncorrectable errors, Legacy PCI errors

 All these errors are modeled as stats properties. The ones for which thresholds are not
defined get reported as statistics only

 BMC, BIOS, OS log platform errors to BMC’s System Event Log


(SEL) Buffer

 POST and Run Time errors

 Used as an Effective health monitoring tool

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18
System Event Log (SEL) - config
Users can define rules (policies) for backing up and clearing SEL across all
servers in the UCS system, or they can manually trigger a SEL backup on
individual servers.

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19
System Event Logs = Management Logs
Make sure that servers are discovered
Make sure backup destination path is valid
Chassis Can be done via CLI also

Server

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20
CLI navigation

 SSH or Telnet to the Cluster IP when possible


You will connect to the Primary FI in the cluster automatically

Cisco UCS 6100 Series Fabric Interconnect


Using keyboard-interactive authentication.
The copyrights to certain works contained herein are owned by
other third parties and are used and distributed under license.
Some parts of this software may be covered under the GNU Public
License or the GNU Lesser General Public License. A copy of
each such license is available at
http://www.gnu.org/licenses/gpl.html and
http://www.gnu.org/licenses/lgpl.html
FarNorth-B#

FarNorth-B# show cluster state


Cluster Id: 0xf76362a0c56011de-0x8446000decd07b44

B: UP, PRIMARY
A: UP, SUBORDINATE

HA READY

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21
UCS CLI navigation Structure
 Almost same as NXOS, slight differences in layout
 But Configuration is in XML structure

FarNorth-B# show
FarNorth-B#
chassis Chassis
acknowledge Acknowledge
cli CLI commands
backup Backup
clock Display current Date
clear Reset functions
cluster Cluster mode
commit-buffer Commit transaction buffer
configuration Show information about configuration sessions
connect Connect to Another CLI
eth-uplink Ethernet Uplink
decommission Decommission managed objects
event Event Manager commands
discard-buffer Discard transaction buffer
fabric-interconnect Show Fabric Interconnect
end Go to exec mode
fault Fault
exit Exit from command interpreter
identity Identity
recommission Recommission Server Resources
iom IO Module
remove Remove
license Show the contents of all the license files
scope Changes the current mode
org Organizations
set Set property values
security Security mode
show Show running system information
sel System Event Log
terminal Set terminal line parameters
server Server
top Go to the top mode
service-profile Service Profile
up Go up one mode
system System-related show commands
where Show information about the current
timezone Set timezone
mode
version System version
vif Virtual Interfaces

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22
UCS Configuration from CLI

 Not recommended as best practice but is some


times required due to problem
 More for use when direct troubleshooting or
verification of proper config from UCSM
 Will give you good understanding of XML structure
for third party API configurations and uses of
navigation
 As system admin for troubleshooting you will need
to be somewhat familiar with CLI

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23
XML configuration naviation
Configuration verification or to so pending changes
FarNorth-A# show configuration ?
<CR>
> Redirect it to a file
>> Redirect it to a file in append mode Configuration tools
all All
no-diff-markers Don't Show Diff Markers FarNorth-A# show configuration | ?
no-pending Don't Show Pending Config cut Print selected parts of lines.
pending Show Only Pending Config egrep Egrep - print lines matching a pattern
| Pipe command output to filter grep Grep - print lines matching a pattern
head Display first lines
last Display last lines
less Filter for paging
no-more Turn-off pagination for command output
sort Stream Sorter
Save off config to file tr Translate, squeeze, and/or delete characters
uniq Discard all but one of successive identical
(UCSM also has backup methods) lines
vsh The shell than understands cli command
FarNorth-A# show configuration > ?
wc Count words, lines, characters
ftp: Dest File URI
begin Begin with the line that matches
scp: Dest File URI
count Count number of lines
sftp: Dest File URI
end End with the line that matches
tftp: Dest File URI
exclude Exclude lines that match
volatile: Dest File URI
include Include lines that match
workspace: Dest File URI

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24
Scope
 Scoping – movement to different UCS configuration Components
Details on hardware components done with connect command

 You want to be on the Primary FI


FarNorth-B# scope

adapter Mezzanine Adapter


chassis Chassis
eth-server Ethernet Server Domain
eth-uplink Ethernet Uplink
fabric-interconnect Fabric Interconnect
fc-uplink FC Uplink
firmware Firmware
host-eth-if Host Ethernet Interface
host-fc-if Host FC Interface
monitoring Monitor the system
org Organizations
security Security mode
server Server
service-profile Service Profile
system Systems
vhba VHBA
vnic VNIC
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 25
Management Commands (scope, where, up & top)

UCSM Navigation CLI Equivalent to Nav Pane

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 26
Connect NXOS

 Connecting from the XML to the Fabric Interconnect


(FI) standard NXOS component.
 Used to assist in troubleshooting – very familiar to IOS
and Nexus users and all the show commands
 Used to run advised debugs
 Show switch running config (non server config)
 Enable and run ethanalyzer
 Clear interface counters found on the FI
 Cannot be used to configure UCS (read only)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 27
FarNorth-A(local-mgmt)# ?

Connect – Hardware Troubleshooting cd


clear
Change current directory
Reset functions
cluster Cluster mode
connect Connect to Another CLI
copy Copy a file
• Connect – attaches you to hardware cp Copy a file
delete Delete managed objects
and read only NXOS dir Show content of dir
enable Enable
end Go to exec mode
FarNorth-B# connect erase Erase
erase-log-config Erase the mgmt logging config file
exit Exit from command interpreter
adapter Mezzanine Adapter install-license Install a license
bmc Baseboard Management Controller (CIMC) ls Show content of dir
clp Connect to DMTF CLP mkdir Create a directory
move Move a file
iom IO Module mv Move a file
local-mgmt Connect to Local Management CLI ping Test network reachability
nxos Connect to NXOS CLI pwd Print current directory
reboot Reboots Fabric Interconnect
rm Remove a file
rmdir Remove a directory
run-script Run a script
FarNorth-A# connect local-mgmt show Show running system information
<CR> ssh SSH to another system
a Fabric A Defaults to primary tail-mgmt-log Tail mgmt log file
telnet Telnet to another system
b Fabric B terminal Set terminal line parameters
top Go to the top mode
traceroute Traceroute to destination
Most dangerous
-erase configuration
- reboot

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 28
Connect to NXOS Most popular example:

Show run
FarNorth-A# connect nxos Show fex detail
<CR>
a Fabric A Show interface
b Fabric B Show lacp
Debug
Sh npv flogi-table
FarNorth-A(nxos)# ?
clear Reset functions  only place you can clear counters today Show mac-address-table
cli CLI commands
debug Debugging functions
debug-filter Enable filtering for debugging functions
end Go to exec mode
ethanalyzer Configure cisco fabric analyzer
exit Exit from command interpreter
no Negate a command or set its defaults
ntp Execute NTP commands
pop Pop mode from stack or restore from name
push Push current mode to stack or save it under name
show Show running system information
system System management commands
terminal Set terminal line parameters
test Test command
undebug Disable Debugging functions (See also debug)
where Shows the cli context you are in

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 29
Ethanalyzer tool usage

 Uses Wirshark utility to view FI control data and


Management traffic
 Ethanalyzer is a tool that will collect frames that are
destined to, or originate from the FI control
plane. Node to FI, or FI to Network traffic can be
seen with this tool.
 Need to be connected to NXOS

FarNorth-A(nxos)# ethanalyzer local interface


inbound-hi Inbound(high priority) interface
inbound-low Inbound(low priority) interface
mgmt Management interface

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 30
Ethernet Interfaces on CPU
Troubleshooting Uses
Ethanalyzer terminology, internal ethernet interfaces are used:
eth3 = inbound-lo

eth4 = inbound-hi

eth3 handles Rx and Tx of low priority control pkts


IGMP, CDP

TCP/UDP/IP/ARP (for management purpose only)

eth4 handles Rx and Tx of high priority control pkts


FC (FC packets come to Switch CPU as FCoE packets) and FCoE

STP (spanning-tree) , LACP, DCBX (Data Center Bridging)

 Save to file and use Wireshark tool to help diagnose issue

1) FarNorth-A(nxos)# ethanalyzer local interface inbound-hi write volatile:///ciscolive

2) FarNorth-A(local-mgmt)# cd volatile:/// 3) FarNorth-A(local-mgmt)# copy volatile:///ciscolive tftp:


FarNorth-A(local-mgmt)# dir Enter hostname for the tftp server: 10.91.42.134
25192 May 18 11:08:17 2010 ciscolive Trying to connect to tftp server......
Connection to server Established. Copying Started.....
TFTP put operation was successful

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 31
KVM
 Tool to snapshot screen for support
 Doing Web-ex recording best

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 32
Monitoring – with UCSM and CLI
Compute System Fabric Monitoring
BMC (Per blade) Vif‟s
 Voltage, current sensors ( Power)
 Interface stats
 Thermal Sensors
 States
 DIMMs, CPUs, Adapter,…
Adaptor
 Sensor values available via IPMI
 Interface stats
CMC (IOM)
 Aggregate stats
 Per blade totals
 States
 Per chassis totals
FEX
 PSU redundancy state
 Interface stats
Changes are passed to UCSM
 States
 Critical transitions via async notifications
Switch
 Periodic polling
 Interface stats
 UCSM maintains stats
 Vif‟s stats
 SAM Maintains state
 States
 State, stats available via GUI, CLI, API

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 33
Data Gathering for Support

UCSM detailed tech-support should be taken as soon as possible after a


failure occurred. UCSM tech-support contains a running configuration
snapshot as well as an application error/debug log.

If a problem is easily reproducible, please re-try a configuration attempt and


collect tech-support files immediately.

A# connect local-mgmt
A(local-mgmt)# show tech-support ucsm detail

2. Collect tech-support on one or more problematic chassis (and its


components like server, IOM, BMC)

A(local-mgmt)# show tech-support chassis <chassis id> all detail

3. Copy collected file to tftp.cisco.com (171.69.17.19)


A(local-mgmt)# copy
workspace:///techsupport/<name_of_the_file>.tar tftp://171.69.17.19

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 34
Data Gathering for Support - examples
FarNorth-A(local-mgmt)# show tech-support ucsm detail
Initiating tech-support information task on FABRIC A ...
Initiating tech-support information task on FABRIC B ...
Completed initiating tech-support subsystem tasks (Total: 2)
All tech-support subsystem tasks are completed (Total: 2)

The detailed tech-support information is located at


workspace:///techsupport/20100517125801_FarNorth_UCSM.tar

FarNorth-A(local-mgmt)# show tech-support chassis 1 all detail


Initiating tech-support information task on Chassis 1 FabricExtender 1 ...
Remotely initiating tech-support information task on Chassis 1 FabricExtender 2
Initiating tech-support information task on Chassis 1 FabricExtender 2 ...
Initiating tech-support information task on IBMC 1 on Chassis 1 ...
Initiating tech-support information task on Adaptor 1 on Chassis/Blade 1/1 ...
Initiating tech-support information task on IBMC 2 on Chassis 1 ...
Initiating tech-support information task on Adaptor 1 on Chassis/Blade 1/2 ...
Initiating tech-support information task on IBMC 3 on Chassis 1 ...
Initiating tech-support information task on Adaptor 1 on Chassis/Blade 1/3 ...
Initiating tech-support information task on Adaptor 2 on Chassis/Blade 1/3 ...
FarNorth-A(local-mgmt)# dir Initiating tech-support information task on IBMC 7 on Chassis 1 ...
Initiating tech-support information task on Adaptor 1 on Chassis/Blade 1/7 ...
16 Oct 30 09:31:03 2009 cores
Completed initiating tech-support subsystem tasks (Total: 11)
31 Nov 20 13:14:20 2009 diagnostics All tech-support subsystem tasks are completed (Total: 11)
1024 Oct 30 09:29:05 2009 lost+found/ The detailed tech-support information is located at
1024 May 17 12:59:47 2010 techsupport/ workspace:///techsupport/20100517124544_FarNorth_BC001_all.tar

FarNorth-A(local-mgmt)# cd ///techsupport
FarNorth-A(local-mgmt)# ls
2140160 May 17 12:52:58 2010 20100517124544_FarNorth_BC001_all.tar
12871680 May 17 12:59:47 2010 20100517125801_FarNorth_UCSM.tar

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 35
Core Dumps
• Once TFTP core Exporter is
configured and enabled, dumps
will be transferred

•Once transferred, select and


move to trash can

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 36
Blade Troubleshooting

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 37
Troubleshooting Flow
For rest of the session we will work from Blade servers up toward LAN and
SAN network

End

LAN-SAN

Fabric-
Interconnects

IOM Modules

Blades
Start
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 38
Common Debug Scenarios – Blades

BMC doesn’t boot


Corrupt BMC BIOS, Post Failure, not completing
Attempt to connect to BMC to diagnose
View Logs, collect tech-support
Bad Service-Profile - Association Failure

Bad Hardware
Bad/Reseat/Replace Dimm(s)
CPU or other component – check logs

Adaptors issues
Connect to Mezz cards to Diagnose issues

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 39
BMC Troubleshooting - Debug Firmware Utility

Command Description

mctool Gets basic information on the State of the BMC to


USC management API
network See current network configuration and socket
information
obfl Live obfl

messages Live /var/log/messages file

alarms What sensors are in alarm

sensors Current sensor readings from IPMI

power The current power state of the x86

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 40
__________________________________________
Debug Firmware Utility
Connect CIMC – Debug Utility __________________________________________
Command List
__________________________________________

alarms
 Show tech detail and logs cores
exit
help [COMMAND]
 Get snapshot of KVM screen images
mctools
memory
 To verify health of blade if messages
network
questioning UCSM and obfl
post
wanting to look at lowest level power
sensors
of Blade data points sel
fru
mezz1fru
mezz2fru
FarNorth-A# connect cimc 1/3 tasks
Trying 127.5.1.3... top
Connected to 127.5.1.3. update
Escape character is '^]'. users
version
BMC Debug Firmware Utility Shell
__________________________________________
[ help ]# Notes:
"enter Key" will execute last command
Useful commands marked with arrow "COMMAND ?" will execute help for that command
__________________________________________

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 41
Mezz Cards Common Debug & Isolation Hints
• Verify physical link state between IOM and M71KR VIC M81KR (Palo)

using “show interface brief” on the switch CLI


• Verify vif state and vnic state from M71KR
perspective using “show-vifs” command and “show-
systemstatus” command.
• Find vif corresponding to the link M71KR-Q & M71KR-E (Menlo)

• Verify M71KR-Intel/M71KR-QorE physical link state


using M71KR Link Event Log
• Verify state of the control channel (VIC/DCBX/VNTAG)
• Verify state of VIF from vic protocol perspective (VIC
log on M71KR)
• For FC, look at FC logs for FLOGI/LS_ACC
• Look at the link state from host perspective using
host based tools

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 42
M71KR Overview

 2 types : M71KR-Q & M71KR-E


QLOGIC/EMULEX
 Standard 10G Ethernet adapter OPLIN
FC

 Standard 4G HBA

 Menlo ASIC HOST ETH 0


VNTAG=0
HOST ETH 1
VNTAG 1
HOST FC 0
VNTAG 2
HOST FC 1
VNTAG 3
PIF=0 PIF 1 PIF 2 PIF 3
Encap/Decap FcoE PORT=0 PORT 1 PORT 4 PORT 5

Convert Ethernet to DCE


Provide ability to ACL packets Menlo mCPU
VNTAG=6
Establish NIV with Switch to PIF=6
Vif=0
provide fabric failover

UIF 0 UIF 1
PORT=2 PORT=3 Menlo ASIC

TO IOM/CMC 0 TO IOM/CMC 1

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 43
Adaptor Debug
• From UCS cli, use the following command to talk to Cisco M71KR
connect adapter <chassisid>/<bladeid>/adapterid

FarNorth-A# connect adapter 1/3/2 (Mezz 2 on B250 M1 Blade)


adapter 1/3/2 #

• Following commands are available

adapter 1/3/2 # help


Available commands:
exit - Exit from subshell
help - List available commands
history - Show command history
show-asic-stats - Show adapter's asic stats
show-cfg - Show adapter's configuration
show-debug-log - Show adapter's debug log
show-fwlist - Show firmware versions on the adapter
show-identity - Show adapter identity
show-memory - Show adapter's memory
show-panic-log - Show adapter's panic log
show-phyinfo - Show adapter phy info
show-port-stats - Show adapter's port stats
show-systemstatus - Show adapter status
show-vif-stats - Show adapter's vif stats
show-vifs - Show adapter's vifs

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 44
M71KR General Configuration
 Using Outputs to Verify
adapter 1/3/1 # show-identity
type: Menlo Hardware operations with UCSM
description: "Cisco MENLO Adapter"
hw_version: "1.0"
sw_version: "1.3(0.193)"

adapter 1/3/1 # show-fwlist


[0]: BOOT Version 1.0(1e)
[1]: APP Version 1.3(0.193) [RUNNING]
[2]: APP Version 1.3(0.168a) [STARTUP]
[3]: DIAG Version 5.0.0.0

adapter 1/3/1 # show-cfg


ChipVersion : 0x00000002
uif 0 mac-addr : 00:26:51:08:cf:cc
uif 1 mac_addr : 00:26:51:08:cf:cd
timeout : 0x07d0
fw_updt_timeout : 0x2710
eth_failover : disabled(0)
fcoe_cfg : T11(1)
fcoe_fc_map : 0x00fc0e
stdby_recovery_delay : 5 secs
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 45
M71KR debug logs

 Mapping of ”show-debug-log” LOGID’s to names

0 - Debug-log
2 - FC : Shows FLOGI LS_ACC info : OX_ID, NPortID
3 - FC : Shows FLOGI ELS request info : WWPN, OX_ID
8 - Link Events : physical/logical links up/down
9 - DCBX : dcbx configuration changes/updates
10 - VIC : vic protocol
11 - Adapter management: Menlo adapter management protocol
Others – unused at this point

Note: Each type of log has fixed number of entries. Logs wrap around.

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 46
M71KR debug logs - outputs Log ID
Management – FW load
adapter 1/3/1 # show-debug-log 11
00000:00:00:00:010 vfc1206: uifid 1 pifid 3 initialized Physical Link
00000:00:00:00:010 vfc1204: uifid 0 pifid 2 initialized
00000:03:24:20:950 IDLE [BEGIN( 0)] => OPENING adapter 1/3/1 # show-debug-log 8
00000:03:24:20:970 OPENING [REPLY( 0)] => 00000:04:07:30:600 Host ethernet port 1, physical link down
INPROGRESS 00000:04:07:30:600 Host fibre channel port 1, logical link down
00000:03:24:24:360 INPROGRESS [REPLY( 0)] => 00000:04:07:35:600 Host ethernet port 1, logical link down
CLOSING 00000:04:07:35:600 Host ethernet port 1, physical link down
00000:03:24:24:360 fwupdate: complete 00000:04:07:35:600 Host fibre channel port 1, logical link down
00000:04:07:40:600 Host ethernet port 1, logical link down
00000:04:07:40:600 Host ethernet port 1, physical link down
VIC Protocol
adapter 1/3/1 # show-debug-log 10
00000:04:07:56:000 vif[4]: vfc1206: s:INIT(e:CREATE)->s:CREATE
00000:04:07:56:000 create: port 1 vfc1206 primary 1
00000:04:07:56:000 create_cb: port 1 veth1202 status ERET
00000:04:07:56:000 vif[2]: veth1202: s:CREATE(e:LINK_DOWN)->s:INIT
00000:04:07:56:000 active_vif_down: port 1 veth1202 primary 1
00000:04:07:56:000 vic_eth_phys_if_down: port 1 vif 4294967295
00000:04:07:56:000 create_cb: port 1 vfc1206 status ERET
00000:04:07:56:000 vif[4]: vfc1206: s:CREATE(e:LINK_DOWN)->s:INIT SAN Login
adapter 1/3/1 # show-debug-log 3
00000:00:03:47:640 Egress FLOGI: port 0, wwpn 20000025b5000007, oxid 0
00000:00:03:48:350 Egress FLOGI: port 1, wwpn 20000025b5000009, oxid 0
00000:00:20:31:610 Egress FLOGI: port 0, wwpn 20000025b5000007, oxid d
00000:00:32:58:000 Egress FLOGI: port 1, wwpn 20000025b5000009, oxid 11
00000:04:05:39:200 Egress FLOGI: port 0, wwpn 20000025b5000007, oxid 19
00000:04:08:16:630 Egress FLOGI: port 1, wwpn 20000025b5000009, oxid 21
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 47
M71KR Port Stats See example outputs in Appendix A

 Mapping of portids to ports for “show-port-stats”


command
0 : Host ethernet port 0 – connected to Intel Oplin port 0
1 : Host ethernet port 1 – connected to Intel Oplin port 1
2 : DCE Port 0 – connected to IOM0
3 : DCE Port 1 – connected to IOM1
4 : Host FC Port 0 – connected to Q/E FC port 0
5 : Host FC Port 1 – connected to Q/E FC Port 1
 RMON stats for portids 0,1,2,3
 FC Port stats for portids 4,5

 All the stats are from M71KR perspective.

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 48
VN-Tag: instantiation of virtual interfaces

 Virtual interfaces (VIFs) help distinguish between FC


and Eth interfaces
 They also identify the origin server
 VIFs are instantiated on the FI and correspond to frame-
level tags assigned to blade mezz cards
 A 6-byte tag (VN-Tag) is preprended by Palo and Menlo
as traffic leaves the server to identify the interface
VN-Tag associates frames to a VIF
 VIFs are „spawned off‟ the server‟s EthX/Y/Z interfaces
(examples follow)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 49
VN-Tag at the adapter (mezz card) level

 Connect to a server’s adapter and use “show-vifs”

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 50
Verifying & Viewing Pause Frames on M71KR
adapter 1/2/1 # show-asic-stats

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 51
M81KR - Palo Adaptor
adapter 1/1/1 # help
Available commands: Same type commands
connect - Connect to remote debug shell
exit - Exit from subshell
as M71KR
help - List available commands
history - Show command history
show-fwlist - Show firmware versions on the adapter
show-identity - Show adapter identity
show-phyinfo - Show adapter phy info
show-systemstatus - Show adapter status

adapter 1/1/1 # connect Use connect command to Attach to Master


Control Program – which is main Palo
adapter 1/1/1 (top):1# help
firmware application to get more details
Available commands:
attach-fls - Attach to fls
attach-mcp - Attach to mcp
estat - Run fc performance monitor
exit - Exit from subshell
help - List available commands adapter 1/1/1 (top):2# attach-mcp
history - Show command history
phy-read - Read PHY register
show-fru - Show FRU contents
show-fwdtab - Show forwarding table
show-log - Show system log
show-macstats - Show MAC statistics
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 52
M81KR - Adapter Debug CLI (vif info)

adapter 2/8/1 (top):2# attach-mcp


vnic - shows vnic overview

FarNorth-A# connect adapter 2/8/1


adapter 2/8/1 # connect
adapter 2/8/1 (top):1# attach-mcp
adapter 2/8/1 (mcp):1# vnic
vnic id : internal id of vnic, use for other vnic cmds
vnic name : ucsm provisioned name for this vnic
vnic type : en=ethernet, fc=fcoe
vnic state: state of vnic
lif : internal logical if id, use for other lif/vif cmds
lif state : state of lif
vif uif : bound uplink 0 or 1, =:primary, -:secondary,
>:current
vif ucsm : ucsm id for this vif
vif idx : switch id for this vif (vethXXX)
vif vlan : default vlan for traffic
vif state : state of vif

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 53
Details of Vif
• Vif info shows network connectivity
• COS, default vlan, rate limits
• Vif info shows address registration list
adapter 2/8/1 (mcp):2# vif 2
lifid: 2 • Unicast, broadcast, multicast
uif: 0
state: UP
adminst: UP
flags: NIV, CREATED, VIFHASH, VUP, VIFINFO
vifindex: 1241
hash: 89
priority: 0 create retries: 2
provinfo.oui : 00 00 0c last req: VIF_ENABLE
provinfo.type: SAM_CA req status: OK
provinfo.data.vifid : 1241 req cc: SUCCESS
provinfo.data.cookie : 0x5285a ev trace: LINK_UP CREATE_FAILED TIMEOUT
provinfo.data.viftype: ETH CREATE_FAILED TIMEOUT CREATE_OK ENABLE_OK SET_UP
vifinfo.priority : 0
vifinfo.vifid :2 reg'd addrs: vlan 0 mac 00:25:b5:00:00:17
vifinfo.default_cos : 0 vlan 0 mac ff:ff:ff:ff:ff:ff
vifinfo.vifstate : E--- vlan 0 mac 00:00:00:00:00:00
vifinfo.vlan :1 inadd addrs:
vifinfo.ratelimit.burstsize : 0 toadd addrs:
vifinfo.ratelimit.rate : -1 indel addrs:
todel addrs:

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 54
M81KR MAC Statistics

adapter 2/8/1 (mcp):3# dcem-macstats 0

TOTAL DESCRIPTION 1063448 Rx Frames 64 < len <= 127


24841 Tx frames len == 64 41133 Rx Frames 128 <= len <= 255
63470 Tx frames 64 < len <= 127 24707 Rx Frames 256 <= len <= 511
51113 Tx frames 128 <= len <= 255 2359 Rx Frames 512 <= len <= 1023
380 Tx frames 256 <= len <= 511 372 Rx Frames 1024 <= len <= 1518
225020 Tx frames 512 <= len <= 1023 8901 Rx Frames 1519 <= len <= 2047
160 Tx frames 1024 <= len <= 1518 1140928 Rx total received packets
2865 Tx frames 1519 <= len <= 2047 110619220 Rx bytes
367849 Tx total packets 1140928 Rx good packets
147903879 Tx bytes 311492 Rx unicast frames
367849 Tx good packets 74263 Rx multicast frames
346958 Tx unicast frames 755173 Rx broadcast frames
20277 Tx multicast frames 147903879 Tx bytes for good packets
614 Tx broadcast frames 110619220 Rx bytes for good packets
25 Tx frames with VLAN tag
8 Rx Frames len == 64

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 55
Adapter Debug CLI (logs)

• show-log – display internal adapter logs

adapter 1/3/1 (top):2# show-log


2009 Oct 5 16:21:15 palo %BCxx_MEZZxxxx_mcp.uif[289]-6-Port 0 set to VNTAG mode
2009 Oct 5 16:21:15 palo %BCxx_MEZZxxxx_mcp.uif[289]-6-Port 0: Running
2009 Oct 5 16:21:15 palo %BCxx_MEZZxxxx_mcp.vif[289]-6-uif0 starting link up in niv
mode
2009 Oct 5 16:21:15 palo %BCxx_MEZZxxxx_mcp.vic[289]-6-vic0: peer eth0.0
00:0d:ec:6d:b8:3c start
2009 Oct 5 16:21:15 palo %BCxx_MEZZxxxx_mcp.uif[289]-6-Port 0 FSM:
WAIT_NIVDELAYTIMEO/RXVNTAG => RUNNING
2009 Oct 5 16:21:15 palo %BCxx_MEZZxxxx_mcp.vic[289]-6-vic0: starting timer for peer
VIC_OPEN
2009 Oct 5 16:21:15 palo %BCxx_MEZZxxxx_mcp.vic[289]-6-vic0: app_start_done flags
OPEN_SENT status OK
...

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 56
Memory errors
Check Server Event Log/Faults sh sel 2/1

5ed | 03/29/2010 02:20:50 | Memory 0x02 | Uncorrectable ECC/other uncorrectable memory error | Rank: 0, DIMM Socket: 1, Channel: C, Socket: 0 | Asserted

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 57
What to gather and look at for memory issues

 On CIMC - do show tech


 On KVM - capture the BIOS version
 On KVM - BIOS capture the memory configuration
 On CIMC - capture the BIOS version
 On CIMC - capture the memory inventory
 Show mem details (get shot)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 58
Reboots

 Need to find out reason for reboot of hardware

• BMC (CIMC) – issue in hardware/firmware on server

• UCS Service Profile – caused by a profile change/issue

• Other Hardware on the blade – CPU, Memory

• User induced – reset button

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 59
Blade Reboots – Viewing OBFL for reason of reboot
Reboot - pressing front-panel button:
0:2009 Dec 29 19:45:04:BMC:kernel::<0>LPC Reset ISR -> ResetState: 1 <---this indicates Reset occurred
4:2009 Dec 29 19:45:04:BMC:kernel::<4>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618 FCSd/bmc/drivers/vdd_pwr_good
/gooding/vdd_pwr_good_cb.c:19:Platform is Gooding: Deasserted
5:2009 Dec 29 19:45:04:BMC:kernel:-:<5>USB FS: VDD Power WAKEUP- Power Good = OFF
5:2009 Dec 29 19:45:04:BMC:kernel:-:<5>USB HS: VDD Power WAKEUP- Power Good = OFF
1:2009 Dec 29 19:45:04:BMC:kernel::<1>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/block_transfer/
block_transfer.c:564:block_transfer_deallocate_entire_list --> Dumped: 0x0000 files.
5:2009 Dec 29 19:45:04:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[0]
5:2009 Dec 29 19:45:04:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[1]
5:2009 Dec 29 19:45:05:BMC:IPMI:470: Pilot2SrvPower.c:369:Blade Power Changed To: [ OFF ]
5:2009 Dec 29 19:45:05:BMC:IPMI:497: VirtualSEL.c:26:SEL Evt[02 0D]< C1 0B 02 41 5C 3A 4B 20 00 04 25 52 08 00 FF FF >
4:2009 Dec 29 19:45:34:BMC:kernel:-:<4>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/vdd_pwr_good/
gooding/vdd_pwr_good_cb.c:19:Platform is Gooding: Asserted
5:2009 Dec 29 19:45:34:BMC:kernel:-:<5>USB FS: VDD Power WAKEUP- Power Good = ON

This is a signature of HW failure (power off followed by power on in 4-5 seconds. Intel feature to react on HW failure):
0:2009 Nov 25 11:44:55:BMC:kernel::<0>LPC Reset ISR -> ResetState: 1 <---this indicates Reset occurred
4:2009 Nov 25 11:44:55:BMC:kernel:-:<4>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/vdd_pwr_good/
gooding/vdd_pwr_good_cb.c:19:Platform is Gooding: Deasserted
5:2009 Nov 25 11:44:55:BMC:kernel:-:<5>USB FS: VDD Power WAKEUP- Power Good = OFF
5:2009 Nov 25 11:44:55:BMC:kernel:-:<5>USB HS: VDD Power WAKEUP- Power Good = OFF
1:2009 Nov 25 11:44:55:BMC:kernel::<1>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/block_transfer/
block_transfer.c:564:block_transfer_deallocate_entire_list --> Dumped: 0x0000 files.
5:2009 Nov 25 11:44:55:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[0]
5:2009 Nov 25 11:44:55:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[1]
4:2009 Nov 25 11:44:55:BMC:kernel:-:<4>kbdmouse_write: mouse write aborted for device reset.
5:2009 Nov 25 11:44:55:BMC:IPMI:472: Pilot2SrvPower.c:369:Blade Power Changed To: [ OFF ]
5:2009 Nov 25 11:44:55:BMC:IPMI:500: VirtualSEL.c:26:SEL Evt[22 02]< 22 02 02 B7 18 0D 4B 20 00 04 25 52 08 00 FF FF >
3:2009 Nov 25 11:45:16:BMC:doctor-bmc:584: doctor-bmc.c:1143:Tcp -> Connection between remote ip 0xFE00037F at port 0x86A4
and local ip 0x200037F at port 0xFAA is in TCP_TIME_WAIT state for at least 2 min 30 seconds.
3:2009 Nov 25 11:45:16:BMC:doctor-bmc:584: doctor-bmc.c:1155:Tcp -> Total Errors Found: 1
5:2009 Nov 25 11:45:21:BMC:kernel:-:<5>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/pilot2_power
/pilot2_power.c:266:do_power_on
remote ip 0xFE00037F = 254 0 3 127 or 127.0.3.254 (the CMC0 interface to the blades) and local ip 0x200037F = 2 0 3 127 or 127.3.0.2
which is blade 2's IP facing the CMC
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 60
Blade Reboots – Viewing OBFL for reason

This is actual customer power reset from UCSM (power on in 8 minutes):

0:2009 Dec 22 17:16:26:BMC:kernel::<0>LPC Reset ISR -> ResetState: 1 <---this indicates Reset occurred
4:2009 Dec 22 17:16:26:BMC:kernel:-:<4>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/vdd_pwr_good/
gooding/vdd_pwr_good_cb.c:19:Platform is Gooding: Deasserted
5:2009 Dec 22 17:16:26:BMC:kernel:-:<5>USB FS: VDD Power WAKEUP- Power Good = OFF
5:2009 Dec 22 17:16:26:BMC:kernel:-:<5>USB HS: VDD Power WAKEUP- Power Good = OFF
1:2009 Dec 22 17:16:26:BMC:kernel::<1>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/block_transfer/
block_transfer.c:564:block_transfer_deallocate_entire_list --> Dumped: 0x0000 files.
5:2009 Dec 22 17:16:26:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[0]
5:2009 Dec 22 17:16:26:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[1]
5:2009 Dec 22 17:16:27:BMC:IPMI:474: Pilot2SrvPower.c:369:Blade Power Changed To: [ OFF ]
5:2009 Dec 22 17:16:27:BMC:IPMI:511: VirtualSEL.c:26:SEL Evt[98 02]< 98 02 02 EB FE 30 4B 20 00 04 25 52 08 00 FF FF >
5:2009 Dec 22 17:24:49:BMC:[email protected]:1275: mcserver_ipmi_extensions.c:212:[mcserver_set_vdd_power] "Power Cycle”
5:2009 Dec 22 17:24:49:BMC:kernel:-:<5>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers
/pilot2_power/pilot2_power.c:313:do_cycle
5:2009 Dec 22 17:24:49:BMC:kernel:-:<5>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers
/pilot2_power/pilot2_power.c:232:do_power_off
5:2009 Dec 22 17:24:59:BMC:kernel:-:<5>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers
/pilot2_power/pilot2_power.c:266:do_power_on
4:2009 Dec 22 17:24:59:BMC:kernel:-:<4>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers
/vdd_pwr_good/gooding/vdd_pwr_good_cb.c:19:Platform is Gooding: Asserted
5:2009 Dec 22 17:24:59:BMC:kernel:-:<5>USB FS: VDD Power WAKEUP- Power Good = ON

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 61
Blade Reboots – Viewing OBFL for reason

This is IPMI request, coming from UCSM as authorized reboot or a result of having Desired power State as OFF.

5:2009 Dec 23 18:16:58:BMC:[email protected]:1275: mcserver_ipmi_extensions.c:212:[mcserver_set_vdd_power]


"Power Off" <--- indicator that an IPMI initiated reset has occurred.
5:2009 Dec 23 18:16:58:BMC:kernel:-:<5>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers
/pilot2_power/pilot2_power.c:232:do_power_off
0:2009 Dec 23 18:17:03:BMC:kernel::<0>LPC Reset ISR -> ResetState: 1 <---this indicates you've entered Reset for whatever reason
4:2009 Dec 23 18:17:03:BMC:kernel:-:<4>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers/
vdd_pwr_good/gooding/vdd_pwr_good_cb.c:19:Platform is Gooding: Deasserted
5:2009 Dec 23 18:17:03:BMC:kernel:-:<5>USB FS: VDD Power WAKEUP- Power Good = OFF
5:2009 Dec 23 18:17:03:BMC:kernel:-:<5>USB HS: VDD Power WAKEUP- Power Good = OFF
1:2009 Dec 23 18:17:03:BMC:kernel::<1>/nuova/builds1/ca-ventura_1-build/091027-100438-rev34618-FCSd/bmc/drivers
/block_transfer/block_transfer.c:564:block_transfer_deallocate_entire_list --> Dumped: 0x0000 files.
5:2009 Dec 23 18:17:03:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[0]
5:2009 Dec 23 18:17:03:BMC:kernel:-:<5>handle_exception: Handling MSD_STATE_DISCONNECT for interface[1]

Also for all Resets the DME logs should be viewed for more information,
DME logs are found in the in /var/sysmgr/sam_logs/ inside the .tar file of
the <show tech-support ucsm detail> svc_sam_dme.log

A# connect local-mgmt
A(local-mgmt)# show tech-support ucsm detail
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 62
Serial over LAN (SoL)
 Requires Serial over LAN configured and IPMI profile configured
then applied to Server-profile
 Access via same IP address as KVM
 Can be configured on the fly and applied to service-profile without
disruption

 Used IPMI open tool http://ipmitool.sourceforge.net/ IPMI User


Accessing
Management Network
BMC
interface

Serial over LAN connection

KVM end point IP address on


Blade

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 63
IPMI
 IPMI does not run on the OS installed on the blade
Totally independent of the installed OS; runs even if OS is down

 IPMI runs on the Baseboard Management Controller


Supports servicability in four main areas:
• System Event Log (SEL)
OS Watchdog, hardware alerts, etc.
• Sensors Data Repository (SDR)
Temperature controls, Inventory, etc.
• Power control
• Serial over LAN

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 64
DMIDECODE http://www.nongnu.org/dmidecode/

 Dmidecode reports information about your system's hardware as described in


your system BIOS according to the SMBIOS/DMI standard.

 This will often include usage status for the CPU sockets, expansion slots (e.g.
AGP, PCI, ISA) and memory module slots, and the list of I/O ports (e.g. serial,
parallel, USB).

 Support for Linux and Windows

dmidecode --type {KEYWORD / Number }

bios
system
baseboard
chassis
processor
memory
cache
connector
slot
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 65
IOM (FEX) Troubleshooting

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 66
Troubleshooting Flow
We will work from Blade servers up toward LAN and SAN network

End

LAN-SAN

Fabric-
Interconnects

IOM Modules

Blades
Start
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 67
IOM connections: chassis backplane view

Chassis
Path A Path A
Path A
Path B Blade 1 Blade 2
Path B Path B

Blade 3 Blade 4

Blade 5 Blade 6
IOM1 IOM2

Blade 7

Half-width servers: 1 mezz card (one A and one B path)


FarNorth-A(nxos)# show fex
Full-width servers: 2 mezz cards (two A & B paths)
FEX FEX FEX FEX
Number Description State Model Serial
------------------------------------------------------------------------
1 FEX0001 Online N20-C6508 QCI132800SN
2 FEX0002 Online N20-C6508 QCI131600Z9
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 68
IOM connections
 Each IOM (aka „Fabric Extender‟) provides
8+1 internal IO channels (8 slots + 1 internal mgmt network)
4 external ports (10Gbps each; no Etherchannel in the 1st release)

 The servers‟ mezz cards use those IO channels for external


connectivity
 Servers with one mezz card use one IO channel per IOM
vNIC1 can for instance use IOM 1 while vNIC2 uses IOM2
This vNIC-to-IOM „routing‟ is flexible and user-configurable

 Servers with two mezz cards use two IO channels per IOM
 Server vNICs are automatically pinned to fabric links
 Each IOM actually provides a 9th internal IO channel for internal
management connectivity

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 69
Viewing Blade ports

 These interfaces From <sh int brief> at NXOS prompt) are backplane traces
 Eth X/Y/Z where
X = chassis number
Y = mezz card number (always 1 with half-width blades)
Z = IOM port number (slot where the blade server resides)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 70
IOM to Fabric Interconnect connections
 UCSM calls these ports server ports
 NXOS CLI calls them fex-fabric interfaces
Note: those EthX/Y ports are interfaces on the fabric interconnects

 There can be 1, 2 or 4 ports between an IOM and a FI


FarNorth-A(nxos)# sh interface fex-fabric
Fabric Fabric Fex FEX
Fex Port Port State Uplink Model Serial
---------------------------------------------------------------
1 Eth1/1 Active 1 N20-C6508 QCI132800SN
1 Eth1/2 Active 2 N20-C6508 QCI132800SN
2 Eth1/5 Active 2 N20-C6508 QCI131600Z9
2 Eth1/6 Active 1 N20-C6508 QCI131600Z9

interface Ethernet1/1
switchport mode fex-fabric
pinning server
fex associate 1 chassis-serial FOX1327GKGN module-serial QCI132800SN module-slot left
no shutdown

interface Ethernet1/2
switchport mode fex-fabric
pinning server
fex associate 1 chassis-serial FOX1327GKGN module-serial QCI132800SN module-slot left
no shutdown
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 71
Actual IOM-to-FI pinning scheme
Server slots pinned to uplink
slot 1
1 link
slot 2
slot 3 I switch Uplink: slots 1,2,3,4,5,6,7,8
slot 4
slot 5 O
slot 6
M How to read this: with one IOM-to-FI link, all servers use that link
slot 7
slot 8

slot 1
slot 2
2 links Uplink 1: slots 1,3,5,7
slot 3 I switch Uplink 2: slots 2,4,6,8
slot 4
slot 5 O
slot 6
slot 7
M How to read this: with two IOM-to-FI links, servers in slots 1,3,5,7 use link
slot 8 number 1 while other slots use link number 2

slot 1
slot 2
4 links Uplink 1: slots 1,5
slot 3 I switch Uplink 2: slots 2,6
slot 4
slot 5 O Uplink 3: slots 3,7
slot 6
slot 7 M Uplink 4: slots 4,8
slot 8
How to read this: with four IOM-to-FI links, servers in slots 1 and 5 use link 1,
servers in slots 2 and 6 use link 2, etc.
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 72
Understanding the Virtual Interface

 The servers with one mezz card present two 10GE


„external‟ to the Fabric Interconnect interfaces
 The Server OS views the interfaces as 10GE NICs and
HBAs depending on the configuration specified in the
Service Profile
 These northbound interfaces can carry both Ethernet
and FC traffic (FCoE). We need a mechanism to identify
the origin server

Concept of Virtual Interface or VIF is created (see next slide)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 73
Virtual interfaces (Vif)
Blade 1
„Southbound‟ or OS-side interfaces

veth0 OS veth1
vhba0 vhba1

0 1 External mezz card 10GE port

Virtual interface tag


to associate frames to a VIF

IOM 1 Eth X/Y/Z interface IOM 2

IOM-to-FI link

Vif 1 Vif 2 Vif 3 Vif 4

Fabric A Fabric B

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 74
Verifying IOM-to-FI pinning
FarNorth-A(nxos)# show run interface
ethernet 1/1/7
version 4.1(3)N2(1.3)  Good for identifing proper
interface Ethernet1/1/7 path to Mezz adaptor
vntag max-vifs 30
pinning server
 Eg: IOM 1 ,slot 7 pinned
fabric-interface Eth1/1 to link 1; IOM 2 slot 8
no shutdown pinned to link 5 Do show
FarNorth-A(nxos)# show run interface
run int eX/Y/Z to verify
ethernet 2/1/8
version 4.1(3)N2(1.3)

interface Ethernet2/1/8
vntag max-vifs 30
pinning server
fabric-interface Eth1/5
no shutdown
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 75
Show Fex Detail

FEX: 1 Description: FEX0001 state: Online FEX: 2 Description: FEX0002 state: Online
FEX version: 4.1(3)N2(1.3) [Switch version: 4.1(3)N2(1.3)] FEX version: 4.1(3)N2(1.3) [Switch version: 4.1(3)N2(1.3)]
FEX Interim version: 4.1(3)N2(1.2.168a) FEX Interim version: 4.1(3)N2(1.2.168a)
Switch Interim version: 4.1(3)N2(1.2.168a) Switch Interim version: 4.1(3)N2(1.2.168a)
Chassis Model: N20-C6508, Chassis Serial: FOX1327GKGN Chassis Model: N20-C6508, Chassis Serial: FOX1317G26R
Extender Model: N20-I6584, Extender Serial: QCI132800SN Extender Model: N20-I6584, Extender Serial: QCI131600Z9
Part No: 73-11623-04 Part No: 73-11623-04
Card Id: 67, Mac Addr: 00:26:51:08:67:f4, Num Macs: 10 Card Id: 67, Mac Addr: 00:24:97:1f:6d:aa, Num Macs: 10
Module Sw Gen: 12594 [Switch Sw Gen: 21] Module Sw Gen: 12594 [Switch Sw Gen: 21]
pinning-mode: static Max-links: 1 pinning-mode: static Max-links: 1
Fabric port for control traffic: Eth1/1 Fabric port for control traffic: Eth1/5
Fabric interface state: Fabric interface state:
Eth1/1 - Interface Up. State: Active Eth1/5 - Interface Up. State: Active
Eth1/2 - Interface Up. State: Active Eth1/6 - Interface Up. State: Active
Fex Port State Fabric Port Primary Fabric Fex Port State Fabric Port Primary Fabric
Eth1/1/1 Up Eth1/1 Eth1/2 Eth2/1/1 Up Eth1/6 Eth1/5
Eth1/1/2 Up Eth1/2 Eth1/2 Eth2/1/2 Up Eth1/5 Eth1/5
Eth1/1/3 Up Eth1/1 Eth1/2 Eth2/1/8 Up Eth1/5 Eth1/5
Eth1/1/4 Up Eth1/2 Eth1/2 Eth2/1/9 Up Eth1/5 Eth1/5
Eth1/1/7 Up Eth1/1 Eth1/2
Eth1/1/9 Up Eth1/2 Eth1/2

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 76
Attaching to FEX

FarNorth-A# connect iom ?


<1-255> Chassis ID

FarNorth-A# connect iom 1


Attaching to FEX 1 ...
To exit type 'exit', to abort type '$.'
Bad terminal type: "xterm". Will assume vt100.

From FEX attach CLI, user can monitor


CPU, memory etc.
show system resources
show process cpu
show process memory
show system uptime

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 77
VIFs
 Ethernet and FC are muxed on the same physical
links  concept of virtual interfaces (vifs) to split
Eth and FC
 Two types of VIFs: veth and vfc
Veth for Ethernet ; vfc for FC traffic
 Each EthX/Y/Z interface typically has multiple vifs
attached to it to carry traffic to and from a server
 To find all vifs associated with a EthX/Y/Z interface,
do this:

FarNorth-A(nxos)# show vifs interface ethernet 2/1/8

Interface VIFS
-------------- ---------------------------------------------------------
Eth2/1/8 veth1241, veth1243, veth9461, veth9463

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 78
VIFs for FC traffic (FCoE)
FarNorth-A(nxos)# show vifs interface ethernet 2/1/8

Interface VIFS
 All vifs associated with a EthX/Y/Z
-------------- --------------------------------------------------------- interfaces are pinned to the fabric port
Eth2/1/8 veth1241, veth1243, veth9461, veth9463,
that EthX/Y/Z interface is pinned to.
 Vifs in the 10000+ range are used for FC
FarNorth-A(nxos)# sh int vethernet 9463
vethernet9463 is up traffic. Check the VLAN to VSAN
Bound Interface is Ethernet2/1/8
Hardware: VEthernet
mapping (show vlan fcoe)
Encapsulation ARPA
Port mode is access
FarNorth-A(nxos)# show vifs interface vethernet 9463
Last link flapped 1week(s) 1day(s)
Last clearing of "show interface" counters never
Interface VIFS
1 interface resets
-------------- ---------------------------------------------------------
veth9463 vfc1271,
FarNorth-A(nxos)# show int vfc1271
vfc1271 is up
Bound interface is vethernet9463
Hardware is Virtual Fibre Channel
Port WWN is 24:f6:00:0d:ec:d0:7b:7f
FCoE VLAN is 100
Admin port mode is F, trunk mode is off
snmp link state traps are enabled FarNorth-A(nxos)# show vlan fcoe
Port mode is F, FCID is 0x710005 VLAN VSAN Status
Port vsan is 100 -------- -------- --------
1 1 Operational
100 100 Operational
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 79
Redwood Connection Information

show tech-support fex <1 or 2>

This will capture a needed output


to determine congestion, packet
counters, Pause control on Server
ports and network ports on IOM

Next few slides are few examples of output

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 80
Redwood Traffic Information
Traffic Rates on IOM

Will show pause frames and drops if looking for performance concerns

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 81
RMON
Stats

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 82
Top commands to debugging
# Port Info
Show clock
Show platform fwm event-history lif <PORT>
Show system internal ethpm info interface <PORT>
Show system internal ethpm even-history interface <PORT>
Show platform software dcbx internal info interface <PORT>
Show platform software dcbx internal errors
Show platform software sifmgr info interface <PORT>
Show clock

#Global Info
Show clock
Show platform fwm event-history errors
# IOM
Show platform fwm event-history msgs
Connected local-mgm <fabric>
Show platform fwm errors
Connect iom <chassis_id>
Show system internal ethpm event-history errors
terminal length 0
Show system internal ethpm info trace
show platform software redwood sts
Show system internal ethpm event-history msgs
show platform software redwood oper
Show platform software sifmgr event-history errors
show platform software redwood log
Show platform software sifmgr event-history lock
show platform software redwood elog
Show platform software sifmgr info trace
show platform software redwood ilog
Show platform software sifmgr event-history msgs
show platform software redwood ints

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 83
Fabric Interconnect Troubleshooting

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 84
Troubleshooting Flow
We will work from Blade servers up toward LAN and SAN network

End

LAN-SAN

Fabric-
Interconnects

IOM Modules

Blades
Start
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 85
6100 Fabric Interconnect Troubleshooting

 Understanding the Fabric Port Manager


 Physical Links issues
 Server Links
 FEX-Links
 DCBX Discovery
 Mac Addresses functions in End Host Mode

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 86
Fabric Port Management

 Managed by UCS Manager as part of overall chassis discovery


process
 Number of deployed fabric ports defined in UCS Manager
service profile
 Change in the number of deployed fabric ports require „Re-
acknowledge Chassis‟
 Supports Explicit Pinning only, as determined by UCS Manager
 UCS Manager recalculates pinning distribution when fabric
port(s) go down
 Supports even number of fabric ports only
 No support for fabric port channel

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 87
Troubleshooting 10GBE - Link Not Coming Up
Check PHY driver software link state:
switch# show hardware internal gatos port ethernet 1/19 xcvr info

Port 0/18:
State: UP
XCVR insert debounce timer running
XCVR link debounce timer not running
TX enable signal is on
Debounce timeout: 0.100 seconds

Link up : 506097 usecs after Wed May 12 22:38:08 2010


Link dn debounce start : 0 usecs after Thu Jan 1 00:00:00 1970
Link debounce end : 0 usecs after Thu Jan 1 00:00:00 1970

Counters:
Interrupt cntrs:
Bit error cntrs:
Bit Error Rate: 0x0000000000000000 Bit Error Rate(since linkup): 0x00000000
Error blocks : 0x0000000000000043 Error blocks(since linkup) : 0x00000011
Link cntrs:
Link up: 0x9 (9)
Link dn: 0x0 (0)
Link debounced with link up: 0x0 (0)
Link debounced with link up since last enable: 0x0 (0)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 88
Enabling the Server link
• After enabling fabric port

FarNorth-A(nxos)# show running-config interface ethernet 1/1


version 4.1(3)N2(1.3)

interface Ethernet1/1
switchport mode fex-fabric
pinning server
fex associate 1 chassis-serial FOX1327GKGN module-serial QCI132800SN module-slot left
no shutdown

FarNorth-A(nxos)# show interface fex-fabric


Fabric Fabric Fex FEX
Fex Port Port State Uplink Model Serial
--------------------------------------------------------------------------------------
1 Eth1/1 Active 1 N20-C6508 QCI132800SN
1 Eth1/2 Active 2 N20-C6508 QCI132800SN
2 Eth1/5 Active 2 N20-C6508 QCI131600Z9
2 Discovered 1 N20-C6508 QCI131600Z9
2 Eth1/6 Configured 1 N20-C6508 QCI131600Z9 Transition States
2 Eth1/6 Fabric Up 0
2 Eth1/6 Active 1 N20-C6508 QCI131600Z9

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 89
Fabric Port Management
FarNorth-A(nxos)# show fex 1 detail
FEX: 1 Description: FEX0001 state: Online
FEX version: 4.1(3)N2(1.3) [Switch version: 4.1(3)N2(1.3)]
FEX Interim version: 4.1(3)N2(1.2.168a)
Switch Interim version: 4.1(3)N2(1.2.168a)
Chassis Model: N20-C6508, Chassis Serial: FOX1327GKGN
Extender Model: N20-I6584, Extender Serial: QCI132800SN
Part No: 73-11623-04
Card Id: 67, Mac Addr: 00:26:51:08:67:f4, Num Macs: 10
Module Sw Gen: 21 [Switch Sw Gen: 21]
pinning-mode: static Max-links: 1
Fabric port for control traffic: Eth1/1
Fabric interface state:
Eth1/1 - Interface Up. State: Active
Eth1/2 - Interface Up. State: Active
Fabric Ports
Fex Port State Fabric Port Primary Fabric
Eth1/1/1 Up Eth1/1 Eth1/2
Eth1/1/2 Up Eth1/2 Eth1/2
Eth1/1/3 Up
Eth1/1/4 Up
Eth1/1
Eth1/2
Eth1/2
Eth1/2
Pinned fabric Port
Eth1/1/7 Up Eth1/1 Eth1/2
Eth1/1/9 Up Eth1/2 Eth1/2
Logs:
[05/12/2010 22:38:28.273779] Module register received
[05/12/2010 22:38:28.276776] Registration response sent
[05/12/2010 22:38:28.546132] Module Online Sequence FEX Event history
[05/12/2010 22:38:29.45265] Module Online
Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 90
Network Interface Virtualization (NIV)
– protocol negotiation w/ DCBX

 Switch and adapter uses DCBX (LLDP based protocol) NIV


TLV (Feature Type 7, Subtype 0) to:
•indicate NIV capability
•negotiate control VNTAG for virtual interface used by adapter
management entity

 Initial protocol frames are non-VNTAG

 All frames contain VNTAG once negotiated

 VIC protocol
• Allocate/Deallocate virtual interfaces (driven by Interface Virtualizer)
•Set VIF State (active/standby)
•Virtual Interface list management (driven by switch)
•MAC address registration (mac filtering offload from adapter to switch)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 91
DCBX Troubleshooting
Checking for DCBX negotiation results
In the dump of “show platform software dcbx internal info interface ethernet 1/1/1” look
for every feature negotiation result as shown below
feature type 3 sub_type 0

feature state variables: oper_version 0 error 0 oper_mode 1 feature_seq_no 0 remote_feature_tlv_present 1

remote_tlv_not_present_notification_sent 0 remote_tlv_aged_out 0

feature register params max_version 0, enable 1, willing 0 advertise 1, disruptive_error 0 mts_addr_node

0x101mts_addr_sap 0x1e5

Desired config cfg length: 1 data bytes:08

Operating config cfg length: 1 data bytes:08


Error
1) Indicates negotiation error.
2) Never expected to happen when connected to CNA adaptor
3) When two N5Ks are connected back-to-back
4) If PFC is enabled on different CoS values negotiation error can happen
Operating Config
Indicates negotiation result
Absence of operating config indicates that the peer does not support this DCBX TLV or negotiation error
“remote_feature_tlv_present” indicates whether the remote peer supports this feature TLV or not

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 92
MAC Address Learning Functions
 Server mac address is learned via traffic generated by the server
 Once learned, the server mac address is static
 Server mac address only learned on server port
 MAC address learning is disabled on border ports
 Network to server traffic can only be forwarded (subject to RFP and déjà vu
check) if server mac address is already learned on server port.
 Server mac address can „move‟ from one server port to another server port
 Server mac address can „move‟ outside the EH-node. The old server mac
address is removed when packet with the same source mac is received on
the original pinned border port (more on that later). E.g. a VM moved and
generates a gratuitous arp
 Adapter can register mac addresses with the switch
 Switch offloads adapter from performing mac address filtering
 Menlo adapters always registers * (send all traffic to Menlo)

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 93
Verifying End Host Mode Status and Configuration
 Mac address table FarNorth-A(nxos)# show mac-address-table
VLAN MAC Address Type Age Port
---------+-----------------+-------+---------+------------------------------
FarNorth-A(nxos)# show mac-address-table ? 1 0025.b500.0004 static 0 veth1235
<CR> 1 0025.b500.0007 static 0 veth1243
> Redirect it to a file
1 0025.b500.0008 static 0 veth1200
>> Redirect it to a file in append mode
address Address 1 0025.b500.0009 static 0 veth1199
aging-time Display Aging Time (configured or default) 1 0025.b500.000c static 0 veth1207
count Display only the count of MAC entries 1 0025.b500.0017 static 0 veth1241
dynamic Display Dynamic Entries 1 0025.b500.0018 static 0 veth1277
interface Interface .
multicast Show Multicast MAC Table entries
. <cut>
notification Display Notification Information
static Display Static Entries .
vlan VLAN 4044 0024.971f.6a45 dynamic 0 Eth1/1/9
| Pipe command output to filter 4044 0024.971f.6b6f dynamic 0 Eth1/1/9
4044 0024.971f.6b8d dynamic 0 Eth2/1/9
4044 0024.971f.6da8 dynamic 0 Eth2/1/9
4044 0026.5108.67f2 dynamic 0 Eth1/1/9
4044 0026.5108.7de1 dynamic 0 Eth1/1/9
4044 0026.5108.ac59 dynamic 0 Eth1/1/9
4044 0026.5108.c9a1 dynamic 0 Eth2/1/9
1 0100.5e7f.fffa igmp 0 Po2 veth1207
1 0100.5e7f.fffd igmp 0 Po2 veth1277
200 0100.5e7f.fffa igmp 0 veth1199 veth1200
Total MAC Addresses: 47

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 94
Verifying End Host Mode Status and Configuration
running-config

UCS-HA-B(nxos)# show running-config interface ethernet 1/9


nterface Ethernet1/9
switchport mode trunk
switchport trunk allowed vlan 1
pinning border
no shutdown

UCS-HA-B(nxos)# show running-config interface veth681


interface vethernet681
switchport trunk allowed vlan 1
bind interface Ethernet1/1/5
no pinning server sticky
pinning server pinning-failure link-down

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 95
Verifying End Host Mode Status and Configuration
Server port pinning information

FarNorth-A(nxos)# show pinning server-interfaces

---------------+-----------------+------------------------+-----------------
SIF Interface Sticky Pinned Border Interface Pinned Duration
---------------+-----------------+------------------------+-----------------
Eth1/1 Yes - -
Eth1/2 Yes - -
Eth1/5 Yes - -
Eth1/6 Yes - -
veth1199 No Po2 2d 53:9:57
veth1200 No Po2 2d 53:9:59
veth1207 No Po2 2d 53:10:18
veth1235 No Po2 2d 53:10:22
veth1241 No Po2 2d 53:9:38
veth1243 No Po2 2d 53:9:38
veth1277 No Po2 2d 53:9:50
veth9395 Yes - -
veth9396 Yes - -
.
. <cut.>
.
Total Interfaces : 37

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 96
Verifying End Host Mode Status and Configuration

 Border port information

FarNorth-A(nxos)# show pinning border-interfaces

--------------------+---------+----------------------------------------------------------
Border Interface Status SIFs
--------------------+---------+----------------------------------------------------------
Po2 Active veth1199 veth1200 veth1207 veth1235
veth1241 veth1243 veth1277
Eth1/19 Down
Eth1/20 Down

Total Interfaces : 3

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 97
SAN – NPV Troubleshooting

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 98
NPV: Supported Hardware & Management
 NPV Mode Supported on:
All MDS blade switches , 9124‟s & 9134‟s
MDS NXOS 3.x latest
Nexus 5010 and 5020 switches
UCS 6100 Fabric Interconnects

 NPV-Core (NPIV Mode)


MDS 95xx, 9216i, 9216a, 9222i, Nexus 5010, 5020, and 3rd party switch with NPIV
support (Support Matrix)

 Management
NPV device has it‟s own IP address and management port – for mgmt. & debugging
All relevant configs. are supported via SNMP and CLI
FM support for discovering and configuring NPV switches (e.g. NP port mode)
No change in image upgrade and installation procedure

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 99
N-Port Virtualization (NPV) mode

 UCS FI Configured in NPV mode


Server-facing ports are regular F ports
Uplinks toward SAN core fabric are NP ports
 UCS distributes (relays) FCIDs to attached devices
No local domain ID to maintain
 One VSAN per uplink on UCS Fabrics
No trunking or channelling of NP ports
 Zoning, FSPF, DPVM, etc are not configured on the UCS Fabrics
 Domain mgr, FSPF, zone server, fabric login server, name server
They do not run on UCS Fabrics
 No local switching
All traffic routed via the core SAN switches

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 100
N-Port Virtualization Operations
NPIV-NPV
NPV-Core MDS 9000 Switch
w/ NPIV configured

FLOGI (acc)

PLOGI (acc)
ACC PLOGI (acc)

F
FC

NP
FLOGI PLOGI FDISC
NPV enabled
6100 FI

F
FCoE
N

FLOGI vHBA
PLOGI PRLI

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 101
6100 FI and FC Operations (NPV Mode)
FarNorth-B(nxos)# show npv flogi-table
 Remember no FC
------------------------------------------------------------------------------------------------------------------ services running in
SERVER EXTERNAL
INTERFACE VSAN FCID PORT NAME NODE NAME INTERFACE NPV Mode
------------------------------------------------------------------------------------------------------------------
vfc1205
vfc1206
100 0x240007 20:00:00:25:b5:00:00:0a 20:00:00:25:b5:00:00:06 fc2/1
100 0x240006 20:00:00:25:b5:00:00:09 20:00:00:25:b5:00:00:06 fc2/1
 FCIDs assigned
vfc1210
vfc1238
100 0x240008 20:00:10:25:b5:00:00:09 20:00:00:10:b5:00:00:09 fc2/2
100 0x240002 20:00:00:25:b5:00:00:10 20:00:00:25:b5:00:00:0f fc2/1
from Core NPIV
vfc1240 100 0x240003 20:00:00:25:b5:00:00:04 20:00:00:25:b5:00:00:0f fc2/2 switch
Total number of flogi = 5.

FarNorth-B(nxos)# show npv status  NP port to core


npiv is enabled
disruptive load balancing is disabled
Switch must be up
and assigned to
External Interfaces:
==================== proper VSANs
Interface: fc2/1, VSAN: 100, FCID: 0x240000, State: Up
Interface: fc2/2, VSAN: 100, FCID: 0x240001, State: Up
FarNorth-B(nxos)# show int brief
Number of External Interfaces: 2
-------------------------------------------------------------------------------
Server Interfaces: Interface Vsan Admin Admin Status SFP Oper Oper Port
================== Mode Trunk Mode Speed Channel
Interface: vfc1205, VSAN: 100, State: Up Mode (Gbps)
Interface: vfc1206, VSAN: 100, State: Up -------------------------------------------------------------------------------
Interface: vfc1210, VSAN: 100, State: Up fc2/1 100 NP off up swl NP 2 --
Interface: vfc1238, VSAN: 100, State: Up fc2/2 100 NP off up swl NP 2 --
Interface: vfc1240, VSAN: 100, State: Up fc2/3 1 NP off sfpAbsent -- -- --
Interface: vfc1270, VSAN: 100, State: Up fc2/4 1 NP off sfpAbsent -- -- --
Interface: vfc1272, VSAN: 100, State: Up fc2/5 1 NP off sfpAbsent -- -- --
Interface: vfc1280, VSAN: 100, State: Up fc2/6 1 NP off sfpAbsent -- -- --
Interface: vfc1284, VSAN: 100, State: Up fc2/7 1 NP off sfpAbsent -- -- --
fc2/8 1 NP off sfpAbsent -- -- --
Number of Server Interfaces: 9

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 102
NPV Related Show Commands on NPV Switch
 The following show commands can be used on the NPV switch to
display info on the NPV devices

FarNorth-B(nxos)# sh npv
flogi-table Show information about FLOGI sessions
internal Show internal NPV information
status Show NPV status
traffic-map Show information about Traffic Map
traffic-usage Show information about Traffic Usage

FarNorth-B(nxos)# show npv internal


errors Show error logs of NPV
event-history Show various event logs of NPV
events Show important events of NPV
info Show internal data structure information
mem-stats Show memory allocation statistics of NPV
msgs Show various message logs of NPV
pending-queue Show pending queue information

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 103
Available Debugs (Edge 6100 FI and Core NPIV)

FarNorth-B(nxos)# debug npv ?


all Configure all debug flags of NPV
demux Configure debugging of NPV message demux
dequeue Configure debugging of NPV message dequeue
distrib Configure distribution debug flags of NPV
errors Configure debugging of NPV errors
events Configure debugging of NPV events
ext-if-fsm Configure debugging of ext-if-fsm
flogi-fsm Configure debugging of flogi-fsm
fsm Configure debugging of NPV FSM transitions
ha Configure debugging of NPV High Availability
svr-if-fsm Configure debugging of svr-if-fsm
trace Configure debugging of NPV trace
warning Configure debugging of NPV warnings

FarNorth-B(nxos)# debug npv flogi-fsm ?


errors Configure debugging of flogi-fsm errors
events Configure debugging of flogi-fsm events
trace Configure debugging of flogi-fsm trace

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 104
Tracing a server FC connection

 Determine the server‟s pWWN


Assigned through the service profile
Verify on the host – it will match:

 Check local FLOGI for that pWWN on UCS:

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 105
About WWN pools

 Cisco MDS switches will not let just any random WWN FLOGI
This can be difficult to diagnose. If your vFC (server-side) interface
does not come up, check for malformed WWNs on the upstream
MDS using “show flogi internal event-history errors”

Event:E_DEBUG, length:146, at 154805 usecs after Fri Sep 4 17:55:13 2009


[102] Err(NAA=5 and IEEE Company ID is zero)invalid node name 50:00:00:00:00:00:00:07 from interface fc1/9;
nport name is 20:00:00:00:00:00:04:02.

 Try to use IEEE Type 2 WWNs

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 106
What to collect in a pinch?
 If you are rushed please collect the output of the following
commands on both the NPV and CORE switch. Collecting
some of the debugs discussed in the prior section would
also be beneficial.

 CORE SWITCH (MDS)


show tech-support details

 NPV SWITCH (6100 FI)


show tech-support details
show tech-support npv

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 107
Complete Your Online
Session Evaluation

 Give us your feedback and you


could win fabulous prizes.
Winners announced daily.
 Receive 20 Cisco Preferred
Access points for each session
evaluation you complete.
 Complete your session
evaluation online now (open a
browser through our wireless
network to access our portal)
or visit one of the Internet Don‟t forget to activate your
stations throughout the Cisco Live and Networkers Virtual
Convention Center. account for access to all session
materials, communities, and on-demand
and live activities throughout the year.
Activate your account at any internet
station or visit www.ciscolivevirtual.com.

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 108
Misc Troubleshooting
Appendix

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 110
Backup Considerations

Potential to Overwrite Backup Files:


If you rerun a backup operation without changing the filename, UCSM
overwrites the existing file on the server.

Scheduled Backups:
You cannot schedule a backup operation. However, you can create a
backup operation and re-trigger it by setting the admin state to enabled.

Incremental Backups:
Incremental backups are not supported.

Authorization:
You must have a user account that includes the admin role to create and
run backup operations

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 111
Software Update

 Fabric Extender software can be updated two ways


 Auto-update
The Fabric Extender is discovered by the switch using a L2 Satellite
Discover Protocol (SDP) on the uplink port
The NX5K switch checks software compatibility and starts and update if needed
After about 8 minutes the Fabric Extender will be rebooted and then discovered
normally

 UCS Manager
Once the Fabric Extender is discovered the alternate software image
can be updated and activated under the Firmware tab
The running image is not affected by this operation so a failure to
activate the alternate image will not leave the Fabric Extender in a
non-operational state
 Note: The bootloader is not part of software update but can be
updated if required using the debug plug-in

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 112
Appendix A

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 113
Port Stats

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 114
Port Stats

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 115
Port Stats

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 116
Circuit Information
 Service Profile Circuit Paths:

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 117
Troubleshooting QoS : Fabric Interconnect Queue specific counters

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 118
Troubleshooting QoS (cont.): IOM Queue Specific Counters

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 119
Troubleshooting QoS (cont.): IOM Flow

Presentation_ID © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 120

You might also like