0% found this document useful (0 votes)

116 views12 pages

LNX Storage 1a

The document discusses checking the status of remote fibre channel ports using the /sys filesystem. Key information about remote ports like the port_name, port_state, roles, and scsi_target_id can be viewed. The systool command provides a unified way to examine fibre channel driver and hardware information stored throughout the /sys filesystem. Remote ports with an unknown role indicate the port is no longer present, but the data is kept to preserve original scsi target id bindings if the port returns.

Uploaded by

Yulin Liu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

116 views12 pages

LNX Storage 1a

Uploaded by

Yulin Liu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 12

"rport" is the short form for remote port.

On your host the remote port(s) would ideally be the

ports that the fiber connection is connected to on the other end. For example, if a system is conn
ected to a storage array with two controllers (i.e., nothing but the ports), then the remote port
information and state can be checked. There are a few ways for checking this:
1. The information related to rhost ports is at the location:
/sys/class/fc_remote_ports/rport-*
where information like the node_name, port_name, port_status, scsi_target_id, etc. can be checked
with the cat command. For example, the following command will check the state of the port and if
it is online or not:
$ cat /sys/class/fc_remote_ports/rport-1:0-0/port_state
which will show "Online", if the port is online and working normally. A simple grep command can
be used to see all port states at once:
$ grep -Hv "zz" /sys/class/fc_remote_ports/rport*/port_state

2. Another very good utility is the systool command from the sysfsutils package (see: man
systool). With this command the information from the remote ports can be gathered in one pass by
running the command with following options:
$ systool -c fc_remote_ports -v
where -c indicates class, and -v is for verbose.
Not all "rports" are storage ports. Examine the "roles" parameter under rport to determine if the
remote port is a storage target.
$ grep -Hv "zz" /sys/class/fc_remote_ports/rport-*/roles
/sys/class/fc_remote_ports/rport-6:0-0/roles:FCP Target << storage target
/sys/class/fc_remote_ports/rport-6:0-3/roles:FCP Target << storage target
/sys/class/fc_remote_ports/rport-6:0-4/roles:FCP Initiator << HBA
/sys/class/fc_remote_ports/rport-8:0-8/roles:Fabric Port << fabric port
/sys/class/fc_remote_ports/rport-8:0-9/roles:Directory Server << fabric directory server

Remote ports associated with storage ports with also have a positive (not -1) value assigned to its
scsi_target_id value.
$ grep -Hv "zz" /sys/class/fc_remote_ports/rport-*/scsi_target_id
/sys/class/fc_remote_ports/rport-6:0-0/scsi_target_id:0
/sys/class/fc_remote_ports/rport-6:0-3/scsi_target_id:1
/sys/class/fc_remote_ports/rport-6:0-4/scsi_target_id:-1
/sys/class/fc_remote_ports/rport-8:0-8/scsi_target_id:-1
/sys/class/fc_remote_ports/rport-8:0-9/scsi_target_id:-1

What does an 'unknown' roles mean?

$ cat /sys/class/fc_remote_ports/rport-0:0-17/roles
unknown
How can we clean up remote ports with unknown role?
Is there a way to reset the driver or HBA in a way to clean this up (with out rebooting)?
The 'unknown' role means the device is no longer connected/available to this host at this time.
There is no detrimental effect of leaving these fc remote ports within the running configuration.
Reboot is the supported method to remove fc remote ports from sysfs that are no longer present on
the san or zoned to this host.

FCP-SCSI targets, once discovered, are retained within the system until the next reboot. If access
to a FCP-SCSI target is lost, then you'll probably notice a message of:
"blocked FC remote port time out: removing target and saving binding" or something similar.
Essentially the message is indicating that the scsi target is being removed from the active configu
ration but its saving its binding. The "saving binding" part just means if or when this storage target
port returns to the configuration -- can be seen by this host again -- it will retain/be assigned its
original scsi target id. So if, for example, a storage target attached to scsi host0 that was assigned
scsi target id 5 left and returned then the scsi addresses to luns behind that storage target would be
0:0:5:* (h:c:t:l) before and after that event. This will prevent devices from moving around and
being renamed if a storage target goes missing for a period of time.

In order to maintain the above logic, that is returning original bindings to returning devices, two
pieces of information need to be retained:
a) the wwpn of the storage port, and
b) the original assigned scsi target id.
This is why the left over data from ports that are no longer available to this host is kept. Ports that
have left the active configuration and are no longer accessible with have a role of "unknown" and
a port_state of "Not Present".

Examine the number of "unknown" roles in fc_remote_ports:

$ ls -1c ./sys/class/fc_remote_ports/rport*/roles 2> /dev/null | xargs -I {} grep -H -v \"ZzZz\" {} |
grep unknown
./sys/class/fc_remote_ports/rport-0:0-17/roles:unknown
./sys/class/fc_remote_ports/rport-0:0-2/roles:unknown
./sys/class/fc_remote_ports/rport-0:0-3/roles:unknown
./sys/class/fc_remote_ports/rport-1:0-5/roles:unknown
Then examine the full contents of each such rport* directory/contents. Remote ports with "unkn
own" should list port_status as "Not Present" along with the saved wwpn (port_name) and assi
gned scsi target id.
$ cd ./sys/class/fc_remote_ports/rport-1:0-7; ls -1c 2> /dev/null |xargs -I {} grep -H -v \"ZzZz\" {}
dev_loss_tmo:30
fast_io_fail_tmo:off
maxframe_size:4294967295 bytes
node_name:0xffffffffffffffff
port_id:0xffffffff
port_name:0x5006016b3b241784
port_state:Not Present
roles:unknown
scsi_target_id:5
In the above case the preserved scsi addresses will be 0:0:5:*. You can also use systool (from sysfs
utils package) to get the same information as above:
$ systool -c fc_transport -v
:
The port_name is a WWN (world wide name) in NAA format. This can be decoded, if desired, to
see what storage type this wwpn belonged to.
0x5006016b3b241784::
Vendor
NAA OUI Specific
5 00-60-16 B3B241784 CLARIION

So this fc remote port that no longer present within the storage configuration was associated with
Clariion storage.

The 2.6.11 Linux kernel introduced certain changes to the lpfc (emulex driver) and qla2xxx (Qlog
ic driver) Fibre Channel HBA drivers which removed the following entries from the proc pseudo -
filesystem: /proc/scsi/qla2xxx, /proc/scsi/lpfc. These entries had provided a centralized repository
of information about the drivers and connected hardware. After the changes, the drivers started sto
ring all this information within the /sys filesystem. Since Red Hat Enterprise Linux 5 uses version
2.6.18 of the Linux kernel it is affected by this change.

Using the /sys filesystem has the advantage that all the Fibre Channel drivers now use a unified
and consistent manner to report data. However it also means that the data previously available in a
single file is now scattered across a myriad of files in different parts of the /sys filesystem.

One basic example is the status of a Fibre Channel HBA: checking this can now be accomplished
with the following command:
# cat /sys/class/scsi_host/host#/state
...where host# is the H-value in the HBTL SCSI addressing format, which references the approp
riate FC HBA. For emulex adapters (lpfc driver) for example, this command would yield:
# cat /sys/class/scsi_host/host1/state
Link Up - Ready:

Fabric
For qlogic devices (qla2xxx driver) the output would instead be as follows:
# cat /sys/class/scsi_host/host1/state
Link Up - F_Port

Obviously it becomes quite impractical to search through the /sys filesystem for the relevant files
when there is a large variety of Fibre Channel-related information of interest. Instead of manual
searching, the systool (1) command provides a simple but powerful means of examining and
analyzing this information. Detailed below are several commands which demonstrate samples of
information which the systool command can be used to examine.
To examine some simple information about the Fibre Channel HBAs in a machine:
# systool -c fc_host -v
To look at verbose information regarding the SCSI adapters present on a system:
# systool -c scsi_host -v
To see what Fibre Channel devices are connected to the Fibre Channel HBA cards:
# systool -c fc_remote_ports -v -d

For Fibre Channel transport information:

# systool -c fc_transport -v
For information on SCSI disks connected to a system:
# systool -c scsi_disk -v

To examine more disk information including which hosts are connected to which disks:
# systool -b scsi -v
Furthermore, by installing the sg3_utils package it is possible to use the sg_map command to view
more information about the SCSI map. After installing the package, run:
# modprobe sg
# sg_map -x

Finally, to obtain driver information, including version numbers and active parameters, the
following commands can be used for the lpfc and qla2xxx drivers respectively:
# systool -m lpfc -v
# systool -m qla2xxx -v
ATTENTION: The syntax of the systool (1) command differs across versions of Red Hat
Enterprise Linux. Therefore the commands above are only valid for Red Hat Enterprise Linux 5.

I'm checking 2.6.16-rc5 with 2 QLogic 2312 adapters using qla2xxx driver from 2.6.16-rc5.
As with earlier kernels, I think > 2.6.12 (since scsi_transport_fc gained functionality) I have the
following problem.
2 scsi hosts available, 4 and 5 (for QLogic).
I disconnect the cable from one of QLogic cards. After timeout I have the message:
rport-4:0-0: blocked FC remote port time out: removing target and saving binding and appropriate
SCSI devices that came from adapter 4 disappear from /proc/scsi/scsi.
So far, so good. I reconnect the cable, the directory /sys/class/fc_remote_ports/rport-4:0-1 appears
along with the old ones rport-4:0-0 and rport-5:0-0, so currently I have 3.
However, no automatic rescan appears on adapter 4.
What's worse, if I try echo "0 1 0" > /sys/class/scsi_host/host4/scan, the process is stuck.

Most of the problem seems to be a QLogic driver problem.

HBAs are connected to target via FC switch.
1). If I have several LUNs on each HBA, with QLogic only 1 directory per adapter (for LUN 0) is
created in /sys/class/fc_remote_ports, while with Emulex a directory for every LUN is created.
2). The situation I described occurs with QLogic only if the cable connecting between HBA and
switch is pulled out/in. If I (dis)connect the cable between switch and target, disks come back.
3). With Emulex in both cases disks come back. However, both with Emulex and QLogic stale
directories in /sys/classfc_remote_ports are left.

For example, with Emulex if I had in the beginning rport-6:0-0 rport-6:0-1 rport-6:0-2 rport-7:0-0
rport-7:0-1 rport-7:0-2, then disconnected adapter 7, got rport-6:0-0 rport-6:0-1 rport-6:0-2 rport-
7:0-0 rport-7:0-2
(7-0-0 and 7-0-2 didn't disappear while 7-0-1 did) connected 7 back rport-6:0-0 rport-6:0-1 rport-
6:0-2 rport-7:0-2 rport-7:0-4 rport-7:0-5 rport-7:0-6
(7-0-0 disappeared, but 7-0-2 is still here).

I applied the patch changing 2 lines in scsi_transport_fc.c to if (fc_host_tgtid_bind_type(shost) !=

FC_TGTID_BIND_NONE)
I tried first with Emulex as QLogic seemed yesterday to have more problems, (not only orphan
rports, but also not creating all rports), so let's start solving problems one by one.
I saw no change in behavior with Emulex.
1). I had 3 LUNs on adapters 6 and 7.
# ls /sys/class/fc_remote_ports
rport-6:0-0 rport-6:0-1 rport-6:0-2 rport-7:0-0 rport-7:0-1 rport-7:0-2

So specifically one target-port at rport-7:0-2 which has three luns (0, 1, and 2).
/devices/pci0000:00/0000:00:06.0/0000:05:00.2/000:07:01.1/host7/rport-7:0-2/target7:0:0/7:0:0:0
/devices/pci0000:00/0000:00:06.0/0000:05:00.2/000:07:01.1/host7/rport-7:0-2/target7:0:0/7:0:0:1
/devices/pci0000:00/0000:00:06.0/0000:05:00.2/000:07:01.1/host7/rport-7:0-2/target7:0:0/7:0:0:2
So what are the other rports? Other initiators?

2). disconnected the cable between adapter 7 and the switch rport-7:0-0 disappeared momentarily
with Emulex LinkDown event.
lpfc 0000:07:01.1: 1:1305 Link Down Event x2 received Data: x2 x20 x110
# ls /sys/class/fc_remote_ports
rport-6:0-0 rport-6:0-1 rport-6:0-2 rport-7:0-0 rport-7:0-2
Actually, rport-7:0-1 disappears -- maybe an initiator. Is rport-7:0-0 an FCP_TARGET with no
luns?

3). Then after a timeout I got a message about blocking rport-7:0-2, but nothing changed.
rport-7:0-2: blocked FC remote port time out: removing target and saving binding
Exactly, the scsi_target and scsi_device is reaped after TMO expires.
# ls /sys/class/fc_remote_ports
rport-6:0-0 rport-6:0-1 rport-6:0-2 rport-7:0-0 rport-7:0-2
This is still correct. However, all scsi entries related to adapter 7 are removed from /proc/scsi/scsi.

If your question is, are the /proc/scsi/scsi devices (/dev/sda, sdb,...) supposed to disappear when
the rport TMO expires , then yes they are supposed to disappear. The rports persist with a port_
state of 'not present' for persistent-binding purposes, until the port return (i.e. you reinsert the
cable). If that were to occur, then upon rport addition, the 3-lun storage would be attached to rport-
7:0-2 and a request signaled for lun scanning by the midlayer.

I'll just describe Emulex situation to confirm.

Let's connect only 1 Emulex port (adapter 7) to a switch and leave adapter 6 not connected. Then
we have
# ls /sys/class/fc_remote_ports/
rport-7:0-0 rport-7:0-1 rport-7:0-2
# cat /sys/class/fc_remote_ports/*/roles
Fabric Port
Directory Server
FCP Target, FCP Initiator
When the cable is disconnected from adapter 7, immediately with LinkDown event, the rport with
the role of Directory server disappears and only 2 are left:
# ls /sys/class/fc_remote_ports/
rport-7:0-0 rport-7:0-2
# cat /sys/class/fc_remote_ports/*/roles
Fabric Port
FCP Target, FCP Initiator

Then after a timeout, the role of rport-7:0-2 is changed to unknown and relevant entries are remov
ed from /proc/scsi/scsi. rport-7:0-0 is still here.

rport-7:0-2: blocked FC remote port time out: removing target and saving binding
# ls /sys/class/fc_remote_ports/
rport-7:0-0 rport-7:0-2
# cat /sys/class/fc_remote_ports/*/roles
Fabric Port
unknown

After reconnecting the cable, rport-7:0:0 disappears and rport-7:0:4 and rport-7:0-5 appear along
with newly recognized LUNs in /proc/scsi/scsi.
# ls /sys/class/fc_remote_ports/
rport-7:0-2 rport-7:0-4 rport-7:0-5
# cat /sys/class/fc_remote_ports/*/roles
FCP Target, FCP Initiator
Fabric Port
Directory Server

If I'm not mistaken, in QLogic case only 1 rport per adapter appeared instead of 3. Tomorrow I'll
connect QLogic and report again.
I can confirm that very problem (pulling the cable between HBA and switch results in only LUN 0
or nothing coming back afterward) on 2.6.15.4 here too.

Please try recent 2.6.16-rcX kernels as there have been a number of patches submitted since 2.6.15
which attempt to address most of these holes
[PATCH] qla2xxx: Close window on race between rport removal and fcport transition.
[SCSI] qla2xxx: Drop legacy 'bypass lun scan for tape device' code.
[SCSI] qla2xxx: Correct issue where the rport's upcall was not being made after relogin.
[SCSI] qla2xxx: Correct synchronization issues during rport addition/deletion.
[SCSI] qla2xxx: Disable port-type RSCN handling via driver state-machine.

Today I tested disconnecting QLogic port.

Adapter 4 is connected via switch to a storage and 3 LUNs are seen via the adapter.
Only 1 rport is created (for FCP Target) while in Emulex case there were 3: (Fabric Port, Directory
Server and FCP Target, FCP Initiator).
# ls /sys/class/fc_remote_ports/
rport-4:0-0
# cat /sys/class/fc_remote_ports/*/roles
FCP Target

Default dev_loss_tmo is 6 (1+5) while in Emulex case the default was 35.

After disconnecting the cable between the HBA and the switch:
qla2xxx 0000:03:01.0: LOOP DOWN detected (2).
rport-4:0-0: blocked FC remote port time out: removing target and saving binding
# ls /sys/class/fc_remote_ports/
rport-4:0-0
# cat /sys/class/fc_remote_ports/*/roles
unknown

Relevant scsi devices are removed from /proc/scsi/scsi.

After reconnecting the cable

qla2xxx 0000:03:01.0: LIP reset occured (f7f7).
qla2xxx 0000:03:01.0: LOOP UP detected (2 Gbps).
# ls /sys/class/fc_remote_ports/
rport-4:0-0
# cat /sys/class/fc_remote_ports/*/roles
FCP Target
However, scsi devices don't reappear in /proc/scsi/scsi.
When I issue rescan, the command is stuck
echo - - - > /sys/class/scsi_host/host4/scan
That's correct, we currently don't make an upcall for the SNS server port nor the switch F-port.
...
Could you add the enable-debug patch I sent you earlier and retry the test? Again forward the rel
event snippets from var/log/messages.

Here's the patch again.

diff --git a/drivers/scsi/qla2xxx/qla_dbg.h b/drivers/scsi/qla2xxx/qla_dbg.h
index 935a59a..632f653 100644
--- a/drivers/scsi/qla2xxx/qla_dbg.h
+++ b/drivers/scsi/qla2xxx/qla_dbg.h
@@ -9,6 +9,7 @@
*/
/* #define QL_DEBUG_LEVEL_1 */ /* Output register accesses to COM1 */
/* #define QL_DEBUG_LEVEL_2 */ /* Output error msgs to COM1 */
+#define QL_DEBUG_LEVEL_2 /* Output error msgs to COM1 */
/* #define QL_DEBUG_LEVEL_3 */ /* Output function trace msgs to COM1 */
/* #define QL_DEBUG_LEVEL_4 */ /* Output NVRAM trace msgs to COM1 */
/* #define QL_DEBUG_LEVEL_5 */ /* Output ring trace msgs to COM1 */

diff --git a/drivers/scsi/qla2xxx/qla_settings.h b/drivers/scsi/qla2xxx/qla_settings.h

index 363205c..b2e22b0 100644
--- a/drivers/scsi/qla2xxx/qla_settings.h
+++ b/drivers/scsi/qla2xxx/qla_settings.h
@@ -8,7 +8,7 @@
* Compile time Options:
* 0 - Disable and 1 - Enable
*/
-#define DEBUG_QLA2100 0 /* For Debug of qla2x00 */
+#define DEBUG_QLA2100 1 /* For Debug of qla2x00 */

#define USE_ABORT_TGT 1 /* Use Abort Target mbx cmd *

Please see the log with debug-patch.

The module is loaded with option qlport_down_retry=1. Adapter 4 is connected to switch, adapter
5 doesn't have cable attached.
After reconnecting the cable the disks don't reappear and rescan is stuck. Before applying your pat
ches ghost rport was staying, now it's OK.

Before you try the patch I sent earlier, could you send be the output from the following:
# echo t > /proc/sysrq-trigger

Historically the qlogic driver rescan is a 2-phase process:

1) schedule the rescan, e.g.: echo scsi-qlascan > /proc/scsi/qla2xxx/4
2) rescan, e.g.: echo - - - > /sys/class/scsi_host/host4/scan
BUT, I've just used scsi-qlascan to discover _new_ devices... not existing devices that experienced
FC connection loss. I assume the qla driver _should_ just bring those lost devices back? But does
the historic 2-phase rescan for new devices speak to why the qlogic driver doesn't automagically
bring the old devices back? Or has the latest qlogic driver in mainline advanced past this 2-phase
requirement in

Unfortunately I don't have the directory /proc/scsi/qla2xxx. However the target sees PRLI from
the host again after reconnecting the cable between the initiator and the switch.
Does it mean the rediscovering new devices on initiator side is already done?

The two stage discovery process has not been needed since FC transport integration. Instead, the
driver simply makes up-calls to signal rport visiblity (add on PLOGI/PRLI; delete on LOGO/cabl
e-pull/etc).

Yes, after plugging the cable back in, the driver rediscovers ports:
Mar 3 01:07:22 multipath kernel: scsi(4): RSNN_NN exiting normally.
Mar 3 01:07:22 multipath kernel: scsi(4): GID_PT entry - nn 200000e08b079a69 pn 210000e08b
079a69 portid=010700.
Mar 3 01:07:22 multipath kernel: scsi(4): GID_PT entry - nn 2000001738279c00 pn 1000001738
279c11 portid=010200.
Mar 3 01:07:22 multipath kernel: scsi(4): device wrap (010200)
Initiates PLOGI/PRLI:
Mar 3 01:07:22 multipath kernel: scsi(4): Trying Fabric Login w/loop id 0x0081 for port 010200.

And upcall via fc_remote_port_add() is done.

Mar 3 01:07:22 multipath kernel: scsi(4): LOOP READY
Mar 3 01:07:22 multipath kernel: scsi(4): qla2x00_loop_resync - end

Firmware then notifies software that the port has logged out:
Mar 3 01:07:22 multipath kernel: scsi(4): Async PORT UPDATE ignored 0081/0007/7 ee5.
Mar 3 01:07:22 multipath kernel: scsi(4:0:0): status_entry: Port Down pid=43, compl status=0x29,
port state=0x4

A CDB also returns with a completion status of PORT_LOGGED_OUT. From

the driver's DPC routine (process-context), the upcall to fc_remote_port_delete() is issued:

Driver attempts a relogin:

Mar 3 01:07:22 multipath kernel: scsi(4): Port login retry:1000008279c11, id =0x0081 retry cnt=8
Mar 3 01:07:23 multipath kernel: scsi(4): fcport-0 - port retry count: 0 remaining
Mar 3 01:07:23 multipath kernel: scsi(4): qla2x00_port_login()
Mar 3 01:07:23 multipath kernel: scsi(4): Trying Fabric Login w/loop id 0x0081 for port 010200.

Relogin complete
Mar 3 01:07:23 multipath kernel: scsi(4): port login OK: logged in ID 0x81
Upcall to fc_remote_port_add() done.
Mar 3 01:07:23 multipath kernel: scsi(4): qla2x00_port_login - end
Mar 3 01:07:23 multipath kernel: scsi(4): Async PORT UPDATE ignored 0000/0006 /00 01.
Mar 3 01:07:23 multipath kernel: scsi(4): Async PORT UPDATE ignored 0000/0007/0001.
Mar 3 01:07:23 multipath kernel: scsi(4): Async PORT UPDATE ignored 0000/0004/0001

I also noticed that scsi_transport_fc.c::fc_user_scan() is not called with the host_lock held...
hmm.. could you try out the patch I sent earlier and provide the results.

Also, could you send the "echo t > /proc/..." output after the cable has been reinserted, but, before
the 'echo "- - -" > /sys/class' scan is initiated.

Here's sysrq output after reconnecting cable without manual disk

rescan. Before applying a patch.
The same lock exists:
#001: [ffff81006ee20080] {scsi_host_alloc}
.. held by: scsi_wq_4: 4255 [ffff81006f9147b0, 110]
... acquired at: scsi_scan_target+0x51/0x87 [scsi_mod]

After applying the patch the same lock exists:

#001: [ffff81006edc4080] {scsi_host_alloc}
.. held by: scsi_wq_4: 4255 [ffff81007edaf770, 110]
... acquired at: scsi_scan_target+0x51/0x87 [scsi_mod]

The kernel from scsi-rc-fixes git and your patch are working.
By the way, could you, please, tell me how I get only scsi patches from the git repository, cause I
got the whole kernel by using cg-clone
http://kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fix es-2.6.git

Now the process looks like following:

Mar 11 23:54:22 multipath kernel: qla2xxx 0000:03:01.0: LOOP DOWN detected (2).
Mar 11 23:54:28 multipath kernel: rport-4:0-0: blocked FC remote port time out:
removing target and saving binding
Mar 11 23:54:37 multipath kernel: qla2xxx 0000:03:01.0: LIP reset occured (f7f7).
Mar 11 23:54:37 multipath kernel: qla2xxx 0000:03:01.0: LOOP UP detected (2 Gbps).
Mar 11 23:54:59 multipath kernel: 4:0:0:0: timing out command, waited 22s

And the disks appear. Could you tell me, please, where this 22sec timeout came from?

Looks like your fiber fabric decided to renegotiate, and halfway it went for a coffee and donuts
break to not upset the union rules :)
I've seen LOOP negotiations take 10+ seconds before, and that is on a really simple setup , so noth
ing super special.

Actually Mike R. and James S. deserve the credit for the composite patch which consists of:
1) [PATCH] FC transport : Avoid device offline cases by stalling aborts until device unblocked
http://marc.theaimsgroup.com/?l=linux-scsi&m=114225658724378&w=2
2) Serialize scan work during fc_remote_port_delete() so rport removal doesn't deadlock midlayer
scans. The problem you were seeing. (Mike R.)
3) rport race fixes during removal (James S.).
...
Essentially there's currently several issues with rport consumers making delete() calls during mid-
layer scanning.

At a minimum to get Mike R's fixes into 2.6.16, and address the additional races going forward...

Here's a minimal the serialize scan-work patch, could you check to see that this addresses your
issue? Start with any latest linux-2.6.git tree.
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 929032e..3d09920 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -1649,6 +1649,8 @@ fc_remote_port_delete(struct fc_rport *
+ /* flush any scan work */ /* which can sleep */
+ scsi_flush_work(rport_to_shost(rport));
scsi_target_block(&rport->dev);
/* cap the length the devices can be blocked until they are deleted */

There are several commands to determine the WWN of a Fibre Channel (FC) HBA and their status
(online/offline). The post discusses few of the most commonly used methods.
Method 1
# lspci -nn | grep -i hba
07:00.0 Fibre Channel [0c04]: QLogic Corp. ISP2532 8Gb FC PCI-E HBA [1077:2532] (rev 02)
07:00.1 Fibre Channel [0c04]: QLogic Corp. ISP2532 8Gb FC PCI-E HBA [1077:2532] (rev 02)
To check the available HBA ports :
# ls -l /sys/class/fc_host
total 0
drwxr-xr-x 3 root root 0 Feb 3 2015 host2
drwxr-xr-x 3 root root 0 Feb 3 2015 host3

To find the state of HBA ports (online/offline) :

# more /sys/class/fc_host/host?/port_state
/sys/class/fc_host/host2/port_state
Online
/sys/class/fc_host/host3/port_state
Online

To find the WWN numbers of the above ports :

# more /sys/class/fc_host/host?/port_name
/sys/class/fc_host/host2/port_name
0x500143802426baf4
/sys/class/fc_host/host3/port_name
0x500143802426baf6

Method 2 : Using systool

Another useful command to find the information about HBAs is systool. If not already install, you
may need to install the sysfsutils package.
# yum install sysfsutils

To check the available HBA ports :

# systool -c fc_host
Class = "fc_host"
Class Device = "host2"
Device = "host2"
Class Device = "host3"
Device = "host3"

To find the WWNs for the HBA ports :

# systool -c fc_host -v | grep port_name
port_name = "0x500143802426baf4"
port_name = "0x500143802426baf6"
To check the state of the HBA ports (online/offline) :
# systool -c fc_host -v | grep port_state
port_state = "Online"
port_state = "Online"

Intro and Configuration Linux Multipathing
No ratings yet
Intro and Configuration Linux Multipathing
7 pages
R - 04 - MR-1CP-NSSANTS - Cisco FC Troubleshooting
No ratings yet
R - 04 - MR-1CP-NSSANTS - Cisco FC Troubleshooting
156 pages
Configuring NBU FT Media Server and SAN Clients
No ratings yet
Configuring NBU FT Media Server and SAN Clients
5 pages
Commands
No ratings yet
Commands
7 pages
Interview Questions
No ratings yet
Interview Questions
5 pages
FPCPRODSAN1
No ratings yet
FPCPRODSAN1
16 pages
Aix Troubleshooting
No ratings yet
Aix Troubleshooting
19 pages
Brocade Commands PDF
No ratings yet
Brocade Commands PDF
10 pages
Juniper Troubleshoot
No ratings yet
Juniper Troubleshoot
11 pages
FC HBA - SAN Troubleshooting in An Oracle Linux x86:x64 Environment (Doc ID 2087727.1)
No ratings yet
FC HBA - SAN Troubleshooting in An Oracle Linux x86:x64 Environment (Doc ID 2087727.1)
19 pages
KDB PDF
No ratings yet
KDB PDF
362 pages
Red Hat Enterprise Linux 5: Online Storage Reconfiguration Guide
No ratings yet
Red Hat Enterprise Linux 5: Online Storage Reconfiguration Guide
41 pages
FC 5774
No ratings yet
FC 5774
11 pages
Uc On Ucs B Series Troubleshooting Guide
0% (1)
Uc On Ucs B Series Troubleshooting Guide
88 pages
How To Check The WWN of Your Fiber Device
No ratings yet
How To Check The WWN of Your Fiber Device
3 pages
How To Find The WWN
No ratings yet
How To Find The WWN
3 pages
Trouble mds9148 00
No ratings yet
Trouble mds9148 00
21 pages
AIX Fcs KDB
No ratings yet
AIX Fcs KDB
8 pages
Junos Hardware Command
No ratings yet
Junos Hardware Command
9 pages
How To Find WWN Number
No ratings yet
How To Find WWN Number
11 pages
How To Identify The HBA Cards - Ports and WWN in Solaris - The Geek Diary - 1
No ratings yet
How To Identify The HBA Cards - Ports and WWN in Solaris - The Geek Diary - 1
10 pages
0 - Solution Enabler 6
No ratings yet
0 - Solution Enabler 6
2 pages
Technology Stream
No ratings yet
Technology Stream
51 pages
3par Cli
No ratings yet
3par Cli
5 pages
Cfgclear (Hit "Y" at Prompt)
No ratings yet
Cfgclear (Hit "Y" at Prompt)
18 pages
Brocade
No ratings yet
Brocade
13 pages
Pan Os New Features
No ratings yet
Pan Os New Features
204 pages
Device LUN Cleanup On Solaris
No ratings yet
Device LUN Cleanup On Solaris
5 pages
4-Transport Layer and Application Layer
No ratings yet
4-Transport Layer and Application Layer
14 pages
Lun Mapping Disks
No ratings yet
Lun Mapping Disks
11 pages
How To Scan FC LUNS and SCSI Disks
0% (1)
How To Scan FC LUNS and SCSI Disks
6 pages
Questions On Experts Exchange
No ratings yet
Questions On Experts Exchange
10 pages
IN1011 Revision Guide
No ratings yet
IN1011 Revision Guide
3 pages
CN Ques Bank
No ratings yet
CN Ques Bank
4 pages
Lab 1.6.2 Catalyst 2950T and 3550 Configuration and IOS Files
No ratings yet
Lab 1.6.2 Catalyst 2950T and 3550 Configuration and IOS Files
9 pages
HP-UX Overview and Command Summary
No ratings yet
HP-UX Overview and Command Summary
14 pages
Detect Newly Assigned LUN in RHEL Without Reboot The Server
No ratings yet
Detect Newly Assigned LUN in RHEL Without Reboot The Server
4 pages
HP-UX Cheatsheet: by Siberianbunny
No ratings yet
HP-UX Cheatsheet: by Siberianbunny
7 pages
Hba WWN-WWPN On Solaris
No ratings yet
Hba WWN-WWPN On Solaris
7 pages
3PAR CLI - Commands PDF
No ratings yet
3PAR CLI - Commands PDF
7 pages
Matlab Hello World: 1. Copy Paste The Following Line of Code Into A Script File
No ratings yet
Matlab Hello World: 1. Copy Paste The Following Line of Code Into A Script File
1 page
Brocade Commands: Hemant Hemant
No ratings yet
Brocade Commands: Hemant Hemant
9 pages
HP Commands
0% (1)
HP Commands
11 pages
Brocade Command List
No ratings yet
Brocade Command List
18 pages
HP-UX Overview and Command Summary Bootup/Shutdown: Interupting The Boot Process
No ratings yet
HP-UX Overview and Command Summary Bootup/Shutdown: Interupting The Boot Process
14 pages
Port Ns Fabric
No ratings yet
Port Ns Fabric
11 pages
S.No Command Description
No ratings yet
S.No Command Description
15 pages
Pro-Watch 4.5 Release Notes Jan 16 2019 PDF
No ratings yet
Pro-Watch 4.5 Release Notes Jan 16 2019 PDF
137 pages
Basic Command SAN Switch
No ratings yet
Basic Command SAN Switch
10 pages
Gigabyte Ga Ma770-Ds3 V1.0.
No ratings yet
Gigabyte Ga Ma770-Ds3 V1.0.
39 pages
Fabric OS Commands v7.2: Quick Reference Guide
No ratings yet
Fabric OS Commands v7.2: Quick Reference Guide
2 pages
User's Guide IBM PDF
No ratings yet
User's Guide IBM PDF
318 pages
Solaris Engineering
No ratings yet
Solaris Engineering
3 pages
1.1 Memory: 1. Useful Commands Note All AIX Commands Reference Can Be Found Under
No ratings yet
1.1 Memory: 1. Useful Commands Note All AIX Commands Reference Can Be Found Under
6 pages
Cisco Switch Commands
No ratings yet
Cisco Switch Commands
6 pages
Assembly Language
No ratings yet
Assembly Language
23 pages
Fabric OS Commands v7.0: Quick Reference Guide
No ratings yet
Fabric OS Commands v7.0: Quick Reference Guide
2 pages
Brocade San Switch Troubleshooting
No ratings yet
Brocade San Switch Troubleshooting
7 pages
Cisco MDS CLI Quick Reference v15
100% (1)
Cisco MDS CLI Quick Reference v15
2 pages
Solaris 10 How To Find Individual SAN Paths
No ratings yet
Solaris 10 How To Find Individual SAN Paths
4 pages
TL Tips 2
No ratings yet
TL Tips 2
12 pages
Get The LUN ID at AIX
No ratings yet
Get The LUN ID at AIX
4 pages
San Question and Answer
No ratings yet
San Question and Answer
14 pages
AIX CPU Util 0
No ratings yet
AIX CPU Util 0
10 pages
1.2 Devices: Posted: 5 Jan 04 (Edited 10 Mar 04)
No ratings yet
1.2 Devices: Posted: 5 Jan 04 (Edited 10 Mar 04)
6 pages
5 Minute Troubleshooting
No ratings yet
5 Minute Troubleshooting
9 pages
VN HMC 1
No ratings yet
VN HMC 1
9 pages
OS (2nd) May2022
No ratings yet
OS (2nd) May2022
2 pages
Sample CLI Commands To Use For Triage On McData Sphereon and Intrepid
No ratings yet
Sample CLI Commands To Use For Triage On McData Sphereon and Intrepid
3 pages
The Hadoop Ecosystem: So Much Free Stuff!
No ratings yet
The Hadoop Ecosystem: So Much Free Stuff!
21 pages
Lab Assessment: - 1: 1.create A Virtual Machine (VM)
No ratings yet
Lab Assessment: - 1: 1.create A Virtual Machine (VM)
17 pages
5-Minute Initial Troubleshooting On Brocade Equipment: Elonden Elonden
No ratings yet
5-Minute Initial Troubleshooting On Brocade Equipment: Elonden Elonden
5 pages
Ora-600 kdsgrp1
No ratings yet
Ora-600 kdsgrp1
10 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Computer Networks Syallbus
No ratings yet
Computer Networks Syallbus
3 pages
Android Mobile App Pentesting PDF
No ratings yet
Android Mobile App Pentesting PDF
22 pages
Chapter 2 - Networking & Telecommunication
No ratings yet
Chapter 2 - Networking & Telecommunication
32 pages
VIOS Tech 2
No ratings yet
VIOS Tech 2
11 pages
Table of Contents: Tac Documents On Firepower Service, Firesight System, and Amp
No ratings yet
Table of Contents: Tac Documents On Firepower Service, Firesight System, and Amp
4 pages
Ora Net 2a
No ratings yet
Ora Net 2a
8 pages
Ora Corruption 1
No ratings yet
Ora Corruption 1
7 pages
FW Upgrade Guide ENG
No ratings yet
FW Upgrade Guide ENG
6 pages
WP Hostbridge Soap and Rest 090303
No ratings yet
WP Hostbridge Soap and Rest 090303
12 pages
Differences Between Enterprise, Standard Edition 2 On Oracle 12.2
No ratings yet
Differences Between Enterprise, Standard Edition 2 On Oracle 12.2
9 pages
Ora Net 0c
No ratings yet
Ora Net 0c
13 pages
NBU TS Net 1
No ratings yet
NBU TS Net 1
7 pages
Ora Upd Seg 1
No ratings yet
Ora Upd Seg 1
3 pages
Ora Net 1a
No ratings yet
Ora Net 1a
12 pages
Ora Net 0a
No ratings yet
Ora Net 0a
13 pages
Testing of Network Using Sophos Firewall With Layer Three Switch Through Dos Attacks
No ratings yet
Testing of Network Using Sophos Firewall With Layer Three Switch Through Dos Attacks
5 pages
Kill Switch: Command Description
No ratings yet
Kill Switch: Command Description
2 pages
Huawei B315s-936 Unlock Instructions (LATEST)
No ratings yet
Huawei B315s-936 Unlock Instructions (LATEST)
11 pages
AIX SAN Boot
No ratings yet
AIX SAN Boot
4 pages
CCR2004 1G 12S+2XS
No ratings yet
CCR2004 1G 12S+2XS
3 pages
Guitar Pro 6 On Ubuntu 64bit
No ratings yet
Guitar Pro 6 On Ubuntu 64bit
3 pages
MPMC by Godse
No ratings yet
MPMC by Godse
5 pages
Change DNS Settings in Windows XP
No ratings yet
Change DNS Settings in Windows XP
3 pages
Introduction To Programming STM32 ARM Cortex-M 32-Bit Microcontrollers
100% (2)
Introduction To Programming STM32 ARM Cortex-M 32-Bit Microcontrollers
13 pages
Ora-13013 1
No ratings yet
Ora-13013 1
6 pages
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
JHS-770 Software Upgrade Procedure
100% (2)
JHS-770 Software Upgrade Procedure
19 pages
TCP/IP Foundation For Engineers: Network +
No ratings yet
TCP/IP Foundation For Engineers: Network +
2 pages
A Practical Guide Wireshark Forensics
From Everand
A Practical Guide Wireshark Forensics
alasdair gilchrist
5/5 (4)
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hidaia Mahmood Alassouli
No ratings yet
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
From Everand
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
Mamta Devi
No ratings yet

LNX Storage 1a

Uploaded by

LNX Storage 1a

Uploaded by

"rport" is the short form for remote port.

On your host the remote port(s) would ideally be the

What does an 'unknown' roles mean?

Examine the number of "unknown" roles in fc_remote_ports:

For Fibre Channel transport information:

Most of the problem seems to be a QLogic driver problem.

I applied the patch changing 2 lines in scsi_transport_fc.c to if (fc_host_tgtid_bind_type(shost) !=

I'll just describe Emulex situation to confirm.

Today I tested disconnecting QLogic port.

Relevant scsi devices are removed from /proc/scsi/scsi.

After reconnecting the cable

Here's the patch again.

diff --git a/drivers/scsi/qla2xxx/qla_settings.h b/drivers/scsi/qla2xxx/qla_settings.h

#define USE_ABORT_TGT 1 /* Use Abort Target mbx cmd *

Please see the log with debug-patch.

Historically the qlogic driver rescan is a 2-phase process:

And upcall via fc_remote_port_add() is done.

A CDB also returns with a completion status of PORT_LOGGED_OUT. From

Driver attempts a relogin:

Here's sysrq output after reconnecting cable without manual disk

After applying the patch the same lock exists:

Now the process looks like following:

To find the state of HBA ports (online/offline) :

To find the WWN numbers of the above ports :

Method 2 : Using systool

To check the available HBA ports :

To find the WWNs for the HBA ports :

You might also like