0% found this document useful (0 votes)
244 views

The Basic "How To" On Device Mapper Multipath (Dm-Multipath)

This document provides an overview of device mapper multipath (dm-multipath) in Linux. It explains how dm-multipath creates a single block device for each LUN and assembles paths into priority groups, with only one group active at a time. It also describes the components of dm-multipath and provides explanations of multipath command output.

Uploaded by

robby nazareth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
244 views

The Basic "How To" On Device Mapper Multipath (Dm-Multipath)

This document provides an overview of device mapper multipath (dm-multipath) in Linux. It explains how dm-multipath creates a single block device for each LUN and assembles paths into priority groups, with only one group active at a time. It also describes the components of dm-multipath and provides explanations of multipath command output.

Uploaded by

robby nazareth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

The basic “How To” on device mapper multipath (dm-multipath)

Revision 1.0
Author: Steve Valimaki
Date: 7/1/2011

Contents
1.0 …………………………………………………………………………………………………………………………..dm-multipath
1.1 ……………………………………………………………………………………………………………….dm-multipath output explanation
2.0 …………………………………………………………………………………………………………………………..Starting dm-multipath
2.1 ……………………………………………………………………………………………………………….File Systems
3.0 …………………………………………………………………………………………………………………………..Multipath Commands
4.0 …………………………………………………………………………………………………………………………..multipathd
5.0 …………………………………………………………………………………………………………………………..multipath.conf
6.0 …………………………………………………………………………………………………………………………..Troubleshooting

Appendix A: RHEL 6 Differences


Appendix B: Manipulating Linux Devices
1.0 Device Mapper Multipath (dm-multipath)
This document provides some details about device mapper multipathing, dm-multipath. Much of the information has been culled
from various internet sources, but I’ve re-worded most of the content in an attempt to be in easier to understand terms.
Additionally, the file /usr/share/doc/device-mapper-multipath-0.4.7/Multipath-usage.txt contains very useful information.

Dm-multipath provides a means for accessing a device with multiple paths to that device in Linux. The device mapper kernel module
creates a single block device for every LUN probed by Linux at boot time (or manually – see later in this document). This device(s) for
each LUN can be found in /dev/mapper/

How it works (basically)


Paths, which are the physical connection between the initiator* and a specific LUN on the target** device, are assembled into
priority groups. Only one of these priority groups will be used at a time for I/O to the device. The priority group that is being utilized
for IO is labeled active. A component to dm-multipath is used to determine which path to use for the next IO. This component is
called the Path Selector.

* The Fibre channel or Infiniband interface within the attached server


** Data storage device

If an I/O fails on the selected active path, that path will be disabled and the I/O is retried down a different path within the same
group of paths, the Priority Group. There can be more than one path in this priority group, each path is weighted for a priority level
called a Path Group Priority. The highest priority level in the Group determines the primary path to use to access the device. If the
primary path fails, then the path with the next highest priority is used. If every path in the path group fails, then a different priority
group will chosen and enabled to continue IO to the target device.

dm-multipath is comprised of the following components:


The dm-multipath kernel module – provides control over paths and priorities.
The multipath daemon (multipathd) – used by the Linux kernel to monitor and control the multipath paths.
The multipath command – utilized by the user to manipulate (view, edit, flush cached entries…) multipath devices.
The /etc/multipath.conf file – the configuration file read by multipathd to describe the behaviors and attributes of multipath
devices.
Reference the man pages for these: multipath, multipathd, multipath.conf, kpartx, dmsetup, mpath_prio_alua

1.1 Multipath Output Explanation

# lsscsi -g
[0:0:0:0] disk SEAGATE ST373207LC D703 /dev/sda /dev/sg0
[0:0:6:0] process PE/PV 1x2 SCSI BP 1.0 - /dev/sg1
[1:0:0:0] disk DDN S2A 6620 1.03 /dev/sdd /dev/sg3
[1:0:0:1] disk DDN S2A 6620 1.03 /dev/sde /dev/sg5
[1:0:0:2] disk DDN S2A 6620 1.03 /dev/sdg /dev/sg7
[2:0:0:0] disk DDN S2A 6620 1.03 /dev/sdb /dev/sg2
[2:0:0:1] disk DDN S2A 6620 1.03 /dev/sdc /dev/sg4
[2:0:0:2] disk DDN S2A 6620 1.03 /dev/sdf /dev/sg6

The lsscsi utility is part of some Linux distributions, but is also available for free download. It displays SCSI path information, Vendor and Product
inquiry strings, block devices, and associated SCSI generic devices in a nice easy to read output. Reference the man page for more information. If
displaying the SCSI generic (sg) devices, be sure the sg kernel module is loaded using the command ‘lsmod | grep sg’.

Functioning Multipath Output

# multipath -ll
mpath19 (360001ff0721160000000002688e10002) dm-2 DDN,S2A 6620
[size=21T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:2 sdg 8:96 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:2 sdf 8:80 [active][ready]
mpath18 (360001ff0721160000000002588e00001) dm-0 DDN,S2A 6620
[size=2.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:1 sdc 8:32 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 1:0:0:1 sde 8:64 [active][ready]
mpath17 (360001ff0721160000000002488df0000) dm-1 DDN,S2A 6620
[size=472G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:0 sdd 8:48 [active][ready]
mpath19 (360001ff0721160000000002688e10002) dm-2 DDN,S2A 6620
[size=21T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:2 sdg 8:96 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:2 sdf 8:80 [active][ready]

mpath19 (360001ff0721160000000002688e10002) dm-2 DDN,S2A 6620

Product
Vendor
Sysfs Name
WWID of the Device
User Defined Alias Name

Product and Vendor are returned from the SCSI inquiry string
The sysfs name is the device mapper SCSI name
WWID is the unique identifier for the multipath device; it includes OEM vendor strings, owning controller MAC, and LUN id’s
The alias name is optional and can be defined in /etc/multipath.conf in conjunction with enabling user_friendly_names

[size=21T][features=1 queue_if_no_path][hwhandler=0][rw]

Device Permissions
Defined hardware Handler
Supported Features
Size of the DM Device
Defined hardware handler is 0 for almost all devices, except for those storage targets implementing RDAC.

The features value determines what to do if the path has failed. Various algorithms are defined in /etc/multipath.conf.

Path Group 1
\_ round-robin 0 [prio=50][active]

Path Group State


Group Priority
Path Selector and Repeat Count
Path Group Level

The previously mentioned Paths are organized into Path Groups. Only one path group can be active at any time. The Path Selector
determines which path in the path group will get the next IO. This IO will only go down the active/primary path.

All path’s have a specific Priority. The Priority Callout function within multipath is defined in the /etc/multipath.conf file and
determines the priority for all path’s. The group_by_prio path grouping policy provides the path priority and is used to group paths
together and determine the priority value within the path selector.

The Path Group State displays the current status of the path to the path group. Each path within the path group may show one of a
few different status states. The active state means the path is the optimal path and is capable of handling IO. The enabled state
means the path is capable of handling IO, but is not the optimal path to take. The disabled state infers that no path is available to the
active path group, and IO will go down another path group if it is in the ready state.

The Path Group Priority is a weighted value. The multipath daemon will use the highest path group priority value to determine the
active path group.

The Path Selector is a component of multipath that chooses which path to take for the next IO. The round-robin algorithm is
currently utilized for load balancing or active-passive pathing.
First Path on Path Group 1 (only path in this example, could be more than one)
\_ 1:0:0:2 sdg 8:96 [active][ready]

Physical Path State


DM Path State
Device Major/Minor Numbers
Block Device Name
SCSI Path Info, host:channel:scsi id:LUN

The Path State refers to the physical state of a path. There are currently 4 states in which a path can be in. The ready state means
the path is available to handle IO requests. A faulty state means the path is currently down and cannot handle IO requests. A shaky
state means the path is available, but for some reason is not capable of handling IO requests. The ghost state is a passive path in an
active-passive arrangement.

The DM Path State is the multipath kernel module’s state of a path. An active status means that the last IO requested through this
path completed without incidence. A failed status means that the last IO requested down the path did not complete and failed.

The device Major/Minor numbers are assigned by the Linux kernel and are used to read/write to the device file itself. The major
number refers to the device driver type and the minor number refers to the differentiated device within the device driver type.

Block Device names are used to address the Major/Minor devices. The sdg (in the example above) refers to a SCSI device driver type
(sd) and the differentiated device of this type (g).

SCSI Path Information displays the specific attributes of the path to the device. There are four colon separated attributes; host,
channel, scsi id, and LUN id. The host attribute refers to the initiator (Fibre Channel or Infiniband) port within the server. The channel
and scsi id attributes are set within the HBA/HCA interface. The LUN id displays the LUN number of the SCSI device from the
presented target device. In the example above, you see host 1 / LUN 2.
Path Group 2
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:2 sdf 8:80 [active][ready]

The example above shows a second path group to the same device (referenced in Path Group 1) within the Priority Group. This path
group has a lower priority (prio=10) than path group 1 (prio=50). The path group state of this path group is enabled, meaning it is
ready to service IO, but is not the optimal path to send IO requests down. The physical path state is ready to accept IO requests and
the device mapper path state is active, which infers the path was last tested successfully.

2.0 Starting Multipath

First, you will need to determine if dm-multipath is installed on your server. Run the following command, if you do not get an
installed package displayed, you will need to download and install.

# rpm -qa | grep -i multipath


device-mapper-multipath-0.4.7-30.el5

For dm-multipath devices to be created and discovered, you must first enable the services. For these services to start every time the
host server starts, you must enable the chkconfig flags.

# chkconfig --level 2345 multipathd on

# chkconfig --list | grep -i multipathd


multipathd 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Now, start the services:

# service multipathd start


Starting multipathd daemon: [ OK ]
2.1 File systems
File systems should be made and mounted on the dm-multipath devices, for example:

# mkfs -t xfs /dev/mapper/mpath17


meta-data=/dev/mapper/mpath17 isize=256 agcount=16, agsize=7733248 blks
= sectsz=512 attr=0
data = bsize=4096 blocks=123731968, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=32768, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0

# mount /dev/mapper/mpath17 /test1

# df -h
Filesystem Size Used Avail Use% Mounted on
...
/dev/mapper/mpath17 472G 4.6M 472G 1% /test1
/dev/mapper/mpath18 2.1T 5.1M 2.1T 1% /test2
/dev/mapper/mpath19 22T 5.1M 22T 1% /test3

3.0 Multipath Commands

The multipath command provides some useful options for discovering, updating, and debugging device mapper target devices. To
list the currently discovered devices in the multipath topology, run the following command:

# multipath -ll
mpath19 (360001ff0721160000000002688e10002) dm-2 DDN,S2A 6620
[size=21T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:2 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:2 sdg 8:96 [active][ready]
mpath18 (360001ff0721160000000002588e00001) dm-1 DDN,S2A 6620
[size=2.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 1:0:0:1 sdc 8:32 [active][ready]
mpath17 (360001ff0721160000000002488df0000) dm-0 DDN,S2A 6620
[size=472G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:0 sdb 8:16 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:0 sde 8:64 [active][ready]

You can optionally view the multipath topology with more verbose output (mostly for debugging purposes by increasing the
verbosity level (the higher the number, the more information is display). Verbosity level 2 (v2) is the default if not specified.

# multipath -v3 –ll

Lots of verbose output that include blacklists, settings, path lists, etc…

If changes are made to the target devices (adding|removing), then you will want to flush these old entries from the cached
configuration. This process to accomplish this task requires stopping multipath services, flushing cached entries, and then starting
services again. Here is an example:

# service multipathd stop


Stopping multipathd daemon: [ OK ]

# multipath –F

# service multipathd start


Starting multipathd daemon: [ OK ]

To manually remove a multipath device, you can use the following syntax
# multipath –f devicename

Another potentially useful command is dmsetup. Reference the man page for details. Here’s an example output that may be useful:

# dmsetup info
Name: mpath19
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 10
Major, minor: 253, 2
Number of targets: 1
UUID: mpath-360001ff0721160000000002688e10002

Name: mpath18
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 4
Major, minor: 253, 1
Number of targets: 1
UUID: mpath-360001ff0721160000000002588e00001

Name: mpath17
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 26
Major, minor: 253, 0
Number of targets: 1
UUID: mpath-360001ff0721160000000002488df0000
4.0 Multipathd

The multipath daemon is responsible for monitoring device mapper paths. If a failed path is discovered, the multipath daemon is
responsible for reconfiguring the multipath map. Reconfiguration is handled outside of the multipath daemon by calling and utilizing
the multipath configuration tool. When reconfiguration by the multipath configuration tool has completed, the multipath daemon is
notified and provides failover to the new, reconfigured path. When the failed path is re-established, the multipath daemon calls
back to the multipath configuration tool to reconfigure back to the optimal path, which in turn notifies the multipath daemon to
failback to the original optimized path.

The multipath daemon does have an interactive session that can be quite useful for debugging purposes and general understanding
of the multipath topology. The interactive session can be called out as shown below.

# multipathd -k
multipathd> ?
fail
multipath-tools v0.4.7 (03/12, 2006)
CLI commands reference:
list|show paths
list|show maps|multipaths
list|show maps|multipaths status
list|show maps|multipaths stats
list|show maps|multipaths topology
list|show topology
list|show map|multipath $map topology
list|show config
list|show blacklist
list|show devices
add path $path
remove|del path $path
add map|multipath $map
remove|del map|multipath $map
switch|switchgroup map|multipath $map group $group
reconfigure
suspend map|multipath $map
resume map|multipath $map
reinstate path $path
fail path $path
disablequeueing map|multipath $map
restorequeueing map|multipath $map
disablequeueing maps|multipaths
restorequeueing maps|multipaths
resize map|multipath $map

Careful usage of this command should be followed. Options beyond list|show, can create unwanted affects – know what you’re
doing! Use ctrl-d to escape the session.

5.0 Multipath.conf

The /etc/multipath.conf file holds configuration information utilized by multipathd. There are many customizations within this file,
and these customizations can be specific to OEM storage platforms and behaviors.

5.1 Multipath.conf Structure


The multipath.conf file is a user editable text file with various sections for entries.

The defaults section of the file defines values that will be used by multipath, but will not be used if definitions have been
commented out of the file or you have defined specific settings in the devices section.

Example defaults section

#defaults {
# udev_dir /dev
# polling_interval 5
# selector "round-robin 0"
# path_grouping_policy failover
# getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
# prio_callout none
# path_checker readsector0
# rr_min_io 1000
# rr_weight uniform
# failback manual
# no_path_retry fail
# user_friendly_names no
# bindings_file "/var/lib/multipath/bindings"
#}

The blacklist section of the multipath.conf file is used to define which device(s) will not be controlled by dm-multipath. An example
of a blacklisted device would be your boot device of the host server with an installed operating system.
Example of the blacklist section

blacklist {
devnode "^sda"
}

It should be noted that some device names may not be persistent on reboot, therefore, to blacklist a specific device, you can define
the devices WWID like this:

blacklist {
wwid 26353900f02796769
}

To obtain a drives WWID, you can run the following command (although my SCSI boot device did not return a WWID value).

# scsi_id -g -u -s /block/sda
SSEAGATE_ST373207LC_3KT31KD7
The devices section allows for customizations for specific OEM vendor products. Section 5.2 will describe these customization
settings.
Example of the devices section

device {
vendor "DDN"
product "S2A 6620"
path_grouping_policy group_by_prio
prio "alua"
path_checker tur
getuid_callout "/sbin/scsi_id -u -g -p 0x83 -s /block/%n"
prio_callout "/sbin/mpath_prio_alua /dev/%n"
failback immediate
no_path_retry 12
}

5.2 Multipath.conf Settings

The /etc/multipath.conf file contains many user defineable customizations. References to these customizations can be obtained
from the /usr/share/doc/device-mapper-multipath-0.4.7/multipath.conf.annotated file.

The defaults section

defaults {

udev_dir /dev
Description: The directory where udev creates its device nodes

verbosity 2
Scope: multipath & multipathd
Description: The verbosity level of the command. It can be overridden by the -v command line option.
Values: 0-6
Default value: 2

polling_interval 5
Scope: multipathd
Description: How often a path's state is checked, in seconds. For paths that are usable, the time between checks will gradually
increase to (4 * polling_interval).
Default value: 5

selector "round-robin 0"


Scope: multipath
Description: The default path selector algorithm to use these algorithms are offered by the kernel multipath target
Values: "round-robin 0"
Default Value : "round-robin 0"

path_grouping_policy failover
Scope : multipath
Description : The default path grouping policy to apply to unspecified multipaths
Values : failover = 1 path per priority group
multibus = all valid paths in 1 priority group
group_by_serial = 1 priority group per detected serial number
group_by_prio = 1 priority group per path priority value
group_by_node_name = 1 priority group per target node name
Default value : failover

getuid_callout "/sbin/scsi_id -g -u -s /block/%n"


Scope : multipath
Description: The default program and args to callout to obtain a unique path identifier. Absolute path required.
Default value : "/sbin/scsi_id -g -u -s /block/%n"
prio_callout "/bin/true"
Scope: multipath
Description: The default program and args to callout to obtain a path priority value. The ALUA bits in SPC-3 provide an exploitable
prio value for example. "none" is a valid value
Default value : (null)

path_checker readsector0
Scope: multipath & multipathd
Description: The default method used to determine the paths' state
Values: directio|tur|hp_sw|rdac|emc_clariion|readsector0|cciss_tur
Default value: readsector0

features "1 queue_if_no_path"


Scope: multipath
Description: The default extra features of multipath devices. The only existing feature currently is queue_if_no_path, which is the
same as setting no_path_retry to queue.
Values: "1 queue_if_no_path"
Default value: (null)

rr_min_io 100
Scope: multipath
Description: The number of IO’s to route to a path before switching to the next in the same path group
Default Value : 1000

max_fds 8192
Scope: multipathd
Description: Sets the maximum number of open file descriptors for the multipathd process.
Values: max|n > 0
Default value: None
rr_weight priorities
Scope: multipath
Description: If set to priorities the multipath configurator will assign path weights as "path prio * rr_min_io"
Values: priorities|uniform
Default value: uniform

failback immediate
Scope: multipathd
Description: Tell the daemon to manage path group failback, or not to. 0 means immediate failback, values >0 means differed
failback expressed in seconds.
Values: manual|immediate|n > 0
Default value: manual

no_path_retry queue
Scope: multipath & multipathd
Description: Tell the number of retries until disable queueing, or"fail" means immediate failure (no queueing), "queue" means never
stop queueing
Values: queue|fail|n (>0)
Default value : (null)

flush_on_last_del yes
Scope: multipathd
Description: If set to "yes", multipathd will disable queueing when the last path to a device has been deleted.
Default value: no

queue_without_daemon yes
Scope: multipathd
Description: If set to "no", multipathd will disable queueing for all devices when it is shut down.
Values: yes|no
Default value: yes
user_friendly_names no
Scope: multipath
Description: If set to "yes", using the bindings file /var/lib/multipath/bindings to assign a persistent and unique alias to the
multipath, in the form of mpath<n>. If set to "no" use the WWID as the alias. In either case this will be overridden by any specific
aliases in this file.
Values: yes|no
Default value: no

bindings_file "/etc/multipath_bindings"
Scope: multipath
Description: The location of the bindings file that is used with the user_friendly_names option.
Values: <full_pathname>
Default value : "/var/lib/multipath/bindings"

mode 0644
Scope: multipath
Description: The mode to use for the multipath device nodes, in octal.
Values: 0000 - 0777
Default value: determined by the process

uid 0
Scope: multipath
Description: The user id to use for the multipath device nodes. You must use the numeric user id.
Values: <user_id_number>
Default value: determined by the process

gid 0
Scope: multipath
Description: The group id to use for the multipath device nodes. You must use the numeric group id.
Values: <group_id_number>
Default value: determined by the process
The blacklist section

Scope: multipath & multipathd


Description: list of device names to discard as not multipath candidates
Default value: fd, hd, md, dm, sr, scd, st, ram, raw, loop

blacklist {
wwid 26353900f02796769# devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z][[0-9]*]"
device {
vendor DEC.*
product MSA[15]00
}

Blacklist Exceptions

Scope: multipath & multipathd


Description: list of device names to be treated as multipath candidates even if they are on the blacklist.
Note: blacklist exceptions are only valid in the same class.
It is not possible to blacklist devices using the devnode keyword
and to exclude some devices of them using the wwid keyword.

blacklist_exceptions {
devnode "^dasd[c-d]+[0-9]*"
wwid "IBM.75000000092461.4d00.34"
wwid "IBM.75000000092461.4d00.35"
wwid "IBM.75000000092461.4d00.36"
device {
vendor "IBM"
product "S/390.*"
}
}

The multipaths section

Scope: multipath & multipathd


Description: list of multipaths finest-grained settings

multipaths
Scope: multipath & multipathd
Description: Container for settings that apply to one specific multipath

multipath {

wwid 3600508b4000156d700012000000b0000
Scope : multipath & multipathd
Description: index of the container

alias yellow
Scope : multipath
Description: Symbolic (user friendly) name for the multipath

path_grouping_policy failover
Scope: multipath
Description: Path grouping policy to apply to this multipath
Values : failover = 1 path per priority group
multibus = all valid paths in 1 priority group
group_by_serial = 1 priority group per detected serial number
group_by_prio = 1 priority group per path priority value
group_by_node_name = 1 priority group per target node name
Default value: failover

prio_callout "/sbin/mpath_prio_balance_units %d"


Scope: multipath
Description: The program and args to callout to obtain a path weight. Weights are summed for each path group to determine the
next PG to use case of failure. "none" is a valid value.
Default value: no callout, all paths equals

path_selector "round-robin 0"


Scope: multipath
Description: The path selector algorithm to use for this mpath these algo are offered by the kernel mpath target
Values: "round-robin 0"
Default value: "round-robin 0"

failback immediate
Scope: multipathd
Description: Tell the daemon to manage path group failback, or not to. 0 means immediate failback, values >0 means differed
failback expressed in seconds.
Values: manual|immediate|n > 0
Default value: manual

rr_weight priorities
Scope: multipath
Description: If set to priorities the multipath configurator will assign path weights as "path prio * rr_min_io"
Values: priorities|uniform
Default value: uniform
no_path_retry queue
Scope: multipath & multipathd
Description: Tell the number of retries until disable queueing, or "fail" means immediate failure (no queueing), "queue" means never
stop queueing
Values: queue|fail|n (>0)
Default value: (null)

flush_on_last_del yes
Scope: multipathd
Description: If set to "yes", multipathd will disable queueing when the last path to a device has been deleted.
Values: yes|no
Default value: no

rr_min_io 100
Scope: multipath
Description: The number of IO to route to a path before switching to the next in the same path group
Default value: 1000

mode 0644
Scope: multipath
Description: The mode to use for the multipath device node, in octal.
Values: 0000 - 0777
Default value: determined by the process

uid 0
Scope: multipath
Description: The user id to use for the multipath device node. You must use the numeric user id.
Values: <user_id_number>
Default value: determined by the process
gid 0
Scope: multipath
Description: The group id to use for the multipath device node. You must use the numeric group id.
Values: <group_id_number>
Default value: determined by the process

The devices section

Scope: multipath & multipathd


Description: List of per storage controller settings overrides default settings (device_maps block) overridden by per multipath
settings (multipaths block)

device
Scope: multipath & multipathd
Description: Settings for this specific storage controller

vendor, product
Scope : multipath & multipathd
Desc : Index for the block
vendor "COMPAQ "
product "HSV110 (C)COMPAQ"

path_grouping_policy failover
Scope: multipath
Description: Path grouping policy to apply to multipath hosted by this storage controller
Values: failover = 1 path per priority group
multibus = all valid paths in 1 priority group
group_by_serial = 1 priority group per detected serial number
group_by_prio = 1 priority group per path priority value
group_by_node_name = 1 priority group per target node name
Default value: failover

getuid_callout "/sbin/scsi_id -g -u -s /block/%n"


Scope: multipath
Description: The program and args to callout to obtain a unique path identifier. Absolute path required
Default value: "/sbin/scsi_id -g -u -s /block/%n"

prio_callout "/sbin/mpath_prio_balance_units %d"


Scope: multipath
Description: The program and args to callout to obtain a path weight. Weights are summed for each path group to determine the
next PG to use case of failure. "none" is a valid value.
Default value: no callout, all paths equals

path_checker readsector0
Scope: multipathd
Description: Path checking alorithm to use to check path state
Value: directio|tur|hp_sw|rdac|emc_clariion|readsector0|cciss_tur
Default value: readsector0

path_selector "round-robin 0"


Description: the path selector algorithm to use for this mpath these algo are offered by the kernel mpath target
Values: "round-robin 0"
Default value: "round-robin 0"

features "1 queue_if_no_path"


Scope: multipath
Description: The default extra features of multipath devices. The only existing feature currently is queue_if_no_path, which is the
same as setting no_path_retry to queue.
Values: "1 queue_if_no_path"
Default value: (null)

hardware_handler "1 emc"


Scope: multipath
Description: If set, it specifies a module that will be used to perform hardware specific actions when switching path groups or
handling IO errors
Values: "0"|"1 emc"|"1 hp-sw"|"1 rdac"
Default value: "0"

rr_weight priorities
Scope: multipath
Description: If set to priorities the multipath configurator will assign path weights as "path prio * rr_min_io"
Values: priorities|uniform
Default value: uniform

no_path_retry queue
Scope: multipath & multipathd
Description: Tell the number of retries until disable queueing, or "fail" means immediate failure (no queueing), "queue" means never
stop queueing
Values: queue|fail|n (>0)
Default value: (null)

failback 30
Scope: multipathd
Description: Tell the daemon to manage path group failback, or not to. 0 means immediate failback, values >0 means deffered
failback expressed in seconds.
Values: manual|immediate|n > 0
Default value: manual
rr_min_io 100
Scope: multipath
Description: The number of IO to route to a path before switching to the next in the same path group
Default value: 1000

flush_on_last_del yes
Scope: multipathd
Description: If set to "yes", multipathd will disable queueing when the last path to a device has been deleted.
Values: yes|no
Default value: no

product_blacklist LUN_Z
Scope: multipath & multipathd
Description: Product strings to blacklist for this vendor
Default value: none

6.0 Troubleshooting

6.1 Blacklists
Be sure you are not blacklisting any devices that you want discovered in the multipath topology.
Here is an example of the device /dev/sdb that is black listed, yet is the first device for LUN 0.

First, look at the original output:


# lsscsi -g
[0:0:0:0] disk SEAGATE ST373207LC D703 /dev/sda /dev/sg0
[0:0:6:0] process PE/PV 1x2 SCSI BP 1.0 - /dev/sg1
[1:0:0:0] disk DDN S2A 6620 1.03 /dev/sdd /dev/sg3
[1:0:0:1] disk DDN S2A 6620 1.03 /dev/sde /dev/sg5
[1:0:0:2] disk DDN S2A 6620 1.03 /dev/sdg /dev/sg7
[2:0:0:0] disk DDN S2A 6620 1.03 /dev/sdb /dev/sg2
[2:0:0:1] disk DDN S2A 6620 1.03 /dev/sdc /dev/sg4
[2:0:0:2] disk DDN S2A 6620 1.03 /dev/sdf /dev/sg6

Now the multipath listing of discovered devices for LUN 0:

mpath17 (360001ff0721160000000002488df0000) dm-1 DDN,S2A 6620


[size=472G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:0 sdd 8:48 [active][ready]

Note only one path group. Now a look at the Blacklist device section in the /etc/multipath.conf file:

# Blacklist all devices by default. Remove this to enable multipathing


# on the default devices.
blacklist {
devnode "^sda"
devnode "^sdb"
}

This prevents multipath from discovering the device in the multipath topology. If we remove the entry from the blacklist so it looks like this:

# Blacklist all devices by default. Remove this to enable multipathing


# on the default devices.
blacklist {
devnode "^sda"
}
Then you can stop the multipath services, flush all old entries, and then restart multipath services, like this:

# service multipathd stop


Stopping multipathd daemon: [ OK ]

# multipath –F

# service multipathd start


Starting multipathd daemon: [ OK ]

# multipath -ll
mpath19 (360001ff0721160000000002688e10002) dm-2 DDN,S2A 6620
[size=21T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:2 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:2 sdg 8:96 [active][ready]
mpath18 (360001ff0721160000000002588e00001) dm-1 DDN,S2A 6620
[size=2.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 1:0:0:1 sdc 8:32 [active][ready]
mpath17 (360001ff0721160000000002488df0000) dm-0 DDN,S2A 6620
[size=472G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 1:0:0:0 sdb 8:16 [active][ready]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:0 sde 8:64 [active][ready]

Now we see all devices properly.


6.2 Failed Paths

Here is an example of a failed path. Test Unit Ready (tur) has reported that the path is down. Note that all host 1 paths are faulty. All
host 2 paths are active and ready. Also, note that the priority on the original primary paths have changed from 50 to 0.

# multipath -ll
sdb: checker msg is "tur checker reports path is down"
sdc: checker msg is "tur checker reports path is down"
sdd: checker msg is "tur checker reports path is down"
mpath19 (360001ff0721160000000002688e10002) dm-2 DDN,S2A 6620
[size=21T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 1:0:0:2 sdd 8:48 [active][faulty]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:2 sdg 8:96 [active][ready]
mpath18 (360001ff0721160000000002588e00001) dm-1 DDN,S2A 6620
[size=2.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=50][active]
\_ 2:0:0:1 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:1 sdc 8:32 [active][faulty]
mpath17 (360001ff0721160000000002488df0000) dm-0 DDN,S2A 6620
[size=472G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 1:0:0:0 sdb 8:16 [active][faulty]
\_ round-robin 0 [prio=10][enabled]
\_ 2:0:0:0 sde 8:64 [active][ready]

IO will be re-routed down the failover path. IO transactions already sent will timeout and retry, this time going down the failover
path. You can confirm this activity be seeing the failover event on your host, looking at the LED’s of the Fibre Channel or Infiniband
ports for activity, and looking at IO being received on the storage target device.
6.2.1 Failover event on the host

While running IO to a file system mounted on /dev/mapper/mpath17, which is dm-0, the primary path was failed.

# ls -la /dev/mpath
total 0
drwxr-xr-x 2 root root 100 Apr 21 13:29 .
drwxr-xr-x 13 root root 4280 Apr 21 13:29 ..
lrwxrwxrwx 1 root root 7 Apr 21 13:29 mpath17 -> ../dm-0
lrwxrwxrwx 1 root root 7 Apr 21 13:29 mpath18 -> ../dm-1
lrwxrwxrwx 1 root root 7 Apr 21 13:29 mpath19 -> ../dm-2

Apr 21 15:21:34 DDNfed kernel: qla2xxx 0000:04:0b.0: LOOP DOWN detected (2 5 0).

Fibre Channel path is down (above)

Failover occurs for mpath17 and mapth19 (below)


Apr 21 15:22:04 DDNfed kernel: rport-1:0-0: blocked FC remote port time out: saving binding
Apr 21 15:22:04 DDNfed kernel: sd 1:0:0:0: SCSI error: return code = 0x00010000
Apr 21 15:22:04 DDNfed kernel: end_request: I/O error, dev sdb, sector 7631792
Apr 21 15:22:04 DDNfed kernel: device-mapper: multipath: Failing path 8:16.
Apr 21 15:22:04 DDNfed kernel: sd 1:0:0:0: SCSI error: return code = 0x00010000
Apr 21 15:22:04 DDNfed kernel: end_request: I/O error, dev sdb, sector 7806768
Apr 21 15:22:04 DDNfed kernel: sd 1:0:0:0: SCSI error: return code = 0x00010000
Apr 21 15:22:04 DDNfed kernel: end_request: I/O error, dev sdb, sector 62945944
Apr 21 15:22:04 DDNfed kernel: sd 1:0:0:0: SCSI error: return code = 0x00010000
Apr 21 15:22:04 DDNfed kernel: end_request: I/O error, dev sdb, sector 63382040
Apr 21 15:22:04 DDNfed multipathd: dm-0: devmap already registered
Apr 21 15:22:04 DDNfed multipathd: dm-1: add map (uevent)
Apr 21 15:22:04 DDNfed multipathd: dm-1: devmap already registered
Apr 21 15:22:04 DDNfed multipathd: dm-2: add map (uevent)
Apr 21 15:22:04 DDNfed multipathd: dm-2: devmap already registered
Apr 21 15:22:04 DDNfed multipathd: dm-0: add map (uevent)
Apr 21 15:22:04 DDNfed multipathd: dm-0: devmap already registered
Apr 21 15:22:04 DDNfed multipathd: 8:16: mark as failed
Apr 21 15:22:04 DDNfed multipathd: mpath17: remaining active paths: 1
Apr 21 15:23:23 DDNfed multipathd: sdb: tur checker reports path is down
Apr 21 15:23:23 DDNfed multipathd: mpath17: switch to path group #2
Apr 21 15:24:30 DDNfed multipathd: sdc: tur checker reports path is down
Apr 21 15:24:30 DDNfed multipathd: checker failed path 8:32 in map mpath18
Apr 21 15:24:30 DDNfed kernel: device-mapper: multipath: Failing path 8:32.
Apr 21 15:24:30 DDNfed multipathd: mpath18: remaining active paths: 1
Apr 21 15:24:30 DDNfed multipathd: sdd: tur checker reports path is down
Apr 21 15:24:30 DDNfed kernel: device-mapper: multipath: Failing path 8:48.
Apr 21 15:24:30 DDNfed multipathd: checker failed path 8:48 in map mpath19
Apr 21 15:24:30 DDNfed multipathd: mpath19: remaining active paths: 1
Apr 21 15:24:30 DDNfed multipathd: mpath19: switch to path group #2

SCSI errors and path failover (above)


Apr 21 15:25:26 DDNfed kernel: qla2xxx 0000:04:0b.0: LIP reset occured (f8ef).
Apr 21 15:25:26 DDNfed kernel: qla2xxx 0000:04:0b.0: LIP occured (f8ef).
Apr 21 15:25:26 DDNfed kernel: qla2xxx 0000:04:0b.0: LOOP UP detected (4 Gbps).

Path returns (above)

Failback occurs for mpath17 and mapth19 (below)


Apr 21 15:25:51 DDNfed multipathd: sdc: tur checker reports path is up
Apr 21 15:25:51 DDNfed multipathd: 8:32: reinstated
Apr 21 15:25:51 DDNfed multipathd: mpath18: remaining active paths: 2
Apr 21 15:25:51 DDNfed multipathd: sdd: tur checker reports path is up
Apr 21 15:25:51 DDNfed multipathd: 8:48: reinstated
Apr 21 15:25:51 DDNfed multipathd: mpath19: remaining active paths: 2
Apr 21 15:25:51 DDNfed multipathd: mpath19: switch to path group #1
Apr 21 15:25:51 DDNfed multipathd: mpath19: switch to path group #1
Apr 21 15:25:51 DDNfed multipathd: dm-1: add map (uevent)
Apr 21 15:25:51 DDNfed multipathd: dm-1: devmap already registered
Apr 21 15:25:51 DDNfed multipathd: dm-2: add map (uevent)
Apr 21 15:25:51 DDNfed multipathd: dm-2: devmap already registered
Apr 21 15:26:04 DDNfed multipathd: sdb: tur checker reports path is up
Apr 21 15:26:04 DDNfed multipathd: 8:16: reinstated
Apr 21 15:26:04 DDNfed multipathd: mpath17: remaining active paths: 2
Apr 21 15:26:04 DDNfed multipathd: mpath17: switch to path group #1
Apr 21 15:26:04 DDNfed multipathd: mpath17: switch to path group #1
Appendix A: RHEL 6 differences

A.1 Multipath output


# multipath -ll
360001ff0775480000000000b8895000b dm-5 DDN,S2A 6620
size=16G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 3:0:0:11 sdj 8:144 active ready running
| `- 1:0:0:11 sdg 8:96 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
|- 3:0:1:11 sdm 8:192 active ready running
`- 1:0:1:11 sdd 8:48 active ready running
360001ff0775480000000000a8894000a dm-4 DDN,S2A 6620
size=16G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 3:0:1:10 sdl 8:176 active ready running
| `- 1:0:1:10 sdc 8:32 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
|- 3:0:0:10 sdi 8:128 active ready running
`- 1:0:0:10 sdf 8:80 active ready running

A.2 Multipath.conf

In the defaults section of /etc/multipath.conf, these additions or modifications have been added:

multipath_dir
Directory where the dynamic shared objects are stored; default is system dependent, commonly /lib/multipathfind_multipaths If
set to yes , instead of trying to create a multipath device for every non-blacklisted path, multipath will only create a device if one
of three condidions are met. 1 There are at least two non-blacklisted paths with the same wwid, 2 the user manually forces the
creation, by specifying a device with the multipath command, or 3 a path has the same WWID as a multipath device that was
previously created while find_multipaths was set (even if that multipath device doesnât currently exist). Whenever a multipath
device is created with find_multipaths set, multipath will remeber the WWID of the device, so that it will automatically create the
device again, as soon as it sees a path with that WWID. This should allow most users to have multipath automatically choose the
correct paths to make into multi-path devices, without having to edit the blacklist; Default is no

path_selector
The default path selector algorithm to use; they are offered by the kernel multipath target. There are three selector algorithms.

round-robin 0
Loop through every path in the path group, sending the same amount of IO to each.
queue-length 0
Send the next bunch of IO down the path with the least amount of outstanding IO.
service-time 0
Choose the path for the next bunch of IO based on the amount of outstanding IO to the path and its relative throughput.

prio
The default method used to obtain a path priority value. Possible values are

const
Set a priority of one to all paths

emc
Generate the path priority for EMC arrays

alua
Generate the path priority based on the SCSI-3 ALUA settings.

tpg_pref
Generate the path prority based on the SCSI-3 ALUA settings, using the preferred port bit.
ontap
Generate the path priority for NetApp arrays.

rdac
Generate the path priority for LSI/Engenio RDAC controller.

hp_sw
Generate the path priority for Compaq/HP controller in active/standby mode.

hds
Generate the path priority for Hitachi HDS Modular storage arrays.

Default value is const.

path_checker
The default method used to determine the pathsâ state. Possible values are

readsector0
Read the first sector of the device

tur
Issue a TEST UNIT READY command to the device.

emc_clariion
Query the EMC Clariion specific EVPD page 0xC0 to determine the path state.

hp_sw
Check the path state for HP storage arrays with Active/Standby firmware.

rdac
Check the path state for LSI/Engenio RDAC storage controller.
directio
Read the first sector with direct I/O.

failback
Tell multipathd how to manage path group failback.

immediate
Immediately failback to the highest priority pathgroup that contains active paths.

manual
Do not perform automatic failback.

followover
Only perform automatic failback when the first path of a pathgroup becomes active. This keeps a node from automatically
failing back when another node requested the failover.

values > 0 deferred failback (time to defer in seconds)

Default value is manual.

queue_without_daemon
If set to no , multipathd will disable queueing for all devices when it is shut down. Default is yes

checker_timeout
Specify the timeout to user for path checkers that issue scsi commands with an explict timeout, in seconds; default taken
from /sys/block/sd<x>/device/timeout
fast_io_fail_tmo
Specify the number of seconds the scsi layer will wait after a problem has been detected on a FC remote port before failing
IO to devices on that remote port. This should be smaller than dev_loss_tmo. Setting this to off will dis-able the timeout.

dev_loss_tmo
Specify the number of seconds the scsi layer will wait after a problem has been detected on a FC remote port before
removing it from the system.

The devices section has added the ALUA hardware handler.

hardware_handler
(Optional) The hardware handler to use for this device type. The following hardware handler are implemented:

1 emc
Hardware handler for EMC storage arrays.

1 alua
Hardware handler for SCSI-3 ALUA arrays.

1 hp_sw
Hardware handler for Compaq/HP controllers.

1 rdac
Hardware handler for the LSI/Engenio RDAC controllers.
Appendix B: Manipulating Linux Devices
When adding, deleting, or modifying SCSI devices, you have two options for seeing these changes take affect.
1. Reboot the server (may not be desirable)
2. Re-scan the SCSI driver for the altered device(s).
To re-scan the bus, you will need to write a 1 into rescan file for the target device. This will cause the SCSI driver to re-probe the bus
for the target device. Here’s an example:

# echo 1 > /sys/block/sdg/device/rescan

To rescan a whole SCSI bus for added or removed devices, you can use pre-configured scripts like this one:
http://bash.cyberciti.biz/diskadmin/rescan-linux-scsi-bus/
I called the script rescan-bus, which provides output like this:

# ./rescan-bus
Host adapter 0 (mptspi) found.
Host adapter 1 (<NULL>) found.
Host adapter 2 (<NULL>) found.
Scanning hosts 0 1 2 channels 0 for
SCSI target IDs 0 1 2 3 4 5 6 7 , LUNs 0
Scanning for device 0 0 0 0 ...
OLD: Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: SEAGATE Model: ST373207LC Rev: D703
Type: Direct-Access ANSI SCSI revision: 03
Scanning for device 0 0 6 0 ...
OLD: Host: scsi0 Channel: 00 Id: 06 Lun: 00
Vendor: PE/PV Model: 1x2 SCSI BP Rev: 1.0
Type: Processor ANSI SCSI revision: 02
Scanning for device 1 0 0 0 ...
OLD: Host: scsi1 Channel: 00 Id: 00 Lun: 00
Host: scsi1 Channel: 00 Id: 00 Lun: 01
Host: scsi1 Channel: 00 Id: 00 Lun: 02
Vendor: DDN Model: S2A 6620 Rev: 1.03
Type: Direct-Access ANSI SCSI revision: 05
Scanning for device 2 0 0 0 ...
OLD: Host: scsi2 Channel: 00 Id: 00 Lun: 00
Host: scsi2 Channel: 00 Id: 00 Lun: 01
Host: scsi2 Channel: 00 Id: 00 Lun: 02
Vendor: DDN Model: S2A 6620 Rev: 1.03
Type: Direct-Access ANSI SCSI revision: 05
0 new device(s) found.
0 device(s) removed.

To find the port WWN of a device, try this:

# cat /sys/class/fc_host/host1/port_name
0x210000e08b84e9ea

To find the queue_depth of a particular device

# cat /sys/class/fc_host/host1/device/rport-1\:0-0/target1\:0\:0/1\:0\:0\:2/queue_depth
32

To find the timeout of a particular device

# cat /sys/class/fc_host/host1/device/rport-1\:0-0/target1\:0\:0/1\:0\:0\:2/timeout
60

You might also like