Hozer Central

Thu, 18 Jan 2007

SLES9, NetApp, and multipathd Howto

Setting up multipathd on SLES9 is a hozer.

Material List

  • NetApp FAS3020 (2 in cluster)
  • QLA2422 HBA (2 per server)
  • Switching Fabric of some sort (2)
  • Target to bang head against (1)

Software List

  • SLES9 SP3
  • multipath-tools-0.4.5-020 (from novell's update site stock will work too)
  • NetApp Linux Host Utils 3.0

multipath.conf

Take a look at the examples in /usr/share/doc/packages/multipath-tools. Although multipath will work without a configuration file, you will be limited to accessing the luns by their WWID. My expert recommendation is that at least the multipaths, devices, and devnode_blacklist be included.

devnode_blacklist

This section lists devices to be excluded from multipath (with regex matching). If the server boots from local SCSI disks, including those disks would be wise. Something along the lines of:
devnode_blacklist {
  devnode "^sda$"
  devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
  devnode "^hd[a-z]"
  devnode "^cciss!c[0-9]d[0-9]*"
}
NOTE: It is important that any devnode starting with sd will need to be terminated with a $. LUNS can (and probaby will) be named sda[a-z][a-z].

multipaths

This is were you map your LUNS to a friendly name that you can work with, such as:
multipaths {
  multipath {
    wwid 360a9800043336a414c3a3954725a7869
    alias  my-lun0
  }
  multipath {
    wwid 360a9800043336a414c4a395871437a71
    alias  my-lun1
  }
}
Whoa, your saying to yourself, how do you figure out your WWID? If you run multipath -d -v2 -l you will get something like:
360a9800043336a414c3a3954725a7869
[size=100 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
 \_ 2:0:1:1  sdas 66:192 [active][ready]
 \_ 1:0:1:1  sdq  65:0   [active][ready]
\_ round-robin 0 [enabled]
 \_ 2:0:0:1  sdal 66:80  [active][ready]
 \_ 1:0:0:1  sdj  8:144  [active][ready]

360a9800043336a414c4a395871437a71
[size=100 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
 \_ 2:0:1:0  sdar 66:176 [active][ready]
 \_ 1:0:1:0  sdp  8:240  [active][ready]
\_ round-robin 0 [enabled]
 \_ 2:0:0:0  sdad 65:208 [active][ready]
 \_ 1:0:0:0  sdb  8:16   [active][ready]
The import part here is to look at the common LUN ID each path has. For the LUN with WWID 360a9800043336a414c4a395871437a71 you can see that it is LUN ID 0 in the group since all SCSI paths end in 0.

devices

With the devices section you define devices and options for them. Since we have a clustered NetApp, this is the section that will tell multipath which path to prefer since you can access a LUN via either filer head end (although they will bitch about it every now and again if you have autosupport turned on and are accessing a LUN via the "wrong" head end). Your section should look something like:
devices {
  device {
    vendor  "NETAPP"
    product  "LUN"
    path_grouping_policy  group_by_prio
    getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
    prio_callout  "/opt/netapp/santools/mpath_prio_ontap /dev/%n"
    features  "1 queue_if_no_path"
    path_checker  readsector0
    failback  immediate
  }  
}
Most importantly are the path_grouping_policy and prio_callout. By setting path_grouping_policy to group_by_prio paths to the LUN will be grouped via their priority as determined by which head end you are accessing the LUN through. The prio_callout entry tells multipath how to ask netapp what priority each path has.

Starting Multipathd

You can now start everything up with:
# /etc/init.d/boot.multipath start
# /etc/init.d/multipathd start
And add them to start on boot:
# insserv boot.multipath multipathd
The multipath devices can now be used in /dev/disk/by-name. To check everything out you can run multipath -d -v2 -ll and see the priority and grouping of the paths:
my-lun0 (360a9800043336a414c3a3954725a7869)
[size=100 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [prio=8][active]
 \_ 2:0:1:1  sdas 66:192 [active][ready]
 \_ 1:0:1:1  sdq  65:0   [active][ready]
\_ round-robin 0 [prio=2][enabled]
 \_ 2:0:0:1  sdal 66:80  [active][ready]
 \_ 1:0:0:1  sdj  8:144  [active][ready]

my-lun1 (360a9800043336a414c4a395871437a71)
[size=100 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [prio=8][active]
 \_ 2:0:1:0  sdar 66:176 [active][ready]
 \_ 1:0:1:0  sdp  8:240  [active][ready]
\_ round-robin 0 [prio=2][enabled]
 \_ 2:0:0:0  sdad 65:208 [active][ready]
 \_ 1:0:0:0  sdb  8:16   [active][ready]
You can see that now you not only get the WWID but now also the alias.

Rebooting

You may find that after rebooting your devices do not appear in /dev/disk/by-name. Run! The sky is falling! The streets will flow with the blood of the non-believers! Just kidding. What is happening is /etc/init.d/boot.multipath is being run too early. I'm still looking for the "correct" solution, but I have found that by editing /etc/init.d/multipathd to call it prior to the start of multipathd works. Here is my diff to the file:
--- multipathd.old      2007-01-17 21:35:41.091274231 -0600
+++ multipathd  2007-01-17 21:35:05.591133859 -0600
@@ -55,6 +55,7 @@
 case "$1" in
     start)
        echo -n "Starting multipathd"
+       /etc/init.d/boot.multipath start
 
        modprobe dm-multipath

posted at: 05:16 | permanent link to this entry

January 2007 >
MoTuWeThFrSaSu
1 2 3 4 5 6 7
8 91011121314
15161718192021
22232425262728
293031    

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License .

Powered by PyBlosxom.

Icons from the Tango Project.