Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch does two things. First, it allows the tur checker to retry when it
fails with DID_TRANSPORT_DISRUPTED. Second, it makes both calls to check a path
use get_state, do avoid duplicated code.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
|
|
This patch adds a new multipath.conf default paramter, queue_without_daemon.
If this is set to "no", when multipathd stops, queueing will be turned off for
all devices. This is useful for devices that set no_path_retry. If a machine
is shut down while all paths to a device are down, it is possible to hang
waiting for IO to return from the device after multipathd has been stopped.
Without multipathd running, access to the paths cannot be restored, and the
kernel cannot be told to stop queueing IO. Setting queue_without_daemon to "no"
makes multipathd turn off queueing on all devices when it stops, avoiding the
problem.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
|
|
|
|
--Boundary-01=_PlHbLmcCyk7NmgQ
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Hi!
In latest git -master on line 1443 of multipathd/main.c lock() is called on=
=20
exit_mutex, but since exit_mutex is a pthread_mutex_t, pthread_mutex_lock()=
=20
is needed.
Attached is the one-liner patch, tested it on a gentoo machine and seems to=
be=20
working.
=2D-=20
Regards,
Rumko
From a6bf54d588c2d0c9d3a97541bcb7b605fd1f3ae0 Mon Sep 17 00:00:00 2001
From: Rumko <rumcic@gmail.com>
Date: Fri, 5 Feb 2010 20:59:21 +0100
Subject: [PATCH] Use pthread_mutex_lock() instead of lock() since we are dealing with a
mutex directly.
|
|
A SCSI device can have for more states than just 'offline' and
'running'. In fact, any device _not_ in state 'running' is
unaccessible to I/O, so running a path checker on these devices
will cause the checker to be delayed and hence stall the entire
daemon.
This patch updates the path_offline() function to return the
actual device state. Path checkers will only be run if the
state is PATH_UP. A 'blocked' device state will be translated
into PATH_PENDING, causing the checkerloop to skip this device
and recheck as soon as possible.
Signed-off-by: Hannes Reinecke <hare@suse.de>
|
|
You should lock the mutex before doing a pthread_cond_wait otherwise
undefined results occur. In fact we get away with this with glibc,
but with uclibc it causes a segfault.
|
|
when the multipath already exists and
1/ new path size is 0
2/ new path size is different than the multipath known size
as per Chandra Seetharaman recommendation.
|
|
added in the map
Hi,
If READ_CAPACITY fails during device discovery, sd device gets attached with device size 0. Currently multipath discover these paths and adds into the map. RDAC patch checker sends inquiry on each path to check path status, which eventually marks this path as up. If this path is from owning controller then mode select will be issued to switch the pathgroup. But any I/O sent to this path(path with size 0) will eventually fail in sd_prep_fn due to incorrect device size and resulting to ping pong between pathgroups. We should only allow valid paths to get added in the map. Below patch checks two cases before adding paths; i.e.:
1) device size of path is not 0
2) there is no mismatch between mpp size and new path size.
Thanks,
Vijay
----
multipath should only add paths with valid size to the map. If there is mismatch between map and path size it should not be added. This patch also check if the device size is not 0 before adding path. During device discovery if READ_CAPACITY fails, sd device get attached with device size 0. multipath should not allow the such device to get added in the map.
Signed-off-by: Vijay Chauhan <vijay.chauhan@lsi.com>
|
|
New commands added to the multipathd -k command mode.
Document them in the manpage
Signed-off-by: Ritesh Raj Sarraf <rsarraf@netapp.com>
|
|
Setting max_fds to unlimited doesn't actually work. In the kernel, there is a
fixed limit to the maximum number of open fds a process can have. If you try
to set max_fds to greater than this, it fails. This patch replaces the special
value "unlimited" with a new special value, "max". If you set max_fds to "max",
multipath will use the actual system limit, which it looks up from
/proc/sys/fs/nr_open.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
|
|
Use a more restrictive umask for /var/run/multipathd.sock
Group and Other do not need to access the socket.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
- properly check cli_list_wildcards()'s MALLOC returned pointer
- add missing newline to "blacklisted" reply
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
CVE-2009-0115 taught us that such paths should not be tolerated
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
If multipathd (server) crashes fail any multipathd (client)
communication gracefully. This patch adds error checking that prevents
calls to recv_packet() if the send_packet() call failed.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Running the internal memory checker revealed quite some memory
leaks.
Signed-off-by: Hannes Reinecke <hare@suse.de>
|
|
A remove event might be handled after the failed devices have already
been purged from the multipath structure, so a failure here is not
an error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
|
|
Signed-off-by: Hannes Reinecke <hare@suse.de>
|
|
During shutdown the mpvec pointer can indeed be empty, so we should
check it first before trying to access it.
References: 437245
Signed-off-by: Hannes Reinecke <hare@suse.de>
|
|
Valgrind found some issues. And clear up whitespaces while we're at it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
|
|
Various small improvements to Red Hat's multipathd initscript.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
this bit was missing from the forward-port
|
|
Even when the last path of a multipath device is deleted, it can't be
removed until all the queued IO is flushed. For devices that have
no_path_retry set to queue, this doesn't automatically happen.
This patch adds a "flush_on_last_del" config file option, that causes the
multipath device to automatically turn off queueing when the last path is
deleted. It also adds the "disablequeueing" and "restorequeueing"
multipathd cli commands.
|
|
Setting the stacksize too small just causes
pthread_attr_setstacksize() to fail, leaving you with the default stack
size. On some architectures, the default stacksize is large, like 10Mb.
Since you start one waiter thread per multipath device, every 100
devices eats up 1Gb of memory.
The other problem is that when I actually read the pthread_attr_init man
page (it can fail. who knew?), I saw that it can fail with ENOMEM. Also,
that it had a function to free it, and that the result of reinitializing
an attr that hadn't been freed was undefined. Clearly, this function
wasn't intended to be called over and over without ever freeing the
attr, which is how we've been using it in multipathd. So, in the spirit
of writing code to the interface, instead of to how it appears to be
currently implemented, how about this.
|
|
When the last path in a multipath map was removed, the path wasn't getting
deleted from the pathvec before it was getting freed.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
|
|
This is a patch to fix up the linking. It does two things. First, it makes
libmultipath.so install to /lib/ just like a normal shared library, so you
don't have to use -rpath to link to it. Second, and more importantly,
it moves the libaio linking into libcheckdirectio.so, where it belongs. Since
libcheckdirectio.so is a dynamic shared object, multipath and multipathd don't
know what functions they need to link in from libaio. This fixes the directio
lockup for me.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
|
|
This is mostly a cleanup of some bugs that recently got introduced. In
ACT_RESIZE we were trying to create a read-only device before we tried
to create a read/write one (I also added the ability to fail back to
read-only in ACT_RELOAD). There were some printouts that I assume were
for debugging, and some duplicate code. And I switched it so that
dm_simplecmd_flush did flushing, and dm_simplecmd_noflush didn't.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
|
|
add missing format param
|
|
easier than the magic value used in static dm_simple_cmd to drive
the flush/noflush behaviour
|
|
This is patch that initially showed up on dm-devel mailing list:
http://www.linux-archive.org/device-mapper-development/162594-multipath-tools-libmultipath-configure-c-libmu.html
which was posted on dm-devel mailing list, but never ported
over to work with the git version. This forward-port by me
works.
|
|
If multipathd is run with -v3, both the SIGHUP, and the SIGUSR1 signal handlers
will log a message. If a multipathd thread receives one of these signals while
it has a log lock held, it deadlocks itself. Also, the SIGHUP handler will grab
the vecs lock, so if any thread receives a SIGHUP while holding the vecs lock,
it deadlocks itself. This commit blocks the appropriate signals to guard
against this.
|
|
Due to an stray 'umask()' the socket file is in fact world-writable,
allowing for an easy exploit.
References: 458598
|
|
uev_discard uses sscanf to write a 10 byte string into an array,
but I forgot to take the trailing NULL byte into account.
|
|
and don't switch groups.
A previous commit mass-changed #ifdef DAEMON to check for 'mpp->waiter'.
Unfortunatly when the 'domap' function is called with ACT_CREATE in the daemon,
the mpp->waiter is not set, hence the multipath client mode logic is choosen.
Fixing this triggers another issues which is that newly added path via
ACT_CREATE won't have their waitevent thread created as the caller checks
mpp->action (which changed to ACT_NOTHING) and won't start the thread.
|
|
Our statup sequence is 'load_config', 'init_checkers', and 'init_prio'.
Both init_* functions reset the list of prio and checkers, which is
unfortunate as in load_config, depending on the multipat.conf, would
load prio and checker libraries. This results in double-loading of
the libraries and a memory leak.
|
|
When we shutdown, the main process locks the mutex, causing
all of the free_waiter functions to pile up on their lock.
Once we unlock in the main process, all of the free_waiters
start working. However the next instruction in the main proces
is to destroy the mutex. The short window is all the free_waiter
threads have to do their cleanup before they attempt to unlock
the mutex - which might have been de-allocated (and set to NULL).
End result can be a seg-fault.
This fix adds a ref-count to the mutex so that during shutdown
we spin and wait until all of the free_waiter functions
have completed and the ref-count is set to zero.
|
|
|
|
When using dm_mapname it makes a strdup of the returned value. We use
the dm_mapname return value (alias) in our function but neglected
to free it at the exit points.
|
|
This is necessary to make uevents work on fedora, since devpath appears as
something like:
'/devices/pci0000:00/0000:00:0a.0/0000:06:00.0/host11/rport-11:0-1/target11:0:1/11:0:1:0/block/sdi'
It simply strips off the everything up to the /block.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
|
|
Tested with piped and redirected stdout, using either "multipath -l"
or "multipathd show topo". Seems there are no more blatant regression.
Let me know if there are, because I'm tempted to go a little further
with those ANSI codes by highlighting degrated/failure situation.
|
|
Previously using the CLI commands from the shell was done using :
multipathd -k"show paths format \"%w %d %i %s %t %T %D\""
Now you can also use :
multipathd -- show paths format "%w %d %i %s %t %T %D"
or
multipathd show paths format "%w %d %i %s %t %T %D"
|
|
Just like "show paths format ...", it gives users more control
over the report format of multipaths information.
Example:
$ sudo multipathd -k'show maps format "%n %s %S %d %t %r %Q %N"'
name vend/prod/rev size sysfs dm-st write_prot queueing paths
353333330000007d0 Linux,scsi_debug 8.0M dm-1 active rw - 2
353333330000007d1 Linux,scsi_debug 8.0M dm-2 active rw - 2
35333333000000bb8 Linux,scsi_debug 8.0M dm-3 active rw - 2
35333333000000bb9 Linux,scsi_debug 8.0M dm-4 active rw - 2
|
|
If not explicitely set to faulty, the default is undef.
|
|
o don't check offlined paths
Avoids log polution and useless work
o display offlined status
This information is quite useful because offlining can be done
by the kernel as a response to an unsane situation, but un-offlining
is not automatic. This helps the offlining being aknowledged by
admins.
|
|
"exit" or "quit" may be more straight-forward than CTRL-D
Put dummy cli commands in place for auto-generated help,
even if we exit from the socket client code before sending
the command packet to the daemon.
|
|
The fact I had to look at the code to find the wildcards to use
in "show paths format ...", "show multipath format ..." and
"show pathgroup format ..." was a clear sign that more help was
necessary.
The "show wildcards" command outputs :
multipath format wildcards:
%n name
%w uuid
%d sysfs
%F failback
%Q queueing
%N paths
%r write_prot
%t dm-st
%S size
%f features
%h hwhandler
%A action
%0 path_faults
%1 switch_grp
%2 map_loads
%3 total_q_time
%4 q_timeouts
%s vend/prod/rev
path format wildcards:
%w uuid
%i hcil
%d dev
%D dev_t
%t dm_st
%T chk_st
%s vend/prod/rev
%C next_check
%p pri
%S size
pathgroup format wildcards:
%s selector
%p pri
%t dm_st
And for example, "show paths format foo:%d:%S:%i", outs
foo:dev:size:hcil
foo:sda:149G:2:0:0:0
|
|
For now just print the number of paths in each path checker state,
if not zero. For example :
path checker states:
up 2
down 1
ghost 1
|
|
o move DEFAULT_TARGET define from defaults.h to devmapper.h
o rename DEFAULT_TARGET into TGT_MPATH (multipath)
o introduce TGT_PART (linear)
o remove the type param from functions used only with TGT_MPATH
o abstract dm_addmap() with to wrappers dm_addmap_create() and
dm_addmap_reload(). Wrappers don't require the type and task
params.
o move the dm_addmap(DM_DEVICE_CREATE, ...) cleanup on failure
from configure.c into devmapper.c::dm_addmap_create()
|
|
to let users extract custom reports from the daemon data stractures.
format is printf fmt-like with :
%w uuid
%i hcil
%d dev
%D dev_t
%t dm_st
%T chk_st
%s vend/prod/rev
%C next_check
%p pri
%S size
example:
$ mutlipathd -k'show paths format "/dev/%d is a path to %w"'
|
|
There is no reason why multipath should handle the removal
of the last path from a multipath map different than any other
path removal. In fact, in doing so we'll shed the stale reference
to the dead device in the map and allow for a clean reconnection.
Signed-off-by: Hannes Reinecke <hare@suse.de>
|
|
which aligns daemon output to "multipath -l" output
|