summaryrefslogtreecommitdiff
path: root/fs/lockd/svc.c
AgeCommit message (Collapse)AuthorFilesLines
2012-10-13Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-15/+2
Pull nfsd update from J Bruce Fields: "Another relatively quiet cycle. There was some progress on my remaining 4.1 todo's, but a couple of them were just of the form "check that we do X correctly", so didn't have much affect on the code. Other than that, a bunch of cleanup and some bugfixes (including an annoying NFSv4.0 state leak and a busy-loop in the server that could cause it to peg the CPU without making progress)." * 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits) UAPI: (Scripted) Disintegrate include/linux/sunrpc UAPI: (Scripted) Disintegrate include/linux/nfsd nfsd4: don't allow reclaims of expired clients nfsd4: remove redundant callback probe nfsd4: expire old client earlier nfsd4: separate session allocation and initialization nfsd4: clean up session allocation nfsd4: minor free_session cleanup nfsd4: new_conn_from_crses should only allocate nfsd4: separate connection allocation and initialization nfsd4: reject bad forechannel attrs earlier nfsd4: enforce per-client sessions/no-sessions distinction nfsd4: set cl_minorversion at create time nfsd4: don't pin clientids to pseudoflavors nfsd4: fix bind_conn_to_session xdr comment nfsd4: cast readlink() bug argument NFSD: pass null terminated buf to kstrtouint() nfsd: remove duplicate init in nfsd4_cb_recall nfsd4: eliminate redundant nfs4_free_stateid fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR ...
2012-10-01lockd: per-net NSM client creation and destruction helpers introducedStanislav Kinsbursky1-0/+1
NSM RPC client can be required on NFSv3 umount, when child reaper is dying (and destroying it's mount namespace). It means, that current nsproxy is set to NULL already, but creation of RPC client requires UTS namespace for gaining hostname string. This patch introduces reference counted NFS RPC clients creation and destruction helpers (similar to RPCBIND RPC clients). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-21svcrpc: remove handling of unknown errors from svc_recvJ. Bruce Fields1-15/+2
svc_recv() returns only -EINTR or -EAGAIN. If we really want to worry about the case where it has a bug that causes it to return something else, we could stick a WARN() in svc_recv. But it's silly to require every caller to have all this boilerplate to handle that case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27Lockd: move grace period management from lockd() to per-net functionsStanislav Kinsbursky1-6/+3
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27LockD: pass actual network namespace to grace period management functionsStanislav Kinsbursky1-7/+9
Passed network namespace replaced hard-coded init_net Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27LockD: manage grace list per network namespaceStanislav Kinsbursky1-0/+1
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27LockD: make lockd manager allocated per network namespaceStanislav Kinsbursky1-8/+10
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27LockD: manage grace period per network namespaceStanislav Kinsbursky1-6/+11
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31LockD: add debug message to start and stop functionsStanislav Kinsbursky1-0/+5
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31LockD: service start function introducedStanislav Kinsbursky1-25/+42
This is just a code move, which from my POV makes the code look better. I.e. now on start we have 3 different stages: 1) Service creation. 2) Service per-net data allocation. 3) Service start. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31LockD: move global usage counter manipulation from error pathStanislav Kinsbursky1-3/+2
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31LockD: service creation function introducedStanislav Kinsbursky1-11/+27
This function creates service if it doesn't exist, or increases usage counter if it does, and returns a pointer to it. The usage counter will be droppepd by svc_destroy() later in lockd_up(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31LockD: use existing per-net data function on service creationStanislav Kinsbursky1-16/+7
This patch also replaces svc_rpcb_setup() with svc_bind(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31LockD: pass service to per-net up and down functionsStanislav Kinsbursky1-7/+5
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31SUNRPC: move per-net operations from svc_destroy()Stanislav Kinsbursky1-12/+15
The idea is to separate service destruction and per-net operations, because these are two different things and the mix looks ugly. Notes: 1) For NFS server this patch looks ugly (sorry for that). But these place will be rewritten soon during NFSd containerization. 2) LockD per-net counter increase int lockd_up() was moved prior to make_socks() to make lockd_down_net() call safe in case of error. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31SUNRPC: new svc_bind() routine introducedStanislav Kinsbursky1-0/+6
This new routine is responsible for service registration in a specified network context. The idea is to separate service creation from per-net operations. Note also: since registering service with svc_bind() can fail, the service will be destroyed and during destruction it will try to unregister itself from rpcbind. In this case unregistration has to be skipped. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-04-11Lockd: pass network namespace to creation and destruction routinesStanislav Kinsbursky1-4/+3
v2: dereference of most probably already released nlm_host removed in nlmclnt_done() and reclaimer(). These routines are called from locks reclaimer() kernel thread. This thread works in "init_net" network context and currently relays on persence on lockd thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay on current network context. So let's pass corrent one into them. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26Merge nfs containerization work from Trond's treeJ. Bruce Fields1-15/+102
The nfs containerization work is a prerequisite for Jeff Layton's reboot recovery rework.
2012-02-17lockd: fix arg parsing for grace_period and timeout.NeilBrown1-1/+1
If you try to set grace_period or timeout via a module parameter to lockd, and do this on a big-endian machine where sizeof(int) != sizeof(unsigned long) it won't work. This number given will be effectively shifted right by the difference in those two sizes. So cast kp->arg properly to get correct result. Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15Lockd: shutdown NLM hosts in network namespace contextStanislav Kinsbursky1-1/+3
Lockd now managed in network namespace context. And this patch introduces network namespace related NLM hosts shutdown in case of releasing per-net Lockd resources. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15Lockd: per-net up and down routines introducedStanislav Kinsbursky1-2/+45
This patch introduces per-net Lockd initialization and destruction routines. The logic is the same as in global Lockd up and down routines. Probably the solution is not the best one. But at least it looks clear. So per-net "up" routine are called only in case of lockd is running already. If per-net resources are not allocated yet, then service is being registered with local portmapper and lockd sockets created. Per-net "down" routine is called on every lockd_down() call in case of global users counter is not zero. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15Lockd: pernet usage counter introducedStanislav Kinsbursky1-3/+42
Lockd is going to be shared between network namespaces - i.e. going to be able to handle lock requests from different network namespaces. This means, that network namespace related resources have to be allocated not once (like now), but for every network namespace context, from which service is requested to operate. This patch implements Lockd per-net users accounting. New per-net counter is used to determine, when per-net resources have to be freed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15Lockd: create permanent lockd sockets in current network namespaceStanislav Kinsbursky1-10/+13
This patch parametrizes Lockd permanent sockets creation routine by network namespace context. It also replaces hard-coded init_net with current network namespace context in Lockd sockets creation routines. This approach looks safe, because Lockd is created during NFS mount (or NFS server start) and thus socket is required exactly in current network namespace context. But in the same time it means, that Lockd sockets inherits first Lockd requester network namespace. This issue will be fixed in further patches of the series. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31SUNRPC: search for service transports in network namespace contextStanislav Kinsbursky1-1/+1
Service transports are parametrized by network namespace. And thus lookup of transport instance have to take network namespace into account. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: J. Bruce Fields <bfields@redhat.com>
2011-08-19sunrpc: use better NUMA affinitiesEric Dumazet1-1/+1
Use NUMA aware allocations to reduce latencies and increase throughput. sunrpc kthreads can use kthread_create_on_node() if pool_mode is "percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can also take into account NUMA node affinity for memory allocations. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: "J. Bruce Fields" <bfields@fieldses.org> CC: Neil Brown <neilb@suse.de> CC: David Miller <davem@davemloft.net> Reviewed-by: Greg Banks <gnb@fastmail.fm> [bfields@redhat.com: fix up caller nfs41_callback_up] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-27lockd: push lock_flocks downArnd Bergmann1-11/+0
lockd should use lock_flocks() instead of lock_kernel() to lock against posix locks accessing the i_flock list. This is a prerequisite to turning lock_flocks into a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01sunrpc: Add net argument to svc_create_xprtPavel Emelyanov1-1/+1
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-1/+0
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-01-26SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt()Chuck Lever1-2/+0
Clean up: Bruce observed we have more or less common logic in each of svc_create_xprt()'s callers: the check to create an IPv6 RPC listener socket only if CONFIG_IPV6 is set. I'm about to add another case that does just the same. If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt() call sites can get rid of the "#ifdef" ugliness, and can use the same logic with or without IPv6 support available in the kernel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-18sysctl: Drop & in front of every proc_handler.Eric W. Biederman1-6/+6
For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-12sysctl fs: Remove dead binary sysctl supportEric W. Biederman1-11/+3
Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name and .strategy members of sysctl tables are dead code. Remove them. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-05-06lockd: fix list corruption on lockd restartJ. Bruce Fields1-4/+11
If lockd is signalled soon enough after restart then locks_start_grace() will try to re-add an entry to a list and trigger a lock corruption warning. Thanks to Wang Chen for the problem report and diagnosis. WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c() ... list_add corruption. next->prev should be prev (ef8fe958), but was ef8ff128. (next=ef8ff128). ... Pid: 23062, comm: lockd Tainted: G W 2.6.30-rc2 #3 Call Trace: [<c042d5b5>] warn_slowpath+0x71/0xa0 [<c0422a96>] ? update_curr+0x11d/0x125 [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c044b270>] ? trace_hardirqs_on+0xb/0xd [<c051c61a>] ? _raw_spin_lock+0x53/0xfa [<c051c89f>] __list_add+0x27/0x5c [<ef8f6daa>] locks_start_grace+0x22/0x30 [lockd] [<ef8f34da>] set_grace_period+0x39/0x53 [lockd] [<c06b8921>] ? lock_kernel+0x1c/0x28 [<ef8f3558>] lockd+0x64/0x164 [lockd] [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c04227b0>] ? complete+0x34/0x3e [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<c043dd42>] kthread+0x45/0x6b [<c043dcfd>] ? kthread+0x0/0x6b [<c0403c23>] kernel_thread_helper+0x7/0x10 Reported-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: stable@kernel.org
2009-03-28lockd: Start PF_INET6 listener only if IPv6 support is availableChuck Lever1-9/+21
Apparently a lot of people need to disable IPv6 completely on their distributor-built systems, which have CONFIG_IPV6_MODULE enabled at build time. They do this by blacklisting the ipv6.ko module. This causes the creation of the lockd service listener to fail if CONFIG_IPV6_MODULE is set, but the module cannot be loaded. Now that the kernel's PF_INET6 RPC listeners are completely separate from PF_INET listeners, we can always start PF_INET. Then lockd can try to start PF_INET6, but it isn't required to be available. Note this has the added benefit that NLM callbacks from AF_INET6 servers will never come from AF_INET remotes. We no longer have to worry about matching mapped IPv4 addresses to AF_INET when comparing addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-28NFS: Revert creation of IPv6 listeners for lockd and NFSv4 callbacksChuck Lever1-12/+1
We're about to convert over to using separate PF_INET and PF_INET6 listeners, instead of a single PF_INET6 listener that also receives AF_INET requests and maps them to AF_INET6. Clear the way by removing the logic in lockd and the NFSv4 callback server that creates an AF_INET6 service listener. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-28SUNRPC: Remove @family argument from svc_create() and svc_create_pooled()Chuck Lever1-1/+1
Since an RPC service listener's protocol family is specified now via svc_create_xprt(), it no longer needs to be passed to svc_create() or svc_create_pooled(). Remove that argument from the synopsis of those functions, and remove the sv_family field from the svc_serv struct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-28SUNRPC: Change svc_create_xprt() to take a @family argumentChuck Lever1-1/+2
The sv_family field is going away. Pass a protocol family argument to svc_create_xprt() instead of extracting the family from the passed-in svc_serv struct. Again, as this is a listener socket and not an address, we make this new argument an "int" protocol family, instead of an "sa_family_t." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-01-07NLM: Clean up flow of control in make_socks() functionChuck Lever1-8/+14
Clean up: Use Bruce's preferred control flow style in make_socks(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-07NLM: Refactor make_socks() functionChuck Lever1-15/+16
Clean up: extract common logic in NLM's make_socks() function into a helper. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06lockd: Enable NLM use of AF_INET6Chuck Lever1-1/+12
If the kernel is configured to support IPv6 and the RPC server can register services via rpcbindv4, we are all set to enable IPv6 support for lockd. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aime Le Rouzic <aime.le-rouzic@bull.net> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Move nsm_use_hostnames to mon.cChuck Lever1-1/+0
Clean up. Treat the nsm_use_hostnames global variable like nsm_local_state. Note that the default value of nsm_use_hostnames is still zero. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Remove include/linux/lockd/sm_inter.hChuck Lever1-1/+0
Clean up: The include/linux/lockd/sm_inter.h header is nearly empty now. Remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)Jeff Layton1-0/+8
The default method for calculating the number of connections allowed per RPC service arbitrarily limits single-threaded services to 80 connections. This is too low for services like lockd and artificially limits the number of TCP clients that it can support. Have lockd set a default sv_maxconn value to 1024 (which is the typical default value for RLIMIT_NOFILE. Also add a module parameter to allow an admin to set this to an arbitrary value. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-12-23LOCKD: Make lockd_up() and lockd_down() exported GPL-onlyTrond Myklebust1-3/+3
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-11-24nfsd: clean up grace period on early exitJ. Bruce Fields1-0/+1
If nfsd was shut down before the grace period ended, we could end up with a freed object still on grace_list. Thanks to Jeff Moyer for reporting the resulting list corruption warnings. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Tested-by: Jeff Moyer <jmoyer@redhat.com>
2008-10-04NLM: Remove "proto" argument from lockd_up()Chuck Lever1-2/+1
Clean up: Now that lockd_up() starts listeners for both transports, the "proto" argument is no longer needed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-04NLM: Always start both UDP and TCP listenersChuck Lever1-18/+19
Commit 24e36663, which first appeared in 2.6.19, changed lockd so that the client side starts a UDP listener only if there is a UDP NFSv2/v3 mount. Its description notes: This... means that lockd will *not* listen on UDP if the only mounts are TCP mount (and nfsd hasn't started). The latter is the only one that concerns me at all - I don't know if this might be a problem with some servers. Unfortunately it is a problem for Linux itself. The rpc.statd daemon on Linux uses UDP for contacting the local lockd, no matter which protocol is used for NFS mounts. Without a local lockd UDP listener, NFSv2/v3 lock recovery from Linux NFS clients always fails. Revert parts of commit 24e36663 so lockd_up() always starts both listeners. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03nfsd: common grace period controlJ. Bruce Fields1-15/+5
Rewrite grace period code to unify management of grace period across lockd and nfsd. The current code has lockd and nfsd cooperate to compute a grace period which is satisfactory to them both, and then individually enforce it. This creates a slight race condition, since the enforcement is not coordinated. It's also more complicated than necessary. Here instead we have lockd and nfsd each inform common code when they enter the grace period, and when they're ready to leave the grace period, and allow normal locking only after both of them are ready to leave. We also expect the locks_start_grace()/locks_end_grace() interface here to be simpler to build on for future cluster/high-availability work, which may require (for example) putting individual filesystems into grace, or enforcing grace periods across multiple cluster nodes. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29lockd: don't depend on lockd main loop to end graceJ. Bruce Fields1-11/+13
End lockd's grace period using schedule_delayed_work() instead of a check on every pass through the main loop. After a later patch, we'll depend on lockd to end its grace period even if it's not currently handling requests; so it shouldn't depend on being woken up from the main loop to do so. Also, Nakano Hiroaki (who independently produced a similar patch) noticed that the current behavior is buggy in the face of jiffies wraparound: "lockd uses time_before() to determine whether the grace period has expired. This would seem to be enough to avoid timer wrap-around issues, but, unfortunately, that is not the case. The time_* family of comparison functions can be safely used to compare jiffies relatively close in time, but they stop working after approximately LONG_MAX/2 ticks. nfsd can suffer this problem because the time_before() comparison in lockd() is not performed until the first request comes in, which means that if there is no lockd traffic for more than LONG_MAX/2 ticks we are screwed. "The implication of this is that once time_before() starts misbehaving any attempt from a NFS client to execute fcntl() will be received with a NLM_LCK_DENIED_GRACE_PERIOD message for 25 days (assuming HZ=1000). In other words, the 50 seconds grace period could turn into a grace period of 50 days or more. "Note: This bug was analyzed independently by Oda-san <oda@valinux.co.jp> and myself." Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Nakano Hiroaki <nakano.hiroaki@oss.ntt.co.jp> Cc: Itsuro Oda <oda@valinux.co.jp>
2008-09-29locks: allow lockd to process blocked locks during grace periodJ. Bruce Fields1-9/+3
The check here is currently harmless but unnecessary, since, as the comment notes, there aren't any blocked-lock callbacks to process during the grace period anyway. And eventually we want to allow multiple grace periods that come and go for different filesystems over the course of the lifetime of lockd, at which point this check is just going to get in the way. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-29SUNRPC: Add address family field to svc_serv data structureChuck Lever1-1/+1
Introduce and initialize an address family field in the svc_serv structure. This field will determine what family to use for the service's listener sockets and what families are advertised via the local rpcbind daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>