This removes the BKL from the RPC service creation codepath. The BKL
really isn't adequate for this job since some of this info needs
protection across sleeps.
Also, add some comments to try and clarify how the locking should work
and to make it clear that the BKL isn't necessary as long as there is
adequate locking between tasks when touching the svc_serv fields.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* Pass reference to cpumask variable instead of using stack.
For inclusion into sched-devel/latest tree.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
A RDMA read-list cannot contain more elements than RPCSVC_MAXPAGES or
it will overflow the DTO context. Verify this when processing the
protocol header.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
The svc_rdma_send_error function is called when an RPCRDMA protocol
error is detected. This function attempts to post an error reply message.
Since an error posting to a transport in error is ignored, change
the return type to void.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
This race was found by inspection. Messages can be received from the peer
immediately following the rdma_accept call, however, the CQ have not yet
been armed and the transport address has not yet been set.
Set the transport address in the connect request handler and arm the CQ
prior to calling rdma_accept.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
The rdma_read_complete function needs to copy the rqstp transport address
from the transport. Failure to do so can result in using the wrong
authentication method for the RPC or bug checking if the rqstp address
is not valid.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Use the ib_verbs version of the dma_unmap service in the
svc_rdma_put_context function. This should support providers
using software rdma.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
When the transport is closing, the DTO tasklet may queue data
that never gets processed. Clean up resources associated with
this I/O.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Move the destruction of the QP and CM_ID to the free path so that the
QP cleanup code doesn't race with the dto_tasklet handling flushed WR.
The QP reference is not needed because we now have a reference for
every WR.
Also add a guard in the SQ and RQ completion handlers to ignore
calls generated by some providers when the QP is destroyed.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Some providers may wait while destroying adapter resources.
Since it is possible that the last reference is put on the
dto_tasklet, the actual destroy must be scheduled as a work item.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
The rq_cq_reap function is only called from the dto_tasklet. The
only resource shared with other threads is the sc_rq_dto_q. Move the
spin lock to protect only this list.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Replace the one-off linked list implementation used to implement the
context cache with the standard Linux list_head lists. Add a context
counter to catch resource leaks. A WARN_ON will be added later to
ensure that we've freed all contexts.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
An NFS_WRITE requires a set of RDMA_READ requests to fetch the write
data from the client. There are two principal pieces of data that
need to be tracked: the list of pages that comprise the completed RPC
and the SGE of dma mapped pages to refer to this list of pages. Previously
this whole bit was managed as a linked list of contexts with the
context containing the page list buried in this list. This patch
simplifies this processing by not keeping a linked list, but rather only
a pionter from the last submitted RDMA_READ's context to the context
that maps the set of pages that describe the RPC. This significantly
simplifies this code path. SGE contexts are cleaned up inline in the DTO
path instead of at read completion time.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
The rdma_read_xdr function did not discriminate between no read-list and
an error posting the read-list. This results in a leak of a page if there
is an error posting the read-list.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
A listening endpoint isn't known to the generic transport switch until
the svc_create_xprt function returns without error. Calling
svc_xprt_put within the xpo_create function causes the module reference
count to be erroneously decremented.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
If an error is encountered trying to post a recv buffer in send_reply,
free the passed in context. Return an error to the caller so it is
aware that the request was not posted.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
If there is an error posting the recv WR to the RQ, free the
context associated with the WR. This would leak a context when
asynchronous errors occurred on the transport while conccurent threads
were processing their RPC.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
The svcrdma transport takes a reference when it gets the ESTABLISHED
event from the provider. This reference is supposed to be removed when
the DISCONNECT event is received, however, the call to svc_xprt_put
was missing in the switch statement. This results in the memory
associated with the transport never being freed.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Fix the return value on close to -ENOTCONN so caller knows to free context.
Also if a thread is waiting for free SQ space, check for close when waking
to avoid posting WR to a closing transport.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
The svc_rdma_send function will attempt to reap SQ WR to make room for
a new request if it finds the SQ full. This function races with the
dto_tasklet that also reaps SQ WR. To avoid polling and arming the CQ
unnecessarily move the test_and_clear_bit of the RDMAXPRT_SQ_PENDING
flag and arming of the CQ to the sq_cq_reap function.
Refactor the rq_cq_reap function to match sq_cq_reap so that the
code is easier to follow.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
The svcrdma transport provider currently allocates receive buffers
to the RQ through the xpo_release_rqst method. This approach is overly
complicated since it means that the rqstp rq_xprt_ctxt has to be
selectively set based on whether the RPC is going to be processed
immediately or deferred. Instead, just post the receive buffer when
we are certain that we are replying in the send_reply function.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Remove a redundant check for the XPT_DEAD bit in the svc_xprt_enqueue
function. This same bit is checked below while holding the pool lock
and prints a debug message if found to be dead.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Commit f15364bd4c ("IPv6 support for NFS
server export caches") dropped a couple spaces, rendering the output
here difficult to read.
(However note that we expect the output to be parsed only by humans, not
machines, so this shouldn't have broken any userland software.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Apparently this causes Solaris 10 servers to refuse our NFSv4 SETCLIENTID
calls. Fall back to root creds for now, since most servers that care are
very likely to have root squashing enabled.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Simply replace proc_create and further data assigned with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix missing sunrpc kernel-doc:
Warning(linux-2.6.25-git7//net/sunrpc/xprt.c:451): No description found for parameter 'action'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC 2203 requires the server to drop the request if it believes the
RPCSEC_GSS context is out of sequence. The problem is that we have no way
on the client to know why the server dropped the request. In order to avoid
spinning forever trying to resend the request, the safe approach is
therefore to always invalidate the RPCSEC_GSS context on every major
timeout.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: Suppress a harmless compiler warning.
Index rq_pages[] with an unsigned type. Make "pages" unsigned as well,
as it never represents a value less than zero.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: Suppress a harmless compiler warning in the RPC server related
to array indices.
ARRAY_SIZE() returns a size_t, so use unsigned type for a loop index when
looping over arrays.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: Update the RPC server's TCP record marker decoder to match the
constructs used by the RPC client's TCP socket transport.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Use the 2.6 method for disabling TCP Nagle in the kernel's RPC server.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Now that the nfs4 callback thread uses the kthread API, there are no
more users of svc_create_thread(). Remove it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
g_make_token_header() and g_token_size() add two too many, and
therefore their callers pass in "(logical_value - 2)" rather
than "logical_value" as hard-coded values which causes confusion.
This dates back to the original g_make_token_header which took an
optional token type (token_id) value and added it to the token.
This was removed, but the routine always adds room for the token_id
rather than not.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Consistently use unsigned (u32 vs. s32) for seqnum.
In get_mic function, send the local copy of seq_send,
rather than the context version.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
net/sunrpc/svc.c: In function '__svc_create_thread':
net/sunrpc/svc.c:587: warning: 'oldmask.bits[0u]' may be used uninitialized in this function
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Tom Tucker <tom@opengridcomputing.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
cleanup: When adding new encryption types, the checksum length
can be different for each enctype. Face the fact that the
current code only supports DES which has a checksum length of 8.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
cleanup: Fix grammer/typos to use "too" instead of "to"
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
SVCRDMA: Add check for XPT_CLOSE in svc_rdma_send
The svcrdma transport can crash if a send is waiting for an
empty SQ slot and the connection is closed due to an asynchronous error.
The crash is caused when svc_rdma_send attempts to send on a deleted
QP.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
In function svcauth_gss_accept() (net/sunrpc/auth_gss/svcauth_gss.c) the
code that handles GSS integrity and decryption failures should be
returning GARBAGE_ARGS as specified in RFC 2203, sections 5.3.3.4.2 and
5.3.3.4.3.
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
svc_recv() calls alloc_page(), and if it fails it does a 500ms
uninterruptible sleep and then reattempts. There doesn't seem to be any
real reason for this to be uninterruptible, so change it to an
interruptible sleep. Also check for kthread_stop() and signalled() after
setting the task state to avoid races that might lead to sleeping after
kthread_stop() wakes up the task.
I've done some very basic smoke testing with this, but obviously it's
hard to test the actual changes since this all depends on an
alloc_page() call failing.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This field is set once and never used; probably some artifact of an
earlier implementation idea.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This adds IPv6 support to the interfaces that are used to express nfsd
exports. All addressed are stored internally as IPv6; backwards
compatibility is maintained using mapped addresses.
Thanks to Bruce Fields, Brian Haley, Neil Brown and Hideaki Joshifuji
for comments
Signed-off-by: Aurelien Charbon <aurelien.charbon@bull.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Brian Haley <brian.haley@hp.com>
Cc: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
When using kthreads that call into svc_recv, we want to make sure that
they do not block there for a long time when we're trying to take down
the kthread.
This patch changes svc_recv() to check kthread_should_stop() at the same
places that it checks to see if it's signalled(). Also check just before
svc_recv() tries to schedule(). By making sure that we check it just
after setting the task state we can avoid having to use any locking or
signalling to ensure it doesn't block for a long time.
There's still a chance of a 500ms sleep if alloc_page() fails, but
that should be a rare occurrence and isn't a terribly long time in
the context of a kthread being taken down.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Needed since the plan is to not have a svc_create_thread helper and to
have current users of that function just call kthread_run directly.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
When a server rejects our credential with an AUTH_REJECTEDCRED or similar,
we need to refresh the credential and then retry the request.
However, we do want to allow any requests that are in flight to finish
executing, so that we can at least attempt to process the replies that
depend on this instance of the credential.
The solution is to ensure that gss_refresh() looks up an entirely new
RPCSEC_GSS credential instead of attempting to create a context for the
existing invalid credential.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>