NLM: Fix "kernel BUG at fs/lockd/host.c:417!" or ".../host.c:283!"

Nick Bowler <nbowler@elliptictech.com> reports:

> We were just having some NFS server troubles, and my client machine
> running 2.6.38-rc1+ (specifically, commit 2b1caf6ed7b888c95) crashed
> hard (syslog output appended to this mail).
>
> I'm not sure what the exact timeline was or how to reproduce this,
> but the server was rebooted during all this. Since I've never seen
> this happen before, it is possibly a regression from previous kernel
> releases. However, I recently updated my nfs-utils (on the client) to
> version 1.2.3, so that might be related as well.

[ BUG output redacted ]

When done searching, the for_each_host loop in next_host_state() falls
through and returns the final host on the host chain without bumping
it's reference count.

Since the host's ref count is only one at that point, releasing the
host in nlm_host_rebooted() attempts to destroy the host prematurely,
and therefore hits a BUG().

Likely, the original intent of the for_each_host behavior in
next_host_state() was to handle the case when the host chain is empty.
Searching the chain and finding no suitable host to return needs to be
handled as well.

Defensively restructure next_host_state() always to return NULL when
the loop falls through.

Introduced by commit b10e30f6 "lockd: reorganize nlm_host_rebooted".

Cc: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

authored by Chuck Lever and committed by Trond Myklebust 80c30e8d f61f6da0

+5 -4
+5 -4
fs/lockd/host.c
··· 520 520 struct nsm_handle *nsm, 521 521 const struct nlm_reboot *info) 522 522 { 523 - struct nlm_host *host = NULL; 523 + struct nlm_host *host; 524 524 struct hlist_head *chain; 525 525 struct hlist_node *pos; 526 526 ··· 532 532 host->h_state++; 533 533 534 534 nlm_get_host(host); 535 - goto out; 535 + mutex_unlock(&nlm_host_mutex); 536 + return host; 536 537 } 537 538 } 538 - out: 539 + 539 540 mutex_unlock(&nlm_host_mutex); 540 - return host; 541 + return NULL; 541 542 } 542 543 543 544 /**