[PATCH] nfsd: decrease sc_count directly if fail to queue dl_recall

Li Lingfeng posted 1 patch 8 months, 1 week ago
fs/nfsd/nfs4state.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] nfsd: decrease sc_count directly if fail to queue dl_recall
Posted by Li Lingfeng 8 months, 1 week ago
A deadlock warning occurred when invoking nfs4_put_stid following a failed
dl_recall queue operation:
            T1                            T2
                                nfs4_laundromat
                                 nfs4_get_client_reaplist
                                  nfs4_anylock_blockers
__break_lease
 spin_lock // ctx->flc_lock
                                   spin_lock // clp->cl_lock
                                   nfs4_lockowner_has_blockers
                                    locks_owner_has_blockers
                                     spin_lock // flctx->flc_lock
 nfsd_break_deleg_cb
  nfsd_break_one_deleg
   nfs4_put_stid
    refcount_dec_and_lock
     spin_lock // clp->cl_lock

When a file is opened, an nfs4_delegation is allocated with sc_count
initialized to 1, and the file_lease holds a reference to the delegation.
The file_lease is then associated with the file through kernel_setlease.

The disassociation is performed in nfsd4_delegreturn via the following
call chain:
nfsd4_delegreturn --> destroy_delegation --> destroy_unhashed_deleg -->
nfs4_unlock_deleg_lease --> kernel_setlease --> generic_delete_lease
The corresponding sc_count reference will be released after this
disassociation.

Since nfsd_break_one_deleg executes while holding the flc_lock, the
disassociation process becomes blocked when attempting to acquire flc_lock
in generic_delete_lease. This means:
1) sc_count in nfsd_break_one_deleg will not be decremented to 0;
2) The nfs4_put_stid called by nfsd_break_one_deleg will not attempt to
acquire cl_lock;
3) Consequently, no deadlock condition is created.

Given that sc_count in nfsd_break_one_deleg remains non-zero, we can
safely perform refcount_dec on sc_count directly. This approach
effectively avoids triggering deadlock warnings.

Fixes: 230ca758453c ("nfsd: put dl_stid if fail to queue dl_recall")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
 fs/nfsd/nfs4state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2041268b398a..59a693f22452 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5430,7 +5430,7 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
 	queued = nfsd4_run_cb(&dp->dl_recall);
 	WARN_ON_ONCE(!queued);
 	if (!queued)
-		nfs4_put_stid(&dp->dl_stid);
+		refcount_dec(&dp->dl_stid.sc_count);
 }
 
 /* Called from break_lease() with flc_lock held. */
-- 
2.31.1
Re: [PATCH] nfsd: decrease sc_count directly if fail to queue dl_recall
Posted by cel@kernel.org 8 months, 1 week ago
From: Chuck Lever <chuck.lever@oracle.com>

On Thu, 10 Apr 2025 09:57:08 +0800, Li Lingfeng wrote:
> A deadlock warning occurred when invoking nfs4_put_stid following a failed
> dl_recall queue operation:
>             T1                            T2
>                                 nfs4_laundromat
>                                  nfs4_get_client_reaplist
>                                   nfs4_anylock_blockers
> __break_lease
>  spin_lock // ctx->flc_lock
>                                    spin_lock // clp->cl_lock
>                                    nfs4_lockowner_has_blockers
>                                     locks_owner_has_blockers
>                                      spin_lock // flctx->flc_lock
>  nfsd_break_deleg_cb
>   nfsd_break_one_deleg
>    nfs4_put_stid
>     refcount_dec_and_lock
>      spin_lock // clp->cl_lock
> 
> [...]

Applied to nfsd-testing, thanks!

[1/1] nfsd: decrease sc_count directly if fail to queue dl_recall
      commit: e8f1e5f463b88ce923eefac447abc2079075f921

--
Chuck Lever
Re: [PATCH] nfsd: decrease sc_count directly if fail to queue dl_recall
Posted by Jeff Layton 8 months, 1 week ago
On Thu, 2025-04-10 at 09:57 +0800, Li Lingfeng wrote:
> A deadlock warning occurred when invoking nfs4_put_stid following a failed
> dl_recall queue operation:
>             T1                            T2
>                                 nfs4_laundromat
>                                  nfs4_get_client_reaplist
>                                   nfs4_anylock_blockers
> __break_lease
>  spin_lock // ctx->flc_lock
>                                    spin_lock // clp->cl_lock
>                                    nfs4_lockowner_has_blockers
>                                     locks_owner_has_blockers
>                                      spin_lock // flctx->flc_lock
>  nfsd_break_deleg_cb
>   nfsd_break_one_deleg
>    nfs4_put_stid
>     refcount_dec_and_lock
>      spin_lock // clp->cl_lock
> 

Was this warning generated via static analysis? Given that the refcount
shouldn't go to 0 with the nfs4_put_stid() call in this function, I'm
wondering how the deadlock could occur?

> When a file is opened, an nfs4_delegation is allocated with sc_count
> initialized to 1, and the file_lease holds a reference to the delegation.
> The file_lease is then associated with the file through kernel_setlease.
> 
> The disassociation is performed in nfsd4_delegreturn via the following
> call chain:
> nfsd4_delegreturn --> destroy_delegation --> destroy_unhashed_deleg -->
> nfs4_unlock_deleg_lease --> kernel_setlease --> generic_delete_lease
> The corresponding sc_count reference will be released after this
> disassociation.
> 
> Since nfsd_break_one_deleg executes while holding the flc_lock, the
> disassociation process becomes blocked when attempting to acquire flc_lock
> in generic_delete_lease. This means:
> 1) sc_count in nfsd_break_one_deleg will not be decremented to 0;
> 2) The nfs4_put_stid called by nfsd_break_one_deleg will not attempt to
> acquire cl_lock;
> 3) Consequently, no deadlock condition is created.
> 
> Given that sc_count in nfsd_break_one_deleg remains non-zero, we can
> safely perform refcount_dec on sc_count directly. This approach
> effectively avoids triggering deadlock warnings.
> 
> Fixes: 230ca758453c ("nfsd: put dl_stid if fail to queue dl_recall")
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>  fs/nfsd/nfs4state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 2041268b398a..59a693f22452 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -5430,7 +5430,7 @@ static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
>  	queued = nfsd4_run_cb(&dp->dl_recall);
>  	WARN_ON_ONCE(!queued);
>  	if (!queued)
> -		nfs4_put_stid(&dp->dl_stid);
> +		refcount_dec(&dp->dl_stid.sc_count);
>  }
>  
>  /* Called from break_lease() with flc_lock held. */

Ok. I think you're right that it's safe to just decrement it since we
are able to do an unconditional increment just above it. It would be
nice to update the comment above this with an explanation so we're not
scratching our heads over this in a couple of years.

Reviewed-by: Jeff Layton <jlayton@kernel.org>