[PATCH v4] nfsd: remove long-standing revoked delegations by force

Li Lingfeng posted 1 patch 2 months ago
fs/nfsd/nfs4state.c | 41 +++++++++++++++++++++++++++++++++++++++--
1 file changed, 39 insertions(+), 2 deletions(-)
[PATCH v4] nfsd: remove long-standing revoked delegations by force
Posted by Li Lingfeng 2 months ago
When file access conflicts occur between clients, the server recalls
delegations. If the client holding delegation fails to return it after
a recall, nfs4_laundromat adds the delegation to cl_revoked list.
This causes subsequent SEQUENCE operations to set the
SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag, forcing the client to
validate all delegations and return the revoked one.

However, if the client fails to return the delegation like this:
nfs4_laundromat                       nfsd4_delegreturn
 unhash_delegation_locked
 list_add // add dp to reaplist
          // by dl_recall_lru
 list_del_init // delete dp from
               // reaplist
                                       destroy_delegation
                                        unhash_delegation_locked
                                         // do nothing but return false
 revoke_delegation
 list_add // add dp to cl_revoked
          // by dl_recall_lru

The delegation will remain in the server's cl_revoked list while the
client marks it revoked and won't find it upon detecting
SEQ4_STATUS_RECALLABLE_STATE_REVOKED.
This leads to a loop:
the server persistently sets SEQ4_STATUS_RECALLABLE_STATE_REVOKED, and the
client repeatedly tests all delegations, severely impacting performance
when numerous delegations exist.

Since abnormal delegations are removed from flc_lease via nfs4_laundromat
--> revoke_delegation --> destroy_unhashed_deleg -->
nfs4_unlock_deleg_lease --> kernel_setlease, and do not block new open
requests indefinitely, retaining such a delegation on the server is
unnecessary.

Reported-by: Zhang Jian <zhangjian496@huawei.com>
Fixes: 3bd64a5ba171 ("nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED")
Closes: https://lore.kernel.org/all/ff8debe9-6877-4cf7-ba29-fc98eae0ffa0@huawei.com/
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
  Changes in v2:
  1) Set SC_STATUS_CLOSED unconditionally in destroy_delegation();
  2) Determine whether to remove the delegation based on SC_STATUS_CLOSED,
     rather than by timeout;
  3) Modify the commit message.

  Changes in v3:
  1) Move variables used for traversal inside the if statement;
  2) Add a comment to explain why we have to do this;
  3) Move the second check of cl_revoked inside the if statement of
     the first check.

  Changes in v4:
  Stuff dp onto a local list under the protect of cl_lock and put all
  the items later.
 fs/nfsd/nfs4state.c | 41 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 81fa7cc6c77b..30fed3845fa1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1373,6 +1373,11 @@ static void destroy_delegation(struct nfs4_delegation *dp)
 
 	spin_lock(&state_lock);
 	unhashed = unhash_delegation_locked(dp, SC_STATUS_CLOSED);
+	/*
+	 * Unconditionally set SC_STATUS_CLOSED, regardless of whether the
+	 * delegation is hashed, to mark the current delegation as invalid.
+	 */
+	dp->dl_stid.sc_status |= SC_STATUS_CLOSED;
 	spin_unlock(&state_lock);
 	if (unhashed)
 		destroy_unhashed_deleg(dp);
@@ -4507,8 +4512,40 @@ nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	default:
 		seq->status_flags = 0;
 	}
-	if (!list_empty(&clp->cl_revoked))
-		seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
+	if (!list_empty(&clp->cl_revoked)) {
+		struct list_head *pos, *next, reaplist;
+		struct nfs4_delegation *dp;
+
+		/*
+		 * Concurrent nfs4_laundromat() and nfsd4_delegreturn()
+		 * may add a delegation to cl_revoked even after the
+		 * client has returned it, causing persistent
+		 * SEQ4_STATUS_RECALLABLE_STATE_REVOKED, disrupting normal
+		 * operations.
+		 * Remove delegations with SC_STATUS_CLOSED from cl_revoked
+		 * to resolve.
+		 */
+		INIT_LIST_HEAD(&reaplist);
+		spin_lock(&clp->cl_lock);
+		list_for_each_safe(pos, next, &clp->cl_revoked) {
+			dp = list_entry(pos, struct nfs4_delegation, dl_recall_lru);
+			if (dp->dl_stid.sc_status & SC_STATUS_CLOSED) {
+				list_del_init(&dp->dl_recall_lru);
+				list_add(&dp->dl_recall_lru, &reaplist);
+			}
+		}
+		spin_unlock(&clp->cl_lock);
+
+		while (!list_empty(&reaplist)) {
+			dp = list_first_entry(&reaplist, struct nfs4_delegation,
+						dl_recall_lru);
+			list_del_init(&dp->dl_recall_lru);
+			nfs4_put_stid(&dp->dl_stid);
+		}
+
+		if (!list_empty(&clp->cl_revoked))
+			seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
+	}
 	if (atomic_read(&clp->cl_admin_revoked))
 		seq->status_flags |= SEQ4_STATUS_ADMIN_STATE_REVOKED;
 	trace_nfsd_seq4_status(rqstp, seq);
-- 
2.46.1
Re: [PATCH v4] nfsd: remove long-standing revoked delegations by force
Posted by Jeff Layton 2 months ago
On Fri, 2025-10-10 at 11:43 +0800, Li Lingfeng wrote:
> When file access conflicts occur between clients, the server recalls
> delegations. If the client holding delegation fails to return it after
> a recall, nfs4_laundromat adds the delegation to cl_revoked list.
> This causes subsequent SEQUENCE operations to set the
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag, forcing the client to
> validate all delegations and return the revoked one.
> 
> However, if the client fails to return the delegation like this:
> nfs4_laundromat                       nfsd4_delegreturn
>  unhash_delegation_locked
>  list_add // add dp to reaplist
>           // by dl_recall_lru
>  list_del_init // delete dp from
>                // reaplist
>                                        destroy_delegation
>                                         unhash_delegation_locked
>                                          // do nothing but return false
>  revoke_delegation
>  list_add // add dp to cl_revoked
>           // by dl_recall_lru
> 
> The delegation will remain in the server's cl_revoked list while the
> client marks it revoked and won't find it upon detecting
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED.
> This leads to a loop:
> the server persistently sets SEQ4_STATUS_RECALLABLE_STATE_REVOKED, and the
> client repeatedly tests all delegations, severely impacting performance
> when numerous delegations exist.
> 
> Since abnormal delegations are removed from flc_lease via nfs4_laundromat
> --> revoke_delegation --> destroy_unhashed_deleg -->
> nfs4_unlock_deleg_lease --> kernel_setlease, and do not block new open
> requests indefinitely, retaining such a delegation on the server is
> unnecessary.
> 
> Reported-by: Zhang Jian <zhangjian496@huawei.com>
> Fixes: 3bd64a5ba171 ("nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED")
> Closes: https://lore.kernel.org/all/ff8debe9-6877-4cf7-ba29-fc98eae0ffa0@huawei.com/
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>   Changes in v2:
>   1) Set SC_STATUS_CLOSED unconditionally in destroy_delegation();
>   2) Determine whether to remove the delegation based on SC_STATUS_CLOSED,
>      rather than by timeout;
>   3) Modify the commit message.
> 
>   Changes in v3:
>   1) Move variables used for traversal inside the if statement;
>   2) Add a comment to explain why we have to do this;
>   3) Move the second check of cl_revoked inside the if statement of
>      the first check.
> 
>   Changes in v4:
>   Stuff dp onto a local list under the protect of cl_lock and put all
>   the items later.
>  fs/nfsd/nfs4state.c | 41 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 81fa7cc6c77b..30fed3845fa1 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1373,6 +1373,11 @@ static void destroy_delegation(struct nfs4_delegation *dp)
>  
>  	spin_lock(&state_lock);
>  	unhashed = unhash_delegation_locked(dp, SC_STATUS_CLOSED);
> +	/*
> +	 * Unconditionally set SC_STATUS_CLOSED, regardless of whether the
> +	 * delegation is hashed, to mark the current delegation as invalid.
> +	 */
> +	dp->dl_stid.sc_status |= SC_STATUS_CLOSED;

I think we should just unconditionally apply status flags on every call
to unhash_delegation_locked() instead of special-casing it here.

>  	spin_unlock(&state_lock);
>  	if (unhashed)
>  		destroy_unhashed_deleg(dp);
> @@ -4507,8 +4512,40 @@ nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	default:
>  		seq->status_flags = 0;
>  	}
> -	if (!list_empty(&clp->cl_revoked))
> -		seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
> +	if (!list_empty(&clp->cl_revoked)) {
> +		struct list_head *pos, *next, reaplist;
> +		struct nfs4_delegation *dp;
> +
> +		/*
> +		 * Concurrent nfs4_laundromat() and nfsd4_delegreturn()
> +		 * may add a delegation to cl_revoked even after the
> +		 * client has returned it, causing persistent
> +		 * SEQ4_STATUS_RECALLABLE_STATE_REVOKED, disrupting normal
> +		 * operations.
> +		 * Remove delegations with SC_STATUS_CLOSED from cl_revoked
> +		 * to resolve.
> +		 */
> +		INIT_LIST_HEAD(&reaplist);
> +		spin_lock(&clp->cl_lock);
> +		list_for_each_safe(pos, next, &clp->cl_revoked) {
> +			dp = list_entry(pos, struct nfs4_delegation, dl_recall_lru);
> +			if (dp->dl_stid.sc_status & SC_STATUS_CLOSED) {
> +				list_del_init(&dp->dl_recall_lru);
> +				list_add(&dp->dl_recall_lru, &reaplist);
> +			}
> +		}
> +		spin_unlock(&clp->cl_lock);
> +
> +		while (!list_empty(&reaplist)) {
> +			dp = list_first_entry(&reaplist, struct nfs4_delegation,
> +						dl_recall_lru);
> +			list_del_init(&dp->dl_recall_lru);
> +			nfs4_put_stid(&dp->dl_stid);
> +		}
> +
> +		if (!list_empty(&clp->cl_revoked))
> +			seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
> +	}
>  	if (atomic_read(&clp->cl_admin_revoked))
>  		seq->status_flags |= SEQ4_STATUS_ADMIN_STATE_REVOKED;
>  	trace_nfsd_seq4_status(rqstp, seq);

I'm not thrilled with this fix. It seems heavyweight, and it's also
less than ideal to not just release the delegation in the laundromat
when this happens. Given the race you described in the changelog, would
something like this fix the problem too?

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c9053ef4d79f..7977f94fdc1d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1347,13 +1347,14 @@ unhash_delegation_locked(struct nfs4_delegation *dp, unsigned short statusmask)
 
        lockdep_assert_held(&state_lock);
 
-       if (!delegation_hashed(dp))
-               return false;
-
        if (statusmask == SC_STATUS_REVOKED &&
            dp->dl_stid.sc_client->cl_minorversion == 0)
                statusmask = SC_STATUS_CLOSED;
        dp->dl_stid.sc_status |= statusmask;
+
+       if (!delegation_hashed(dp))
+               return false;
+
        if (statusmask & SC_STATUS_ADMIN_REVOKED)
                atomic_inc(&dp->dl_stid.sc_client->cl_admin_revoked);
 
@@ -6899,7 +6900,13 @@ nfs4_laundromat(struct nfsd_net *nn)
                dp = list_first_entry(&reaplist, struct nfs4_delegation,
                                        dl_recall_lru);
                list_del_init(&dp->dl_recall_lru);
-               revoke_delegation(dp);
+               if (dp->dl_stid.sc_status & SC_STATUS_REVOKED)
+                       revoke_delegation(dp);
+               else if (dp->dl_stid.sc_status & SC_STATUS_CLOSED)
+                       destroy_unhashed_deleg(dp);
+               else
+                       pr_warn_once("nfsd: unexpected status when revoking deleg: 0x%hx\n",
+                                    dp->dl_stid.sc_status);
        }
 
        spin_lock(&nn->client_lock);
Re: [PATCH v4] nfsd: remove long-standing revoked delegations by force
Posted by Chuck Lever 2 months ago
On 10/9/25 11:43 PM, Li Lingfeng wrote:
> When file access conflicts occur between clients, the server recalls
> delegations. If the client holding delegation fails to return it after
> a recall, nfs4_laundromat adds the delegation to cl_revoked list.
> This causes subsequent SEQUENCE operations to set the
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag, forcing the client to
> validate all delegations and return the revoked one.
> 
> However, if the client fails to return the delegation like this:
> nfs4_laundromat                       nfsd4_delegreturn
>  unhash_delegation_locked
>  list_add // add dp to reaplist
>           // by dl_recall_lru
>  list_del_init // delete dp from
>                // reaplist
>                                        destroy_delegation
>                                         unhash_delegation_locked
>                                          // do nothing but return false
>  revoke_delegation
>  list_add // add dp to cl_revoked
>           // by dl_recall_lru
> 
> The delegation will remain in the server's cl_revoked list while the
> client marks it revoked and won't find it upon detecting
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED.
> This leads to a loop:
> the server persistently sets SEQ4_STATUS_RECALLABLE_STATE_REVOKED, and the
> client repeatedly tests all delegations, severely impacting performance
> when numerous delegations exist.
> 
> Since abnormal delegations are removed from flc_lease via nfs4_laundromat
> --> revoke_delegation --> destroy_unhashed_deleg -->
> nfs4_unlock_deleg_lease --> kernel_setlease, and do not block new open
> requests indefinitely, retaining such a delegation on the server is
> unnecessary.
> 
> Reported-by: Zhang Jian <zhangjian496@huawei.com>
> Fixes: 3bd64a5ba171 ("nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED")
> Closes: https://lore.kernel.org/all/ff8debe9-6877-4cf7-ba29-fc98eae0ffa0@huawei.com/
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>   Changes in v2:
>   1) Set SC_STATUS_CLOSED unconditionally in destroy_delegation();
>   2) Determine whether to remove the delegation based on SC_STATUS_CLOSED,
>      rather than by timeout;
>   3) Modify the commit message.
> 
>   Changes in v3:
>   1) Move variables used for traversal inside the if statement;
>   2) Add a comment to explain why we have to do this;
>   3) Move the second check of cl_revoked inside the if statement of
>      the first check.
> 
>   Changes in v4:
>   Stuff dp onto a local list under the protect of cl_lock and put all
>   the items later.
>  fs/nfsd/nfs4state.c | 41 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 81fa7cc6c77b..30fed3845fa1 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1373,6 +1373,11 @@ static void destroy_delegation(struct nfs4_delegation *dp)
>  
>  	spin_lock(&state_lock);
>  	unhashed = unhash_delegation_locked(dp, SC_STATUS_CLOSED);
> +	/*
> +	 * Unconditionally set SC_STATUS_CLOSED, regardless of whether the
> +	 * delegation is hashed, to mark the current delegation as invalid.
> +	 */
> +	dp->dl_stid.sc_status |= SC_STATUS_CLOSED;
>  	spin_unlock(&state_lock);
>  	if (unhashed)
>  		destroy_unhashed_deleg(dp);
> @@ -4507,8 +4512,40 @@ nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	default:
>  		seq->status_flags = 0;
>  	}
> -	if (!list_empty(&clp->cl_revoked))
> -		seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
> +	if (!list_empty(&clp->cl_revoked)) {
> +		struct list_head *pos, *next, reaplist;

Declare reaplist with:

		LIST_HEAD(reaplist);

Then you don't need the INIT_LIST_HEAD(reaplist); below.


> +		struct nfs4_delegation *dp;
> +
> +		/*
> +		 * Concurrent nfs4_laundromat() and nfsd4_delegreturn()
> +		 * may add a delegation to cl_revoked even after the
> +		 * client has returned it, causing persistent
> +		 * SEQ4_STATUS_RECALLABLE_STATE_REVOKED, disrupting normal
> +		 * operations.
> +		 * Remove delegations with SC_STATUS_CLOSED from cl_revoked
> +		 * to resolve.
> +		 */
> +		INIT_LIST_HEAD(&reaplist);
> +		spin_lock(&clp->cl_lock);
> +		list_for_each_safe(pos, next, &clp->cl_revoked) {
> +			dp = list_entry(pos, struct nfs4_delegation, dl_recall_lru);
> +			if (dp->dl_stid.sc_status & SC_STATUS_CLOSED) {
> +				list_del_init(&dp->dl_recall_lru);
> +				list_add(&dp->dl_recall_lru, &reaplist);
> +			}
> +		}
> +		spin_unlock(&clp->cl_lock);

Yep, this follows the same form used elsewhere in nfs4state.c.


> +
> +		while (!list_empty(&reaplist)) {
> +			dp = list_first_entry(&reaplist, struct nfs4_delegation,
> +						dl_recall_lru);
> +			list_del_init(&dp->dl_recall_lru);
> +			nfs4_put_stid(&dp->dl_stid);
> +		}
> +
> +		if (!list_empty(&clp->cl_revoked))
> +			seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;

Is this checking if cl_revoked still has remaining revoked state? If
yes, a brief comment to that effect would be helpful.


> +	}
>  	if (atomic_read(&clp->cl_admin_revoked))
>  		seq->status_flags |= SEQ4_STATUS_ADMIN_STATE_REVOKED;
>  	trace_nfsd_seq4_status(rqstp, seq);


-- 
Chuck Lever