[RFC PATCH] nfsd: convert the nfsd_users to atomic_t

chenxiaosong@chenxiaosong.com posted 1 patch 3 months, 3 weeks ago
fs/nfsd/nfssvc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
[RFC PATCH] nfsd: convert the nfsd_users to atomic_t
Posted by chenxiaosong@chenxiaosong.com 3 months, 3 weeks ago
From: ChenXiaoSong <chenxiaosong@kylinos.cn>

Before commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
we had a null-ptr-deref in nfsd4_probe_callback() (Link[1]):

 nfsd: last server has exited, flushing export cache
 NFSD: starting 90-second grace period (net f0000030)
 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
 ...
 Call trace:
  __queue_work+0xb4/0x558
  queue_work_on+0x88/0x90
  nfsd4_probe_callback+0x4c/0x58 [nfsd]
 NFSD: starting 90-second grace period (net f0000030)
  nfsd4_probe_callback_sync+0x20/0x38 [nfsd]
  nfsd4_init_conn.isra.57+0x8c/0xa8 [nfsd]
  nfsd4_create_session+0x5b8/0x718 [nfsd]
  nfsd4_proc_compound+0x4c0/0x710 [nfsd]
  nfsd_dispatch+0x104/0x248 [nfsd]
  svc_process_common+0x348/0x808 [sunrpc]
  svc_process+0xb0/0xc8 [sunrpc]
  nfsd+0xf0/0x160 [nfsd]
  kthread+0x134/0x138
  ret_from_fork+0x10/0x18
 Code: aa1c03e0 97ffffba aa0003e2 b5000780 (f9400262)
 SMP: stopping secondary CPUs
 Starting crashdump kernel...
 Bye!

One of the cases is:

    task A (cpu 1)    |   task B (cpu 2)     |   task C (cpu 3)
 ---------------------|----------------------|---------------------------------
 nfsd_startup_generic | nfsd_startup_generic |
   nfsd_users == 0    |  nfsd_users == 0     |
   nfsd_users++       |  nfsd_users++        |
   nfsd_users == 1    |                      |
   ...                |                      |
   callback_wq == xxx |                      |
 ---------------------|----------------------|---------------------------------
                      |                      | nfsd_shutdown_generic
                      |                      |   nfsd_users == 1
                      |                      |   --nfsd_users
                      |                      |   nfsd_users == 0
                      |                      |   ...
                      |                      |   callback_wq == xxx
                      |                      |   destroy_workqueue(callback_wq)
 ---------------------|----------------------|---------------------------------
                      |  nfsd_users == 1     |
                      |  ...                 |
                      |  callback_wq == yyy  |

After commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
this issue no longer occurs, but we should still convert the nfsd_users
to atomic_t to prevent other similar issues.

Link[1]: https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html
Co-developed-by: huhai <huhai@kylinos.cn>
Signed-off-by: huhai <huhai@kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
---
 fs/nfsd/nfssvc.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 9b3d6cff0e1e..08b1f9ebdc2a 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -270,13 +270,13 @@ static int nfsd_init_socks(struct net *net, const struct cred *cred)
 	return 0;
 }
 
-static int nfsd_users = 0;
+static atomic_t nfsd_users = ATOMIC_INIT(0);
 
 static int nfsd_startup_generic(void)
 {
 	int ret;
 
-	if (nfsd_users++)
+	if (atomic_fetch_inc(&nfsd_users))
 		return 0;
 
 	ret = nfsd_file_cache_init();
@@ -291,13 +291,13 @@ static int nfsd_startup_generic(void)
 out_file_cache:
 	nfsd_file_cache_shutdown();
 dec_users:
-	nfsd_users--;
+	atomic_dec(&nfsd_users);
 	return ret;
 }
 
 static void nfsd_shutdown_generic(void)
 {
-	if (--nfsd_users)
+	if (atomic_dec_return(&nfsd_users))
 		return;
 
 	nfs4_state_shutdown();
-- 
2.34.1
Re: [RFC PATCH] nfsd: convert the nfsd_users to atomic_t
Posted by NeilBrown 3 months, 2 weeks ago
On Wed, 18 Jun 2025, chenxiaosong@chenxiaosong.com wrote:
> From: ChenXiaoSong <chenxiaosong@kylinos.cn>
> 
> Before commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> we had a null-ptr-deref in nfsd4_probe_callback() (Link[1]):
> 
>  nfsd: last server has exited, flushing export cache
>  NFSD: starting 90-second grace period (net f0000030)
>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000

The only possible cause that I can find for this crash is that the nfsd
thread must have still been running when nfsd_shutdown_net() and then
nfsd_shutdown_generic() were called resulting in the workqueue being
destroyed.

The threads will all have been signalled with SIGKILL, but there was no
mechanism to wait for the threads to complete.

This was changed in

Commit: 3409e4f1e8f2 ("NFSD: Make it possible to use svc_set_num_threads_sync")

Sync then threads were stopped synchronously so they were certainly all
stopped before the workqueue was removed.

NeilBrown


>  ...
>  Call trace:
>   __queue_work+0xb4/0x558
>   queue_work_on+0x88/0x90
>   nfsd4_probe_callback+0x4c/0x58 [nfsd]
>  NFSD: starting 90-second grace period (net f0000030)
>   nfsd4_probe_callback_sync+0x20/0x38 [nfsd]
>   nfsd4_init_conn.isra.57+0x8c/0xa8 [nfsd]
>   nfsd4_create_session+0x5b8/0x718 [nfsd]
>   nfsd4_proc_compound+0x4c0/0x710 [nfsd]
>   nfsd_dispatch+0x104/0x248 [nfsd]
>   svc_process_common+0x348/0x808 [sunrpc]
>   svc_process+0xb0/0xc8 [sunrpc]
>   nfsd+0xf0/0x160 [nfsd]
>   kthread+0x134/0x138
>   ret_from_fork+0x10/0x18
>  Code: aa1c03e0 97ffffba aa0003e2 b5000780 (f9400262)
>  SMP: stopping secondary CPUs
>  Starting crashdump kernel...
>  Bye!
> 
> One of the cases is:
> 
>     task A (cpu 1)    |   task B (cpu 2)     |   task C (cpu 3)
>  ---------------------|----------------------|---------------------------------
>  nfsd_startup_generic | nfsd_startup_generic |
>    nfsd_users == 0    |  nfsd_users == 0     |
>    nfsd_users++       |  nfsd_users++        |
>    nfsd_users == 1    |                      |
>    ...                |                      |
>    callback_wq == xxx |                      |
>  ---------------------|----------------------|---------------------------------
>                       |                      | nfsd_shutdown_generic
>                       |                      |   nfsd_users == 1
>                       |                      |   --nfsd_users
>                       |                      |   nfsd_users == 0
>                       |                      |   ...
>                       |                      |   callback_wq == xxx
>                       |                      |   destroy_workqueue(callback_wq)
>  ---------------------|----------------------|---------------------------------
>                       |  nfsd_users == 1     |
>                       |  ...                 |
>                       |  callback_wq == yyy  |
> 
> After commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> this issue no longer occurs, but we should still convert the nfsd_users
> to atomic_t to prevent other similar issues.
> 
> Link[1]: https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html
> Co-developed-by: huhai <huhai@kylinos.cn>
> Signed-off-by: huhai <huhai@kylinos.cn>
> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
> ---
>  fs/nfsd/nfssvc.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 9b3d6cff0e1e..08b1f9ebdc2a 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -270,13 +270,13 @@ static int nfsd_init_socks(struct net *net, const struct cred *cred)
>  	return 0;
>  }
>  
> -static int nfsd_users = 0;
> +static atomic_t nfsd_users = ATOMIC_INIT(0);
>  
>  static int nfsd_startup_generic(void)
>  {
>  	int ret;
>  
> -	if (nfsd_users++)
> +	if (atomic_fetch_inc(&nfsd_users))
>  		return 0;
>  
>  	ret = nfsd_file_cache_init();
> @@ -291,13 +291,13 @@ static int nfsd_startup_generic(void)
>  out_file_cache:
>  	nfsd_file_cache_shutdown();
>  dec_users:
> -	nfsd_users--;
> +	atomic_dec(&nfsd_users);
>  	return ret;
>  }
>  
>  static void nfsd_shutdown_generic(void)
>  {
> -	if (--nfsd_users)
> +	if (atomic_dec_return(&nfsd_users))
>  		return;
>  
>  	nfs4_state_shutdown();
> -- 
> 2.34.1
> 
> 
Re: [RFC PATCH] nfsd: convert the nfsd_users to atomic_t
Posted by ChenXiaoSong 3 months, 2 weeks ago
It seems this patch (or patchset) could resolve the issue.

Thanks for your reply.
ChenXiaoSong.

在 2025/6/20 19:55, NeilBrown 写道:
> The only possible cause that I can find for this crash is that the nfsd
> thread must have still been running when nfsd_shutdown_net() and then
> nfsd_shutdown_generic() were called resulting in the workqueue being
> destroyed.
> 
> The threads will all have been signalled with SIGKILL, but there was no
> mechanism to wait for the threads to complete.
> 
> This was changed in
> 
> Commit: 3409e4f1e8f2 ("NFSD: Make it possible to use svc_set_num_threads_sync")
> 
> Sync then threads were stopped synchronously so they were certainly all
> stopped before the workqueue was removed.
> 
> NeilBrown
> 

Re: [RFC PATCH] nfsd: convert the nfsd_users to atomic_t
Posted by Jeff Layton 3 months, 3 weeks ago
On Wed, 2025-06-18 at 18:41 +0800, chenxiaosong@chenxiaosong.com wrote:
> From: ChenXiaoSong <chenxiaosong@kylinos.cn>
> 
> Before commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> we had a null-ptr-deref in nfsd4_probe_callback() (Link[1]):
> 
>  nfsd: last server has exited, flushing export cache
>  NFSD: starting 90-second grace period (net f0000030)
>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>  ...
>  Call trace:
>   __queue_work+0xb4/0x558
>   queue_work_on+0x88/0x90
>   nfsd4_probe_callback+0x4c/0x58 [nfsd]
>  NFSD: starting 90-second grace period (net f0000030)
>   nfsd4_probe_callback_sync+0x20/0x38 [nfsd]
>   nfsd4_init_conn.isra.57+0x8c/0xa8 [nfsd]
>   nfsd4_create_session+0x5b8/0x718 [nfsd]
>   nfsd4_proc_compound+0x4c0/0x710 [nfsd]
>   nfsd_dispatch+0x104/0x248 [nfsd]
>   svc_process_common+0x348/0x808 [sunrpc]
>   svc_process+0xb0/0xc8 [sunrpc]
>   nfsd+0xf0/0x160 [nfsd]
>   kthread+0x134/0x138
>   ret_from_fork+0x10/0x18
>  Code: aa1c03e0 97ffffba aa0003e2 b5000780 (f9400262)
>  SMP: stopping secondary CPUs
>  Starting crashdump kernel...
>  Bye!
> 
> One of the cases is:
> 
>     task A (cpu 1)    |   task B (cpu 2)     |   task C (cpu 3)
>  ---------------------|----------------------|---------------------------------
>  nfsd_startup_generic | nfsd_startup_generic |
>    nfsd_users == 0    |  nfsd_users == 0     |
>    nfsd_users++       |  nfsd_users++        |
>    nfsd_users == 1    |                      |
>    ...                |                      |
>    callback_wq == xxx |                      |
>  ---------------------|----------------------|---------------------------------
>                       |                      | nfsd_shutdown_generic
>                       |                      |   nfsd_users == 1
>                       |                      |   --nfsd_users
>                       |                      |   nfsd_users == 0
>                       |                      |   ...
>                       |                      |   callback_wq == xxx
>                       |                      |   destroy_workqueue(callback_wq)
>  ---------------------|----------------------|---------------------------------
>                       |  nfsd_users == 1     |
>                       |  ...                 |
>                       |  callback_wq == yyy  |
> 
> After commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"),
> this issue no longer occurs, but we should still convert the nfsd_users
> to atomic_t to prevent other similar issues.
> 
> Link[1]: https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html
> Co-developed-by: huhai <huhai@kylinos.cn>
> Signed-off-by: huhai <huhai@kylinos.cn>
> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
> ---
>  fs/nfsd/nfssvc.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 9b3d6cff0e1e..08b1f9ebdc2a 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -270,13 +270,13 @@ static int nfsd_init_socks(struct net *net, const struct cred *cred)
>  	return 0;
>  }
>  
> -static int nfsd_users = 0;
> +static atomic_t nfsd_users = ATOMIC_INIT(0);
>  
>  static int nfsd_startup_generic(void)
>  {
>  	int ret;
>  
> -	if (nfsd_users++)
> +	if (atomic_fetch_inc(&nfsd_users))
>  		return 0;
>  
>  	ret = nfsd_file_cache_init();
> @@ -291,13 +291,13 @@ static int nfsd_startup_generic(void)
>  out_file_cache:
>  	nfsd_file_cache_shutdown();
>  dec_users:
> -	nfsd_users--;
> +	atomic_dec(&nfsd_users);
>  	return ret;
>  }
>  
>  static void nfsd_shutdown_generic(void)
>  {
> -	if (--nfsd_users)
> +	if (atomic_dec_return(&nfsd_users))
>  		return;
>  
>  	nfs4_state_shutdown();

Isn't nfsd_users protected by the nfsd_mutex? It looks like it's held
in all of the places this counter is accessed.

-- 
Jeff Layton <jlayton@kernel.org>
Re: [RFC PATCH] nfsd: convert the nfsd_users to atomic_t
Posted by ChenXiaoSong 3 months, 3 weeks ago
Yes, nfsd_users is protected by the nfsd_mutex. But the following log 
confuse me, why were they printed in a very short period when crash?

     [24225.575708] nfsd: last server has exited, flushing export cache
     [24225.580242] NFSD: starting 90-second grace period (net f0000030)
     ...
     [24225.807458] NFSD: starting 90-second grace period (net f0000030)

Why was callback_wq queued that it had already been freed? And a new 
callback_wq was created. I’ve added some new vmcore analysis to the link:

https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html


在 2025/6/18 19:50, Jeff Layton 写道:
> Isn't nfsd_users protected by the nfsd_mutex? It looks like it's held
> in all of the places this counter is accessed.
> 


Re: [RFC PATCH] nfsd: convert the nfsd_users to atomic_t
Posted by Jeff Layton 3 months, 3 weeks ago
On Thu, 2025-06-19 at 15:10 +0800, ChenXiaoSong wrote:
> Yes, nfsd_users is protected by the nfsd_mutex. But the following log 
> confuse me, why were they printed in a very short period when crash?
> 
>      [24225.575708] nfsd: last server has exited, flushing export cache
>      [24225.580242] NFSD: starting 90-second grace period (net f0000030)
>      ...
>      [24225.807458] NFSD: starting 90-second grace period (net f0000030)
> 
> Why was callback_wq queued that it had already been freed? And a new 
> callback_wq was created. I’ve added some new vmcore analysis to the link:
> 
> https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html
> 
> 
> 在 2025/6/18 19:50, Jeff Layton 写道:
> > Isn't nfsd_users protected by the nfsd_mutex? It looks like it's held
> > in all of the places this counter is accessed.
> > 
> 

I don't know, specifically.

4.19.90 was released more than 5 years ago, and I have no idea what
else you have in that kernel. If this is only reproducible there, then
there's not much we can do to help you. Can you reproduce this on
something more recent?
-- 
Jeff Layton <jlayton@kernel.org>