net/sunrpc/svcauth.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
auth_domain_put() uses kref_put_lock(), which atomically decrements the
refcount before acquiring auth_domain_lock. This creates a window where
an auth_domain entry is still linked on the hash list with refcount == 0.
auth_domain_lookup() walks the hash under auth_domain_lock but uses plain
kref_get() to acquire a reference. If it finds an entry in this transient
zero-refcount state, refcount_inc() triggers a WARN and refuses to
increment (saturating refcount_t semantics), but the function returns the
pointer anyway. The caller then holds a dangling reference: when the
concurrent auth_domain_put() finally acquires the lock and runs
auth_domain_release(), the object is freed while the lookup caller still
has a pointer to it.
The sibling function auth_domain_find() already handles this correctly
using kref_get_unless_zero(). Apply the same pattern in
auth_domain_lookup(): treat a zero-refcount entry as absent and continue
searching. The loop then either finds another live entry or falls through
to insert the new domain, preserving existing semantics.
Reported-by: Chris Mason <clm@meta.com>
Assisted-by: kres:claude-opus-4-6
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
net/sunrpc/svcauth.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
index 55b4d2874188..8e01f0626759 100644
--- a/net/sunrpc/svcauth.c
+++ b/net/sunrpc/svcauth.c
@@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
spin_lock(&auth_domain_lock);
hlist_for_each_entry(hp, head, hash) {
- if (strcmp(hp->name, name)==0) {
- kref_get(&hp->ref);
+ if (strcmp(hp->name, name) == 0) {
+ if (!kref_get_unless_zero(&hp->ref))
+ continue;
+
spin_unlock(&auth_domain_lock);
return hp;
}
---
base-commit: 508c9eaa7e0b952c4fe019880796e6207e3cd201
change-id: 20260520-nfsd-fixes-f137572d0480
Best regards,
--
Jeff Layton <jlayton@kernel.org>
On Wed, May 20, 2026, at 2:10 PM, Jeff Layton wrote:
> auth_domain_put() uses kref_put_lock(), which atomically decrements the
> refcount before acquiring auth_domain_lock. This creates a window where
> an auth_domain entry is still linked on the hash list with refcount == 0.
>
> auth_domain_lookup() walks the hash under auth_domain_lock but uses plain
> kref_get() to acquire a reference. If it finds an entry in this transient
> zero-refcount state, refcount_inc() triggers a WARN and refuses to
> increment (saturating refcount_t semantics), but the function returns the
> pointer anyway. The caller then holds a dangling reference: when the
> concurrent auth_domain_put() finally acquires the lock and runs
> auth_domain_release(), the object is freed while the lookup caller still
> has a pointer to it.
>
> The sibling function auth_domain_find() already handles this correctly
> using kref_get_unless_zero(). Apply the same pattern in
> auth_domain_lookup(): treat a zero-refcount entry as absent and continue
> searching. The loop then either finds another live entry or falls through
> to insert the new domain, preserving existing semantics.
>
> Reported-by: Chris Mason <clm@meta.com>
> Assisted-by: kres:claude-opus-4-6
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> net/sunrpc/svcauth.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> index 55b4d2874188..8e01f0626759 100644
> --- a/net/sunrpc/svcauth.c
> +++ b/net/sunrpc/svcauth.c
> @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> spin_lock(&auth_domain_lock);
>
> hlist_for_each_entry(hp, head, hash) {
> - if (strcmp(hp->name, name)==0) {
> - kref_get(&hp->ref);
> + if (strcmp(hp->name, name) == 0) {
> + if (!kref_get_unless_zero(&hp->ref))
> + continue;
> +
> spin_unlock(&auth_domain_lock);
> return hp;
> }
>
> ---
> base-commit: 508c9eaa7e0b952c4fe019880796e6207e3cd201
> change-id: 20260520-nfsd-fixes-f137572d0480
>
> Best regards,
> --
> Jeff Layton <jlayton@kernel.org>
Forwarding sashiko review results:
> auth_domain_put() uses kref_put_lock(), which atomically decrements the
> refcount before acquiring auth_domain_lock. This creates a window where
> an auth_domain entry is still linked on the hash list with refcount == 0.
[Severity: High]
Is this description of kref_put_lock() accurate?
Looking at kref_put_lock(), it uses refcount_dec_and_lock() which acquires
the spinlock before the 1->0 transition is completed. Since
auth_domain_lookup() holds the same auth_domain_lock, is it actually possible
for it to encounter a 0-refcount entry under normal conditions?
> diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> index 55b4d2874188..8e01f0626759 100644
> --- a/net/sunrpc/svcauth.c
> +++ b/net/sunrpc/svcauth.c
> @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> spin_lock(&auth_domain_lock);
>
> hlist_for_each_entry(hp, head, hash) {
> - if (strcmp(hp->name, name)==0) {
> - kref_get(&hp->ref);
> + if (strcmp(hp->name, name) == 0) {
> + if (!kref_get_unless_zero(&hp->ref))
> + continue;
> +
> spin_unlock(&auth_domain_lock);
> return hp;
> }
[Severity: High]
If a 0-refcount entry is encountered here, wouldn't it indicate a severe
underlying refcount imbalance or use-after-free rather than a normal race?
By using kref_get_unless_zero() and continuing the search, might this silently
mask the root cause of the WARN instead of resolving it?
Furthermore, if the loop skips the 0-refcount entry and falls through to the
end of the function, won't it insert a duplicate auth_domain with the same
name into the hash list?
--
Chuck Lever
On Wed, 2026-05-20 at 15:47 -0400, Chuck Lever wrote:
>
> On Wed, May 20, 2026, at 2:10 PM, Jeff Layton wrote:
> > auth_domain_put() uses kref_put_lock(), which atomically decrements the
> > refcount before acquiring auth_domain_lock. This creates a window where
> > an auth_domain entry is still linked on the hash list with refcount == 0.
> >
> > auth_domain_lookup() walks the hash under auth_domain_lock but uses plain
> > kref_get() to acquire a reference. If it finds an entry in this transient
> > zero-refcount state, refcount_inc() triggers a WARN and refuses to
> > increment (saturating refcount_t semantics), but the function returns the
> > pointer anyway. The caller then holds a dangling reference: when the
> > concurrent auth_domain_put() finally acquires the lock and runs
> > auth_domain_release(), the object is freed while the lookup caller still
> > has a pointer to it.
> >
> > The sibling function auth_domain_find() already handles this correctly
> > using kref_get_unless_zero(). Apply the same pattern in
> > auth_domain_lookup(): treat a zero-refcount entry as absent and continue
> > searching. The loop then either finds another live entry or falls through
> > to insert the new domain, preserving existing semantics.
> >
> > Reported-by: Chris Mason <clm@meta.com>
> > Assisted-by: kres:claude-opus-4-6
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > net/sunrpc/svcauth.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> > index 55b4d2874188..8e01f0626759 100644
> > --- a/net/sunrpc/svcauth.c
> > +++ b/net/sunrpc/svcauth.c
> > @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> > spin_lock(&auth_domain_lock);
> >
> > hlist_for_each_entry(hp, head, hash) {
> > - if (strcmp(hp->name, name)==0) {
> > - kref_get(&hp->ref);
> > + if (strcmp(hp->name, name) == 0) {
> > + if (!kref_get_unless_zero(&hp->ref))
> > + continue;
> > +
> > spin_unlock(&auth_domain_lock);
> > return hp;
> > }
> >
> > ---
> > base-commit: 508c9eaa7e0b952c4fe019880796e6207e3cd201
> > change-id: 20260520-nfsd-fixes-f137572d0480
> >
> > Best regards,
> > --
> > Jeff Layton <jlayton@kernel.org>
>
> Forwarding sashiko review results:
>
> > auth_domain_put() uses kref_put_lock(), which atomically decrements the
> > refcount before acquiring auth_domain_lock. This creates a window where
> > an auth_domain entry is still linked on the hash list with refcount == 0.
>
> [Severity: High]
> Is this description of kref_put_lock() accurate?
>
> Looking at kref_put_lock(), it uses refcount_dec_and_lock() which acquires
> the spinlock before the 1->0 transition is completed. Since
> auth_domain_lookup() holds the same auth_domain_lock, is it actually possible
> for it to encounter a 0-refcount entry under normal conditions?
>
> > diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> > index 55b4d2874188..8e01f0626759 100644
> > --- a/net/sunrpc/svcauth.c
> > +++ b/net/sunrpc/svcauth.c
> > @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> > spin_lock(&auth_domain_lock);
> >
> > hlist_for_each_entry(hp, head, hash) {
> > - if (strcmp(hp->name, name)==0) {
> > - kref_get(&hp->ref);
> > + if (strcmp(hp->name, name) == 0) {
> > + if (!kref_get_unless_zero(&hp->ref))
> > + continue;
> > +
> > spin_unlock(&auth_domain_lock);
> > return hp;
> > }
>
> [Severity: High]
> If a 0-refcount entry is encountered here, wouldn't it indicate a severe
> underlying refcount imbalance or use-after-free rather than a normal race?
>
> By using kref_get_unless_zero() and continuing the search, might this silently
> mask the root cause of the WARN instead of resolving it?
>
> Furthermore, if the loop skips the 0-refcount entry and falls through to the
> end of the function, won't it insert a duplicate auth_domain with the same
> name into the hash list?
>
Sashiko is correct. This codepath can't see a 0 refcount here. The
patch won't break anything, but it's not fixing anything either. Let's
just drop this one.
--
Jeff Layton <jlayton@kernel.org>
On Wed, May 20, 2026, at 2:10 PM, Jeff Layton wrote:
> auth_domain_put() uses kref_put_lock(), which atomically decrements the
> refcount before acquiring auth_domain_lock. This creates a window where
> an auth_domain entry is still linked on the hash list with refcount == 0.
>
> auth_domain_lookup() walks the hash under auth_domain_lock but uses plain
> kref_get() to acquire a reference. If it finds an entry in this transient
> zero-refcount state, refcount_inc() triggers a WARN and refuses to
> increment (saturating refcount_t semantics), but the function returns the
> pointer anyway. The caller then holds a dangling reference: when the
> concurrent auth_domain_put() finally acquires the lock and runs
> auth_domain_release(), the object is freed while the lookup caller still
> has a pointer to it.
>
> The sibling function auth_domain_find() already handles this correctly
> using kref_get_unless_zero(). Apply the same pattern in
> auth_domain_lookup(): treat a zero-refcount entry as absent and continue
> searching. The loop then either finds another live entry or falls through
> to insert the new domain, preserving existing semantics.
>
> Reported-by: Chris Mason <clm@meta.com>
> Assisted-by: kres:claude-opus-4-6
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
Fixes: 608a0ab2f54a ("SUNRPC: Add lockless lookup of the server's auth domain")
> ---
> net/sunrpc/svcauth.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
> index 55b4d2874188..8e01f0626759 100644
> --- a/net/sunrpc/svcauth.c
> +++ b/net/sunrpc/svcauth.c
> @@ -245,8 +245,10 @@ auth_domain_lookup(char *name, struct auth_domain *new)
> spin_lock(&auth_domain_lock);
>
> hlist_for_each_entry(hp, head, hash) {
> - if (strcmp(hp->name, name)==0) {
> - kref_get(&hp->ref);
> + if (strcmp(hp->name, name) == 0) {
> + if (!kref_get_unless_zero(&hp->ref))
> + continue;
> +
> spin_unlock(&auth_domain_lock);
> return hp;
> }
>
> ---
> base-commit: 508c9eaa7e0b952c4fe019880796e6207e3cd201
> change-id: 20260520-nfsd-fixes-f137572d0480
>
> Best regards,
> --
> Jeff Layton <jlayton@kernel.org>
--
Chuck Lever
© 2016 - 2026 Red Hat, Inc.