[PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period

Yohei Kojima posted 4 patches 2 months, 1 week ago
kernel/nstree.c                               |  68 ++++--
.../namespaces/listns_pagination_bug.c        | 200 ++++++++++++++++++
.../selftests/namespaces/ns_active_ref_test.c |   4 +
.../testing/selftests/namespaces/nsid_test.c  |   8 +
4 files changed, 258 insertions(+), 22 deletions(-)
[PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period
Posted by Yohei Kojima 2 months, 1 week ago
This series fixes the spurious ENOENT set by listns when (1) pagination
is used and (2) listns tries to start enumeration from a destroyed or
inactive namespace.

The Cause of the Bug
====================
This bug was caused by lookup_ns_id_at(kls->last_ns_id + 1, ...), which
is called by do_listns(). This function returned NULL if the first
namespace after the given ns id was destroyed or inactivated before this
function is called:

A: active namespace
D: destroyed (or inactive) namespace

         +-----+-----+-----+-----+-----+-----+-----+-----+
state:   |  A  |  A  |  A  |  D  |  D  |  A  |  A  |  A  |
ns_id:   |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |
         +-----+-----+-----+-----+-----+-----+-----+-----+
	                |     |
		        |     +-- (kls->last_ns_id + 1)
		        +-- req.ns_id = 3

For listns(), there is no way to distinguish this case with the case
nstree is empty, therefore it returns -ENOENT although three namespaces
remains in the tree.

Solution
========
The bug is fixed by iterating over the nstree's internal list until it
reaches the first active namespace.

Patches Sequence
================
Patches 1 and Patch 2 fix the existing issues in namespace selftests.
Patch 3 fixes the spurious ENOENT bug. Patch 4 adds a regression test
for this bug.

Disclaimer on Reproduction
==========================
Unfortunately I couldn't reproduce this bug on VM environment, perhaps
because the test I added relies on timing-sensitive RCU behavior. At
least, I confirmed that this bug reproduces on my bare-metal machine
equipped with i7-14700K. Also, I confirmed that all namespaces tests
pass after applying this series.


Yohei Kojima (4):
  selftests/namespace: fix selftest hang-up caused by zombie processes
  selftests/namespace: fix unintentional skip in ns_active_ref_test.c
  nstree: Fix spurious ENOENT in listns pagination during grace period
  selftests/namespace: test spurious ENOENT bug in listns pagination

 kernel/nstree.c                               |  68 ++++--
 .../namespaces/listns_pagination_bug.c        | 200 ++++++++++++++++++
 .../selftests/namespaces/ns_active_ref_test.c |   4 +
 .../testing/selftests/namespaces/nsid_test.c  |   8 +
 4 files changed, 258 insertions(+), 22 deletions(-)

-- 
2.52.0
Re: [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period
Posted by Yohei Kojima 2 months, 1 week ago
On Mon, Apr 06, 2026 at 01:50:36AM +0900, Yohei Kojima wrote:
> Yohei Kojima (4):
>   selftests/namespace: fix selftest hang-up caused by zombie processes
>   selftests/namespace: fix unintentional skip in ns_active_ref_test.c
>   nstree: Fix spurious ENOENT in listns pagination during grace period

I'm sorry, the subjects of the cover letter and the third patch are
incorrect. This bug is unrelated to the RCU grace period; instead, it
is caused by the handling of inactive and destroyed namespaces. I'll
fix the subject in v2.

Thanks,
Yohei

>   selftests/namespace: test spurious ENOENT bug in listns pagination
> 
>  kernel/nstree.c                               |  68 ++++--
>  .../namespaces/listns_pagination_bug.c        | 200 ++++++++++++++++++
>  .../selftests/namespaces/ns_active_ref_test.c |   4 +
>  .../testing/selftests/namespaces/nsid_test.c  |   8 +
>  4 files changed, 258 insertions(+), 22 deletions(-)
> 
> -- 
> 2.52.0
>
Re: [PATCH 0/4] nstree: Fix spurious ENOENT in listns pagination during grace period
Posted by Christian Brauner 2 months ago
On Tue, Apr 07, 2026 at 09:57:38PM +0900, Yohei Kojima wrote:
> On Mon, Apr 06, 2026 at 01:50:36AM +0900, Yohei Kojima wrote:
> > Yohei Kojima (4):
> >   selftests/namespace: fix selftest hang-up caused by zombie processes
> >   selftests/namespace: fix unintentional skip in ns_active_ref_test.c
> >   nstree: Fix spurious ENOENT in listns pagination during grace period
> 
> I'm sorry, the subjects of the cover letter and the third patch are
> incorrect. This bug is unrelated to the RCU grace period; instead, it
> is caused by the handling of inactive and destroyed namespaces. I'll
> fix the subject in v2.

Ok, sounds good. We can wait.