[PATCH v3] tools: jobserver: Prevent deadlock caused by incorrect jobserver configuration and enhance error reporting

Changbin Du posted 1 patch 1 month ago
There is a newer version of this series
tools/lib/python/jobserver.py | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
[PATCH v3] tools: jobserver: Prevent deadlock caused by incorrect jobserver configuration and enhance error reporting
Posted by Changbin Du 1 month ago
When using GNU Make's jobserver feature in kernel builds, a bug in MAKEFLAGS
propagation caused "--jobserver-auth=r,w" to reference an unintended file
descriptor. This led to infinite loops in jobserver-exec's os.read() calls
due to empty token.

My shell opened /etc/passwd for some reason without closing it, and as a
result, all child processes inherited this fd 3.

$ ls -l /proc/self/fd
total 0
lrwx------ 1 changbin changbin 64 Dec 25 13:03 0 -> /dev/pts/1
lrwx------ 1 changbin changbin 64 Dec 25 13:03 1 -> /dev/pts/1
lrwx------ 1 changbin changbin 64 Dec 25 13:03 2 -> /dev/pts/1
lr-x------ 1 changbin changbin 64 Dec 25 13:03 3 -> /etc/passwd
lr-x------ 1 changbin changbin 64 Dec 25 13:03 4 -> /proc/1421383/fd

In this case, the `make` should open a new file descriptor for jobserver
control, but clearly, it did not do so and instead still passed fd 3 as
"--jobserver-auth=3,4" in MAKEFLAGS. (The version of my gnu make is 4.3)

This update ensures robustness against invalid jobserver configurations,
even when `make` incorrectly pass non-pipe file descriptors.
 * Rejecting empty reads to prevent infinite loops on EOF.
 * Clearing `self.jobs` to avoid writing to incorrect files if invalid tokens
   are detected.
 * Printing detailed error messages to stderr to inform the user.

Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Changbin Du <changbin.du@huawei.com>

---
  v3: format exception with repr(e).
  v2: remove validation for all bytes are '+' characters. (Mauro)
---
 tools/lib/python/jobserver.py | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/lib/python/jobserver.py b/tools/lib/python/jobserver.py
index a24f30ef4fa8..bd231f847032 100755
--- a/tools/lib/python/jobserver.py
+++ b/tools/lib/python/jobserver.py
@@ -91,6 +91,10 @@ class JobserverExec:
             while True:
                 try:
                     slot = os.read(self.reader, 8)
+                    if not slot:
+                        # Clear self.jobs to prevent us from probably writing incorrect file.
+                        self.jobs = []
+                        raise ValueError("unexpected empty token from jobserver fd, invalid '--jobserver-auth=' setting?")
                     self.jobs += slot
                 except (OSError, IOError) as e:
                     if e.errno == errno.EWOULDBLOCK:
@@ -105,7 +109,8 @@ class JobserverExec:
             # to sit here blocked on our child.
             self.claim = len(self.jobs) + 1
 
-        except (KeyError, IndexError, ValueError, OSError, IOError):
+        except (KeyError, IndexError, ValueError, OSError, IOError) as e:
+            print(f"Warning: {repr(e)}", file=sys.stderr)
             # Any missing environment strings or bad fds should result in just
             # not being parallel.
             self.claim = None
-- 
2.43.0
Re: [PATCH v3] tools: jobserver: Prevent deadlock caused by incorrect jobserver configuration and enhance error reporting
Posted by duchangbin 1 month ago
On Thu, Jan 08, 2026 at 07:15:34PM +0800, Changbin Du wrote:
> When using GNU Make's jobserver feature in kernel builds, a bug in MAKEFLAGS
> propagation caused "--jobserver-auth=r,w" to reference an unintended file
> descriptor. This led to infinite loops in jobserver-exec's os.read() calls
> due to empty token.
> 
> My shell opened /etc/passwd for some reason without closing it, and as a
> result, all child processes inherited this fd 3.
> 
> $ ls -l /proc/self/fd
> total 0
> lrwx------ 1 changbin changbin 64 Dec 25 13:03 0 -> /dev/pts/1
> lrwx------ 1 changbin changbin 64 Dec 25 13:03 1 -> /dev/pts/1
> lrwx------ 1 changbin changbin 64 Dec 25 13:03 2 -> /dev/pts/1
> lr-x------ 1 changbin changbin 64 Dec 25 13:03 3 -> /etc/passwd
> lr-x------ 1 changbin changbin 64 Dec 25 13:03 4 -> /proc/1421383/fd
> 
> In this case, the `make` should open a new file descriptor for jobserver
> control, but clearly, it did not do so and instead still passed fd 3 as
> "--jobserver-auth=3,4" in MAKEFLAGS. (The version of my gnu make is 4.3)
> 
> This update ensures robustness against invalid jobserver configurations,
> even when `make` incorrectly pass non-pipe file descriptors.
>  * Rejecting empty reads to prevent infinite loops on EOF.
>  * Clearing `self.jobs` to avoid writing to incorrect files if invalid tokens
>    are detected.
>  * Printing detailed error messages to stderr to inform the user.
> 
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Signed-off-by: Changbin Du <changbin.du@huawei.com>
> 
> ---
>   v3: format exception with repr(e).
>   v2: remove validation for all bytes are '+' characters. (Mauro)
> ---
>  tools/lib/python/jobserver.py | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/lib/python/jobserver.py b/tools/lib/python/jobserver.py
> index a24f30ef4fa8..bd231f847032 100755
> --- a/tools/lib/python/jobserver.py
> +++ b/tools/lib/python/jobserver.py
> @@ -91,6 +91,10 @@ class JobserverExec:
>              while True:
>                  try:
>                      slot = os.read(self.reader, 8)
> +                    if not slot:
> +                        # Clear self.jobs to prevent us from probably writing incorrect file.
> +                        self.jobs = []
oops, I should not change the type of self.jobs. So it's better to be:
			   self.jobs = b""

-- 
Cheers,
Changbin Du