[PATCH v4] x86/fpu: Fix NULL dereference in avx512_status()

Sohil Mehta posted 1 patch 1 month, 3 weeks ago
arch/x86/kernel/fpu/xstate.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
[PATCH v4] x86/fpu: Fix NULL dereference in avx512_status()
Posted by Sohil Mehta 1 month, 3 weeks ago
From: Fushuai Wang <wangfushuai@baidu.com>

Problem
-------
With CONFIG_X86_DEBUG_FPU enabled, reading /proc/[kthread]/arch_status
causes a kernel NULL pointer dereference.

Kernel threads aren't expected to access the FPU state directly. Kernel
usage of FPU registers is contained within kernel_fpu_begin()/_end()
sections.

However, to report AVX-512 usage, the avx512_timestamp variable within
struct fpu needs to be accessed, which triggers a warning in
x86_task_fpu().

For Kthreads:
  proc_pid_arch_status()
    avx512_status()
      x86_task_fpu() => Warning and returns NULL
      x86_task_fpu()->avx512_timestamp => NULL dereference

The warning is a false alarm since the access isn't intended for
modifying the FPU state. All kernel threads (except the init_task) have
a "struct fpu" with an avx512_timestamp variable that is valid to
access. Also, the init_task (PID 0) never follows this path since it is
not exposed in /proc.

Solution
--------
One option is to get rid of the warning in x86_task_fpu() for kernel
threads. However, that warning was recently added and might be useful to
catch any potential misuse of the FPU state in kernel threads.

A better option is to avoid the access altogether. The kernel does not
track AVX-512 usage for kernel threads.
save_fpregs_to_fpstate()->update_avx_timestamp() is never invoked for
kernel threads, so avx512_timestamp is always guaranteed to be 0.

Also, the legacy behavior of reporting "AVX512_elapsed_ms: -1", which
signifies "no AVX-512 usage", is misleading. The kernel usage just isn't
tracked.

For now, update the ABI for kernel threads and do not report AVX-512
usage for them. Reading /proc/[kthread]/arch_status would display no
AVX-512 information. This avoids the NULL dereference as well as the
misleading report.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Fixes: 22aafe3bcb67 ("x86/fpu: Remove init_task FPU state dependencies, add debugging warning for PF_KTHREAD tasks")
Cc: <stable@vger.kernel.org> # v6.15+
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Co-developed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v4:
 - No significant change, minor wording improvements.

v3: https://lore.kernel.org/lkml/20250724013422.307954-1-sohil.mehta@intel.com/
 - Do not report anything for kernel threads. (DaveH)
 - Make the commit message more precise.

v2:
 - Avoid making the fix dependent on CONFIG_X86_DEBUG_FPU.
 - Include PF_USER_WORKER in the kernel thread check.
 - Update commit message for clarity.
---
 arch/x86/kernel/fpu/xstate.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 12ed75c1b567..28e4fd65c9da 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1881,19 +1881,20 @@ long fpu_xstate_prctl(int option, unsigned long arg2)
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
 /*
  * Report the amount of time elapsed in millisecond since last AVX512
- * use in the task.
+ * use in the task. Report -1 if no AVX-512 usage.
  */
 static void avx512_status(struct seq_file *m, struct task_struct *task)
 {
-	unsigned long timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
-	long delta;
+	unsigned long timestamp;
+	long delta = -1;
 
-	if (!timestamp) {
-		/*
-		 * Report -1 if no AVX512 usage
-		 */
-		delta = -1;
-	} else {
+	/* AVX-512 usage is not tracked for kernel threads. Don't report anything. */
+	if (task->flags & (PF_KTHREAD | PF_USER_WORKER))
+		return;
+
+	timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
+
+	if (timestamp) {
 		delta = (long)(jiffies - timestamp);
 		/*
 		 * Cap to LONG_MAX if time difference > LONG_MAX
-- 
2.43.0
Re: [PATCH v4] x86/fpu: Fix NULL dereference in avx512_status()
Posted by Dave Hansen 1 month, 3 weeks ago
On 8/11/25 11:50, Sohil Mehta wrote:
> From: Fushuai Wang <wangfushuai@baidu.com>
> 
> Problem
> -------
> With CONFIG_X86_DEBUG_FPU enabled, reading /proc/[kthread]/arch_status
> causes a kernel NULL pointer dereference.
<snip>

That changelog is getting a bit long-winded and has a lot of extra
information.

The changelog also isn't really converging, so I gave it a go to
rewrite it. Is this missing anything?

https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/commit/?h=testme&id=d61828dcbcff4ac80b91f5071ba6d21ef6c97347
Re: [PATCH v4] x86/fpu: Fix NULL dereference in avx512_status()
Posted by Sohil Mehta 1 month, 3 weeks ago
On 8/11/2025 12:22 PM, Dave Hansen wrote:
> 
> The changelog also isn't really converging, so I gave it a go to
> rewrite it. Is this missing anything?
> 

Thank you! Your changelog covers the essentials and makes it concise.
The dual nature of x86_task_fpu() was making it hard to write for me.

A couple of typos:

> This is because the AVX-512 timestamp code uses x86_task_fpu() doesn't
								^^^but> check it for NULL.

Missing "but"

> If anyone ever wants to track kernel thread AVX-512 use, the can come
> back later and do it properly, separate from this bug fix.

s/the/they

> https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/commit/?h=testme&id=d61828dcbcff4ac80b91f5071ba6d21ef6c97347

It probably doesn't matter, but the documentation suggests ordering
Co-developed-by and Signed-off-by tags a bit differently.

"...every Co-developed-by: must be immediately followed by a
Signed-off-by: of the associated co-author."

https://www.kernel.org/doc/html/latest/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by

Example of a patch submitted by a Co-developed-by: author:

From: From Author <from@author.example.org>

<changelog>

Co-developed-by: Random Co-Author <random@coauthor.example.org>
Signed-off-by: Random Co-Author <random@coauthor.example.org>
Signed-off-by: From Author <from@author.example.org>
Co-developed-by: Submitting Co-Author <sub@coauthor.example.org>
Signed-off-by: Submitting Co-Author <sub@coauthor.example.org>
Re: [PATCH v4] x86/fpu: Fix NULL dereference in avx512_status()
Posted by Dave Hansen 1 month, 3 weeks ago
On 8/11/25 13:16, Sohil Mehta wrote:
> On 8/11/2025 12:22 PM, Dave Hansen wrote:
>>
>> The changelog also isn't really converging, so I gave it a go to
>> rewrite it. Is this missing anything?
>>
> 
> Thank you! Your changelog covers the essentials and makes it concise.
> The dual nature of x86_task_fpu() was making it hard to write for me.
> 
> A couple of typos:

Thanks for the second set of eyes!

> Co-developed-by: Random Co-Author <random@coauthor.example.org>
> Signed-off-by: Random Co-Author <random@coauthor.example.org>
> Signed-off-by: From Author <from@author.example.org>
> Co-developed-by: Submitting Co-Author <sub@coauthor.example.org>
> Signed-off-by: Submitting Co-Author <sub@coauthor.example.org>

Gah, I hate Co-developed-by.

With those fixes, it's in x86/urgent now.