[PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled

Mashiro Chen posted 1 patch 2 months, 2 weeks ago
kernel/sched/psi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
Posted by Mashiro Chen 2 months, 2 weeks ago
When PSI is disabled, psi_memstall_enter() returns early without
writing to *flags, leaving the caller's local variable uninitialized.
psi_memstall_leave() also returns early when PSI is disabled and does
not read *flags, so the uninitialized value is never used functionally.

However, KMSAN tracks the shadow and origin metadata per physical
address. When a kernel stack page is subsequently reused, a new object
at the same address inherits the stale KMSAN shadow from the old
uninitialized pflags, causing spurious uninit-value reports in
unrelated code paths such as __flush_smp_call_function_queue().

Initialize *flags to 0 in the psi_disabled early-return path to
prevent the stale shadow from escaping the callers' stack frames.

Fixes: eb414681d5a0 ("psi: pressure stall information for CPU, memory, and IO")
Reported-by: syzbot+4b1bd55fba6260160779@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4b1bd55fba6260160779
Link: https://lore.kernel.org/all/6991885b.050a0220.340abe.02ef.GAE@google.com/
Signed-off-by: Mashiro Chen <mashiro.chen@mailbox.org>
---
 kernel/sched/psi.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45b..32d3a180fc03b 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1058,8 +1058,10 @@ void psi_memstall_enter(unsigned long *flags)
 	struct rq_flags rf;
 	struct rq *rq;
 
-	if (static_branch_likely(&psi_disabled))
+	if (static_branch_likely(&psi_disabled)) {
+		*flags = 0;
 		return;
+	}
 
 	*flags = current->in_memstall;
 	if (*flags)
-- 
2.53.0
Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
Posted by Johannes Weiner 2 months, 1 week ago
On Sun, Apr 05, 2026 at 01:50:44PM +0800, Mashiro Chen wrote:
> When PSI is disabled, psi_memstall_enter() returns early without
> writing to *flags, leaving the caller's local variable uninitialized.
> psi_memstall_leave() also returns early when PSI is disabled and does
> not read *flags, so the uninitialized value is never used functionally.
>
> However, KMSAN tracks the shadow and origin metadata per physical
> address. When a kernel stack page is subsequently reused, a new object
> at the same address inherits the stale KMSAN shadow from the old
> uninitialized pflags, causing spurious uninit-value reports in
> unrelated code paths such as __flush_smp_call_function_queue().
>
> Initialize *flags to 0 in the psi_disabled early-return path to
> prevent the stale shadow from escaping the callers' stack frames.

How is this not a kmsan bug?

I don't think putting an unexpected init into unrelated code with no
comment is an appropriate fix for what seems like a false positive in
this tool.
Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
Posted by Mashiro Chen 2 months, 1 week ago
Hi Johannes,

Good question. You're right that KMSAN's stack tracking persisting
across page reuse boundaries is arguably a tool limitation. That said,
I think fixing it on the PSI side is still reasonable:

psi_memstall_enter() takes a pointer parameter with an implicit contract:
if the caller passes &flags, they expect *flags to be initialized upon
return. The current early-return silently violates that contract by
leaving *flags uninitialized, even though the value is never actually used
functionally.

The fix is essentially free (we're already in the early-return path) and
makes the contract explicit. You're right that the original patch lacked
a comment explaining this, I should have added:

     /* Initialize to 0 even in psi_disabled case to honor the
      * implicit API contract that *flags is initialized on return.
      * psi_memstall_leave() also returns early when psi_disabled
      * and does not read *flags, so this is zero-cost. */
     *flags = 0;
     return;

That said, if you prefer this stays in KMSAN (e.g., treating stack
variables as out-of-scope once their frame returns), I'm happy to drop
the patch and redirect the effort there instead.

Thanks for the thoughtful review feedback.

Best,
Mashiro Chen

On 4/8/26 23:10, Johannes Weiner wrote:
> On Sun, Apr 05, 2026 at 01:50:44PM +0800, Mashiro Chen wrote:
>> When PSI is disabled, psi_memstall_enter() returns early without
>> writing to *flags, leaving the caller's local variable uninitialized.
>> psi_memstall_leave() also returns early when PSI is disabled and does
>> not read *flags, so the uninitialized value is never used functionally.
>>
>> However, KMSAN tracks the shadow and origin metadata per physical
>> address. When a kernel stack page is subsequently reused, a new object
>> at the same address inherits the stale KMSAN shadow from the old
>> uninitialized pflags, causing spurious uninit-value reports in
>> unrelated code paths such as __flush_smp_call_function_queue().
>>
>> Initialize *flags to 0 in the psi_disabled early-return path to
>> prevent the stale shadow from escaping the callers' stack frames.
> How is this not a kmsan bug?
>
> I don't think putting an unexpected init into unrelated code with no
> comment is an appropriate fix for what seems like a false positive in
> this tool.
Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
Posted by Johannes Weiner 2 months, 1 week ago
On Thu, Apr 09, 2026 at 12:14:50AM +0800, Mashiro Chen wrote:
> Hi Johannes,
> 
> Good question. You're right that KMSAN's stack tracking persisting
> across page reuse boundaries is arguably a tool limitation. That said,
> I think fixing it on the PSI side is still reasonable:
> 
> psi_memstall_enter() takes a pointer parameter with an implicit contract:
> if the caller passes &flags, they expect *flags to be initialized upon
> return. The current early-return silently violates that contract by
> leaving *flags uninitialized, even though the value is never actually used
> functionally.

The caller has no expectations towards the contents of *flags and no
business reading or manipulating them. It's an opaque channel that
lets _enter() communicate with _leave().

> The fix is essentially free (we're already in the early-return path) and
> makes the contract explicit. You're right that the original patch lacked
> a comment explaining this, I should have added:
> 
>      /* Initialize to 0 even in psi_disabled case to honor the
>       * implicit API contract that *flags is initialized on return.
>       * psi_memstall_leave() also returns early when psi_disabled
>       * and does not read *flags, so this is zero-cost. */
>      *flags = 0;
>      return;
> 
> That said, if you prefer this stays in KMSAN (e.g., treating stack
> variables as out-of-scope once their frame returns), I'm happy to drop
> the patch and redirect the effort there instead.

It sounds to me like this would be a good thing to fix regardless of
what psi is doing here. Even if psi initialized it to some value that
is meaningful to psi - that value is totally random, and for all
intents and purposes "uninitialized", from the view of a subsequent
user of that stack slot?
Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
Posted by Mashiro Chen 2 months, 1 week ago
Hi Johannes,

You're right on both counts. The 'opaque channel' framing makes it
clear there's no meaningful API contract being violated here -- the
caller is not supposed to interpret *flags at all.

And yes, your second point is exactly the real issue: once a stack
frame returns, its local variables should be considered dead. KMSAN
tracking that shadow across page reuse into an unrelated frame is the
actual bug.

I'll drop this patch. The correct fix is in KMSAN -- it should treat
stack slots as out-of-scope once their owning frame returns, rather
than letting stale shadow metadata escape into subsequent users of
the same physical address.

Thanks for the clear explanation.

Best,
Mashiro Chen

On 4/9/26 00:40, Johannes Weiner wrote:
> On Thu, Apr 09, 2026 at 12:14:50AM +0800, Mashiro Chen wrote:
>> Hi Johannes,
>>
>> Good question. You're right that KMSAN's stack tracking persisting
>> across page reuse boundaries is arguably a tool limitation. That said,
>> I think fixing it on the PSI side is still reasonable:
>>
>> psi_memstall_enter() takes a pointer parameter with an implicit contract:
>> if the caller passes &flags, they expect *flags to be initialized upon
>> return. The current early-return silently violates that contract by
>> leaving *flags uninitialized, even though the value is never actually used
>> functionally.
> The caller has no expectations towards the contents of *flags and no
> business reading or manipulating them. It's an opaque channel that
> lets _enter() communicate with _leave().
>
>> The fix is essentially free (we're already in the early-return path) and
>> makes the contract explicit. You're right that the original patch lacked
>> a comment explaining this, I should have added:
>>
>>       /* Initialize to 0 even in psi_disabled case to honor the
>>        * implicit API contract that *flags is initialized on return.
>>        * psi_memstall_leave() also returns early when psi_disabled
>>        * and does not read *flags, so this is zero-cost. */
>>       *flags = 0;
>>       return;
>>
>> That said, if you prefer this stays in KMSAN (e.g., treating stack
>> variables as out-of-scope once their frame returns), I'm happy to drop
>> the patch and redirect the effort there instead.
> It sounds to me like this would be a good thing to fix regardless of
> what psi is doing here. Even if psi initialized it to some value that
> is meaningful to psi - that value is totally random, and for all
> intents and purposes "uninitialized", from the view of a subsequent
> user of that stack slot?