kernel/sched/psi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
When PSI is disabled, psi_memstall_enter() returns early without
writing to *flags, leaving the caller's local variable uninitialized.
psi_memstall_leave() also returns early when PSI is disabled and does
not read *flags, so the uninitialized value is never used functionally.
However, KMSAN tracks the shadow and origin metadata per physical
address. When a kernel stack page is subsequently reused, a new object
at the same address inherits the stale KMSAN shadow from the old
uninitialized pflags, causing spurious uninit-value reports in
unrelated code paths such as __flush_smp_call_function_queue().
Initialize *flags to 0 in the psi_disabled early-return path to
prevent the stale shadow from escaping the callers' stack frames.
Fixes: eb414681d5a0 ("psi: pressure stall information for CPU, memory, and IO")
Reported-by: syzbot+4b1bd55fba6260160779@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4b1bd55fba6260160779
Link: https://lore.kernel.org/all/6991885b.050a0220.340abe.02ef.GAE@google.com/
Signed-off-by: Mashiro Chen <mashiro.chen@mailbox.org>
---
kernel/sched/psi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45b..32d3a180fc03b 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1058,8 +1058,10 @@ void psi_memstall_enter(unsigned long *flags)
struct rq_flags rf;
struct rq *rq;
- if (static_branch_likely(&psi_disabled))
+ if (static_branch_likely(&psi_disabled)) {
+ *flags = 0;
return;
+ }
*flags = current->in_memstall;
if (*flags)
--
2.53.0
On Sun, Apr 05, 2026 at 01:50:44PM +0800, Mashiro Chen wrote: > When PSI is disabled, psi_memstall_enter() returns early without > writing to *flags, leaving the caller's local variable uninitialized. > psi_memstall_leave() also returns early when PSI is disabled and does > not read *flags, so the uninitialized value is never used functionally. > > However, KMSAN tracks the shadow and origin metadata per physical > address. When a kernel stack page is subsequently reused, a new object > at the same address inherits the stale KMSAN shadow from the old > uninitialized pflags, causing spurious uninit-value reports in > unrelated code paths such as __flush_smp_call_function_queue(). > > Initialize *flags to 0 in the psi_disabled early-return path to > prevent the stale shadow from escaping the callers' stack frames. How is this not a kmsan bug? I don't think putting an unexpected init into unrelated code with no comment is an appropriate fix for what seems like a false positive in this tool.
Hi Johannes, Good question. You're right that KMSAN's stack tracking persisting across page reuse boundaries is arguably a tool limitation. That said, I think fixing it on the PSI side is still reasonable: psi_memstall_enter() takes a pointer parameter with an implicit contract: if the caller passes &flags, they expect *flags to be initialized upon return. The current early-return silently violates that contract by leaving *flags uninitialized, even though the value is never actually used functionally. The fix is essentially free (we're already in the early-return path) and makes the contract explicit. You're right that the original patch lacked a comment explaining this, I should have added: /* Initialize to 0 even in psi_disabled case to honor the * implicit API contract that *flags is initialized on return. * psi_memstall_leave() also returns early when psi_disabled * and does not read *flags, so this is zero-cost. */ *flags = 0; return; That said, if you prefer this stays in KMSAN (e.g., treating stack variables as out-of-scope once their frame returns), I'm happy to drop the patch and redirect the effort there instead. Thanks for the thoughtful review feedback. Best, Mashiro Chen On 4/8/26 23:10, Johannes Weiner wrote: > On Sun, Apr 05, 2026 at 01:50:44PM +0800, Mashiro Chen wrote: >> When PSI is disabled, psi_memstall_enter() returns early without >> writing to *flags, leaving the caller's local variable uninitialized. >> psi_memstall_leave() also returns early when PSI is disabled and does >> not read *flags, so the uninitialized value is never used functionally. >> >> However, KMSAN tracks the shadow and origin metadata per physical >> address. When a kernel stack page is subsequently reused, a new object >> at the same address inherits the stale KMSAN shadow from the old >> uninitialized pflags, causing spurious uninit-value reports in >> unrelated code paths such as __flush_smp_call_function_queue(). >> >> Initialize *flags to 0 in the psi_disabled early-return path to >> prevent the stale shadow from escaping the callers' stack frames. > How is this not a kmsan bug? > > I don't think putting an unexpected init into unrelated code with no > comment is an appropriate fix for what seems like a false positive in > this tool.
On Thu, Apr 09, 2026 at 12:14:50AM +0800, Mashiro Chen wrote: > Hi Johannes, > > Good question. You're right that KMSAN's stack tracking persisting > across page reuse boundaries is arguably a tool limitation. That said, > I think fixing it on the PSI side is still reasonable: > > psi_memstall_enter() takes a pointer parameter with an implicit contract: > if the caller passes &flags, they expect *flags to be initialized upon > return. The current early-return silently violates that contract by > leaving *flags uninitialized, even though the value is never actually used > functionally. The caller has no expectations towards the contents of *flags and no business reading or manipulating them. It's an opaque channel that lets _enter() communicate with _leave(). > The fix is essentially free (we're already in the early-return path) and > makes the contract explicit. You're right that the original patch lacked > a comment explaining this, I should have added: > > /* Initialize to 0 even in psi_disabled case to honor the > * implicit API contract that *flags is initialized on return. > * psi_memstall_leave() also returns early when psi_disabled > * and does not read *flags, so this is zero-cost. */ > *flags = 0; > return; > > That said, if you prefer this stays in KMSAN (e.g., treating stack > variables as out-of-scope once their frame returns), I'm happy to drop > the patch and redirect the effort there instead. It sounds to me like this would be a good thing to fix regardless of what psi is doing here. Even if psi initialized it to some value that is meaningful to psi - that value is totally random, and for all intents and purposes "uninitialized", from the view of a subsequent user of that stack slot?
Hi Johannes, You're right on both counts. The 'opaque channel' framing makes it clear there's no meaningful API contract being violated here -- the caller is not supposed to interpret *flags at all. And yes, your second point is exactly the real issue: once a stack frame returns, its local variables should be considered dead. KMSAN tracking that shadow across page reuse into an unrelated frame is the actual bug. I'll drop this patch. The correct fix is in KMSAN -- it should treat stack slots as out-of-scope once their owning frame returns, rather than letting stale shadow metadata escape into subsequent users of the same physical address. Thanks for the clear explanation. Best, Mashiro Chen On 4/9/26 00:40, Johannes Weiner wrote: > On Thu, Apr 09, 2026 at 12:14:50AM +0800, Mashiro Chen wrote: >> Hi Johannes, >> >> Good question. You're right that KMSAN's stack tracking persisting >> across page reuse boundaries is arguably a tool limitation. That said, >> I think fixing it on the PSI side is still reasonable: >> >> psi_memstall_enter() takes a pointer parameter with an implicit contract: >> if the caller passes &flags, they expect *flags to be initialized upon >> return. The current early-return silently violates that contract by >> leaving *flags uninitialized, even though the value is never actually used >> functionally. > The caller has no expectations towards the contents of *flags and no > business reading or manipulating them. It's an opaque channel that > lets _enter() communicate with _leave(). > >> The fix is essentially free (we're already in the early-return path) and >> makes the contract explicit. You're right that the original patch lacked >> a comment explaining this, I should have added: >> >> /* Initialize to 0 even in psi_disabled case to honor the >> * implicit API contract that *flags is initialized on return. >> * psi_memstall_leave() also returns early when psi_disabled >> * and does not read *flags, so this is zero-cost. */ >> *flags = 0; >> return; >> >> That said, if you prefer this stays in KMSAN (e.g., treating stack >> variables as out-of-scope once their frame returns), I'm happy to drop >> the patch and redirect the effort there instead. > It sounds to me like this would be a good thing to fix regardless of > what psi is doing here. Even if psi initialized it to some value that > is meaningful to psi - that value is totally random, and for all > intents and purposes "uninitialized", from the view of a subsequent > user of that stack slot?
© 2016 - 2026 Red Hat, Inc.