kernel/acct.c | 1 + 1 file changed, 1 insertion(+)
From: Ozgur Karatas <ozgur@goosey.org>
Hello,
i want to try out the openai codex and it seemed like a logical
process so with rcu_read_lock() a protection is started but we dont
call rcu_read_unlock(); it has to be called to end rcu read lock.
I guess, this means rcu stays open forever and data structures are not
cleaned which causes performance degradation.
Regards
Ozgur
Reported-by: Ozgur Karatas <ozgur@goosey.org>
---
kernel/acct.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/acct.c b/kernel/acct.c
index 6520baa13669..8a2b75063d16 100644
--- a/kernel/acct.c
+++ b/kernel/acct.c
@@ -330,6 +330,7 @@ SYSCALL_DEFINE1(acct, const char __user *, name)
} else {
rcu_read_lock();
pin_kill(task_active_pid_ns(current)->bacct);
+ rcu_read_unlock();
}
return error;
--
2.39.5
On Fri, May 16, 2025 at 11:15:49PM +0000, Ozgur Kara wrote: > From: Ozgur Karatas <ozgur@goosey.org> > > Hello, > > i want to try out the openai codex and it seemed like a logical > process so with rcu_read_lock() a protection is started but we dont > call rcu_read_unlock(); it has to be called to end rcu read lock. > > I guess, this means rcu stays open forever and data structures are not > cleaned which causes performance degradation. > > Regards > > Ozgur > > Reported-by: Ozgur Karatas <ozgur@goosey.org> ... showing that one needs *something* that would be able to reason somewhere in the loop. Your "I guess, this means" is an excellent example - you've got something (openai codex, visual examination, whatever) pointing you to unusual pattern - rcu_read_lock() + call of function with no rcu_read_unlock() in sight. Next step: hypothesis that unlock might be missing. Next step: blindly send a patch adding an unlock on the strength of that hypothesis? Sorry, no. It's not just that hypothesis is wrong - it is, but that's not the real issue. It's that instead of * asking how the hell could that work * trying to trigger that and checking whether the hypothesis is corrent * looking around for similar places to see whether it looks intentional (in which case it still might be a bug, but if it's inconsistent, the odds of a bug are going up) * looking through the history (e.g. with git blame) to see if there might be any explanations in commit message * looking through the function in question (git grep would immediately point to fs/fs_pin.c, and the first glance would show rcu_read_unlock() on all paths through it, with the total balance of lock/unlock being -1 on each) you've chosen to post a patch "fixing" the problem as a way to see whether it's actually there. Folks, programming tools can be very useful in directing to places that might be worth checking. The codebase is huge, so any heuristics useful in triage are Good Things(tm); if instead of "bug found because somebody looked at random line picked out of millions such and it turned out to be broken" you have "bug found because such-and-such heuristic had picked that line as odd-looking and it turned out to be broken", you win even if it lists 20 places and only one of them turns out to be actually broken. But you really can't treat that as anything beyond a hint to bump that particular place in the list of things to look through. It might tell you what looks unusual, but that's it. Same as with any warning from any source. Asking around is fine; so's experimenting, so's reasoning, etc. But you can't go from "tool gives a warning, I guess it translates into plain language this way" to "modify that place so that the tool doesn't point to that place anymore". You really need to understand what's going on first.
Al Viro <viro@zeniv.linux.org.uk>, 17 May 2025 Cmt, 02:55 tarihinde şunu yazdı: > > On Fri, May 16, 2025 at 11:15:49PM +0000, Ozgur Kara wrote: > > From: Ozgur Karatas <ozgur@goosey.org> > > > > Hello, > > > > i want to try out the openai codex and it seemed like a logical > > process so with rcu_read_lock() a protection is started but we dont > > call rcu_read_unlock(); it has to be called to end rcu read lock. > > > > I guess, this means rcu stays open forever and data structures are not > > cleaned which causes performance degradation. > > > > Regards > > > > Ozgur > > > > Reported-by: Ozgur Karatas <ozgur@goosey.org> > > ... showing that one needs *something* that would be able to reason somewhere > in the loop. > > Your "I guess, this means" is an excellent example - you've got something > (openai codex, visual examination, whatever) pointing you to unusual pattern - > rcu_read_lock() + call of function with no rcu_read_unlock() in sight. > Hello Al, first thanks a lot for your guidance and insightful perspective and your advice was incredibly valuable, very valuable. And not just for this specific patch but also in helping us understand how to approach bugs and testing in kernel world. it really clarified how things that look like straightforward fixes can actually be symptoms of deeper design decisions and why blindly patching based on heuristics can lead us astray. So your mentorship reminded us that understanding the why behind kernel code is way more important than just code and that thoughtful investigation is key in kernel development. When examining every new technology (ai, openai codex) i think we need an engineering perspective but you should know that i didnt send it with a very short research, sorry. and i didnt think about whether rcu should leave this open while doing read copy and what would happen if it did but you said it very well, so this was enough of a guide for me and i will definitely look into it and try to understand it better by going to for example git blame and grep. I actually thought i understood this code and fix but that was it and i just believed the patch was correct without any need for static analysis or heuristic warnings. Still we live with these, i mean ai or many other tools come into our lives and this is also a learning phase for me, i need to learn more. and i actually asked ai tool this again and this time it got a better answer: "rcu locks are complex, it is not enough to just appear to be missing. why does it appear to be missing? look at the entire code, how is the lock/unlock balance provided? look at past commits, why does it appear that there is no unlock there? look at similar codes, is it intentional to act like this in that place? experiments, tests, asking questions are a must. in other words, do not ask “the tool warned, let’s patch it immediately” but “why is there such a warning, is it really a problem?” " this is a pattern i dont normally see elsewhere in kernel and i was convinced that the switch was left open. i learned that rcu_read_lock shouldnt be missing and it actually improved my approach to kernel. so thank you again. Regards Ozgur > Next step: hypothesis that unlock might be missing. > > Next step: blindly send a patch adding an unlock on the strength of that > hypothesis? > > Sorry, no. It's not just that hypothesis is wrong - it is, but that's not > the real issue. It's that instead of > * asking how the hell could that work > * trying to trigger that and checking whether the hypothesis is > corrent > * looking around for similar places to see whether it looks intentional > (in which case it still might be a bug, but if it's inconsistent, the odds of > a bug are going up) > * looking through the history (e.g. with git blame) to see if there > might be any explanations in commit message > * looking through the function in question (git grep would immediately > point to fs/fs_pin.c, and the first glance would show rcu_read_unlock() on > all paths through it, with the total balance of lock/unlock being -1 on > each) > you've chosen to post a patch "fixing" the problem as a way to see whether > it's actually there. > > Folks, programming tools can be very useful in directing to places that might > be worth checking. The codebase is huge, so any heuristics useful in triage > are Good Things(tm); if instead of "bug found because somebody looked at > random line picked out of millions such and it turned out to be broken" you > have "bug found because such-and-such heuristic had picked that line as > odd-looking and it turned out to be broken", you win even if it lists > 20 places and only one of them turns out to be actually broken. > > But you really can't treat that as anything beyond a hint to bump that > particular place in the list of things to look through. It might tell you > what looks unusual, but that's it. Same as with any warning from any > source. > > Asking around is fine; so's experimenting, so's reasoning, etc. > But you can't go from "tool gives a warning, I guess it translates into plain > language this way" to "modify that place so that the tool doesn't point to > that place anymore". You really need to understand what's going on first. > >
© 2016 - 2025 Red Hat, Inc.