[PATCH] posix-timers: cond_resched() during exit_itimers()

Benjamin Segall posted 1 patch 10 months, 1 week ago
kernel/time/posix-timers.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[PATCH] posix-timers: cond_resched() during exit_itimers()
Posted by Benjamin Segall 10 months, 1 week ago
exit_itimers() loops through every timer in the process to delete it.
This requires taking the system-wide hash_lock for each of these locks,
and contends with other processes trying to create or delete timers.
When a process creates hundreds of thousands of timers, and then exits
while other processes contend with it, this can trigger softlockups on
CONFIG_PREEMPT=n.

Ideally this will some day be better solved by eliminating the global
hashtable, but until that point mitigate the issue by doing
cond_resched in that loop.

Signed-off-by: Ben Segall <bsegall@google.com>
---
 kernel/time/posix-timers.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1b675aee99a98..44ba7db07e900 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1097,12 +1097,14 @@ void exit_itimers(struct task_struct *tsk)
 	spin_lock_irq(&tsk->sighand->siglock);
 	hlist_move_list(&tsk->signal->posix_timers, &timers);
 	spin_unlock_irq(&tsk->sighand->siglock);
 
 	/* The timers are not longer accessible via tsk::signal */
-	while (!hlist_empty(&timers))
+	while (!hlist_empty(&timers)) {
 		itimer_delete(hlist_entry(timers.first, struct k_itimer, list));
+		cond_resched();
+	}
 
 	/*
 	 * There should be no timers on the ignored list. itimer_delete() has
 	 * mopped them up.
 	 */
-- 
2.48.1.601.g30ceb7b040-goog
Re: [PATCH] posix-timers: cond_resched() during exit_itimers()
Posted by Thomas Gleixner 10 months ago
On Fri, Feb 14 2025 at 14:12, Benjamin Segall wrote:
> exit_itimers() loops through every timer in the process to delete it.
> This requires taking the system-wide hash_lock for each of these locks,
> and contends with other processes trying to create or delete timers.
> When a process creates hundreds of thousands of timers, and then exits
> while other processes contend with it, this can trigger softlockups on
> CONFIG_PREEMPT=n.
>
> Ideally this will some day be better solved by eliminating the global
> hashtable, but until that point mitigate the issue by doing
> cond_resched in that loop.

It won't help for a PREEMPT_NONE kernel because the loop will be equally
long as before. Only the hash lock contention will be smaller, but that
does not mean that mopping up 100k timers won't be able to take ages.

We really need to get this PREEMPT_LAZY thing going and kill all of this
cond_resched() nonsense.

Thanks,

        tglx
Re: [PATCH] posix-timers: cond_resched() during exit_itimers()
Posted by Benjamin Segall 10 months ago
Thomas Gleixner <tglx@linutronix.de> writes:

> On Fri, Feb 14 2025 at 14:12, Benjamin Segall wrote:
>> exit_itimers() loops through every timer in the process to delete it.
>> This requires taking the system-wide hash_lock for each of these locks,
>> and contends with other processes trying to create or delete timers.
>> When a process creates hundreds of thousands of timers, and then exits
>> while other processes contend with it, this can trigger softlockups on
>> CONFIG_PREEMPT=n.
>>
>> Ideally this will some day be better solved by eliminating the global
>> hashtable, but until that point mitigate the issue by doing
>> cond_resched in that loop.
>
> It won't help for a PREEMPT_NONE kernel because the loop will be equally
> long as before. Only the hash lock contention will be smaller, but that
> does not mean that mopping up 100k timers won't be able to take ages.

Yeah, it could just run into a new lock or other bottleneck, though it's
not immediately obvious to me what it would be (hash_lock isn't sharing
~any of the time in perf tracing, the obvious other locks like hrtimer
are sharded, etc). Just sharding the lock a bunch (leaving the actual
hashtable with the same cacheline sharing even) boosts the speed of my
synthetic contention test freeing 100k timers from 6s to 380ms (with
uncontended exit at 17ms), so I think it's realistic that avoiding
the shared lock/table might well do the job.

Of course nothing is stopping an even buggier application from
just creating more timers (and at that point starting to notice the
fixed hashtable size during timer_create)...

>
> We really need to get this PREEMPT_LAZY thing going and kill all of this
> cond_resched() nonsense.
>
> Thanks,
>
>         tglx
[tip: timers/core] posix-timers: Invoke cond_resched() during exit_itimers()
Posted by tip-bot2 for Benjamin Segall 10 months ago
The following commit has been merged into the timers/core branch of tip:

Commit-ID:     f99c5bb396b8d1424ed229d1ffa6f596e3b9c36b
Gitweb:        https://git.kernel.org/tip/f99c5bb396b8d1424ed229d1ffa6f596e3b9c36b
Author:        Benjamin Segall <bsegall@google.com>
AuthorDate:    Fri, 14 Feb 2025 14:12:20 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 18 Feb 2025 10:12:49 +01:00

posix-timers: Invoke cond_resched() during exit_itimers()

exit_itimers() loops through every timer in the process to delete it.  This
requires taking the system-wide hash_lock for each of these timers, and
contends with other processes trying to create or delete timers.

When a process creates hundreds of thousands of timers, and then exits
while other processes contend with it, this can trigger softlockups on
CONFIG_PREEMPT=n.

Add a cond_resched() invocation into the loop to allow the system to make
progress.

Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/xm2634gg2n23.fsf@google.com
---
 kernel/time/posix-timers.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1b675ae..44ba7db 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1099,8 +1099,10 @@ void exit_itimers(struct task_struct *tsk)
 	spin_unlock_irq(&tsk->sighand->siglock);
 
 	/* The timers are not longer accessible via tsk::signal */
-	while (!hlist_empty(&timers))
+	while (!hlist_empty(&timers)) {
 		itimer_delete(hlist_entry(timers.first, struct k_itimer, list));
+		cond_resched();
+	}
 
 	/*
 	 * There should be no timers on the ignored list. itimer_delete() has