[PATCH] timers: Fix NULL function pointer race in timer_shutdown_sync

Yipeng Zou posted 1 patch 1 week, 2 days ago
kernel/time/timer.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
[PATCH] timers: Fix NULL function pointer race in timer_shutdown_sync
Posted by Yipeng Zou 1 week, 2 days ago
There is a race condition between timer_shutdown_sync() and timer
expiration that can lead to hitting a WARN_ON in expire_timers().

The issue occurs when timer_shutdown_sync() clears the timer function
to NULL while the timer is still running on another CPU. The race
scenario looks like this:

CPU0					CPU1
					<SOFTIRQ>
					lock_timer_base
					expire_timers
					base->running_timer = timer;
					unlock_timer_base
					[call_timer_fn enter]
					mod_timer
					...
timer_shutdown_sync
lock_timer_base
// For now, will not detach the timer but only clear its function to NULL
if (base->running_timer != timer)
	ret = detach_if_pending(timer, base, true);
if (shutdown)
	timer->function = NULL;
unlock_timer_base
					[call_timer_fn exit]
					lock_timer_base
					base->running_timer = NULL
					unlock_timer_base
					...
					// Now timer is pending while its function set to NULL.
					// next timer trigger
					<SOFTIRQ>
					expire_timers
					WARN_ON_ONCE(!fn) // hit
					...
lock_timer_base
// Now timer will detach
if (base->running_timer != timer)
	ret = detach_if_pending(timer, base, true);
if (shutdown)
	timer->function = NULL;
unlock_timer_base

The problem is that timer_shutdown_sync() clears the timer function
regardless of whether the timer is currently running. This can leave
a pending timer with a NULL function pointer, which triggers the
WARN_ON_ONCE(!fn) check in expire_timers().

Fix this by only clearing the timer function when we actually detach
the timer. If the timer is running, we leave the function pointer
intact, which is safe because the timer will be properly detached
when it finishes running.

Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
---
 kernel/time/timer.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 282a8e5c05f8..1f2364126894 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1458,10 +1458,11 @@ static int __try_to_del_timer_sync(struct timer_list *timer, bool shutdown)
 
 	base = lock_timer_base(timer, &flags);
 
-	if (base->running_timer != timer)
+	if (base->running_timer != timer) {
 		ret = detach_if_pending(timer, base, true);
-	if (shutdown)
-		timer->function = NULL;
+		if (shutdown)
+			timer->function = NULL;
+	}
 
 	raw_spin_unlock_irqrestore(&base->lock, flags);
 
-- 
2.34.1
[tip: timers/urgent] timers: Fix NULL function pointer race in timer_shutdown_sync()
Posted by tip-bot2 for Yipeng Zou 1 week, 2 days ago
The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     20739af07383e6eb1ec59dcd70b72ebfa9ac362c
Gitweb:        https://git.kernel.org/tip/20739af07383e6eb1ec59dcd70b72ebfa9ac362c
Author:        Yipeng Zou <zouyipeng@huawei.com>
AuthorDate:    Sat, 22 Nov 2025 09:39:42 
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sat, 22 Nov 2025 22:55:26 +01:00

timers: Fix NULL function pointer race in timer_shutdown_sync()

There is a race condition between timer_shutdown_sync() and timer
expiration that can lead to hitting a WARN_ON in expire_timers().

The issue occurs when timer_shutdown_sync() clears the timer function
to NULL while the timer is still running on another CPU. The race
scenario looks like this:

CPU0					CPU1
					<SOFTIRQ>
					lock_timer_base()
					expire_timers()
					base->running_timer = timer;
					unlock_timer_base()
					[call_timer_fn enter]
					mod_timer()
					...
timer_shutdown_sync()
lock_timer_base()
// For now, will not detach the timer but only clear its function to NULL
if (base->running_timer != timer)
	ret = detach_if_pending(timer, base, true);
if (shutdown)
	timer->function = NULL;
unlock_timer_base()
					[call_timer_fn exit]
					lock_timer_base()
					base->running_timer = NULL;
					unlock_timer_base()
					...
					// Now timer is pending while its function set to NULL.
					// next timer trigger
					<SOFTIRQ>
					expire_timers()
					WARN_ON_ONCE(!fn) // hit
					...
lock_timer_base()
// Now timer will detach
if (base->running_timer != timer)
	ret = detach_if_pending(timer, base, true);
if (shutdown)
	timer->function = NULL;
unlock_timer_base()

The problem is that timer_shutdown_sync() clears the timer function
regardless of whether the timer is currently running. This can leave a
pending timer with a NULL function pointer, which triggers the
WARN_ON_ONCE(!fn) check in expire_timers().

Fix this by only clearing the timer function when actually detaching the
timer. If the timer is running, leave the function pointer intact, which is
safe because the timer will be properly detached when it finishes running.

Fixes: 0cc04e80458a ("timers: Add shutdown mechanism to the internal functions")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251122093942.301559-1-zouyipeng@huawei.com
---
 kernel/time/timer.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 553fa46..d5ebb1d 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1458,10 +1458,11 @@ static int __try_to_del_timer_sync(struct timer_list *timer, bool shutdown)
 
 	base = lock_timer_base(timer, &flags);
 
-	if (base->running_timer != timer)
+	if (base->running_timer != timer) {
 		ret = detach_if_pending(timer, base, true);
-	if (shutdown)
-		timer->function = NULL;
+		if (shutdown)
+			timer->function = NULL;
+	}
 
 	raw_spin_unlock_irqrestore(&base->lock, flags);