[PATCH 0/2] timers/workqueue: Add support for active CPU

Partha Satapathy posted 2 patches 1 month, 3 weeks ago
include/linux/timer.h     |  1 +
include/linux/workqueue.h |  3 ++
kernel/time/timer.c       | 45 +++++++++++++++++-
kernel/workqueue.c        | 96 +++++++++++++++++++++++++++++++++------
4 files changed, 131 insertions(+), 14 deletions(-)
[PATCH 0/2] timers/workqueue: Add support for active CPU
Posted by Partha Satapathy 1 month, 3 weeks ago
From: Partha Sarathi Satapathy <partha.satapathy@oracle.com>

Hi,

Timers queued with add_timer_on() and delayed work queued with
queue_delayed_work_on() currently rely on the caller to ensure that the
target CPU remains online until the enqueue operation completes. In
practice, CPU hotplug can still race with that sequence and leave the
timer queued on an offline CPU, where it will not run until that CPU
comes back online.

For delayed work, this has a direct knock-on effect: if the backing
timer is stranded on an offline CPU, the work item is never queued for
execution until that CPU returns.

In many cases, the target CPU is chosen for locality and cache affinity
rather than as a strict execution requirement. Falling back to an active
CPU is preferable to leaving the timer or delayed work blocked on a dead
CPU. While callers can try to track CPU hotplug state themselves, that
does not close the race, and taking the hotplug lock around enqueue
operations is too expensive for this class of use.

This series adds opt-in helpers for that fallback behavior without
changing the semantics of the existing interfaces:

  - add_timer_active_cpu() queues a timer on the requested CPU only if
    the target CPU's timer base is active; otherwise it falls back to
    the current CPU.

  - queue_delayed_work_active_cpu() uses the new timer helper for the
    delayed timer path and updates dwork->cpu to reflect the CPU
    actually selected for the timer, so the work item is queued on the
    same active CPU.

The existing add_timer_on() and queue_delayed_work_on() behavior is left
unchanged for callers that require strict CPU placement.

Partha Sarathi Satapathy (2):
  timer: add add_timer_active_cpu()
  workqueue: add queue_delayed_work_active_cpu()

 include/linux/timer.h     |  1 +
 include/linux/workqueue.h |  3 ++
 kernel/time/timer.c       | 45 +++++++++++++++++-
 kernel/workqueue.c        | 96 +++++++++++++++++++++++++++++++++------
 4 files changed, 131 insertions(+), 14 deletions(-)

-- 
2.43.7
Re: [PATCH 0/2] timers/workqueue: Add support for active CPU
Posted by Frederic Weisbecker 1 month, 3 weeks ago
Hi,

Le Thu, Apr 23, 2026 at 09:19:05AM +0000, Partha Satapathy a écrit :
> From: Partha Sarathi Satapathy <partha.satapathy@oracle.com>
> 
> Hi,
> 
> Timers queued with add_timer_on() and delayed work queued with
> queue_delayed_work_on() currently rely on the caller to ensure that the
> target CPU remains online until the enqueue operation completes. In
> practice, CPU hotplug can still race with that sequence and leave the
> timer queued on an offline CPU, where it will not run until that CPU
> comes back online.
> 
> For delayed work, this has a direct knock-on effect: if the backing
> timer is stranded on an offline CPU, the work item is never queued for
> execution until that CPU returns.
> 
> In many cases, the target CPU is chosen for locality and cache affinity
> rather than as a strict execution requirement. Falling back to an active
> CPU is preferable to leaving the timer or delayed work blocked on a dead
> CPU. While callers can try to track CPU hotplug state themselves, that
> does not close the race, and taking the hotplug lock around enqueue
> operations is too expensive for this class of use.
> 
> This series adds opt-in helpers for that fallback behavior without
> changing the semantics of the existing interfaces:
> 
>   - add_timer_active_cpu() queues a timer on the requested CPU only if
>     the target CPU's timer base is active; otherwise it falls back to
>     the current CPU.
> 
>   - queue_delayed_work_active_cpu() uses the new timer helper for the
>     delayed timer path and updates dwork->cpu to reflect the CPU
>     actually selected for the timer, so the work item is queued on the
>     same active CPU.
> 
> The existing add_timer_on() and queue_delayed_work_on() behavior is left
> unchanged for callers that require strict CPU placement.

Timers are migrated when CPUs go offline. So the problem is queueing
a timer to an offline CPU. It should be the responsibility of a subsystem
to synchronize with CPU hotplug in order to avoid that.

As for timers that are queued locally not for correctness but for performance
reasons, do we know such example?

Thanks.

-- 
Frederic Weisbecker
SUSE Labs