[PATCH] workqueue: warn if delayed_work is queued to an offlined cpu.

Imran Khan posted 1 patch 1 year ago
There is a newer version of this series
kernel/workqueue.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] workqueue: warn if delayed_work is queued to an offlined cpu.
Posted by Imran Khan 1 year ago
delayed_work submitted to an offlined cpu, will not get executed,
after the specified delay if the cpu remains offline. If the cpu
never comes online the work will never get executed.
checking for online cpu in __queue_delayed_work, does not sound
like a good idea because to do this reliably we need hotplug lock
and since work may be submitted from atomic contexts, we would
have to use cpus_read_trylock. But if trylock fails we would queue
the work on any cpu and this may not be optimal because our intended
cpu might still be online.

Putting a WARN_ON for an already offlined cpu, will indicate users
of queue_delayed_work_on, if they are (wrongly) trying to queue
delayed_work on offlined cpu.

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
---
 kernel/workqueue.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 8e0bb3c608239..10878b5e3d74f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2508,6 +2508,7 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
 		return;
 	}
 
+	WARN_ON(cpu != WORK_CPU_UNBOUND && !cpu_online(cpu));
 	dwork->wq = wq;
 	dwork->cpu = cpu;
 	timer->expires = jiffies + delay;

base-commit: 6ecd20965bdc21b265a0671ccf36d9ad8043f5ab
-- 
2.34.1
Re: [PATCH] workqueue: warn if delayed_work is queued to an offlined cpu.
Posted by Tejun Heo 1 year ago
On Thu, Jan 09, 2025 at 09:18:29PM +1100, Imran Khan wrote:
> delayed_work submitted to an offlined cpu, will not get executed,
> after the specified delay if the cpu remains offline. If the cpu
> never comes online the work will never get executed.
> checking for online cpu in __queue_delayed_work, does not sound
> like a good idea because to do this reliably we need hotplug lock
> and since work may be submitted from atomic contexts, we would
> have to use cpus_read_trylock. But if trylock fails we would queue
> the work on any cpu and this may not be optimal because our intended
> cpu might still be online.
> 
> Putting a WARN_ON for an already offlined cpu, will indicate users
> of queue_delayed_work_on, if they are (wrongly) trying to queue
> delayed_work on offlined cpu.
> 
> Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
> ---
>  kernel/workqueue.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 8e0bb3c608239..10878b5e3d74f 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2508,6 +2508,7 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
>  		return;
>  	}
>  
> +	WARN_ON(cpu != WORK_CPU_UNBOUND && !cpu_online(cpu));

Can we use WARN_ON_ONCE() instead?

Thanks.

-- 
tejun
Re: [PATCH] workqueue: warn if delayed_work is queued to an offlined cpu.
Posted by imran.f.khan@oracle.com 1 year ago
Hello Tejun,
Thanks a lot for having a look.
On 10/1/2025 8:08 am, Tejun Heo wrote:
> On Thu, Jan 09, 2025 at 09:18:29PM +1100, Imran Khan wrote:
>> delayed_work submitted to an offlined cpu, will not get executed,
>> after the specified delay if the cpu remains offline. If the cpu
>> never comes online the work will never get executed.
>> checking for online cpu in __queue_delayed_work, does not sound
>> like a good idea because to do this reliably we need hotplug lock
>> and since work may be submitted from atomic contexts, we would
>> have to use cpus_read_trylock. But if trylock fails we would queue
>> the work on any cpu and this may not be optimal because our intended
>> cpu might still be online.
>>
>> Putting a WARN_ON for an already offlined cpu, will indicate users
>> of queue_delayed_work_on, if they are (wrongly) trying to queue
>> delayed_work on offlined cpu.
>>
>> Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
>> ---
>>  kernel/workqueue.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index 8e0bb3c608239..10878b5e3d74f 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -2508,6 +2508,7 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
>>  		return;
>>  	}
>>  
>> +	WARN_ON(cpu != WORK_CPU_UNBOUND && !cpu_online(cpu));
> 
> Can we use WARN_ON_ONCE() instead?
> 
Yes we can. The reason I was using WARN_ON was that, if 
someone does this for multiple offlined CPU, we could see the
warnings for each of them.

I have sent a v2 of this patch with WARN_ON_ONCE. Also in v2, 
I have modified description of queue_delayed_work_on, similar to
that of queue_work_on, to indicate why the caller should ensure
that specified cpu is and remains online.

Thanks,
Imran

> Thanks.
>