[PATCH 3/4] amd/amdkfd: WQ_PERCPU added to alloc_workqueue users

Marco Crivellari posted 4 patches 3 months, 1 week ago
There is a newer version of this series
[PATCH 3/4] amd/amdkfd: WQ_PERCPU added to alloc_workqueue users
Posted by Marco Crivellari 3 months, 1 week ago
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ddfe30c13e9d..ebc9925f4e66 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -690,7 +690,8 @@ void kfd_procfs_del_queue(struct queue *q)
 int kfd_process_create_wq(void)
 {
 	if (!kfd_process_wq)
-		kfd_process_wq = alloc_workqueue("kfd_process_wq", 0, 0);
+		kfd_process_wq = alloc_workqueue("kfd_process_wq", WQ_PERCPU,
+						 0);
 	if (!kfd_restore_wq)
 		kfd_restore_wq = alloc_ordered_workqueue("kfd_restore_wq",
 							 WQ_FREEZABLE);
-- 
2.51.0

Re: [PATCH 3/4] amd/amdkfd: WQ_PERCPU added to alloc_workqueue users
Posted by Christian König 3 months, 1 week ago
On 10/30/25 17:10, Marco Crivellari wrote:
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
> This lack of consistentcy cannot be addressed without refactoring the API.
> 
> alloc_workqueue() treats all queues as per-CPU by default, while unbound
> workqueues must opt-in via WQ_UNBOUND.
> 
> This default is suboptimal: most workloads benefit from unbound queues,
> allowing the scheduler to place worker threads where they’re needed and
> reducing noise when CPUs are isolated.
> 
> This change adds a new WQ_PERCPU flag to explicitly request
> alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
> 
> With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
> any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
> must now use WQ_PERCPU.
> 
> Once migration is complete, WQ_UNBOUND can be removed and unbound will
> become the implicit default.

Adding Philip and Felix to comment, but this should most likely also not execute on the same CPU as the one who scheduled the work.

Regards,
Christian.

> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index ddfe30c13e9d..ebc9925f4e66 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -690,7 +690,8 @@ void kfd_procfs_del_queue(struct queue *q)
>  int kfd_process_create_wq(void)
>  {
>  	if (!kfd_process_wq)
> -		kfd_process_wq = alloc_workqueue("kfd_process_wq", 0, 0);
> +		kfd_process_wq = alloc_workqueue("kfd_process_wq", WQ_PERCPU,
> +						 0);
>  	if (!kfd_restore_wq)
>  		kfd_restore_wq = alloc_ordered_workqueue("kfd_restore_wq",
>  							 WQ_FREEZABLE);

Re: [PATCH 3/4] amd/amdkfd: WQ_PERCPU added to alloc_workqueue users
Posted by Marco Crivellari 3 months, 1 week ago
On Thu, Oct 30, 2025 at 6:15 PM Christian König
<christian.koenig@amd.com> wrote:
>[...]
> Adding Philip and Felix to comment, but this should most likely also not execute on the same CPU as the one who scheduled the work.

Hi Christian,

The actual behavior without WQ_PERCPU is exactly the same: with 0 it
means the workqueue is per-cpu. We just enforced that, adding the
WQ_PERCPU flag, so that it is explicit.

So if you need this to be unbound, I can send the v2 with WQ_UNBOUND
instead of WQ_PERCPU.

Thanks!

--

Marco Crivellari

L3 Support Engineer, Technology & Product
Re: [PATCH 3/4] amd/amdkfd: WQ_PERCPU added to alloc_workqueue users
Posted by Philip Yang 3 months, 1 week ago
On 2025-10-31 04:48, Marco Crivellari wrote:
> On Thu, Oct 30, 2025 at 6:15 PM Christian König
> <christian.koenig@amd.com> wrote:
>> [...]
>> Adding Philip and Felix to comment, but this should most likely also not execute on the same CPU as the one who scheduled the work.
> Hi Christian,
>
> The actual behavior without WQ_PERCPU is exactly the same: with 0 it
> means the workqueue is per-cpu. We just enforced that, adding the
> WQ_PERCPU flag, so that it is explicit.
>
> So if you need this to be unbound, I can send the v2 with WQ_UNBOUND
> instead of WQ_PERCPU.
Hi,

WQ_UNBOUND is more appropriate here, to execute the KFD release work immediately as long as CPU resource is available, not specific to the CPU that kfd_unref_process the last process refcount.

Thanks,
Philip

> Thanks!
>
> --
>
> Marco Crivellari
>
> L3 Support Engineer, Technology & Product
Re: [PATCH 3/4] amd/amdkfd: WQ_PERCPU added to alloc_workqueue users
Posted by Marco Crivellari 3 months, 1 week ago
On Fri, Oct 31, 2025 at 2:12 PM Philip Yang <yangp@amd.com> wrote:
> Hi,
>
> WQ_UNBOUND is more appropriate here, to execute the KFD release work immediately as long as CPU resource is available, not specific to the CPU that kfd_unref_process the last process refcount.

Hi,

I will do what you suggest.

Thank you!

-- 

Marco Crivellari

L3 Support Engineer, Technology & Product