drm/xe: Replace use of system_wq with system_percpu_wq

[PATCH] drm/xe: Replace use of system_wq with system_percpu_wq

Posted by Marco Crivellari 1 month, 2 weeks ago

This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:

   commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.

Before that to happen after a careful review and conversion of each individual
case, workqueue users must be converted to the better named new workqueues with
no intended behaviour changes:

   system_wq -> system_percpu_wq
   system_unbound_wq -> system_dfl_wq

This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 drivers/gpu/drm/xe/xe_tlb_inval.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
index 918a59e686ea..b2cf6e17fbc5 100644
--- a/drivers/gpu/drm/xe/xe_tlb_inval.c
+++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
@@ -94,7 +94,7 @@ static void xe_tlb_inval_fence_timeout(struct work_struct *work)
 		xe_tlb_inval_fence_signal(fence);
 	}
 	if (!list_empty(&tlb_inval->pending_fences))
-		queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
+		queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
 				   timeout_delay);
 	spin_unlock_irq(&tlb_inval->pending_lock);
 }
@@ -226,7 +226,7 @@ static void xe_tlb_inval_fence_prep(struct xe_tlb_inval_fence *fence)
 	list_add_tail(&fence->link, &tlb_inval->pending_fences);
 
 	if (list_is_singular(&tlb_inval->pending_fences))
-		queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
+		queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
 				   tlb_inval->ops->timeout_delay(tlb_inval));
 	spin_unlock_irq(&tlb_inval->pending_lock);
 
@@ -378,7 +378,7 @@ void xe_tlb_inval_done_handler(struct xe_tlb_inval *tlb_inval, int seqno)
 	}
 
 	if (!list_empty(&tlb_inval->pending_fences))
-		mod_delayed_work(system_wq,
+		mod_delayed_work(system_percpu_wq,
 				 &tlb_inval->fence_tdr,
 				 tlb_inval->ops->timeout_delay(tlb_inval));
 	else
-- 
2.52.0

Re: [PATCH] drm/xe: Replace use of system_wq with system_percpu_wq

Posted by Matthew Brost 1 month, 2 weeks ago

On Wed, Dec 24, 2025 at 05:00:26PM +0100, Marco Crivellari wrote:
> This patch continues the effort to refactor workqueue APIs, which has begun
> with the changes introducing new workqueues and a new alloc_workqueue flag:
> 
>    commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
>    commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
> 
> The point of the refactoring is to eventually alter the default behavior of
> workqueues to become unbound by default so that their workload placement is
> optimized by the scheduler.
> 
> Before that to happen after a careful review and conversion of each individual
> case, workqueue users must be converted to the better named new workqueues with
> no intended behaviour changes:
> 
>    system_wq -> system_percpu_wq
>    system_unbound_wq -> system_dfl_wq
> 
> This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
> removed in the future.
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
> ---
>  drivers/gpu/drm/xe/xe_tlb_inval.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> index 918a59e686ea..b2cf6e17fbc5 100644
> --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> @@ -94,7 +94,7 @@ static void xe_tlb_inval_fence_timeout(struct work_struct *work)
>  		xe_tlb_inval_fence_signal(fence);
>  	}
>  	if (!list_empty(&tlb_inval->pending_fences))
> -		queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
> +		queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,

Actually system_wq or system_percpu_wq doesn't work here as this is the
fence signaling path. We should use one Xe's ordered work queues which
is properly setup to be reclaim same.

Matt

>  				   timeout_delay);
>  	spin_unlock_irq(&tlb_inval->pending_lock);
>  }
> @@ -226,7 +226,7 @@ static void xe_tlb_inval_fence_prep(struct xe_tlb_inval_fence *fence)
>  	list_add_tail(&fence->link, &tlb_inval->pending_fences);
>  
>  	if (list_is_singular(&tlb_inval->pending_fences))
> -		queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
> +		queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
>  				   tlb_inval->ops->timeout_delay(tlb_inval));
>  	spin_unlock_irq(&tlb_inval->pending_lock);
>  
> @@ -378,7 +378,7 @@ void xe_tlb_inval_done_handler(struct xe_tlb_inval *tlb_inval, int seqno)
>  	}
>  
>  	if (!list_empty(&tlb_inval->pending_fences))
> -		mod_delayed_work(system_wq,
> +		mod_delayed_work(system_percpu_wq,
>  				 &tlb_inval->fence_tdr,
>  				 tlb_inval->ops->timeout_delay(tlb_inval));
>  	else
> -- 
> 2.52.0
>

Re: [PATCH] drm/xe: Replace use of system_wq with system_percpu_wq

Posted by Marco Crivellari 1 month, 1 week ago

On Wed, Dec 24, 2025 at 7:28 PM Matthew Brost <matthew.brost@intel.com> wrote:
> [...]
> >  drivers/gpu/drm/xe/xe_tlb_inval.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > index 918a59e686ea..b2cf6e17fbc5 100644
> > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > @@ -94,7 +94,7 @@ static void xe_tlb_inval_fence_timeout(struct work_struct *work)
> >               xe_tlb_inval_fence_signal(fence);
> >       }
> >       if (!list_empty(&tlb_inval->pending_fences))
> > -             queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
> > +             queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
>
> Actually system_wq or system_percpu_wq doesn't work here as this is the
> fence signaling path. We should use one Xe's ordered work queues which
> is properly setup to be reclaim same.

Hi,

So only for this specific workqueue we should use for example this, instead:

462     /** @ordered_wq: used to serialize compute mode resume */
463     struct workqueue_struct *ordered_wq;

I noticed this has been allocated using:

 490     xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);

Using alloc_ordered_workqueue() makes this workqueue to be unbound.

569 #define alloc_ordered_workqueue(fmt, flags, args...)            \
570     alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)

So this patch should be split in 2:
- 1 patch with Xe's ordered workqueue, changing the behavior to
unbound "implicitly"
- 1 patch changing system_wq with the new per-cpu wq (system_percpu_wq).

To keep this workqueue per-cpu we can use xe->unordered_wq, that makes use of
alloc_workqueue() without specifying flags (eg. WQ_UNBOUND or the new
WQ_PERCPU),
making this workqueue per-cpu.

Would this sound reasonable to you?

Thanks!
-- 

Marco Crivellari

L3 Support Engineer

Re: [PATCH] drm/xe: Replace use of system_wq with system_percpu_wq

Posted by Matthew Brost 1 month, 1 week ago

On Tue, Dec 30, 2025 at 04:42:51PM +0100, Marco Crivellari wrote:
> On Wed, Dec 24, 2025 at 7:28 PM Matthew Brost <matthew.brost@intel.com> wrote:
> > [...]
> > >  drivers/gpu/drm/xe/xe_tlb_inval.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > > index 918a59e686ea..b2cf6e17fbc5 100644
> > > --- a/drivers/gpu/drm/xe/xe_tlb_inval.c
> > > +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c
> > > @@ -94,7 +94,7 @@ static void xe_tlb_inval_fence_timeout(struct work_struct *work)
> > >               xe_tlb_inval_fence_signal(fence);
> > >       }
> > >       if (!list_empty(&tlb_inval->pending_fences))
> > > -             queue_delayed_work(system_wq, &tlb_inval->fence_tdr,
> > > +             queue_delayed_work(system_percpu_wq, &tlb_inval->fence_tdr,
> >
> > Actually system_wq or system_percpu_wq doesn't work here as this is the
> > fence signaling path. We should use one Xe's ordered work queues which
> > is properly setup to be reclaim same.
> 
> Hi,
> 
> So only for this specific workqueue we should use for example this, instead:
> 
> 462     /** @ordered_wq: used to serialize compute mode resume */
> 463     struct workqueue_struct *ordered_wq;
> 
> I noticed this has been allocated using:
> 
>  490     xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
> 
> Using alloc_ordered_workqueue() makes this workqueue to be unbound.
> 
> 569 #define alloc_ordered_workqueue(fmt, flags, args...)            \
> 570     alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)
> 
> So this patch should be split in 2:
> - 1 patch with Xe's ordered workqueue, changing the behavior to
> unbound "implicitly"
> - 1 patch changing system_wq with the new per-cpu wq (system_percpu_wq).
> 
> To keep this workqueue per-cpu we can use xe->unordered_wq, that makes use of
> alloc_workqueue() without specifying flags (eg. WQ_UNBOUND or the new
> WQ_PERCPU),
> making this workqueue per-cpu.
> 
> Would this sound reasonable to you?
> 

What I'd probably do here is store a WQ pointer in 'struct
xe_tlb_inval' and update all calls in xe_tlb_inval to use that work
queue.

Since these are tied to a GT currently - see xe_gt_tlb_inval_init_early,
I'd set the WQ pointer in 'struct xe_tlb_inval' to 'gt->ordered_wq'.
'gt->ordered_wq' is the per-GT WQ for resets, jobs timeouts, VF
migrations (all operations in reclaim path), so this seems to fit here
as well.

Matt

> Thanks!
> -- 
> 
> Marco Crivellari
> 
> L3 Support Engineer

Re: [PATCH] drm/xe: Replace use of system_wq with system_percpu_wq

Posted by Marco Crivellari 1 month ago

On Wed, Dec 31, 2025 at 3:25 AM Matthew Brost <matthew.brost@intel.com> wrote:
> What I'd probably do here is store a WQ pointer in 'struct
> xe_tlb_inval' and update all calls in xe_tlb_inval to use that work
> queue.
>
> Since these are tied to a GT currently - see xe_gt_tlb_inval_init_early,
> I'd set the WQ pointer in 'struct xe_tlb_inval' to 'gt->ordered_wq'.
> 'gt->ordered_wq' is the per-GT WQ for resets, jobs timeouts, VF
> migrations (all operations in reclaim path), so this seems to fit here
> as well.

Hi Matthew,

Thanks for your hint, I will send a new version with this change or
if I don't have something properly clear I will get back to you here!

Thanks!

-- 

Marco Crivellari

L3 Support Engineer