[PATCH V3 6/7] workqueue: Limit number of processed works in rescuer per turn

Lai Jiangshan posted 7 patches 1 week, 3 days ago
There is a newer version of this series
[PATCH V3 6/7] workqueue: Limit number of processed works in rescuer per turn
Posted by Lai Jiangshan 1 week, 3 days ago
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

Currently the rescuer keeps looping until all work on a PWQ is done, and
this may hurt fairness among PWQs, as the rescuer could remain stuck on
one PWQ indefinitely.

Introduce RESCUER_BATCH to control the maximum number of work items the
rescuer processes in each turn, and move on to other PWQs when the limit
is reached.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 kernel/workqueue.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 49dce50ff647..9bc155545492 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -117,6 +117,8 @@ enum wq_internal_consts {
 	MAYDAY_INTERVAL		= HZ / 10,	/* and then every 100ms */
 	CREATE_COOLDOWN		= HZ,		/* time to breath after fail */
 
+	RESCUER_BATCH		= 16,		/* process items per turn */
+
 	/*
 	 * Rescue workers are used only on emergencies and shared by
 	 * all cpus.  Give MIN_NICE.
@@ -3456,7 +3458,7 @@ static int worker_thread(void *__worker)
 	goto woke_up;
 }
 
-static bool assign_rescuer_work(struct pool_workqueue *pwq, struct worker *rescuer)
+static bool assign_rescuer_work(struct pool_workqueue *pwq, struct worker *rescuer, bool limited)
 {
 	struct worker_pool *pool = pwq->pool;
 	struct work_struct *cursor = &pwq->mayday_cursor;
@@ -3477,7 +3479,20 @@ static bool assign_rescuer_work(struct pool_workqueue *pwq, struct worker *rescu
 
 	/* try to assign a work to rescue */
 	list_for_each_entry_safe_from(work, n, &pool->worklist, entry) {
-		if (get_work_pwq(work) == pwq && assign_work(work, rescuer, &n)) {
+		if (get_work_pwq(work) != pwq)
+		       continue;
+		/*
+		 * put the cursor, resend mayday for itself and move on to other
+		 * PWQs when the limit is reached.
+		 */
+		if (limited && !list_empty(&pwq->wq->maydays)) {
+			list_add_tail(&cursor->entry, &work->entry);
+			raw_spin_lock(&wq_mayday_lock);		/* for wq->maydays */
+			send_mayday(work);
+			raw_spin_unlock(&wq_mayday_lock);
+			return false;
+		}
+		if (assign_work(work, rescuer, &n)) {
 			pwq->stats[PWQ_STAT_RESCUED]++;
 			/* put the cursor for next search */
 			list_add_tail(&cursor->entry, &n->entry);
@@ -3542,6 +3557,7 @@ static int rescuer_thread(void *__rescuer)
 		struct pool_workqueue *pwq = list_first_entry(&wq->maydays,
 					struct pool_workqueue, mayday_node);
 		struct worker_pool *pool = pwq->pool;
+		unsigned int count = 0;
 
 		__set_current_state(TASK_RUNNING);
 		list_del_init(&pwq->mayday_node);
@@ -3554,7 +3570,7 @@ static int rescuer_thread(void *__rescuer)
 
 		WARN_ON_ONCE(!list_empty(&rescuer->scheduled));
 
-		while (assign_rescuer_work(pwq, rescuer))
+		while (assign_rescuer_work(pwq, rescuer, ++count > RESCUER_BATCH))
 			process_scheduled_works(rescuer);
 
 		/*
-- 
2.19.1.6.gb485710b
Re: [PATCH V3 6/7] workqueue: Limit number of processed works in rescuer per turn
Posted by Tejun Heo 1 week, 3 days ago
Hello,

On Fri, Nov 21, 2025 at 10:57:19PM +0800, Lai Jiangshan wrote:
> +static bool assign_rescuer_work(struct pool_workqueue *pwq, struct worker *rescuer, bool limited)

I find the organization a bit odd with the expiration detection in the
caller and the implmentation of it piped into this function. Please see
below.

>  	list_for_each_entry_safe_from(work, n, &pool->worklist, entry) {
> -		if (get_work_pwq(work) == pwq && assign_work(work, rescuer, &n)) {
> +		if (get_work_pwq(work) != pwq)
> +		       continue;
> +		/*
> +		 * put the cursor, resend mayday for itself and move on to other
> +		 * PWQs when the limit is reached.
> +		 */
> +		if (limited && !list_empty(&pwq->wq->maydays)) {
> +			list_add_tail(&cursor->entry, &work->entry);
> +			raw_spin_lock(&wq_mayday_lock);		/* for wq->maydays */
> +			send_mayday(work);
> +			raw_spin_unlock(&wq_mayday_lock);
> +			return false;

Does it make sense to maintain cursor position across pwqs? Shouldn't it be
reset? Imagine two pwqs' (A, B) work items interleaved:

        A1 B1 A2 B2 A3 B3

1. Two of A's work items are rescued and cursor is inserted before the next
   eligible one:

        B1 B2 A3 B3
              ^

2. Let's say limit is reached and we're moving on to B. Then, the rescuer
   would first run B3. Wouldn't it make more sense to go back to the head of
   the queue and start over so that it can pick up B1 first?

Thanks.

-- 
tejun
Re: [PATCH V3 6/7] workqueue: Limit number of processed works in rescuer per turn
Posted by Lai Jiangshan 1 week, 2 days ago
Hello

On Sat, Nov 22, 2025 at 3:28 AM Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Fri, Nov 21, 2025 at 10:57:19PM +0800, Lai Jiangshan wrote:
> > +static bool assign_rescuer_work(struct pool_workqueue *pwq, struct worker *rescuer, bool limited)
>
> I find the organization a bit odd with the expiration detection in the
> caller and the implmentation of it piped into this function. Please see
> below.
>
> >       list_for_each_entry_safe_from(work, n, &pool->worklist, entry) {
> > -             if (get_work_pwq(work) == pwq && assign_work(work, rescuer, &n)) {
> > +             if (get_work_pwq(work) != pwq)
> > +                    continue;
> > +             /*
> > +              * put the cursor, resend mayday for itself and move on to other
> > +              * PWQs when the limit is reached.
> > +              */
> > +             if (limited && !list_empty(&pwq->wq->maydays)) {
> > +                     list_add_tail(&cursor->entry, &work->entry);
> > +                     raw_spin_lock(&wq_mayday_lock);         /* for wq->maydays */
> > +                     send_mayday(work);
> > +                     raw_spin_unlock(&wq_mayday_lock);
> > +                     return false;
>
> Does it make sense to maintain cursor position across pwqs? Shouldn't it be
> reset? Imagine two pwqs' (A, B) work items interleaved:
>
>         A1 B1 A2 B2 A3 B3

I might misunderstand the setting.

>
> 1. Two of A's work items are rescued and cursor is inserted before the next
>    eligible one:
>
>         B1 B2 A3 B3
>               ^
>
> 2. Let's say limit is reached and we're moving on to B. Then, the rescuer
>    would first run B3. Wouldn't it make more sense to go back to the head of
>    the queue and start over so that it can pick up B1 first?
>

The cursor is per PWQ. When the rescuer come back to this pool next time,
it can only handle the PWQ belonging to its wq, which is A, and it will search
from A3 and process A3 instead of searching from B1 if the cursor is reset.

The rescuer never moves to B (I assume "A1 B1 A2 B2 A3 B3" is the worklist
of the pool, and shows B is a pwq of the pool), because B is not a PWQ
of the workqueue since there never be two PWQs of the same wq in the same
pool.

The rescuer will leave this pool and will move to the next pwq in
pwq->wq->maydays and, continuing to help its own wq, shifting from
pwq to pwq with a limited number of processed work for each pwq in turn.

Thanks,
Lai
Re: [PATCH V3 6/7] workqueue: Limit number of processed works in rescuer per turn
Posted by Tejun Heo 1 week, 2 days ago
Hello,

On Sat, Nov 22, 2025 at 02:22:12PM +0800, Lai Jiangshan wrote:
> The cursor is per PWQ. When the rescuer come back to this pool next time,
> it can only handle the PWQ belonging to its wq, which is A, and it will search
> from A3 and process A3 instead of searching from B1 if the cursor is reset.

Oh yeah, you're right. I was thinking that the cursor was shared across pwqs
for some reason.

Thanks.

-- 
tejun