From nobody Tue Dec 2 01:48:34 2025 Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 555A634676F for ; Fri, 21 Nov 2025 14:53:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763736831; cv=none; b=Qz7OasoSRy5VSwAPQgxz5i0z9Da5h/VeXakEGr3Xj468zpTWQ3pNPQ4uEh+Drln7SYnewDfwlDS4QGh6+YYICZ4HkdSgrdzRQJTohvqD75eJR1fjHCbo0FwBfZxZ0MtKXu7lgi6l2zLxHr06pwtWk7sNZHQ/BhdF47IteqRzc40= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763736831; c=relaxed/simple; bh=14RzNSw0iMTNH8xOoc2TPH06WRB4giCpnLlrDiKeCFU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KD1ZTlRMrIoc4DCbA/PGMjuPdrWThnaryfSGo045TwJ6i6TfykdgSXU31QYHv8OcjtKqvC3OEw5y7i0a5nsOku6HvKYTQnBUdwx6MFaBBOZPpjBfuBddKb3/MHIIHQqMK171pJR544HSP5JOWVKueN6zU6wAXy/jIYgJMF8HaC4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=SvUaW5Aw; arc=none smtp.client-ip=209.85.215.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SvUaW5Aw" Received: by mail-pg1-f182.google.com with SMTP id 41be03b00d2f7-bbf2c3eccc9so1757088a12.0 for ; Fri, 21 Nov 2025 06:53:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763736827; x=1764341627; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0Pi7iXAYcJk1Q+rYpdW9NsZWiIUVlt0px9qmxFiyQGM=; b=SvUaW5AwNsn7iKjwenq/iHLSJkKqTgR6mFDH5nLboMsPAUuJlVZSRvA11wIZ+Tvdxc zw2XOgVY8/2tAQjoHwIaHpiIItgonnO/KSg8uyJbYkkb5uKm974suNLvqBv3poaXlXer oAGyrUqH+oEqDmOWFlRuZm4QywiLKvN4uZJl7pNjWF+e1lkdmZGR4Tz2olq1t1Yc2Cst Jvn5+ZZ4StnKW9PHICAjxrj9GWVs9DvbHSsiob5QrtEUk/CsDLL9U+EcjFwL9zHy8YqO 2FezuJD7+kAL33s5QysfjmD4oAOMZLQa11SPBHOjiTT4ASILjLMfnA64m+e4b6VEXmAJ Y2Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763736827; x=1764341627; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=0Pi7iXAYcJk1Q+rYpdW9NsZWiIUVlt0px9qmxFiyQGM=; b=HjPGu6QkAnk1meYjIvZysQCMkV4EwtNaV6Y9CR0Daa+swrUhT0Y9EA+BD9TmbE7IkS BY/ttXj9GHYQIWw0+7RdXSHhOXOT/LH4UTMksPj6MfDD4YH3KK5SFa4QQ/OP0SRq/GMu xqU9y/1DwobI7Xpj8ntBpi95tr7tISKStSKN5rrahvCmdEkLkCF9Pa0suBXNu1RVwEou ZojZkBvViMKCzQI1/WeZj84O7v+nqsEI8OUYcZM2Kfxugu3YSY7RgVM0jWeHtoikaR8q SZ5XpXMURuvmL/58ww0Kwb4buKfF5QV3KVTuGK0aeFM6WIT7ywQ5G+C1p9sMhTMYkTmX 1CXg== X-Gm-Message-State: AOJu0YxbGt0vbfvcKquEH1RUA9juzLn6WYmFz9crKnpR87t64y24+bGr do/+NDxnuMySRs1guHFQ8VE93uHmosRNQ3jSHD4rdK/qVfPVt5Itayzobk2OVpU0 X-Gm-Gg: ASbGncvDSbwyhRGOMVmNYXquYLPxz48Ttc5G6kpP3Nnk0EteofgHG9pLKIIL9k3X1j9 wrrke+FAgD05mj39M25nhuJx8vm1U7IheNb/Li0k/J9Oe2ytYqIHmKf1goV6BuCHsHUGXGKr7zJ z384d4TzHQAoGzodvKxjqPqbkXaFAQQEQZ0LhlzioNIFL+LmaOwkQu5N659KN71pDRzveqpSvfj hBHY2hFNV9yqYLMYZ9G/S37XHTZ9l85ZG81ySx1eNjYKWMKfNl7ZXIuavb8qL/t0LHVmCv+7AKU ccnHqXNerHH9GH5NSH/lRRMlb0is3FRrbuZo+rKIeYgaQ6lj0b30cW+4lGOGcPVb+G1t5ZInEOt 3nHb7i1ORerekG9SjCkZ2WZD4VuuHXRmjewB8fafdJqUIyNpuXMtNkTZrL5IQqmYaxA9bAC1eDk nCbbs9fcpmKes= X-Google-Smtp-Source: AGHT+IEiOAApG0Ko/1NI5Mc/BgBZCrtEGVbCO5vDTVBLn05aqwQUzaMg9NZfBGRmEzUXqe9jNXlQ3g== X-Received: by 2002:a17:90b:4f8a:b0:343:78ed:8d19 with SMTP id 98e67ed59e1d1-3472985089cmr8156616a91.7.1763736826713; Fri, 21 Nov 2025 06:53:46 -0800 (PST) Received: from localhost ([240b:4004:a2:7900:ecf3:dec8:8c1e:4f5]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-34726696ba5sm5912877a91.2.2025.11.21.06.53.45 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Nov 2025 06:53:46 -0800 (PST) From: Lai Jiangshan To: linux-kernel@vger.kernel.org Cc: Tejun Heo , ying chen , Lai Jiangshan , Lai Jiangshan Subject: [PATCH V3 5/7] workqueue: Process rescuer work items one-by-one using a cursor Date: Fri, 21 Nov 2025 22:57:18 +0800 Message-Id: <20251121145720.342467-6-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20251121145720.342467-1-jiangshanlai@gmail.com> References: <20251121145720.342467-1-jiangshanlai@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Lai Jiangshan Previously, the rescuer scanned for all matching work items at once and processed them within a single rescuer thread, which could cause one blocking work item to stall all others. Make the rescuer process work items one-by-one instead of slurping all matches in a single pass. Break the rescuer loop after finding and processing the first matching work item, then restart the search to pick up the next. This gives normal worker threads a chance to process other items which gives them the opportinity to be processed instead of waiting on the rescuer's queue and prevents a blocking work item from stalling the rest once memory pressure is relieved. Introduce a dummy cursor work item to avoid potentially O(N^2) rescans of the work list. The marker records the resume position for the next scan, eliminating redundant traversals. Cc: ying chen Reported-by: ying chen Fixes: e22bee782b3b ("workqueue: implement concurrency managed dynamic work= er pool") Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 56 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 49 insertions(+), 7 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 3032235a131e..49dce50ff647 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -286,6 +286,7 @@ struct pool_workqueue { struct list_head pending_node; /* LN: node on wq_node_nr_active->pending_= pwqs */ struct list_head pwqs_node; /* WR: node on wq->pwqs */ struct list_head mayday_node; /* MD: node on wq->maydays */ + struct work_struct mayday_cursor; /* L: cursor on pool->worklist */ =20 u64 stats[PWQ_NR_STATS]; =20 @@ -1126,6 +1127,12 @@ static struct worker *find_worker_executing_work(str= uct worker_pool *pool, return NULL; } =20 +static void mayday_cursor_func(struct work_struct *work) +{ + /* should not be processed, only for marking position */ + BUG(); +} + /** * move_linked_works - move linked works to a list * @work: start of series of works to be scheduled @@ -1188,6 +1195,16 @@ static bool assign_work(struct work_struct *work, st= ruct worker *worker, =20 lockdep_assert_held(&pool->lock); =20 + /* The cursor work should not be processed */ + if (unlikely(work->func =3D=3D mayday_cursor_func)) { + /* only worker_thread() can possibly take this branch */ + WARN_ON_ONCE(worker->rescue_wq); + if (nextp) + *nextp =3D list_next_entry(work, entry); + list_del_init(&work->entry); + return false; + } + /* * A single work shouldn't be executed concurrently by multiple workers. * __queue_work() ensures that @work doesn't jump to a different pool @@ -3442,22 +3459,33 @@ static int worker_thread(void *__worker) static bool assign_rescuer_work(struct pool_workqueue *pwq, struct worker = *rescuer) { struct worker_pool *pool =3D pwq->pool; + struct work_struct *cursor =3D &pwq->mayday_cursor; struct work_struct *work, *n; =20 + /* from where to search */ + if (list_empty(&cursor->entry)) { + work =3D list_first_entry(&pool->worklist, struct work_struct, entry); + } else { + work =3D list_next_entry(cursor, entry); + /* It will be at a new position or not need cursor anymore */ + list_del_init(&cursor->entry); + } + /* need rescue? */ if (!pwq->nr_active || !need_to_create_worker(pool)) return false; =20 - /* - * Slurp in all works issued via this workqueue and - * process'em. - */ - list_for_each_entry_safe(work, n, &pool->worklist, entry) { - if (get_work_pwq(work) =3D=3D pwq && assign_work(work, rescuer, &n)) + /* try to assign a work to rescue */ + list_for_each_entry_safe_from(work, n, &pool->worklist, entry) { + if (get_work_pwq(work) =3D=3D pwq && assign_work(work, rescuer, &n)) { pwq->stats[PWQ_STAT_RESCUED]++; + /* put the cursor for next search */ + list_add_tail(&cursor->entry, &n->entry); + return true; + } } =20 - return !list_empty(&rescuer->scheduled); + return false; } =20 /** @@ -5141,6 +5169,20 @@ static void init_pwq(struct pool_workqueue *pwq, str= uct workqueue_struct *wq, INIT_LIST_HEAD(&pwq->pwqs_node); INIT_LIST_HEAD(&pwq->mayday_node); kthread_init_work(&pwq->release_work, pwq_release_workfn); + + /* + * Set the dumpy cursor work with valid function and get_work_pwq(). + * + * The cursor work should only be in the pwq->pool->worklist, and + * should never be queued, processed, flushed, cancelled or even examed + * as a work item. + * + * WORK_STRUCT_PENDING and WORK_STRUCT_INACTIVE just make it less + * surprise for kernel debuging tools and reviewers. + */ + INIT_WORK(&pwq->mayday_cursor, mayday_cursor_func); + atomic_long_set(&pwq->mayday_cursor.data, (unsigned long)pwq | + WORK_STRUCT_PENDING | WORK_STRUCT_PWQ | WORK_STRUCT_INACTIVE); } =20 /* sync @pwq with the current state of its associated wq and link it */ --=20 2.19.1.6.gb485710b