From nobody Tue Dec 23 23:48:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE82E7D413 for ; Thu, 8 Feb 2024 16:10:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408633; cv=none; b=g4gdJt8e7GUffiIXlufjrU24bUbYQ6BTv4iMyK5W02mXOtS0dOlzk/XShDWqesOL55Ns61ubRn3D6XanoN/RvboeXCf6Rqre1yZ3Fg3wI1CQ2hIMAaXIFClttGiTjgozJKEN5knNyA6nA/vMnBh8voqa0h5uxFiJDhbHCQaqcWA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408633; c=relaxed/simple; bh=jtcRfS7FK70oPiJGDW4/0HuF2nzmDdtB4If+8e4T7MQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RI16iIiXbxNG4YWZ/K2KtaiCFk1v5tcjMfm3WgJFN0xSgpXayuQakgC/NxK0tS5wvgkbZ5waTO2iMx4fTulDErDPe/XKxKo7oKAtpg+2Q6YrPv6ZwCWmjZOPRWffsI1bxIU85NSCwZTUyGZhFLCFJzDr/0ykzqFYR3gdi+TiPvI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=MTHnVcTX; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MTHnVcTX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707408630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Xu6Dy+LgyMX67Ta0NVKivdXcqKGVmjFkpKpX1QNbOE=; b=MTHnVcTXT//xsWX7usokCHK0FS4aLrXml0ki6SiEikeplu1opSBl1oD6YHy6Fs9AEHVgs+ F+X5uMSZbUVT57gXJZyeNwRVhoC102dV0Yos18Z1mBjh2XX7ZpSsaMzOoKEU6ZnwDaHTyE Qq44S2YnZJQ3k2/+6ppumrbwL/irdEM= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-1-3Fv6xqkQNByRzOKeLgBVHw-1; Thu, 08 Feb 2024 11:10:26 -0500 X-MC-Unique: 3Fv6xqkQNByRzOKeLgBVHw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EF068381644E; Thu, 8 Feb 2024 16:10:25 +0000 (UTC) Received: from llong.com (unknown [10.22.32.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7C130492BC8; Thu, 8 Feb 2024 16:10:25 +0000 (UTC) From: Waiman Long To: Tejun Heo , Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Juri Lelli , Cestmir Kalina , Alex Gladkov , Phil Auld , Costa Shulyupin , Waiman Long Subject: [PATCH wq/for-6.9 v5 1/4] workqueue: Link pwq's into wq->pwqs from oldest to newest Date: Thu, 8 Feb 2024 11:10:11 -0500 Message-Id: <20240208161014.1084943-2-longman@redhat.com> In-Reply-To: <20240208161014.1084943-1-longman@redhat.com> References: <20240208161014.1084943-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 Content-Type: text/plain; charset="utf-8" Add a new pwq into the tail of wq->pwqs so that pwq iteration will start from the oldest pwq to the newest. This ordering will facilitate the inclusion of ordered workqueues in a wq_unbound_cpumask update. Signed-off-by: Waiman Long --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index cf514ba0dfc3..fa7bd3b34f52 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -4804,7 +4804,7 @@ static void link_pwq(struct pool_workqueue *pwq) pwq->work_color =3D wq->work_color; =20 /* link in @pwq */ - list_add_rcu(&pwq->pwqs_node, &wq->pwqs); + list_add_tail_rcu(&pwq->pwqs_node, &wq->pwqs); } =20 /* obtain a pool matching @attr and create a pwq associating the pool and = @wq */ --=20 2.39.3 From nobody Tue Dec 23 23:48:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C86652562E for ; Thu, 8 Feb 2024 19:12:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707419571; cv=none; b=mO0M4tiO4m4Hp9wPFdzpbF1sQ3IcSfekJcR3A5Ob/NtwNwWT7BYxBrldRU9JtIa6E00eklKsTpKDM5+EFtn6JFrm21aoEt9imP2XNDb/Sl52Y5tkbqD1l2B8yty69kTunuXgGRAHi0hix3/RlFJRC+YtGWC3QqQueEznsOR3WSY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707419571; c=relaxed/simple; bh=506h+I6jENuIwHl/+l3GqWz/VJuMQboUBhmMK6kLGrk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oyViu9OPoViZ5GqpPQ11qZB2/s7JalZhVvEE3AEPi1FhXZmtit8u8hoQ8Ey6F0mBNw7dP9GL/Fr1yYk1Re1HFV25bmhOArIbj75+2S/15ZEIHtOJSdUVgSze8U5dxxNvSY63L7S0MZjyh+/6NFbPYpsHCEzteV0ZHvCujd4ZPgE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=A90XOS4+; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="A90XOS4+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707419568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cSqFDht9oqySGNbtHUWvob5GvrVsWMq1FrhWjgIOu+Y=; b=A90XOS4+ohy8ukS65g6+RjJNcvxJed/DXeJfc6TKJcKg0fvHX0/KnluGj/UPurhyBBY14J KedE0d/tvz87sLoKrU5LB7ypxW/OKPI76uWuoryYNmyYS0CfjEM1H4hzpE4AsASJu5fenc kYzNlp1s8/b3ZomiLmGmiW08yNDjde0= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-626-Jnbb78LxNU-dulUcC3EUvQ-1; Thu, 08 Feb 2024 14:12:47 -0500 X-MC-Unique: Jnbb78LxNU-dulUcC3EUvQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0BB4229AA3AF; Thu, 8 Feb 2024 19:12:47 +0000 (UTC) Received: from llong.com (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 77BA14011FF1; Thu, 8 Feb 2024 19:12:46 +0000 (UTC) From: Waiman Long To: Tejun Heo , Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Juri Lelli , Cestmir Kalina , Alex Gladkov , Phil Auld , Costa Shulyupin , Waiman Long Subject: [PATCH wq/for-6.9 v6 2/4] workqueue: Enable unbound cpumask update on ordered workqueues Date: Thu, 8 Feb 2024 14:12:20 -0500 Message-Id: <20240208191220.1094426-1-longman@redhat.com> In-Reply-To: <20240208161014.1084943-1-longman@redhat.com> References: <20240208161014.1084943-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.2 Content-Type: text/plain; charset="utf-8" Ordered workqueues does not currently follow changes made to the global unbound cpumask because per-pool workqueue changes may break the ordering guarantee. IOW, a work function in an ordered workqueue may run on an isolated CPU. This patch enables ordered workqueues to follow changes made to the global unbound cpumask by temporaily plug or suspend the newly allocated pool_workqueue from executing newly queued work items until the old pwq has been properly drained. For ordered workqueues, there should only be one pwq that is unplugged, the rests should be plugged. This enables ordered workqueues to follow the unbound cpumask changes like other unbound workqueues at the expense of some delay in execution of work functions during the transition period. Signed-off-by: Waiman Long --- kernel/workqueue.c | 69 +++++++++++++++++++++++++++++++++++++++------- 1 file changed, 59 insertions(+), 10 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index fa7bd3b34f52..da124859a691 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -255,6 +255,7 @@ struct pool_workqueue { int refcnt; /* L: reference count */ int nr_in_flight[WORK_NR_COLORS]; /* L: nr of in_flight works */ + bool plugged; /* L: execution suspended */ =20 /* * nr_active management and WORK_STRUCT_INACTIVE: @@ -1708,6 +1709,9 @@ static bool pwq_tryinc_nr_active(struct pool_workqueu= e *pwq, bool fill) goto out; } =20 + if (unlikely(pwq->plugged)) + return false; + /* * Unbound workqueue uses per-node shared nr_active $nna. If @pwq is * already waiting on $nna, pwq_dec_nr_active() will maintain the @@ -1782,6 +1786,43 @@ static bool pwq_activate_first_inactive(struct pool_= workqueue *pwq, bool fill) } } =20 +/** + * unplug_oldest_pwq - restart an oldest plugged pool_workqueue + * @wq: workqueue_struct to be restarted + * + * pwq's are linked into wq->pwqs with the oldest first. For ordered + * workqueues, only the oldest pwq is unplugged, the others are plugged to + * suspend execution until the oldest one is drained. When this happens, t= he + * next oldest one (first plugged pwq in iteration) will be unplugged to + * restart work item execution to ensure proper work item ordering. + * + * dfl_pwq --------------+ [P] - plugged + * | + * v + * pwqs -> A -> B [P] -> C [P] (newest) + * | | | + * 1 3 5 + * | | | + * 2 4 6 + */ +static void unplug_oldest_pwq(struct workqueue_struct *wq) +{ + struct pool_workqueue *pwq; + + lockdep_assert_held(&wq->mutex); + + /* Caller should make sure that pwqs isn't empty before calling */ + pwq =3D list_first_entry_or_null(&wq->pwqs, struct pool_workqueue, + pwqs_node); + raw_spin_lock_irq(&pwq->pool->lock); + if (pwq->plugged) { + pwq->plugged =3D false; + if (pwq_activate_first_inactive(pwq, true)) + kick_pool(pwq->pool); + } + raw_spin_unlock_irq(&pwq->pool->lock); +} + /** * node_activate_pending_pwq - Activate a pending pwq on a wq_node_nr_acti= ve * @nna: wq_node_nr_active to activate a pending pwq for @@ -4740,6 +4781,13 @@ static void pwq_release_workfn(struct kthread_work *= work) mutex_lock(&wq->mutex); list_del_rcu(&pwq->pwqs_node); is_last =3D list_empty(&wq->pwqs); + + /* + * For ordered workqueue with a plugged dfl_pwq, restart it now. + */ + if (!is_last && (wq->flags & __WQ_ORDERED)) + unplug_oldest_pwq(wq); + mutex_unlock(&wq->mutex); } =20 @@ -4966,6 +5014,15 @@ apply_wqattrs_prepare(struct workqueue_struct *wq, cpumask_copy(new_attrs->__pod_cpumask, new_attrs->cpumask); ctx->attrs =3D new_attrs; =20 + /* + * For initialized ordered workqueues, there should only be one pwq + * (dfl_pwq). Set the plugged flag of ctx->dfl_pwq to suspend execution + * of newly queued work items until execution of older work items in + * the old pwq's have completed. + */ + if ((wq->flags & __WQ_ORDERED) && !list_empty(&wq->pwqs)) + ctx->dfl_pwq->plugged =3D true; + ctx->wq =3D wq; return ctx; =20 @@ -5006,10 +5063,6 @@ static int apply_workqueue_attrs_locked(struct workq= ueue_struct *wq, if (WARN_ON(!(wq->flags & WQ_UNBOUND))) return -EINVAL; =20 - /* creating multiple pwqs breaks ordering guarantee */ - if (!list_empty(&wq->pwqs) && WARN_ON(wq->flags & __WQ_ORDERED)) - return -EINVAL; - ctx =3D apply_wqattrs_prepare(wq, attrs, wq_unbound_cpumask); if (IS_ERR(ctx)) return PTR_ERR(ctx); @@ -6489,9 +6542,6 @@ static int workqueue_apply_unbound_cpumask(const cpum= ask_var_t unbound_cpumask) list_for_each_entry(wq, &workqueues, list) { if (!(wq->flags & WQ_UNBOUND) || (wq->flags & __WQ_DESTROYING)) continue; - /* creating multiple pwqs breaks ordering guarantee */ - if (wq->flags & __WQ_ORDERED) - continue; =20 ctx =3D apply_wqattrs_prepare(wq, wq->unbound_attrs, unbound_cpumask); if (IS_ERR(ctx)) { @@ -7006,9 +7056,8 @@ int workqueue_sysfs_register(struct workqueue_struct = *wq) int ret; =20 /* - * Adjusting max_active or creating new pwqs by applying - * attributes breaks ordering guarantee. Disallow exposing ordered - * workqueues. + * Adjusting max_active breaks ordering guarantee. Disallow exposing + * ordered workqueues. */ if (WARN_ON(wq->flags & __WQ_ORDERED)) return -EINVAL; --=20 2.39.3 From nobody Tue Dec 23 23:48:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C4127CF3A for ; Thu, 8 Feb 2024 16:10:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408633; cv=none; b=UJDRBE04tAAnnxzi8UJdfAKGRz8LVPh+gGCbWkNoxyVmnccAwTSLMTlpkgFY6aN9saTjIpdRs8IGYoUczo0QNRp/rDHM5i6kp9C4lYQWhyq0xoxZGGXU8c/MkApgLZTs71afi2TPtzXKEHVAAGdv8q36d05Iok1YaQwNhLoLJm4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408633; c=relaxed/simple; bh=D5JMEe8lTSWuGsEK9VwlOoiVTZNVySteiIE0qWPY+QM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DBuOV6j2nmz+4upk8ijHQLRC9ZkumwNgpAIjpLJiOVj261v/DhbgL00P8Q+/z6J+pqNi8Fyl5vUkvs4J7AsIqz3cNCgJ8xR/SvJNtJ+KufbXkP9KpY51/jkyI6iFxWl3xjuB6MnCU/0e1IWgbVXZRj/3AYWleecgHvo8igWCG+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=HF9u6cPy; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HF9u6cPy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707408630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jjWwGxP0KxUYOxdmuDnOunDZH5/kaF4YpdnTH6SYbsI=; b=HF9u6cPymxLaTR1mK5N4idZ34gayYJEUCJKU0+F2OEsdcoDWY5c5N+AaBisS+I3TPCFT// gp/B3ldoWcQYFzsD0D88iBoQZCVWADxvLEMHGHzWl/LoX1gZD4YAeaD6Od+HGZUC4UBH/2 tl90dRkGbWaTbf3lUEKuCHD1oZPy0S8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-56-jV7ckQhPNdivU-fS1x2orw-1; Thu, 08 Feb 2024 11:10:26 -0500 X-MC-Unique: jV7ckQhPNdivU-fS1x2orw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7B65B85A58A; Thu, 8 Feb 2024 16:10:26 +0000 (UTC) Received: from llong.com (unknown [10.22.32.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0A81D492BC6; Thu, 8 Feb 2024 16:10:26 +0000 (UTC) From: Waiman Long To: Tejun Heo , Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Juri Lelli , Cestmir Kalina , Alex Gladkov , Phil Auld , Costa Shulyupin , Waiman Long Subject: [PATCH wq/for-6.9 v5 2/4] workqueue: Enable unbound cpumask update on ordered workqueues Date: Thu, 8 Feb 2024 11:10:12 -0500 Message-Id: <20240208161014.1084943-3-longman@redhat.com> In-Reply-To: <20240208161014.1084943-1-longman@redhat.com> References: <20240208161014.1084943-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 Content-Type: text/plain; charset="utf-8" Ordered workqueues does not currently follow changes made to the global unbound cpumask because per-pool workqueue changes may break the ordering guarantee. IOW, a work function in an ordered workqueue may run on an isolated CPU. This patch enables ordered workqueues to follow changes made to the global unbound cpumask by temporaily plug or suspend the newly allocated pool_workqueue from executing newly queued work items until the old pwq has been properly drained. For ordered workqueues, there should only be one pwq that is unplugged, the rests should be plugged. This enables ordered workqueues to follow the unbound cpumask changes like other unbound workqueues at the expense of some delay in execution of work functions during the transition period. Signed-off-by: Waiman Long --- kernel/workqueue.c | 72 +++++++++++++++++++++++++++++++++++++++------- 1 file changed, 62 insertions(+), 10 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index fa7bd3b34f52..e261acf258b8 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -255,6 +255,7 @@ struct pool_workqueue { int refcnt; /* L: reference count */ int nr_in_flight[WORK_NR_COLORS]; /* L: nr of in_flight works */ + bool plugged; /* L: execution suspended */ =20 /* * nr_active management and WORK_STRUCT_INACTIVE: @@ -1708,6 +1709,9 @@ static bool pwq_tryinc_nr_active(struct pool_workqueu= e *pwq, bool fill) goto out; } =20 + if (unlikely(pwq->plugged)) + return false; + /* * Unbound workqueue uses per-node shared nr_active $nna. If @pwq is * already waiting on $nna, pwq_dec_nr_active() will maintain the @@ -1782,6 +1786,46 @@ static bool pwq_activate_first_inactive(struct pool_= workqueue *pwq, bool fill) } } =20 +/** + * unplug_oldest_pwq - restart an oldest plugged pool_workqueue + * @wq: workqueue_struct to be restarted + * + * pwq's are linked into wq->pwqs with the oldest first. For ordered + * workqueues, only the oldest pwq is unplugged, the others are plugged to + * suspend execution until the oldest one is drained. When this happens, t= he + * next oldest one (first plugged pwq in iteration) will be unplugged to + * restart work item execution to ensure proper work item ordering. + * + * dfl_pwq --------------+ [P] - plugged + * | + * v + * pwqs -> A -> B [P] -> C [P] (newest) + * | | | + * 1 3 5 + * | | | + * 2 4 6 + */ +static void unplug_oldest_pwq(struct workqueue_struct *wq) +{ + struct pool_workqueue *pwq; + unsigned long flags; + + lockdep_assert_held(&wq->mutex); + + pwq =3D list_first_entry_or_null(&wq->pwqs, struct pool_workqueue, + pwqs_node); + if (WARN_ON_ONCE(!pwq)) + return; + + raw_spin_lock_irqsave(&pwq->pool->lock, flags); + if (pwq->plugged) { + pwq->plugged =3D false; + if (pwq_activate_first_inactive(pwq, true)) + kick_pool(pwq->pool); + } + raw_spin_unlock_irqrestore(&pwq->pool->lock, flags); +} + /** * node_activate_pending_pwq - Activate a pending pwq on a wq_node_nr_acti= ve * @nna: wq_node_nr_active to activate a pending pwq for @@ -4740,6 +4784,13 @@ static void pwq_release_workfn(struct kthread_work *= work) mutex_lock(&wq->mutex); list_del_rcu(&pwq->pwqs_node); is_last =3D list_empty(&wq->pwqs); + + /* + * For ordered workqueue with a plugged dfl_pwq, restart it now. + */ + if (!is_last && (wq->flags & __WQ_ORDERED)) + unplug_oldest_pwq(wq); + mutex_unlock(&wq->mutex); } =20 @@ -4966,6 +5017,15 @@ apply_wqattrs_prepare(struct workqueue_struct *wq, cpumask_copy(new_attrs->__pod_cpumask, new_attrs->cpumask); ctx->attrs =3D new_attrs; =20 + /* + * For initialized ordered workqueues, there is only one pwq (dfl_pwq). + * Set the plugged flag of ctx->dfl_pwq to suspend execution of newly + * queued work items until execution of older work items in the old + * pwq's have completed. + */ + if (!list_empty(&wq->pwqs) && (wq->flags & __WQ_ORDERED)) + ctx->dfl_pwq->plugged =3D true; + ctx->wq =3D wq; return ctx; =20 @@ -5006,10 +5066,6 @@ static int apply_workqueue_attrs_locked(struct workq= ueue_struct *wq, if (WARN_ON(!(wq->flags & WQ_UNBOUND))) return -EINVAL; =20 - /* creating multiple pwqs breaks ordering guarantee */ - if (!list_empty(&wq->pwqs) && WARN_ON(wq->flags & __WQ_ORDERED)) - return -EINVAL; - ctx =3D apply_wqattrs_prepare(wq, attrs, wq_unbound_cpumask); if (IS_ERR(ctx)) return PTR_ERR(ctx); @@ -6489,9 +6545,6 @@ static int workqueue_apply_unbound_cpumask(const cpum= ask_var_t unbound_cpumask) list_for_each_entry(wq, &workqueues, list) { if (!(wq->flags & WQ_UNBOUND) || (wq->flags & __WQ_DESTROYING)) continue; - /* creating multiple pwqs breaks ordering guarantee */ - if (wq->flags & __WQ_ORDERED) - continue; =20 ctx =3D apply_wqattrs_prepare(wq, wq->unbound_attrs, unbound_cpumask); if (IS_ERR(ctx)) { @@ -7006,9 +7059,8 @@ int workqueue_sysfs_register(struct workqueue_struct = *wq) int ret; =20 /* - * Adjusting max_active or creating new pwqs by applying - * attributes breaks ordering guarantee. Disallow exposing ordered - * workqueues. + * Adjusting max_active breaks ordering guarantee. Disallow exposing + * ordered workqueues. */ if (WARN_ON(wq->flags & __WQ_ORDERED)) return -EINVAL; --=20 2.39.3 From nobody Tue Dec 23 23:48:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE7E57D411 for ; Thu, 8 Feb 2024 16:10:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408633; cv=none; b=IqA6a9hOMmoDzpk2KQycR0TYXORIneZGlrz42s0KVrdVjrPqcbTqPGX+Ygdtwp/nA2SYpX8QIt/UuKyCnSSaM9JpwYS1Z3v9Htt7E/OS41Vsujo/TlZ1iw4+TcfAhIE+LztXCp189P2oHQg60HIRovFkvwNB4ekBx7HZFD2vnE4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408633; c=relaxed/simple; bh=3oS9kSqj13klcEDeru35Hnr/W1vfC6t+2kJTHNu3Y/w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fXlKCI4HSqbA4lLbe0dGeRcc8Mo4904PrTcdNHSebjy50aHBxd08gE14XEQwQuICswEHTjifqj5eBm3jcYJNl3iH3knz7eg/hEc1GfuUsqDZQEi99hEquySRYW7oLkuClxDiK4k0JBbsBb7mmtNd5bqxyzLLsyoSOeUNVp9gUTQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TrQWBEtM; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TrQWBEtM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707408630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UVT04WTuQZOL2MdMj2b0Oo6Y1zY5WCBFIOr5ePYEIio=; b=TrQWBEtMFeBNjuuQnY4gzlnOKCXZcmvoHkBsDTwKnEY85bP8h2Pi7vDjDzvGIJvtIbXWzF P838tbL1oim3QJkfO0BnzGM/negEdk5BbbiJ9tDoPM2pWeIFiHZXEzatfMrjmPXzXjimqU pIeZCduXjLhkB/Y2pgL7cIy+NYyIRY8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-516-46iU4v4GPkOENz-AOr_wng-1; Thu, 08 Feb 2024 11:10:27 -0500 X-MC-Unique: 46iU4v4GPkOENz-AOr_wng-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 09CA485A5A8; Thu, 8 Feb 2024 16:10:27 +0000 (UTC) Received: from llong.com (unknown [10.22.32.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 88FD2492BC6; Thu, 8 Feb 2024 16:10:26 +0000 (UTC) From: Waiman Long To: Tejun Heo , Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Juri Lelli , Cestmir Kalina , Alex Gladkov , Phil Auld , Costa Shulyupin , Waiman Long Subject: [PATCH wq/for-6.9 v5 3/4] kernel/workqueue: Let rescuers follow unbound wq cpumask changes Date: Thu, 8 Feb 2024 11:10:13 -0500 Message-Id: <20240208161014.1084943-4-longman@redhat.com> In-Reply-To: <20240208161014.1084943-1-longman@redhat.com> References: <20240208161014.1084943-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 Content-Type: text/plain; charset="utf-8" From: Juri Lelli When workqueue cpumask changes are committed the associated rescuer (if one exists) affinity is not touched and this might be a problem down the line for isolated setups. Make sure rescuers affinity is updated every time a workqueue cpumask changes, so that rescuers can't break isolation. [longman: set_cpus_allowed_ptr() will block until the designated task is enqueued on an allowed CPU, no wake_up_process() needed. Also use the unbound_effective_cpumask() helper as suggested by Tejun.] Signed-off-by: Juri Lelli Signed-off-by: Waiman Long --- kernel/workqueue.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index e261acf258b8..8df27c496b63 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5054,6 +5054,11 @@ static void apply_wqattrs_commit(struct apply_wqattr= s_ctx *ctx) /* update node_nr_active->max */ wq_update_node_max_active(ctx->wq, -1); =20 + /* rescuer needs to respect wq cpumask changes */ + if (ctx->wq->rescuer) + set_cpus_allowed_ptr(ctx->wq->rescuer->task, + unbound_effective_cpumask(ctx->wq)); + mutex_unlock(&ctx->wq->mutex); } =20 --=20 2.39.3 From nobody Tue Dec 23 23:48:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E1597EF14 for ; Thu, 8 Feb 2024 16:10:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408636; cv=none; b=ozQy4NZ/Oi82xhcd9FRJle3NbobPXiugqEnA7BmGXlOAPv6lWTtAPlg3cRV1r+r0Q0ayk1zO7BOfIsyrWHiC8ey8Sl1L/g5+rIZQgvJba3CF5GP2X+OCnNa/A5/kchuTzwBFPAAIXYt/Xkgrr/1rZHNElHfgz5OYyrCVSBoVXmo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707408636; c=relaxed/simple; bh=+t7pmKjtDdwuTE5+NchSpHdkWOnMrrwVoYXVY+E/eCY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FEPxQ1TNb8nz27pf6TtEwYJKwWKfk+Af3LFSR48p3x+QMNFjWTX3ZKfk4HKlcRGksJLhitN6E29BlmdC+K0elmGI2x5kn5U1uQqAcbG3A+/HFp6mYRf2IKLwfU5mKXF4sDTdiYRyaV4Pf+3J/ofE+IX/w94w+SS+EO/iJZz4yIU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CKrepQkQ; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CKrepQkQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707408633; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r7dYKKpjSbNzyLMoqiBagtQ1UbDE2BU0ecOQOmIBJIo=; b=CKrepQkQqiGrjIPT4vGsLn0fXZNlaPgKwW3BGV0pihhQ1ae3/rvUZ4t5UVK1h0oRMDrbRD tk+jln6IS9jC0SO++g+OgO3LkeOnrG3p60ROEav6iaoWx6CBeJYWf8pRe+sZNCvfy91OU1 JBT1+Z3K41bT9R3mjYNZiqvanZX1DWg= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-85-rzlhX5SBM5mWrxnbKQl8iQ-1; Thu, 08 Feb 2024 11:10:27 -0500 X-MC-Unique: rzlhX5SBM5mWrxnbKQl8iQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8FD8D381645B; Thu, 8 Feb 2024 16:10:27 +0000 (UTC) Received: from llong.com (unknown [10.22.32.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 19877492BC6; Thu, 8 Feb 2024 16:10:27 +0000 (UTC) From: Waiman Long To: Tejun Heo , Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Juri Lelli , Cestmir Kalina , Alex Gladkov , Phil Auld , Costa Shulyupin , Waiman Long Subject: [PATCH wq/for-6.9 v5 4/4] workqueue: Bind unbound workqueue rescuer to wq_unbound_cpumask Date: Thu, 8 Feb 2024 11:10:14 -0500 Message-Id: <20240208161014.1084943-5-longman@redhat.com> In-Reply-To: <20240208161014.1084943-1-longman@redhat.com> References: <20240208161014.1084943-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 Content-Type: text/plain; charset="utf-8" Commit 85f0ab43f9de ("kernel/workqueue: Bind rescuer to unbound cpumask for WQ_UNBOUND") modified init_rescuer() to bind rescuer of an unbound workqueue to the cpumask in wq->unbound_attrs. However unbound_attrs->cpumask's of all workqueues are initialized to cpu_possible_mask and will only be changed if it has the WQ_SYSFS flag to expose a cpumask sysfs file to be written by users. So this patch doesn't achieve what it is intended to do. If an unbound workqueue is created after wq_unbound_cpumask is modified and there is no more unbound cpumask update after that, the unbound rescuer will be bound to all CPUs unless the workqueue is created with the WQ_SYSFS flag and a user explicitly modified its cpumask sysfs file. Fix this problem by binding directly to wq_unbound_cpumask in init_rescuer(). Fixes: 85f0ab43f9de ("kernel/workqueue: Bind rescuer to unbound cpumask for= WQ_UNBOUND") Signed-off-by: Waiman Long --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 8df27c496b63..ca53e1144f0a 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5302,7 +5302,7 @@ static int init_rescuer(struct workqueue_struct *wq) =20 wq->rescuer =3D rescuer; if (wq->flags & WQ_UNBOUND) - kthread_bind_mask(rescuer->task, wq->unbound_attrs->cpumask); + kthread_bind_mask(rescuer->task, wq_unbound_cpumask); else kthread_bind_mask(rescuer->task, cpu_possible_mask); wake_up_process(rescuer->task); --=20 2.39.3