From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 269D3C53210
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:50:43 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229850AbjAEAuk (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:50:40 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58094 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229786AbjAEAuS (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:50:18 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org
 [IPv6:2604:1380:4641:c500::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CDD658318;
        Wed,  4 Jan 2023 16:46:09 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id C85686189D;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 30E7EC433F1;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=HEwgkfItho+scdtz7EMh//sZ9TZpxPJIVeZwBitEPbE=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=cQureTlSGVqziMDCkl6CBkxBIl/wji10X2T/xwK6J4INqCDExjd4j+B3rDy9T0HS0
         C3iSONGip2Ym678wJH7gmmSDx8tDAgzLXwlpg5MX376HWtMBz2EwAvpSfV7qSPh9oH
         6Y77ejI6pC8ylHgjTObWCACt6+F6+QVjQeT3WkCOnVsSyl0zU6phny4+/Cz9xalx4M
         gGOkrqlLi1sYEGR17OD2FxDgnB/UbDix00q7cd4VbtuRGhlDFey6+pE2vkjLfsAFfV
         iRbiKof/gRaTgPzLsN7771VpB+ZCIV9tWgUvr48YBNP13BX6lvmJlG01pwyOI466RR
         zhT9n+2BRDYrw==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id D7B9B5C05CA; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, Zqiang <qiang1.zhang@intel.com>,
        "Paul E . McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 1/6] rcu-tasks: Use accurate runstart time for RCU Tasks
 boot-time testing
Date: Wed,  4 Jan 2023 16:44:49 -0800
Message-Id: <20230105004501.1771332-1-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Zqiang <qiang1.zhang@intel.com>

Currently, test_rcu_tasks_callback() reads from the jiffies counter only
once when this function is invoked.  This introduces inaccuracies because
of the latencies induced by the synchronize_rcu_tasks*() invocations.
This commit therefore re-reads the jiffies counter at the beginning
of each test, thus avoiding penalizing later tests for the latencies
induced by earlier tests.

Therefore, this commit at the start of each RCU Tasks test, re-fetch the
jiffies time as the runstart time.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tasks.h | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index fe9840d90e960..c418aa1c038a9 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1815,23 +1815,21 @@ static void test_rcu_tasks_callback(struct rcu_head=
 *rhp)
=20
 static void rcu_tasks_initiate_self_tests(void)
 {
-	unsigned long j =3D jiffies;
-
 	pr_info("Running RCU-tasks wait API self tests\n");
 #ifdef CONFIG_TASKS_RCU
-	tests[0].runstart =3D j;
+	tests[0].runstart =3D jiffies;
 	synchronize_rcu_tasks();
 	call_rcu_tasks(&tests[0].rh, test_rcu_tasks_callback);
 #endif
=20
 #ifdef CONFIG_TASKS_RUDE_RCU
-	tests[1].runstart =3D j;
+	tests[1].runstart =3D jiffies;
 	synchronize_rcu_tasks_rude();
 	call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback);
 #endif
=20
 #ifdef CONFIG_TASKS_TRACE_RCU
-	tests[2].runstart =3D j;
+	tests[2].runstart =3D jiffies;
 	synchronize_rcu_tasks_trace();
 	call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback);
 #endif
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3DD40C46467
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:50:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229879AbjAEAup (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:50:45 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58786 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229843AbjAEAuS (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:50:18 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org
 [IPv6:2604:1380:4641:c500::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4673D4A958;
        Wed,  4 Jan 2023 16:46:09 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id BFEAA6189B;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2DA6AC433F0;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=p1n9Oin+2YJ+fvK62lOQ034ZZbRWfu52tpQVaVSIkgU=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=QLorTwpXRP7RHiZKX9ilE9mVj9i5l3CORa+QRtRWE+Kx19J2SebW01pokDz4o6JEh
         MyvcQSJiy9wvDjZ/1T1BD2qQywHYDCI1X/yX+iD/Zu/LgP2ae3GHqJoyz0nxY6bMvn
         u8v6y0E9SbbMkH3WfR3zMp/fv65Y5+UFSPch+dQ1o3bx9CSxtUS7E89ZxDcoPzwyuO
         XSVB30zD+a/KEYDOwl2riDc2/PQEtO060vXa8t4UxYzqKiFJVDkPo+BRhnFk5JcCtJ
         +qD7rLianEGQEq7SYsmxTv9EesqnLQKR2XKR4Hm/UhGclpTlIGZf8ZW9bIWEnSGckA
         w6sBAi480clcQ==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id DAE285C086D; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, "Paul E. McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 1/7] torture: Seed torture_random_state on CPU
Date: Wed,  4 Jan 2023 16:44:50 -0800
Message-Id: <20230105004501.1771332-2-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

The DEFINE_TORTURE_RANDOM_PERCPU() macro defines per-CPU random-number
generators for torture testing, but the seeds for each CPU's instance
will be identical if they are first used at the same time.  This commit
therefore adds the CPU number to the mix when reseeding.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/torture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/torture.c b/kernel/torture.c
index 789aeb0e1159c..29afc62f2bfec 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -450,7 +450,7 @@ unsigned long
 torture_random(struct torture_random_state *trsp)
 {
 	if (--trsp->trs_count < 0) {
-		trsp->trs_state +=3D (unsigned long)local_clock();
+		trsp->trs_state +=3D (unsigned long)local_clock() + raw_smp_processor_id=
();
 		trsp->trs_count =3D TORTURE_RANDOM_REFRESH;
 	}
 	trsp->trs_state =3D trsp->trs_state * TORTURE_RANDOM_MULT +
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 4C6CBC54EBD
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:54:58 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230174AbjAEAyf (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:54:35 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36126 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230283AbjAEAxm (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:53:42 -0500
Received: from ams.source.kernel.org (ams.source.kernel.org
 [IPv6:2604:1380:4601:e00::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D61353702;
        Wed,  4 Jan 2023 16:50:07 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by ams.source.kernel.org (Postfix) with ESMTPS id 88255B81992;
        Thu,  5 Jan 2023 00:45:04 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2DA3BC433D2;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=rw0C2QSJ5sl128boZVsbR5s1JFsI7AQxvQksYOm1VeU=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=Q7BnZbOK68eBMjsP4z+hTX2ZcJVhuqwKGpHwjXDXWxmleDVcYxBtztgmAscNqYp63
         9SXSPg+VmGOR3B830nin81ZlqcR5+EiXrcxPTGZsRfu3jauyuuvrrF324PKrhp1PUS
         IUjxjqMnYbPmG6R7c48bZKDJmj7/4UzRjcmllJDWN+eUtL7rJ28guS631WC2Pr4OMt
         2+cJQTPSyhQCtmq/MJ7gJDLxGm4YSLLfrOEIK2Ib+RisvC//aLk1qaRue9XSII5Mko
         E3CxxMMO2vJKqyl4no/Xneasgnso/0HBxMbkdDjQikmcSF+K/a6vlThh9NaJEBoz9A
         cRCzMnFanTtJg==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id DD52F5C08E5; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, Frederic Weisbecker <frederic@kernel.org>,
        Boqun Feng <boqun.feng@gmail.com>,
        Neeraj Upadhyay <quic_neeraju@quicinc.com>,
        "Paul E . McKenney" <paulmck@kernel.org>,
        Oleg Nesterov <oleg@redhat.com>,
        Lai Jiangshan <jiangshanlai@gmail.com>,
        "Eric W . Biederman" <ebiederm@xmission.com>
Subject: [PATCH rcu 2/6] rcu-tasks: Improve comments explaining
 tasks_rcu_exit_srcu purpose
Date: Wed,  4 Jan 2023 16:44:51 -0800
Message-Id: <20230105004501.1771332-3-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Frederic Weisbecker <frederic@kernel.org>

Make sure we don't need to look again into the depths of git blame in
order not to miss a subtle part about how rcu-tasks is dealing with
exiting tasks.

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tasks.h | 37 +++++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index c418aa1c038a9..50d4c0ec7a89f 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -827,11 +827,21 @@ static void rcu_tasks_pertask(struct task_struct *t, =
struct list_head *hop)
 static void rcu_tasks_postscan(struct list_head *hop)
 {
 	/*
-	 * Wait for tasks that are in the process of exiting.  This
-	 * does only part of the job, ensuring that all tasks that were
-	 * previously exiting reach the point where they have disabled
-	 * preemption, allowing the later synchronize_rcu() to finish
-	 * the job.
+	 * Exiting tasks may escape the tasklist scan. Those are vulnerable
+	 * until their final schedule() with TASK_DEAD state. To cope with
+	 * this, divide the fragile exit path part in two intersecting
+	 * read side critical sections:
+	 *
+	 * 1) An _SRCU_ read side starting before calling exit_notify(),
+	 *    which may remove the task from the tasklist, and ending after
+	 *    the final preempt_disable() call in do_exit().
+	 *
+	 * 2) An _RCU_ read side starting with the final preempt_disable()
+	 *    call in do_exit() and ending with the final call to schedule()
+	 *    with TASK_DEAD state.
+	 *
+	 * This handles the part 1). And postgp will handle part 2) with a
+	 * call to synchronize_rcu().
 	 */
 	synchronize_srcu(&tasks_rcu_exit_srcu);
 }
@@ -898,7 +908,10 @@ static void rcu_tasks_postgp(struct rcu_tasks *rtp)
 	 *
 	 * In addition, this synchronize_rcu() waits for exiting tasks
 	 * to complete their final preempt_disable() region of execution,
-	 * cleaning up after the synchronize_srcu() above.
+	 * cleaning up after synchronize_srcu(&tasks_rcu_exit_srcu),
+	 * enforcing the whole region before tasklist removal until
+	 * the final schedule() with TASK_DEAD state to be an RCU TASKS
+	 * read side critical section.
 	 */
 	synchronize_rcu();
 }
@@ -988,7 +1001,11 @@ void show_rcu_tasks_classic_gp_kthread(void)
 EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread);
 #endif // !defined(CONFIG_TINY_RCU)
=20
-/* Do the srcu_read_lock() for the above synchronize_srcu().  */
+/*
+ * Contribute to protect against tasklist scan blind spot while the
+ * task is exiting and may be removed from the tasklist. See
+ * corresponding synchronize_srcu() for further details.
+ */
 void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu)
 {
 	preempt_disable();
@@ -996,7 +1013,11 @@ void exit_tasks_rcu_start(void) __acquires(&tasks_rcu=
_exit_srcu)
 	preempt_enable();
 }
=20
-/* Do the srcu_read_unlock() for the above synchronize_srcu().  */
+/*
+ * Contribute to protect against tasklist scan blind spot while the
+ * task is exiting and may be removed from the tasklist. See
+ * corresponding synchronize_srcu() for further details.
+ */
 void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu)
 {
 	struct task_struct *t =3D current;
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 41361C54E76
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:49:28 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229820AbjAEAs1 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:48:27 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51934 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229923AbjAEAsG (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:48:06 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org
 [IPv6:2604:1380:4641:c500::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CA6F30543;
        Wed,  4 Jan 2023 16:45:05 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id D4B736189A;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34D39C43392;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=bsB5dxI29c3QNEaB4l1cl2DOXyvWSbIPCGoroiZvYMU=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=bhANtMsA9RsT0hm3RhSk6Nk27I3yIQpk0zTwJgLZOymKm/HRxRCkvmLaC5baJKeVJ
         S32SqHWbJ4wV1M+Q5CVpQgBYsSyFwGlpZ3BbxptzM/nSdc5o7gO6skKD1uToCae+vh
         TMfQAxM4GXtxTFjacJxj+BaH+QtpnLQEgEAv1c7UR3Xua8d8eG3dzrJdPfTdDESrdL
         8Fm+2zLa8GBqaNHwFv5XQZu5mk+FT27Z7Xsy67SsOSbr+WPDaSaJCoCs6On97jgMMY
         F0ThLq8zu57KI1x37uViYVJHEeLwlGI+zKRkxXrkI1kCoIXINc26WPEHIsBaPwsMDI
         GmA7lbQeW9Abw==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id DF4195C1456; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, "Paul E. McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 2/7] refscale: Provide for initialization failure
Date: Wed,  4 Jan 2023 16:44:52 -0800
Message-Id: <20230105004501.1771332-4-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Current tests all have init() functions that are guaranteed to succeed.
But upcoming tests will need to allocate memory, thus possibly failing.
This commit therefore handles init() function failure.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/refscale.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 435c884c02b5c..7f12168627a1f 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -124,7 +124,7 @@ static int exp_idx;
=20
 // Operations vector for selecting different types of tests.
 struct ref_scale_ops {
-	void (*init)(void);
+	bool (*init)(void);
 	void (*cleanup)(void);
 	void (*readsection)(const int nloops);
 	void (*delaysection)(const int nloops, const int udl, const int ndl);
@@ -162,8 +162,9 @@ static void ref_rcu_delay_section(const int nloops, con=
st int udl, const int ndl
 	}
 }
=20
-static void rcu_sync_scale_init(void)
+static bool rcu_sync_scale_init(void)
 {
+	return true;
 }
=20
 static struct ref_scale_ops rcu_ops =3D {
@@ -315,9 +316,10 @@ static struct ref_scale_ops refcnt_ops =3D {
 // Definitions for rwlock
 static rwlock_t test_rwlock;
=20
-static void ref_rwlock_init(void)
+static bool ref_rwlock_init(void)
 {
 	rwlock_init(&test_rwlock);
+	return true;
 }
=20
 static void ref_rwlock_section(const int nloops)
@@ -351,9 +353,10 @@ static struct ref_scale_ops rwlock_ops =3D {
 // Definitions for rwsem
 static struct rw_semaphore test_rwsem;
=20
-static void ref_rwsem_init(void)
+static bool ref_rwsem_init(void)
 {
 	init_rwsem(&test_rwsem);
+	return true;
 }
=20
 static void ref_rwsem_section(const int nloops)
@@ -833,7 +836,10 @@ ref_scale_init(void)
 		goto unwind;
 	}
 	if (cur_ops->init)
-		cur_ops->init();
+		if (!cur_ops->init()) {
+			firsterr =3D -EUCLEAN;
+			goto unwind;
+		}
=20
 	ref_scale_print_module_parms(cur_ops, "Start of test");
=20
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 59025C54EBC
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:49:35 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229545AbjAEAta (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:49:30 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51944 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229493AbjAEAsG (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:48:06 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org
 [IPv6:2604:1380:4641:c500::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F57630572;
        Wed,  4 Jan 2023 16:45:05 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id D4116618A0;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34C8EC433EF;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=kVT1HGBSl7LGywi0JFiEDrUmFvLHman/Yt/MTgK22Cc=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=QZg3Vmw+pnalljHEBPfGidZfVJkNWzTbNxrTJFrdmuXLqMrtlKhIyYsJJGwxSCfVm
         GyJGL7lJPn4EJ+Q60cntcETE+DTa4JH8nyT4Yl/ED+9N5xeiMWVsaBhKUWsktO3Xlo
         VMwSNlq3HcA4h8Urxcn0wXUeeb8t5FNXnRWK+WcntYtmyvkq/8O4ALU/nBAMr+sQ3i
         RwQx2iTPhm7b3GNoxCk4rdJXhUXqC4HJbPZIw7L0Shs1cXWNhwD3+u26B/O+5Kgu9+
         zOofsWii/ioo/yezCYFuy+9nIdNSMYHzfq9MgxKK5UB6iCDWFK7ivgbxZRpZ8ROy5l
         ucIF1h1pRWU1Q==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id E11DC5C149B; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, Frederic Weisbecker <frederic@kernel.org>,
        Boqun Feng <boqun.feng@gmail.com>,
        "Paul E . McKenney" <paulmck@kernel.org>,
        Neeraj Upadhyay <quic_neeraju@quicinc.com>,
        Lai Jiangshan <jiangshanlai@gmail.com>
Subject: [PATCH rcu 3/6] rcu-tasks: Remove preemption disablement around
 srcu_read_[un]lock() calls
Date: Wed,  4 Jan 2023 16:44:53 -0800
Message-Id: <20230105004501.1771332-5-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Frederic Weisbecker <frederic@kernel.org>

Ever since the following commit:

	5a41344a3d83 ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()")

SRCU doesn't rely anymore on preemption to be disabled in order to
modify the per-CPU counter. And even then it used to be done from the API
itself.

Therefore and after checking further, it appears to be safe to remove
the preemption disablement around __srcu_read_[un]lock() in
exit_tasks_rcu_start() and exit_tasks_rcu_finish()

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tasks.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 50d4c0ec7a89f..fbaed2637a7ff 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1008,9 +1008,7 @@ EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread);
  */
 void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu)
 {
-	preempt_disable();
 	current->rcu_tasks_idx =3D __srcu_read_lock(&tasks_rcu_exit_srcu);
-	preempt_enable();
 }
=20
 /*
@@ -1022,9 +1020,7 @@ void exit_tasks_rcu_finish(void) __releases(&tasks_rc=
u_exit_srcu)
 {
 	struct task_struct *t =3D current;
=20
-	preempt_disable();
 	__srcu_read_unlock(&tasks_rcu_exit_srcu, t->rcu_tasks_idx);
-	preempt_enable();
 	exit_tasks_rcu_finish_trace(t);
 }
=20
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 2156DC46467
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:49:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229706AbjAEAto (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:49:44 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50864 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229559AbjAEAsM (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:48:12 -0500
Received: from ams.source.kernel.org (ams.source.kernel.org
 [IPv6:2604:1380:4601:e00::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 953BE485A8;
        Wed,  4 Jan 2023 16:45:09 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by ams.source.kernel.org (Postfix) with ESMTPS id 101DCB81995;
        Thu,  5 Jan 2023 00:45:05 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83F25C433A7;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=03Bt6VZE0iiaB506LPl1maokF6QUwrOMExAqox+g+f8=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=FrNXjIVy5QhqfQiF0bt0AqKTqhmBtA7navIdYm2RygiwsKm+TO7t6BIj+Tzlko6ZL
         rJ2gCyYs6D875tITyopmCPm+TXTz0n8tDltfvbZHVChoO9oN9Dat8GPaL1CUwRZT+m
         M7gN3OmtKJAPj554FVlXbWefViAiKHhfJYl5bzW2Ji3EsCV9Sk2UbMtjbRtvyGa6uS
         2d+KOVbW9hKJFWS5O/XWZ7Fu6dH8ZNfw8NEBSnAKnS8Z7wwbMHqBSM4hC4kLdnmyXl
         K7vTbCIFGmMsThIGlk2xkQqfJdm74VVGyDUYXdgAyQKbXtpJaSZ3u9tnGkb2guB5KN
         i8KAJ2k6kftgg==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id E302D5C1ADF; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, "Paul E. McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 3/7] refscale: Add tests using SLAB_TYPESAFE_BY_RCU
Date: Wed,  4 Jan 2023 16:44:54 -0800
Message-Id: <20230105004501.1771332-6-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

This commit adds three read-side-only tests of three use cases featuring
SLAB_TYPESAFE_BY_RCU: One using per-object reference counting, one using
per-object locking, and one using per-object sequence locking.

[ paulmck: Apply feedback from kernel test robot. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/refscale.c | 236 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 236 insertions(+)

diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 7f12168627a1f..abeeeadb83b59 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -76,6 +76,8 @@ torture_param(int, verbose_batched, 0, "Batch verbose deb=
ugging printk()s");
 // Wait until there are multiple CPUs before starting test.
 torture_param(int, holdoff, IS_BUILTIN(CONFIG_RCU_REF_SCALE_TEST) ? 10 : 0,
 	      "Holdoff time before test start (s)");
+// Number of typesafe_lookup structures, that is, the degree of concurrenc=
y.
+torture_param(long, lookup_instances, 0, "Number of typesafe_lookup struct=
ures.");
 // Number of loops per experiment, all readers execute operations concurre=
ntly.
 torture_param(long, loops, 10000, "Number of loops per experiment.");
 // Number of readers, with -1 defaulting to about 75% of the CPUs.
@@ -526,6 +528,239 @@ static struct ref_scale_ops clock_ops =3D {
 	.name		=3D "clock"
 };
=20
+////////////////////////////////////////////////////////////////////////
+//
+// Methods leveraging SLAB_TYPESAFE_BY_RCU.
+//
+
+// Item to look up in a typesafe manner.  Array of pointers to these.
+struct refscale_typesafe {
+	atomic_t rts_refctr;  // Used by all flavors
+	spinlock_t rts_lock;
+	seqlock_t rts_seqlock;
+	unsigned int a;
+	unsigned int b;
+};
+
+static struct kmem_cache *typesafe_kmem_cachep;
+static struct refscale_typesafe **rtsarray;
+static long rtsarray_size;
+static DEFINE_TORTURE_RANDOM_PERCPU(refscale_rand);
+static bool (*rts_acquire)(struct refscale_typesafe *rtsp, unsigned int *s=
tart);
+static bool (*rts_release)(struct refscale_typesafe *rtsp, unsigned int st=
art);
+
+// Conditionally acquire an explicit in-structure reference count.
+static bool typesafe_ref_acquire(struct refscale_typesafe *rtsp, unsigned =
int *start)
+{
+	return atomic_inc_not_zero(&rtsp->rts_refctr);
+}
+
+// Unconditionally release an explicit in-structure reference count.
+static bool typesafe_ref_release(struct refscale_typesafe *rtsp, unsigned =
int start)
+{
+	if (!atomic_dec_return(&rtsp->rts_refctr)) {
+		WRITE_ONCE(rtsp->a, rtsp->a + 1);
+		kmem_cache_free(typesafe_kmem_cachep, rtsp);
+	}
+	return true;
+}
+
+// Unconditionally acquire an explicit in-structure spinlock.
+static bool typesafe_lock_acquire(struct refscale_typesafe *rtsp, unsigned=
 int *start)
+{
+	spin_lock(&rtsp->rts_lock);
+	return true;
+}
+
+// Unconditionally release an explicit in-structure spinlock.
+static bool typesafe_lock_release(struct refscale_typesafe *rtsp, unsigned=
 int start)
+{
+	spin_unlock(&rtsp->rts_lock);
+	return true;
+}
+
+// Unconditionally acquire an explicit in-structure sequence lock.
+static bool typesafe_seqlock_acquire(struct refscale_typesafe *rtsp, unsig=
ned int *start)
+{
+	*start =3D read_seqbegin(&rtsp->rts_seqlock);
+	return true;
+}
+
+// Conditionally release an explicit in-structure sequence lock.  Return
+// true if this release was successful, that is, if no retry is required.
+static bool typesafe_seqlock_release(struct refscale_typesafe *rtsp, unsig=
ned int start)
+{
+	return !read_seqretry(&rtsp->rts_seqlock, start);
+}
+
+// Do a read-side critical section with the specified delay in
+// microseconds and nanoseconds inserted so as to increase probability
+// of failure.
+static void typesafe_delay_section(const int nloops, const int udl, const =
int ndl)
+{
+	unsigned int a;
+	unsigned int b;
+	int i;
+	long idx;
+	struct refscale_typesafe *rtsp;
+	unsigned int start;
+
+	for (i =3D nloops; i >=3D 0; i--) {
+		preempt_disable();
+		idx =3D torture_random(this_cpu_ptr(&refscale_rand)) % rtsarray_size;
+		preempt_enable();
+retry:
+		rcu_read_lock();
+		rtsp =3D rcu_dereference(rtsarray[idx]);
+		a =3D READ_ONCE(rtsp->a);
+		if (!rts_acquire(rtsp, &start)) {
+			rcu_read_unlock();
+			goto retry;
+		}
+		if (a !=3D READ_ONCE(rtsp->a)) {
+			(void)rts_release(rtsp, start);
+			rcu_read_unlock();
+			goto retry;
+		}
+		un_delay(udl, ndl);
+		// Remember, seqlock read-side release can fail.
+		if (!rts_release(rtsp, start)) {
+			rcu_read_unlock();
+			goto retry;
+		}
+		b =3D READ_ONCE(rtsp->a);
+		WARN_ONCE(a !=3D b, "Re-read of ->a changed from %u to %u.\n", a, b);
+		b =3D rtsp->b;
+		rcu_read_unlock();
+		WARN_ON_ONCE(a * a !=3D b);
+	}
+}
+
+// Because the acquisition and release methods are expensive, there
+// is no point in optimizing away the un_delay() function's two checks.
+// Thus simply define typesafe_read_section() as a simple wrapper around
+// typesafe_delay_section().
+static void typesafe_read_section(const int nloops)
+{
+	typesafe_delay_section(nloops, 0, 0);
+}
+
+// Allocate and initialize one refscale_typesafe structure.
+static struct refscale_typesafe *typesafe_alloc_one(void)
+{
+	struct refscale_typesafe *rtsp;
+
+	rtsp =3D kmem_cache_alloc(typesafe_kmem_cachep, GFP_KERNEL);
+	if (!rtsp)
+		return NULL;
+	atomic_set(&rtsp->rts_refctr, 1);
+	WRITE_ONCE(rtsp->a, rtsp->a + 1);
+	WRITE_ONCE(rtsp->b, rtsp->a * rtsp->a);
+	return rtsp;
+}
+
+// Slab-allocator constructor for refscale_typesafe structures created
+// out of a new slab of system memory.
+static void refscale_typesafe_ctor(void *rtsp_in)
+{
+	struct refscale_typesafe *rtsp =3D rtsp_in;
+
+	spin_lock_init(&rtsp->rts_lock);
+	seqlock_init(&rtsp->rts_seqlock);
+	preempt_disable();
+	rtsp->a =3D torture_random(this_cpu_ptr(&refscale_rand));
+	preempt_enable();
+}
+
+static struct ref_scale_ops typesafe_ref_ops;
+static struct ref_scale_ops typesafe_lock_ops;
+static struct ref_scale_ops typesafe_seqlock_ops;
+
+// Initialize for a typesafe test.
+static bool typesafe_init(void)
+{
+	long idx;
+	long si =3D lookup_instances;
+
+	typesafe_kmem_cachep =3D kmem_cache_create("refscale_typesafe",
+						 sizeof(struct refscale_typesafe), sizeof(void *),
+						 SLAB_TYPESAFE_BY_RCU, refscale_typesafe_ctor);
+	if (!typesafe_kmem_cachep)
+		return false;
+	if (si < 0)
+		si =3D -si * nr_cpu_ids;
+	else if (si =3D=3D 0)
+		si =3D nr_cpu_ids;
+	rtsarray_size =3D si;
+	rtsarray =3D kcalloc(si, sizeof(*rtsarray), GFP_KERNEL);
+	if (!rtsarray)
+		return false;
+	for (idx =3D 0; idx < rtsarray_size; idx++) {
+		rtsarray[idx] =3D typesafe_alloc_one();
+		if (!rtsarray[idx])
+			return false;
+	}
+	if (cur_ops =3D=3D &typesafe_ref_ops) {
+		rts_acquire =3D typesafe_ref_acquire;
+		rts_release =3D typesafe_ref_release;
+	} else if (cur_ops =3D=3D &typesafe_lock_ops) {
+		rts_acquire =3D typesafe_lock_acquire;
+		rts_release =3D typesafe_lock_release;
+	} else if (cur_ops =3D=3D &typesafe_seqlock_ops) {
+		rts_acquire =3D typesafe_seqlock_acquire;
+		rts_release =3D typesafe_seqlock_release;
+	} else {
+		WARN_ON_ONCE(1);
+		return false;
+	}
+	return true;
+}
+
+// Clean up after a typesafe test.
+static void typesafe_cleanup(void)
+{
+	long idx;
+
+	if (rtsarray) {
+		for (idx =3D 0; idx < rtsarray_size; idx++)
+			kmem_cache_free(typesafe_kmem_cachep, rtsarray[idx]);
+		kfree(rtsarray);
+		rtsarray =3D NULL;
+		rtsarray_size =3D 0;
+	}
+	if (typesafe_kmem_cachep) {
+		kmem_cache_destroy(typesafe_kmem_cachep);
+		typesafe_kmem_cachep =3D NULL;
+	}
+	rts_acquire =3D NULL;
+	rts_release =3D NULL;
+}
+
+// The typesafe_init() function distinguishes these structures by address.
+static struct ref_scale_ops typesafe_ref_ops =3D {
+	.init		=3D typesafe_init,
+	.cleanup	=3D typesafe_cleanup,
+	.readsection	=3D typesafe_read_section,
+	.delaysection	=3D typesafe_delay_section,
+	.name		=3D "typesafe_ref"
+};
+
+static struct ref_scale_ops typesafe_lock_ops =3D {
+	.init		=3D typesafe_init,
+	.cleanup	=3D typesafe_cleanup,
+	.readsection	=3D typesafe_read_section,
+	.delaysection	=3D typesafe_delay_section,
+	.name		=3D "typesafe_lock"
+};
+
+static struct ref_scale_ops typesafe_seqlock_ops =3D {
+	.init		=3D typesafe_init,
+	.cleanup	=3D typesafe_cleanup,
+	.readsection	=3D typesafe_read_section,
+	.delaysection	=3D typesafe_delay_section,
+	.name		=3D "typesafe_seqlock"
+};
+
 static void rcu_scale_one_reader(void)
 {
 	if (readdelay <=3D 0)
@@ -815,6 +1050,7 @@ ref_scale_init(void)
 	static struct ref_scale_ops *scale_ops[] =3D {
 		&rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_op=
s,
 		&rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops,
+		&typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,
 	};
=20
 	if (!torture_init_begin(scale_type, verbose))
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 50E6AC46467
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:49:51 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229733AbjAEAts (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:49:48 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58476 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229581AbjAEAsM (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:48:12 -0500
Received: from ams.source.kernel.org (ams.source.kernel.org
 [IPv6:2604:1380:4601:e00::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 953394731B;
        Wed,  4 Jan 2023 16:45:09 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by ams.source.kernel.org (Postfix) with ESMTPS id 1816FB81987;
        Thu,  5 Jan 2023 00:45:05 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83ED2C433A4;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=O1cvLXslwasqBFvrGBgEFY2X2iW5lbp1qzrBi1HIquQ=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=XoBSKmKMCpBx6dipT4sHYSHxYlTMCXPOOKc9PNEE9r+T5VtnYkWvPE5uqmXDcUdLy
         GiwF+7wtiCbhh2uCMOIFdBV1ZvUKuHjxQ0BMokq6L2vPTm36Xuhq2KdLafaHbIrVS9
         EGpXoJ82b0K3T8QT9GVqAY2wQPdxYneOwP/gCupvvQkx/15+fMPx8LDCP43qvp1RBs
         nw45YtXTYbTLxCPWBpMgunw6wOsMCxOk8KmlVdA1T9Tw+8/6/Lvzbq7gdNFzv5UfjX
         lUAcoP3JjZqBD8ZDzMH7Y9hWLzkttUAqG51nn/WamxGmAOyHzMvNGbfoYeDzDAlwww
         mpxubPcIpu0+g==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id E4E175C1AE0; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org,
        "Joel Fernandes (Google)" <joel@joelfernandes.org>,
        "Paul E . McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 4/7] locktorture: Allow non-rtmutex lock types to be
 boosted
Date: Wed,  4 Jan 2023 16:44:55 -0800
Message-Id: <20230105004501.1771332-7-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

Currently RT boosting is only done for rtmutex_lock, however with proxy
execution, we also have the mutex_lock participating in priorities. To
exercise the testing better, add RT boosting to other lock testing types
as well, using a new knob (rt_boost).

Tested with boot parameters:
locktorture.torture_type=3Dmutex_lock
locktorture.onoff_interval=3D1
locktorture.nwriters_stress=3D8
locktorture.stutter=3D0
locktorture.rt_boost=3D1
locktorture.rt_boost_factor=3D1
locktorture.nlocks=3D3

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/locking/locktorture.c | 99 ++++++++++++++++++++----------------
 1 file changed, 56 insertions(+), 43 deletions(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 9c2fb613a55d2..e2271e8fc3027 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -46,6 +46,8 @@ torture_param(int, shutdown_secs, 0, "Shutdown time (j), =
<=3D zero to disable.");
 torture_param(int, stat_interval, 60,
 	     "Number of seconds between stats printk()s");
 torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=3Ddi=
sable");
+torture_param(int, rt_boost, 2,
+		"Do periodic rt-boost. 0=3DDisable, 1=3DOnly for rt_mutex, 2=3DFor all l=
ock types.");
 torture_param(int, verbose, 1,
 	     "Enable verbose debugging printk()s");
=20
@@ -127,15 +129,49 @@ static void torture_lock_busted_write_unlock(int tid =
__maybe_unused)
 	  /* BUGGY, do not use in real life!!! */
 }
=20
-static void torture_boost_dummy(struct torture_random_state *trsp)
+static void __torture_rt_boost(struct torture_random_state *trsp)
 {
-	/* Only rtmutexes care about priority */
+	const unsigned int factor =3D 50000; /* yes, quite arbitrary */
+
+	if (!rt_task(current)) {
+		/*
+		 * Boost priority once every ~50k operations. When the
+		 * task tries to take the lock, the rtmutex it will account
+		 * for the new priority, and do any corresponding pi-dance.
+		 */
+		if (trsp && !(torture_random(trsp) %
+			      (cxt.nrealwriters_stress * factor))) {
+			sched_set_fifo(current);
+		} else /* common case, do nothing */
+			return;
+	} else {
+		/*
+		 * The task will remain boosted for another ~500k operations,
+		 * then restored back to its original prio, and so forth.
+		 *
+		 * When @trsp is nil, we want to force-reset the task for
+		 * stopping the kthread.
+		 */
+		if (!trsp || !(torture_random(trsp) %
+			       (cxt.nrealwriters_stress * factor * 2))) {
+			sched_set_normal(current, 0);
+		} else /* common case, do nothing */
+			return;
+	}
+}
+
+static void torture_rt_boost(struct torture_random_state *trsp)
+{
+	if (rt_boost !=3D 2)
+		return;
+
+	__torture_rt_boost(trsp);
 }
=20
 static struct lock_torture_ops lock_busted_ops =3D {
 	.writelock	=3D torture_lock_busted_write_lock,
 	.write_delay	=3D torture_lock_busted_write_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_lock_busted_write_unlock,
 	.readlock       =3D NULL,
 	.read_delay     =3D NULL,
@@ -179,7 +215,7 @@ __releases(torture_spinlock)
 static struct lock_torture_ops spin_lock_ops =3D {
 	.writelock	=3D torture_spin_lock_write_lock,
 	.write_delay	=3D torture_spin_lock_write_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_spin_lock_write_unlock,
 	.readlock       =3D NULL,
 	.read_delay     =3D NULL,
@@ -206,7 +242,7 @@ __releases(torture_spinlock)
 static struct lock_torture_ops spin_lock_irq_ops =3D {
 	.writelock	=3D torture_spin_lock_write_lock_irq,
 	.write_delay	=3D torture_spin_lock_write_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_lock_spin_write_unlock_irq,
 	.readlock       =3D NULL,
 	.read_delay     =3D NULL,
@@ -275,7 +311,7 @@ __releases(torture_rwlock)
 static struct lock_torture_ops rw_lock_ops =3D {
 	.writelock	=3D torture_rwlock_write_lock,
 	.write_delay	=3D torture_rwlock_write_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_rwlock_write_unlock,
 	.readlock       =3D torture_rwlock_read_lock,
 	.read_delay     =3D torture_rwlock_read_delay,
@@ -318,7 +354,7 @@ __releases(torture_rwlock)
 static struct lock_torture_ops rw_lock_irq_ops =3D {
 	.writelock	=3D torture_rwlock_write_lock_irq,
 	.write_delay	=3D torture_rwlock_write_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_rwlock_write_unlock_irq,
 	.readlock       =3D torture_rwlock_read_lock_irq,
 	.read_delay     =3D torture_rwlock_read_delay,
@@ -358,7 +394,7 @@ __releases(torture_mutex)
 static struct lock_torture_ops mutex_lock_ops =3D {
 	.writelock	=3D torture_mutex_lock,
 	.write_delay	=3D torture_mutex_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_mutex_unlock,
 	.readlock       =3D NULL,
 	.read_delay     =3D NULL,
@@ -456,7 +492,7 @@ static struct lock_torture_ops ww_mutex_lock_ops =3D {
 	.exit		=3D torture_ww_mutex_exit,
 	.writelock	=3D torture_ww_mutex_lock,
 	.write_delay	=3D torture_mutex_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_ww_mutex_unlock,
 	.readlock       =3D NULL,
 	.read_delay     =3D NULL,
@@ -474,37 +510,6 @@ __acquires(torture_rtmutex)
 	return 0;
 }
=20
-static void torture_rtmutex_boost(struct torture_random_state *trsp)
-{
-	const unsigned int factor =3D 50000; /* yes, quite arbitrary */
-
-	if (!rt_task(current)) {
-		/*
-		 * Boost priority once every ~50k operations. When the
-		 * task tries to take the lock, the rtmutex it will account
-		 * for the new priority, and do any corresponding pi-dance.
-		 */
-		if (trsp && !(torture_random(trsp) %
-			      (cxt.nrealwriters_stress * factor))) {
-			sched_set_fifo(current);
-		} else /* common case, do nothing */
-			return;
-	} else {
-		/*
-		 * The task will remain boosted for another ~500k operations,
-		 * then restored back to its original prio, and so forth.
-		 *
-		 * When @trsp is nil, we want to force-reset the task for
-		 * stopping the kthread.
-		 */
-		if (!trsp || !(torture_random(trsp) %
-			       (cxt.nrealwriters_stress * factor * 2))) {
-			sched_set_normal(current, 0);
-		} else /* common case, do nothing */
-			return;
-	}
-}
-
 static void torture_rtmutex_delay(struct torture_random_state *trsp)
 {
 	const unsigned long shortdelay_us =3D 2;
@@ -530,10 +535,18 @@ __releases(torture_rtmutex)
 	rt_mutex_unlock(&torture_rtmutex);
 }
=20
+static void torture_rt_boost_rtmutex(struct torture_random_state *trsp)
+{
+	if (!rt_boost)
+		return;
+
+	__torture_rt_boost(trsp);
+}
+
 static struct lock_torture_ops rtmutex_lock_ops =3D {
 	.writelock	=3D torture_rtmutex_lock,
 	.write_delay	=3D torture_rtmutex_delay,
-	.task_boost     =3D torture_rtmutex_boost,
+	.task_boost     =3D torture_rt_boost_rtmutex,
 	.writeunlock	=3D torture_rtmutex_unlock,
 	.readlock       =3D NULL,
 	.read_delay     =3D NULL,
@@ -600,7 +613,7 @@ __releases(torture_rwsem)
 static struct lock_torture_ops rwsem_lock_ops =3D {
 	.writelock	=3D torture_rwsem_down_write,
 	.write_delay	=3D torture_rwsem_write_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_rwsem_up_write,
 	.readlock       =3D torture_rwsem_down_read,
 	.read_delay     =3D torture_rwsem_read_delay,
@@ -652,7 +665,7 @@ static struct lock_torture_ops percpu_rwsem_lock_ops =
=3D {
 	.exit		=3D torture_percpu_rwsem_exit,
 	.writelock	=3D torture_percpu_rwsem_down_write,
 	.write_delay	=3D torture_rwsem_write_delay,
-	.task_boost     =3D torture_boost_dummy,
+	.task_boost     =3D torture_rt_boost,
 	.writeunlock	=3D torture_percpu_rwsem_up_write,
 	.readlock       =3D torture_percpu_rwsem_down_read,
 	.read_delay     =3D torture_rwsem_read_delay,
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5D553C53210
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:49:43 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229683AbjAEAtl (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:49:41 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50360 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229512AbjAEAsK (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:48:10 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org
 [IPv6:2604:1380:4641:c500::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 086623056A;
        Wed,  4 Jan 2023 16:45:07 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id 5B2B661892;
        Thu,  5 Jan 2023 00:45:04 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 841A2C433AC;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=GGw9u8vxG0hen+++2QAE23m8weAq1b7/mIwpr+dFs2Y=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=WTgSpv0iJ06oPC7d08wfYVJPZdjyKo0wfK1h6QRG4sjtIdL50FNYybeyN29sRQips
         2HNDD137dsfsQ/pDfr9tU9r8wAyftOkbUeAk4dWFbSVgQwnjkE2vXzxg14vVPS9z6f
         BGZHWq31WM5hsnuhCPFUh6wNn1my7TGbx1c4LB6inrk7INMfTbKvaaqyPa6FZ7vfKJ
         TTSOJIFoseynCbknNiTBpnV5HIfS7pnnUWhdl03+0dKZCI9SRs2LyhgQwATmegdYHu
         8IKdOL4e6PH1O1ZbSq5NSCozqZZc+5mqccbbhfv/6L3QuNU2B58st85Yiz5WmNTMaQ
         NmGSBAWKy303Q==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id E75F65C1C5B; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, Frederic Weisbecker <frederic@kernel.org>,
        Pengfei Xu <pengfei.xu@intel.com>,
        Boqun Feng <boqun.feng@gmail.com>,
        Neeraj Upadhyay <quic_neeraju@quicinc.com>,
        "Paul E . McKenney" <paulmck@kernel.org>,
        Oleg Nesterov <oleg@redhat.com>,
        Lai Jiangshan <jiangshanlai@gmail.com>,
        "Eric W . Biederman" <ebiederm@xmission.com>
Subject: [PATCH rcu 4/6] rcu-tasks: Fix synchronize_rcu_tasks() VS
 zap_pid_ns_processes()
Date: Wed,  4 Jan 2023 16:44:56 -0800
Message-Id: <20230105004501.1771332-8-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Frederic Weisbecker <frederic@kernel.org>

RCU Tasks and PID-namespace unshare can interact in do_exit() in a
complicated circular dependency:

1) TASK A calls unshare(CLONE_NEWPID), this creates a new PID namespace
   that every subsequent child of TASK A will belong to. But TASK A
   doesn't itself belong to that new PID namespace.

2) TASK A forks() and creates TASK B. TASK A stays attached to its PID
   namespace (let's say PID_NS1) and TASK B is the first task belonging
   to the new PID namespace created by unshare()  (let's call it PID_NS2).

3) Since TASK B is the first task attached to PID_NS2, it becomes the
   PID_NS2 child reaper.

4) TASK A forks() again and creates TASK C which get attached to PID_NS2.
   Note how TASK C has TASK A as a parent (belonging to PID_NS1) but has
   TASK B (belonging to PID_NS2) as a pid_namespace child_reaper.

5) TASK B exits and since it is the child reaper for PID_NS2, it has to
   kill all other tasks attached to PID_NS2, and wait for all of them to
   die before getting reaped itself (zap_pid_ns_process()).

6) TASK A calls synchronize_rcu_tasks() which leads to
   synchronize_srcu(&tasks_rcu_exit_srcu).

7) TASK B is waiting for TASK C to get reaped. But TASK B is under a
   tasks_rcu_exit_srcu SRCU critical section (exit_notify() is between
   exit_tasks_rcu_start() and exit_tasks_rcu_finish()), blocking TASK A.

8) TASK C exits and since TASK A is its parent, it waits for it to reap
   TASK C, but it can't because TASK A waits for TASK B that waits for
   TASK C.

Pid_namespace semantics can hardly be changed at this point. But the
coverage of tasks_rcu_exit_srcu can be reduced instead.

The current task is assumed not to be concurrently reapable at this
stage of exit_notify() and therefore tasks_rcu_exit_srcu can be
temporarily relaxed without breaking its constraints, providing a way
out of the deadlock scenario.

[ paulmck: Fix build failure by adding additional declaration. ]

Fixes: 3f95aa81d265 ("rcu: Make TASKS_RCU handle tasks that are almost done=
 exiting")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W . Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcupdate.h |  2 ++
 kernel/pid_namespace.c   | 17 +++++++++++++++++
 kernel/rcu/tasks.h       | 15 +++++++++++++--
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 03abf883a281b..c0c79beac3fe6 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -238,6 +238,7 @@ void synchronize_rcu_tasks_rude(void);
=20
 #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)
 void exit_tasks_rcu_start(void);
+void exit_tasks_rcu_stop(void);
 void exit_tasks_rcu_finish(void);
 #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
 #define rcu_tasks_classic_qs(t, preempt) do { } while (0)
@@ -246,6 +247,7 @@ void exit_tasks_rcu_finish(void);
 #define call_rcu_tasks call_rcu
 #define synchronize_rcu_tasks synchronize_rcu
 static inline void exit_tasks_rcu_start(void) { }
+static inline void exit_tasks_rcu_stop(void) { }
 static inline void exit_tasks_rcu_finish(void) { }
 #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
=20
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index f4f8cb0435b45..fc21c5d5fd5de 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -244,7 +244,24 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
 		set_current_state(TASK_INTERRUPTIBLE);
 		if (pid_ns->pid_allocated =3D=3D init_pids)
 			break;
+		/*
+		 * Release tasks_rcu_exit_srcu to avoid following deadlock:
+		 *
+		 * 1) TASK A unshare(CLONE_NEWPID)
+		 * 2) TASK A fork() twice -> TASK B (child reaper for new ns)
+		 *    and TASK C
+		 * 3) TASK B exits, kills TASK C, waits for TASK A to reap it
+		 * 4) TASK A calls synchronize_rcu_tasks()
+		 *                   -> synchronize_srcu(tasks_rcu_exit_srcu)
+		 * 5) *DEADLOCK*
+		 *
+		 * It is considered safe to release tasks_rcu_exit_srcu here
+		 * because we assume the current task can not be concurrently
+		 * reaped at this point.
+		 */
+		exit_tasks_rcu_stop();
 		schedule();
+		exit_tasks_rcu_start();
 	}
 	__set_current_state(TASK_RUNNING);
=20
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index fbaed2637a7ff..5de61f12a1645 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1016,16 +1016,27 @@ void exit_tasks_rcu_start(void) __acquires(&tasks_r=
cu_exit_srcu)
  * task is exiting and may be removed from the tasklist. See
  * corresponding synchronize_srcu() for further details.
  */
-void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu)
+void exit_tasks_rcu_stop(void) __releases(&tasks_rcu_exit_srcu)
 {
 	struct task_struct *t =3D current;
=20
 	__srcu_read_unlock(&tasks_rcu_exit_srcu, t->rcu_tasks_idx);
-	exit_tasks_rcu_finish_trace(t);
+}
+
+/*
+ * Contribute to protect against tasklist scan blind spot while the
+ * task is exiting and may be removed from the tasklist. See
+ * corresponding synchronize_srcu() for further details.
+ */
+void exit_tasks_rcu_finish(void)
+{
+	exit_tasks_rcu_stop();
+	exit_tasks_rcu_finish_trace(current);
 }
=20
 #else /* #ifdef CONFIG_TASKS_RCU */
 void exit_tasks_rcu_start(void) { }
+void exit_tasks_rcu_stop(void) { }
 void exit_tasks_rcu_finish(void) { exit_tasks_rcu_finish_trace(current); }
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
=20
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 91ED7C53210
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:50:03 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229771AbjAEAt7 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:49:59 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52334 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230343AbjAEAtN (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:49:13 -0500
Received: from ams.source.kernel.org (ams.source.kernel.org
 [IPv6:2604:1380:4601:e00::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BEBD53702;
        Wed,  4 Jan 2023 16:45:46 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by ams.source.kernel.org (Postfix) with ESMTPS id 2830DB8199A;
        Thu,  5 Jan 2023 00:45:05 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 84100C433AA;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=GkMvou9LWTPYdZckr57QSN+yE5nVrUCbMx7MJAU8/Ko=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=d/BHuEXCfnXFPrkASVpDASAU+6pQ8/TrrSBBx4uS0EiIgH2kJU72s9W3h51/XS75l
         DfCE5Qm102E3BMfqoO0wE3ze0hkg8vHLL/VM/pg+8R1MrIWblTRhStxzLOWauI3Nau
         HdzEE1rYEJIMF2ovM/071ukZSj+/Jai6Zi+fZa+qxZVgDwgsAZRmd3FbNIg73N7zN5
         3qTncCoHYovlXr/TSRdo9WZ3py5TQvvfnPZ3hxFHGO9qhzlZUJLCNYvmpcFxIDFH6k
         J4VbOD4huc3CPiYRH4l8B9NA/N9c6oJIZGxC6oECJrdx5BKLaClnopvZWRJLMfhsIO
         gSsphY/BrW7eA==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id EAFA55C1C64; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, Zqiang <qiang1.zhang@intel.com>,
        "Paul E . McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 5/6] rcu-tasks: Make rude RCU-Tasks work well with CPU
 hotplug
Date: Wed,  4 Jan 2023 16:44:58 -0800
Message-Id: <20230105004501.1771332-10-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Zqiang <qiang1.zhang@intel.com>

The synchronize_rcu_tasks_rude() function invokes rcu_tasks_rude_wait_gp()
to wait one rude RCU-tasks grace period.  The rcu_tasks_rude_wait_gp()
function in turn checks if there is only a single online CPU.  If so, it
will immediately return, because a call to synchronize_rcu_tasks_rude()
is by definition a grace period on a single-CPU system.  (We could
have blocked!)

Unfortunately, this check uses num_online_cpus() without synchronization,
which can result in too-short grace periods.  To see this, consider the
following scenario:

        CPU0                                   CPU1 (going offline)
                                          migration/1 task:
                                      cpu_stopper_thread
                                       -> take_cpu_down
                                          -> _cpu_disable
                                           (dec __num_online_cpus)
                                          ->cpuhp_invoke_callback
                                                preempt_disable
                                                access old_data0
           task1
 del old_data0                                  .....
 synchronize_rcu_tasks_rude()
 task1 schedule out
 ....
 task2 schedule in
 rcu_tasks_rude_wait_gp()
     ->__num_online_cpus =3D=3D 1
       ->return
 ....
 task1 schedule in
 ->free old_data0
                                                preempt_enable

When CPU1 decrements __num_online_cpus, its value becomes 1.  However,
CPU1 has not finished going offline, and will take one last trip through
the scheduler and the idle loop before it actually stops executing
instructions.  Because synchronize_rcu_tasks_rude() is mostly used for
tracing, and because both the scheduler and the idle loop can be traced,
this means that CPU0's prematurely ended grace period might disrupt the
tracing on CPU1.  Given that this disruption might include CPU1 executing
instructions in memory that was just now freed (and maybe reallocated),
this is a matter of some concern.

This commit therefore removes that problematic single-CPU check from the
rcu_tasks_rude_wait_gp() function.  This dispenses with the single-CPU
optimization, but there is no evidence indicating that this optimization
is important.  In addition, synchronize_rcu_tasks_generic() contains a
similar optimization (albeit only for early boot), which also splats.
(As in exactly why are you invoking synchronize_rcu_tasks_rude() so
early in boot, anyway???)

It is OK for the synchronize_rcu_tasks_rude() function's check to be
unsynchronized because the only times that this check can evaluate to
true is when there is only a single CPU running with preemption
disabled.

While in the area, this commit also fixes a minor bug in which a
call to synchronize_rcu_tasks_rude() would instead be attributed to
synchronize_rcu_tasks().

[ paulmck: Add "synchronize_" prefix and "()" suffix. ]

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tasks.h | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 5de61f12a1645..eee38b0d362a8 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -560,8 +560,9 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
 {
 	/* Complain if the scheduler has not started.  */
-	WARN_ONCE(rcu_scheduler_active =3D=3D RCU_SCHEDULER_INACTIVE,
-			 "synchronize_rcu_tasks called too soon");
+	if (WARN_ONCE(rcu_scheduler_active =3D=3D RCU_SCHEDULER_INACTIVE,
+			 "synchronize_%s() called too soon", rtp->name))
+		return;
=20
 	// If the grace-period kthread is running, use it.
 	if (READ_ONCE(rtp->kthread_ptr)) {
@@ -1064,9 +1065,6 @@ static void rcu_tasks_be_rude(struct work_struct *wor=
k)
 // Wait for one rude RCU-tasks grace period.
 static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
 {
-	if (num_online_cpus() <=3D 1)
-		return;	// Fastpath for only one CPU.
-
 	rtp->n_ipis +=3D cpumask_weight(cpu_online_mask);
 	schedule_on_each_cpu(rcu_tasks_be_rude);
 }
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 620D4C53210
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:50:28 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229463AbjAEAu0 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:50:26 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51944 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231190AbjAEAtZ (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:49:25 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A214C58303;
        Wed,  4 Jan 2023 16:45:58 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id 6C28F61890;
        Thu,  5 Jan 2023 00:45:04 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88773C433AE;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=PpoaSNebBqsMgk78r+PsMqT/cfa6dOFsCQW5R5Tf9+M=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=Ox82R6WGb2VPQnD40QOeYlOqExeTBeWoyyncMBI7zZg3poA/XR9aEx9ewLIylR7AE
         dlg4CtFqsYhgZUlfQT/7SBaX9hqGg3rOfoFuYdEdWIzf0hQFYXAfVqYxbOnI2jwxD4
         VGMaO44wOdBPjkFaTtyX0pE30p2pARa/2ijmyhXw+WWy04jmlF23E6sBUnkb3SGKmN
         uYfxXNUd3g25nqiWXRowIvg2yq01N/KzubV4Xqf2SraV+ctTHnNGMODYJZmZyMPxx7
         xJPKoSpmsjkEnoEpsWJtyi0QSiF9zoXdb25bG/tcqjrBgouFIA4Panlrl4iCrzYD8e
         BrUYprJi/VGaQ==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id E92355C1C5D; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org,
        "Joel Fernandes (Google)" <joel@joelfernandes.org>,
        Davidlohr Bueso <dave@stgolabs.net>,
        "Paul E . McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 5/7] locktorture: Make the rt_boost factor a tunable
Date: Wed,  4 Jan 2023 16:44:57 -0800
Message-Id: <20230105004501.1771332-9-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

The rt boosting in locktorture has a factor variable s currently large enou=
gh
that boosting only happens once every minute or so. Add a tunable to reduce=
 the
factor so that boosting happens more often, to test paths and arrive at fai=
lure
modes earlier. With this change, I can set the factor to like 50 and have t=
he
boosting happens every 10 seconds or so.

Tested with boot parameters:
locktorture.torture_type=3Dmutex_lock
locktorture.onoff_interval=3D1
locktorture.nwriters_stress=3D8
locktorture.stutter=3D0
locktorture.rt_boost=3D1
locktorture.rt_boost_factor=3D50
locktorture.nlocks=3D3

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/locking/locktorture.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index e2271e8fc3027..f04b1978899dd 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -48,6 +48,7 @@ torture_param(int, stat_interval, 60,
 torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=3Ddi=
sable");
 torture_param(int, rt_boost, 2,
 		"Do periodic rt-boost. 0=3DDisable, 1=3DOnly for rt_mutex, 2=3DFor all l=
ock types.");
+torture_param(int, rt_boost_factor, 50, "A factor determining how often rt=
-boost happens.");
 torture_param(int, verbose, 1,
 	     "Enable verbose debugging printk()s");
=20
@@ -131,12 +132,12 @@ static void torture_lock_busted_write_unlock(int tid =
__maybe_unused)
=20
 static void __torture_rt_boost(struct torture_random_state *trsp)
 {
-	const unsigned int factor =3D 50000; /* yes, quite arbitrary */
+	const unsigned int factor =3D rt_boost_factor;
=20
 	if (!rt_task(current)) {
 		/*
-		 * Boost priority once every ~50k operations. When the
-		 * task tries to take the lock, the rtmutex it will account
+		 * Boost priority once every rt_boost_factor operations. When
+		 * the task tries to take the lock, the rtmutex it will account
 		 * for the new priority, and do any corresponding pi-dance.
 		 */
 		if (trsp && !(torture_random(trsp) %
@@ -146,8 +147,9 @@ static void __torture_rt_boost(struct torture_random_st=
ate *trsp)
 			return;
 	} else {
 		/*
-		 * The task will remain boosted for another ~500k operations,
-		 * then restored back to its original prio, and so forth.
+		 * The task will remain boosted for another 10 * rt_boost_factor
+		 * operations, then restored back to its original prio, and so
+		 * forth.
 		 *
 		 * When @trsp is nil, we want to force-reset the task for
 		 * stopping the kthread.
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 83FC7C54EBF
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:55:30 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229638AbjAEAzB (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:55:01 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34392 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229812AbjAEAx5 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:53:57 -0500
Received: from ams.source.kernel.org (ams.source.kernel.org
 [IPv6:2604:1380:4601:e00::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 764BF4915B;
        Wed,  4 Jan 2023 16:50:24 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by ams.source.kernel.org (Postfix) with ESMTPS id 3A74EB81996;
        Thu,  5 Jan 2023 00:45:05 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83F5EC433A8;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=TPAQ6ZqA8AKGeWwHJCciCg8+xPBM5tfIbW3yIZTNF8w=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=JZVdhZwGDXiTgwp6lBIaW2hNJdO2qbd7rOpBQFgxNcHLcaw+creIRGj4crtBXMVj4
         HTbzutBqtMHaC3ESP5KBkfUOVLElGN2Tq6HzSCyY+kDCH1c2kvTRAzzKcUkSjSxLrH
         5b4BB4MIOOR4PYuyqucb7VdLkrNhOxlS+zVizT8bjofZWLCOHJjNWELhwn49PDv6za
         sS9a4/bqRV7yAfDPVcgQQxuMiMKRaCND8GmK0d2IyDMfI1/H++OwPHyMCsbBFskejA
         Ph289S3Meogiy5MXicqDbaXyDOZZuFwH7qsKZQ/78r5395iHLTi7C0AY4CEd3+w3XO
         aI9O/pAvPz9ZQ==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id ECBA65C1C77; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, Zqiang <qiang1.zhang@intel.com>,
        "Paul E . McKenney" <paulmck@kernel.org>
Subject: [PATCH rcu 6/6] rcu-tasks: Handle queue-shrink/callback-enqueue race
 condition
Date: Wed,  4 Jan 2023 16:44:59 -0800
Message-Id: <20230105004501.1771332-11-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Zqiang <qiang1.zhang@intel.com>

The rcu_tasks_need_gpcb() determines whether or not: (1) There are
callbacks needing another grace period, (2) There are callbacks ready
to be invoked, and (3) It would be a good time to shrink back down to a
single-CPU callback list.  This third case is interesting because some
other CPU might be adding new callbacks, which might suddenly make this
a very bad time to be shrinking.

This is currently handled by requiring call_rcu_tasks_generic() to
enqueue callbacks under the protection of rcu_read_lock() and requiring
rcu_tasks_need_gpcb() to wait for an RCU grace period to elapse before
finalizing the transition.  This works well in practice.

Unfortunately, the current code assumes that a grace period whose end is
detected by the poll_state_synchronize_rcu() in the second "if" condition
actually ended before the earlier code counted the callbacks queued on
CPUs other than CPU 0 (local variable "ncbsnz").  Given the current code,
it is possible that a long-delayed call_rcu_tasks_generic() invocation
will queue a callback on a non-zero CPU after these CPUs have had their
callbacks counted and zero has been stored to ncbsnz.  Such a callback
would trigger the WARN_ON_ONCE() in the second "if" statement.

To see this, consider the following sequence of events:

o	CPU 0 invokes rcu_tasks_one_gp(), and counts fewer than
	rcu_task_collapse_lim callbacks.  It sees at least one
	callback queued on some other CPU, thus setting ncbsnz
	to a non-zero value.

o	CPU 1 invokes call_rcu_tasks_generic() and loads 42 from
	->percpu_enqueue_lim.  It therefore decides to enqueue its
	callback onto CPU 1's callback list, but is delayed.

o	CPU 0 sees the rcu_task_cb_adjust is non-zero and that the number
	of callbacks does not exceed rcu_task_collapse_lim.  It therefore
	checks percpu_enqueue_lim, and sees that its value is greater
	than the value one.  CPU 0 therefore  starts the shift back
	to a single callback list.  It sets ->percpu_enqueue_lim to 1,
	but CPU 1 has already read the old value of 42.  It also gets
	a grace-period state value from get_state_synchronize_rcu().

o	CPU 0 sees that ncbsnz is non-zero in its second "if" statement,
	so it declines to finalize the shrink operation.

o	CPU 0 again invokes rcu_tasks_one_gp(), and counts fewer than
	rcu_task_collapse_lim callbacks.  It also sees that there are
	no callback queued on any other CPU, and thus sets ncbsnz to zero.

o	CPU 1 resumes execution and enqueues its callback onto its own
	list.  This invalidates the value of ncbsnz.

o	CPU 0 sees the rcu_task_cb_adjust is non-zero and that the number
	of callbacks does not exceed rcu_task_collapse_lim.  It therefore
	checks percpu_enqueue_lim, but sees that its value is already
	unity.	It therefore does not get a new grace-period state value.

o	CPU 0 sees that rcu_task_cb_adjust is non-zero, ncbsnz is zero,
	and that poll_state_synchronize_rcu() says that the grace period
	has completed.  it therefore finalizes the shrink operation,
	setting ->percpu_dequeue_lim to the value one.

o	CPU 0 does a debug check, scanning the other CPUs' callback lists.
	It sees that CPU 1's list has a callback, so it (rightly)
	triggers the WARN_ON_ONCE().  After all, the new value of
	->percpu_dequeue_lim says to not bother looking at CPU 1's
	callback list, which means that this callback will never be
	invoked.  This can result in hangs and maybe even OOMs.

Based on long experience with rcutorture, this is an extremely
low-probability race condition, but it really can happen, especially in
preemptible kernels or within guest OSes.

This commit therefore checks for completion of the grace period
before counting callbacks.  With this change, in the above failure
scenario CPU 0 would know not to prematurely end the shrink operation
because the grace period would not have completed before the count
operation started.

[ paulmck: Adjust grace-period end rather than adding RCU reader. ]
[ paulmck: Avoid spurious WARN_ON_ONCE() with ->percpu_dequeue_lim check. ]

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tasks.h | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index eee38b0d362a8..bfb5e1549f2b2 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -384,6 +384,7 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
 {
 	int cpu;
 	unsigned long flags;
+	bool gpdone =3D poll_state_synchronize_rcu(rtp->percpu_dequeue_gpseq);
 	long n;
 	long ncbs =3D 0;
 	long ncbsnz =3D 0;
@@ -425,21 +426,23 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
 			WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids));
 			smp_store_release(&rtp->percpu_enqueue_lim, 1);
 			rtp->percpu_dequeue_gpseq =3D get_state_synchronize_rcu();
+			gpdone =3D false;
 			pr_info("Starting switch %s to CPU-0 callback queuing.\n", rtp->name);
 		}
 		raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
 	}
-	if (rcu_task_cb_adjust && !ncbsnz &&
-	    poll_state_synchronize_rcu(rtp->percpu_dequeue_gpseq)) {
+	if (rcu_task_cb_adjust && !ncbsnz && gpdone) {
 		raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
 		if (rtp->percpu_enqueue_lim < rtp->percpu_dequeue_lim) {
 			WRITE_ONCE(rtp->percpu_dequeue_lim, 1);
 			pr_info("Completing switch %s to CPU-0 callback queuing.\n", rtp->name);
 		}
-		for (cpu =3D rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) {
-			struct rcu_tasks_percpu *rtpcp =3D per_cpu_ptr(rtp->rtpcpu, cpu);
+		if (rtp->percpu_dequeue_lim =3D=3D 1) {
+			for (cpu =3D rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) {
+				struct rcu_tasks_percpu *rtpcp =3D per_cpu_ptr(rtp->rtpcpu, cpu);
=20
-			WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist));
+				WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist));
+			}
 		}
 		raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
 	}
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 03C47C46467
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:49:54 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229747AbjAEAtw (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:49:52 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58356 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230241AbjAEAtK (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:49:10 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org
 [IPv6:2604:1380:4641:c500::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04C214A1BA;
        Wed,  4 Jan 2023 16:45:43 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id 7943F618A8;
        Thu,  5 Jan 2023 00:45:04 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8FE26C433A1;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=7+wvkg80uwHKFycQvuCSaMt0ylXExtbV/M9gEYRenqw=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=WHJ56Hyx6z+mwKPO9NfRHlSVZzfNVeZ/r1xp004c8MB7ycsOKmhbsRAl13ci0aRHD
         gIJELgqgwo/WL0hzuMzuBx1nbPQAKwjGNnKEMY0x5ppNVZnnXEQoQxZpOGkxE4z57L
         7GChdO3O3Tac541emQDjoMbzYhxkJzsjb18Tq6UkJj0tVRpPOuwmuxZxV1vyn+UjFm
         RRt9AUvSqbrO0dgMmYnr5hvf/ucFQ9j7Fx73FgfLsUQUrSxyUjhT2N4tXIWzcy3iY+
         BdLFwHB/cze60GHBYept2Mt6UsT1n06TPGHfxTOJKfuQXHZ6zlO9LOkA6fCgBSBd+v
         vKWyV1QEYG9VA==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id EE7A85C1C78; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org, "Paul E. McKenney" <paulmck@kernel.org>,
        Tejun Heo <tj@kernel.org>
Subject: [PATCH rcu 6/7] rcutorture: Drop sparse lock-acquisition annotations
Date: Wed,  4 Jan 2023 16:45:00 -0800
Message-Id: <20230105004501.1771332-12-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

The sparse __acquires() and __releases() annotations provide very
little value.  The argument is ignored, so sparse cannot tell the
differences between acquiring one lock and releasing another on the one
hand and acquiring and releasing a given lock on the other.  In addition,
lockdep annotations provide much more precision, for but one example,
actually knowing which lock is held.

This commit therefore removes the __acquires() and __releases()
annotations from rcutorture.

Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/rcutorture.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 634df26a2c27c..8e6c023212cb3 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -399,7 +399,7 @@ static int torture_readlock_not_held(void)
 	return rcu_read_lock_bh_held() || rcu_read_lock_sched_held();
 }
=20
-static int rcu_torture_read_lock(void) __acquires(RCU)
+static int rcu_torture_read_lock(void)
 {
 	rcu_read_lock();
 	return 0;
@@ -441,7 +441,7 @@ rcu_read_delay(struct torture_random_state *rrsp, struc=
t rt_read_seg *rtrsp)
 	}
 }
=20
-static void rcu_torture_read_unlock(int idx) __releases(RCU)
+static void rcu_torture_read_unlock(int idx)
 {
 	rcu_read_unlock();
 }
@@ -625,7 +625,7 @@ static struct srcu_struct srcu_ctld;
 static struct srcu_struct *srcu_ctlp =3D &srcu_ctl;
 static struct rcu_torture_ops srcud_ops;
=20
-static int srcu_torture_read_lock(void) __acquires(srcu_ctlp)
+static int srcu_torture_read_lock(void)
 {
 	if (cur_ops =3D=3D &srcud_ops)
 		return srcu_read_lock_nmisafe(srcu_ctlp);
@@ -652,7 +652,7 @@ srcu_read_delay(struct torture_random_state *rrsp, stru=
ct rt_read_seg *rtrsp)
 	}
 }
=20
-static void srcu_torture_read_unlock(int idx) __releases(srcu_ctlp)
+static void srcu_torture_read_unlock(int idx)
 {
 	if (cur_ops =3D=3D &srcud_ops)
 		srcu_read_unlock_nmisafe(srcu_ctlp, idx);
@@ -814,13 +814,13 @@ static void synchronize_rcu_trivial(void)
 	}
 }
=20
-static int rcu_torture_read_lock_trivial(void) __acquires(RCU)
+static int rcu_torture_read_lock_trivial(void)
 {
 	preempt_disable();
 	return 0;
 }
=20
-static void rcu_torture_read_unlock_trivial(int idx) __releases(RCU)
+static void rcu_torture_read_unlock_trivial(int idx)
 {
 	preempt_enable();
 }
--=20
2.31.1.189.g2e36527f23
From nobody Tue Sep 16 08:46:14 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 76265C46467
	for <linux-kernel@archiver.kernel.org>; Thu,  5 Jan 2023 00:49:56 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229761AbjAEAtz (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 4 Jan 2023 19:49:55 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58790 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230235AbjAEAtJ (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 4 Jan 2023 19:49:09 -0500
Received: from dfw.source.kernel.org (dfw.source.kernel.org
 [IPv6:2604:1380:4641:c500::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5B4E4A1B9;
        Wed,  4 Jan 2023 16:45:42 -0800 (PST)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by dfw.source.kernel.org (Postfix) with ESMTPS id 73C2D618A6;
        Thu,  5 Jan 2023 00:45:04 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8DF63C433AF;
        Thu,  5 Jan 2023 00:45:03 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=k20201202; t=1672879503;
        bh=ZC44hmsWKq5LKstc1mABPFNWTm5UBg7AJtRBZoxlpu0=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=PnU+YhqiP1tuA8VdokEYO92/3U9BeI9W4Dq0q5h2Y5iyZS8lJqsEQ5XDnI7IniHBM
         +lRNnJrEkC93tVEJGwBiYxtskaDx3eoGzMHV4kJK4IBmvaztAoeZtX6B9OvC0xISIz
         BmX+CT7rWKYDK2b/16XpgMwXXf3CIwK2M7En8autzrkgHOfOI1AXBxUkzwCY+3UrHm
         USk+XnACWyK3SIP2l5ncBK98nB6LH8xJ6JIIZi/t04uUgtrhxhNX9xHzpEDtB39oOF
         em/unGNy0NyN6b9rgXJ6H5jgfrKjPEdTSxbNUCayuFeR6/BwnPxf+H86SSrs1c0QXY
         jJi03WbVTaFZA==
Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000)
        id F05205C1C89; Wed,  4 Jan 2023 16:45:02 -0800 (PST)
From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
        rostedt@goodmis.org,
        "Joel Fernandes (Google)" <joel@joelfernandes.org>,
        Paul McKenney <paulmck@kernel.org>,
        Frederic Weisbecker <fweisbec@gmail.com>,
        Zhouyi Zhou <zhouzhouyi@gmail.com>, stable@vger.kernel.org,
        Davidlohr Bueso <dave@stgolabs.net>
Subject: [PATCH rcu 7/7] torture: Fix hang during kthread shutdown phase
Date: Wed,  4 Jan 2023 16:45:01 -0800
Message-Id: <20230105004501.1771332-13-paulmck@kernel.org>
X-Mailer: git-send-email 2.31.1.189.g2e36527f23
In-Reply-To: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
References: <20230105004454.GA1771168@paulmck-ThinkPad-P17-Gen-1>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

During rcutorture shutdown, the rcu_torture_cleanup() function calls
torture_cleanup_begin(), which sets the fullstop global variable to
FULLSTOP_RMMOD. This causes the rcutorture threads for readers and
fakewriters to exit all of their "while" loops and start shutting down.

They then call torture_kthread_stopping(), which in turn waits for
kthread_stop() to be called.  However, rcu_torture_cleanup() has
not yet called kthread_stop() on those threads, and before it gets a
chance to do so, multiple instances of torture_kthread_stopping() invoke
schedule_timeout_interruptible(1) in a tight loop.  Tracing confirms that
TIMER_SOFTIRQ can then continuously execute timer callbacks.  If that
TIMER_SOFTIRQ preempts the task executing rcu_torture_cleanup(), that
task might never invoke kthread_stop().

This commit improves this situation by increasing the timeout passed to
schedule_timeout_interruptible() from one jiffy to 1/20th of a second.
This change prevents TIMER_SOFTIRQ from monopolizing its CPU, thus
allowing rcu_torture_cleanup() to carry out the needed kthread_stop()
invocations.  Testing has shown 100 runs of TREE07 passing reliably,
as oppose to the tens-of-percent failure rates seen beforehand.

Cc: Paul McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: <stable@vger.kernel.org> # 6.0.x
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/torture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/torture.c b/kernel/torture.c
index 29afc62f2bfec..1a0519b836ac9 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title)
 	VERBOSE_TOROUT_STRING(buf);
 	while (!kthread_should_stop()) {
 		torture_shutdown_absorb(title);
-		schedule_timeout_uninterruptible(1);
+		schedule_timeout_uninterruptible(HZ / 20);
 	}
 }
 EXPORT_SYMBOL_GPL(torture_kthread_stopping);
--=20
2.31.1.189.g2e36527f23