From nobody Tue Oct 7 01:58:30 2025 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2081.outbound.protection.outlook.com [40.107.94.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2586129E117; Tue, 15 Jul 2025 20:02:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.81 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609745; cv=fail; b=e+K1+uOhlOlaoCIJ2RAShXaGN8/MR7ae+mfltVXXxJVE8qKPwdIdoytmMhSRZpy7b66n6IUxiSH/QxiNIOgxAVp2/s2y7equJvjcNY10iYwope42ZS2l6smpCHzEnxFttu+cY0jq2t5hd63gaRHuBfg87Fb/PFiMuapEhNg1wy4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609745; c=relaxed/simple; bh=62wqs9lEk7qVLS3PuRJrnX0g5GIMgcTMkutUcD7G2ws=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=QOkPAWuz4aUbI0g6t1DZkqStlJeRNVh5lwHshI5eA1AikrlRrwoU1e3C9VA+piWRDG5F13v87CFlqmjOCCge9YMY0yMGYm7XEmUEqi+2KbIAjBt8Wa4qo95FcQz4nGqucxk67GBTH1IREAJuAHfGyGrZApYSWvIn5NZzUso+Eao= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=IIngfytH; arc=fail smtp.client-ip=40.107.94.81 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="IIngfytH" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rUhRgErNDPBZd1RhZRBD4F6LT/HZ/53l2nYzkbNc3hyFOjvSnMkv2O7vEOuiMf7HLBYFEjYKI5GYjf12cO4pD8UKq4aZT52rne95cnxws9bWQM3gzevGZXYhKjTweivAXt50toR0S1x7I9HReQQU/FrUvJpIwtLk3UpupJGO8VM/RCDooSBCR9pIDwJmC5V454vy2V6ngOsi8Envxkx9kMVSFqmzNSigBEuWaKAQGvXXRLJVlaOsgsSFTek2/9X5Ouax1B2a8p+mAA9aUHcHXt/NMpeMmrLRMePPEIoQUfGueFvQnfiF4qjpJo9/oGSvEuzQRRbX6DKWC3ACeNkqog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uzR2MNprUba9v4Mj4LcwuICTYWRxry+qs4TRY3K5eRY=; b=v82E2afKkUwhKbHDEKsfOkPcXhtV2ILIU1KTeRhZLILGazeveA+KWVv8slLI9VJsAma8uAvm4wW9wvz0juTrYl6SvtswFySx2Hc8MVeYhm86Dhb71cnEb9fPu5zuc+pp6tSSwkdHGwBB11jGwDcSx3f4gzMnpvKf1uZeDut8UPnhS1HrDkXOSIz0dX9hBldr1rrHRUpB8HRgJFBEmd52jCIMObJuYp1NyFd6JC2wfQPToK5LVjpNB6cFjlUpXFD4rjBXpXjm0Usr/KWiz4qu7kQ0JrvTOzdgo3o0Qs2AHg9nEFPNJG7mgf8iRs9wNhDB2gHcUfuFolSOkfiQqzyyaQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=uzR2MNprUba9v4Mj4LcwuICTYWRxry+qs4TRY3K5eRY=; b=IIngfytHD6OC+Wscjg+4iN40HGQ/CzTpNgOLUr7ZxhL0uXTxygBMLz2ZAsM9cMOznRLdxUCy34vMS4ZsG7kufe4fHEhQQ1pDeb7fYznW5wfWh+IRYbSXWgecXKxTubPii8rFKMG0Gi013YL0mtW6gWmqYbW/QxvQ9SzcsXqisfCyIW0GR4UltQXeSaVdEcxEkwwWnddQcDO0qtzkZteTGbqFk85XdQLeOtBGOOP6pGGBStVwSbG0Ym1nFs/NgRgjKrmgCcSuufLCeTP8h+L0U0NMVY7pf9SMqO22Bn+EY8ly3X0fXsGWzpQRU+f4X8lFbhXe8ROE2LnxVYIU/PMPuw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) by PH0PR12MB7009.namprd12.prod.outlook.com (2603:10b6:510:21c::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.33; Tue, 15 Jul 2025 20:02:20 +0000 Received: from SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91]) by SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91%7]) with mapi id 15.20.8901.018; Tue, 15 Jul 2025 20:02:19 +0000 From: Joel Fernandes To: linux-kernel@vger.kernel.org Cc: rcu@vger.kernel.org, neeraj.iitr10@gmail.com, paulmck@kernel.org, Joel Fernandes , Andrea Righi , Frederic Weisbecker , Peter Zijlstra Subject: [PATCH -next 1/6] smp: Document preemption and stop_machine() mutual exclusion Date: Tue, 15 Jul 2025 16:01:51 -0400 Message-Id: <20250715200156.2852484-2-joelagnelf@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250715200156.2852484-1-joelagnelf@nvidia.com> References: <20250715200156.2852484-1-joelagnelf@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BL1P223CA0026.NAMP223.PROD.OUTLOOK.COM (2603:10b6:208:2c4::31) To SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN7PR12MB8059:EE_|PH0PR12MB7009:EE_ X-MS-Office365-Filtering-Correlation-Id: 939d9e58-ec39-45ca-0b29-08ddc3da815f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?EyAf9B8ujA/3Q+pcKbdaOdTLt4eWyWihecVquT3uqnrWtMOg/OEa0K2VJCTs?= =?us-ascii?Q?UZ7Iiw6JgCDWyg66uyBWKCFwyayT52ijJaijglZBMbiKesGkBy6eq0ZBNhj9?= =?us-ascii?Q?Bs45ppD5yJZC3vIhgq+nLqNtNqBpZIZZiaTMELIi3XAb9ixcPMeu4ULkFfw1?= =?us-ascii?Q?4/PpJyP6K9gCYmgrU2Cou8R6zGpqnSsFrOCPppSC5gTtlhuQ9JYJ5YbaTPDd?= =?us-ascii?Q?W3LpMeZB1wGZGXmH+JQUM/k6d5YAI2LrFBhqENCjF9PfmGEGNybKRIJACqkN?= =?us-ascii?Q?OglyrorPMeAIhOL+DfBWQVt4TTABX0JIaG59cP0CO7Q1BpkwjdGV4KWoaKfw?= =?us-ascii?Q?JIP7sQP1gZk5hI8BxcDHHyQT9VOZT+quEnTm2oyS64UbBo6LJj5N2IjJeNsL?= =?us-ascii?Q?jUrYvfm/Zwc/pKrxgV7ZOJNJHjOQhWGJpPpR6/0Wcv0cfZngECOVkPyd2ga+?= =?us-ascii?Q?KMgUzDUCeJqAk2b/JN5dyO9WXXkHnk09CAlLZ1BcK9eppjpKQ/93yxMOASIX?= =?us-ascii?Q?ChIwTmptsEFK4z6BbzVIJzlqnczPbO8ZVhfmEZuDB1Qx4THdcIhdT4Qth3eG?= =?us-ascii?Q?kna0ws/X3ArkM41Q3opkao1hdVqlF/idjztHASZhXsgUA63+94fRTNyoJCsm?= =?us-ascii?Q?x6i5ZT5G9VfK+4JyHCFdo6u8BsJhopBocahrjET6bglR5t2khQhYiQreUBTm?= =?us-ascii?Q?Z1CRhICO7ov0TZU+DIl9/aMp+/eHcdCa98ARj6cCY2DcFwHuVJdFqqvkM3cd?= =?us-ascii?Q?J6ZF56+qJ61sz88h1sXR7cEHhD78QfjRKtMv3idXDJORInR5KrH6fI57baf8?= =?us-ascii?Q?LsHGo1YTY7hxLR2C+n8oyRkA/GXI4a1N0A281TjeuI95zZ8K9f+qdyLvZ8bI?= =?us-ascii?Q?Azx/U6oUGvmIvha/Oj5zZYGKevCXjL4hqJai3N/u7EvVvMo7Q4QFniZHkEL6?= =?us-ascii?Q?MEhwxt0CxSJotcY8UCq2KlEnFRfdp5WwdDt3isgZakBiZPqmT+NPnNui+5lH?= =?us-ascii?Q?T2s3LfrPzNcJZxMKgb7W4qfNiSOxnKSMt9wDL7m1E30S+BvE6yB8gSVs0dEH?= =?us-ascii?Q?Q0lAPgUK43ytTHsCNTT/gzx5aXtAmMwFl5CvFbv0SU+n8iafrFupqPXM05Jz?= =?us-ascii?Q?U7cmxEtwfOiBPdGqvmHjCvsU+tt+CSpBBbEX7AQr302KY1HN/Vauh9Jqx0yS?= =?us-ascii?Q?uV40nR4dsiIOW03WgDl/pRk4XEyTtYWgMSnvu9ZXAnW44SrTYT8wNpSQOlCc?= =?us-ascii?Q?li7lhTishkqmqubmtqNrxUO+mkNMMp5O017BoRhMBnQ7m8RDd/CS2PjehPq1?= =?us-ascii?Q?ioEzmio7bm9iFL5W3r108RKgcp7PIuppU+CHy1gKbPxgAWnJByAM4W87dBb7?= =?us-ascii?Q?U3H1uebsv4dR2JFslBB3mlwKh7Tvu9pptY8r3j0QH5xKFhz+k1hPhv/nYxgl?= =?us-ascii?Q?RVs9tf/E0I0=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN7PR12MB8059.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?DOowyFfaCDGtr+ylbX3CNFShHj/UsDiYaTatfnHhOQN/5dUAygfxeOmhV52Q?= =?us-ascii?Q?RDO3uJu9k1P+FDVz/NTTqR0il9B/Lkr71GOmwDk1IhCrsceqomHwXBSaN0Jd?= =?us-ascii?Q?+KIYINFBTXLTHdh1QU047KapGg11H8uUVkR6xVFn6ux7zRZrC4RlegMpgyfA?= =?us-ascii?Q?LDH1BFfMhYMdb7zYxfVM7OOZDAvNXmYlfNcyRN7djELElzZlU5SSgWPDISS+?= =?us-ascii?Q?e4eS78ZThTB/YhIhufhXPg1Bu0O5qAkXK/7MjXzRyVliUj/I0tpmqJmxmXqZ?= =?us-ascii?Q?5kNrdmuhshAIwnabHqxIlyGtwLJ+tJ9yKkNMfiSwf7VX1dYa4Jq7RYxVGwUR?= =?us-ascii?Q?2fulpNQVOzclCg/VAMSkGYJWwo/iBnQRk9QbSo7wh1OMTabkJbtpRCdsP4ry?= =?us-ascii?Q?EU+nOk7PU12mT8B7XbKqLGqdNMXwhXEiJd8RQq1gkceAr4EVPpW3plXIiXEm?= =?us-ascii?Q?cXygeL1BIGabWtkSAiTTtclQ67pXPv4mBQti1j4BdBT3gjBkQPYY3oTW6x3k?= =?us-ascii?Q?/rHl0h8LP+SAcZNcGUPK+fFF9l7c2dPwaQqH1aLHpzu2UCkqmrw3rTPHty7D?= =?us-ascii?Q?2FtpfbIRgzjT+BsVj8GKc/XA8JNNhR5EjV51ki9HWkHPlWg3FB+xDao8ojQ9?= =?us-ascii?Q?h7uk3Gwqh569Us/qxMh1YiIlUFYsZ5qdOTBE7C1GA7+saK8OQ7V2I6BPtcTn?= =?us-ascii?Q?obKaYgVbJEi93/LUdA5kpYyTu2LPldSXnXYupNpsA7fANsa7NUuVSEiL25zb?= =?us-ascii?Q?UfiZBe2OJRWYdDl3WE72mjna0VNaKDybHccumHmLE1mZrGlADQyt93gwzxUB?= =?us-ascii?Q?2wuxD6aP1cuiMoXLU4ZAHepaV7MZ4PaqfEYdmDiPpyDWJXfeVqYn8BS65PXi?= =?us-ascii?Q?Hl2pGg5LNCZ0AtBsTvzzWtVY130G0t8feyjKV4/qpt0NqqkQVZYqxR4MMDGz?= =?us-ascii?Q?iWK0V6SqTzofBHqyxas244ZiY8ZnsGWj0Si9HCX0f1WgH7WDbJOtFiA+ZJJj?= =?us-ascii?Q?hCc3ZsBEmdgn5KsUkqvZc7lJI9KgyN/j3yZUwaGg2SBLesIV5fv4NE9O1O4I?= =?us-ascii?Q?KZGmZQwymj5DVWDoySY5+tTYqTemwQ1eIQ+U9xs6w/lpZMtoOpLrdDGDlj5X?= =?us-ascii?Q?GNJvDyfK8FzO4yd6Z//B0KPYIAc7trkT+nFV3uUK9KGLabDZQvg2/ACWShrt?= =?us-ascii?Q?E18f4YEgAoK3C7919mzmEtyw7kHkNI4D9cjZhh84VmnJZnEUksXwlwtfz5cG?= =?us-ascii?Q?4ZN+OBO2pmLPW5g1cY9gCN7bUNWQfbJ25BLHkBylkK8cLijlH6Vxi4Uw3vzm?= =?us-ascii?Q?5ic3jUhFXQvHKB1g4CL9LvsyPvu66ezw0KoHn4inxcl6ll36JQ9yRcIaXPGU?= =?us-ascii?Q?2TODSHuEAlbzJDXipiD3ODqbNhyKVCcKBeVYw0ZIyZVbx4gkJRtgNwWNtwbD?= =?us-ascii?Q?evg+FvtSEkR8m2vXSYvxOMzepHSEydjoLLNIL41Xy0qnzgSjMDKFdMO+uBYh?= =?us-ascii?Q?pDdkLvmR0Dv2yvMIzaIBDuJnk2WBEwxZRlZC0EcnevRJZGyfAETzYlCX/e8l?= =?us-ascii?Q?yJDWPftty1H9QV2Mpnq78x8Dh+dXYQY85pJrB59R?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 939d9e58-ec39-45ca-0b29-08ddc3da815f X-MS-Exchange-CrossTenant-AuthSource: SN7PR12MB8059.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2025 20:02:19.8747 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0xteihHaVtmVijvrsKHusAOTGuxmaPYC7WeC7K8pdaMPNgmcPV8tDg73MUxOYGarXkfVM5vQajRioaDMSUqgeA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB7009 Content-Type: text/plain; charset="utf-8" Recently while revising RCU's cpu online checks, there was some discussion around how IPIs synchronize with hotplug. Add comments explaining how preemption disable creates mutual exclusion with CPU hotplug's stop_machine mechanism. The key insight is that stop_machine() atomically updates CPU masks and flushes IPIs with interrupts disabled, and cannot proceed while any CPU (including the IPI sender) has preemption disabled. [ Apply peterz feedback. ] Cc: Andrea Righi Cc: Paul E. McKenney Cc: Frederic Weisbecker Cc: Peter Zijlstra (Intel) Cc: rcu@vger.kernel.org Acked-by: Paul E. McKenney Co-developed-by: Frederic Weisbecker Signed-off-by: Joel Fernandes --- kernel/smp.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index 974f3a3962e8..23d51a8e582d 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -86,13 +86,15 @@ int smpcfd_dead_cpu(unsigned int cpu) int smpcfd_dying_cpu(unsigned int cpu) { /* - * The IPIs for the smp-call-function callbacks queued by other - * CPUs might arrive late, either due to hardware latencies or - * because this CPU disabled interrupts (inside stop-machine) - * before the IPIs were sent. So flush out any pending callbacks - * explicitly (without waiting for the IPIs to arrive), to - * ensure that the outgoing CPU doesn't go offline with work - * still pending. + * The IPIs for the smp-call-function callbacks queued by other CPUs + * might arrive late, either due to hardware latencies or because this + * CPU disabled interrupts (inside stop-machine) before the IPIs were + * sent. So flush out any pending callbacks explicitly (without waiting + * for the IPIs to arrive), to ensure that the outgoing CPU doesn't go + * offline with work still pending. + * + * This runs with interrupts disabled inside the stopper task invoked by + * stop_machine(), ensuring mutually exclusive CPU offlining and IPI flus= h. */ __flush_smp_call_function_queue(false); irq_work_run(); @@ -418,6 +420,10 @@ void __smp_call_single_queue(int cpu, struct llist_nod= e *node) */ static int generic_exec_single(int cpu, call_single_data_t *csd) { + /* + * Preemption already disabled here so stopper cannot run on this CPU, + * ensuring mutually exclusive CPU offlining and last IPI flush. + */ if (cpu =3D=3D smp_processor_id()) { smp_call_func_t func =3D csd->func; void *info =3D csd->info; @@ -638,8 +644,10 @@ int smp_call_function_single(int cpu, smp_call_func_t = func, void *info, int err; =20 /* - * prevent preemption and reschedule on another processor, - * as well as CPU removal + * Prevent preemption and reschedule on another CPU, as well as CPU + * removal. This prevents stopper from running on this CPU, thus + * providing mutual exclusion of the below cpu_online() check and + * IPI sending ensuring IPI are not missed by CPU going offline. */ this_cpu =3D get_cpu(); =20 --=20 2.34.1 From nobody Tue Oct 7 01:58:30 2025 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2049.outbound.protection.outlook.com [40.107.93.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F149C2BE7DF; Tue, 15 Jul 2025 20:02:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.93.49 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609748; cv=fail; b=tXX+XVi7I4uqEanRTpkoBn/hhbKpGkdim2Nru6yWfO4eRZzkRg4VpDeTb6wbP85QcAvoSfH4yYZqHXRWI47YPwK/2yHp7c8eSyESlXbaSOIq9cF9npYhcduyPemU+QQAtImRf4Km+vLDWlx9N/vKD3lsH891OQkj9xxEl1qPfbI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609748; c=relaxed/simple; bh=aZjW/8sSUWVW98Hvvm8Ehe63LuLDA9Mk5wTVTqHK8NA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=jaD1gt3COJI+cYOkJE4OCrA5i3w5JJkA+q4rIL5ty/K5I62uJOVMYtvwMc25NWhn9CNJ6dBpwIq79COxfnfE9lapCQHLlJ2wqPYbSSchNy3lVjie6myc2dKjOfO2R9mUycc/ur3yzjWlikQHgfdozKw6GYIEIVsFzA9+AhxommA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=nnGPqDhV; arc=fail smtp.client-ip=40.107.93.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="nnGPqDhV" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=X0uCToV4XUHejqYxub1MdxGJo+t/gSVTfZF+EmPeuyC/sEndt/UuGQX1HWoXMfn8xPbOoq4X/DTWK5lKFmluY1p8/w/IpjO4CXRAuTLdSZ4PivMtu68BTD+DgsUnqUuADnYRaNhc84ztYXzcJfEy5QTXKLwFWwKLpfotvMPfno4i/SZDAL+3j3GSriD2k6XbmtOqJNx8LiAWlWamiYPbUtyaA0efvuAgAdr7aPAkje9jS36RBGw11H93u1wmZj/dJnLI8ebXcfuifGeHdxqINxJJRE2NC5WUjhhNuhSl49AGRXNLnFNFU9xRuhhU5n67g1KDYGDKck3HQhpvOX3zsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Mcy7jtC1N1/MvUXfsiFuCayIALL3TPKOK1QPC8crkHg=; b=Ha4BG9cgSJ8ToeIeisxpbNlqYdXSb9oybg9bwF6ZuX5fs2lGcFq2wp7nphReKAAgX7vrza7sZxfg2MXZ6T6sq0cYf1ZDVGed7T6Hs9dmeCoqDRUFgjatVAHcwTrQEnUs9mFL7zvGM3tld3b21BuLlnTB0sEQ4gH5j0ncohh56+j5kFoGUXBgi7Eu23q08EmPvcirVICXku3+9b5ge93Vubv0t47eFYmNZbv1ySPXv1t7Iot12gCEA0AIpkfixGyUy7HVUOL3TClxRD+4FHefr+0zTy5tlPpRmSdoEWchoNLMsvtEPQlEAeEfOG21iJqivtEPBInI1aIHnCQ+Cfju/w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Mcy7jtC1N1/MvUXfsiFuCayIALL3TPKOK1QPC8crkHg=; b=nnGPqDhVo4YaY8gMHcEmd805+bF3N1uhOf15J8qlK4grXsBcUOtG05Nh8kMP9bKEItA01HeZ4/k3HHedhUjOsI4k/M6GKsI1GR0cZ7uCgGtxvpT1c2bJlv12yNHlUKH/dydjn3wxbaennmlkeoqDone+Luefdz1p7qqv0pakZMPXjiWSM2sZPU8IZ1i8JkAqptOll8Of5yrmhyTj3VVQzzLVLF5xv1vY6w7pqKWLYRLVFxxTOn+VD5/ozbwqiUJcywxC23ACN9VY665izbGBuP8F1YM1IzHrUPS1BD5/MFAW7XJgj3DK5qbmRcZr7SVwLqMEBnSICYqlgUKXwNVm8A== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) by PH0PR12MB7009.namprd12.prod.outlook.com (2603:10b6:510:21c::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.33; Tue, 15 Jul 2025 20:02:23 +0000 Received: from SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91]) by SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91%7]) with mapi id 15.20.8901.018; Tue, 15 Jul 2025 20:02:22 +0000 From: Joel Fernandes To: linux-kernel@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Sebastian Andrzej Siewior , Clark Williams Cc: rcu@vger.kernel.org, neeraj.iitr10@gmail.com, linux-rt-devel@lists.linux.dev Subject: [PATCH -next 2/6] rcu: Refactor expedited handling check in rcu_read_unlock_special() Date: Tue, 15 Jul 2025 16:01:52 -0400 Message-Id: <20250715200156.2852484-3-joelagnelf@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250715200156.2852484-1-joelagnelf@nvidia.com> References: <20250715200156.2852484-1-joelagnelf@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: MN0P222CA0025.NAMP222.PROD.OUTLOOK.COM (2603:10b6:208:531::33) To SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN7PR12MB8059:EE_|PH0PR12MB7009:EE_ X-MS-Office365-Filtering-Correlation-Id: 69e71695-65eb-4d05-10b8-08ddc3da82ac X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?hrqAMRmtVKrzHQBmTzFSE7FInHY6SbNj1dktAp7LEJKl6X42d3mknG4EXZ31?= =?us-ascii?Q?eVwhhYwPVeOVx6wf+sEyaDPRlXaYAb6ZA+kjhRh8aoFw3Dl2qVz9CJ0/FkxM?= =?us-ascii?Q?CF2wPp1MMxiifieVhjQI1WCr9Mem1ou6VQWvGDEpS/2H3bxr1Jjou9fZ6TOp?= =?us-ascii?Q?nhMzC7xBTkamR49e8QmH0+ymsTqoEQYoYnplkjtzP9o2VTjo0eef2ebwBhPx?= =?us-ascii?Q?00MEv1l+wrhERJQTPFXUBb1RDWNCmZl9pDSdYUY5CRy1qBR1I2Kkn4E9cCSP?= =?us-ascii?Q?ajdY9SkcP3gN+PvEENAiEYIwOBhah83vNDSU1ZaQqPByrf2B6Wf8dluYZvdn?= =?us-ascii?Q?ncQ6uEM20Y1xcVQzqKoNoJ22z7LTmw0SSZCUB8Hzr7Hw5K8btlarlF61IQv8?= =?us-ascii?Q?rVZyLi7IyMzbkRBGO81Nn7wmGexl6UeUe99shXPuF87aRLlgas0K3uOtQGPb?= =?us-ascii?Q?oWCQq0qtGgKysI+98nLB0yWOZc9BVyMDMzYmGgZiG4ELWQlPHrwNJpSwc85R?= =?us-ascii?Q?o8TwOHpm9kMB8XCTWXrDEwSZ1tAQmE2U2cJLCxEq0UbsLlV9Xt8v2iSnjtBE?= =?us-ascii?Q?JwIvXhE1EYTPIoxklp6l7H7LMerIDZivjetS/SgluIXyNQr3ENmGqGYYsJqg?= =?us-ascii?Q?tfKB1H2JhFtps9vqgS9hhA1g8UNGpD5RJEBawI4ZILcJJQVCsLdB3ZKd1u1l?= =?us-ascii?Q?6h03J0wB6B2Rk/1VR4kuUusewDmgJwN47Ge4n80Gvw/ghYSQopGSk6Qy2aPP?= =?us-ascii?Q?8SzC4njQsvEKQ3DSMIXIA2Z3ZiBcawWm9VAf+OoLyrgtYykM54uU3qDUgpM3?= =?us-ascii?Q?+gk6K6J2sZvQYEg48Mh6CtJAq6nk2YLWvSa3KvzAs1/eIRr3ZmCxvFpaRxbG?= =?us-ascii?Q?SUCRTenbZaXqVhj8CsblhC5XTXYqrcrAnLElEiuS4KcrtcSC4+H0CYoxt9Cn?= =?us-ascii?Q?EYaMhDThWlWPhvsrgIvzRWKcTRmpJ8x8fRcM9b4IyIWdts+xRKrKEuZfkDwT?= =?us-ascii?Q?qE/fO3+rV03Q0P57OcHBHCUdShpSJ6xNN6ZTxetjcGRq3nzWnNPEyNi1fWUt?= =?us-ascii?Q?PGv94MzH4zqp2cffhLJhh/dpt9c2xsFMDYCREHKTnudReLVNdNwBO9dpwocp?= =?us-ascii?Q?7M6emLie7oQfMvJdI41Jzjx20GlUTnWClucdY0ILo/OjC1PRkuO09QojHJdV?= =?us-ascii?Q?OEX4bAWur1+oGwjz1wSdhnJvG3nLWKOOTtdedR7g1TgyWItpvH9FQfRFPYtW?= =?us-ascii?Q?vza0GTg5MewjWgCKDHyOxssxbv1qIe7r6rDLp1wEadc+NTYnQArmGs3fNmz4?= =?us-ascii?Q?6+RMoek0jxDm8QANgmyP2SvZI+wHi7cj28hOi3BV15AzG6I1CP4p5Tbsm9S2?= =?us-ascii?Q?WZwsPSPwSlKw5gXBYfEIfH8zvSgWga4ms0LAgf1X/VURdvcpDH0ZRK7kU27E?= =?us-ascii?Q?cHNY3ehpqThPCA1xZSSpEt4dqDwdGSjIaU7vJL2Ral6ik9ms0cukqQ=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN7PR12MB8059.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?sclrsDPb7F1X/H9SemYAzYtmcfrhphk8VbpZ4cv+y9sySR10sAxhcNCI1+hu?= =?us-ascii?Q?4xY4tSN2+fgGTSlze23rV+nAeezNVm3Ks+35KsYpIwYK/yo6Vs+itqARiGg4?= =?us-ascii?Q?883ePMlgAFd2xG1qvO6Bs5nVtFoAQzGyB1bnkPaePxTL8/BOX5uzpL+Fq+It?= =?us-ascii?Q?NrjjQqeCXkzYW4/RMRTErlzHBVf+5g8h66PsWnJeecFlPWyjGIsQEqGCBvRY?= =?us-ascii?Q?sjv/aMniUDApjhSe3rDeE+3ZO9EOhEQwDeHFTz8Cis8FZzQGrCxSFVOlrPrP?= =?us-ascii?Q?FN/WoNuEnvzlry+qyw9xtsdg1SClZ+qrC8sBLch1hrKn0n2uuj7CYcEneJLB?= =?us-ascii?Q?71DnlHfM7CY/RHcV8jufqSicb2m8q9Ya0Nx+GkG0Q642ZojPfQzxtWnDB1lW?= =?us-ascii?Q?0o5uH74WuSJqOxRWt2xhKe5dD3lnQ9FSX5FC6/g1CSKgFsNs6FeJKfbNkUu7?= =?us-ascii?Q?RLXuNWdVhcTrGov/O0k57PLsOecJxLdwFZSGqi8dqBXu2kphaCmlTSIjMbZw?= =?us-ascii?Q?GqcGzDaRHc2ki6rIZDehVnb+bz5nQKAm1I9I73hBkojh6zRqVRy+dbdl3HIC?= =?us-ascii?Q?L9mbpD3o0Z0Rfvti5yhUXqe90evbDEtJ+iVeL0fns2V3D6R5PFmMA/Ux6E8W?= =?us-ascii?Q?VPf/uO5RXs3r0ahpRNyAw37L2KyU01RGRsNDSWSQ7ky6b/y/1lOlVKjUoRt2?= =?us-ascii?Q?woqurdU2v/VG5uFDWcc1b9uy8nJyZOh6qhrPIX1sXmii4yNLe0hZzSrbWV6k?= =?us-ascii?Q?kshqDef6/RcsHANq6Jl5YJQ9DttDBvCfVGk8B/jsWwR7lvEWmlLor8pGTKfd?= =?us-ascii?Q?XSx/cZgGIzz6MyCDHyb08HddPUw2gg7b3Y/WTzNXMGroNJoBdnXQlszWf0iL?= =?us-ascii?Q?nWoEDrOdVfHKrw0bsMyxyqqiIx+4G8VY2ulpMWWDfeYXtuMAS+Pnk+MgFJx+?= =?us-ascii?Q?3lAzni5pOvIG09gfE5perFE2ynxBINxuo483dS7oyEhslvNWRUi5eN0NhIgs?= =?us-ascii?Q?8SKs6QmqxuOhUlJ0VDcMZBM7BwLHBHhZevcwF0IaXc6zDbkTttRocrEqmzQ4?= =?us-ascii?Q?9+/d/Lxe5RT+5byLD8I9WkkNQ7i97506uKN+KFzpSQ8xtTlTg/cx3vqUx54d?= =?us-ascii?Q?SPYKmEzuFe8G6zP52eM5TqDXiouwmbIzgBRcz4P2kFvggYWvH19q80tOWNWs?= =?us-ascii?Q?rYmgJ6ddRyxRNd0o59COG0ix04LmOIZuH1BQhGRZTiR2oG9AKYMPR3nXa32V?= =?us-ascii?Q?6ZRMFWtaju+sm4th84Wq+XXAB878QfEW0czFvggcL5Eyf4WXiJ+ofa5zHf04?= =?us-ascii?Q?XtbXRcbJo4uWgn6Ap6m91NiMYPrILZFShZuCAmBDqXHdmazKSVFRLYehS2Lb?= =?us-ascii?Q?HXpfcdMViLFeZRFtwLwh48a70IkDtYrZrAsXShRdgD57gqHXHWvawOX+t1PI?= =?us-ascii?Q?WP4hbmf0rjVIVA32NZVWNbdILmL6zzyaiBx+WnTU/CcU939snc5O8/GUHVcV?= =?us-ascii?Q?p5MGhEH8LY5sugkr0nHIZ6ZWJ5/eCg3/MaWvMkvuJsToA1hMqcBuplb3rA9Y?= =?us-ascii?Q?PEpTMU9sfMeqDLBaRPFsgV7B5NLgyZ8YZwQa/o/6?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 69e71695-65eb-4d05-10b8-08ddc3da82ac X-MS-Exchange-CrossTenant-AuthSource: SN7PR12MB8059.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2025 20:02:22.0728 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: eM54akR62vWH1QhcIj6uxe+SQ2bI2Yy9RfnQsnJuTcXkl6vNypqoxv/R8CCKNyEmCNP46zfe4QzV5Nw1Tldi4A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB7009 Content-Type: text/plain; charset="utf-8" Extract the complex expedited handling condition in rcu_read_unlock_special= () into a separate function rcu_unlock_needs_exp_handling() with detailed comments explaining each condition. This improves code readability. No functional change intended. Reviewed-by: Paul E. McKenney Signed-off-by: Joel Fernandes --- kernel/rcu/tree_plugin.h | 83 +++++++++++++++++++++++++++++++++++----- 1 file changed, 74 insertions(+), 9 deletions(-) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 1ee0d34ec333..ffe6eb5d8e34 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -646,6 +646,75 @@ static void rcu_preempt_deferred_qs_handler(struct irq= _work *iwp) local_irq_restore(flags); } =20 +/* + * Check if expedited grace period processing during unlock is needed. + * + * This function determines whether expedited handling is required based o= n: + * 1. Task blocking an expedited grace period (based on a heuristic, could= be + * false-positive, see below.) + * 2. CPU participating in an expedited grace period + * 3. Strict grace period mode requiring expedited handling + * 4. RCU priority deboosting needs when interrupts were disabled + * + * @t: The task being checked + * @rdp: The per-CPU RCU data + * @rnp: The RCU node for this CPU + * @irqs_were_disabled: Whether interrupts were disabled before rcu_read_u= nlock() + * + * Returns true if expedited processing of the rcu_read_unlock() is needed. + */ +static bool rcu_unlock_needs_exp_handling(struct task_struct *t, + struct rcu_data *rdp, + struct rcu_node *rnp, + bool irqs_were_disabled) +{ + /* + * Check if this task is blocking an expedited grace period. If the + * task was preempted within an RCU read-side critical section and is + * on the expedited grace period blockers list (exp_tasks), we need + * expedited handling to unblock the expedited GP. This is not an exact + * check because 't' might not be on the exp_tasks list at all - its + * just a fast heuristic that can be false-positive sometimes. + */ + if (t->rcu_blocked_node && READ_ONCE(t->rcu_blocked_node->exp_tasks)) + return true; + + /* + * Check if this CPU is participating in an expedited grace period. + * The expmask bitmap tracks which CPUs need to check in for the + * current expedited GP. If our CPU's bit is set, we need expedited + * handling to help complete the expedited GP. + */ + if (rdp->grpmask & READ_ONCE(rnp->expmask)) + return true; + + /* + * In CONFIG_RCU_STRICT_GRACE_PERIOD=3Dy kernels, all grace periods + * are treated as short for testing purposes even if that means + * disturbing the system more. Check if either: + * - This CPU has not yet reported a quiescent state, or + * - This task was preempted within an RCU critical section + * In either case, require expedited handling for strict GP mode. + */ + if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) && + ((rdp->grpmask & READ_ONCE(rnp->qsmask)) || t->rcu_blocked_node)) + return true; + + /* + * RCU priority boosting case: If a task is subject to RCU priority + * boosting and exits an RCU read-side critical section with interrupts + * disabled, we need expedited handling to ensure timely deboosting. + * Without this, a low-priority task could incorrectly run at high + * real-time priority for an extended period degrading real-time + * responsiveness. This applies to all CONFIG_RCU_BOOST=3Dy kernels, + * not just to PREEMPT_RT. + */ + if (IS_ENABLED(CONFIG_RCU_BOOST) && irqs_were_disabled && t->rcu_blocked_= node) + return true; + + return false; +} + /* * Handle special cases during rcu_read_unlock(), such as needing to * notify RCU core processing or task having blocked during the RCU @@ -665,18 +734,14 @@ static void rcu_read_unlock_special(struct task_struc= t *t) local_irq_save(flags); irqs_were_disabled =3D irqs_disabled_flags(flags); if (preempt_bh_were_disabled || irqs_were_disabled) { - bool expboost; // Expedited GP in flight or possible boosting. + bool needs_exp; // Expedited handling needed. struct rcu_data *rdp =3D this_cpu_ptr(&rcu_data); struct rcu_node *rnp =3D rdp->mynode; =20 - expboost =3D (t->rcu_blocked_node && READ_ONCE(t->rcu_blocked_node->exp_= tasks)) || - (rdp->grpmask & READ_ONCE(rnp->expmask)) || - (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) && - ((rdp->grpmask & READ_ONCE(rnp->qsmask)) || t->rcu_blocked_node)) || - (IS_ENABLED(CONFIG_RCU_BOOST) && irqs_were_disabled && - t->rcu_blocked_node); + needs_exp =3D rcu_unlock_needs_exp_handling(t, rdp, rnp, irqs_were_disab= led); +=09 // Need to defer quiescent state until everything is enabled. - if (use_softirq && (in_hardirq() || (expboost && !irqs_were_disabled))) { + if (use_softirq && (in_hardirq() || (needs_exp && !irqs_were_disabled)))= { // Using softirq, safe to awaken, and either the // wakeup is free or there is either an expedited // GP in flight or a potential need to deboost. @@ -689,7 +754,7 @@ static void rcu_read_unlock_special(struct task_struct = *t) set_tsk_need_resched(current); set_preempt_need_resched(); if (IS_ENABLED(CONFIG_IRQ_WORK) && irqs_were_disabled && - expboost && rdp->defer_qs_iw_pending !=3D DEFER_QS_PENDING && + needs_exp && rdp->defer_qs_iw_pending !=3D DEFER_QS_PENDING && cpu_online(rdp->cpu)) { // Get scheduler to re-evaluate and call hooks. // If !IRQ_WORK, FQS scan will eventually IPI. --=20 2.34.1 From nobody Tue Oct 7 01:58:30 2025 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2053.outbound.protection.outlook.com [40.107.96.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A9A3529E117; Tue, 15 Jul 2025 20:02:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.96.53 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609752; cv=fail; b=J/EFbTSEFcCE55ecy3CrqE5BIrXvXX7M7+AuN+01qINdUZ/ec2ZGZyYrzrH5FoKkWunZg895grVCj8Nx2Aigfvp3d+/NH1ccD1eriTBaElEjyWsg8rLAaBKp3Ane2dPQnXOSoeAezB89TGxrpt7YiVNhNwlAQrS7xU+5WakCzOw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609752; c=relaxed/simple; bh=LvugUNZai7aPSOKZwGDYbgKnyUYOaYeLeLQPLNcfP+4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=cHCCwyWmAUBfvZwowA0yHydZ3TOzn0ks+WZqdszQjVTdn7/1QbbsncMDxn8zRvbnm77wUg4pbnPzwGY3WOR/lE3g4IyFKZERAI5+sbL1BwLpQuoCc8uIOzVjqBCiAFOJGqubdia0XZMAyi9/pP9ttHu/K939L5ifLb94fTY/jmY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=LkbEGjBq; arc=fail smtp.client-ip=40.107.96.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="LkbEGjBq" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=uGZF1hmrybkyi9RUf3bH4YaGF7kWLHO+rdv7p+UURqFyCECSGdZnOpDjUDJQuZK19xyASv5VU9Do1Iv0nVJCOyTLzqUQ6D7tfYZ2isiWDzkFdL9mTbctH12RN2Dip6TFv04+LYlhKafH1u8HScg90opUO8OBOhdx6CHqf4Cxy4i/vdpHs+0G8iv9KqF69Ewgm/5BKkh4ztuBv2kL35V0S4rKgzsBfqzIWMSwG/YnAczSUnOOjJyZ9HqL2QSxyse4mNbF0eCAQBE1d/55uN/KpWgqMb9OZjFw7xH6VpEP4+tQbjF42KM6kA3hEMUNMIIhCj7nCjmM7/V/Ip8UT0fHzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fB3nWIfJofqHV0NWEzYd+61pWgh5AxwM7NjD7i1KVY4=; b=tTZoQA9IqJPKMA/DHzeImRp1/LCtk428/OvlaKPe9/IM/eyEFMGiJ+wRy/vKje7t37vy5LGi+NxX4vAlXOCFDtnV2nszBcOs/eoBuHdZS/shs1/6IJggPBr+3mTzjaEqlZHX89p2gDqBkrumu1kxU25XSYr5ggfgXxW3oNSvszCnRfq1l/nPMLk0T4wn4Pg/pQAn+ZlZ5Np45QBEy/fqRbQoa6ufFCfx4zliAVgJU7gDLFfvG4/00ja4S6uFEoS1TYkPowS7qMZ21vJPH9hUIOb0aI5c7vl5Y42RHJ1LqWDx6X3xhuUP8uHMk6bigHwEoLDWHnjXZNj1aqxGaSLFQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fB3nWIfJofqHV0NWEzYd+61pWgh5AxwM7NjD7i1KVY4=; b=LkbEGjBqRWgigx0VUb4LADE97dcwwUCoglzIAW5l41FWetYNd433OusCdsYPo0+LSdvj1dnsL12gzAy5IyELGVCRTIxnc0F3VVA5G0imHX4s70lDDtrx2PibKanLRYFiC8vd8GV+Ls2UhqDKJFnvLt41r00S8O4rtnj0mVAuw15+KpFTBDvO0cKnY9ueON5ostCf0W3dF7QeRlMR/r+Lu+CjppIXNaywcLR2q1+KZUfM7y3rnh8Anb0SmMJGHSJrEO8QxTQN0GyOzMLXhCnc2Frf5tXpsE4YLEVRfDiz2UEutRZUF/ndIUtI2lq+EHgA5qTuQFvQxO3jDXkxnL3BvA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) by SJ1PR12MB6171.namprd12.prod.outlook.com (2603:10b6:a03:45a::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.33; Tue, 15 Jul 2025 20:02:24 +0000 Received: from SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91]) by SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91%7]) with mapi id 15.20.8901.018; Tue, 15 Jul 2025 20:02:24 +0000 From: Joel Fernandes To: linux-kernel@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Jonathan Corbet Cc: rcu@vger.kernel.org, neeraj.iitr10@gmail.com, linux-doc@vger.kernel.org Subject: [PATCH -next 3/6] rcu: Document GP init vs hotplug-scan ordering requirements Date: Tue, 15 Jul 2025 16:01:53 -0400 Message-Id: <20250715200156.2852484-4-joelagnelf@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250715200156.2852484-1-joelagnelf@nvidia.com> References: <20250715200156.2852484-1-joelagnelf@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: MN2PR07CA0011.namprd07.prod.outlook.com (2603:10b6:208:1a0::21) To SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN7PR12MB8059:EE_|SJ1PR12MB6171:EE_ X-MS-Office365-Filtering-Correlation-Id: d5397963-e9d6-4691-ef3e-08ddc3da8460 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|376014|7416014|1800799024|921020; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?UmkaZtoKHXMbbh6GIY4x7AM6qHsxg3sRvsYhqi2JEavqPJIJriBOH3ICH+dy?= =?us-ascii?Q?c+KUciLmZoBuiVirafl3nc+4RhTTcvcqwyKrtzr+xI7GcTfO3sa4Z4EhJzEK?= =?us-ascii?Q?KZ51KuKv1nw68P7GnpE1TNMnPPteIcc8pb+Umb10GqTU16MgGqbVz3tqEWfx?= =?us-ascii?Q?rv0ZswpRXqlPDnjQaLhcsDhkPizJpOSqQVDYihzXmNIUGtm4KaacnW66A7he?= =?us-ascii?Q?gqr9DwQWxQUZAohX1UF69wiKPKC7y1LgehOeYgUEVmYIUX/St364l8rJrTA+?= =?us-ascii?Q?U7r4Ly3jVzZZUQYXxF8+FA9yUO9V+ODbk6Yt3AOCloVYJQi8M59hROQ2NNCf?= =?us-ascii?Q?lAopRrz9spx+mjd+dkpawfX+9jDfTdAgfS70USl5x3D/175ms0j9Fl5iVlmb?= =?us-ascii?Q?5wV6XxSXuFMlmg+bTTLoPi2f2KuDAvAqIW22YuEZ960anDalc4hYIZqri2jq?= =?us-ascii?Q?ld4Gme8lt1uT0FU/7ON2wujFWNd4dft6eoN4DjoTSNcwYS/M7laLQhp1bV1J?= =?us-ascii?Q?TQF/w3fVUQAhBo85cOYP6qurs1/ICjQkbdivYHrAzpn9LrVKj22o3hvytuja?= =?us-ascii?Q?yS4DNy9/JBhr4OMd9TqtbBSnV6kZHhOwEy0fkBH/wRgDseV19U7NDRPM/DYY?= =?us-ascii?Q?pm2TA5u5+1EFtVAo9WM6GBrdGLV0geQsaQgRkNQbE7pjuJoUAAqgw3MmiCho?= =?us-ascii?Q?cWLMkmt+DIKQtZiFQBGMHoDkYiQkoyVwuMbqF9pXV87QEKog48/ppnrqugR5?= =?us-ascii?Q?8l34iOGLEol6DbceoHoteNlSPmvcyLeYdu3LgitiVjWujNvcoiVzsMuCGT9b?= =?us-ascii?Q?xiwn8SjuuNZ28fMKxutcvF1fLQ9Heyyetn1Kb4Y0BqO4emdTT/hdiB3yaUsO?= =?us-ascii?Q?MQ4Anqnx27QRV/ui4mEqIumht18lLqXW1ZAezR8HFm2axGUEE1TZH/5yfrHx?= =?us-ascii?Q?21RkbjIjZCYlRwfOjcDqky1PV+6lapWxQi5K8bAfinSEXhXXYq6z//0ekMin?= =?us-ascii?Q?naZo8e+esqIlSvjCo4iGWUMXJ372fhHL6u+k48/Em2we0mpwK01YSzB5yikD?= =?us-ascii?Q?nAqCr8Bs6H/gGHi+DqegBbuH9nEHX3CZZ9n+g3TccyDmnojIyckV7FW0jpoc?= =?us-ascii?Q?WeLyqA+K4JauO745qfuElVGSiuVpAWYTbW0SL6rVs1VhOl/WjK164CL70n/w?= =?us-ascii?Q?oF6QLfpVSpPkIwktSTK7otWyUHpDZKiPjqEYKLHyOpM7U/KRZ9y3qUHPJ076?= =?us-ascii?Q?1IDTxnWLsKHZag2CvHWqYJZqaRlcXeTnQIpbGRgtkiBAPbtxTa0TJ+pDOoEE?= =?us-ascii?Q?yPqIvfakFXqG9XABoukqj8BXf5ArFeMbY9PFJDC3fyBT/cUsQl46lnK4NzPU?= =?us-ascii?Q?zlv6nkOWX7m4X3QMDuAJdh1cr67ZZX9tC0nidQ1/peJPiyrbzwThe257h1cK?= =?us-ascii?Q?dLktAFVdvm9GXhuX81cT0rd+ofvTrr4gJWSFKykWdEZbry2gxhK5eQ=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN7PR12MB8059.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(7416014)(1800799024)(921020);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?9Nn0vUpvbc1X/M7GrmFt/EDvNJn4OelAPP/OC5AWpHTMY+5J3o6Jkb1mZMkz?= =?us-ascii?Q?z1lPdjP3vlIaeFqK6vez4hFLSPV7MVEFX+HsLpVLInLXFN+20T9o4roYCv22?= =?us-ascii?Q?k/Bb7Uf6xl7/nKyXEbArLDU5kwCKKwcgJwKsFo05GWbwmechE1noQo+yEWE2?= =?us-ascii?Q?Rs75nD7WoNhSlQJmkCJ6Puu7MMdUzMoXirLCd+0ITP6nmPuIVaoazSJ56p4p?= =?us-ascii?Q?viqikKjmnDfeEsCza97Uk6NoAKJwJYUaqd+JJ5GqFOIdhVUwxH5eQicQMc6r?= =?us-ascii?Q?vgVbgf6og4uLnlkDPSsTBFqJMm6Rm7kubqT3trNHoAFK73SLT6YmLd/gigUk?= =?us-ascii?Q?ft4VUxpyOwe8NT6eV20cz1Hu3P0bRlM+F344gChZE5l0kA1CayvDfFgTkUUA?= =?us-ascii?Q?yOYRRiMZZq/RUjTLYFYQaYJ6TnVECspK/grAOx3Ivs4oDvAf1gcBwiMBToCQ?= =?us-ascii?Q?PXa9K3UmD5n0r8oQAf9nk58H1+LcaCv8+4w0wKSsoa6Xl9CQjeQhdsYCp+8s?= =?us-ascii?Q?6N4vmzv9yRP2e6tTYjVwFK0MdZpw9RwmIh7SEldqyoFWah4K/Q8y+cw0BWsc?= =?us-ascii?Q?Exb0IiQOyHl5ywCE26LSm1pcyU8j5Q08TqY1TDhoEjFdwJ9ThTDTuaVvWBIR?= =?us-ascii?Q?fFA7LqJY1jrO3ZNgfSK/3KA71nUVOLBwi9D8boAcMbEzQ4KwFZ/BzYDNMt90?= =?us-ascii?Q?w9Gd/sjkk6Q7LszxJd3/xSgEUvfNn/iKY1Lc/vyfViYKGyDnB5ddVMlX37CL?= =?us-ascii?Q?61PafVp7ZgBv2leVl5DipPQZ+hWS6eyWVl6Hx/B0Q6B7+ZEf5G9qeNOnoZi3?= =?us-ascii?Q?p2vKd4nKAkeTyCAFMsdZHic/S9ot5ZBY/e5ETA5MIQN0smAna4fGw0seVpNq?= =?us-ascii?Q?qq0qrQgMAMj+NIgMgCHHcN2YGnoLxHiDcUI24un+BNFwP0pFLEhI/tOGuDbk?= =?us-ascii?Q?BcwFRvv0dhYk6QuINTe6VpA+XzhOb0ogTJEEbHUPwl3wNiVPMLkco+4RPEiO?= =?us-ascii?Q?m6srK4aWJT618eTnWH82K/IeADQslgQEK+jnTbmHuNSZJVZKsdUXpA8ppfFt?= =?us-ascii?Q?GD+qr7FQy8kodPGV3aTp2H0BxRnAiM5Wo5cZQAW2/nD8sCKwJNhMOCComzP/?= =?us-ascii?Q?aQuu8Lx81ST4/iz+6M63guWo5reEXqHGISzGkREozzULb0UmeFRrDycsryTj?= =?us-ascii?Q?+IjbQ9fpw5vcjB7Lw2R8cGoCOitTdRjGWc5VkMrCxSIxQafwdJC0NTpiu9Yw?= =?us-ascii?Q?fmfcXmBzO2OYb9U3I1ksFyJKraExx9HmxlA4YYTHdtCTDe/zkpYaYUNoMDq1?= =?us-ascii?Q?LIhF/+IXlP4cRx24OlD1k406GgTfG8vMzvV6XXf6yl953RXia6LcBYaq+lEV?= =?us-ascii?Q?MgN0RXu9zmGmtKWn3yy3X+lH3a1iPKj4RmmLNfLu3q46ogISySmZxDuEFuVQ?= =?us-ascii?Q?s9nbT3fOX1+U2wd7Bymowi0e3yCEONYYXn2cxKimklo5lv6SYkg+NAWeQeRg?= =?us-ascii?Q?0DFtQLbC8CfD9/dOuZZQubVlZmFwECneiOGMVmYXRTGnDsgREE2cTNzgLPzX?= =?us-ascii?Q?LRUpgqW1l4kpHAEFuU1XXcxMJmEKeqPcDDYeZgIz?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d5397963-e9d6-4691-ef3e-08ddc3da8460 X-MS-Exchange-CrossTenant-AuthSource: SN7PR12MB8059.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2025 20:02:24.8713 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: k/Dl3qLtO0I0q7UFMPf7bPOee75J2TufamLrKOnK6Ne2psWet5VNP/0+GgXtess2YHC5M1c19dhQ80xF1yNspQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ1PR12MB6171 Content-Type: text/plain; charset="utf-8" Add detailed comments explaining the critical ordering constraints during RCU grace period initialization, based on discussions with Frederic. Reviewed-by: Paul E. McKenney Co-developed-by: Frederic Weisbecker Signed-off-by: Joel Fernandes --- .../RCU/Design/Requirements/Requirements.rst | 41 +++++++++++++++++++ kernel/rcu/tree.c | 8 ++++ 2 files changed, 49 insertions(+) diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Docum= entation/RCU/Design/Requirements/Requirements.rst index 6125e7068d2c..771a1ce4c84d 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -1970,6 +1970,47 @@ corresponding CPU's leaf node lock is held. This avo= ids race conditions between RCU's hotplug notifier hooks, the grace period initialization code, and the FQS loop, all of which refer to or modify this bookkeeping. =20 +Note that grace period initialization (rcu_gp_init()) must carefully seque= nce +CPU hotplug scanning with grace period state changes. For example, the +following race could occur in rcu_gp_init() if rcu_seq_start() were to hap= pen +after the CPU hotplug scanning. + +.. code-block:: none + + CPU0 (rcu_gp_init) CPU1 CPU2 + --------------------- ---- ---- + // Hotplug scan first (WRONG ORDER) + rcu_for_each_leaf_node(rnp) { + rnp->qsmaskinit =3D rnp->qsmaskinitnext; + } + rcutree_report_cpu_starting() + rnp->qsmaskinitnext |=3D mask; + rcu_read_lock() + r0 =3D *X; + r1 = =3D *X; + X = =3D NULL; + cook= ie =3D get_state_synchronize_rcu(); + // c= ookie =3D 8 (future GP) + rcu_seq_start(&rcu_state.gp_seq); + // gp_seq =3D 5 + + // CPU1 now invisible to this GP! + rcu_for_each_node_breadth_first() { + rnp->qsmask =3D rnp->qsmaskinit; + // CPU1 not included! + } + + // GP completes without CPU1 + rcu_seq_end(&rcu_state.gp_seq); + // gp_seq =3D 8 + poll= _state_synchronize_rcu(cookie); + // R= eturns true! + kfre= e(r1); + r2 =3D *r0; // USE-AFTER-FREE! + +By incrementing gp_seq first, CPU1's RCU read-side critical section +is guaranteed to not be missed by CPU2. + Scheduler and RCU ~~~~~~~~~~~~~~~~~ =20 diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 08be3df95e68..040e853758df 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1837,6 +1837,14 @@ static noinline_for_stack bool rcu_gp_init(void) start_new_poll =3D rcu_sr_normal_gp_init(); /* Record GP times before starting GP, hence rcu_seq_start(). */ old_gp_seq =3D rcu_state.gp_seq; + /* + * Critical ordering: rcu_seq_start() must happen BEFORE the CPU hotplug + * scan below. Otherwise we risk a race where a newly onlining CPU could + * be missed by the current grace period, potentially leading to + * use-after-free errors. For a detailed explanation of this race, see + * Documentation/RCU/Design/Requirements/Requirements.rst in the + * "Hotplug CPU" section. + */ rcu_seq_start(&rcu_state.gp_seq); /* Ensure that rcu_seq_done_exact() guardband doesn't give false positive= s. */ WARN_ON_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && --=20 2.34.1 From nobody Tue Oct 7 01:58:30 2025 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2087.outbound.protection.outlook.com [40.107.92.87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E53982BDC3D; Tue, 15 Jul 2025 20:02:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.87 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609753; cv=fail; b=fEkMENeEFriJDTu/+eiBNqoy1DI3kETmssoSNXAmG5xhklLqTeJLt7UZFBVfIHZ8kQ0obyl7fPW56I7yFQae8JU76fqd36qSMP4/y8qnjtAsuHtrwYiougSbZRuSgqo0v8y4UYLi98ZbetderG3FGYVwlaq70rfPCZxGfMUatfI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609753; c=relaxed/simple; bh=xbrjRO+2o+EmBpiieN3XEPkkux9P9ufD1VSv4HzhqbU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=aX2+4KPqfY8on5ZKzDorZUJURtTwW1TxkyCGVXFXW1NBrEsM8iXLr90dtChaHGeTD7OTG+6LDGlKgvb7m4QDYalyd1S1Jhi2K5ot3vAUba57pAbB/L8D2XLE1BvofOKEm9vpuFlAESKHBzaWxNVGpURkXbO/f1GUeFsKZPbzvS8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=F6bxxQwJ; arc=fail smtp.client-ip=40.107.92.87 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="F6bxxQwJ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=l3dbdfJg5Srj/Nv/cbsmglExIe9mXpstYii1r6+KAICS1mQbxCHwjjuh3gU5mkjRF90W5Fa1BiW3SNyA84qhpNIPPWt3Ep5EiIm1mxMDoFXzWQ0EG5m57KVCg5o5CcG6Lvme+gjVFs4hanrTN4PZHYZML+okiNbf4i9cm7xCC3Kk8M0MhCPamaywJhGPIA7SUMsvm2A4FTzyW2FlLtCXQKS9FshsfJtHR1ou4XoCyWR5hnla32+Ylbo6jrd9opPKV3znXWKinHqhqAdZTKhCC6qG8f6s2z9Vx+V66SIwVl0Vt/+9CXxI3vzZDtAFLdA4HgzFxSULpDCX3Q2RkA+CMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FUUjCNOXJjfj3G8F3HFBZAAmw94YCQJlIDPodvrY0yQ=; b=qK0pqaaq9g81v5hoQ7qsG8Z7tlazbEzQCRSQk62uu6cH2hI3qWCxctYDg6TEutcx/nVSFMuCzkb+4tfDPYBqLzPOI2tGyWVCAMkw6zTRo6l/2mc0gfLV3AGn7hLw6t8jga8DlRD4/bwIwOi680QdHQ0VdSI0yjlbeAksVnjflyWWdhh2xq1g75o6IoqbQ6xpwUJRHbDEU6PkJ8IQieKbNaXJtqijvT1+OywUyG8fL0JUUR9RZ9ye3tCPCyfrw90GZNOLIuLEXyz+S68Uj8NY3aOuksnYT/2i63EqI7N5xoo5nOiWOqxoE4mmphBSROv6Wrq2ysiiF2lEMwYE3TvV1w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FUUjCNOXJjfj3G8F3HFBZAAmw94YCQJlIDPodvrY0yQ=; b=F6bxxQwJ7DDXuvJUhmtWFQ2mkuPHgvJ4tDKTWtNTOcuf8FIQ6MF1Z8+4b1t/0gd+BnB84M1HbuN8mIvZhrcTDuR/YgYKnsKoa5F26abD+ItlYGdGXyQLcuLbdLpcSqG0FDnbMD/ivy1S9gaf+pqEDkzLdtU1d/l1dY85TbOu1xSimJxy8VQ2Yvn3By71jpG3YDAY3tdjglCriZ4SQMOYVjrbA7KI9XeEnDCFhqoaleAEahWI1qD0zQe7cc2oRj2lOTUN7g4GTMaKzbo7gXzchntuDlWKejKAbcWzspbJk5nDJdHA0klE3WAsZh2ByG6gYY9Ldzbx2A2wuvLT95OQag== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) by PH0PR12MB7009.namprd12.prod.outlook.com (2603:10b6:510:21c::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.33; Tue, 15 Jul 2025 20:02:27 +0000 Received: from SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91]) by SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91%7]) with mapi id 15.20.8901.018; Tue, 15 Jul 2025 20:02:27 +0000 From: Joel Fernandes To: linux-kernel@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Jonathan Corbet Cc: rcu@vger.kernel.org, neeraj.iitr10@gmail.com, linux-doc@vger.kernel.org Subject: [PATCH -next 4/6] rcu: Document separation of rcu_state and rnp's gp_seq Date: Tue, 15 Jul 2025 16:01:54 -0400 Message-Id: <20250715200156.2852484-5-joelagnelf@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250715200156.2852484-1-joelagnelf@nvidia.com> References: <20250715200156.2852484-1-joelagnelf@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: MN0PR04CA0019.namprd04.prod.outlook.com (2603:10b6:208:52d::33) To SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN7PR12MB8059:EE_|PH0PR12MB7009:EE_ X-MS-Office365-Filtering-Correlation-Id: 5a4cfb05-817d-4e7d-bce4-08ddc3da85c6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?oRLjiDGTh9Z+b/Rrff8M3Q3QUxks3s3seFWNpLsb/0ffcrnzHwfZR+X7KMXS?= =?us-ascii?Q?c/07tDyoE0eawCkM/ijKSk7WFd8qACoyIh/D32uQr4tMpTGxSuUSZ6PgMWT5?= =?us-ascii?Q?5aR8/XL/Ls+c4pu1ts/l0LAA3+Mk3R6mPa/N0mI5i+CH5yK1I9AbcpBHMmJH?= =?us-ascii?Q?CB6KWQIHaLLUSZn8ns9e/JqXVkIC2fFei1CuwMt1zlziJ/AG7zyfSYcpGLv7?= =?us-ascii?Q?SYR/L3WPUeJxvE5wpOjC9evEIp5dad61lM5JHrWMTDV5uxcbEJacCYHCctGL?= =?us-ascii?Q?mA1IPT4tMNNCkKC5JF6scx6RLEty6YuqHvbR29ykJ15TYKxRQxFIoeLOkrhG?= =?us-ascii?Q?od9RCugEIdh+QUn4uxa0c1cYzb5b7IJqZfL9uSWUino4l5Jz1K+2V++zQ6My?= =?us-ascii?Q?w8/uIVPRBtqoditr8+6sn9j5v57J4OcUZ0kMt3QkI1RhOQlXnXh6PESJ5NeY?= =?us-ascii?Q?2qGj5x+NA543PtPUQMDiQmjsUky0JJgTBDmrnad5SviYPRHpvBv9EC7TLsPR?= =?us-ascii?Q?aOzaKVsZ5FUVIFH4MnoY8hair3SHEF/RxDtCjNhYXH20iXnN64uVPni2CmBW?= =?us-ascii?Q?n1Zpx56fYNyKg6Btpbj3Yq0gIkbkoPYdXvMZuddK91XuCp7KCFGvlOyFv3Cp?= =?us-ascii?Q?j3HdA94nJ9lwvYRYF0HmWKX7FX/sVzdnbn+D7qVyKmPRzvYZ9YZV058MkSpT?= =?us-ascii?Q?flq6/uUf9jtZt90kC53bk7uDkIM4kN/aMyvrHFkegjJrLd7f258aI+34n3mN?= =?us-ascii?Q?cACADJqt9/LIkS1mckhlYrOBk/FGJzNr+yBjys9dVo6sR+8cBS9wTPapdApn?= =?us-ascii?Q?FSwH2XEX3NrJxzcF94K+Azat3vf7Lr/uThst2h2L2tV+84IC2s8iVIRNfqHr?= =?us-ascii?Q?1LHd1zlLbWcc3W30qV8iEAyJ5rz+QnTIcnVRgehMfgWcT5Mgd4fHUquOLdPE?= =?us-ascii?Q?eLRDX+xkvxgrt6VLZxSYQ0BdKDXnn1h+dRIgaqUOYhDfqNAYhU1s5XEGpk/J?= =?us-ascii?Q?yMFVrLfScn7pIpXl2s1K+w3lHKXnC1IB4DU2OsaEVIYLnOL503fAT5F+ef/3?= =?us-ascii?Q?s+IiTclHNEHGWFK/Uy3ci+BNwNpt2G7jxvEMsQCeYPeA6l1U0/NlXlh0OjG+?= =?us-ascii?Q?/oApCZbhcPEbf7xlxD7luDEarTyOOAsWvB5XBEFxq+3JZx9NjAdVYrNnmxZr?= =?us-ascii?Q?CeAZHfAW4AeYqWP/QEUN4HSgG9NYaFUTW94VS03E9j7EG5frLwEKZGOGsR7I?= =?us-ascii?Q?A3HcaxBoh4JGS0wkNT7ZXxI+wHeKd8nrGrDRYDlR+jqrFeKrqhErsR+aNT6G?= =?us-ascii?Q?QQ22SREsR3374uQqiHCBReSaAYgVQN2dmPRAv7P9noLhtHxbWrUOXNxYY77e?= =?us-ascii?Q?JLruUtF4eevlrLZqj7u5LXN0XiEGWf28QOPkmlgC2FDnYW0gbw+4pocn/7UP?= =?us-ascii?Q?cfYXyHktFgNIIrXIZFtYmJuYHQm0PJDnw7D8wBR8ghhW9N9zil8u8g=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN7PR12MB8059.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?9cSZEhPUDKXa104Vb5dagQJzYnFV0+kUNrrFo/A6QNnoijdqE43jlSe3nVAh?= =?us-ascii?Q?bUvMoTzwS9gdMwXOMr+T1iX+wpm/56lpfopSEDGqfwvEx68PLWZS2aZRsWaX?= =?us-ascii?Q?K5Ftyh3BMjAWW/YqzLxMi5oZMicN/9Ax+xAtN+8PsLAuil52M19Dl/s6Omu9?= =?us-ascii?Q?JZcc5yv3386UUMhZKDi+E0H7yg/318goPWvueCyb6LECNKd5qgqYVu/WvJpu?= =?us-ascii?Q?lDFOHt10ocwn2UT3eTMrTYoKIHkMxES6GxzrsQnI9gQDRU3czEj7evRcFMqx?= =?us-ascii?Q?DaTXpzOwl2kswMBfijtJHvdYAyzrRd1nIUKSp+e6Pv5KfLteiJq3cRqNGhbw?= =?us-ascii?Q?v1pURw6Eb2BeOeh/0WBExMFQXUVLjSX0FmnVyJlAzoI49m0ioFq7DlfJRFpf?= =?us-ascii?Q?FCEK+qsaOFVNBf1yclMTf01sDAJTXrh66EzmbUPJExlr0W6j/rPGp9U1f6i1?= =?us-ascii?Q?JCLWDX3+o61wV2vcS/M4C3hYEk0Th5rhrP3mBcgVVAtkDUJ+plxr1RwZplz+?= =?us-ascii?Q?Ej51SMepaYv4L51efxIG45B3gtWKYEFGNixwYsplnA3gtrC2RO3Bgwi0CdGV?= =?us-ascii?Q?Eeigy6G4K+3b3arMxsvxDOcFAMEoCigYhxkHMv1Dhgz59TElKRbmJqmkP/P3?= =?us-ascii?Q?Yp72pTxI/t9v/xGBCCwiC3e0WK4HipOYwlyjXqf2f46Y0gINvUGd5olZjOUR?= =?us-ascii?Q?VoLyLjYuJLrg46SGOWr8HcmzSUyP+rhA4aI8IgDl98p59KWk5YiTOEXRyCZa?= =?us-ascii?Q?wzChIDLwk0tExpGwhuBBIPdv7eXQ988RSnEtrWnwbkLi+XzshVMog52Y+Zyl?= =?us-ascii?Q?oEONRuA3/EnZqk/BlpUOIEZf8I/IDvJWWWTfBhlIbzyWAvzLYizwCTHPaczY?= =?us-ascii?Q?t2N3Xp+DomfTgxxqWsRXUHEH7ux4WEkxObgiTfhwZjdu9dts3P0fc9nzqR6K?= =?us-ascii?Q?ccMZkAx7LE2V3AwSS4/5hyJ3FsYB7YP1DhQgXcDV+knnOfkvuIT+nSPtIPnU?= =?us-ascii?Q?pH14W6rMuBxoag47+oDOHH6vPqbh7yGnaSFsnCOzYdn97cJSyilGz0cM9Vox?= =?us-ascii?Q?XmdDxZgBSMlhcE/85pD4OMAeVUa+iecSsFMRPUDh5Cmtac5wqwl19Hwv255Y?= =?us-ascii?Q?Hp4pN3wGImKKrPe2GuNh13t8gJiEy2ql7le4lmQ6LJhkS50Irw/DOd8F4RUo?= =?us-ascii?Q?USol3yKOfKsZvt/3xP917hwdRwAqk5whEp468PI0sgMcnd4K2SY3qNxb8DQ6?= =?us-ascii?Q?jPbY4QsMNzTDZ4kvsjOQs0KCPgRnvU3r807bb0yQjZhkqO52DTSzWIDJznd8?= =?us-ascii?Q?CvOr8+jmXxS8Iduag1U78uer4xKbxu7IBJZlEvOUz4rgmD7F4gGKn2lvXR1A?= =?us-ascii?Q?FNRs12BJH5wcOSlICvE3b175wkg7NA6T6oyo8eTMRRHsHMwMDHYJsZ6sSG84?= =?us-ascii?Q?PmSNBaXeGpRksbnsfnTsqJf6ADGZoPRu3Z78Nuh6eE9ejGQZk6Lj0uueSIBx?= =?us-ascii?Q?7z9KqhgrKQ4hoWj/LuQ7zBa5QUNSwSipiTG2DJVlTvvZUtyyh3SbRT7gMdEa?= =?us-ascii?Q?sif2xr5dBC3PVPO8N24F2w9d4PIa20gDtn2hpg/F?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5a4cfb05-817d-4e7d-bce4-08ddc3da85c6 X-MS-Exchange-CrossTenant-AuthSource: SN7PR12MB8059.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2025 20:02:27.2112 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: KsnwEVO03u1GDwYJMI80H3j0N4P1cgsbF4PomjbSFamekWc/U32xCghU1v9TmwTq0V9A508oWieEbyzSyMFx0A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB7009 Content-Type: text/plain; charset="utf-8" The details of this are subtle and was discussed recently. Add a quick-quiz about this and refer to it from the code, for more clarity. Reviewed-by: Paul E. McKenney Signed-off-by: Joel Fernandes --- .../Data-Structures/Data-Structures.rst | 32 +++++++++++++++++++ kernel/rcu/tree.c | 4 +++ 2 files changed, 36 insertions(+) diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst b= /Documentation/RCU/Design/Data-Structures/Data-Structures.rst index 04e16775c752..930535f076b4 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst @@ -286,6 +286,38 @@ in order to detect the beginnings and ends of grace pe= riods in a distributed fashion. The values flow from ``rcu_state`` to ``rcu_node`` (down the tree from the root to the leaves) to ``rcu_data``. =20 ++-----------------------------------------------------------------------+ +| **Quick Quiz**: | ++-----------------------------------------------------------------------+ +| Given that the root rcu_node structure has a gp_seq field, | +| why does RCU maintain a separate gp_seq in the rcu_state structure? | +| Why not just use the root rcu_node's gp_seq as the official record | +| and update it directly when starting a new grace period? | ++-----------------------------------------------------------------------+ +| **Answer**: | ++-----------------------------------------------------------------------+ +| On single-node RCU trees (where the root node is also a leaf), | +| updating the root node's gp_seq immediately would create unnecessary | +| lock contention. Here's why: | +| | +| If we did rcu_seq_start() directly on the root node's gp_seq: | +| 1. All CPUs would immediately see their node's gp_seq from their rdp's| +| gp_seq, in rcu_pending(). They would all then invoke the RCU-core. | +| 2. Which calls note_gp_changes() and try to acquire the node lock. | +| 3. But rnp->qsmask isn't initialized yet (happens later in | +| rcu_gp_init()) | +| 4. So each CPU would acquire the lock, find it can't determine if it | +| needs to report quiescent state (no qsmask), update rdp->gp_seq, | +| and release the lock. | +| 5. Result: Lots of lock acquisitions with no grace period progress | +| | +| By having a separate rcu_state.gp_seq, we can increment the official | +| grace period counter without immediately affecting what CPUs see in | +| their nodes. The hierarchical propagation in rcu_gp_init() then | +| updates the root node's gp_seq and qsmask together under the same lock| +| acquisition, avoiding this useless contention. | ++-----------------------------------------------------------------------+ + Miscellaneous ''''''''''''' =20 diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 040e853758df..aa6cbd1501cb 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1844,6 +1844,10 @@ static noinline_for_stack bool rcu_gp_init(void) * use-after-free errors. For a detailed explanation of this race, see * Documentation/RCU/Design/Requirements/Requirements.rst in the * "Hotplug CPU" section. + * + * Also note that the root rnp's gp_seq is kept separate from, and lags, + * the rcu_state's gp_seq, for a reason. See the Quick-Quiz on + * Single-node systems for more details (in Data-Structures.rst). */ rcu_seq_start(&rcu_state.gp_seq); /* Ensure that rcu_seq_done_exact() guardband doesn't give false positive= s. */ --=20 2.34.1 From nobody Tue Oct 7 01:58:30 2025 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2081.outbound.protection.outlook.com [40.107.92.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADEEF285CAF; Tue, 15 Jul 2025 20:02:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.81 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609762; cv=fail; b=qmJ2Dh59JQeAFE8VHe53emj5hRrkIk88w/snYZqi+5Q00mrFUI0SVzkgZpgSm5sE/Q+DvZ0DQPsBTpbjofuX+GI0EGthNbekOlwECe24gYsodPZllVfthOrLP52JnliTDbr3vMf28mHbTIoFSV1ZPTN7hpduOb8ek77FPVbXdfE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609762; c=relaxed/simple; bh=QJpr/TDGm8E/EQmfJZ4z3BxWvQtDSqRZnFQ7peH+60E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=MdDdVurTmd2AsvnOD6/GjwRs0AWLA78BGGZzNlDwhS53TMr6WDT4VrMjjS7muf1px/OolN/0IsUXZlKGUiJb1J/mtCLwuugW5+9Z0lMviJY12oiFtDUQ8OWe+znykzQ6l869y7NRO5862XTqOY+Ebto0uxY1sT2TmIJb6tIBKTo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=gunVe18w; arc=fail smtp.client-ip=40.107.92.81 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="gunVe18w" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rIFo7fuQaur5XRvyggRYsgAy0mQ+Y5SmfUpvZBI9NWlGYb7Z07zI1wWQBuNt9t0DIAFjDPZWHmRJHk6Q0BLIFf2DtExIv/IKfShN5ivUtddHVGurArjZ2yKXsmLNjuWWc9Wsqv4q8cFQ//VfkX5ERc7lLpPuDefqp4TTvb1L+/8E5mJmisvsyldg0Z7/Ugp3tYtiN+icohFdKBORe9IzLfVTQv6a1o7eYAqIsMfpEp5/jrnMy0/F8tKSQUrQvsD5dTxiqaXX0TDrZM9Kezc261uuObD4GcVPBbxNgQgflgrxstRPZDUyuivOSO4ixyEwis+F7Ef1jMuQ3dZioDDQZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GQPMC+1ew8p1kEWGqaAo6UPDR1H3fjRnIU5QzV37Byw=; b=QXPgv3MfzTilOHDbSuTBYpIW7gxm+yBUGoCr2W7G+/ug76Ftq/B2fYQyfW1c7CCH/25ByF5N13m2atiDVba8RT0ap0FNGAuQE5gByxAAoTwtMEoDzQs5I1oH4Ea/FXKrFe9BGqrZhE+BCcE2u05ptpv68K43EvwzI5tNJP1Qqvw4H/ntTr9x/cZiJztdO0a5x3DHm277ZEMHBDP57wD1x5b2EYDt9b1tkEKE/WrM6+T2tOLYJB4ZRuYpdUqM+MpKrvfCtQNDm/fFmQZahh9HOGf3tDyKB9YzU4GSmem9qIgL/FhWvop8AzaAZ05mjIyse1++lvZj9qnSk64za+6irw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GQPMC+1ew8p1kEWGqaAo6UPDR1H3fjRnIU5QzV37Byw=; b=gunVe18w8JN3pvXZRcNptYYW7arTfAd/UhIajVqeFNBTFbdAK81073vA+nRoL8djCRmpcSQ0V3H/GjwULaX3EVv8megrL48PeLkvxcTj48KD5BKDzZ+MjOAwTzrw4lXAfS0mjxzx4MUf2LuupXGtvWU1Qk1/zpwCl3KMoUDGIr5/xKiSXIQ4I3dQyWJsiFHdXM1YljUvXPEugK+L5vliSn+uVGPUiqV9FE84dJBQWIUic+xA3vghXYlvLDL190xR3B+9/23vD0IU4nkBnEMPTohspg0r4qXE9PF/1HU4jmSrvqRflJ5sVx7lCpzpkjboiWNv8VZFcuXAXvAFbNYdQA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) by PH0PR12MB7009.namprd12.prod.outlook.com (2603:10b6:510:21c::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.33; Tue, 15 Jul 2025 20:02:29 +0000 Received: from SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91]) by SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91%7]) with mapi id 15.20.8901.018; Tue, 15 Jul 2025 20:02:29 +0000 From: Joel Fernandes To: linux-kernel@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Jonathan Corbet Cc: rcu@vger.kernel.org, neeraj.iitr10@gmail.com, linux-doc@vger.kernel.org Subject: [PATCH -next 5/6] rcu: Document concurrent quiescent state reporting for offline CPUs Date: Tue, 15 Jul 2025 16:01:55 -0400 Message-Id: <20250715200156.2852484-6-joelagnelf@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250715200156.2852484-1-joelagnelf@nvidia.com> References: <20250715200156.2852484-1-joelagnelf@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BL1PR13CA0006.namprd13.prod.outlook.com (2603:10b6:208:256::11) To SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN7PR12MB8059:EE_|PH0PR12MB7009:EE_ X-MS-Office365-Filtering-Correlation-Id: 8b0c6031-1eb7-4f30-ae1b-08ddc3da8746 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?VIYDdbi5rkfRZ5mw6bgQciTwBWHE+F9k8HMyCshLNpSTq9H1sMASIkvaqPtI?= =?us-ascii?Q?TXQsaYbS0GRJkS1kcnDF606KG8Koecb0qwbKVBgMdOOw2aY1SBeurz5mONQK?= =?us-ascii?Q?dFiIBBEGe1t3j9538i/5pc32XbhjqAA5NjpX0rgYjmQF65BRVLPQxUidLw4p?= =?us-ascii?Q?a82bv25dM+2qmyJ9hTPcXL6NXy49G+bJK5gee8IQSlJ0qcXNaiya63eEdGIa?= =?us-ascii?Q?g7jLZmol9cjX3gg6gjoZnTD3GwQ+2DHNdN9VvQgqU7tXPTv5Sd3UqamBJxcU?= =?us-ascii?Q?pEJaibKA2lbe0/fblaG3pEAv7VAw/oArIv21Ly7mzViOlH07LEEsokMOnD5V?= =?us-ascii?Q?o1SdK/JJQGp0RhpEIUxBp9dpU8KIP3FLBCYcikE6K9WJgAA61hYD2BgWNyM1?= =?us-ascii?Q?28HytoWIwHJNnO4600Q3ECjd57oRZWY2XqNGQDqJDA9dC/KZoPtM1hg9yPvZ?= =?us-ascii?Q?6KMo9nD1PRLvG4DYxSV+h8FeRlF2CKRfR4MSDo+FnQy+SA7caTrQgj5qP7lx?= =?us-ascii?Q?OLapEgQ5WPpdkpnzANevnENqhA15L2vuShuqeT1Rwg7EQXXtS0HNWXfkzpMy?= =?us-ascii?Q?2V6M9B3Dj/7d2poFwOs0zlUX0Wpp4f5m7nYEQskvWeZu51BLlxIb2ZDYSAXd?= =?us-ascii?Q?6+ZWDlBn0dlSlUGrkP2t/RPCCaY/9jOABO4cDc2YWE7dzx5ZjnNAcRcG2x06?= =?us-ascii?Q?I5U/bmjhh7FpIsibJxSElBahS5Ym0UHuxAIskiVE0Z8M9ffNM/gjrXc6FVVb?= =?us-ascii?Q?rrrqOuHTRxX3FGft81Y3sUMIQgxuY098n507ninvIzFy8zvQF/XKWXiZNL3p?= =?us-ascii?Q?BjZe7ZG24i5ofkwTLq6BhXl2Ipxuxzv6dd1rjdXfQpD/rsWScAf0Op7r4T74?= =?us-ascii?Q?ZzP+/Fk37vs+6xzjt/3dolsKTaKnoPrmV52VdxctetC6LzLuiNMJS6w/BwYp?= =?us-ascii?Q?HtF0CjefBz02m3Gm8oga8b+sh/shki7+42/rglrr9ZUEWa8+HEcAA1jBZK5B?= =?us-ascii?Q?5CZcmzBUdMZCeIqAnQdoP3jW+HwuAwlUzoaAJ27DDuHwPdq8L71a0gQ7YLRA?= =?us-ascii?Q?cBSwM75p/qoZypbIYka6yVLjTDBcL/oCEUk3YDxwKQ1tkOzmeSa+PpZCiCPP?= =?us-ascii?Q?BnBSU3YmcI1CkNmpq8sTjM776GW1i8LDlMxM1CLavTGjtW1OYErtBg/8KS1L?= =?us-ascii?Q?JW0q/Xht6dCcKZHfDAvZvkS/zd6ghxQgxrr3dh4l4+XqVJzKSp5mn2GvXbPm?= =?us-ascii?Q?2ZVgJ6zidFmI8BQMCnz+ib54CFMe+fT2rhq1ug1sSZy8j0R02hD6c6v4YVkB?= =?us-ascii?Q?EQzv0uFe6O2yvCF0R5Ev6Ani78bFSOILJstrUwC5wUKm0DHjtq/QUhJ5hOSt?= =?us-ascii?Q?tHCK/JylxWVWf38AHk+JOAi7jYGcQrvgNI+h7z8ho5I35vQN/50N9aNa3Rl6?= =?us-ascii?Q?8t6O6llTg+SGvw6lwnn8zcWtUL3gXhzfuVnbTJBZe4Azca71n1GAKQ=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN7PR12MB8059.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?1RjihZSBZfGKocYkU2LVQg5qJ2rqSg6IrxolknM7SI8QqElBoivLqaWGjm1n?= =?us-ascii?Q?akAKJXrYV1cGJxaw69zUHCVzClu4GF4cKAQ0HtAQcI6o1WX8gHI8NdqqPr0D?= =?us-ascii?Q?cF8WtNVkgLEokvtFjlVAM24j+iN9IbLIc6tWd+4T/8a7gGHTgMU6yS9GYDRo?= =?us-ascii?Q?Hl1coAgShlAZl4XYCnvmgtXPdKw7dfWcfpVt91wUhWmTPSccBjhdOxqpwgm0?= =?us-ascii?Q?Mw1gJdNwCylUs2VdMw5Q6lXOaLfNfHpVLDG35SbSYxtg4f/UofmdDnh3+ql3?= =?us-ascii?Q?kSDIpz3ulkvA0Vyx2dsDMmUbZsqNKT0I/w/GJkXyZNIREuPp3DaKxTCEeAmX?= =?us-ascii?Q?6fqmwNW9yXYKQYbMr3vjPJR4NinPIXKTAkfjCngRTESsHRdFJIVrFrttYZvn?= =?us-ascii?Q?YUesbQlINou3IAfepaynNHzjJ/a8G5trH8Thk7hzODVYQb4QY9lIctr5MZe+?= =?us-ascii?Q?sLpIEsxao/soeYyb7+C5tIY5V5RsRk+SK4DMmMXKgpeA/4gpSdyLY4C8Ewa3?= =?us-ascii?Q?0Q3MjDQUUgE2lj8RSasuOkY4V7nSFo9Sq++e+CBnmudf5jiBpe7LQoTxJaYo?= =?us-ascii?Q?EMT+e1zw9yzNAcJk/Wz22cuL5hpSbGNrZehpEV5qRZaQO5hQZgo11Raf/CW6?= =?us-ascii?Q?lratQqRVwtKuK2txNbmEBMWfSYPDfw3q2L/eJzytWLnybPyYDmg8fozEJdqV?= =?us-ascii?Q?PpFszWGtaGr0M7Koj1LyL9ghUvXVBjhtYvwSVxQG4/l8+1ue7kUrYZ1SGzVm?= =?us-ascii?Q?AQRAdxA0PglgFJdf5kaS0ZrKv+3mJlc+Fbvfi6BDfL35WjM8yGHCB3mq0PiO?= =?us-ascii?Q?MUrcoJvqF21QWgfvhc7tVNw1JiuDmlT1Oy1kdeEACJQglKWTT8Gx6Z9pli2j?= =?us-ascii?Q?skokR5Ja34/FiuyJew3zoQdm/p0LAsT62GeNi4vbcnEvPrtHfaQzx8H0EKh5?= =?us-ascii?Q?aMvv5VWod1temS64bvObJe6I1EOBqFMwBSeFtO/yFAv1qocSYvHKw/tu/iL1?= =?us-ascii?Q?gR9u+0ReKREqKoDNjbVfjWuWpjzlBLZjRDDrCwSirrg0VNuWV9djaWVYaTOm?= =?us-ascii?Q?VqTGB5PbN+TmTFbHZeFS17/QG65ef1RWlqEC8CYJ5prn+3pr1IVi08PkCLXC?= =?us-ascii?Q?GKB3PqNGTncU7l6a04bVN52E8TaMtUpdEgHJMrMGstty/YIxr3hC/AlOfPak?= =?us-ascii?Q?EUQKwVAFy4/6ODXWXnlhTv/a7CzDkPLzCvXtgRJzO4yWKjSRFRoBraVn+7ql?= =?us-ascii?Q?zqha7RyJFoXZwVl9QixprDWHbxK9DxBr3hOTSxtT4lknNDizJPP0hNnUV4lY?= =?us-ascii?Q?jdxXV55b1ZL0C+RE5lIm0gNrLE8QnCxPLh3gt9AjRsth/vdpvCcyNCt9l+y+?= =?us-ascii?Q?FzGCg+4wKXSgpaKCT/ZyFBYPIbM88IgaR+EvF8QakubYZDWQVJzV8mKUErM+?= =?us-ascii?Q?AYJNgb7lyiMg/y3vZcUXpLQckXhP++DrFOfEaPkTvlc02jgaZMwKsWDIlzXF?= =?us-ascii?Q?3CLVZz2mC/PI9DqJjuE6ZqgL1Xc0cGeVV8xXZsyVJvsjTkVj6GEtP6cSM6cG?= =?us-ascii?Q?kW4hHelbQIiT13dPkBhlY2pUXDdak7vxNHx3AaAk?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8b0c6031-1eb7-4f30-ae1b-08ddc3da8746 X-MS-Exchange-CrossTenant-AuthSource: SN7PR12MB8059.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2025 20:02:29.7458 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: pg7ROdzZxs4pFedmBv76uDo0bDXjKG1YlYFUyEF/oyq3worE+hv06iM/dG/Of96MRBSp91EKEcMtVEDXlw3Apg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB7009 Content-Type: text/plain; charset="utf-8" The synchronization of CPU offlining with GP initialization is confusing to put it mildly (rightfully so as the issue it deals with is complex). Recent discussions brought up a question -- what prevents the rcu_implicit_dyntick_qs() from warning about QS reports for offline CPUs (missing QS reports for offline CPUs causing indefinite hangs). QS reporting for now-offline CPUs should only happen from: - gp_init() - rcutree_cpu_report_dead() Add some documentation on this and refer to it from comments in the code explaining how QS reporting is not missed when these functions are concurrently running. I referred heavily to this post [1] about the need for the ofl_lock. [1] https://lore.kernel.org/all/20180924164443.GF4222@linux.ibm.com/ [ Applied paulmck feedback on moving documentation to Requirements.rst ] Link: https://lore.kernel.org/all/01b4d228-9416-43f8-a62e-124b92e8741a@paul= mck-laptop/ Co-developed-by: Paul E. McKenney Reviewed-by: Frederic Weisbecker Signed-off-by: Joel Fernandes --- .../RCU/Design/Requirements/Requirements.rst | 87 +++++++++++++++++++ kernel/rcu/tree.c | 19 +++- 2 files changed, 105 insertions(+), 1 deletion(-) diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Docum= entation/RCU/Design/Requirements/Requirements.rst index 771a1ce4c84d..841326d9358d 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -2011,6 +2011,93 @@ after the CPU hotplug scanning. By incrementing gp_seq first, CPU1's RCU read-side critical section is guaranteed to not be missed by CPU2. =20 +**Concurrent Quiescent State Reporting for Offline CPUs** + +RCU must ensure that CPUs going offline report quiescent states to avoid +blocking grace periods. This requires careful synchronization to handle +race conditions + +**Race condition causing Offline CPU to hang GP** + +A race between CPU offlining and new GP initialization (gp_init) may occur +because `rcu_report_qs_rnp()` in `rcutree_report_cpu_dead()` must temporar= ily +release the `rcu_node` lock to wake the RCU grace-period kthread: + +.. code-block:: none + + CPU1 (going offline) CPU0 (GP kthread) + -------------------- ----------------- + rcutree_report_cpu_dead() + rcu_report_qs_rnp() + // Must release rnp->lock to wake GP kthread + raw_spin_unlock_irqrestore_rcu_node() + // Wakes up and starts new GP + rcu_gp_init() + // First loop: + copies qsmaskinitnext->qsmaskinit + // CPU1 still in qsmaskinitnext! + =20 + // Second loop: + rnp->qsmask =3D rnp->qsmaskinit + mask =3D rnp->qsmask & ~rnp->qsm= askinitnext + // mask is 0! CPU1 still in both= masks + // Reacquire lock (but too late) + rnp->qsmaskinitnext &=3D ~mask // Finally clears bit + +Without `ofl_lock`, the new grace period includes the offline CPU and waits +forever for its quiescent state causing a GP hang. + +**A solution with ofl_lock** + +The `ofl_lock` (offline lock) prevents `rcu_gp_init()` from running during +the vulnerable window when `rcu_report_qs_rnp()` has released `rnp->lock`: + +.. code-block:: none + + CPU0 (rcu_gp_init) CPU1 (rcutree_report_cpu_dead) + ------------------ ------------------------------ + rcu_for_each_leaf_node(rnp) { + arch_spin_lock(&ofl_lock) -----> arch_spin_lock(&ofl_lock) [BLOCKED] + =20 + // Safe: CPU1 can't interfere + rnp->qsmaskinit =3D rnp->qsmaskinitnext + =20 + arch_spin_unlock(&ofl_lock) ---> // Now CPU1 can proceed + } // But snapshot already taken + +**Another race causing GP hangs in rcu_gpu_init(): Reporting QS for Now-of= fline CPUs** + +After the first loop takes an atomic snapshot of online CPUs, as shown abo= ve, +the second loop in `rcu_gp_init()` detects CPUs that went offline between +releasing `ofl_lock` and acquiring the per-node `rnp->lock`. This detectio= n is +crucial because: + +1. The CPU might have gone offline after the snapshot but before the secon= d loop +2. The offline CPU cannot report its own QS if it's already dead +3. Without this detection, the grace period would wait forever for CPUs th= at + are now offline. + +The second loop performs this detection safely: + +.. code-block:: none + + rcu_for_each_node_breadth_first(rnp) { + raw_spin_lock_irqsave_rcu_node(rnp, flags); + rnp->qsmask =3D rnp->qsmaskinit; // Apply the snapshot + =20 + // Detect CPUs offline after snapshot + mask =3D rnp->qsmask & ~rnp->qsmaskinitnext; + =20 + if (mask && rcu_is_leaf_node(rnp)) + rcu_report_qs_rnp(mask, ...) // Report QS for offline CPUs + } + +This approach ensures atomicity: quiescent state reporting for offline CPUs +happens either in `rcu_gp_init()` (second loop) or in `rcutree_report_cpu_= dead()`, +never both and never neither. The `rnp->lock` held throughout the sequence +prevents races - `rcutree_report_cpu_dead()` also acquires this lock when +clearing `qsmaskinitnext`, ensuring mutual exclusion. + Scheduler and RCU ~~~~~~~~~~~~~~~~~ =20 diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index aa6cbd1501cb..174ee243b349 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1885,6 +1885,10 @@ static noinline_for_stack bool rcu_gp_init(void) /* Exclude CPU hotplug operations. */ rcu_for_each_leaf_node(rnp) { local_irq_disable(); + /* + * Serialize with CPU offline. See Requirements.rst > Hotplug CPU > + * Concurrent Quiescent State Reporting for Offline CPUs. + */ arch_spin_lock(&rcu_state.ofl_lock); raw_spin_lock_rcu_node(rnp); if (rnp->qsmaskinit =3D=3D rnp->qsmaskinitnext && @@ -1959,7 +1963,12 @@ static noinline_for_stack bool rcu_gp_init(void) trace_rcu_grace_period_init(rcu_state.name, rnp->gp_seq, rnp->level, rnp->grplo, rnp->grphi, rnp->qsmask); - /* Quiescent states for tasks on any now-offline CPUs. */ + /* + * Quiescent states for tasks on any now-offline CPUs. Since we + * released the ofl and rnp lock before this loop, CPUs might + * have gone offline and we have to report QS on their behalf. + * See Requirements.rst > Hotplug CPU > Concurrent QS Reporting. + */ mask =3D rnp->qsmask & ~rnp->qsmaskinitnext; rnp->rcu_gp_init_mask =3D mask; if ((mask || rnp->wait_blkd_tasks) && rcu_is_leaf_node(rnp)) @@ -4390,6 +4399,13 @@ void rcutree_report_cpu_dead(void) =20 /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ mask =3D rdp->grpmask; + + /* + * Hold the ofl_lock and rnp lock to avoid races between CPU going + * offline and doing a QS report (as below), versus rcu_gp_init(). + * See Requirements.rst > Hotplug CPU > Concurrent QS Reporting section + * for more details. + */ arch_spin_lock(&rcu_state.ofl_lock); raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order gu= arantee. */ rdp->rcu_ofl_gp_seq =3D READ_ONCE(rcu_state.gp_seq); @@ -4400,6 +4416,7 @@ void rcutree_report_cpu_dead(void) rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); raw_spin_lock_irqsave_rcu_node(rnp, flags); } + /* Clear from ->qsmaskinitnext to mark offline. */ WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext & ~mask); raw_spin_unlock_irqrestore_rcu_node(rnp, flags); arch_spin_unlock(&rcu_state.ofl_lock); --=20 2.34.1 From nobody Tue Oct 7 01:58:30 2025 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2081.outbound.protection.outlook.com [40.107.92.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4A212C158E; Tue, 15 Jul 2025 20:02:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.81 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609760; cv=fail; b=S8LG4Rkwtv2YFVj8NYwH8ookFkhla5mnV4tXYLAAmxkRlkX6TSaZay33bzuudmN8cqZaISqbPI7R8eKbd5GqnCpVRYSzB6P/yKEAL38KzYnVVtmVS0JMJw2ua3SR38mnjaUEWxIJcGYvgs3RzN0FR1oUVF/XEGTefF8DPVrvaqs= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752609760; c=relaxed/simple; bh=jz2y7OJ6aa4tkrfG6+u44mTGmsUIElpbi6ZziSsBtpA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=dWjS6498HUofxYbaIQ9vmdqLpGc2P8oMLXuWn7Wr2q9R6bAWBgi/0HeNzTh81v2adV9vItSrpKlZdvS5rwX8Jf70l1Q1ze30FT5Ely6wk4jE+s07ZtOwo0NKfFbcID4cyK54QfCoioPnDGWOYj0caXcDFXSTUcXSUJTSrMqodYw= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=hdcZga5o; arc=fail smtp.client-ip=40.107.92.81 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="hdcZga5o" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=qrbXWs50oVUZ6kBUN+WBqVo8+x1Jcu7470sac1I8SE8KjrjVs9r35iD3cC03kXPi8sLKYf6I68HQq3ZUAiS9kHWMX2xGNfx5IBzTzoezKCShNvfggxtIFK7QAJmRFR7XLSxp76Zl3zwUSpWlCXenJplsFXYJuabrQyqrX6w+CYiWkgiRzsbd09pnxeliVtdw1TfUfEyBcjnJwtD/4WDQdtw1QRfvljiEjaSFj8DU4e++60HRFe8/w0szQugrlM1hzklZBjU5qTmD3BJcT83W5H4UPk23d7SHVWB7E37W/ImPHN4kC8NilNs2gs8GgZebFY0x19eWlFZlaVv7ptJEkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NX24UGdkpMMtU2K6+ypDl4RthGVvBipXkeFN3a4iBLM=; b=vrdt/c7NvJT7leY9TBTDZIpgaugXRqtWUfPQKZffOKWDFn2GrIFj7ZsWBNZwsgy3i1Ylh4m5aYRhX6cHpQSbynHVwx0sFvmS62VXtBa35U0lJTisUt0xcJ1FEl9jhhRLM68/xTwkeD8CjEQLqHvh+wgPm0vdwzf4lEOMhJwMKTruxViBruIPolBqzUrMforuQho36sNO9IE8LXpOvlkRfCIzqLOPdGIH1Js0Fvt2ub/VPW10VonJ4/1yVrAhH0nLZMpL52Qi9gJH0LhQUpT1Uwa2wZb4VG55zrzP0bR2qfOtMVrzWoBPqEl5fQTXY5tTcfdhpsUua1ezrKAZnZ16Pw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NX24UGdkpMMtU2K6+ypDl4RthGVvBipXkeFN3a4iBLM=; b=hdcZga5oYMimtlDGNYP1Azv1t5KipHzj/sF07M3T7J61+AscdR8Ac7wEqQgV6meOmgnna4cICrxypvVajYgdD5FWG7Qc1yjeYNHQSSM7NajOurAPm+k82vm/AqjA3neTDL3Q92XaUaghwryfbFdTbU1d0Da3g4hTj6FvQaPwaj6XYRbPdltEuy7r/J+kNvwqGlo12MJwKrOzrBme+YAJ4bHbPjoYTN7drjt0lzJe7Or1dKDD5Fn5aDaequ5v1dsRB5tRyoqb45UKX9Ybw6citMz5ALM34mlYTGcCNACKA5jnK4kGPxzBrn+E8EWj3Bly4POtq4Gfvsd6v3qObPy31g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) by PH0PR12MB7009.namprd12.prod.outlook.com (2603:10b6:510:21c::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.33; Tue, 15 Jul 2025 20:02:32 +0000 Received: from SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91]) by SN7PR12MB8059.namprd12.prod.outlook.com ([fe80::4ee2:654e:1fe8:4b91%7]) with mapi id 15.20.8901.018; Tue, 15 Jul 2025 20:02:32 +0000 From: Joel Fernandes To: linux-kernel@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang Cc: rcu@vger.kernel.org, neeraj.iitr10@gmail.com Subject: [PATCH -next 6/6] [please squash] fixup! rcu: Fix rcu_read_unlock() deadloop due to IRQ work Date: Tue, 15 Jul 2025 16:01:56 -0400 Message-Id: <20250715200156.2852484-7-joelagnelf@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250715200156.2852484-1-joelagnelf@nvidia.com> References: <20250715200156.2852484-1-joelagnelf@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: MN2PR11CA0030.namprd11.prod.outlook.com (2603:10b6:208:23b::35) To SN7PR12MB8059.namprd12.prod.outlook.com (2603:10b6:806:32b::7) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN7PR12MB8059:EE_|PH0PR12MB7009:EE_ X-MS-Office365-Filtering-Correlation-Id: ea71aeeb-4f05-4102-a9f0-08ddc3da88a8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?TJjKe3e9ZquLQb8uMqFQS8wYuWJcsgPvh89JQjb/7fqW4r805pJHZCGi2tDK?= =?us-ascii?Q?0XiyXh53OQmG3YxWBP6cmuUqCZRJbujacsoD/zKJpgZV1Ek1V932GSMrXlY9?= =?us-ascii?Q?mCmXKbkbIPoBN0Z2M6+R+TB14Ib0wAiugeimx2cocBWHfa4Qw8O1E1Snld+O?= =?us-ascii?Q?nmXqwAGEmtioIp6njWb4l78w5AIcFMAksrsPofggrCRSD9ZU2piaj2Hmr7qR?= =?us-ascii?Q?OIiA2RQ2KsxmBylsukwalFFc7UOWmhp88m6/K18TbR4mjG0gTY190mMf7NL5?= =?us-ascii?Q?xxPJM50nSz4BvrS5bnZ7aDFn8vQz2XbOF1WJUPB4PA3G+2H7RQmULgw1SZzk?= =?us-ascii?Q?QexpLZxX1NX5Abufhb9hiNMEvS9Y7djeLh8mJAsqeUMYylFyaXR6VuVB6y3L?= =?us-ascii?Q?dVLctHASplGLI4mveRTlrre1EaWrCdF9T3v1I6pcUKCK68aiiecnDnVKCvN6?= =?us-ascii?Q?CxNNwAM3a/VEMe1KYoOHyZNOmtnyHN4SXv/AQcWRDpPnl6vJ0flfYN6DlYML?= =?us-ascii?Q?AhFyLrvwmpqK7lnuEDJZeayw7vZJpnW7fAAbRCiEFUnNfwfd9liq7p0n9sU1?= =?us-ascii?Q?A5enaoL5ph1vX0ukxlcaGt0e42rb2ED1K4/AiYORsVGGnAC+/XODteOoH7M3?= =?us-ascii?Q?trSpv8y43CrLGDU2z7eOTQzc1zyWmwcyB2BgWJdKQkRZZonxBuH3FeIB3Q6A?= =?us-ascii?Q?tRDyPBbpQ/tUSfCu1VzuG/X/yqL07VsQBPSgcljy08EkdtmgWnhtjccz6xNA?= =?us-ascii?Q?jrMceJmK1wcQ/DGQX2aJgr8nPUN5r6dyFhqfyMO+due66iUw553ZzbXOAtYd?= =?us-ascii?Q?0lADKRyqz81uCgtknXPCXk04SS6TrDmkCxYQ3IoMVJqFjbBojUqPu2YBXyPk?= =?us-ascii?Q?9lXybvKE92Zd8D+ODQjFSGMWA5Grnes3MVNsNSbwKhFORF9WVZIzWUJvGD22?= =?us-ascii?Q?oSLfhRDkk1O1rHJq1yYQ34K2zVdiJCVQ5Zr+oruQigVKwI2oztICprMQGyob?= =?us-ascii?Q?XBLxbeQImQRWBv4ZO0WvovuV92CzMP9Vx2IzVG1wDRqPC9x8GLclhA1D0tJ9?= =?us-ascii?Q?srZYZlPPTGHxTiKuFbORFTERs3cMteUYCrYa6DNsYPhukCYPg/zqUUwJzKhi?= =?us-ascii?Q?vxHwkJl7SDTqH3vCKJ1A2vpoBuj8RLBbqz0Eq8AXV7Cu6G90OUCeRG1Typwz?= =?us-ascii?Q?W4IDWASQ4ZmFsSqeJjkhhUB1nsJAUG96dqU89R6K2YDtcqqUB0wVH4XO82B7?= =?us-ascii?Q?wxiyjZB+w/p97l9/2gv2+7rznJYKYyC3LnIq6LzrSlnxURV2qC/6rE/cabWx?= =?us-ascii?Q?3zrRnq40sNgyPGP7z7LhhU3UTy9aqomGZ4ehXpdy2aDFhpb/WVCFkG8PmetA?= =?us-ascii?Q?s5w1KKh+aacPaaNcjDh0Gg6dQKRvaqzSnhn7IzAlRpsuUcrv3zGNaXE+6v5K?= =?us-ascii?Q?BJjqNzVw89vOVG/xY/jCWqNd7hKBdHQhUl3DnIYwiTi+mLGbMYjQDA=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN7PR12MB8059.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?KN948doYWUG1N7JIjs6cXN9cfilrBq/+eFR8Re5EhdEgE2JlpzfP9pgGvreT?= =?us-ascii?Q?jBg4+24dz8f0qA1uStkzkBCvC15TRNH/P9v1TO0sovz3BA5zsCs955uCui8Q?= =?us-ascii?Q?yst9DV4UHtsl+nMx76a2ky/pd0e8bt40s0yHMJ4QFRPJ6lSYk879jiW9WxCF?= =?us-ascii?Q?6GJffFI+TZCSE3Ks/3BVokcriBjMfX0ufpatkClAbfbHxilUwUsqF4KDKs4O?= =?us-ascii?Q?jGbqeYKawxPnJjkkF/ZDH0V/lCaIWTr6bG0pUNxoGg3diNCoeTcyuWOFnTvt?= =?us-ascii?Q?jR9p2+IGo5IcJ6Es3zCp28Nqd4is2kd7kNigcHNILrQGmAuxbDAA+bGpo84O?= =?us-ascii?Q?FSMyWdlI8J69YvVyvPJ0DCYhw7nnATY/MlgdhNEOQDPKiMVNauKDyZCLDRTe?= =?us-ascii?Q?rdDHt+nevr3YfvEP98x1lwu57WJDWF3CIOJ/uwFnjhDXYSeBt0xUnU9SmmUc?= =?us-ascii?Q?bzivlAsjuXrzZjHKAqlxrzCtztGbjM/18IeOTOAQcGR+9/74QXF30+HPVqv4?= =?us-ascii?Q?nWgG1aT/mg1c0RVKmCv+KP/iQ23B4xd7HYCHDl2+MwwSW57kypiWSU+s+V6n?= =?us-ascii?Q?CLCC4O76PjsUl3eI7SG4jbRCXeb/N0+80QbK/zT/j5fmmcLnuJ3irljrkMzM?= =?us-ascii?Q?MfRmTHJtC84jaxUMPeJOvFmmMjrxWN2kLmw1iGJJTO+vzYJEzi8lf07zhFs8?= =?us-ascii?Q?rBVzRXHiX3DfFGFLrYYrm0v+XEL2LgubfWvm5zAdYqsz9ZlVZo7nBz2mZWd+?= =?us-ascii?Q?w1JhpykfGsxSIjDReY+7ssCgioLyeVKTzmDBHiNIiluW9m8RyNObdeMjkfax?= =?us-ascii?Q?4Tou9VXu1+ZkxM4YWTSPva0RfgXbzb1PQbBGkx5LOxPjVM29ww4/zksYazbc?= =?us-ascii?Q?nEjJquUWnphFs83wIzmTLyomxHPGN7zR8GcHz/I+NhG5iDDBH3RDxFXHv092?= =?us-ascii?Q?EHYugSpS2Z00TldaExaEH+K+ct9dI0NY9loeO5jOXJa5qMnNaT/9DKGu13iM?= =?us-ascii?Q?hypdjq6gcn+f5ETuqqCEFaMaTih5hf9GgEevdzyOPY02P1wsDjMDS+o2+tzd?= =?us-ascii?Q?EgapsRc7zpI2ABzmkVZkEouvl1bEuynoKbVprb6N9Hq8NiYZz5fC6jHZrPFR?= =?us-ascii?Q?aKQlTU2rH/43sZ8zFjHUxrPzTKrb6kfO8UziK8tDKyYH3UUjoyV0SoB2PQrT?= =?us-ascii?Q?NCBxAvlvTBdLaXAN+Rxv0AtNHtCfk3dpegdk3Ee/WbFkMbDYxxgCL8s4zcgR?= =?us-ascii?Q?u+SUBj5XoLR7wFY3cQJdsW5G0U1Xff0x5sExVyVVWFdPftQoc9nEhxJ3Ja2Q?= =?us-ascii?Q?6XzXs+6rzeToL2YEQG1xVxCF3Kex/zrj78RLAKZ/XlWTh0eAWHwRy7f6PGq1?= =?us-ascii?Q?gMli+Gne5aSLz+PrcXt/+AqqEglLkVE+wyFspTe1ranEra3JxT9KjSlPapOQ?= =?us-ascii?Q?Iehjv/wj2cyD94Yf2ew4b7zmbgHX15SKXa8wBECBQVJOKSsj3sPv+I2O6YBV?= =?us-ascii?Q?/QVXgPhgknGexVrrPvjzhVsqSpUN+d0Rjsx5pJqDim6ViLOm5Uj5nzwN30RS?= =?us-ascii?Q?lwrRE1geYv2/dwhDMdD3LujVF6cUxxhhwdJaayaT?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: ea71aeeb-4f05-4102-a9f0-08ddc3da88a8 X-MS-Exchange-CrossTenant-AuthSource: SN7PR12MB8059.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2025 20:02:32.0494 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: CPye+iHY1EGRu3QgmvvBvW9NTTuJ0KEH/p019a74YTHFojGYY1cN8DPhVGAQVlYlIYDVSTXESfWXgDtSmyEJfg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR12MB7009 Content-Type: text/plain; charset="utf-8" Please squash few comment-related changes courtesy of review from Frederic. Signed-off-by: Joel Fernandes --- kernel/rcu/tree.h | 10 ++++++---- kernel/rcu/tree_plugin.h | 7 ++++++- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index f8f612269e6e..de6ca13a7b5f 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -175,10 +175,12 @@ struct rcu_snap_record { }; =20 /* - * The IRQ work (deferred_qs_iw) is used by RCU to get scheduler's attenti= on. - * It can be in one of the following states: - * - DEFER_QS_IDLE: An IRQ work was never scheduled. - * - DEFER_QS_PENDING: An IRQ work was scheduler but never run. + * An IRQ work (deferred_qs_iw) is used by RCU to get the scheduler's atte= ntion. + * to report quiescent states at the soonest possible time. + * The request can be in one of the following states: + * - DEFER_QS_IDLE: An IRQ work is yet to be scheduled. + * - DEFER_QS_PENDING: An IRQ work was scheduled but either not yet run, o= r it + * ran and we still haven't reported a quiescent state. */ #define DEFER_QS_IDLE 0 #define DEFER_QS_PENDING 1 diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index ffe6eb5d8e34..1b9403505c42 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -633,7 +633,12 @@ static void rcu_preempt_deferred_qs_handler(struct irq= _work *iwp) local_irq_save(flags); =20 /* - * Requeue the IRQ work on next unlock in following situation: + * If the IRQ work handler happens to run in the middle of RCU read-side + * critical section, it could be ineffective in getting the scheduler's + * attention to report a deferred quiescent state (the whole point of the + * IRQ work). For this reason, requeue the IRQ work. + * + * Basically, we want to avoid following situation: * 1. rcu_read_unlock() queues IRQ work (state -> DEFER_QS_PENDING) * 2. CPU enters new rcu_read_lock() * 3. IRQ work runs but cannot report QS due to rcu_preempt_depth() > 0 --=20 2.34.1