From nobody Thu Jan 30 17:32:09 2025 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2084.outbound.protection.outlook.com [40.107.96.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3A7C1C5F13 for ; Mon, 27 Jan 2025 22:06:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.96.84 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738015591; cv=fail; b=eujJzG24OKf9m4jhSS2j0WkEHP05Plu6fHWWHWQf2ehM5DWiXEQpqJM9Z1XD0efAEBCt+5ODHirsKClTbfKBocco5tNrkX3ybsNKz/FRBSGg7aSdniwMEUmkXovvEhAWwMgot1p5xyhIiug1G3iQGUqBS+c21dCE6mgwlePYqGY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738015591; c=relaxed/simple; bh=uG2bHIcIv3Ul+jeNIzIEXGp4vX/PKUDJl7uflmIRtHw=; h=From:To:Cc:Subject:Date:Message-ID:Content-Type:MIME-Version; b=LlZDjFoBZsXBYNTuhETLLDScPYT4Bt/jRedEyqiFCqmmurbG7DVSmyjgHNHhRxGd6uJDDKcOvErvd45YoYPvOor15qsqVXvavj/PTxdR6/HbCQAAVC2LMZzBdw905fsiBAFHWPYruvGoFfto7kruAQwmQ24+HwpRpsVPRpGxztU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=N78Asc8H; arc=fail smtp.client-ip=40.107.96.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="N78Asc8H" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=NMCVa/rz8NCEjad/Ygt1eZxKLGMicdVQTiiqIQFMJsQMY2N9FGEcMVOGJvlWSCq3dZXi58PiIvlRriqsjDjzXXfcTNHiBd6kjhLLDU9oUBjGRWR3opFFoyqTIIlDDQQST3+WaO7DF7XJeOUlO4RSXAlddHL/gHvjfnhkEExlSHO1rxE93JzD/ZPB9Wi81wZOQmDccrTjihI/mVeKX9GwmS2qgB4dWJXO7yKDLh7mvpgZtA6ntM73aghg5f313NV/jkhlt5Mw2Uy7fqbao3vrYprpQnuw13Qvb/8DlR6jE+7eeR1SiAOHG4p3/Z0K8/1A8ypPntfplhiliY4p4j5TSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Eh4OwhQD/Ng2yyhX9azYDu203kTG3rICgowvgVvbbTg=; b=EZqRC604s5H+8H7yyJdltRpa2OL5oSBadfyX8/86OHl9yhYYPLBurXi7ZR/udZPJbgHjxS1ofDRnRFzi6G0usyktNmO2Jy8tCJ51EO8IDHu3y5rexAoNe8HNnixM/Y3ynaW04Em/5ErsdVMn3o23jUSthgEMI9pd9A3mPukfI4uQRbAnBg9YCigyBHLs5vzUQ7A9j+SkZQH8UyAItVmp+SBGG7YPk6sGeBTHQKp8dmPOxvmgB6W49R9T/bdAzm/7ow0Y+6QF7jRe/eInAQOCjBa1OeLvIDe/lew3jbjnBBbiGIzWTY/xPuCWBkroQ7vmWK0je16Y+mkB5zpjI3tIsQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Eh4OwhQD/Ng2yyhX9azYDu203kTG3rICgowvgVvbbTg=; b=N78Asc8HlQN+edVWFrSHEPOqjOAPIwpUY9/3eVQKL5wNd937kS0BPgnJrMcqI1HSyErWLkEKOX9gXRD8WCIGKGOYoUPl268mfIeFAeSyGx5vPgQCw2wOrEQFabXr2bxtFMUr2wV79lDmn5xaJjAp1S9YkQzQUP27OadtbY9DExL883M8Sbp25Xdik5ITtSHNT+E07VesFockWZh2+GHCoIXV69MN7mkKXWCKAKlHKs72/WQXsHPClwRmnfv+533jVcL4uWIPZ8TUmn1dFtzceLhQM0hHNRRusAciG39h355PvqjUbAhvNWALKYP+lR8+wrHd8TQxCgn5ArqF73caDg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from CY5PR12MB6405.namprd12.prod.outlook.com (2603:10b6:930:3e::17) by MW4PR12MB7484.namprd12.prod.outlook.com (2603:10b6:303:212::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8377.22; Mon, 27 Jan 2025 22:06:26 +0000 Received: from CY5PR12MB6405.namprd12.prod.outlook.com ([fe80::2119:c96c:b455:53b5]) by CY5PR12MB6405.namprd12.prod.outlook.com ([fe80::2119:c96c:b455:53b5%4]) with mapi id 15.20.8377.021; Mon, 27 Jan 2025 22:06:25 +0000 From: Andrea Righi To: Tejun Heo , David Vernet , Changwoo Min Cc: linux-kernel@vger.kernel.org Subject: [PATCH v5] sched_ext: Fix lock imbalance in dispatch_to_local_dsq() Date: Mon, 27 Jan 2025 23:06:16 +0100 Message-ID: <20250127220616.620097-1-arighi@nvidia.com> X-Mailer: git-send-email 2.48.1 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: FR0P281CA0228.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:b2::12) To CY5PR12MB6405.namprd12.prod.outlook.com (2603:10b6:930:3e::17) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY5PR12MB6405:EE_|MW4PR12MB7484:EE_ X-MS-Office365-Filtering-Correlation-Id: 36da16fc-b8cf-4176-2ebe-08dd3f1ed77d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?IgJnf0tPNQoGs4d7PX3rO7uo3szy816g/v7FS0Ycd+Ew1AVN9ab+hnMeDUgF?= =?us-ascii?Q?Xa4T3zvC+23kwAtRVZEktvZcpXCgQsZfk1MsiYmaHs1ohPXvxF/HqIKeNMCU?= =?us-ascii?Q?U9kcUqJUEu53r0kcV0OhhqsUraHMg4zouvhQlucErWkgz2e419nVYiB9+nO6?= =?us-ascii?Q?GSlg+IapZPV212rLP9Kj7vmIiFdRzj8xCOOt8aQ6ycHzk68X9G/s4Oqg34gw?= =?us-ascii?Q?1gQGzFudE+34ZLsz0oan2bM2pGez8URCT9quHkaaaxxax32yDibkgbLp5UjJ?= =?us-ascii?Q?JRytMUO/TgYFUHP5VIeYRpvmmZ0UBoRBnIAV4R6xWt0+9NXvjpjYsZsea3/Y?= =?us-ascii?Q?lKep+J76KwR3a/a9O01Ca4Ct1N8RrqvBRMlVjGMzH2D/M+7A4yKcT84jCyyx?= =?us-ascii?Q?1cZNz5TM4/nGNVzjMK9tqEG4I0lICmw/7fSu8OTLyrTv/Pb4zNvmYJEkxMA2?= =?us-ascii?Q?FbRlOUrv3atpGNOJPJ8sDo07helKR2M/H7peBx+LlxrAH/93HgVrMsJzV3g+?= =?us-ascii?Q?ln353nDIEBqQmggNHihtMcf4ZSXN/v2i8M8aZs5LYz7DjMYFAWvllzSn4fl5?= =?us-ascii?Q?jeIi8+zY3FzjfSemVOY3gxbxuVdF+2hwa1fhDcsNtXTr4hSIWuF+wSL1233U?= =?us-ascii?Q?re4paXSpjocpEXap9q2GVVwAP/NlauQHjOcfcjgUDxWoJG2g0UH1tIh0jaUW?= =?us-ascii?Q?LitRUczhCSH9Teq/khoQRQTwPN7mNQtq2pTdTg7pZbT05inx8wO4ijHGBup6?= =?us-ascii?Q?g63DYimhJWbn3JsC4lRRSSd+AflcSwyGgWCyQ5smmrdErOaNSZrNJ1LHCFgc?= =?us-ascii?Q?bcrAzv3Zj9UJUeFi9IdDBY93YagGc3ERmnLgJGu+EQVvT9iQxytxX+zqtmc2?= =?us-ascii?Q?dAa1IbSx49wepoLSVroQ60QTalZpITqUp5lV/Z1Pz1uqtsqr3mv3xYokcD8h?= =?us-ascii?Q?iZlLeZ7FdYXNR7OljEV1V6Q8lGi/aJHcx1fGBQG2hnjaYC92J5quI8WueU7K?= =?us-ascii?Q?kZQZv5ENCi0UkeYguLjQbrmGUEvAjfvbSSjiNLSMJHZB4oVn7cRrS4KDEel4?= =?us-ascii?Q?4z9STyXf5Y8NPbM8xkO2qy1rAQXKfqXhSRpbjKueOWHW85rGb3vUZRpPk65b?= =?us-ascii?Q?TzIfxl78WZnoywwZTKA7PuOY3YgP2xY+4oAM2AQwQenb4ZoS2+288nQzPzXB?= =?us-ascii?Q?R8dx6qc3JkGmh41aCxjCnge8qUELbg2tFSuyxZDCdNf6a0uQbVZB+Lz9cy6R?= =?us-ascii?Q?lSxbGK90JooiYGRv21TxBHIoWbtH5gG4JjpZ8kgYapwX4PNMdAPVMw8BDJFN?= =?us-ascii?Q?LY88jKx4jEuYee97KEwm/IUcxoGe+fEEwmVwJN4Jf9KOhgs0MhHKBYg8BD+p?= =?us-ascii?Q?KX3kR6IFfrMnjD3FI8vHQLdI0YqT?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CY5PR12MB6405.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?4jY2crcW1KGHijlpomhRokUIOaeLZkRv1OsAUt05yl95LatWdZaN/1kaUBJm?= =?us-ascii?Q?ttxWhIQa3RdDUfTW8cTI84rlB4jMMNUTjTvKXy9A3bZTp4mnChqyKy8qsgXd?= =?us-ascii?Q?Ob/7PU0xmks7+6IaARI4V1ChBUCkH4Sh/NFYun3EmDVIo4kPFr2LuACLweKc?= =?us-ascii?Q?9D/SxPLBR21yZDeiIPy1uthMbDDAW2D5j4Ck3tv/7hJ4CJHNl/jkB4mICHkG?= =?us-ascii?Q?TOhri/kmka/FGncs/vPP0BuhX+9GcUP2MQnjAi8/ygKQeLn+h/8hFKtIs9Ki?= =?us-ascii?Q?v7j+BfNUz4xmOkCq8b3DzWlpg96CVRfOey/+YjEUMPCt/Snso5SlOksWuTht?= =?us-ascii?Q?FA3gO8m+cAog6yPwetdVprcWRep1se/l5IdpOew8TtZtJ/YeZPZDDMzwPuf5?= =?us-ascii?Q?5mWR9TD4eWb6LBpcvAmIeQQ9jEzSbtRMzvkBowmgIRFGmGfgQrc42GoJs+HC?= =?us-ascii?Q?7SaxBfbrdnyQD7EgU+4G2sV5hR9b4l0zsoqn+qyaVcCXlLTqOPO8XmJ9A87U?= =?us-ascii?Q?Wtwvo0r4gdbJVaMhnPzWgd/kpr4JCBwUf38od8zNO0bLLQjAAsyhIf5Lo2Hg?= =?us-ascii?Q?AamrBcV/NE47NbGBvBSZHL9myKO/o//CAiBTxrw+HMSFztv7bvHVZ91QPyUt?= =?us-ascii?Q?+LAwqBlwS+Fq56SBMFmx4BOOLwMHlrjAjuNJ6/SlSwpKnwtPzB+7W6vyeY61?= =?us-ascii?Q?r9ag/NBpkSNxKQWiXICKdA+ta5qhmj3M+sWDTs/8dRNv8/wY9bns0SIcEMA2?= =?us-ascii?Q?qXR0C9ldz/2MFBUUGRAvWgyp22b4ZpJtnmYWKEEfcpGJvgR8xRHK0kgvRnAh?= =?us-ascii?Q?+OlyMrGEeSBit5A1BDbmdacreJF9sFh0BtrKHdYLG6eg5eGe66pV5bPFdJwA?= =?us-ascii?Q?Vg76nvPkzip3kQuXIqtM3QewIxlK5NKxr6h7m5Pv5hdJLlnGpVFtqt7uDoF7?= =?us-ascii?Q?G4Xipr4CjVwFoJq8pVfwL69nXxgbIzfQIBSUapT2epzK9rnNDHmPhfiuW8gc?= =?us-ascii?Q?9cYcqowGV/zza+f4xbKapCgCzi8DbD8NvHI6/Lr5pVeAGPmT3/SD1IdLjfD7?= =?us-ascii?Q?IOzI1y4iUPAeQEKRHNkYUKvHndc308YiM9c5dGqNcpn4a6XaJwN6IwV/Pr7z?= =?us-ascii?Q?97b5iGXufeDJ187MbQz1VppB8MZD3aX9IPXGw4Uos8lQprvsU2ShXJLgC/3L?= =?us-ascii?Q?2SpLcyQSugPLW+rMT0Xo9SPxwbXfYNLRMwKp/2qPLpOFCvVtpUF1vqTyEUj2?= =?us-ascii?Q?IOAw2prl6ivPgj+f83jqk/rTavn/95NcQSKwFIB/ie8H3tuIG4Ao10tdv2C5?= =?us-ascii?Q?fj373sf9eAgKryfGhOd4lILlALetv/B3WiBL1B3Chnf02103Xkebevw239Zq?= =?us-ascii?Q?PAaEXtoUjm4reDRQIYmM4di32EUo5fD9OVjONRt6F24tb9YiLr1gx9RFrl71?= =?us-ascii?Q?o/mVc3gnklxs2N9o8vg7Z1HxPP9hw1h+d2o8k8IsXv8hifyOvVFt48R2BSZj?= =?us-ascii?Q?PQN49a6UoINrPDwmb14QOemUEbOMGRW5eRcFohb6qHbz9goQP6IcX8GtaKn/?= =?us-ascii?Q?DZIdFj/Ebj+WdYpbZWKCCHW6g5Fca7iILcH+nRde?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 36da16fc-b8cf-4176-2ebe-08dd3f1ed77d X-MS-Exchange-CrossTenant-AuthSource: CY5PR12MB6405.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Jan 2025 22:06:25.4441 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 4H0gH3/NlH4Fn4y//alIpdrVdZt4rh54y/G4lRfqKpWpiX1OTgi+QxGaknbE5L7dMO3wOCubWtjLsZbqrEkqkg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB7484 Content-Type: text/plain; charset="utf-8" While performing the rq locking dance in dispatch_to_local_dsq(), we may trigger the following lock imbalance condition, in particular when multiple tasks are rapidly changing CPU affinity (i.e., running a `stress-ng --race-sched 0`): [ 13.413579] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D [ 13.413660] WARNING: bad unlock balance detected! [ 13.413729] 6.13.0-virtme #15 Not tainted [ 13.413792] ------------------------------------- [ 13.413859] kworker/1:1/80 is trying to release lock (&rq->__lock) at: [ 13.413954] [] dispatch_to_local_dsq+0x108/0x1a0 [ 13.414111] but there are no more locks to release! [ 13.414176] [ 13.414176] other info that might help us debug this: [ 13.414258] 1 lock held by kworker/1:1/80: [ 13.414318] #0: ffff8b66feb41698 (&rq->__lock){-.-.}-{2:2}, at: raw_spi= n_rq_lock_nested+0x20/0x90 [ 13.414612] [ 13.414612] stack backtrace: [ 13.415255] CPU: 1 UID: 0 PID: 80 Comm: kworker/1:1 Not tainted 6.13.0-v= irtme #15 [ 13.415505] Workqueue: 0x0 (events) [ 13.415567] Sched_ext: dsp_local_on (enabled+all), task: runnable_at=3D-= 2ms [ 13.415570] Call Trace: [ 13.415700] [ 13.415744] dump_stack_lvl+0x78/0xe0 [ 13.415806] ? dispatch_to_local_dsq+0x108/0x1a0 [ 13.415884] print_unlock_imbalance_bug+0x11b/0x130 [ 13.415965] ? dispatch_to_local_dsq+0x108/0x1a0 [ 13.416226] lock_release+0x231/0x2c0 [ 13.416326] _raw_spin_unlock+0x1b/0x40 [ 13.416422] dispatch_to_local_dsq+0x108/0x1a0 [ 13.416554] flush_dispatch_buf+0x199/0x1d0 [ 13.416652] balance_one+0x194/0x370 [ 13.416751] balance_scx+0x61/0x1e0 [ 13.416848] prev_balance+0x43/0xb0 [ 13.416947] __pick_next_task+0x6b/0x1b0 [ 13.417052] __schedule+0x20d/0x1740 This happens because dispatch_to_local_dsq() is racing with dispatch_dequeue() and, when the latter wins, we incorrectly assume that the task has been moved to dst_rq. Fix by properly tracking the currently locked rq. Fixes: 4d3ca89bdd31 ("sched_ext: Refactor consume_remote_task()") Signed-off-by: Andrea Righi --- kernel/sched/ext.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) ChangeLog v4 -> v5: - apply the fix without introducing any code refactoring ChangeLog v3 -> v4: - small refactoring to fix an unused variable build warning on UP ChangeLog v2 -> v3: - keep track of the currently locked rq in dispatch_to_local_dsq() ChangeLog v1 -> v2: - more comments to clarify the race with dequeue - rebase to tip diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 773fa2e4d6a5..41c41a082a97 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -2557,6 +2557,9 @@ static void dispatch_to_local_dsq(struct rq *rq, stru= ct scx_dispatch_q *dst_dsq, { struct rq *src_rq =3D task_rq(p); struct rq *dst_rq =3D container_of(dst_dsq, struct rq, scx.local_dsq); +#ifdef CONFIG_SMP + struct rq *locked_rq =3D rq; +#endif =20 /* * We're synchronized against dequeue through DISPATCHING. As @p can't @@ -2593,8 +2596,9 @@ static void dispatch_to_local_dsq(struct rq *rq, stru= ct scx_dispatch_q *dst_dsq, atomic_long_set_release(&p->scx.ops_state, SCX_OPSS_NONE); =20 /* switch to @src_rq lock */ - if (rq !=3D src_rq) { - raw_spin_rq_unlock(rq); + if (locked_rq !=3D src_rq) { + raw_spin_rq_unlock(locked_rq); + locked_rq =3D src_rq; raw_spin_rq_lock(src_rq); } =20 @@ -2612,6 +2616,8 @@ static void dispatch_to_local_dsq(struct rq *rq, stru= ct scx_dispatch_q *dst_dsq, } else { move_remote_task_to_local_dsq(p, enq_flags, src_rq, dst_rq); + /* task has been moved to dst_rq, which is now locked */ + locked_rq =3D dst_rq; } =20 /* if the destination CPU is idle, wake it up */ @@ -2620,8 +2626,8 @@ static void dispatch_to_local_dsq(struct rq *rq, stru= ct scx_dispatch_q *dst_dsq, } =20 /* switch back to @rq lock */ - if (rq !=3D dst_rq) { - raw_spin_rq_unlock(dst_rq); + if (locked_rq !=3D rq) { + raw_spin_rq_unlock(locked_rq); raw_spin_rq_lock(rq); } #else /* CONFIG_SMP */ --=20 2.48.1