From nobody Sat Feb 7 05:49:49 2026 Received: from CH4PR04CU002.outbound.protection.outlook.com (mail-northcentralusazon11013027.outbound.protection.outlook.com [40.107.201.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C5AB2D46B2 for ; Wed, 28 Jan 2026 10:14:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.201.27 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769595262; cv=fail; b=Scw09g/BBO2zGDoXl+Cmq0+HnEFvE5fZSw6Ur3ET/uCtxKXJQnUt4sSHP/hDm0vghzUci/NSqauUJxHnHkPQgxZy0fNryfId1aEd3kB5VuiXrPpPQRlzYd4sKA44OZG6dnx5tlj099rfkfPzo23pqcbaShb+N7TQjcjsTSAjkII= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769595262; c=relaxed/simple; bh=OQipipycJY2YvBgfXG1iGiwBsycXeLTo6WO14ztaLuA=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=VxCoTXx/bss2uM8uf3VKk4a/2BfubQSbtymlD3mCXpuDwnO0lzKAvkC+om4uA+UU/X5kf+Ub7GpgC2l8BMPm+7XJxHGRnBgy7rQ7uJvBteIt4OiwSa9QLpqJ61vcXP1Yv4vJNCRnGntqkcPZI+w2ui/2kaJomjF2d5RXv+dYgtY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=dXhefaFU; arc=fail smtp.client-ip=40.107.201.27 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="dXhefaFU" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=wciVdk6BOYYckR/FqHUbW2ZZC1yHdGHO0vVN4PFvc0Z1ly0cm0tX4rvINhmubrk+LlSasinmTIZhxOx3E9jwXNnWpeUUPBlI/g1AQKCmINJHLyrmeCLf15fbXq9EX7w+M6NqafWzsrvyE8tZ2oNVX7BwqAd0Pi/j8XcbcB4LdGRbF04jtAMW6yYMkj3ssw/i7HtFx5aJOpvwRQJdjyQZuFvtvRodGlu6J9NnGL01UaKCaZ+XlA/DRVtXv4G9sRSYoYk2X+hPrUklXVi+VYmxQmBLex3hZUsKylwO4A4zSbnOo/i1ff1QDC1XdgvMHSBcUyQnwHz0AYgvMaT7Qmp2mQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Lu6XE7sC0HGoIkdkzBzYNY6KLE3au9xtaqnY8JC6ftI=; b=QKifDbqWQutdQ4gWWa9PJrooxltfFiknv76miz6iwMRv5SmuV7isLb61fiiah1z6tNYM5iDV/FOuhl/7kMyCtmIGUPc+3K/IA+NwM4tUxVxbFdk1x7MLTvnbjTJM0YmptVstZCbWlgn/pUL67CXCpxd8iLzbJ+rGU+Z+31r/tQRkoRv/g+ORB/XtyooY8fJhFhf6GnGaW/nLke3C4VCjtMJkx9O/qpXS9YtJwQL4uMkHjxrAF7dzEOnbH91kma7yOk/JxxoTX3adUiEMT9KQ8Agtm7gISkXRtn4hoT9GKo1RNOwH/NqAtkoR4nXgKz18lVPZ1Sf5YyY9L8hGln2dxw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Lu6XE7sC0HGoIkdkzBzYNY6KLE3au9xtaqnY8JC6ftI=; b=dXhefaFUSP8VZ0kYcIBQZP2tIBlI908o+8Q67P3X/K75AzATDhVEFj71F8JojipMdsaY+IjAHT/STrnHRVgij+XQoWW3IM5rmIdxvUag0uo2NoWC4oYbtJl2eQ7MSXBHrcunNj+slD3U2oCncqWZDDxGwpMb9KtvTpx74J3lXYg= Received: from MW4PR04CA0285.namprd04.prod.outlook.com (2603:10b6:303:89::20) by DM4PR12MB6160.namprd12.prod.outlook.com (2603:10b6:8:a7::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.7; Wed, 28 Jan 2026 10:14:17 +0000 Received: from CO1PEPF000066E7.namprd05.prod.outlook.com (2603:10b6:303:89:cafe::68) by MW4PR04CA0285.outlook.office365.com (2603:10b6:303:89::20) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9564.7 via Frontend Transport; Wed, 28 Jan 2026 10:14:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=satlexmb07.amd.com; pr=C Received: from satlexmb07.amd.com (165.204.84.17) by CO1PEPF000066E7.mail.protection.outlook.com (10.167.249.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Wed, 28 Jan 2026 10:14:16 +0000 Received: from BLRKPRNAYAK.amd.com (10.180.168.240) by satlexmb07.amd.com (10.181.42.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.17; Wed, 28 Jan 2026 04:14:13 -0600 From: K Prateek Nayak To: Thomas Gleixner , Ingo Molnar , Sebastian Andrzej Siewior , CC: Peter Zijlstra , Darren Hart , Davidlohr Bueso , =?UTF-8?q?Andr=C3=A9=20Almeida?= , K Prateek Nayak Subject: [RFC PATCH] futex: Dynamically allocate futex_queues depending on nr_node_ids Date: Wed, 28 Jan 2026 10:13:58 +0000 Message-ID: <20260128101358.20954-1-kprateek.nayak@amd.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: satlexmb08.amd.com (10.181.42.217) To satlexmb07.amd.com (10.181.42.216) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000066E7:EE_|DM4PR12MB6160:EE_ X-MS-Office365-Filtering-Correlation-Id: 237441c8-d21e-40e5-cc4f-08de5e55feac X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|82310400026|376014|7142099003; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?9C4QEod0J6TDRNX4p90Nxxx0VAi5ntvpnWyc0lGitk5uuF+EFNFYqq/i+gsx?= =?us-ascii?Q?5KQfJLiN951Pv7hZvkGGu1+h/hD9usD3j1DXd7oPqwqQbwS9iPqLF2MAz4G8?= =?us-ascii?Q?HnVwDc67cse/4wTBI4tNN9/UBGi/mba867YyVRC8UdbV55Iz3TQIwoe5Tcda?= =?us-ascii?Q?iw/ce1ip4B+7qZ84dEADmXY9lWbJWicEai3joLFOTch7RkUG1yaHWE3caXsD?= =?us-ascii?Q?MUrQWhJGb+DSMrgDB2mTUWoK0UwoXqplUTMgfgNmDjsgKDg45uudpYKzp8Mo?= =?us-ascii?Q?VyUXZi/JKmotUKBJsonKgq3rtcfnsclDMdwBgiR0WproIlipkHRj0oCycGrL?= =?us-ascii?Q?XQjq4yUgoRiTGJdkCMgnTJ4VJ+ZG+cpyhigyGPUke76oAVfyyY6BkmnzarEX?= =?us-ascii?Q?YCZP+ue8NFrSS+ECQH6Jj3Do5a7ypOK4T74lAdHWUDXZ3A53dhn87WAHQFeP?= =?us-ascii?Q?tSP9VztAeVbzuP13sFTaOIO5lQyJd7j/dEZPH2mRY+wu0AEuDzxge+xrXo+3?= =?us-ascii?Q?UP/q18PcjSnEMrInavPU1e/wKZyo8gck7NEKq7D5r1iIndjhH0xmXIg2zGyy?= =?us-ascii?Q?k9q3lOBLCGcOP8eh2nwKQ9CVA4x40n/X4eeJludxnu4lWjPMVCtaq+5BClMW?= =?us-ascii?Q?Kv4ji5YOv37rcZJR8ewY/rwgtKsBXoF3604qh1klsP5Zm/TgZLKGnTSTZIc5?= =?us-ascii?Q?SXo6QsgvyD4hd8dD7fuXmJDzZ8xIsZWamG3rJGg9RlB7QgFsJmh3Wop1cP0Q?= =?us-ascii?Q?/iDrRhLT+gJd848zsqKO55yq/cLvJYWsiZOxaEXN3YoivUMXmrGb5Kauqk9y?= =?us-ascii?Q?DQ/+VzXOghd6t6JYlxYSCO02pJrEyIm/i0E7HmMoeu9qCV2C0rMbhSKOYs30?= =?us-ascii?Q?ZyyN/H6WFtEmAFuOZvk1Oa6VMORLbOZPWICQ8ml6JI3tFIXcA+Kn8tiN9Hpm?= =?us-ascii?Q?UOh03etloOM8Mxq/WUX/fnbSwGXZbFrjo0g3dzMUCs+YbAg5oIuSncIwRHRW?= =?us-ascii?Q?3rSlGwwq6//O5T5vJbv5RxiTZHib0UJgD0T9orU73CcS9GAQILN+H0BWUfQT?= =?us-ascii?Q?17igyw6rviGOCD073coV3FIvTujo/rGhY8AuO02UCNPY5WAPtK+j/zmAJA7w?= =?us-ascii?Q?YCRlbppn9ta4OItE4EY76fOt4xJtbCoy11eUQrh6+TC6pEr8zTRVpzl4d0eJ?= =?us-ascii?Q?//D+uMcmdWn+k2Db//zC+EvCXz8yG5x98J9qmGtQZPhFf1jDonOplDekNapW?= =?us-ascii?Q?Yo+xPBpRDBSeZx93pcEMMuLE2IM0wylgbH3oGl5O9Zk1CcwZ5OGADD/pQwbN?= =?us-ascii?Q?eDeM5Np68p3D1O3BUUogZXo7PG11nsCf+tbDrUBqKMsxGiLHOkKsHaNbebq0?= =?us-ascii?Q?Iz2FV7f1dNmZ1JfC9GNhZ2JmEvIltuNswh4/Gz2C4EVym0Qpkl4/ohQOSgad?= =?us-ascii?Q?ev3LJV/WSHktAG3v5lQ2ZMLa2+ALHwIkpUgRzcZ/ca+DbH0EY+jpICwkGx8s?= =?us-ascii?Q?DCUxNSSezX0eFNkZulIMN3FBqCqSggVxMl7uR3Y4WGtbjBpvvc3oDvKCnRvk?= =?us-ascii?Q?G+1WOhU4puwbiDE9h/Zaf4Qi5bIbNpVzwQ2plnd5Tx72MhaH9/1xkuVbSIJp?= =?us-ascii?Q?5D3PBFXcodSPwIIKYoMk3mdARf3GXjYTXBN5VoQ3GRjsfUgKRQc6URwXQNym?= =?us-ascii?Q?kgSKzw=3D=3D?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:satlexmb07.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014)(7142099003);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Jan 2026 10:14:16.9811 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 237441c8-d21e-40e5-cc4f-08de5e55feac X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[satlexmb07.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000066E7.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB6160 Content-Type: text/plain; charset="utf-8" CONFIG_NODES_SHIFT (which influences MAX_NUMNODES) is often configured generously by distros while the actual number of possible NUMA nodes on most systems is often quite conservative. Instead of reserving MAX_NUMNODES worth of space for futex_queues, dynamically allocate it based on "nr_node_ids" at the time of futex_init(). "nr_node_ids" at the time of futex_init() is cached as "nr_futex_queues" to compensate for the extra dereference necessary to access the elements of futex_queues which ends up in a different cacheline now. Running 5 runs of perf bench futex showed no measurable impact for any variants on a dual socket 3rd generation AMD EPYC system (2 x 64C/128T): variant locking/futex base + patch %diff futex/hash 1220783.2 1333296.2 (9%) futex/wake 0.71186 0.72584 (2%) futex/wake-parallel 0.00624 0.00664 (6%) futex/requeue 0.25088 0.26102 (4%) futex/lock-pi 57.6 57.8 (0%) Note: futex/hash had noticeable run to run variance on test machine. "nr_node_ids" can rarely be larger than num_possible_nodes() but the additional space allows for simpler handling of node index in presence of sparse node_possible_map. Reported-by: Sebastian Andrzej Siewior Signed-off-by: K Prateek Nayak --- Sebastian, Does this work for your concerns with the large "MAX_NUMNODES" values on most distros? It does put the "queues" into a separate cacheline from the __futex_data. The other option is to dynamically allocate the entire __futex_data as: struct { unsigned long hashmask; unsigned int hashshift; unsigned int nr_queues; struct futex_hash_bucket *queues[] __counted_by(nr_queues); } *__futex_data __ro_after_init; with a variable length "queues" at the end if we want to ensure everything ends up in the same cacheline but all the __futex_data member access would then be pointer dereferencing which might not be ideal. Thoughts? --- kernel/futex/core.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 125804fbb5cb..d8567c2ca72a 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -56,11 +56,13 @@ static struct { unsigned long hashmask; unsigned int hashshift; - struct futex_hash_bucket *queues[MAX_NUMNODES]; + unsigned int nr_queues; + struct futex_hash_bucket **queues; } __futex_data __read_mostly __aligned(2*sizeof(long)); =20 #define futex_hashmask (__futex_data.hashmask) #define futex_hashshift (__futex_data.hashshift) +#define nr_futex_queues (__futex_data.nr_queues) #define futex_queues (__futex_data.queues) =20 struct futex_private_hash { @@ -439,10 +441,10 @@ __futex_hash(union futex_key *key, struct futex_priva= te_hash *fph) * NOTE: this isn't perfectly uniform, but it is fast and * handles sparse node masks. */ - node =3D (hash >> futex_hashshift) % nr_node_ids; + node =3D (hash >> futex_hashshift) % nr_futex_queues; if (!node_possible(node)) { node =3D find_next_bit_wrap(node_possible_map.bits, - nr_node_ids, node); + nr_futex_queues, node); } } =20 @@ -1987,6 +1989,10 @@ static int __init futex_init(void) size =3D sizeof(struct futex_hash_bucket) * hashsize; order =3D get_order(size); =20 + nr_futex_queues =3D nr_node_ids; + futex_queues =3D kcalloc(nr_futex_queues, sizeof(*futex_queues), GFP_KERN= EL); + BUG_ON(!futex_queues); + for_each_node(n) { struct futex_hash_bucket *table; =20 base-commit: c42ba5a87bdccbca11403b7ca8bad1a57b833732 --=20 2.34.1