From nobody Tue Dec 2 02:41:47 2025 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B745A2E7160 for ; Wed, 19 Nov 2025 03:13:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763521996; cv=none; b=p5beCq0atjP9wgmw7/dJPz6mGu0fIEM4h9P/ZcwfU4Smwaj3H2P99u92Qhd9gX1JSitcT9k6VNTbYLZCWCKOzVcGvJBypfhCAcQmHX5Hfn/xh92/IgxcDUf5Qi6IHvFk3mWs2gsFzk0CaChKFow4lFC5+Prb0vn0PFFrTA8wiac= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763521996; c=relaxed/simple; bh=//McQDuufDIbU2OLIRz91v6W5HXWp3WUNWkAMzqmkB0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q0w+KEr5LxcWfdGAX9mr0fpuRE53hfLjBRAudyYYkTbG11j7L8g3yuifafG3oUTEd2qOB4M9t0wRyCJzX9ld1lnj6RpHGpdz29lCd1oRCL96iw0xaZUR2xtgfFadeJ3ZOlPEp/VV4dFzczp9bHHAgyj5o+MA035jNH856jow2bs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AtKrXmrI; arc=none smtp.client-ip=209.85.222.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AtKrXmrI" Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-8b2dcdde698so525694685a.3 for ; Tue, 18 Nov 2025 19:13:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763521993; x=1764126793; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Wfr6lv6AsT0Zh3via7B3Sd/hj1nTeM1m84o67PMNi4A=; b=AtKrXmrIkfLj9tfqbL1q73iKlgqCNpu+INLB7ffaCDSpknlC/YyF3zxSN8BZAdXVr3 CXVCeG3AhlkhU+8DUU6JP+tQ84ot43hUSXALuNEjCPu3nlBVpx0zLH6DsbaHmZ51w6XE nRRzYHp9yQYniUBbnwVfBQv69/oTbctPSwBT+oC8221krG2UubidNrjVvBwixeOzKznW 6459yAIdxoZm6lGuHLen7I3gy0VFf9GeO1rP7/WsTeJqG2ndeN4NIWl3oF18kg8ljvPu cFF75+OEh5RDS0SOJYHK23eAZIEmbLoTQEJ8NhlK7gqww05qXhnrCUwJ/N1RKyiDNKgk VXxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763521993; x=1764126793; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Wfr6lv6AsT0Zh3via7B3Sd/hj1nTeM1m84o67PMNi4A=; b=S81K94+YHq7WOF617QZPZ/avhawoI99YW0p8OPevJPWdzr5xio9Y9vuAGEh8NMexpp RVW+iGMGwLEpdfPjLssh56Q/7iLLqvghALQBTQZ6UWY9emJb9ILjfz4b3vJcFyw9Pp0m NLK3gcg85t3CDbSZmINoYgKIf6eEEqhQjYZnVgL3oCNi+oiYd/mrFH3DIZbfqpjJBal6 6+bmJiUsT764lW+6kAmmBBJ3Dp1oPcXk1bLXpsR6WmpKDphBXFrJbqI96016+0pe/onB GFhXMejxGXhwgoB1wujFPUOPkKhN/DMzDfkej8Yuj8qO7PDZVbT8kqX9MzJComiuGdLG uDkg== X-Forwarded-Encrypted: i=1; AJvYcCUSU48/1TqBBgSMbrMRNuIJOMQFW8FZJedTGcuxlE73xk3plUOOz7/47u16hmTe83KbjLgXBlgsbLKEt3w=@vger.kernel.org X-Gm-Message-State: AOJu0YzAFW+VHezt3B7v9YTDFaNHY+OO5QVOJAkgLZs5SKOwT2IZV615 dSxVs9zCG9/mQzfMSG/8SRiKeuWy+ZhJnLj1BU+3WCkmh1dzeO/dp5iB X-Gm-Gg: ASbGncvUDMi+rdULUtPduHNg2sqhCdSVN4ZSZkOubC0SQDAg/H86MocrX5wHVpGRfkL 6+ZEb65M5L9DeUwyLA6qRv8w5YBFx3TpgWW3eChQKy5iYwJgm3zYSXMSOnYW3yd730t38TIeKhJ cEQzDquVstRzSiHiMgiUUPg4+1uN9mskACiHwjNT2UvCPZH9gBCC9QFDGFnUT0j73/loy3s4o/M O4CNXzgfVU5Ucps2Skzs9zqhOZ0WjL4vA5CWoq88B/vO4okZc13Gi1hZeI+YqhRdbFbzMowJZPP Y36vZ7kHmxFDzd8t8PNW0cCwCtWW/+vVhG37WPsew8u0KhyBUokjC25L1eCsHR+7+9XpMo1YW8L f4wwZCUE/k9UqEOnCZfvIfRGH3w159TxKZBOdN5SVtRCvxlJ2ZXLRX796kHUAkJSD3SuhmIyY2U DoN9/G1as= X-Google-Smtp-Source: AGHT+IGWB8c6pAwXrMdDLlVzG9i0qLvS7Z0Vj1ke1kyLqFAeWysFd2QEDBsRVI8EutmO0vTtNqQFqw== X-Received: by 2002:a05:620a:237:b0:8b2:d2be:3d04 with SMTP id af79cd13be357-8b2d2be3e06mr1598907285a.36.1763521993568; Tue, 18 Nov 2025 19:13:13 -0800 (PST) Received: from localhost ([12.22.141.131]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8b2af044e3asm1318252485a.41.2025.11.18.19.13.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Nov 2025 19:13:13 -0800 (PST) From: "Yury Norov (NVIDIA)" To: Andrew Morton , Thomas Gleixner Cc: "Yury Norov (NVIDIA)" , Rasmus Villemoes , linux-kernel@vger.kernel.org Subject: [PATCH 3/3] group_cpus: simplify inner loop in grp_spread_init_one() Date: Tue, 18 Nov 2025 22:13:05 -0500 Message-ID: <20251119031306.644129-4-yury.norov@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20251119031306.644129-1-yury.norov@gmail.com> References: <20251119031306.644129-1-yury.norov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Three optimizations for grp_spread_init_one(). 1. Drop most of housekeeping code in grp_spread_init_one() with for_each_cpu_and_andnot_from(). 2. Fix Shlemiel the Painter's algorithm by adding 'sibl =3D cpu' line. This improves the grp_spread_init_one() complexity from quadratic to linear. 3. Don't clear the nmsk because it's rewritten in the caller code anyways, and switch to non-atomic bit setter for irqmsk as the mask is local and implies no concurrency. Signed-off-by: Yury Norov (NVIDIA) --- lib/group_cpus.c | 25 ++++++------------------- 1 file changed, 6 insertions(+), 19 deletions(-) diff --git a/lib/group_cpus.c b/lib/group_cpus.c index 6aae1560b796..35aba99d8cd0 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -17,27 +17,14 @@ static void grp_spread_init_one(struct cpumask *irqmsk,= struct cpumask *nmsk, const struct cpumask *siblmsk; int cpu, sibl; =20 - for ( ; cpus_per_grp > 0; ) { - cpu =3D cpumask_first(nmsk); - - /* Should not happen, but I'm too lazy to think about it */ - if (cpu >=3D nr_cpu_ids) - return; - - cpumask_clear_cpu(cpu, nmsk); - cpumask_set_cpu(cpu, irqmsk); - cpus_per_grp--; - + for_each_cpu(cpu, nmsk) { /* If the cpu has siblings, use them first */ siblmsk =3D topology_sibling_cpumask(cpu); - for (sibl =3D -1; cpus_per_grp > 0; ) { - sibl =3D cpumask_next(sibl, siblmsk); - if (sibl >=3D nr_cpu_ids) - break; - if (!cpumask_test_and_clear_cpu(sibl, nmsk)) - continue; - cpumask_set_cpu(sibl, irqmsk); - cpus_per_grp--; + sibl =3D cpu; + for_each_cpu_and_andnot_from(sibl, nmsk, siblmsk, irqmsk) { + __cpumask_set_cpu(sibl, irqmsk); + if (--cpus_per_grp) + return; } } } --=20 2.43.0