From nobody Mon Dec 1 21:30:48 2025 Received: from mailgw1.hygon.cn (unknown [101.204.27.37]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6AA0D304BDD for ; Mon, 1 Dec 2025 10:57:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=101.204.27.37 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764586658; cv=none; b=a9guQ4WntlOxeGyrpzAdUyg+ENzawc4Ili3dzeXDqOBtP9nQaD9Mu+P/Kt7t37bSlHM1y30uRi+6hDjtoB/UlmpHYzY7zMTcKvYebZUK82eauApBVafZfwrsNNewFYWWXGO/xHHnEgdHPRE0HAYoK/oCWSHsuNTMBhmUCoLKtMU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764586658; c=relaxed/simple; bh=hMqGxqUJcAW+o1D/966v2LbRZudzpmh2Je1nurA7AAM=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=k5oSfafz24SQSrE6jGX84b6E1CteLWzkgD1Kt2yVNiVPlV68DEro5dRSyNRroWgOCPYl+zml4kBdFjZOubDvafXcwp5xUP+nU2ZXh+8lcS4yIoMuuTt8n2s+tSzgmtOUJM32hmNRHt3WcYPsFxnbbBqvH7T0jMuSCF6vYuGQgsA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=hygon.cn; spf=pass smtp.mailfrom=hygon.cn; arc=none smtp.client-ip=101.204.27.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=hygon.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hygon.cn Received: from maildlp1.hygon.cn (unknown [127.0.0.1]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4dKgpH0CBWz4wlvK; Mon, 1 Dec 2025 18:57:11 +0800 (CST) Received: from maildlp1.hygon.cn (unknown [172.23.18.60]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4dKgpD3H4nz4wlvK; Mon, 1 Dec 2025 18:57:08 +0800 (CST) Received: from cncheex04.Hygon.cn (unknown [172.23.18.114]) by maildlp1.hygon.cn (Postfix) with ESMTPS id DC45878A3; Mon, 1 Dec 2025 18:57:05 +0800 (CST) Received: from jianyong.hygon.cn (172.19.20.52) by cncheex04.Hygon.cn (172.23.18.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.36; Mon, 1 Dec 2025 18:57:07 +0800 From: Jianyong Wu To: , , , CC: , , , , , , , , Subject: [PATCH v2] sched/core: avoid calling select_task_rq cb if bound to one CPU for exec Date: Mon, 1 Dec 2025 18:55:45 +0800 Message-ID: <20251201105545.778087-1-wujianyong@hygon.cn> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: cncheex05.Hygon.cn (172.23.18.115) To cncheex04.Hygon.cn (172.23.18.114) Content-Type: text/plain; charset="utf-8" In the current implementation, even if the task calling execl is bound to a single CPU (or not allowed to be migrated), it still invokes the select_task_rq callback to select a CPU. This is unnecessary and wastes cycles. Since select_task_rq() already includes checks for the above scenarios (e.g., tasks bound to a single CPU or forbidden to migrate) and skips the select_task_rq callback in such cases, we can directly use select_task_rq() instead of invoking the callback here. Test environment: 256-CPU X86 server Test method: Run unixbench's execl test with task bound to a single CPU: $ numactl -C 10 ./Run execl -c 1 Test results: Average of 5 runs baseline patched improvement 383.82 436.78 +13.8% Change Log: v1->v2 As suggested by Peter, replace manual corner-case checks with select_task_rq() to align with existing logic. Additional testing on a 256-CPU server which all sched domains have SD_BALANCE_EXEC flag, shows that sched_exec now searches all CPUs in the system (previously, some SD_NUMA sched domains lacked SD_BALANCE_EXEC). This increased the performance improvement to 13.8%. Suggested-by: Peter Zijlstra Co-developed-by: Yibin Liu Signed-off-by: Yibin Liu Signed-off-by: Jianyong Wu --- kernel/sched/core.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f754a60de848..6e4ba3c27e5c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5439,10 +5439,11 @@ void sched_exec(void) { struct task_struct *p =3D current; struct migration_arg arg; - int dest_cpu; + int dest_cpu, wake_flag =3D WF_EXEC; =20 scoped_guard (raw_spinlock_irqsave, &p->pi_lock) { - dest_cpu =3D p->sched_class->select_task_rq(p, task_cpu(p), WF_EXEC); + dest_cpu =3D select_task_rq(p, task_cpu(p), &wake_flag); + if (dest_cpu =3D=3D smp_processor_id()) return; =20 --=20 2.43.0