From nobody Tue Dec 23 14:23:16 2025 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3504C14C5AF for ; Mon, 13 Jan 2025 04:12:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736741578; cv=none; b=gbMq5BYMuNbq6dyvoBo1vqD1yd97utuRkqoJOmPzFgzpb7PXNwV7fm7lBx4QvcWtq1oclcMDwz/YJeL0qn55M/lNmgrsahoxEAOZlcvBnZUDNlDurwbQvO7qY2KMbPG7h3ASt8VFBTrUNfZfuWZ+qMR4QP2dEbnBUHIGkw+i4GI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736741578; c=relaxed/simple; bh=PH543vEZJlnKbAgQLWBSqsyUdPU8zT8FsczMdSMkjOg=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=eYr+rOJdgwp5vbodOIKS7mKzqMUTnLxoiWX2x5vkVKuTmT3VemElaYm/z33AhZo8QbwK44fFR8cL1Chcb9LNrXcmr4ZPbYHlPufC5WOxxAT6yF/SCr+6zpAM72rTqf45gDyhi8nImLe+gorC7Dc0jiJn0i3rSLmKkTh+Sx79xrA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kF1H86ub; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kF1H86ub" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-21649a7bcdcso63007855ad.1 for ; Sun, 12 Jan 2025 20:12:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736741576; x=1737346376; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=hzE3KvO1Rm/WqDk5yY7hOQsGeqfFPOD4sOQSI7ARCAI=; b=kF1H86ubCZ0U+X0miEtmE0zeD2lrP2FOY7vs3TOXKbOoonFwMm7Jd7jqTKBcM5oYZA 4frxs3dGyLXpJ2Ec6jBOsTxcEFV4iT5G9SLoA+F5yXq9t6g3uxgWpyJYj9LpeY0oPFBY qohEzkinSTl1Csf47JRCJxrYSHGaS1hZKKTHDHdOSZ1w8v60s4d3IXABJrEZ4TZr++4n zSa9hrWCuwZSr7wSQE1PvvBt6ouLYGqmHu8JwemhXE6eW7VwXTLw1ai6wKMLgSJ1csCA MvkDyVpLRRcnprz8Si039BTEWpuJ9XYGV7JlS/WLW7+JDHJyreXOmHaIFsFePm7TbbHs 2SDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736741576; x=1737346376; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hzE3KvO1Rm/WqDk5yY7hOQsGeqfFPOD4sOQSI7ARCAI=; b=cGLgFRZ6gcQeWx256MyjnZEiv8Be/iZN6vsVWbaeydIm4FwkA9WG3wwuMR6NkCxJ1g MOwvBe+PmS5HzhDiE6t2oqhuuRi/Rcur+kK2KNMYDBz3TNmY+l6oj/gUNEhA176Tt0Zt dcNjH4Lsm6AMdQV7pePig9tb34X/W18UQs4AjIQwa3Equ1n04obInlzTgACD7KWO5oZ5 GdDAjjxIQ4jw8RTfw7aPOOZBJej5+CA2MawZrSDJcRJngWt2X3W6bCQGpVeGSGuAUF02 E98SSbwui/dysHG3V56TWU3RMf0HefuVMa3HqrpQ8hE2xTW2URqKnGkrQH+ZWK5FsETi R5Lw== X-Forwarded-Encrypted: i=1; AJvYcCX2Cww5rGeFpO+O2GGsE7aX+k2FIWry1LVBKnZwBGP+mzpkIeAAAp3f8R4qVTZbUSGnctmRzuG8vgKv8Tg=@vger.kernel.org X-Gm-Message-State: AOJu0YxJK0AfQcfQOTpvfxTbiGGjzf5RhZd0idnsXsSsPp+/IHL/PpsD rRaYMdNkrI/hJAbvsIHwWcUTIsnpcLt4uS1qIdAReARASfkWP5rM X-Gm-Gg: ASbGnctvh7w5ktOZFY27lRZB8lxalmgdemsP2VaIVqsc4tewzW/7eVE2kU7POFBHjcQ OHhZjFgTnZ7eD6B/Y118z1a81WP5fn2eKzp6W6v0rWT1WUC/uAwYh0Jf8SzlFNkr3ZY+gYJOcgm qgARPkiNGeFBQxo5xzE+j8Y02cBUUWPu7/IsdL7PwaXp4aHjcfxM2Ih5XIdldaEx4V6qPubgVZ4 lngfJJRb6ST7SbpvsX57Fq/8/o80NXW4iX0rvww8TT/JY+jvSC/Gds9oUUVx05PBAeQOWP1071S X-Google-Smtp-Source: AGHT+IHUIyrovhsrtKBbiEuR5TBVOoHzMf/ljD1MWw6HgsQm3ckatuFCPiWcAjE2pBdji6h4zgT11g== X-Received: by 2002:a05:6a20:7fa5:b0:1dc:bdb1:349e with SMTP id adf61e73a8af0-1e88d0ff203mr27272799637.17.1736741576417; Sun, 12 Jan 2025 20:12:56 -0800 (PST) Received: from vaxr-BM6660-BM6360.. ([2001:288:7001:2703:ed7e:d523:ad1d:dd48]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-a31d5047ea0sm6048272a12.58.2025.01.12.20.12.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jan 2025 20:12:55 -0800 (PST) From: I Hsin Cheng To: mingo@redhat.com Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, nysal@linux.ibm.com, jserv@ccns.ncku.edu.tw, linux-kernel@vger.kernel.org, I Hsin Cheng Subject: [RFC PATCH v2] sched/fair: Refactor can_migrate_task() to elimate looping Date: Mon, 13 Jan 2025 12:12:49 +0800 Message-ID: <20250113041249.6847-1-richard120310@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The function "can_migrate_task()" utilize "for_each_cpu_and" with a "if" statement inside to find the destination cpu. It's the same logic to find the first set bit of the result of the bitwise-AND of "env->dst_grpmask", "env->cpus" and "p->cpus_ptr". Refactor it by using "cpumask_first_and_and()" to perform bitwise-AND for "env->dst_grpmask", "env->cpus" and "p->cpus_ptr" and pick the first cpu within the intersection as the destination cpu, so we can elimate the need of looping and multiple times of branch. After the refactoring this part of the code can speed up from ~115ns to ~54ns, according to the test below. Ran the test for 5 times and the result is showned in the following table, and the test script is paste in next section. ------------------------------------------------------- |Old method| 130| 118| 115| 109| 106| avg ~115ns| ------------------------------------------------------- |New method| 58| 55| 54| 48| 55| avg ~54ns| ------------------------------------------------------- v1 -> v2: - Use cpumask_first_and_and() - Remove additional cpumask Signed-off-by: I Hsin Cheng --- Test is done on Linux 6.9.0-0-generic x86_64 with Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Test is executed in the form of kernel module. Test script: int init_module(void) { struct cpumask cur_mask, custom_mask; struct task_struct *p =3D current; int cpu, cpu1 =3D nr_cpu_ids, cpu2 =3D nr_cpu_ids; unsigned tmp =3D 0; cpumask_copy(&cur_mask, cpu_online_mask); /* Self-implemented function, didn't paste here because the length */ generate_random_cpumask(&custom_mask); ktime_t start_1 =3D ktime_get(); for_each_cpu_and(cpu, &cur_mask, &custom_mask) { if (cpumask_test_cpu(cpu, p->cpus_ptr)) { /* imitate load balance operation */ tmp |=3D 0x01010101; cpu1 =3D cpu; break; } } ktime_t end_1 =3D ktime_get(); ktime_t start_2 =3D ktime_get(); cpu =3D cpumask_first_and_and(&cur_mask, &custom_mask, p->cpus_ptr); if (cpu < nr_cpu_ids) { /* imitate load balance operation */ tmp |=3D 0x01010101; cpu2 =3D cpu; } ktime_t end_2 =3D ktime_get(); if (cpu1 !=3D cpu2) { pr_err("Failed Assertion, cpu1 =3D %d, cpu2 =3D %d\n", cpu1, cpu2); return 0; } pr_info("Old method spend time : %lld\n", ktime_to_ns(end_1 - start_1)); pr_info("New method spend time : %lld\n", ktime_to_ns(end_2 - start_2)); return 0; } --- kernel/sched/fair.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2d16c8545..d49960d50 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9404,12 +9404,11 @@ int can_migrate_task(struct task_struct *p, struct = lb_env *env) return 0; =20 /* Prevent to re-select dst_cpu via env's CPUs: */ - for_each_cpu_and(cpu, env->dst_grpmask, env->cpus) { - if (cpumask_test_cpu(cpu, p->cpus_ptr)) { - env->flags |=3D LBF_DST_PINNED; - env->new_dst_cpu =3D cpu; - break; - } + cpu =3D cpumask_first_and_and(env->dst_grpmask, env->cpus, p->cpus_ptr); + + if (cpu < nr_cpu_ids) { + env->flags |=3D LBF_DST_PINNED; + env->new_dst_cpu =3D cpu; } =20 return 0; --=20 2.43.0