From nobody Sat Feb 7 08:13:54 2026 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07F4121764D for ; Thu, 9 Jan 2025 15:26:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736436392; cv=none; b=MX/g4Kyacsv2u26k0qDCcBe7npi2+IiBDuDTNUXhFqkg1aYielosXaQLv9Th8NlFeW3pGj9QInRn8WABNVuiee0fRcOm5E+7HxH1sEB52Siv4tetZVjamBdrBY08SjLuNj+sbtv9Xw6n1C0E+oAIBbDFvBcWnQuxduvXE0HtRXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736436392; c=relaxed/simple; bh=PH543vEZJlnKbAgQLWBSqsyUdPU8zT8FsczMdSMkjOg=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=nHVmNKO6zRHuJ+VooDpdEMmnz+YDwKFu36JQqGmU7Xh6E/jtLfIeDLERpSOx67+OnfssWyYrCvDZQXRMKeWW49dW0zLAcVp3ey0n+ItEHrIA0gjCeUfzrPxVZfojHcULQHO22FkIqagSHa2Eo9M4zmwypcQjg3GR/yRxETKyfEw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dDFW4fUq; arc=none smtp.client-ip=209.85.214.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dDFW4fUq" Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-216401de828so16779035ad.3 for ; Thu, 09 Jan 2025 07:26:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736436390; x=1737041190; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=hzE3KvO1Rm/WqDk5yY7hOQsGeqfFPOD4sOQSI7ARCAI=; b=dDFW4fUqyHHMHwhyr4p24B59P5NbYB2jjvGFXUTTxwORIsM2ow8zxG7p5mBbjU1nob CzFXYc1Y2P/l99DOtY5hGGOTfGYg+g3+jSgO0vAzf6ig0WR8GJ7pN64TAunCp28ENN1v KsxYpKUNWlAKTSekK/6Mywgi4axIyWocoj+FXNJkvFnGsXMLUbjV8o2mO3PL1pWxlXro irkySheqG4qe1Y/U6YDzmCG8qbksvbCmgMc99x6Ic/cLoKTFmvSgj/fq9rj/HKXewoWm v0d4caMkf5q0q/gWupxO9TkvtQ5T+iQgOseaA72YCaFhd0TgoiUs2mKyppOYPnK3tjNA rlug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736436390; x=1737041190; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hzE3KvO1Rm/WqDk5yY7hOQsGeqfFPOD4sOQSI7ARCAI=; b=kevPynC4lApVVr18XGSg2iJSSh/K50TFH9ZppRdLkbbCN6p9vqgzWaKWG5g7T5Esqu v0+GpjfJ6ZidY70n+J2o+38Z/uCKh6ehJaLTKzBbIJ8QaSLuZb0Hnfxrp6CfvlVBJ0rA jq+4xwjBMZchJ7FbLKaSVElnSBHwAnMjt1JFE55tZ2b920IV1ZcObKVxvkt48qhBdLp4 NP06pWOHox3xmLuBYAxapFMAvI9PRRYlESmGgQX5ikDYfjEx4LxZLvDzk8+B7b2dIr7E 3M/M/3DhiCiTvF+lGt25mT7JUIEUdVeYSsL+7vdWwivJsMFAM+9XNU66qtQAsQ21yvv5 pSHQ== X-Forwarded-Encrypted: i=1; AJvYcCXVnbK6vAj0wfsJcrHU6JmUY42LEw79TMeBUstqJhik01NIQhEwUeB5uT/0jgnHVH8mVs9aH9KF7aAQcAI=@vger.kernel.org X-Gm-Message-State: AOJu0YyBshKqX7UqgruxImm9NhfDctOHfe4u7Qj4SHmMPcn9Lij3/Kj4 fsUTxk/byJabOJShu4Rk3KXNbYuB/L1zxrfURxlrYhyR8KDdx0+J X-Gm-Gg: ASbGncvbuJtRqSVSHw7nWIUpREBKI9uWpIaJxd2qr1yHJaIgQHjCnBcPAtpiP9oIWQW 6cdUCl2RL6DLtm0phjUzIcNkkauFWroy9k+rhQnK2/DzPJKSlSljDpNQdkSIlRSDhBYll4CUhe7 azG6yoj/ovCJhmB7d0Stus5oumZZAcmMr7uqvrpLKjL4snxVEZdOUgAM0FY/0HI9GYotmz0cLhz ibBNQc4uCoVBYVSp0tFdrBqCe9sLX624OCUbxI3bhIre/jmjf1K9tYLew5iCZFMclNPF/UYkCbv X-Google-Smtp-Source: AGHT+IETKR0NUJTWthBe8svsJZN+w0N9TsBTP4W6t51PcxTpvnAFo7COGWG/YUlnQmVKLNY0f3jf0Q== X-Received: by 2002:a05:6a00:2e97:b0:71e:2a0:b0b8 with SMTP id d2e1a72fcca58-72d21f17b37mr9897300b3a.1.1736436390173; Thu, 09 Jan 2025 07:26:30 -0800 (PST) Received: from vaxr-BM6660-BM6360.. ([2001:288:7001:2703:ace9:2fa8:4275:a5b0]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72aad8dbb7csm38464262b3a.108.2025.01.09.07.26.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Jan 2025 07:26:29 -0800 (PST) From: I Hsin Cheng To: mingo@redhat.com Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, nysal@linux.ibm.com, linux-kernel@vger.kernel.org, I Hsin Cheng Subject: [RFC PATCH v2] sched/fair: Refactor can_migrate_task() to elimate looping Date: Thu, 9 Jan 2025 23:26:15 +0800 Message-ID: <20250109152615.49760-1-richard120310@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The function "can_migrate_task()" utilize "for_each_cpu_and" with a "if" statement inside to find the destination cpu. It's the same logic to find the first set bit of the result of the bitwise-AND of "env->dst_grpmask", "env->cpus" and "p->cpus_ptr". Refactor it by using "cpumask_first_and_and()" to perform bitwise-AND for "env->dst_grpmask", "env->cpus" and "p->cpus_ptr" and pick the first cpu within the intersection as the destination cpu, so we can elimate the need of looping and multiple times of branch. After the refactoring this part of the code can speed up from ~115ns to ~54ns, according to the test below. Ran the test for 5 times and the result is showned in the following table, and the test script is paste in next section. ------------------------------------------------------- |Old method| 130| 118| 115| 109| 106| avg ~115ns| ------------------------------------------------------- |New method| 58| 55| 54| 48| 55| avg ~54ns| ------------------------------------------------------- v1 -> v2: - Use cpumask_first_and_and() - Remove additional cpumask Signed-off-by: I Hsin Cheng --- Test is done on Linux 6.9.0-0-generic x86_64 with Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Test is executed in the form of kernel module. Test script: int init_module(void) { struct cpumask cur_mask, custom_mask; struct task_struct *p =3D current; int cpu, cpu1 =3D nr_cpu_ids, cpu2 =3D nr_cpu_ids; unsigned tmp =3D 0; cpumask_copy(&cur_mask, cpu_online_mask); /* Self-implemented function, didn't paste here because the length */ generate_random_cpumask(&custom_mask); ktime_t start_1 =3D ktime_get(); for_each_cpu_and(cpu, &cur_mask, &custom_mask) { if (cpumask_test_cpu(cpu, p->cpus_ptr)) { /* imitate load balance operation */ tmp |=3D 0x01010101; cpu1 =3D cpu; break; } } ktime_t end_1 =3D ktime_get(); ktime_t start_2 =3D ktime_get(); cpu =3D cpumask_first_and_and(&cur_mask, &custom_mask, p->cpus_ptr); if (cpu < nr_cpu_ids) { /* imitate load balance operation */ tmp |=3D 0x01010101; cpu2 =3D cpu; } ktime_t end_2 =3D ktime_get(); if (cpu1 !=3D cpu2) { pr_err("Failed Assertion, cpu1 =3D %d, cpu2 =3D %d\n", cpu1, cpu2); return 0; } pr_info("Old method spend time : %lld\n", ktime_to_ns(end_1 - start_1)); pr_info("New method spend time : %lld\n", ktime_to_ns(end_2 - start_2)); return 0; } --- kernel/sched/fair.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2d16c8545..d49960d50 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9404,12 +9404,11 @@ int can_migrate_task(struct task_struct *p, struct = lb_env *env) return 0; =20 /* Prevent to re-select dst_cpu via env's CPUs: */ - for_each_cpu_and(cpu, env->dst_grpmask, env->cpus) { - if (cpumask_test_cpu(cpu, p->cpus_ptr)) { - env->flags |=3D LBF_DST_PINNED; - env->new_dst_cpu =3D cpu; - break; - } + cpu =3D cpumask_first_and_and(env->dst_grpmask, env->cpus, p->cpus_ptr); + + if (cpu < nr_cpu_ids) { + env->flags |=3D LBF_DST_PINNED; + env->new_dst_cpu =3D cpu; } =20 return 0; --=20 2.43.0