From nobody Fri Dec 26 11:23:00 2025 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C425625 for ; Sun, 24 Mar 2024 00:46:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241179; cv=none; b=dD34FrPykhFB8nSyjRGTL2tnYkAZl+/cmY8UCWIxkW/Qrd5qPHs+6zPHloPsfGD1vBcA1yRhuZTetKW7HY4rXn76hdhTNMUIjLfljBtT+Ecvr+EqQYlXeeRTrsJV2MMx34mN8/bhIspIF5Sy7vrgV6wgVNbBUXue1BnhSG7L1QI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241179; c=relaxed/simple; bh=RZZS1ahiS/h2qhBPp/IFChBOnTSZuyG6fr0nrFDfFvQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=pqfWS/rE+bPtINVsJQBg1wFO6S8Nw35pVO73EM60ehLUZLywyLDtDMgiFGIz904QhcgqjwoHw2Wq6Jw19n7azeelifR/sPIcfHiZNw8nTkvAdEju1/b/n3A09B8XvznNfkNzQoi8hktgUHMyp+VDnS93e8RJjcIO6DDtgvDJQ/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=ye/+8V1z; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="ye/+8V1z" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-4147c4862caso15939805e9.0 for ; Sat, 23 Mar 2024 17:46:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1711241176; x=1711845976; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iHUfTdaeXxCHn4zySRUikeyKEbkZDx+ZCbXavnvcml4=; b=ye/+8V1zPg4PsC3lPBxpwqBLAP5ydFrD6AQjDGKWo4kquaupOyQeJfGKolzRMHo8fO heLqMdtcRxP6cLNLZ7LeTydNgJnnX+uMtz0/oGpUmJmdsYj/wv11bFHLvAqjyaPO3Pgo 7eGMNhQv2LkT4tm8jpgzUbqtzW/rv+4YrUcKtGT8YV3S+ErTNoHljMN1gftV5KMZGs9x kmRTlGXgE4xtculUq3rR1jvO8JoFD6EFOUrF1qJUuDQLRK4gdy/llrRlPSuG91aIEwGc t9N7W+9nxrHVUuep88mG5pc6pEsbb6p4uqY3SmgAFt+vO9cs1pAz65yssuGefL8WwiUY wQWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711241176; x=1711845976; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iHUfTdaeXxCHn4zySRUikeyKEbkZDx+ZCbXavnvcml4=; b=XMR/cK/OUWkg+0dh9h+Cz5qGteJPAyIBJj0scKnSzaw8bMqsN3iql3UB2TsvzMpskp FS+JP+BddEWWnqiCIR0E8leUKbs52oUWH754St1W33oB7AYAjVGzJzkLn358iJE8d8k2 WvyRWUiFViPeUZVq3/ExrKCUNO6UEQJZfd2l2+Sr+dY0SPcqDVv5j9CN4Z9UyhtmwsNz QFsznDNGCAIoIMwI2WCUw9/VQsBr6CpcQMMAA09II6rgIvEnxonRTV3UfJxTFxsS7zgm fcFFyhyBiB0/GNgopUOQlVJU85MuZQIToHlJEhjRSvHkJAf3xq8YK8RelDTuy1F3EPGM f7Ug== X-Gm-Message-State: AOJu0YxzyUPxuC35Flq82xnqycqb0VAt0qsVPf+erJoOTyaDZmsoDmZ0 mV+ZDsCTYUqC1IkntGQRZe6uxFW8G4fBtTt2hjnHE9G/DCf3SQjFC2XLVh9pZi1ThCsFVT8Wmv0 pmP4= X-Google-Smtp-Source: AGHT+IFYuqA5653aoYoTa8bCna+lvkbCCau75RH9JUkACAN+1dmPbBDGsUVrYhibS+ruKs+3er46kA== X-Received: by 2002:a05:600c:4792:b0:413:e8db:2c9b with SMTP id k18-20020a05600c479200b00413e8db2c9bmr2442309wmo.40.1711241176089; Sat, 23 Mar 2024 17:46:16 -0700 (PDT) Received: from airbuntu.. (host81-157-90-255.range81-157.btcentralplus.com. [81.157.90.255]) by smtp.gmail.com with ESMTPSA id i6-20020a05600c354600b00414674a1a40sm3778179wmq.45.2024.03.23.17.46.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Mar 2024 17:46:15 -0700 (PDT) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v8 1/4] sched/topology: Export asym_capacity_list Date: Sun, 24 Mar 2024 00:45:49 +0000 Message-Id: <20240324004552.999936-2-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240324004552.999936-1-qyousef@layalina.io> References: <20240324004552.999936-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" So that we can use it to iterate through available capacities in the system. Sort asym_cap_list in descending order as expected users are likely to be interested on the highest capacity first. Make the list RCU protected to allow for cheap access in hot paths. Reviewed-by: Vincent Guittot Signed-off-by: Qais Yousef --- kernel/sched/sched.h | 14 ++++++++++++++ kernel/sched/topology.c | 43 ++++++++++++++++++++++++----------------- 2 files changed, 39 insertions(+), 18 deletions(-) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 41024c1c49b4..f77c00dddfe1 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -109,6 +109,20 @@ extern int sysctl_sched_rt_period; extern int sysctl_sched_rt_runtime; extern int sched_rr_timeslice; =20 +/* + * Asymmetric CPU capacity bits + */ +struct asym_cap_data { + struct list_head link; + struct rcu_head rcu; + unsigned long capacity; + unsigned long cpus[]; +}; + +extern struct list_head asym_cap_list; + +#define cpu_capacity_span(asym_data) to_cpumask((asym_data)->cpus) + /* * Helpers for converting nanosecond timing to jiffy resolution */ diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 99ea5986038c..44ed3d0812ab 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1329,24 +1329,13 @@ static void init_sched_groups_capacity(int cpu, str= uct sched_domain *sd) update_group_capacity(sd, cpu); } =20 -/* - * Asymmetric CPU capacity bits - */ -struct asym_cap_data { - struct list_head link; - unsigned long capacity; - unsigned long cpus[]; -}; - /* * Set of available CPUs grouped by their corresponding capacities * Each list entry contains a CPU mask reflecting CPUs that share the same * capacity. * The lifespan of data is unlimited. */ -static LIST_HEAD(asym_cap_list); - -#define cpu_capacity_span(asym_data) to_cpumask((asym_data)->cpus) +LIST_HEAD(asym_cap_list); =20 /* * Verify whether there is any CPU capacity asymmetry in a given sched dom= ain. @@ -1386,21 +1375,39 @@ asym_cpu_capacity_classify(const struct cpumask *sd= _span, =20 } =20 +static void free_asym_cap_entry(struct rcu_head *head) +{ + struct asym_cap_data *entry =3D container_of(head, struct asym_cap_data, = rcu); + kfree(entry); +} + static inline void asym_cpu_capacity_update_data(int cpu) { unsigned long capacity =3D arch_scale_cpu_capacity(cpu); - struct asym_cap_data *entry =3D NULL; + struct asym_cap_data *insert_entry =3D NULL; + struct asym_cap_data *entry; =20 + /* + * Search if capacity already exits. If not, track which the entry + * where we should insert to keep the list ordered descendingly. + */ list_for_each_entry(entry, &asym_cap_list, link) { if (capacity =3D=3D entry->capacity) goto done; + else if (!insert_entry && capacity > entry->capacity) + insert_entry =3D list_prev_entry(entry, link); } =20 entry =3D kzalloc(sizeof(*entry) + cpumask_size(), GFP_KERNEL); if (WARN_ONCE(!entry, "Failed to allocate memory for asymmetry data\n")) return; entry->capacity =3D capacity; - list_add(&entry->link, &asym_cap_list); + + /* If NULL then the new capacity is the smallest, add last. */ + if (!insert_entry) + list_add_tail_rcu(&entry->link, &asym_cap_list); + else + list_add_rcu(&entry->link, &insert_entry->link); done: __cpumask_set_cpu(cpu, cpu_capacity_span(entry)); } @@ -1423,8 +1430,8 @@ static void asym_cpu_capacity_scan(void) =20 list_for_each_entry_safe(entry, next, &asym_cap_list, link) { if (cpumask_empty(cpu_capacity_span(entry))) { - list_del(&entry->link); - kfree(entry); + list_del_rcu(&entry->link); + call_rcu(&entry->rcu, free_asym_cap_entry); } } =20 @@ -1434,8 +1441,8 @@ static void asym_cpu_capacity_scan(void) */ if (list_is_singular(&asym_cap_list)) { entry =3D list_first_entry(&asym_cap_list, typeof(*entry), link); - list_del(&entry->link); - kfree(entry); + list_del_rcu(&entry->link); + call_rcu(&entry->rcu, free_asym_cap_entry); } } =20 --=20 2.34.1 From nobody Fri Dec 26 11:23:00 2025 Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74FA663B for ; Sun, 24 Mar 2024 00:46:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241180; cv=none; b=RFyOtOs69+zbIFwK8Eezikkv2JWUCkzfjMkOxxb38NtaMeltYmaqbES4VMXPhgcphtoYpWXwJaqFkCKwkudgYhzidxr3NmBs+gpOyj8EZ7kTIzqkm+e8+OPi4106rYXXFrFBzHxRnG7DyVmxZ+0NSVdERs6FzXFUBLHolXVkFv8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241180; c=relaxed/simple; bh=fagvafmRW8bAsH4PlwGjJkpm+a8ityfbiaduDn++QaU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=djtutAOtba+7oJlIcPinTWBzh79TazhNSpYi4YjLAMInRatBPiyzOxlQrH/YpnNuTWVMYRnQtO1owyA8G95MqMsj6Y1Lg+ZvGvL50v0zBzBDtmejTatSJbcFPbYB2hv3mU2teJhUVGEJ2PKPLhJdR9jvAFTWTttrB0y/lYFIXCw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=z9a2vU0n; arc=none smtp.client-ip=209.85.128.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="z9a2vU0n" Received: by mail-wm1-f48.google.com with SMTP id 5b1f17b1804b1-414811d8241so3895995e9.2 for ; Sat, 23 Mar 2024 17:46:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1711241177; x=1711845977; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cxpj4P4M/KwhXcgLKvhRv6OYn6sM9Gs71BX39fQzMVY=; b=z9a2vU0nqm0lTF5a//AfOBYoxj9TSR4nmH09ZnCZECNWekfpzxR5OShVjJkjpBaiuu gf3MOCv3szcfmxweH3OccOtf/Os4Fj4RS6h6TYf5RfQYbSpQFIRYmvvska2ELAk0Ld6E cSpHVkCy3v0L+bhIN3IRDRRT0k2b5053XLE6k5wRjNgQMhdZVA7Z9Qi4cYwxcyKNO4nh 098dTgTAwJvWWwSwtBhsUm93i6hhTxEsT8WUzUgIO0+d8jqEtekXY/72UjlHFgVrwY0N FoexWUOkEpD0J/8dj+uAi8PkZ1m05EMy4Wgjr6HUuts4cle5/tlU6/bnqnQ0bjDi+YVF DgYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711241177; x=1711845977; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cxpj4P4M/KwhXcgLKvhRv6OYn6sM9Gs71BX39fQzMVY=; b=lruOG7pdp9B0fwaTKAquBXmS9PpMAKr2+/qQ5CshX1r+Guw5VP4wjv7sCU7xrmaIkw 1ys11H5Kse1h834fwNLnB8xaw5PyTG/RXVRW2LcLngxVXKwRaaH6zzUhbvLZgg49r5he uwRi9M3JQeq/mBQvR7TlVfZtqQnMu2WlAvkHtm7/n+aY+K1blNZN078M6zgasq5od/o/ XM+wIEMvuHuS9sbKowStF8UZMvqoWoKrR38EW0f8Oc+vz1HN6IuVPHkIvqSBkadB1csk NI8NBcNB1wf+AlQdOyczzGzCcF5EGW0Hj+JG4zDkrsfgowy+hV6nX7xSJXbR8umpF/Eb CbGg== X-Gm-Message-State: AOJu0YwAB91dyN8kO+7e8qL0k29+9mlEa7TcgvGYwshC56OL5d4ffvil cJYF+kGmyZXMBZ+7q/Dw0jIVKvdAJDTvGyRE/QzMg0ZG2z+/lkk2q/T2re1FKQk= X-Google-Smtp-Source: AGHT+IH2RobPfD8yX8gSuTNLW776CVT+XMomm7GaS5uOIbTqLVAz/wbTNVfGJnNan5dhUnV6plrdxw== X-Received: by 2002:a05:600c:3ba8:b0:414:63c6:8665 with SMTP id n40-20020a05600c3ba800b0041463c68665mr2819002wms.2.1711241176860; Sat, 23 Mar 2024 17:46:16 -0700 (PDT) Received: from airbuntu.. (host81-157-90-255.range81-157.btcentralplus.com. [81.157.90.255]) by smtp.gmail.com with ESMTPSA id i6-20020a05600c354600b00414674a1a40sm3778179wmq.45.2024.03.23.17.46.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Mar 2024 17:46:16 -0700 (PDT) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v8 2/4] sched/fair: Check a task has a fitting cpu when updating misfit Date: Sun, 24 Mar 2024 00:45:50 +0000 Message-Id: <20240324004552.999936-3-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240324004552.999936-1-qyousef@layalina.io> References: <20240324004552.999936-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If a misfit task is affined to a subset of the possible cpus, we need to verify that one of these cpus can fit it. Otherwise the load balancer code will continuously trigger needlessly leading the balance_interval to increase in return and eventually end up with a situation where real imbalances take a long time to address because of this impossible imbalance situation. This can happen in Android world where it's common for background tasks to be restricted to little cores. Similarly if we can't fit the biggest core, triggering misfit is pointless as it is the best we can ever get on this system. To be able to detect that; we use asym_cap_list to iterate through capacities in the system to see if the task is able to run at a higher capacity level based on its p->cpus_ptr. We do that when the affinity change, a fair task is forked, or when a task switched to fair policy. We store the max_allowed_capacity in task_struct to allow for cheap comparison in the fast path. Improve check_misfit_status() function by removing redundant checks. misfit_task_load will be 0 if the task can't move to a bigger CPU. And nohz_balancer_kick() already checks for cpu_check_capacity() before calling check_misfit_status(). Test: =3D=3D=3D=3D=3D Add trace_printk("balance_interval =3D %lu\n", interval) in get_sd_balance_interval(). run if [ "$MASK" !=3D "0" ]; then adb shell "taskset -a $MASK cat /dev/zero > /dev/null" fi sleep 10 // parse ftrace buffer counting the occurrence of each valaue Where MASK is either: * 0: no busy task running * 1: busy task is pinned to 1 cpu; handled today to not cause misfit * f: busy task pinned to little cores, simulates busy background task, demonstrates the problem to be fixed Results: =3D=3D=3D=3D=3D=3D=3D=3D Note how occurrence of balance_interval =3D 128 overshoots for MASK =3D f. BEFORE Reviewed-by: Vincent Guittot ------ MASK=3D0 1 balance_interval =3D 175 120 balance_interval =3D 128 846 balance_interval =3D 64 55 balance_interval =3D 63 215 balance_interval =3D 32 2 balance_interval =3D 31 2 balance_interval =3D 16 4 balance_interval =3D 8 1870 balance_interval =3D 4 65 balance_interval =3D 2 MASK=3D1 27 balance_interval =3D 175 37 balance_interval =3D 127 840 balance_interval =3D 64 167 balance_interval =3D 63 449 balance_interval =3D 32 84 balance_interval =3D 31 304 balance_interval =3D 16 1156 balance_interval =3D 8 2781 balance_interval =3D 4 428 balance_interval =3D 2 MASK=3Df 1 balance_interval =3D 175 1328 balance_interval =3D 128 44 balance_interval =3D 64 101 balance_interval =3D 63 25 balance_interval =3D 32 5 balance_interval =3D 31 23 balance_interval =3D 16 23 balance_interval =3D 8 4306 balance_interval =3D 4 177 balance_interval =3D 2 AFTER ----- Note how the high values almost disappear for all MASK values. The system has background tasks that could trigger the problem without simulate it even with MASK=3D0. MASK=3D0 103 balance_interval =3D 63 19 balance_interval =3D 31 194 balance_interval =3D 8 4827 balance_interval =3D 4 179 balance_interval =3D 2 MASK=3D1 131 balance_interval =3D 63 1 balance_interval =3D 31 87 balance_interval =3D 8 3600 balance_interval =3D 4 7 balance_interval =3D 2 MASK=3Df 8 balance_interval =3D 127 182 balance_interval =3D 63 3 balance_interval =3D 31 9 balance_interval =3D 16 415 balance_interval =3D 8 3415 balance_interval =3D 4 21 balance_interval =3D 2 Reviewed-by: Vincent Guittot Signed-off-by: Qais Yousef --- include/linux/sched.h | 1 + init/init_task.c | 1 + kernel/sched/fair.c | 66 ++++++++++++++++++++++++++++++++----------- 3 files changed, 52 insertions(+), 16 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 7eb7f31af796..37b95dbdb4cb 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -835,6 +835,7 @@ struct task_struct { #endif =20 unsigned int policy; + unsigned long max_allowed_capacity; int nr_cpus_allowed; const cpumask_t *cpus_ptr; cpumask_t *user_cpus_ptr; diff --git a/init/init_task.c b/init/init_task.c index 4daee6d761c8..2558b719e053 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -77,6 +77,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = =3D { .cpus_ptr =3D &init_task.cpus_mask, .user_cpus_ptr =3D NULL, .cpus_mask =3D CPU_MASK_ALL, + .max_allowed_capacity =3D SCHED_CAPACITY_SCALE, .nr_cpus_allowed=3D NR_CPUS, .mm =3D NULL, .active_mm =3D &init_mm, diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c8e50fbac345..3b88cf58fb45 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5092,15 +5092,19 @@ static inline int task_fits_cpu(struct task_struct = *p, int cpu) =20 static inline void update_misfit_status(struct task_struct *p, struct rq *= rq) { + int cpu =3D cpu_of(rq); + if (!sched_asym_cpucap_active()) return; =20 - if (!p || p->nr_cpus_allowed =3D=3D 1) { - rq->misfit_task_load =3D 0; - return; - } + /* + * Affinity allows us to go somewhere higher? Or are we on biggest + * available CPU already? Or do we fit into this CPU ? + */ + if (!p || (p->nr_cpus_allowed =3D=3D 1) || + (arch_scale_cpu_capacity(cpu) =3D=3D p->max_allowed_capacity) || + task_fits_cpu(p, cpu)) { =20 - if (task_fits_cpu(p, cpu_of(rq))) { rq->misfit_task_load =3D 0; return; } @@ -8247,6 +8251,36 @@ static void task_dead_fair(struct task_struct *p) remove_entity_load_avg(&p->se); } =20 +/* + * Set the max capacity the task is allowed to run at for misfit detection. + */ +static void set_task_max_allowed_capacity(struct task_struct *p) +{ + struct asym_cap_data *entry; + + if (!sched_asym_cpucap_active()) + return; + + rcu_read_lock(); + list_for_each_entry_rcu(entry, &asym_cap_list, link) { + cpumask_t *cpumask; + + cpumask =3D cpu_capacity_span(entry); + if (!cpumask_intersects(p->cpus_ptr, cpumask)) + continue; + + p->max_allowed_capacity =3D entry->capacity; + break; + } + rcu_read_unlock(); +} + +static void set_cpus_allowed_fair(struct task_struct *p, struct affinity_c= ontext *ctx) +{ + set_cpus_allowed_common(p, ctx); + set_task_max_allowed_capacity(p); +} + static int balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) { @@ -8255,6 +8289,8 @@ balance_fair(struct rq *rq, struct task_struct *prev,= struct rq_flags *rf) =20 return sched_balance_newidle(rq, rf) !=3D 0; } +#else +static inline void set_task_max_allowed_capacity(struct task_struct *p) {} #endif /* CONFIG_SMP */ =20 static void set_next_buddy(struct sched_entity *se) @@ -9604,16 +9640,10 @@ check_cpu_capacity(struct rq *rq, struct sched_doma= in *sd) (arch_scale_cpu_capacity(cpu_of(rq)) * 100)); } =20 -/* - * Check whether a rq has a misfit task and if it looks like we can actual= ly - * help that task: we can migrate the task to a CPU of higher capacity, or - * the task's current CPU is heavily pressured. - */ -static inline int check_misfit_status(struct rq *rq, struct sched_domain *= sd) +/* Check if the rq has a misfit task */ +static inline bool check_misfit_status(struct rq *rq) { - return rq->misfit_task_load && - (arch_scale_cpu_capacity(rq->cpu) < rq->rd->max_cpu_capacity || - check_cpu_capacity(rq, sd)); + return rq->misfit_task_load; } =20 /* @@ -11917,7 +11947,7 @@ static void nohz_balancer_kick(struct rq *rq) * When ASYM_CPUCAPACITY; see if there's a higher capacity CPU * to run the misfit task on. */ - if (check_misfit_status(rq, sd)) { + if (check_misfit_status(rq)) { flags =3D NOHZ_STATS_KICK | NOHZ_BALANCE_KICK; goto unlock; } @@ -12642,6 +12672,8 @@ static void task_fork_fair(struct task_struct *p) rq_lock(rq, &rf); update_rq_clock(rq); =20 + set_task_max_allowed_capacity(p); + cfs_rq =3D task_cfs_rq(current); curr =3D cfs_rq->curr; if (curr) @@ -12765,6 +12797,8 @@ static void switched_to_fair(struct rq *rq, struct = task_struct *p) { attach_task_cfs_rq(p); =20 + set_task_max_allowed_capacity(p); + if (task_on_rq_queued(p)) { /* * We were most likely switched from sched_rt, so @@ -13136,7 +13170,7 @@ DEFINE_SCHED_CLASS(fair) =3D { .rq_offline =3D rq_offline_fair, =20 .task_dead =3D task_dead_fair, - .set_cpus_allowed =3D set_cpus_allowed_common, + .set_cpus_allowed =3D set_cpus_allowed_fair, #endif =20 .task_tick =3D task_tick_fair, --=20 2.34.1 From nobody Fri Dec 26 11:23:00 2025 Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EFFA646 for ; Sun, 24 Mar 2024 00:46:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241181; cv=none; b=F4NrTvwp+juM++rHyTMroF1QhyFq1UxlbFIDNJNcYP/h5MXyP3TevZHhF7R61GoXvj4iNuq5WCuP0i5vSdZjPaSmjj4yt8nr7QJbleo9D/5X4hJyKMEXQwbXFiTzZ7NACMI6AZ7P5N45hiBSWERXU3Hds0RMlqUZE9mVebHJkuY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241181; c=relaxed/simple; bh=AOzbBt2px6IAhcB3mpW6y/UMjXJbufYujPoS27Gf5Iw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NNJD7uItgOGoirUWryLVasCU6Z/PNuOBzr4h9ZNtX0Fd25qOLS0VdJ8ScSxDUN4P/n7k7jjKqUqpITB901mnukg+VA3jIKZNL0AgParzTPHlMDqy8Tsc+0lhFS+OGevC3T+EJcm52NeXVyff1QTO3GIB5L37+zuZY7577tTSMac= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=rSO/h7Ba; arc=none smtp.client-ip=209.85.128.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="rSO/h7Ba" Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-414850d5924so2241085e9.1 for ; Sat, 23 Mar 2024 17:46:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1711241177; x=1711845977; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uUO5KvfpRMNgU51mJbx/1qemdCABN5ekHEl8L1wBADc=; b=rSO/h7BacU42KdbBCR3IMJduQyOUfCg49ENfWp0xDsMv5ZacId324ORTo5PUxmV3iY iqb9BAwF5VZd3DRjtCtQE5d2l2t28fNLHFt+Wp3CIPfjAEtGPxKVtGOlzO3FweFvRrhj bOiy+VLa039w8/VfXvofjsdC6xiYX08R2k4YhPRnbbUmFFLZb6aBexv4hZC69gHZhioW VYHr05igavVX+OJ4dW9/+5UGutCjOmH9VuojM7b6ljc55kLKdOe3aGAgoRKqJuTuoPJg F2XC8+urUOaSaR2CQi20IEc+pke8DMWzzTDNE/1LrxGaTxv8vbve/WWYxg+K9VzJiQJW tQ+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711241177; x=1711845977; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uUO5KvfpRMNgU51mJbx/1qemdCABN5ekHEl8L1wBADc=; b=q0mP5RFXmRmFtGY4RSINJWzv1VG/QyDeLnxb1tIrV+Xw/BdCEMjDOwEm/oV/kUxPyV MmQ6FseonEQn8/CbYGuaKiT+WKqadCMkqM0Kp6WhR2OjHjZU/ztmD7Jr5mKxpy9ahTxP bkSmRRJOf8JRA7f/hdml77RlOmLtLB8wp+BgBBkj1LTSaM0lZ2AaoZRVCU/86N3nj2qH atbMy10L/gGiz9OHjhXyNqJnchUZ5TKRDe2hliSziLHNQmFNZae+n0mcetX8Wc6ANEpE r4QXcZfDsqUTjLLoW5p6tVt8H6h7PHtwHy6U4ES18L650X3K1LAd5yqQXL0u4Zgx6ocI iiUw== X-Gm-Message-State: AOJu0Yzp/eQ5DS1fAwgw+UWt55M8iGIZZcyU/gfnEHQbBnpeYjFlmbSw DF3kTv/rs3ZRdbI4RbaThho9Qv0V0QnMHWKzWfw3I66goLWR1wM/30wKxKeAZoY= X-Google-Smtp-Source: AGHT+IEhh9sC66eyja7PpoxKhCyQwdcZ5Jis9V7utN6IHA9uELFaU59XzgYJAC6tAoLuXSRog35tLw== X-Received: by 2002:a5d:4e8c:0:b0:33e:b7d4:892f with SMTP id e12-20020a5d4e8c000000b0033eb7d4892fmr2101700wru.20.1711241177719; Sat, 23 Mar 2024 17:46:17 -0700 (PDT) Received: from airbuntu.. (host81-157-90-255.range81-157.btcentralplus.com. [81.157.90.255]) by smtp.gmail.com with ESMTPSA id i6-20020a05600c354600b00414674a1a40sm3778179wmq.45.2024.03.23.17.46.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Mar 2024 17:46:17 -0700 (PDT) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v8 3/4] sched/topology: Remove max_cpu_capacity from root_domain Date: Sun, 24 Mar 2024 00:45:51 +0000 Message-Id: <20240324004552.999936-4-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240324004552.999936-1-qyousef@layalina.io> References: <20240324004552.999936-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The value is no longer used as we now keep track of max_allowed_capacity for each task instead. Reviewed-by: Vincent Guittot Signed-off-by: Qais Yousef --- kernel/sched/sched.h | 2 -- kernel/sched/topology.c | 13 ++----------- 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index f77c00dddfe1..4f9e952d4fad 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -917,8 +917,6 @@ struct root_domain { cpumask_var_t rto_mask; struct cpupri cpupri; =20 - unsigned long max_cpu_capacity; - /* * NULL-terminated list of performance domains intersecting with the * CPUs of the rd. Protected by RCU. diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 44ed3d0812ab..63aecd2a7a9f 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2514,16 +2514,9 @@ build_sched_domains(const struct cpumask *cpu_map, s= truct sched_domain_attr *att /* Attach the domains */ rcu_read_lock(); for_each_cpu(i, cpu_map) { - unsigned long capacity; - rq =3D cpu_rq(i); sd =3D *per_cpu_ptr(d.sd, i); =20 - capacity =3D arch_scale_cpu_capacity(i); - /* Use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing: */ - if (capacity > READ_ONCE(d.rd->max_cpu_capacity)) - WRITE_ONCE(d.rd->max_cpu_capacity, capacity); - cpu_attach_domain(sd, d.rd, i); =20 if (lowest_flag_domain(i, SD_CLUSTER)) @@ -2537,10 +2530,8 @@ build_sched_domains(const struct cpumask *cpu_map, s= truct sched_domain_attr *att if (has_cluster) static_branch_inc_cpuslocked(&sched_cluster_active); =20 - if (rq && sched_debug_verbose) { - pr_info("root domain span: %*pbl (max cpu_capacity =3D %lu)\n", - cpumask_pr_args(cpu_map), rq->rd->max_cpu_capacity); - } + if (rq && sched_debug_verbose) + pr_info("root domain span: %*pbl\n", cpumask_pr_args(cpu_map)); =20 ret =3D 0; error: --=20 2.34.1 From nobody Fri Dec 26 11:23:00 2025 Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F01110F7 for ; Sun, 24 Mar 2024 00:46:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241181; cv=none; b=Z6mDONR5NfSvARz+UOlbaF9SH8dmwo6iJfJNdXOPbUvNEfPb+tSmJ6lsxsjZKB4WeaXkiC8H9jsssGF0auXAPzId/OY3INIY2m4jejQrFuydYz1xSokpjCmwmLeXOlU7BwPpVnnk4wi5/eg78fxSj+HRZOP7kQaQ9+ld7BFBwdc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711241181; c=relaxed/simple; bh=vUXbbl8Gs+LBXu8bh5D8UeeeidKdQZUSJAtT8+qGjLg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HC3agNH8/7S9dOlTkH/f5bqoshSUCcdF/XnqfUg5/zDk+LZC4NBPHjhm5nZCiG4kKz6A3ZnnEFPO6wcy7nFwgRGEwP4wel5Hzt6zXvKOotLW7P/N4/qzxet8asloI4VmKv9Brcz52duUYHEqqqatzBOoVTD+FRWrzJT0VBrcnPM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=TkDJjbB7; arc=none smtp.client-ip=209.85.128.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="TkDJjbB7" Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-414811d8241so3896105e9.2 for ; Sat, 23 Mar 2024 17:46:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1711241178; x=1711845978; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1fgqiNts6+Cbc6tMllJTEWHDx0DYaBorRTxJ4SzU6S4=; b=TkDJjbB7tJoly+89mTTFXvxXmTqPTmoCOL+bZSNTUvtaEe+mfXzxeCVvkF+ZQajP6V vaMZPXUFH34QbE3aU4fa6QO6jYO6lvWvOjF7vTxVeAbTniWruysTnmvxzBIq3q8HRQIe lSHKEBcAej+WYnO8SjD0K35OTEtul3+wcDRbP6lDtAuvEkKCSaTYENSXqVUtLMH+oJwN PixioLrQuxL3ij0S9Vp+kd5g5UkY7hukP9ZdtLbWCYZm++IOQi1iAq6UqcdGPryISQ/L vH3me5nREPOF3rkY4tXlTDyEtz7DLnGrjrwvM2fgKpJ/Sl1pxxbAhuU8xwYE8yOYGULO 0Aqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711241178; x=1711845978; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1fgqiNts6+Cbc6tMllJTEWHDx0DYaBorRTxJ4SzU6S4=; b=kAMh+Lldkw24hSBKahtRKiHOasXmei+owuz2E0LG0iKvhbr3dvGQOz4w+FUVRinWHl Pb+8YrlvwQ7P9C5ECOT9jFs7voVeFRWP2rpJ6z1cKDb+LZmnWxWtfi5BYPO7sZTah3qY c6YsYdUEEvb6g+2RYjnzpMG+e0L9tLq7HienFRHdC0+ykRVzguHy+vx0NWzTlCcJWpri jhvtd6xc26BMid9BMrjVwCSIzhhF4xPIYSbCqcIJt57HXcHxZa0LzpUmgZTRjmWw5xc3 Gb5bcnzvxLox5NvdgY7+jCWa3Li7M/UDS0GLdjy2+EP5SslrUwOubVCp8VEAnOxUYJZv yJgA== X-Gm-Message-State: AOJu0YzYbm130ZCs+wJjOaRVcZEu7y38GzgLF5pB5l5dCAha2uYMLLFs SVgCDBYL3ozdoapnCDtujTsOis3rP8f14gB60TtZKhSpVo2NB4ORZGC6jQ1/j+c= X-Google-Smtp-Source: AGHT+IGUfW60lgVTsTKE1PBsEQ5OHEGDLD8qcl3Z2wM+u1m2H1+SFTIWIdFFfx4nyQnfbjnbw7Tuhg== X-Received: by 2002:a05:600c:20d4:b0:413:f75f:98b6 with SMTP id y20-20020a05600c20d400b00413f75f98b6mr2491841wmm.26.1711241178542; Sat, 23 Mar 2024 17:46:18 -0700 (PDT) Received: from airbuntu.. (host81-157-90-255.range81-157.btcentralplus.com. [81.157.90.255]) by smtp.gmail.com with ESMTPSA id i6-20020a05600c354600b00414674a1a40sm3778179wmq.45.2024.03.23.17.46.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Mar 2024 17:46:18 -0700 (PDT) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v8 4/4] sched/fair: Don't double balance_interval for migrate_misfit Date: Sun, 24 Mar 2024 00:45:52 +0000 Message-Id: <20240324004552.999936-5-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240324004552.999936-1-qyousef@layalina.io> References: <20240324004552.999936-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It is not necessarily an indication of the system being busy and requires a backoff of the load balancer activities. But pushing it high could mean generally delaying other misfit activities or other type of imbalances. Also don't pollute nr_balance_failed because of misfit failures. The value is used for enabling cache hot migration and in migrate_util/load types. None of which should be impacted (skewed) by misfit failures. Reviewed-by: Vincent Guittot Signed-off-by: Qais Yousef --- kernel/sched/fair.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3b88cf58fb45..18da54da48a5 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -11443,8 +11443,12 @@ static int sched_balance_rq(int this_cpu, struct r= q *this_rq, * We do not want newidle balance, which can be very * frequent, pollute the failure counter causing * excessive cache_hot migrations and active balances. + * + * Similarly for migration_misfit which is not related to + * load/util migration, don't pollute nr_balance_failed. */ - if (idle !=3D CPU_NEWLY_IDLE) + if (idle !=3D CPU_NEWLY_IDLE && + env.migration_type !=3D migrate_misfit) sd->nr_balance_failed++; =20 if (need_active_balance(&env)) { @@ -11527,8 +11531,13 @@ static int sched_balance_rq(int this_cpu, struct r= q *this_rq, * repeatedly reach this code, which would lead to balance_interval * skyrocketing in a short amount of time. Skip the balance_interval * increase logic to avoid that. + * + * Similarly misfit migration which is not necessarily an indication of + * the system being busy and requires lb to backoff to let it settle + * down. */ - if (env.idle =3D=3D CPU_NEWLY_IDLE) + if (env.idle =3D=3D CPU_NEWLY_IDLE || + env.migration_type =3D=3D migrate_misfit) goto out; =20 /* tune up the balancing interval */ --=20 2.34.1