From nobody Mon Feb 9 03:30:20 2026 Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com [209.85.218.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A9AD78688 for ; Tue, 20 Feb 2024 22:56:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469799; cv=none; b=qgiCuBQsOdwgK4IBJSvB7XqBiUHQg1jikkmRkgDayGkJFxRvAxfZekSc2+wWiCuyb3ldUmODg/C8YlVdk2a71LNKQ3Z/e4/dDPJ9dE9mqFBjdu7BAURPsyOOo6pVfbdkyR2rL/rOpWeyI3Z60FiMNx0GLW14ShU3zBUk4XD5iS4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469799; c=relaxed/simple; bh=pJbxNcidnAquuRUBnbM5G/Xt7klZVoRT5zrWcv5qa6I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=MvVyx7EOIInqaskGzESPDIaD72yKEoFjM8zEEVuJ8D9JkuPjS5yBr6kbj7cpOujfekU/GG3HahSeisT2szc0iFmnThX9d+TpLb50tO7DT4PWUA5rEW4mfiT6rE+TBBfHqoNYYjQ3mIs1wmnt8LNg1RRORBZ0eA7Lx0tTwHJ9mnU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=TtK4QeHY; arc=none smtp.client-ip=209.85.218.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="TtK4QeHY" Received: by mail-ej1-f44.google.com with SMTP id a640c23a62f3a-a3f1bf03722so91537066b.1 for ; Tue, 20 Feb 2024 14:56:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1708469796; x=1709074596; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aT2II4hSJb5JlN6Q/dAyII0ZNTG9/pbj1emmIil5nPc=; b=TtK4QeHYY3eKULAkUjZCc5wxcQqszXYiU1IjCYCTWVnRp/0gpbt9n7RlUoDxBv/iQ8 Dl0Hk85lrmqMSQS77IiV0tQn5xvPCxwla2j5Htuw2H7DgEBfIsV5XHcrZV2ko3VLZf+P 5B6Vt3euYWRBH8AP2Q6yTlNpoHlJYXs8nMoxYSchbyAnYscm7xd/hvchIqh+NP2SRECm 4nAYIgLR0+3+QD/v4AsGE3qkiZuxGn3Yz+OLZkRbvI4txvFg5YD0UQFViPuTrIoAPbd7 rGaG4xHw2/fRmPeBtLqrsBCxpodGfVzz5o0QEaQacNDvIW8KQV8KC1/HICYqgJZFoeLm F2eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708469796; x=1709074596; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aT2II4hSJb5JlN6Q/dAyII0ZNTG9/pbj1emmIil5nPc=; b=fyglTSC56NlYwzuJEI0JocZn+LlFWLLgQ/0pxIdfY9CFM7xDGc0WrjEArtlk8/dG4i bLs6LvgPWaAPuuYsK5lqEkdudg+EZ6mMOSQexA656QcGEmkW31FZh0p0irrwrZxJzsgz 1Qq6qxVZAYMIXZwI6c7ksuGuhysaionn4r8JUZBKpPGErrUEVNzEQg4H4TLBPc6YWrfW lJUe08tNquQ6gzUipoXQXLozFAmDY3GEwC4fQ+9K9CdwmjqDy1eUqXufoR/y6VwY2OyA yA1lSnx49Z28/e1zWQ3b1DYaFIzoPo91MRIPWxD0dUwpvcCwESKVgPNK8O7y/V3yNKk1 hx8w== X-Gm-Message-State: AOJu0YwDRt+6oVS2KwawqnH8G83a9nLG/0A72LsdyfYAG/WWQ8262HaN Ki1hg8pC7spa6GxW6Quc5x8lK4szS29obPcaHFztq8ch2Zf6NCoWiVbZltdXbyE= X-Google-Smtp-Source: AGHT+IHdWfx7cLwsPeIhe8u/keSXJ7SEWTP42c9O/UxSWDheo2IV2uhYtI1XKt3FebDxbN6Z/OPI2g== X-Received: by 2002:a17:906:60b:b0:a3e:b8e4:2b6b with SMTP id s11-20020a170906060b00b00a3eb8e42b6bmr4372934ejb.17.1708469795797; Tue, 20 Feb 2024 14:56:35 -0800 (PST) Received: from airbuntu.. (host109-154-46-208.range109-154.btcentralplus.com. [109.154.46.208]) by smtp.gmail.com with ESMTPSA id vw13-20020a170907a70d00b00a3c5d10bcdbsm4381946ejc.114.2024.02.20.14.56.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 14:56:35 -0800 (PST) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v6 1/4] sched/topology: Export asym_capacity_list Date: Tue, 20 Feb 2024 22:56:19 +0000 Message-Id: <20240220225622.2626569-2-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240220225622.2626569-1-qyousef@layalina.io> References: <20240220225622.2626569-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" So that we can use it to iterate through available capacities in the system. Sort asym_cap_list in descending order as expected users are likely to be interested on the highest capacity first. Make the list RCU protected to allow for cheap access in hot paths. Signed-off-by: Qais Yousef Reviewed-by: Vincent Guittot --- kernel/sched/sched.h | 14 ++++++++++++++ kernel/sched/topology.c | 43 ++++++++++++++++++++++++----------------- 2 files changed, 39 insertions(+), 18 deletions(-) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 001fe047bd5d..e85976bd2bab 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -109,6 +109,20 @@ extern int sysctl_sched_rt_period; extern int sysctl_sched_rt_runtime; extern int sched_rr_timeslice; =20 +/* + * Asymmetric CPU capacity bits + */ +struct asym_cap_data { + struct list_head link; + struct rcu_head rcu; + unsigned long capacity; + unsigned long cpus[]; +}; + +extern struct list_head asym_cap_list; + +#define cpu_capacity_span(asym_data) to_cpumask((asym_data)->cpus) + /* * Helpers for converting nanosecond timing to jiffy resolution */ diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 10d1391e7416..1505677e4247 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1329,24 +1329,13 @@ static void init_sched_groups_capacity(int cpu, str= uct sched_domain *sd) update_group_capacity(sd, cpu); } =20 -/* - * Asymmetric CPU capacity bits - */ -struct asym_cap_data { - struct list_head link; - unsigned long capacity; - unsigned long cpus[]; -}; - /* * Set of available CPUs grouped by their corresponding capacities * Each list entry contains a CPU mask reflecting CPUs that share the same * capacity. * The lifespan of data is unlimited. */ -static LIST_HEAD(asym_cap_list); - -#define cpu_capacity_span(asym_data) to_cpumask((asym_data)->cpus) +LIST_HEAD(asym_cap_list); =20 /* * Verify whether there is any CPU capacity asymmetry in a given sched dom= ain. @@ -1386,21 +1375,39 @@ asym_cpu_capacity_classify(const struct cpumask *sd= _span, =20 } =20 +static void free_asym_cap_entry(struct rcu_head *head) +{ + struct asym_cap_data *entry =3D container_of(head, struct asym_cap_data, = rcu); + kfree(entry); +} + static inline void asym_cpu_capacity_update_data(int cpu) { unsigned long capacity =3D arch_scale_cpu_capacity(cpu); - struct asym_cap_data *entry =3D NULL; + struct asym_cap_data *insert_entry =3D NULL; + struct asym_cap_data *entry; =20 + /* + * Search if capacity already exits. If not, track which the entry + * where we should insert to keep the list ordered descendingly. + */ list_for_each_entry(entry, &asym_cap_list, link) { if (capacity =3D=3D entry->capacity) goto done; + else if (!insert_entry && capacity > entry->capacity) + insert_entry =3D list_prev_entry(entry, link); } =20 entry =3D kzalloc(sizeof(*entry) + cpumask_size(), GFP_KERNEL); if (WARN_ONCE(!entry, "Failed to allocate memory for asymmetry data\n")) return; entry->capacity =3D capacity; - list_add(&entry->link, &asym_cap_list); + + /* If NULL then the new capacity is the smallest, add last. */ + if (!insert_entry) + list_add_tail_rcu(&entry->link, &asym_cap_list); + else + list_add_rcu(&entry->link, &insert_entry->link); done: __cpumask_set_cpu(cpu, cpu_capacity_span(entry)); } @@ -1423,8 +1430,8 @@ static void asym_cpu_capacity_scan(void) =20 list_for_each_entry_safe(entry, next, &asym_cap_list, link) { if (cpumask_empty(cpu_capacity_span(entry))) { - list_del(&entry->link); - kfree(entry); + list_del_rcu(&entry->link); + call_rcu(&entry->rcu, free_asym_cap_entry); } } =20 @@ -1434,8 +1441,8 @@ static void asym_cpu_capacity_scan(void) */ if (list_is_singular(&asym_cap_list)) { entry =3D list_first_entry(&asym_cap_list, typeof(*entry), link); - list_del(&entry->link); - kfree(entry); + list_del_rcu(&entry->link); + call_rcu(&entry->rcu, free_asym_cap_entry); } } =20 --=20 2.34.1 From nobody Mon Feb 9 03:30:20 2026 Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 431E8152E0B for ; Tue, 20 Feb 2024 22:56:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469800; cv=none; b=Wb1w8+pIH3/wd704ndW2d+lvQtFoAR0VsclQ1VKGJge0j3tjLQXaHg++5eV9RWrUNLNpMwGfYH3Vm+PtfQd41lKbr32jNs//y0JOO690D3BdZ/D9mxD246FvlZ/6s5JW3L5dY0dhTQhxcurhV1gevL1pj7MYK25y3VSiaIq/EG8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469800; c=relaxed/simple; bh=Oe9MQLkXOInGOsgqFCqp5Pk6sTMkSt23DYmgc8HrYKg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZW9YTxSShAFvC/qzUdvW1xKYC6xDwOmfP9TfhGVn3uThRbbzo9iXypytgFNl9dJDwUsDKCiPFBX2+xihLydGBUhUohTFo/lh5hYKZyvcGWtdTxC/uekam0mYmZlrbNl2d9nRlu1YoWeOjyGSelTo381iCi3A0mNAZ/pbiqzjYCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=b0dxgfZ+; arc=none smtp.client-ip=209.85.218.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="b0dxgfZ+" Received: by mail-ej1-f46.google.com with SMTP id a640c23a62f3a-a3d5e77cfbeso1051093066b.0 for ; Tue, 20 Feb 2024 14:56:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1708469796; x=1709074596; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J0SnjLhmWiU4ahbZOfLIAejOCAQMGIkWwrZ4aZTqpFE=; b=b0dxgfZ+CzoE14z69QJzzEeGR/oZtAfILwUSsmJuX8Zt2UeyTIWM4epkf0Pv41KHbU 9lxgvJXuePcyFXNOcXeYnMtZVSjIgui3dONSxGj7+L68lq61IMyY8wXTmvawDUJBHolY oGKzFlzpDfMjiSCUbMFItAnGUy3ldw8AszInU/YLTwgpRwvHRO0zBIttMn5jH05XbMu/ DG2Ogov1cSBBQ2PnH+Woz0b5kk7fVD1zDMwGk2ujAsbc/QriW08xz14uHUIUPfyLhThH +gOdAqBcUw6hRa946aP00HrZ5OV74yRAu+ZAYbM+pUfg1NacVFtBUgz5Tbl/OhVTr9wN 7tcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708469796; x=1709074596; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J0SnjLhmWiU4ahbZOfLIAejOCAQMGIkWwrZ4aZTqpFE=; b=Mpy9i+f0MEoGqM4+mxnT7PdfHecK6Mq5Yh3SmnJOlPTP9nDI/uTodX3QyzIT/91gBd krXCuFpxUkoyH3lWbDEZ1KdiKat1bYyEbZkKF7O2JhkFAMtfhV/TBX6cc6F0gpwzMGr0 22nr7+VtmNLTzX9Sdpc8ouHVZoO+At02GcE/nrbhm4yHPJxhf5C+R9McQGSKZCz+51gA IPFpCsTTq4AhUlBsW1dJf9rOOPfQBgAcRDaXEAp1LvO3lEh7zr8L2Eo4mZ05AUbN3LtF HsMrL2e9LM3MJ2sdS7ZFXtouU6S3XlH/Ij6R+izPN3KpX1o55gUD9D7CxbAEy/sba8s7 xAIg== X-Gm-Message-State: AOJu0YyZbamaTKx79R5V9AjHijNsirQKpCLSVHCWsMgHJick6fhisn80 x6tqmZ5hnz+El3pE2X4AgKdblrUPfJZf9/XE6ubslTTWo6PP0u2jQfpRKgYI8ug= X-Google-Smtp-Source: AGHT+IGAeT91ktic70LGV6b9DmKjcRRrQcqHYRYgJED0I9z/HVPnCnHt66XwLs1Kx587JTLfaVyA/A== X-Received: by 2002:a17:906:e83:b0:a3e:599:ae86 with SMTP id p3-20020a1709060e8300b00a3e0599ae86mr10648381ejf.9.1708469796620; Tue, 20 Feb 2024 14:56:36 -0800 (PST) Received: from airbuntu.. (host109-154-46-208.range109-154.btcentralplus.com. [109.154.46.208]) by smtp.gmail.com with ESMTPSA id vw13-20020a170907a70d00b00a3c5d10bcdbsm4381946ejc.114.2024.02.20.14.56.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 14:56:36 -0800 (PST) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v6 2/4] sched/fair: Check a task has a fitting cpu when updating misfit Date: Tue, 20 Feb 2024 22:56:20 +0000 Message-Id: <20240220225622.2626569-3-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240220225622.2626569-1-qyousef@layalina.io> References: <20240220225622.2626569-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If a misfit task is affined to a subset of the possible cpus, we need to verify that one of these cpus can fit it. Otherwise the load balancer code will continuously trigger needlessly leading the balance_interval to increase in return and eventually end up with a situation where real imbalances take a long time to address because of this impossible imbalance situation. This can happen in Android world where it's common for background tasks to be restricted to little cores. Similarly if we can't fit the biggest core, triggering misfit is pointless as it is the best we can ever get on this system. To be able to detect that; we use asym_cap_list to iterate through capacities in the system to see if the task is able to run at a higher capacity level based on its p->cpus_ptr. We do that when the affinity change, a fair task is forked, or when a task switched to fair policy. We store the max_allowed_capacity in task_struct to allow for cheap comparison in the fast path. Improve check_misfit_status() function by removing redundant checks. misfit_task_load will be 0 if the task can't move to a bigger CPU. And nohz_load_balance() already checks for cpu_check_capacity() before calling check_misfit_status(). Test: =3D=3D=3D=3D=3D Add trace_printk("balance_interval =3D %lu\n", interval) in get_sd_balance_interval(). run if [ "$MASK" !=3D "0" ]; then adb shell "taskset -a $MASK cat /dev/zero > /dev/null" fi sleep 10 // parse ftrace buffer counting the occurrence of each valaue Where MASK is either: * 0: no busy task running * 1: busy task is pinned to 1 cpu; handled today to not cause misfit * f: busy task pinned to little cores, simulates busy background task, demonstrates the problem to be fixed Results: =3D=3D=3D=3D=3D=3D=3D=3D Note how occurrence of balance_interval =3D 128 overshoots for MASK =3D f. BEFORE Reviewed-by: Vincent Guittot ------ MASK=3D0 1 balance_interval =3D 175 120 balance_interval =3D 128 846 balance_interval =3D 64 55 balance_interval =3D 63 215 balance_interval =3D 32 2 balance_interval =3D 31 2 balance_interval =3D 16 4 balance_interval =3D 8 1870 balance_interval =3D 4 65 balance_interval =3D 2 MASK=3D1 27 balance_interval =3D 175 37 balance_interval =3D 127 840 balance_interval =3D 64 167 balance_interval =3D 63 449 balance_interval =3D 32 84 balance_interval =3D 31 304 balance_interval =3D 16 1156 balance_interval =3D 8 2781 balance_interval =3D 4 428 balance_interval =3D 2 MASK=3Df 1 balance_interval =3D 175 1328 balance_interval =3D 128 44 balance_interval =3D 64 101 balance_interval =3D 63 25 balance_interval =3D 32 5 balance_interval =3D 31 23 balance_interval =3D 16 23 balance_interval =3D 8 4306 balance_interval =3D 4 177 balance_interval =3D 2 AFTER ----- Note how the high values almost disappear for all MASK values. The system has background tasks that could trigger the problem without simulate it even with MASK=3D0. MASK=3D0 103 balance_interval =3D 63 19 balance_interval =3D 31 194 balance_interval =3D 8 4827 balance_interval =3D 4 179 balance_interval =3D 2 MASK=3D1 131 balance_interval =3D 63 1 balance_interval =3D 31 87 balance_interval =3D 8 3600 balance_interval =3D 4 7 balance_interval =3D 2 MASK=3Df 8 balance_interval =3D 127 182 balance_interval =3D 63 3 balance_interval =3D 31 9 balance_interval =3D 16 415 balance_interval =3D 8 3415 balance_interval =3D 4 21 balance_interval =3D 2 Signed-off-by: Qais Yousef --- include/linux/sched.h | 1 + init/init_task.c | 1 + kernel/sched/fair.c | 77 +++++++++++++++++++++++++++++++++---------- 3 files changed, 61 insertions(+), 18 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index ffe8f618ab86..774cddbeab09 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -835,6 +835,7 @@ struct task_struct { #endif =20 unsigned int policy; + unsigned long max_allowed_capacity; int nr_cpus_allowed; const cpumask_t *cpus_ptr; cpumask_t *user_cpus_ptr; diff --git a/init/init_task.c b/init/init_task.c index 7ecb458eb3da..b3dbab4c959e 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -77,6 +77,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = =3D { .cpus_ptr =3D &init_task.cpus_mask, .user_cpus_ptr =3D NULL, .cpus_mask =3D CPU_MASK_ALL, + .max_allowed_capacity =3D SCHED_CAPACITY_SCALE, .nr_cpus_allowed=3D NR_CPUS, .mm =3D NULL, .active_mm =3D &init_mm, diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8e30e2bb77a0..20006fcf7df2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5092,24 +5092,35 @@ static inline int task_fits_cpu(struct task_struct = *p, int cpu) =20 static inline void update_misfit_status(struct task_struct *p, struct rq *= rq) { + unsigned long cpu_cap; + int cpu =3D cpu_of(rq); + if (!sched_asym_cpucap_active()) return; =20 - if (!p || p->nr_cpus_allowed =3D=3D 1) { - rq->misfit_task_load =3D 0; - return; - } + if (!p || p->nr_cpus_allowed =3D=3D 1) + goto out; =20 - if (task_fits_cpu(p, cpu_of(rq))) { - rq->misfit_task_load =3D 0; - return; - } + cpu_cap =3D arch_scale_cpu_capacity(cpu); + + /* + * Affinity allows us to go somewhere higher? Or are we on biggest + * available CPU already? + */ + if (cpu_cap =3D=3D p->max_allowed_capacity) + goto out; + + if (task_fits_cpu(p, cpu)) + goto out; =20 /* * Make sure that misfit_task_load will not be null even if * task_h_load() returns 0. */ rq->misfit_task_load =3D max_t(unsigned long, task_h_load(p), 1); + return; +out: + rq->misfit_task_load =3D 0; } =20 #else /* CONFIG_SMP */ @@ -8241,6 +8252,36 @@ static void task_dead_fair(struct task_struct *p) remove_entity_load_avg(&p->se); } =20 +/* + * Check the max capacity the task is allowed to run at for misfit detecti= on. + */ +static void set_task_max_allowed_capacity(struct task_struct *p) +{ + struct asym_cap_data *entry; + + if (!sched_asym_cpucap_active()) + return; + + rcu_read_lock(); + list_for_each_entry_rcu(entry, &asym_cap_list, link) { + cpumask_t *cpumask; + + cpumask =3D cpu_capacity_span(entry); + if (!cpumask_intersects(p->cpus_ptr, cpumask)) + continue; + + p->max_allowed_capacity =3D entry->capacity; + break; + } + rcu_read_unlock(); +} + +static void set_cpus_allowed_fair(struct task_struct *p, struct affinity_c= ontext *ctx) +{ + set_cpus_allowed_common(p, ctx); + set_task_max_allowed_capacity(p); +} + static int balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) { @@ -8249,6 +8290,8 @@ balance_fair(struct rq *rq, struct task_struct *prev,= struct rq_flags *rf) =20 return newidle_balance(rq, rf) !=3D 0; } +#else +static inline void set_task_max_allowed_capacity(struct task_struct *p) {} #endif /* CONFIG_SMP */ =20 static void set_next_buddy(struct sched_entity *se) @@ -9601,16 +9644,10 @@ check_cpu_capacity(struct rq *rq, struct sched_doma= in *sd) (arch_scale_cpu_capacity(cpu_of(rq)) * 100)); } =20 -/* - * Check whether a rq has a misfit task and if it looks like we can actual= ly - * help that task: we can migrate the task to a CPU of higher capacity, or - * the task's current CPU is heavily pressured. - */ -static inline int check_misfit_status(struct rq *rq, struct sched_domain *= sd) +/* Check if the rq has a misfit task */ +static inline bool check_misfit_status(struct rq *rq, struct sched_domain = *sd) { - return rq->misfit_task_load && - (arch_scale_cpu_capacity(rq->cpu) < rq->rd->max_cpu_capacity || - check_cpu_capacity(rq, sd)); + return rq->misfit_task_load; } =20 /* @@ -12645,6 +12682,8 @@ static void task_fork_fair(struct task_struct *p) rq_lock(rq, &rf); update_rq_clock(rq); =20 + set_task_max_allowed_capacity(p); + cfs_rq =3D task_cfs_rq(current); curr =3D cfs_rq->curr; if (curr) @@ -12768,6 +12807,8 @@ static void switched_to_fair(struct rq *rq, struct = task_struct *p) { attach_task_cfs_rq(p); =20 + set_task_max_allowed_capacity(p); + if (task_on_rq_queued(p)) { /* * We were most likely switched from sched_rt, so @@ -13139,7 +13180,7 @@ DEFINE_SCHED_CLASS(fair) =3D { .rq_offline =3D rq_offline_fair, =20 .task_dead =3D task_dead_fair, - .set_cpus_allowed =3D set_cpus_allowed_common, + .set_cpus_allowed =3D set_cpus_allowed_fair, #endif =20 .task_tick =3D task_tick_fair, --=20 2.34.1 From nobody Mon Feb 9 03:30:20 2026 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A5C3315442B for ; Tue, 20 Feb 2024 22:56:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469801; cv=none; b=QVMfYoY4Rmv2ivr3bar9Mf/5pJVUcyd502xPuinXM0D2eVzSIHP1Wcq7PFL0m6uae+c4P5OW4pbQIs16IRQgux9e8ispKyC1Wsg218+Hoy4wvYRphDdmWI0nSJtamNfAdrKLopMcyppiH6B6MsfNUtbK1EwZPd+ou1aBEUiE5RM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469801; c=relaxed/simple; bh=FQQWqlHG9UJ+0lG2we8igF6gAoLLpYlVJt/gcnDLSiE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=APcqM6fTOHHZwiuSw+40kHx1IoQE5TzeuYkGedTCerCkOFT+6z6odpZzMXHa5Y8UMXsauq4qXv/j0YQFOvsOrnBt3W1zizJVLwytx3Icd/wS5ezVbtS2ApocVqkKGnCcphb606mfq6mVR/ANSF2H2TA0nEkjjXpPGO6+pcYI42s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=NsUaWPf3; arc=none smtp.client-ip=209.85.218.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="NsUaWPf3" Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a3e7ce7dac9so344013966b.1 for ; Tue, 20 Feb 2024 14:56:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1708469798; x=1709074598; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i0OJtBJmsVtrdWRAtlV2dG6TCQIPBabKpIehhsZoRq8=; b=NsUaWPf3WunHtzuCbP65mzbLGe2uRo6oJf1C/MT2YynxCNXJKQ56PfUE1rP4neMA4N i7avw1wjoT3zBMGvrtckWBQRcFU9oz+gg46tEwM1FXODHyHGLBqMrGzTrcbr/ufADA10 np6brSsDtizdOvA33iMkOvULk6ibBM2ghIQqcWNp/cyPOgGq8yc1MLf/iB8baDhEsm8d 89PY84Z/k/ggW2cZUCswBonnZ19557oAdZtQ4c3K/KAuiMsUUz3I4CYwcgeL5oSs7qTp zaG8IXvON9zuQLKbC3N28k8unbroCoYOFrCaFECbvBAlfIsMRKd2ARNsecX7D5tYlU2j Jlfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708469798; x=1709074598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i0OJtBJmsVtrdWRAtlV2dG6TCQIPBabKpIehhsZoRq8=; b=MZbH4fGENhusPg7mM+xnSttk1xozwRwHGuqeVnrVO5o+bWys1G+qQgFUETkTo7F89I fULBy99uRVRwTiOn15t+XagOmssYQ+qQotl61oB5pnKgU8MuATW/NJqY5RdyS55gT+1Z sKsE2mlNyYTisysMjgYE4SA7bncCBsNXGK6mFWhBoCRgsTDplHMFO5j2fBlIvc2EmlKU EE/ZCuFmdzhyaGGOaw+X9CHiH4rwr2RxHSf1PVH+9DCMuKnw1dSjnM3dCtikunfV79uz 9nsaOfSv6MjeLOklEpP0WHVGsftvN4aKl7vKF7kpDi0WKYfSfI9WOPHyICMmnV5+tMf3 t3OA== X-Gm-Message-State: AOJu0YzMIFyEOOHBGjekq53YpO9zQ8+Md7eB92/67qclle/oFxYg+qzH zRxIxY+WFXf8WxK13dIU+VeqaiqdNdf+8MmyhWBwdsZ1s7vaQZQoFhxg67Lcj5g= X-Google-Smtp-Source: AGHT+IFZQyaaLGWuPzBmBGRF1gnUaRaQLKADCauVFthMEcDdu2QqV2GE2sPbBHnr/hBFXmSLPPkLYQ== X-Received: by 2002:a17:906:b0d9:b0:a3d:6eb4:9769 with SMTP id bk25-20020a170906b0d900b00a3d6eb49769mr15638737ejb.15.1708469797818; Tue, 20 Feb 2024 14:56:37 -0800 (PST) Received: from airbuntu.. (host109-154-46-208.range109-154.btcentralplus.com. [109.154.46.208]) by smtp.gmail.com with ESMTPSA id vw13-20020a170907a70d00b00a3c5d10bcdbsm4381946ejc.114.2024.02.20.14.56.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 14:56:37 -0800 (PST) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v6 3/4] sched/topology: Remove max_cpu_capacity from root_domain Date: Tue, 20 Feb 2024 22:56:21 +0000 Message-Id: <20240220225622.2626569-4-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240220225622.2626569-1-qyousef@layalina.io> References: <20240220225622.2626569-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The value is no longer used as we now keep track of max_allowed_capacity for each task instead. Signed-off-by: Qais Yousef Reviewed-by: Vincent Guittot --- kernel/sched/sched.h | 2 -- kernel/sched/topology.c | 13 ++----------- 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index e85976bd2bab..bc9e598d6f62 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -917,8 +917,6 @@ struct root_domain { cpumask_var_t rto_mask; struct cpupri cpupri; =20 - unsigned long max_cpu_capacity; - /* * NULL-terminated list of performance domains intersecting with the * CPUs of the rd. Protected by RCU. diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 1505677e4247..a57c006d2923 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2513,16 +2513,9 @@ build_sched_domains(const struct cpumask *cpu_map, s= truct sched_domain_attr *att /* Attach the domains */ rcu_read_lock(); for_each_cpu(i, cpu_map) { - unsigned long capacity; - rq =3D cpu_rq(i); sd =3D *per_cpu_ptr(d.sd, i); =20 - capacity =3D arch_scale_cpu_capacity(i); - /* Use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing: */ - if (capacity > READ_ONCE(d.rd->max_cpu_capacity)) - WRITE_ONCE(d.rd->max_cpu_capacity, capacity); - cpu_attach_domain(sd, d.rd, i); =20 if (lowest_flag_domain(i, SD_CLUSTER)) @@ -2536,10 +2529,8 @@ build_sched_domains(const struct cpumask *cpu_map, s= truct sched_domain_attr *att if (has_cluster) static_branch_inc_cpuslocked(&sched_cluster_active); =20 - if (rq && sched_debug_verbose) { - pr_info("root domain span: %*pbl (max cpu_capacity =3D %lu)\n", - cpumask_pr_args(cpu_map), rq->rd->max_cpu_capacity); - } + if (rq && sched_debug_verbose) + pr_info("root domain span: %*pbl\n", cpumask_pr_args(cpu_map)); =20 ret =3D 0; error: --=20 2.34.1 From nobody Mon Feb 9 03:30:20 2026 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C2C6154439 for ; Tue, 20 Feb 2024 22:56:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469802; cv=none; b=ME8DDZm2rPmnkFVGTcikzf6PgQd16XjzO5cG8R3OmMU+G2GESG8mAnku4dA+n+5l5lXst6ID+SwG8raWZOJGwN+imiDL/+63CoWEtKyuTgRfK+7qG4IrgCWZUaC3fFQqaneOjtsdOvF2m7IZfQxZJ7a4D3/Mx63NbEPD0L5P6gE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708469802; c=relaxed/simple; bh=TQb+yjBJI0RqUJiGCwnQz9AbbKBBI+P+muUhOx96Ss8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Xh19oJzT9H3GtIvqygI/JzgPFJQXLRu8zUUAF8sPEjbpuihCkEJa7dLs/5OrdOEFc5xDs5TWQs98HhnBXMOgobyd6/Dnx1FTEwB+ESl+g6ZqYiy/05wBOyCudYc96HR+EocGiIDJex/3N2jemyQZvvO0l/bEcZgP2neH3DFmDJ0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io; spf=pass smtp.mailfrom=layalina.io; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b=3chpQOZd; arc=none smtp.client-ip=209.85.218.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=layalina.io Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=layalina.io Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=layalina-io.20230601.gappssmtp.com header.i=@layalina-io.20230601.gappssmtp.com header.b="3chpQOZd" Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a3e75e30d36so505778866b.1 for ; Tue, 20 Feb 2024 14:56:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1708469798; x=1709074598; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5+0dCCMqItsu36IAUkOXIrkrvGFkrhGQYZYWjbbWBV4=; b=3chpQOZd9Ae5hbi/Nxfjvr34v/txPuathr8Y6REY6LL8bwyLPE8NaLdRP34XuIZeH9 9JjVG/k/9AzV2OiQ/cPPxofUS3NTvHOR/rFJMkJqRvB3Y4ajWicEYY+cLEAlTDPcQXVV OG/Y9HnG0Wo1tdrekWvrmgUhY9hEmZkJKB0HoJUenhZycc6ZIltFsA7Ivu6X1awAdIcR 3MXiHV6fTX2p3BzETmSbjWLm6UQG77IjHa0MaAOF8XvGjZ4BgzcUJEvwQ62mZDARNxRs kIpO5Agt5TVe0M0viABSovSsCqhrqM1yhvLsWTPkcqQfJk5XA52w7juRCEVcWTdXZjwJ j7pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708469798; x=1709074598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5+0dCCMqItsu36IAUkOXIrkrvGFkrhGQYZYWjbbWBV4=; b=g4MIdEtSX8LYhziIMYYnIR7rSumX0ZTuoqCarS0VOjVC6rRj25jGD0qHnfzOnmRFgl cvvHCdJSn6xnbp7m5czCdXMyXTaUxrFl/Mf7W0hMtvRZDxhERQcpPThZwohK7jGlRPm8 ozwWwuZNiJMi4BRWyLimdmqxhVkpq+6By0W7KOQKfWH540UrTQCcIap6vUVuqAjFyEP4 oM/usWSGM7V/2yE8vzClzkD0rWiILGC27TyV6LiDsFwwohEkwaBKlhMPyiQRLCQFcxkG EUYMuv0Ov0gZyuiTcAeRQM8LfE/QFYWN2/o66nFH6mys60+oGl8B2kMFyxXiCTIS8zpz pPDw== X-Gm-Message-State: AOJu0YxCqTSJ93ptCsisScF3EgK8MpGvt5rm2uYaU366hMZd5dHByruw x7OVDr98G+TcKRmhMWOezuVHh1VNZmPxiXR56jNkmCI89wcCB00mmzt45iAW+DEBW4ium5nPQ+0 g X-Google-Smtp-Source: AGHT+IGjuPW+er1ehC/QrHQAlmcBaFalt4vDWYvAxCpzpYV/4O0IVe6YF7BJOdtNZ7Mngn2adOsMGg== X-Received: by 2002:a17:906:b281:b0:a3d:d201:25cb with SMTP id q1-20020a170906b28100b00a3dd20125cbmr13013105ejz.6.1708469798725; Tue, 20 Feb 2024 14:56:38 -0800 (PST) Received: from airbuntu.. (host109-154-46-208.range109-154.btcentralplus.com. [109.154.46.208]) by smtp.gmail.com with ESMTPSA id vw13-20020a170907a70d00b00a3c5d10bcdbsm4381946ejc.114.2024.02.20.14.56.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 14:56:38 -0800 (PST) From: Qais Yousef To: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, "Pierre Gondois" , Qais Yousef Subject: [PATCH v6 4/4] sched/fair: Don't double balance_interval for migrate_misfit Date: Tue, 20 Feb 2024 22:56:22 +0000 Message-Id: <20240220225622.2626569-5-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240220225622.2626569-1-qyousef@layalina.io> References: <20240220225622.2626569-1-qyousef@layalina.io> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It is not necessarily an indication of the system being busy and requires a backoff of the load balancer activities. But pushing it high could mean generally delaying other misfit activities or other type of imbalances. Also don't pollute nr_balance_failed because of misfit failures. The value is used for enabling cache hot migration and in migrate_util/load types. None of which should be impacted (skewed) by misfit failures. Signed-off-by: Qais Yousef Reviewed-by: Vincent Guittot --- kernel/sched/fair.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 20006fcf7df2..4c1235a5dd60 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -11467,8 +11467,12 @@ static int load_balance(int this_cpu, struct rq *t= his_rq, * We do not want newidle balance, which can be very * frequent, pollute the failure counter causing * excessive cache_hot migrations and active balances. + * + * Similarly for migration_misfit which is not related to + * load/util migration, don't pollute nr_balance_failed. */ - if (idle !=3D CPU_NEWLY_IDLE) + if (idle !=3D CPU_NEWLY_IDLE && + env.migration_type !=3D migrate_misfit) sd->nr_balance_failed++; =20 if (need_active_balance(&env)) { @@ -11551,8 +11555,13 @@ static int load_balance(int this_cpu, struct rq *t= his_rq, * repeatedly reach this code, which would lead to balance_interval * skyrocketing in a short amount of time. Skip the balance_interval * increase logic to avoid that. + * + * Similarly misfit migration which is not necessarily an indication of + * the system being busy and requires lb to backoff to let it settle + * down. */ - if (env.idle =3D=3D CPU_NEWLY_IDLE) + if (env.idle =3D=3D CPU_NEWLY_IDLE || + env.migration_type =3D=3D migrate_misfit) goto out; =20 /* tune up the balancing interval */ --=20 2.34.1