From nobody Sat Feb 7 03:56:35 2026 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6CFF223237B for ; Mon, 13 Jan 2025 07:31:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736753469; cv=none; b=fGlKa3Zi7/QhRR4DWtUhmvmuFjoLAEOwD+btwGtXH5ZASSXTPew1qEwpEPnz+H5WQPA/7+dAVS9mirBD3XXqG9WQCYjuLDJB+IZR1Kf0M05TaT3AiO2uUwzWx/kSohPj9LX4FZIVFpr9MEhnOz9JG6QT9LS/BXbFK6p4EVbKyGQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736753469; c=relaxed/simple; bh=e8V2/5KlrGsOamPD8ebv7YyO3uJsR9dHh6zsXexWUME=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Vfe8CcEVkSixcWb6ahzjLLj4VXsn1JKNRfgJ2kZExn92u4w1ITEbRSmFQIeCrFc9M81PytREiRMSGnqWGHOfYQQzrGeMulLueZmfKylRyt7wbW9N5EVppu7OJg8bb9PMNTw9sL0EE+TmyeUXAgvxC4i7N4N+oNVep4N5ZAf83Gs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=lSC2BVLC; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lSC2BVLC" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-21661be2c2dso63949085ad.1 for ; Sun, 12 Jan 2025 23:31:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1736753466; x=1737358266; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=q66ziXZCi5joAoIC/V9vsSQ6l+tHZim3L8CFSJmAPhk=; b=lSC2BVLCADIK0k2YDKgN/Jt0kk4ZTJXPbZ2n6YQritkZyzAy1djbTMkKXH4rkHAuV2 5cFdIsE6doxCxD96+jqFc8dAex2wHirl/j4lPTur3fOSKZDQVRSZ5xnRpyBGAecdamky ehVUUv7g1/WJhWOCfHFB0PpEE0BkPmxuJ7ZHMkEFoL/t2+SC6vdrvJoLzN71JpnJGUiW 68yXbD0PZRepQ6fZ5NY/fywVNhg4h7JBCTOVdpo+jVrtONcLSBj6Yi7nXRRJvjLj0Je+ UbtG0bPiEiyvMPLdcKEhms4S+GG0VIIbx/K0JVtAYeKo2EMC+sWirGw6/YBdfleVv1E2 9jZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736753466; x=1737358266; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q66ziXZCi5joAoIC/V9vsSQ6l+tHZim3L8CFSJmAPhk=; b=AIevDdNTDe/pdM1VygQM6CG3hdc4SEPO3jApVcpjPx4eqShHWd8/diwN6RX7oiu2ZT PPCmUUiZtop6Fpni45s603Ug1tUvSjKDnvWYme5kMVtmKMFvA3NMpblfomV/AD9h769V gi3SlWWeQw+1jCihZ1h6s0OJmrgEkrF7ufHTQcfgyrHHiFsoszteiavQrpkrkGUmMEPm 2yq+UT8tkMYvSsJn0YDR0u7uItX0sehO0g5mH4AG355i+MbJ4o+ud6iVtvn+wlAyrHhx YKBrJoXtc3j/ScnSwUkG1eDWDPgv6V/i5xuXn5ZNWNzw1g429eM4FRdkptAW9RbJtMbq RBbA== X-Forwarded-Encrypted: i=1; AJvYcCVCj9mqm/C+P/d1YeVe1pp79IAuPLvsRiMuVX0byoO7/+YtQz/8tVAfB98R8U/acgbEK2A00YMFaR/XBEU=@vger.kernel.org X-Gm-Message-State: AOJu0YwjZAAfgs+Eh+gDp5tlZbDPAypNSoTi8+/zSEPPXtwZWtRQQnpt pi6nIPocCmDqLWXCCVxrekAPeoulkGUkGxYbE/Pktvv3GIeZmtgidZHpU4XeRwo= X-Gm-Gg: ASbGncsE9adNix7kljtAVI7KiaWmqIZPMRl2XHYnWzO6UDTcyzqgRmda0TTrqYPT3eE KxScWJfNAOoApptTKmXqGF6PJDFgT4GkMTNLIwsjKn+gJuODBskNENEqnMwPvyPW+aC3Pg1FmpN U6PnErbTRR0DmOzbsCBibBASynBySHB0S6TQOpiTaJrcnMPUFh7ZrJWhDy6YcVR21IR8uzQzFdL 45vbijM1iUAD6D+TwXW4QH4/QqOuvaBbXeROIqEYzQSj79lBxYq/XrqRbSsk88aRdPjAKT71jI= X-Google-Smtp-Source: AGHT+IGlwwb4e96GC3+aYqBtTROSCIq8Pg/cuyX2Y7hSsVpWrUsvZ7zFbGaiNu57DfMuy8AKzn3FxQ== X-Received: by 2002:a17:902:e5ce:b0:215:8ca3:3bac with SMTP id d9443c01a7336-21a83f4e46fmr263494635ad.16.1736753465763; Sun, 12 Jan 2025 23:31:05 -0800 (PST) Received: from n37-019-243.byted.org ([115.190.40.11]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10e2c6sm47390875ad.33.2025.01.12.23.31.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jan 2025 23:31:05 -0800 (PST) From: Chuyi Zhou To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, longman@redhat.com, riel@surriel.com Cc: chengming.zhou@linux.dev, kprateek.nayak@amd.com, linux-kernel@vger.kernel.org, Chuyi Zhou , Madadi Vineeth Reddy Subject: [PATCH v3 1/3] sched/fair: Remove unused task_numa_migrate return value Date: Mon, 13 Jan 2025 15:30:48 +0800 Message-Id: <20250113073050.2811925-2-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20250113073050.2811925-1-zhouchuyi@bytedance.com> References: <20250113073050.2811925-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Initial NUMA Balancing implementation used the return value of task_numa_migrate() to retry NUMA Balancing in commit 6b9a7460b6ba ("sched/numa: Retry migration of tasks to CPU on a preferred node") however in the same series[1], Mel also included an optimization from Rik which retried NUMA Balancing periodically irrespective the return value from task_numa_migrate() in commit 2739d3eef3a9 ("sched/numa: Retry task_numa_migrate() periodically"). The return value of task_numa_migrate now is unused, remove it. [1] https://lore.kernel.org/all/1381141781-10992-34-git-send-email-mgorman@= suse.de/ Reviewed-by: K Prateek Nayak Reviewed-by: Madadi Vineeth Reddy Signed-off-by: Chuyi Zhou --- kernel/sched/fair.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d5127d9beaea..f544012b9320 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2486,7 +2486,7 @@ static void task_numa_find_cpu(struct task_numa_env *= env, } } =20 -static int task_numa_migrate(struct task_struct *p) +static void task_numa_migrate(struct task_struct *p) { struct task_numa_env env =3D { .p =3D p, @@ -2531,7 +2531,7 @@ static int task_numa_migrate(struct task_struct *p) */ if (unlikely(!sd)) { sched_setnuma(p, task_node(p)); - return -EINVAL; + return; } =20 env.dst_nid =3D p->numa_preferred_nid; @@ -2600,7 +2600,7 @@ static int task_numa_migrate(struct task_struct *p) /* No better CPU than the current one was found. */ if (env.best_cpu =3D=3D -1) { trace_sched_stick_numa(p, env.src_cpu, NULL, -1); - return -EAGAIN; + return; } =20 best_rq =3D cpu_rq(env.best_cpu); @@ -2609,7 +2609,7 @@ static int task_numa_migrate(struct task_struct *p) WRITE_ONCE(best_rq->numa_migrate_on, 0); if (ret !=3D 0) trace_sched_stick_numa(p, env.src_cpu, NULL, env.best_cpu); - return ret; + return; } =20 ret =3D migrate_swap(p, env.best_task, env.best_cpu, env.src_cpu); @@ -2618,7 +2618,6 @@ static int task_numa_migrate(struct task_struct *p) if (ret !=3D 0) trace_sched_stick_numa(p, env.src_cpu, env.best_task, env.best_cpu); put_task_struct(env.best_task); - return ret; } =20 /* Attempt to migrate a task to a CPU on the preferred node. */ --=20 2.20.1 From nobody Sat Feb 7 03:56:35 2026 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 673E52327A3 for ; Mon, 13 Jan 2025 07:31:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736753473; cv=none; b=J4vqhzE+dqtKpLV/jgyPDLGbUZk1gzztPe0pBu4nEEA+B9S0qjOeqVAsc2shrmrV5KhPKkbwLLYsRgZP7W5PdwD5XQAoZOAbzihpKDAy0PPfyDr+ODooFIfII0lIeSJCKTBId6N+AChJSjDCQOdU+bRjZiFZNyw7CNJN5JumK2Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736753473; c=relaxed/simple; bh=8rDS8l3fx5GE3UOK00jCmGjQdiR0eyu4Z2Dl4M8cOLw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VacWrtr3uaUsUVJvIeX6/3ejx1MsgjA3utdKqn7vnLq8Sj58IiffOQYjFIR9gXWE3kymGdJXoaE0qHubFd1u5PRbI/zPtWOVnKCGWu1koGRvtIswhVJjWJtpMh1SgLJ1demscG7ur6cVunE0e631/kvIJXc8plqC2UOns6kvORU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=G5a77mp8; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="G5a77mp8" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-21654fdd5daso64518345ad.1 for ; Sun, 12 Jan 2025 23:31:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1736753471; x=1737358271; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U67oid9U7yXb6CHfN+uN7wExKwSoaHm29VZTd8oQjXw=; b=G5a77mp8bs2yMBuzxY2SaXwX6gQjT8QB3csTzp1glr5x9w2pUGZqZpvQWV/V9aAhy+ o9l/5JVPCD2JaVngkEJ10gq8zR2KriFpxM/jDADjpJJ1oHBxGm7fZX2TPORc5qH584fx bYs0PIGAAwJ5tbKkWkvPr+AbXxxE3KrQ3oJT+brNedyDI2yi5TqduVBV451ZZMSMC0PA Wzqesa3/x9qy/PvWHAxCHkD1xPnT0N5iYm0mCYW1MqtFzfshYU/BdK8wb3le14wRNGX6 ObqRcYJWBXdcm1Up1kUQ7iwxCEuL8og1Adwa+W0OyVAC32pm0kprSzk58RgsC4mDzHS0 jgjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736753471; x=1737358271; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U67oid9U7yXb6CHfN+uN7wExKwSoaHm29VZTd8oQjXw=; b=KLz/KdFcqMU1kTrBiFVmGq5zMrj4DhYoqTog/m9PzuvMo5dusveGy0DctuU0Z4pxOu PXHO2mMTzXlDv5e8kzKYTjdEYPA7SaaWBtQ5gQmu16o0ltvb/i/L+G5XGTdPi/AewfjZ gWF1g7md9+qMXMfGq9x6xaI9xvkPeNsTD39iu2D3RrAY68on3/jqsH7HmD3JIe7tr8cE SVZQZ/MhFS1+NAXTtuQ2n6DUe3giBlFRYAaMCmAcoTeAjp3PPkeVsxxtwxXGcX10j8S2 yknM3iKj3zuhpUHwn89LUhy35DM1GNFc8EhfQ+mEp9wZ83sXgzK2cZeUjqPF+Zm1c2hA LC5A== X-Forwarded-Encrypted: i=1; AJvYcCUwaJpsllPbJv0R/b9/LXMn5ocMjPZXtVE86SjF0v6XR3deRXWQhAQfj9nUtWycfLbx560WzPncLickYCs=@vger.kernel.org X-Gm-Message-State: AOJu0YyE9ee5goreT/PWHVYpSTq9iycP8KSR59Cz13PoQ0vL5+6p856i lRgIpSXqi3UU9C3yRyWnvLu4Ib1aU1SpAyzKnhD59ZlfBUEuXRSJx9QuaDZ/QHc= X-Gm-Gg: ASbGnct5pOoN2cCRaNpJ/PMhHFFnbaWfVzAqhI0yBFy1JwSfZnGQHzsK0N/2lnETbb5 Eo3xA2nR3nUh88kOotas1pyUC1s/ukNZi7zHRSUGjxsnN3IFHLwVmWsksIeEewjyD+Iza2gpzyO 39TSv9n5MgtW2M8goyqalIwLVOtBWw/RC0Ti8PedLB2jWoSLl6wK8RtNN8wHaSCG0dDqyR3Gq/h 3I50OpXSShziIgRepIlsS68EOVZTXpS/vgrkqFkCIsYl1vgIejAd8NE4OgizON3i3Io5ZAXyeg= X-Google-Smtp-Source: AGHT+IEWvHdqG5f5n4qnnKcupRBkSp1fR06JhmffkfwJshDg6mpHoVjCURZHyiCzD4NUGR4wBj+G6Q== X-Received: by 2002:a17:902:f546:b0:216:51b0:6600 with SMTP id d9443c01a7336-21a83f691eamr317863345ad.24.1736753470675; Sun, 12 Jan 2025 23:31:10 -0800 (PST) Received: from n37-019-243.byted.org ([115.190.40.11]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10e2c6sm47390875ad.33.2025.01.12.23.31.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jan 2025 23:31:10 -0800 (PST) From: Chuyi Zhou To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, longman@redhat.com, riel@surriel.com Cc: chengming.zhou@linux.dev, kprateek.nayak@amd.com, linux-kernel@vger.kernel.org, Chuyi Zhou Subject: [PATCH v3 2/3] sched/fair: Introduce per cpu numa_balance_mask Date: Mon, 13 Jan 2025 15:30:49 +0800 Message-Id: <20250113073050.2811925-3-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20250113073050.2811925-1-zhouchuyi@bytedance.com> References: <20250113073050.2811925-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This patch introduces per cpu numa_balance_mask. Similar to select_rq_mask, it will be used as a temporary variable for candidate cpu searching and numa status update. This will simplify the later patch, and we no longer need to repeatedly verify whether the candidate CPU is in env->p->cpus_ptr during iteration. Signed-off-by: Chuyi Zhou --- kernel/sched/fair.c | 36 +++++++++++++++++++++++++----------- 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f544012b9320..53fd95129b48 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1448,6 +1448,9 @@ unsigned int sysctl_numa_balancing_scan_delay =3D 100= 0; /* The page with hint page fault latency < threshold in ms is considered h= ot */ unsigned int sysctl_numa_balancing_hot_threshold =3D MSEC_PER_SEC; =20 +/* Working cpumask for task_numa_migrate() */ +static DEFINE_PER_CPU(cpumask_var_t, numa_balance_mask); + struct numa_group { refcount_t refcount; =20 @@ -2047,6 +2050,7 @@ struct numa_stats { struct task_numa_env { struct task_struct *p; =20 + struct cpumask *cpus; int src_cpu, src_nid; int dst_cpu, dst_nid; int imb_numa_nr; @@ -2121,8 +2125,10 @@ static void update_numa_stats(struct task_numa_env *= env, memset(ns, 0, sizeof(*ns)); ns->idle_cpu =3D -1; =20 + cpumask_copy(env->cpus, cpumask_of_node(nid)); + rcu_read_lock(); - for_each_cpu(cpu, cpumask_of_node(nid)) { + for_each_cpu(cpu, env->cpus) { struct rq *rq =3D cpu_rq(cpu); =20 ns->load +=3D cpu_load(rq); @@ -2144,7 +2150,7 @@ static void update_numa_stats(struct task_numa_env *e= nv, } rcu_read_unlock(); =20 - ns->weight =3D cpumask_weight(cpumask_of_node(nid)); + ns->weight =3D cpumask_weight(env->cpus); =20 ns->node_type =3D numa_classify(env->imbalance_pct, ns); =20 @@ -2163,11 +2169,9 @@ static void task_numa_assign(struct task_numa_env *e= nv, int start =3D env->dst_cpu; =20 /* Find alternative idle CPU. */ - for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid), start + 1) { - if (cpu =3D=3D env->best_cpu || !idle_cpu(cpu) || - !cpumask_test_cpu(cpu, env->p->cpus_ptr)) { + for_each_cpu_wrap(cpu, env->cpus, start + 1) { + if (cpu =3D=3D env->best_cpu || !idle_cpu(cpu)) continue; - } =20 env->dst_cpu =3D cpu; rq =3D cpu_rq(env->dst_cpu); @@ -2434,6 +2438,8 @@ static void task_numa_find_cpu(struct task_numa_env *= env, bool maymove =3D false; int cpu; =20 + cpumask_and(env->cpus, cpumask_of_node(env->dst_nid), env->p->cpus_ptr); + /* * If dst node has spare capacity, then check if there is an * imbalance that would be overruled by the load balancer. @@ -2475,11 +2481,7 @@ static void task_numa_find_cpu(struct task_numa_env = *env, maymove =3D !load_too_imbalanced(src_load, dst_load, env); } =20 - for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) { - /* Skip this CPU if the source task cannot migrate */ - if (!cpumask_test_cpu(cpu, env->p->cpus_ptr)) - continue; - + for_each_cpu(cpu, env->cpus) { env->dst_cpu =3D cpu; if (task_numa_compare(env, taskimp, groupimp, maymove)) break; @@ -2534,6 +2536,12 @@ static void task_numa_migrate(struct task_struct *p) return; } =20 + /* + * per-cpu numa_balance_mask and rq->rd->span usage + */ + preempt_disable(); + + env.cpus =3D this_cpu_cpumask_var_ptr(numa_balance_mask); env.dst_nid =3D p->numa_preferred_nid; dist =3D env.dist =3D node_distance(env.src_nid, env.dst_nid); taskweight =3D task_weight(p, env.src_nid, dist); @@ -2579,6 +2587,8 @@ static void task_numa_migrate(struct task_struct *p) } } =20 + preempt_enable(); + /* * If the task is part of a workload that spans multiple NUMA nodes, * and is migrating into one of the workload's active nodes, remember @@ -13638,6 +13648,10 @@ __init void init_sched_fair_class(void) zalloc_cpumask_var_node(&per_cpu(should_we_balance_tmpmask, i), GFP_KERNEL, cpu_to_node(i)); =20 +#ifdef CONFIG_NUMA_BALANCING + zalloc_cpumask_var_node(&per_cpu(numa_balance_mask, i), GFP_KERNEL, cpu_= to_node(i)); +#endif + #ifdef CONFIG_CFS_BANDWIDTH INIT_CSD(&cpu_rq(i)->cfsb_csd, __cfsb_csd_unthrottle, cpu_rq(i)); INIT_LIST_HEAD(&cpu_rq(i)->cfsb_csd_list); --=20 2.20.1 From nobody Sat Feb 7 03:56:35 2026 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 476F52343B8 for ; Mon, 13 Jan 2025 07:31:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736753478; cv=none; b=dlT0OH9lheZF5IzbSiyN9+SJcWrIBgtWYt3tH9zbUbYoam7hWZ5sF15+AyKPOQyl68g13/or1YH/u5Ll1NLx8ijU8OMLSPNJ+zM85Wd3DPhJIicWqxBAoLCeMySLt5klc11s59kZDvn9rv6bnPX6BJHRrWci3dM1rh6hEs2hJdU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736753478; c=relaxed/simple; bh=rYIttu5ByCzeg8uBoXMhdblcVt4QCQ1BfYAcyrMwLSU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=neZsfgIGSg0Au1QwLJWZcRK7KPNnx7DgcT2QuEXM78NQXWdjewHTpHuQuGoChZv1xudN0MhL8psDKwqG/AVkd6tJFYa3ApKQ0HvThMmT36u7gruHmIouuXgvaoSpykHVJTOXdwvJHZZv3kcW6awY+MgHfpeBXCW9RW03dR2ojBk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=g2UtfEaU; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="g2UtfEaU" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-21a7ed0155cso65868695ad.3 for ; Sun, 12 Jan 2025 23:31:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1736753475; x=1737358275; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Z3wejKOGXC7O+w1yz/KHLhR+X5S1bkvRmFQ7SwdfPiI=; b=g2UtfEaURe2h7KRQCfNoHpmjl2303yiLIYwTIXqvf54hdWSBNY7NkErhRe6g4QbERf d1fgPClIhCVELGzjbjTAczytA6eAO7uE/5Sr8LcyELZ+6pPTwJOOWC1Nd6GVIi2ZE9K6 u2UxyuPojtTrfEGwGhp7OAvPe7Ki1E/M2ePk2m2ovprV1oGCbJkxKjpLp7GGYazWr2AP rRLtbxYQLVzToIF4MSpgQYenopHOZfUuDqQNxjHB9dF7sXRiDAeti0+zKwKiDL8fbDEP PC8GyN2DhoxwYlYP9gZ6HEeZhJhKZLm9Yj/BnXJ/znWbTOxgULhA2Gvn4VF7ak4gmGui vq/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736753475; x=1737358275; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z3wejKOGXC7O+w1yz/KHLhR+X5S1bkvRmFQ7SwdfPiI=; b=YQnxqkPB2gbjntZYrIK7XtcvyQZitgF9aKk8YvPaNJL1lstOeJKPs3K0Z+saSe+Vx0 djiywcMeD1iznE2mDEdR1V/9FOyIMPpgx7M72JtzeaV0+6z/x+wxZRtlS/aFO6fW8mKK UpvLtTy3/0a7KHJ8AR4I1UjlxETrtcL3IH3YW9dH33Cle+5Ru4yCwyK1QAMu9GpleYX9 r9xaXSx/lNxD+kvk2XqyMPzMLTzpegzK/XWUilTiu9ABabEtPyv7J8pXCSWxabRQxy8a 5bwfTdjrrp80dAEqWQdrbUCgYSSDQ8DMm6EqAWTPrRdEqMfLVoDFYsuPeczycw2gpRvh 92lw== X-Forwarded-Encrypted: i=1; AJvYcCXydA0170h2fZWtJ8G9m+JA4ynks55qNMTiuc8QgyFpTx4/OXy95Ho/vc7+j5HQqLcv+jTDiRS3wqGV2WA=@vger.kernel.org X-Gm-Message-State: AOJu0YzTpEgn/O2ypyRCfTjrBo6+/59pEWifvRfiWYYFSakwucs4nB34 nsWngsUDeMSO6UoO8dbTuvxcdSqKW5yttOpB0EnO2eIiz3JlyD+BkI0Xic2gdxk= X-Gm-Gg: ASbGnctjxR1JTptev+0HGekk6VCSDfvO3jsdANQ72uqi24qJg7n6C78E+2df2ITb07B mh+lGC4v1nuGOHovV2cKY5h6in1Uc4boAQh70KtfNdJu3jDlhZz6GVTFdSoVyIGu4rj9ZUZ7Knl leZy5codLYzBjboV3e+0A0TsZawFFjo9flNb/GG30ab3Yl4hfdbczQCE9DZhXGO4oUXXQrrecKD fJhEfPVh0RyIxqjXlvD/FTp9jtynwF+lfkiOamEH7bxn+VcJcq/ElSypRxkbRSOeG6eigkjwAY= X-Google-Smtp-Source: AGHT+IGUMnpQKS6570B7hZbCaHx1aE/RfNXJez4nu5u9NZHttbdykN7IZ54d1ikoVTOmUCm8KOoKQw== X-Received: by 2002:a17:902:fb8e:b0:215:6426:30a5 with SMTP id d9443c01a7336-21a83fc150dmr244706285ad.40.1736753475603; Sun, 12 Jan 2025 23:31:15 -0800 (PST) Received: from n37-019-243.byted.org ([115.190.40.11]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21a9f10e2c6sm47390875ad.33.2025.01.12.23.31.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jan 2025 23:31:15 -0800 (PST) From: Chuyi Zhou To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, longman@redhat.com, riel@surriel.com Cc: chengming.zhou@linux.dev, kprateek.nayak@amd.com, linux-kernel@vger.kernel.org, Chuyi Zhou Subject: [PATCH v3 3/3] sched/fair: Take sched_domain into account in task_numa_migrate Date: Mon, 13 Jan 2025 15:30:50 +0800 Message-Id: <20250113073050.2811925-4-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20250113073050.2811925-1-zhouchuyi@bytedance.com> References: <20250113073050.2811925-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When we attempt to migrate a task in task_numa_migrate(), we need to consider the scheduling domain. Specifically: When searching for the best_cpu, we should skip CPUs that are not in the current scheduling domain, such as isolated CPUs. Now we only search for suitable CPUs in p->cpus_ptr, but this is not sufficient. Cpuset configured partitions are always reflected in each member task's cpumask. However, for the isolcpus=3D kernel command line option, the isolated CPUs are simply omitted from sched_domains without further restrictions on tasks' cpumasks. If a task's cpumask includes isolated CPUs, the task may be migrated to an isolated cpu. In update_numa_stats(), skip CPUs that are not in the scheduling domain. update_numa_stats() is used to be compatible with standard load balancing. For CPUs that do not participate in load balancing, such as isolated cpus, we should also skip them. This patch tries to fix the above issue by considering src_rq->rd->span in task_numa_migrate(). Note that src_cpu itself may be in an isolated domain too, and its rd may point to def_root_domain, the span may not be what we expected. In such cases, bail out early by checking whether sd_numa is null. Signed-off-by: Chuyi Zhou --- kernel/sched/fair.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 53fd95129b48..764797dd3744 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2120,12 +2120,13 @@ static void update_numa_stats(struct task_numa_env = *env, struct numa_stats *ns, int nid, bool find_idle) { + cpumask_t *span =3D cpu_rq(env->src_cpu)->rd->span; int cpu, idle_core =3D -1; =20 memset(ns, 0, sizeof(*ns)); ns->idle_cpu =3D -1; =20 - cpumask_copy(env->cpus, cpumask_of_node(nid)); + cpumask_and(env->cpus, span, cpumask_of_node(nid)); =20 rcu_read_lock(); for_each_cpu(cpu, env->cpus) { @@ -2435,10 +2436,12 @@ static bool task_numa_compare(struct task_numa_env = *env, static void task_numa_find_cpu(struct task_numa_env *env, long taskimp, long groupimp) { + cpumask_t *span =3D cpu_rq(env->src_cpu)->rd->span; bool maymove =3D false; int cpu; =20 cpumask_and(env->cpus, cpumask_of_node(env->dst_nid), env->p->cpus_ptr); + cpumask_and(env->cpus, env->cpus, span); =20 /* * If dst node has spare capacity, then check if there is an @@ -2503,10 +2506,10 @@ static void task_numa_migrate(struct task_struct *p) .best_cpu =3D -1, }; unsigned long taskweight, groupweight; + struct rq *best_rq, *src_rq; struct sched_domain *sd; long taskimp, groupimp; struct numa_group *ng; - struct rq *best_rq; int nid, ret, dist; =20 /* @@ -2530,6 +2533,9 @@ static void task_numa_migrate(struct task_struct *p) * balance domains, some of which do not cross NUMA boundaries. * Tasks that are "trapped" in such domains cannot be migrated * elsewhere, so there is no point in (re)trying. + * + * Another situation is that src_cpu is in the isolated domain, + * if so, bail out early. */ if (unlikely(!sd)) { sched_setnuma(p, task_node(p)); @@ -2541,6 +2547,7 @@ static void task_numa_migrate(struct task_struct *p) */ preempt_disable(); =20 + src_rq =3D cpu_rq(env.src_cpu); env.cpus =3D this_cpu_cpumask_var_ptr(numa_balance_mask); env.dst_nid =3D p->numa_preferred_nid; dist =3D env.dist =3D node_distance(env.src_nid, env.dst_nid); @@ -2567,6 +2574,10 @@ static void task_numa_migrate(struct task_struct *p) if (nid =3D=3D env.src_nid || nid =3D=3D p->numa_preferred_nid) continue; =20 + if (unlikely(!cpumask_intersects(src_rq->rd->span, + cpumask_of_node(nid)))) + continue; + dist =3D node_distance(env.src_nid, env.dst_nid); if (sched_numa_topology_type =3D=3D NUMA_BACKPLANE && dist !=3D env.dist) { --=20 2.20.1