From nobody Sat Feb 7 22:54:46 2026 Received: from mail-dl1-f51.google.com (mail-dl1-f51.google.com [74.125.82.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24B053E0415 for ; Thu, 22 Jan 2026 16:16:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769098627; cv=none; b=XxMmk2zp2a3F/NblOMLjGvDEpCCMRkHAkIkRYe3zX5KzwtFVuTGaxlvIONhFh7/xUUh7Xu+IjhPMhkkhH59CxsTeaAVvwejBRTpmHglRL0cnmmI36HF66Hr/2Y6JZ3QATZ2rbmzJURM9N/I3SpZBFvcv+0d92e0VgRe+RCAB+es= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769098627; c=relaxed/simple; bh=6lXD/h+FxmzRs26l2+RU913Yh7EC/OEaIIvzrSvPyko=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=EWufni72OT0bJJZDew3yjPnV5BA5dAJ2IahOuiSuVAK2uZPw+ogkQN0IL/UwXJ6jxaoTEZiEtgKdcLbakWp8i2n5/USRHJbCURaH1GNx6YjtAbZxK7AMl8AIWbTE1mUlL/W6eRusk9mJK+j+rp1UlyOXv/zI5es0aThmYivUMig= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kq80TMM/; arc=none smtp.client-ip=74.125.82.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kq80TMM/" Received: by mail-dl1-f51.google.com with SMTP id a92af1059eb24-121bf277922so2048754c88.0 for ; Thu, 22 Jan 2026 08:16:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769098615; x=1769703415; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Pf7qFRfZpMJ+JOkgBHtuzqD5Y7yDJ/IoKs+TsPcv/Bc=; b=kq80TMM/Ga1HVa3mLlxZUfXVUxEmBn6NZMaMIUgM+MlkyCfu0yE/PrgcprYSAEgXMh ZnjxVia6wAtCGKaH26lmVZXuN7NAgTTpeJ9hj8q7RX/uUiXYMTqBGkUDSGaw4OT/uybI VfcjNYdnFRtVt8L1uglQs6wRzDg3CclYta7mO/ETcDyimn4zBysMJBUGrTNsHjBmsVEW qmXF1iFP2CA08vfJlWAajBXxygz9kU4iCxA0S+ycGoy9HrGy61S2yXGp2As6GgUCUqgM 7gE285zzCP5BJwoTr7lzcJRQOJYR/LVI8iZa/VqaYI1QCYS+7oLuAtxmpvGzXQQsNNHK ifvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769098615; x=1769703415; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Pf7qFRfZpMJ+JOkgBHtuzqD5Y7yDJ/IoKs+TsPcv/Bc=; b=rcwkqgZpDSoyEYAk68hfKHR6Iu7/Zaw7jPJT3Rxk/LojrCtc/vGctqWmBBRza07gkC xD/A+YcSY9fCfyZXbpSHTabFv22bIswXp4n3zggKTVTc1FDIxeRZcnHv8ox372crSEVZ Bz6xG8QMVDwoOURtMqZL5cmnNZHVQKgXyhya05HJ++8ITCbWuFfM2XuzAJ/1KjxuXqaf 8myh+VLt06ZLFiajmWvaOBhzjInCKY/megceQO9J7nhWGw1E3dU+ciaTgGN3CiZqbm+o P67qsh3EfxM0kQGqc7pEpttdvXChopN2Kl8cSfELrTj8hiGyb1Cy6uLcZD+3iTOf5e8t euuA== X-Forwarded-Encrypted: i=1; AJvYcCVWxCCoGS7PjzZ66dBpf7mPaINcJnCT5ah5cs/125dUHTseA14VidiQqGGzC4j7p8DRoP8xxZM87Vj3AGc=@vger.kernel.org X-Gm-Message-State: AOJu0YyVzlwBRZY8EskIajddwKFGZ3QmaiH4msLrSgWXi1FqrHU2aG5f /l7R+W4d4XUpE74ZB8PuwscLSwDn3Fp6mKPFDvo9FFH7UciOPlIZJUjA X-Gm-Gg: AZuq6aI3ixC3ivU8jqK/pt6Rtd/v++ydeuVj7L2jApB3s1PFMao1uv5CqI0ofQ1/wPA DCfXvpoV27UdLSciDNUZUShsYw27s01KN672BaqfOj4Rz1rfOmWZUfm7S4A4jvuZMOV6k138+xy LGRXa5+2ZuqnqTFw8f8MavPeIWZste4TkVkvQmrbKXFf03CdOOTKMA9X+1lPYLi4ke+Po8FqcfT JJ99LqSgwS4eOB7eQdlLbNgVf4YVYj8ZyIkELDdi6wkiFWQyA2SomCvRcMhWRlO91EXcg0lbvW6 hil0iuPTAmL4d96XanWlS2FPtYGGCAFNJDtm0XTTmSB4ydvWd1DZA13QEqvZJBM2oOIT/KoFeEZ Y4Q2TdGQseO0oxR3DisYQCP3qOFt4V6C+Z4ofGBIaOjdOlvX3Jbt5SvuItsCfVll03cL8Ir/3gh d4Jb8= X-Received: by 2002:a05:7301:e06:b0:2ae:5dc2:3b11 with SMTP id 5a478bee46e88-2b6b46d3163mr17240177eec.2.1769098614548; Thu, 22 Jan 2026 08:16:54 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b6fba43c3dsm12090431eec.0.2026.01.22.08.16.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jan 2026 08:16:53 -0800 (PST) From: Qiliang Yuan To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org Cc: dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, yuanql9@chinatelecom.cn, Qiliang Yuan Subject: [PATCH] sched/fair: Cache NUMA node statistics to avoid O(N) scanning Date: Thu, 22 Jan 2026 11:16:46 -0500 Message-ID: <20260122161647.142704-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Optimize update_numa_stats() by leveraging pre-calculated group statistics = from the load balancer hierarchy. This reduces the complexity of NUMA balan= cing overhead from O(CPUs_per_node) to O(1) in the hot path when stats are = fresh. Signed-off-by: Qiliang Yuan Signed-off-by: Qiliang Yuan --- kernel/sched/fair.c | 35 +++++++++++++++++++++++++++++++++++ kernel/sched/sched.h | 7 +++++++ 2 files changed, 42 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e71302282671..dc46262bd227 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2099,11 +2099,36 @@ static void update_numa_stats(struct task_numa_env = *env, bool find_idle) { int cpu, idle_core =3D -1; + struct sched_domain *sd; + struct sched_group *sg; =20 memset(ns, 0, sizeof(*ns)); ns->idle_cpu =3D -1; =20 rcu_read_lock(); + /* Algorithmic Optimization: Avoid O(N) scan by using cached stats from l= oad balancer */ + sd =3D rcu_dereference(per_cpu(sd_numa, env->src_cpu)); + if (sd && !find_idle) { + sg =3D sd->groups; + do { + /* Check if this group corresponds to the node we are interested in */ + if (cpumask_test_cpu(cpumask_first(cpumask_of_node(nid)), sched_group_s= pan(sg))) { + /* Use cached stats if they are recent enough (e.g. within 10ms) */ + if (time_before(jiffies, sg->sgc->stats_update + msecs_to_jiffies(10))= ) { + ns->load =3D sg->sgc->load; + ns->runnable =3D sg->sgc->runnable; + ns->util =3D sg->sgc->util; + ns->nr_running =3D sg->sgc->nr_running; + ns->compute_capacity =3D sg->sgc->capacity; + rcu_read_unlock(); + goto skip_scan; + } + break; + } + sg =3D sg->next; + } while (sg !=3D sd->groups); + } + for_each_cpu(cpu, cpumask_of_node(nid)) { struct rq *rq =3D cpu_rq(cpu); =20 @@ -2126,6 +2151,7 @@ static void update_numa_stats(struct task_numa_env *e= nv, } rcu_read_unlock(); =20 +skip_scan: ns->weight =3D cpumask_weight(cpumask_of_node(nid)); =20 ns->node_type =3D numa_classify(env->imbalance_pct, ns); @@ -10488,6 +10514,15 @@ static inline void update_sg_lb_stats(struct lb_en= v *env, if (sgs->group_type =3D=3D group_overloaded) sgs->avg_load =3D (sgs->group_load * SCHED_CAPACITY_SCALE) / sgs->group_capacity; + + /* Algorithmic Optimization: Cache group stats for O(1) NUMA lookups */ + if (env->sd->flags & SD_NUMA) { + group->sgc->nr_running =3D sgs->sum_h_nr_running; + group->sgc->load =3D sgs->group_load; + group->sgc->util =3D sgs->group_util; + group->sgc->runnable =3D sgs->group_runnable; + WRITE_ONCE(group->sgc->stats_update, jiffies); + } } =20 /** diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index d30cca6870f5..81160790993e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2105,6 +2105,13 @@ struct sched_group_capacity { =20 int id; =20 + /* O(1) NUMA stats cache */ + unsigned long nr_running; + unsigned long load; + unsigned long util; + unsigned long runnable; + unsigned long stats_update; + unsigned long cpumask[]; /* Balance mask */ }; =20 --=20 2.51.0