From: Li Chen <chenl311@chinatelecom.cn>
Currently, the SMT domain is added into sched_domain_topology
by default if CONFIG_SCHED_SMT is enabled.
If cpu_attach_domain finds that the CPU SMT domain’s cpumask_weight
is just 1, it will destroy_sched_domain it.
On a large machine, such as one with 512 cores, this results in
512 redundant domain attach/destroy operations.
We can avoid these unnecessary operations by simply checking
cpu_smt_num_threads and not inserting SMT domain into x86_topology if SMT
is not enabled.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
---
arch/x86/kernel/smpboot.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7d202f9785362..9ff8b10715cc1 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -492,8 +492,24 @@ static struct sched_domain_topology_level x86_topology[] = {
{ NULL },
};
+static void __init maybe_remove_smt_level(void)
+{
+ if (cpu_smt_num_threads <= 1) {
+ /*
+ * SMT level is x86_topology[0]. Shift the array left by one,
+ * keep the sentinel { NULL } at the end.
+ */
+ memmove(&x86_topology[0], &x86_topology[1],
+ sizeof(x86_topology) - sizeof(x86_topology[0]));
+ memset(&x86_topology[ARRAY_SIZE(x86_topology) - 1], 0,
+ sizeof(x86_topology[0]));
+ }
+}
+
static void __init build_sched_topology(void)
{
+ maybe_remove_smt_level();
+
/*
* When there is NUMA topology inside the package invalidate the
* PKG domain since the NUMA domains will auto-magically create the
--
2.49.0
On Tue, Jun 24 2025 at 16:08, Li Chen wrote: > From: Li Chen <chenl311@chinatelecom.cn> > > Currently, the SMT domain is added into sched_domain_topology > by default if CONFIG_SCHED_SMT is enabled. > > If cpu_attach_domain finds that the CPU SMT domain’s cpumask_weight If cpu_attach_domain() IIRC, I told you that before. > is just 1, it will destroy_sched_domain it. > > On a large machine, such as one with 512 cores, this results in > 512 redundant domain attach/destroy operations. > > We can avoid these unnecessary operations by simply checking s/We can avoid/Avoid/ Care to read my reviews? If you disagree, then discuss it with me, but silently ignoring it them is not an option. > cpu_smt_num_threads and not inserting SMT domain into x86_topology if SMT not inserting? That's not what this new version does. > +static void __init maybe_remove_smt_level(void) > +{ > + if (cpu_smt_num_threads <= 1) { > + /* > + * SMT level is x86_topology[0]. Shift the array left by one, > + * keep the sentinel { NULL } at the end. > + */ > + memmove(&x86_topology[0], &x86_topology[1], > + sizeof(x86_topology) - sizeof(x86_topology[0])); > + memset(&x86_topology[ARRAY_SIZE(x86_topology) - 1], 0, > + sizeof(x86_topology[0])); So this sets the last entry in the array, aka the original sentinel in the last array entry, to zero... This is completely pointless. The above memmove() copies topo[1 .. (N - 1)] to topo[0 .. (N - 2)] Where N = ARRAY_SIZE(topo). Therefore topo[N - 1] == NULL and topo[N - 2] == NULL No? But then what's worse is that you fail to take that removal into account for the x86_has_numa_in_package case, which still unconditionally sets topo[N - 2] to zero even if the SMT level had been removed... Please take your time and do not rush out half baked stuff. Thanks, tglx
Hi Thomas. ---- On Tue, 24 Jun 2025 21:36:10 +0800 Thomas Gleixner <tglx@linutronix.de> wrote --- > On Tue, Jun 24 2025 at 16:08, Li Chen wrote: > > From: Li Chen <chenl311@chinatelecom.cn> > > > > Currently, the SMT domain is added into sched_domain_topology > > by default if CONFIG_SCHED_SMT is enabled. > > > > If cpu_attach_domain finds that the CPU SMT domain’s cpumask_weight > > If cpu_attach_domain() > > IIRC, I told you that before. > > > is just 1, it will destroy_sched_domain it. > > > > On a large machine, such as one with 512 cores, this results in > > 512 redundant domain attach/destroy operations. > > > > We can avoid these unnecessary operations by simply checking > > s/We can avoid/Avoid/ > > Care to read my reviews? If you disagree, then discuss it with me, but I'm sorry that I forget to say that your previous wording review have already been fixed in v2 https://lore.kernel.org/all/20250624085559.69436-3-me@linux.beauty/ And I would replace cpu_attach_domain with cpu_attach_domain(). Sorry for wasting your time. > silently ignoring it them is not an option. > > > cpu_smt_num_threads and not inserting SMT domain into x86_topology if SMT > > not inserting? That's not what this new version does. > > > +static void __init maybe_remove_smt_level(void) > > +{ > > + if (cpu_smt_num_threads <= 1) { > > + /* > > + * SMT level is x86_topology[0]. Shift the array left by one, > > + * keep the sentinel { NULL } at the end. > > + */ > > + memmove(&x86_topology[0], &x86_topology[1], > > + sizeof(x86_topology) - sizeof(x86_topology[0])); > > + memset(&x86_topology[ARRAY_SIZE(x86_topology) - 1], 0, > > + sizeof(x86_topology[0])); > > So this sets the last entry in the array, aka the original sentinel in > the last array entry, to zero... > > This is completely pointless. The above memmove() copies > > topo[1 .. (N - 1)] > to > topo[0 .. (N - 2)] > > Where N = ARRAY_SIZE(topo). > > Therefore > topo[N - 1] == NULL > and > topo[N - 2] == NULL > > No? > > But then what's worse is that you fail to take that removal into account > for the x86_has_numa_in_package case, which still unconditionally sets > topo[N - 2] to zero even if the SMT level had been removed... > > Please take your time and do not rush out half baked stuff. Sorry again for my mistake. I will fix it in v3. Regards, Li
© 2016 - 2025 Red Hat, Inc.