kernel/sched/topology.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
The function may call cpumask_equal with tl->mask(cpu) == tl->mask(i), even when cpu != i. In such case, cpumask_equal() would always return true, and we can proceed to the next iteration immediately. Valentin Schneider shares on it: PKG can potentially hit that condition, and so can any sched_domain_mask_f that relies on the node masks... I'm thinking ideally we should have checks in place to ensure all node_to_cpumask_map[] masks are disjoint, then we could entirely skip the levels that use these masks in topology_span_sane(), but there's unfortunately no nice way to flag them... Also there would be cases where there's no real difference between PKG and NODE other than NODE is still based on a per-cpu cpumask and PKG isn't, so I don't see a nicer way to go about this. v1: https://lore.kernel.org/lkml/ZrJk00cmVaUIAr4G@yury-ThinkPad/T/ v2: - defer initialization of 'mc' in patch #1 @Chen Yu; - more comments from Valentin Schneider. Yury Norov (2): sched/topology: pre-compute topology_span_sane() loop params sched/topology: optimize topology_span_sane() kernel/sched/topology.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) -- 2.43.0
Ping? On Wed, Aug 07, 2024 at 12:05:19PM -0700, Yury Norov wrote: > The function may call cpumask_equal with tl->mask(cpu) == tl->mask(i), > even when cpu != i. In such case, cpumask_equal() would always return > true, and we can proceed to the next iteration immediately. > > Valentin Schneider shares on it: > > PKG can potentially hit that condition, and so can any > sched_domain_mask_f that relies on the node masks... > > I'm thinking ideally we should have checks in place to > ensure all node_to_cpumask_map[] masks are disjoint, > then we could entirely skip the levels that use these > masks in topology_span_sane(), but there's unfortunately > no nice way to flag them... Also there would be cases > where there's no real difference between PKG and NODE > other than NODE is still based on a per-cpu cpumask and > PKG isn't, so I don't see a nicer way to go about this. > > v1: https://lore.kernel.org/lkml/ZrJk00cmVaUIAr4G@yury-ThinkPad/T/ > v2: > - defer initialization of 'mc' in patch #1 @Chen Yu; > - more comments from Valentin Schneider. > > > Yury Norov (2): > sched/topology: pre-compute topology_span_sane() loop params > sched/topology: optimize topology_span_sane() > > kernel/sched/topology.c | 20 ++++++++++++++++++-- > 1 file changed, 18 insertions(+), 2 deletions(-) > > -- > 2.43.0
Ping again? On Wed, Aug 14, 2024 at 09:35:04AM -0700, Yury Norov wrote: > Ping? > > On Wed, Aug 07, 2024 at 12:05:19PM -0700, Yury Norov wrote: > > The function may call cpumask_equal with tl->mask(cpu) == tl->mask(i), > > even when cpu != i. In such case, cpumask_equal() would always return > > true, and we can proceed to the next iteration immediately. > > > > Valentin Schneider shares on it: > > > > PKG can potentially hit that condition, and so can any > > sched_domain_mask_f that relies on the node masks... > > > > I'm thinking ideally we should have checks in place to > > ensure all node_to_cpumask_map[] masks are disjoint, > > then we could entirely skip the levels that use these > > masks in topology_span_sane(), but there's unfortunately > > no nice way to flag them... Also there would be cases > > where there's no real difference between PKG and NODE > > other than NODE is still based on a per-cpu cpumask and > > PKG isn't, so I don't see a nicer way to go about this. > > > > v1: https://lore.kernel.org/lkml/ZrJk00cmVaUIAr4G@yury-ThinkPad/T/ > > v2: > > - defer initialization of 'mc' in patch #1 @Chen Yu; > > - more comments from Valentin Schneider. > > > > > > Yury Norov (2): > > sched/topology: pre-compute topology_span_sane() loop params > > sched/topology: optimize topology_span_sane() > > > > kernel/sched/topology.c | 20 ++++++++++++++++++-- > > 1 file changed, 18 insertions(+), 2 deletions(-) > > > > -- > > 2.43.0
Le 28/08/2024 à 16:09, Yury Norov a écrit : > Ping again? > Hi, The covert letter is v2, but the patch below it are without v2. In patch 1/2, the Suggested-by: tags in v1 have been removed in this new version :(. CJ > On Wed, Aug 14, 2024 at 09:35:04AM -0700, Yury Norov wrote: >> Ping? >> >> On Wed, Aug 07, 2024 at 12:05:19PM -0700, Yury Norov wrote: >>> The function may call cpumask_equal with tl->mask(cpu) == tl->mask(i), >>> even when cpu != i. In such case, cpumask_equal() would always return >>> true, and we can proceed to the next iteration immediately. >>> >>> Valentin Schneider shares on it: >>> >>> PKG can potentially hit that condition, and so can any >>> sched_domain_mask_f that relies on the node masks... >>> >>> I'm thinking ideally we should have checks in place to >>> ensure all node_to_cpumask_map[] masks are disjoint, >>> then we could entirely skip the levels that use these >>> masks in topology_span_sane(), but there's unfortunately >>> no nice way to flag them... Also there would be cases >>> where there's no real difference between PKG and NODE >>> other than NODE is still based on a per-cpu cpumask and >>> PKG isn't, so I don't see a nicer way to go about this. >>> >>> v1: https://lore.kernel.org/lkml/ZrJk00cmVaUIAr4G@yury-ThinkPad/T/ >>> v2: >>> - defer initialization of 'mc' in patch #1 @Chen Yu; >>> - more comments from Valentin Schneider. >>> >>> >>> Yury Norov (2): >>> sched/topology: pre-compute topology_span_sane() loop params >>> sched/topology: optimize topology_span_sane() >>> >>> kernel/sched/topology.c | 20 ++++++++++++++++++-- >>> 1 file changed, 18 insertions(+), 2 deletions(-) >>> >>> -- >>> 2.43.0 > >
On Thu, Aug 29, 2024 at 08:11:41AM +0200, Christophe JAILLET wrote: > Le 28/08/2024 à 16:09, Yury Norov a écrit : > > Ping again? > > > > Hi, > > The covert letter is v2, but the patch below it are without v2. > > In patch 1/2, the Suggested-by: tags in v1 have been removed in this new > version :(. My bad. I'll send v3.
© 2016 - 2026 Red Hat, Inc.