Since the x86 defconfig aims to be a distro kernel work-alike with
fewer drivers and a shorter build time, enable a handful of
popular scheduler and cgroups options that are typically enabled
on major Linux distributions.
The options enabled is a superset of the latest Ubuntu and Fedora
kernel debugging configs, using Ubuntu's config-6.11.0-24-generic
file, Fedora's kernel-x86_64-fedora.config and RHEL's
kernel-x86_64-rhel.config from kernel-ark.git.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
---
arch/x86/configs/defconfig.x86_64 | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/x86/configs/defconfig.x86_64 b/arch/x86/configs/defconfig.x86_64
index 3c4a03633328..225aed921e21 100644
--- a/arch/x86/configs/defconfig.x86_64
+++ b/arch/x86/configs/defconfig.x86_64
@@ -2,6 +2,7 @@ CONFIG_WERROR=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_AUDIT=y
+# CONFIG_CONTEXT_TRACKING_USER_FORCE is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_BPF_SYSCALL=y
@@ -11,26 +12,45 @@ CONFIG_BPF_PRELOAD=y
CONFIG_BPF_PRELOAD_UMD=y
CONFIG_BPF_LSM=y
CONFIG_PREEMPT_VOLUNTARY=y
+CONFIG_SCHED_CORE=y
+CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
+CONFIG_IRQ_TIME_ACCOUNTING=y
CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_PSI=y
+CONFIG_PSI_DEFAULT_DISABLED=y
CONFIG_LOG_BUF_SHIFT=18
-CONFIG_CGROUPS=y
+CONFIG_PRINTK_INDEX=y
+CONFIG_UCLAMP_TASK=y
+CONFIG_NUMA_BALANCING=y
+CONFIG_MEMCG=y
+CONFIG_MEMCG_V1=y
CONFIG_BLK_CGROUP=y
-CONFIG_CGROUP_SCHED=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_UCLAMP_TASK_GROUP=y
CONFIG_CGROUP_PIDS=y
CONFIG_CGROUP_RDMA=y
+CONFIG_CGROUP_DMEM=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CPUSETS=y
+CONFIG_CPUSETS_V1=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_BPF=y
CONFIG_CGROUP_MISC=y
CONFIG_CGROUP_DEBUG=y
+CONFIG_NAMESPACES=y
+CONFIG_USER_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
+CONFIG_SCHED_AUTOGROUP=y
+CONFIG_SYSFS_SYSCALL=y
+CONFIG_EXPERT=y
CONFIG_KALLSYMS_ALL=y
CONFIG_PROFILING=y
CONFIG_KEXEC=y
@@ -305,7 +325,6 @@ CONFIG_LIST_HARDENED=y
CONFIG_PRINTK_TIME=y
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_DYNAMIC_DEBUG=y
-CONFIG_DEBUG_KERNEL=y
CONFIG_STRIP_ASM_SYMS=y
CONFIG_HEADERS_INSTALL=y
CONFIG_DEBUG_SECTION_MISMATCH=y
--
2.45.2
On Tue, May 06, 2025 at 07:09:22PM +0200, Ingo Molnar <mingo@kernel.org> wrote: > +CONFIG_MEMCG_V1=y Ugh. > +CONFIG_CPUSETS_V1=y Ugh. Those config options were introduced to retire old code (their Kconfig defaults are N). I'd prefer if these defaults matched the Kconfig ones (and leave it up to distros if they need them some time longer). Thanks, Michal
On Tue, May 6, 2025, at 19:09, Ingo Molnar wrote:
> Since the x86 defconfig aims to be a distro kernel work-alike with
> fewer drivers and a shorter build time, enable a handful of
> popular scheduler and cgroups options that are typically enabled
> on major Linux distributions.
>
> The options enabled is a superset of the latest Ubuntu and Fedora
> kernel debugging configs, using Ubuntu's config-6.11.0-24-generic
> file, Fedora's kernel-x86_64-fedora.config and RHEL's
> kernel-x86_64-rhel.config from kernel-ark.git.
I think having a way to get something close to a distro config
is super userful for common options like this, but I wonder if
we could turn this into a kernel/configs/*.config fragment
instead that gets shared across architectures.
> +CONFIG_SYSFS_SYSCALL=y
> +CONFIG_EXPERT=y
> CONFIG_KALLSYMS_ALL=y
> CONFIG_PROFILING=y
I really don't like enabling CONFIG_EXPERT=y in a generic
defconfig. What changes if you turn this off?
Based on the help text for CONFIG_EXPERT, nothing we
consider the default should ever be guarded by it. If there
is something that distros commonly that is prevented by
EXPERT=n, it would be better to relay the dependency on that
particular thing.
Arnd
* Arnd Bergmann <arnd@arndb.de> wrote:
> > +CONFIG_SYSFS_SYSCALL=y
> > +CONFIG_EXPERT=y
> > CONFIG_KALLSYMS_ALL=y
> > CONFIG_PROFILING=y
>
> I really don't like enabling CONFIG_EXPERT=y in a generic
> defconfig. What changes if you turn this off?
That's a good question.
Disabling it gives me material changes for 4 options:
--- .config.before
+++ .config.after
-CONFIG_EXPERT=y
-CONFIG_ARCH_HAS_ZONE_DMA_SET=y
+CONFIG_RFKILL_INPUT=y
-CONFIG_PCIE_BUS_DEFAULT=y
+CONFIG_DEBUG_MEMORY_INIT=y
1) CONFIG_DEBUG_MEMORY_INIT
The CONFIG_DEBUG_MEMORY_INIT default is super weird:
config DEBUG_MEMORY_INIT
bool "Debug memory initialisation" if EXPERT
default !EXPERT
I this might in fact be a bug, and Ubuntu might have fallen victim to
it:
.config.fedora: CONFIG_DEBUG_MEMORY_INIT=y
.config.ubuntu: # CONFIG_DEBUG_MEMORY_INIT is not set
I believe this should be 'default y', or 'default n'.
2) CONFIG_ARCH_HAS_ZONE_DMA_SET
This one is an interim Kconfig helper flag, and it's a bit weird as
well:
arch/x86/Kconfig: select ARCH_HAS_ZONE_DMA_SET if EXPERT
I *think* the intent here is to make configurability of ZONE_DMA and
ZONE_DMA32 dependent on EXPERT, while still giving architectures an
opt-in as well:
config ZONE_DMA
bool "Support DMA zone" if ARCH_HAS_ZONE_DMA_SET
default y if ARM64 || X86
config ZONE_DMA32
bool "Support DMA32 zone" if ARCH_HAS_ZONE_DMA_SET
depends on !X86_32
default y if ARM64
I think the better approach would be to make the EXPERT policy at the
ZONE_DMA and ZONE_DMA32 level:
bool "Support DMA zone" if ARCH_HAS_ZONE_DMA_SET && EXPERT
but it should be functionally equivalent.
3) RFKILL_INPUT
I think this one's a bug too:
config RFKILL_INPUT
bool "RF switch input support" if EXPERT
depends on RFKILL
depends on INPUT = y || RFKILL = INPUT
default y if !EXPERT
Basically if you turn on EXPERT, the default changes from Y to N.
I think this should be a plain 'default y'.
4) CONFIG_PCIE_BUS_DEFAULT
I think this is quite confusing code as well:
choice
prompt "PCI Express hierarchy optimization setting"
default PCIE_BUS_DEFAULT
depends on PCI && EXPERT
help
...
config PCIE_BUS_DEFAULT
bool "Default"
depends on PCI
help
Default choice; ensure that the MPS matches upstream bridge.
...
endchoice
So the intent here is clearly to steer users towards picking
PCIE_BUS_DEFAULT.
But the 'depends' line turns off the option entirely on !EXPERT.
Which happens to work due to how the config options are used by the PCI
code:
#ifdef CONFIG_PCIE_BUS_TUNE_OFF
enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_TUNE_OFF;
#elif defined CONFIG_PCIE_BUS_SAFE
enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_SAFE;
#elif defined CONFIG_PCIE_BUS_PERFORMANCE
enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PERFORMANCE;
#elif defined CONFIG_PCIE_BUS_PEER2PEER
enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PEER2PEER;
#else
enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_DEFAULT;
#endif
But this is highly unintuitive IMO. A cleaner implementation would be
to always have CONFIG_PCIE_BUS_DEFAULT enabled on !EXPERT, which can be
done by making the configurability of the choice-list depend on EXPERT:
choice
prompt "PCI Express hierarchy optimization setting" if EXPERT
default PCIE_BUS_DEFAULT
depends on PCI
> Based on the help text for CONFIG_EXPERT, nothing we
> consider the default should ever be guarded by it. If there
> is something that distros commonly that is prevented by
> EXPERT=n, it would be better to relay the dependency on that
> particular thing.
I think distro kernel maintainers mainly inherited their old configs
and aren't afraid of CONFIG_EXPERT.
Thus *all* major distros I checked have CONFIG_EXPERT enabled: Ubuntu,
Fedora, Debian, you name it. So literally over 99% of our users use a
kernel that has CONFIG_EXPERT=y in it. Which is perfectly fine, distro
kernel maintainers *are* the ultimate experts in this matter - but
their choices inevitably make it to users configuring their own
kernels: if users type 'make localmodconfig' they'll have
CONFIG_EXPERT=y.
So I don't think we should ostracize CONFIG_EXPERT too much. :)
Otherwise I think you were right: 2 out of 4 of the configuration
settings that change due to EXPERT are outright bugs IMO, the other 2
are weird code that could be done in a more standard fashion, resulting
in an invariant .config when EXPERT is toggled on/off.
Also, I kinda don't mind having CONFIG_EXPERT=y in the kernel
defconfig: it's a helper config for *kernel developers* who want to
have finegrained control over debug facilities and other details, it's
not something for users - the resulting kernels won't result in a fully
working system on modern x86 systems.
Thanks,
Ingo
Hello Mingo,
> +CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> +CONFIG_IRQ_TIME_ACCOUNTING=y
Enabling CONFIG_IRQ_TIME_ACCOUNTING=y can lead to user-visible
behavioral changes. For more context, please refer to the related
discussion here:
https://lore.kernel.org/all/20241222024734.63894-1-laoar.shao@gmail.com/ .
If we decide to enable it by default, we should clearly document this
behavior change. Below is the patch I wrote earlier but haven’t sent out
for review yet.
----
Subject: [PATCH] init/Kconfig: document behavior change when enabling
IRQ_TIME_ACCOUNTING
After we enabled CONFIG_IRQ_TIME_ACCOUNTING, we noticed that the IRQ
usage is not accounted to the tasks and thus not accounted to the CPU
cgroup neither. This behavior change results in issues [0] in our
production servers and finally we have to revert it.
We'd better clearly document this behavior change in case it might
matter to the user.
Link:
https://lore.kernel.org/all/20241222024734.63894-1-laoar.shao@gmail.com/ [0]
Suggested-by: "Michal Koutný" <mkoutny@suse.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
---
init/Kconfig | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/init/Kconfig b/init/Kconfig
index a20e6efd3f0f..191df0b5cf1c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -563,6 +563,14 @@ config IRQ_TIME_ACCOUNTING
transitions between softirq and hardirq state, so there can be a
small performance impact.
+ Enabling IRQ_TIME_ACCOUNTING excludes IRQ usage from the CPU usage
+ statistics of individual tasks and, consequently, it is not accounted
+ for in CPU cgroups. As a result, a task's CPU usage will accurately
+ reflect only its user time and system time. IRQ usage is instead
+ attributed at the global level and can be observed in metrics such as
+ /proc/stat or, potentially, at the cgroup level in files like
+ irq.pressure.
+
If in doubt, say N here.
config HAVE_SCHED_AVG_IRQ
* Yafang Shao <laoar.shao@gmail.com> wrote:
> Hello Mingo,
>
> > +CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> > +CONFIG_IRQ_TIME_ACCOUNTING=y
>
> Enabling CONFIG_IRQ_TIME_ACCOUNTING=y can lead to user-visible behavioral
> changes. For more context, please refer to the related discussion here:
> https://lore.kernel.org/all/20241222024734.63894-1-laoar.shao@gmail.com/ .
Yeah. I actually agree with your series. It (re-)includes IRQ/softirq
time in task CPU usage statistics even under IRQ_TIME_ACCOUNTING=y,
while still keeping the finegrained IRQ/softirq statistics as well,
correct?
The Kconfig option is also arguably rather misleading:
config IRQ_TIME_ACCOUNTING
bool "Fine granularity task level IRQ time accounting"
depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE
help
Select this option to enable fine granularity task irq time
accounting. This is done by reading a timestamp on each
transitions between softirq and hardirq state, so there can be a
small performance impact.
It only warns about a small performance impact, but doesn't warn that
CPU accounting is changed in an incompatible fashion that surprises
tooling...
But I think we should probably treat this as a bug, not as lack of
documentation. Peter, do you concur?
> If we decide to enable it by default, we should clearly document this
> behavior change. Below is the patch I wrote earlier but haven’t sent
> out for review yet.
Note that it's not enabled by default - this patch is just about the
x86 defconfig.
Thanks,
Ingo
On Wed, May 7, 2025 at 3:06 PM Ingo Molnar <mingo@kernel.org> wrote: > > > * Yafang Shao <laoar.shao@gmail.com> wrote: > > > Hello Mingo, > > > > > +CONFIG_VIRT_CPU_ACCOUNTING_GEN=y > > > +CONFIG_IRQ_TIME_ACCOUNTING=y > > > > Enabling CONFIG_IRQ_TIME_ACCOUNTING=y can lead to user-visible behavioral > > changes. For more context, please refer to the related discussion here: > > https://lore.kernel.org/all/20241222024734.63894-1-laoar.shao@gmail.com/ . > > Yeah. I actually agree with your series. It (re-)includes IRQ/softirq > time in task CPU usage statistics even under IRQ_TIME_ACCOUNTING=y, > while still keeping the finegrained IRQ/softirq statistics as well, > correct? Correct. > > The Kconfig option is also arguably rather misleading: > > config IRQ_TIME_ACCOUNTING > bool "Fine granularity task level IRQ time accounting" > depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE > help > Select this option to enable fine granularity task irq time > accounting. This is done by reading a timestamp on each > transitions between softirq and hardirq state, so there can be a > small performance impact. > > It only warns about a small performance impact, but doesn't warn that > CPU accounting is changed in an incompatible fashion that surprises > tooling... Yes, this breaks our userspace tools. > > But I think we should probably treat this as a bug, not as lack of > documentation. Peter, do you concur? > > > If we decide to enable it by default, we should clearly document this > > behavior change. Below is the patch I wrote earlier but haven’t sent > > out for review yet. > > Note that it's not enabled by default - this patch is just about the > x86 defconfig. > > Thanks, > > Ingo -- Regards Yafang
* Yafang Shao <laoar.shao@gmail.com> wrote:
> On Wed, May 7, 2025 at 3:06 PM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > > Hello Mingo,
> > >
> > > > +CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> > > > +CONFIG_IRQ_TIME_ACCOUNTING=y
> > >
> > > Enabling CONFIG_IRQ_TIME_ACCOUNTING=y can lead to user-visible behavioral
> > > changes. For more context, please refer to the related discussion here:
> > > https://lore.kernel.org/all/20241222024734.63894-1-laoar.shao@gmail.com/ .
> >
> > Yeah. I actually agree with your series. It (re-)includes IRQ/softirq
> > time in task CPU usage statistics even under IRQ_TIME_ACCOUNTING=y,
> > while still keeping the finegrained IRQ/softirq statistics as well,
> > correct?
>
> Correct.
>
> >
> > The Kconfig option is also arguably rather misleading:
> >
> > config IRQ_TIME_ACCOUNTING
> > bool "Fine granularity task level IRQ time accounting"
> > depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE
> > help
> > Select this option to enable fine granularity task irq time
> > accounting. This is done by reading a timestamp on each
> > transitions between softirq and hardirq state, so there can be a
> > small performance impact.
> >
> > It only warns about a small performance impact, but doesn't warn that
> > CPU accounting is changed in an incompatible fashion that surprises
> > tooling...
>
> Yes, this breaks our userspace tools.
Okay, so 2 out of your 3 fixes are upstream already:
763a744e24a8 ("sched: Don't account irq time if sched_clock_irqtime is disabled")
a6fd16148fdd ("sched, psi: Don't account irq time if sched_clock_irqtime is disabled")
But we don't have this one yet:
[PATCH v8 4/4] sched: Fix cgroup irq time for CONFIG_IRQ_TIME_ACCOUNTING
https://lore.kernel.org/r/20250103022409.2544-5-laoar.shao@gmail.com
which is also essential to fully fix the tooling regression, right?
I think this last patch fell between the cracks, I didn't see any
fundamental objections against the fix.
Since the patch does not apply cleanly anymore, mind sending a fresh
-v9 version against v6.15-rc5 or so?
Thanks,
Ingo
On Thu, May 8, 2025 at 12:23 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > On Wed, May 7, 2025 at 3:06 PM Ingo Molnar <mingo@kernel.org> wrote:
> > >
> > >
> > > * Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > > Hello Mingo,
> > > >
> > > > > +CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> > > > > +CONFIG_IRQ_TIME_ACCOUNTING=y
> > > >
> > > > Enabling CONFIG_IRQ_TIME_ACCOUNTING=y can lead to user-visible behavioral
> > > > changes. For more context, please refer to the related discussion here:
> > > > https://lore.kernel.org/all/20241222024734.63894-1-laoar.shao@gmail.com/ .
> > >
> > > Yeah. I actually agree with your series. It (re-)includes IRQ/softirq
> > > time in task CPU usage statistics even under IRQ_TIME_ACCOUNTING=y,
> > > while still keeping the finegrained IRQ/softirq statistics as well,
> > > correct?
> >
> > Correct.
> >
> > >
> > > The Kconfig option is also arguably rather misleading:
> > >
> > > config IRQ_TIME_ACCOUNTING
> > > bool "Fine granularity task level IRQ time accounting"
> > > depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE
> > > help
> > > Select this option to enable fine granularity task irq time
> > > accounting. This is done by reading a timestamp on each
> > > transitions between softirq and hardirq state, so there can be a
> > > small performance impact.
> > >
> > > It only warns about a small performance impact, but doesn't warn that
> > > CPU accounting is changed in an incompatible fashion that surprises
> > > tooling...
> >
> > Yes, this breaks our userspace tools.
>
> Okay, so 2 out of your 3 fixes are upstream already:
>
> 763a744e24a8 ("sched: Don't account irq time if sched_clock_irqtime is disabled")
> a6fd16148fdd ("sched, psi: Don't account irq time if sched_clock_irqtime is disabled")
>
> But we don't have this one yet:
>
> [PATCH v8 4/4] sched: Fix cgroup irq time for CONFIG_IRQ_TIME_ACCOUNTING
>
> https://lore.kernel.org/r/20250103022409.2544-5-laoar.shao@gmail.com
>
> which is also essential to fully fix the tooling regression, right?
>
> I think this last patch fell between the cracks, I didn't see any
> fundamental objections against the fix.
>
> Since the patch does not apply cleanly anymore, mind sending a fresh
> -v9 version against v6.15-rc5 or so?
Hello Ingo,
I have sent the v9:
https://lore.kernel.org/all/20250511030800.1900-1-laoar.shao@gmail.com/
Could you please help review this? I’d appreciate your feedback.
--
Regards
Yafang
On Thu, May 8, 2025 at 12:23 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > On Wed, May 7, 2025 at 3:06 PM Ingo Molnar <mingo@kernel.org> wrote:
> > >
> > >
> > > * Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > > Hello Mingo,
> > > >
> > > > > +CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> > > > > +CONFIG_IRQ_TIME_ACCOUNTING=y
> > > >
> > > > Enabling CONFIG_IRQ_TIME_ACCOUNTING=y can lead to user-visible behavioral
> > > > changes. For more context, please refer to the related discussion here:
> > > > https://lore.kernel.org/all/20241222024734.63894-1-laoar.shao@gmail.com/ .
> > >
> > > Yeah. I actually agree with your series. It (re-)includes IRQ/softirq
> > > time in task CPU usage statistics even under IRQ_TIME_ACCOUNTING=y,
> > > while still keeping the finegrained IRQ/softirq statistics as well,
> > > correct?
> >
> > Correct.
> >
> > >
> > > The Kconfig option is also arguably rather misleading:
> > >
> > > config IRQ_TIME_ACCOUNTING
> > > bool "Fine granularity task level IRQ time accounting"
> > > depends on HAVE_IRQ_TIME_ACCOUNTING && !VIRT_CPU_ACCOUNTING_NATIVE
> > > help
> > > Select this option to enable fine granularity task irq time
> > > accounting. This is done by reading a timestamp on each
> > > transitions between softirq and hardirq state, so there can be a
> > > small performance impact.
> > >
> > > It only warns about a small performance impact, but doesn't warn that
> > > CPU accounting is changed in an incompatible fashion that surprises
> > > tooling...
> >
> > Yes, this breaks our userspace tools.
>
> Okay, so 2 out of your 3 fixes are upstream already:
>
> 763a744e24a8 ("sched: Don't account irq time if sched_clock_irqtime is disabled")
> a6fd16148fdd ("sched, psi: Don't account irq time if sched_clock_irqtime is disabled")
Right.
>
> But we don't have this one yet:
>
> [PATCH v8 4/4] sched: Fix cgroup irq time for CONFIG_IRQ_TIME_ACCOUNTING
>
> https://lore.kernel.org/r/20250103022409.2544-5-laoar.shao@gmail.com
>
> which is also essential to fully fix the tooling regression, right?
This patch resolves the container tooling regression but does not
address tools depending on getrusage() for CPU measurement. The
getrusage() fix will be implemented in a subsequent patch.
>
> I think this last patch fell between the cracks, I didn't see any
> fundamental objections against the fix.
>
> Since the patch does not apply cleanly anymore, mind sending a fresh
> -v9 version against v6.15-rc5 or so?
I will send a new version.
--
Regards
Yafang
© 2016 - 2025 Red Hat, Inc.