drivers/powercap/dtpm_cpu.c | 6 ++++++ 1 file changed, 6 insertions(+)
The get_pd_power_uw() function contains a race condition during CPU
offlining:
* DTPM power calculations are triggered (e.g., via sysfs reads) while CPU is online
* The CPU goes offline during the calculation, before em_cpu_get() is called
* em_cpu_get() now returns NULL since the energy model was unregistered
* em_span_cpus() dereferences the NULL pointer, causing a crash
Commit eb82bace8931 introduced the call to em_span_cpus(pd) without
checking if pd is NULL.
Add a NULL check after em_cpu_get() and return 0 power if no energy model
is available, matching the existing fallback behavior.
Fixes: eb82bace8931 ("powercap/drivers/dtpm: Scale the power with the load")
Signed-off-by: Sivan Zohar-Kotzer <sivany32@gmail.com>
---
drivers/powercap/dtpm_cpu.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 6b6f51b21550..80d93ab4dc54 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -97,6 +97,11 @@ static u64 get_pd_power_uw(struct dtpm *dtpm)
pd = em_cpu_get(dtpm_cpu->cpu);
+ if (!pd) {
+ pr_warn("DTPM: No energy model available for CPU%d\n", dtpm_cpu->cpu);
+ return 0;
+ }
+
pd_mask = em_span_cpus(pd);
freq = cpufreq_quick_get(dtpm_cpu->cpu);
@@ -207,6 +212,7 @@ static int __dtpm_cpu_setup(int cpu, struct dtpm *parent)
pd = em_cpu_get(cpu);
if (!pd || em_is_artificial(pd)) {
ret = -EINVAL;
+
goto release_policy;
}
--
2.45.2
On Thu, Jun 19, 2025 at 1:16 AM Sivan Zohar-Kotzer <sivany32@gmail.com> wrote: > > The get_pd_power_uw() function contains a race condition during CPU > offlining: > > * DTPM power calculations are triggered (e.g., via sysfs reads) while CPU is online > * The CPU goes offline during the calculation, before em_cpu_get() is called > * em_cpu_get() now returns NULL since the energy model was unregistered But energy models for CPUs are never unregistered. > * em_span_cpus() dereferences the NULL pointer, causing a crash > > Commit eb82bace8931 introduced the call to em_span_cpus(pd) without > checking if pd is NULL. > > Add a NULL check after em_cpu_get() and return 0 power if no energy model > is available, matching the existing fallback behavior. > > Fixes: eb82bace8931 ("powercap/drivers/dtpm: Scale the power with the load") > Signed-off-by: Sivan Zohar-Kotzer <sivany32@gmail.com> > --- > drivers/powercap/dtpm_cpu.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c > index 6b6f51b21550..80d93ab4dc54 100644 > --- a/drivers/powercap/dtpm_cpu.c > +++ b/drivers/powercap/dtpm_cpu.c > @@ -97,6 +97,11 @@ static u64 get_pd_power_uw(struct dtpm *dtpm) > > pd = em_cpu_get(dtpm_cpu->cpu); > > + if (!pd) { > + pr_warn("DTPM: No energy model available for CPU%d\n", dtpm_cpu->cpu); > + return 0; > + } > + > pd_mask = em_span_cpus(pd); > > freq = cpufreq_quick_get(dtpm_cpu->cpu); > @@ -207,6 +212,7 @@ static int __dtpm_cpu_setup(int cpu, struct dtpm *parent) > pd = em_cpu_get(cpu); > if (!pd || em_is_artificial(pd)) { > ret = -EINVAL; > + > goto release_policy; > } > > --
On Fri, Jun 27, 2025 at 11:07 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > * DTPM power calculations are triggered (e.g., via sysfs reads) while CPU is online > > * The CPU goes offline during the calculation, before em_cpu_get() is called > > * em_cpu_get() now returns NULL since the energy model was unregistered > > But energy models for CPUs are never unregistered. > Can't the following happen (extremely rare, but still): CPU gets set to impossible during shutdown sequence, e.g. // arch/alpha/kernel/process.c common_shutdown_1(void *generic_ptr) ... set_cpu_possible(boot_cpuid, false); Just before `get_cpu_device` is called by `em_cpu_get`. Then `get_cpu_device` returns NULL for impossible CPU, causing `em_cpu_get` to return NULL. It's not a common scenario, but it seems NULL checking doesn't cost much, and can assure us no rare case is crashing the system.
On Sun, Jun 29, 2025 at 12:13 AM Elazar Leibovich <elazarl@atero.ai> wrote: > > On Fri, Jun 27, 2025 at 11:07 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > > * DTPM power calculations are triggered (e.g., via sysfs reads) while CPU is online > > > * The CPU goes offline during the calculation, before em_cpu_get() is called > > > * em_cpu_get() now returns NULL since the energy model was unregistered > > > > But energy models for CPUs are never unregistered. > > > > Can't the following happen (extremely rare, but still): > > CPU gets set to impossible during shutdown sequence, e.g. > > // arch/alpha/kernel/process.c > common_shutdown_1(void *generic_ptr) > ... > set_cpu_possible(boot_cpuid, false); > > Just before `get_cpu_device` is called by `em_cpu_get`. > Then `get_cpu_device` returns NULL for impossible CPU, causing > `em_cpu_get` to return NULL. > > It's not a common scenario, but it seems NULL checking doesn't cost much, > and can assure us no rare case is crashing the system. It can happen, but in that case (1) the patch changelog is misleading and (2) the message printed by the new code is not particularly useful. Thanks!
© 2016 - 2025 Red Hat, Inc.