drivers/cpufreq/amd-pstate.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
Hi all, Mario, I've made the actual changes you have requested me on bugzilla for https://bugzilla.kernel.org/show_bug.cgi?id=221473 . Let me know if it sits right with you. Prevent the amd-pstate driver from loading in EPP-reliant "active" mode on systems without Collaborative Processor Performance Control (CPPC) Energy Performance Preference (EPP) support (such as Zen 2 and older processors). Zen 2 and older processors do not support EPP. Loading the driver in active mode on these systems pins the CPU frequency to the lowest non-linear frequency (1.7GHz). To resolve this: - Unconditionally probe EPP support during driver initialization via a new amd_pstate_epp_supported() helper, which queries cppc_get_epp_perf() on the first online CPU. - Lack of ENERGY_PERF support or any error from cppc_get_epp_perf() causes EPP to be treated as unsupported. - Cache EPP support to avoid repeated capability checks during mode switches. - Fall back to passive mode at boot if EPP is unsupported. - Reject runtime switches to active mode with -ENODEV. Please review and provide feedback. Thanks, Marco Marco Scardovi (1): cpufreq/amd-pstate: Prevent active mode on systems without EPP support drivers/cpufreq/amd-pstate.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) -- 2.54.0
Hello Marco,
Sorry for the late follow up on this issue.
On 6/3/2026 1:49 PM, Marco Scardovi wrote:
> Hi all,
>
> Mario, I've made the actual changes you have requested me on bugzilla for
> https://bugzilla.kernel.org/show_bug.cgi?id=221473 . Let me know if it sits
> right with you.
>
> Prevent the amd-pstate driver from loading in EPP-reliant "active" mode on
> systems without Collaborative Processor Performance Control (CPPC) Energy
> Performance Preference (EPP) support (such as Zen 2 and older processors).
>
> Zen 2 and older processors do not support EPP. Loading the driver in active
> mode on these systems pins the CPU frequency to the lowest non-linear
> frequency (1.7GHz).
In the initial report on that bugzilla, for the good case, you had:
==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq <==
3693566
==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver <==
amd-pstate-epp
Which is still the active mode driver. I'm a bit confused as to how that
got to 3.7GHz. Did you do a mode switch back then?
>
> To resolve this:
> - Unconditionally probe EPP support during driver initialization via a new
> amd_pstate_epp_supported() helper, which queries cppc_get_epp_perf() on
> the first online CPU.
> - Lack of ENERGY_PERF support or any error from cppc_get_epp_perf() causes
> EPP to be treated as unsupported.
I believe even the AUTO_SEL_ENABLE path allows cppc_set_epp_perf() to
work which would have otherwise failed the driver load during
amd_pstate_epp_cpu_init().
The EPP value makes no difference in that case but there is some
interaction that seems to cap your frequency to 1.7GHz
> - Cache EPP support to avoid repeated capability checks during mode switches.
> - Fall back to passive mode at boot if EPP is unsupported.
> - Reject runtime switches to active mode with -ENODEV.
>
> Please review and provide feedback.
I think I may know what was happening: The initial cpudata->cppc_req_cached
is all 0s and as a result, you may be hitting the early bailout in
shmem_set_epp() which skips enabling EPP mode altogether if you start off
the driver with AMD_CPPC_EPP_PERFORMANCE (EPP: 0)
If it is not too much trouble, can you try applying the following on top
of amd-pstate-fixes branch [1]:
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 72df461e7b39..3a13ce7256e2 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -426,9 +426,6 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
epp != epp_cached);
}
- if (epp == epp_cached)
- return 0;
-
perf_ctrls.energy_perf = epp;
ret = cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1);
if (ret) {
---
and booting the kernel with "amd_pstate=active amd_dynamic_epp=enable" in the
kernel cmdline while being plugged into the wall and check if you can hit the
boost frequency with your workload?
[1] git.kernel.org/pub/scm/linux/kernel/git/superm1/linux.git/log/?h=amd-pstate-fixes
--
Thanks and Regards,
Prateek
Hi Prateek and all, before trying merging in @Mario's git il post the new patches with the changes you suggested. If something doesn't feel right let me know. This series addresses a boot-time performance regression on shared memory systems where the CPU gets stuck at the lowest non-linear frequency (1.7GHz), and adds proper EPP capability checks to prevent active mode from being incorrectly enabled on systems without EPP support. Patch 1 fixes the boot-time initialization regression by introducing an `epp_hw_programmed` flag in struct amd_cpudata to track whether EPP has been configured on the hardware. This bypasses a false cache hit on boot due to zero-initialized caching registers, ensuring that EPP and auto_sel are always written during CPU initialization, while still preserving the cache guard optimization for all subsequent runtime transitions. Patch 2 implements an EPP capability check at boot time using a new amd_pstate_epp_supported() helper, which queries the EPP performance capability on the first online CPU. If EPP is unsupported, the driver falls back to passive mode at boot, and rejects any subsequent runtime transitions to active mode. Changes in v2: - Patch 1: Rename `epp_initialized` to `epp_hw_programmed`, explicitly document the flag semantics, and add a comment documenting the EPP cache guard optimization behavior. - Patch 2: Add comments explaining the uniform CPU capability check on x86, handle EPP capability check errors robustly using a tri-state model (only treat -EOPNOTSUPP as unsupported, warn and assume supported for other errors to avoid false negatives), and reject runtime active mode transitions at sysfs store time (preventing the driver from being left in an unregistered state). Changes in v1: - Fix the boot-time CPPC EPP/auto_sel initialization regression in shmem_set_epp() using a state tracking flag while preserving runtime cache optimization. - Add an EPP capability check helper during initialization. - Fall back to passive mode at boot if EPP is not supported, and reject transitions to active mode at runtime if EPP is not supported. Marco Scardovi (2): cpufreq/amd-pstate: Fix EPP initialization for shared memory systems cpufreq/amd-pstate: Prevent active mode on systems without EPP support drivers/cpufreq/amd-pstate.c | 52 +++++++++++++++++++++++++++++++++++- drivers/cpufreq/amd-pstate.h | 2 ++ 2 files changed, 53 insertions(+), 1 deletion(-) -- 2.54.0
On 6/3/26 06:56, Marco Scardovi wrote: > Hi Prateek and all, > > before trying merging in @Mario's git il post the new patches with the > changes you suggested. If something doesn't feel right let me know. > > This series addresses a boot-time performance regression on shared memory > systems where the CPU gets stuck at the lowest non-linear frequency > (1.7GHz), and adds proper EPP capability checks to prevent active mode from > being incorrectly enabled on systems without EPP support. > > Patch 1 fixes the boot-time initialization regression by introducing an > `epp_hw_programmed` flag in struct amd_cpudata to track whether EPP has > been configured on the hardware. This bypasses a false cache hit on boot > due to zero-initialized caching registers, ensuring that EPP and auto_sel > are always written during CPU initialization, while still preserving the > cache guard optimization for all subsequent runtime transitions. > > Patch 2 implements an EPP capability check at boot time using a new > amd_pstate_epp_supported() helper, which queries the EPP performance > capability on the first online CPU. If EPP is unsupported, the driver falls > back to passive mode at boot, and rejects any subsequent runtime > transitions to active mode. > > Changes in v2: > - Patch 1: Rename `epp_initialized` to `epp_hw_programmed`, explicitly > document the flag semantics, and add a comment documenting the EPP cache > guard optimization behavior. > - Patch 2: Add comments explaining the uniform CPU capability check on x86, > handle EPP capability check errors robustly using a tri-state model (only > treat -EOPNOTSUPP as unsupported, warn and assume supported for other > errors to avoid false negatives), and reject runtime active mode transitions > at sysfs store time (preventing the driver from being left in an > unregistered state). > > Changes in v1: > - Fix the boot-time CPPC EPP/auto_sel initialization regression in > shmem_set_epp() using a state tracking flag while preserving runtime > cache optimization. > - Add an EPP capability check helper during initialization. > - Fall back to passive mode at boot if EPP is not supported, and reject > transitions to active mode at runtime if EPP is not supported. > > Marco Scardovi (2): > cpufreq/amd-pstate: Fix EPP initialization for shared memory systems > cpufreq/amd-pstate: Prevent active mode on systems without EPP support > > drivers/cpufreq/amd-pstate.c | 52 +++++++++++++++++++++++++++++++++++- > drivers/cpufreq/amd-pstate.h | 2 ++ > 2 files changed, 53 insertions(+), 1 deletion(-) > Did you see Prateek's suggestion? Can you check if that helps before we go down this path? Thanks,
At CPU boot/initialization, the private cpudata structure is allocated
via kzalloc, which means cpudata->cppc_req_cached is initialized to 0.
This makes the default cached EPP value 0 (AMD_CPPC_EPP_PERFORMANCE).
When initializing a system that defaults to performance EPP, the driver
attempts to configure the EPP via shmem_set_epp(policy, 0). Because the
requested EPP (0) matches the uninitialized cached value (0), the early
return check (epp == epp_cached) triggers, and the driver skips writing
to the hardware.
The cppc_set_epp_perf() helper is responsible for writing both the EPP
register and the CPPC autonomous mode enable register (auto_sel). Skipping
the EPP write on initialization due to the false cache hit consequently
skips setting auto_sel to 1, leaving the CPU in non-autonomous mode. This
prevents the hardware from boosting and leaves the CPU frequency stuck at
the lowest non-linear frequency (1.7GHz).
Fix this by introducing an `epp_hw_programmed` flag in struct amd_cpudata
to track whether EPP has been configured on the hardware. Bypass the early
return check in shmem_set_epp() if EPP has not yet been programmed on the
hardware. Once EPP has been successfully programmed once, the cached EPP
value becomes authoritative (representing driver intent rather than dynamic
firmware state), allowing subsequent runtime mode transitions to use the
cache guard optimization safely.
Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221473
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Marco Scardovi <scardracs@disroot.org>
---
drivers/cpufreq/amd-pstate.c | 9 ++++++++-
drivers/cpufreq/amd-pstate.h | 2 ++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 8d55e2be825b..e8057f3dfb1e 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -424,7 +424,13 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
epp != epp_cached);
}
- if (epp == epp_cached)
+ /*
+ * After the first successful programming, the cached EPP value
+ * becomes authoritative (representing driver intent rather than
+ * dynamic firmware state), allowing us to skip redundant hardware
+ * writes when the EPP value is unchanged.
+ */
+ if (cpudata->epp_hw_programmed && epp == epp_cached)
return 0;
perf_ctrls.energy_perf = epp;
@@ -434,6 +440,7 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
return ret;
}
+ cpudata->epp_hw_programmed = true;
value = READ_ONCE(cpudata->cppc_req_cached);
FIELD_MODIFY(AMD_CPPC_EPP_PERF_MASK, &value, epp);
WRITE_ONCE(cpudata->cppc_req_cached, value);
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index e4722e54387b..fc6cd4b873f6 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -128,6 +128,8 @@ struct amd_cpudata {
u8 epp_default_dc;
bool dynamic_epp;
bool raw_epp;
+ /* Indicates that EPP has been successfully programmed at least once since boot. */
+ bool epp_hw_programmed;
struct notifier_block power_nb;
/* platform profile */
--
2.54.0
On 6/3/26 06:56, Marco Scardovi wrote:
> At CPU boot/initialization, the private cpudata structure is allocated
> via kzalloc, which means cpudata->cppc_req_cached is initialized to 0.
> This makes the default cached EPP value 0 (AMD_CPPC_EPP_PERFORMANCE).
>
> When initializing a system that defaults to performance EPP, the driver
> attempts to configure the EPP via shmem_set_epp(policy, 0). Because the
> requested EPP (0) matches the uninitialized cached value (0), the early
> return check (epp == epp_cached) triggers, and the driver skips writing
> to the hardware.
>
> The cppc_set_epp_perf() helper is responsible for writing both the EPP
> register and the CPPC autonomous mode enable register (auto_sel). Skipping
> the EPP write on initialization due to the false cache hit consequently
> skips setting auto_sel to 1, leaving the CPU in non-autonomous mode. This
> prevents the hardware from boosting and leaves the CPU frequency stuck at
> the lowest non-linear frequency (1.7GHz).
>
> Fix this by introducing an `epp_hw_programmed` flag in struct amd_cpudata
> to track whether EPP has been configured on the hardware. Bypass the early
> return check in shmem_set_epp() if EPP has not yet been programmed on the
> hardware. Once EPP has been successfully programmed once, the cached EPP
> value becomes authoritative (representing driver intent rather than dynamic
> firmware state), allowing subsequent runtime mode transitions to use the
> cache guard optimization safely.
>
> Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=221473
> Assisted-by: Antigravity:gemini-3.5-flash
> Signed-off-by: Marco Scardovi <scardracs@disroot.org>
> ---
> drivers/cpufreq/amd-pstate.c | 9 ++++++++-
> drivers/cpufreq/amd-pstate.h | 2 ++
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 8d55e2be825b..e8057f3dfb1e 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -424,7 +424,13 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> epp != epp_cached);
> }
>
> - if (epp == epp_cached)
> + /*
> + * After the first successful programming, the cached EPP value
> + * becomes authoritative (representing driver intent rather than
> + * dynamic firmware state), allowing us to skip redundant hardware
> + * writes when the EPP value is unchanged.
> + */
> + if (cpudata->epp_hw_programmed && epp == epp_cached)
> return 0;
>
> perf_ctrls.energy_perf = epp;
> @@ -434,6 +440,7 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> return ret;
> }
>
> + cpudata->epp_hw_programmed = true;
> value = READ_ONCE(cpudata->cppc_req_cached);
> FIELD_MODIFY(AMD_CPPC_EPP_PERF_MASK, &value, epp);
> WRITE_ONCE(cpudata->cppc_req_cached, value);
> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
> index e4722e54387b..fc6cd4b873f6 100644
> --- a/drivers/cpufreq/amd-pstate.h
> +++ b/drivers/cpufreq/amd-pstate.h
> @@ -128,6 +128,8 @@ struct amd_cpudata {
> u8 epp_default_dc;
> bool dynamic_epp;
> bool raw_epp;
> + /* Indicates that EPP has been successfully programmed at least once since boot. */
> + bool epp_hw_programmed;
> struct notifier_block power_nb;
>
> /* platform profile */
I'm having a hard time following why this is needed. Let me explain my
chain of thought.
We start at amd_pstate_epp_cpu_init().
* cpudata (and thus cppc_req_cached) is initalized to 0 via kzalloc()
- On a server we initialize "epp_default" via a lookup to the HW
(amd_pstate_get_epp).
- On a non-server we initialize "epp_default" to 0x80.
* we call amd_pstate_set_epp() with epp_deafult as the argument.
Now in the shared mem backend (shmem_set_epp) we do a lookup of
epp_cached and it should be 0. We do a comparison of the argument sent
to the function, and this should be 0x80.
So how do we get into a case that amd_pstate_set_epp() is actually
called with 0?
There are some trace points specifically for this purpose - you could
confirm there really is such a call by using them at bootup (or by
starting in passive and with a mode change to active at runtime).
Then we call amd_pstate_set_epp() with that epp_default value.
Hello Mario,
On 6/4/2026 12:23 AM, Mario Limonciello wrote:
>> --- a/drivers/cpufreq/amd-pstate.h
>> +++ b/drivers/cpufreq/amd-pstate.h
>> @@ -128,6 +128,8 @@ struct amd_cpudata {
>> u8 epp_default_dc;
>> bool dynamic_epp;
>> bool raw_epp;
>> + /* Indicates that EPP has been successfully programmed at least once since boot. */
>> + bool epp_hw_programmed;
>> struct notifier_block power_nb;
>> /* platform profile */
>
> I'm having a hard time following why this is needed. Let me explain my chain of thought.
>
> We start at amd_pstate_epp_cpu_init().
> * cpudata (and thus cppc_req_cached) is initalized to 0 via kzalloc()
> - On a server we initialize "epp_default" via a lookup to the HW (amd_pstate_get_epp).
> - On a non-server we initialize "epp_default" to 0x80.
> * we call amd_pstate_set_epp() with epp_deafult as the argument.
>
> Now in the shared mem backend (shmem_set_epp) we do a lookup of epp_cached and it should be 0. We do a comparison of the argument sent to the function, and this should be 0x80.
>
> So how do we get into a case that amd_pstate_set_epp() is actually called with 0?
This was the path I traced yesterday considering amd_dynamic_epp enabled for
a shared memory system that is plugged into the wall:
amd_pstate_epp_cpu_init()
cpudata->epp_default_ac = AMD_CPPC_EPP_PERFORMANCE;
cpudata->epp_default_dc = AMD_CPPC_EPP_BALANCE_PERFORMANCE;
cpudata->current_profile = PLATFORM_PROFILE_BALANCED;
/* cpudata->cppc_req_cached has epp_cached as 0 */
amd_pstate_set_dynamic_epp(policy)
epp = amd_pstate_get_balanced_epp(policy); /* returns cpudata->epp_default_ac which is EPP_PERFORMANCE */
amd_pstate_set_epp(policy, epp /* AMD_CPPC_EPP_PERFORMANCE */)
if (epp == epp_cached)
return;
/*
* Skips cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1)
* where the last argument "enable" enables EPP via the
* AUTO_SEL_ENABLE path.
*/
>
> There are some trace points specifically for this purpose - you could confirm there really is such a call by using them at bootup (or by starting in passive and with a mode change to active at runtime).
It should show up in the traces yes!
>
> Then we call amd_pstate_set_epp() with that epp_default value.
I think this patch has some merit, although the second patch, I'm not
too sure of. After some digging, it seems we always had AUTO_SEL_ENABLE
and ENERGY_PERF right from the time CPPC was introduced (including Zen1
systems that had CPPC support).
Marco, do you have actually have a system that neither has
AUTO_SEL_ENABLE, nor ENERGY_PERF? Seems very unlikely a configuration
like that made it out.
--
Thanks and Regards,
Prateek
In data giovedì 4 giugno 2026 05:56:20 Ora legale dell’Europa centrale, K
Prateek Nayak ha scritto:
> Hello Mario,
>
> On 6/4/2026 12:23 AM, Mario Limonciello wrote:
> >> --- a/drivers/cpufreq/amd-pstate.h
> >> +++ b/drivers/cpufreq/amd-pstate.h
> >> @@ -128,6 +128,8 @@ struct amd_cpudata {
> >> u8 epp_default_dc;
> >> bool dynamic_epp;
> >> bool raw_epp;
> >> + /* Indicates that EPP has been successfully programmed at least once
> >> since boot. */ + bool epp_hw_programmed;
> >> struct notifier_block power_nb;
> >> /* platform profile */
> >
> > I'm having a hard time following why this is needed. Let me explain my
> > chain of thought.
> >
> > We start at amd_pstate_epp_cpu_init().
> > * cpudata (and thus cppc_req_cached) is initalized to 0 via kzalloc()
> > - On a server we initialize "epp_default" via a lookup to the HW
> > (amd_pstate_get_epp). - On a non-server we initialize "epp_default" to
> > 0x80.
> > * we call amd_pstate_set_epp() with epp_deafult as the argument.
> >
> > Now in the shared mem backend (shmem_set_epp) we do a lookup of epp_cached
> > and it should be 0. We do a comparison of the argument sent to the
> > function, and this should be 0x80.
> >
> > So how do we get into a case that amd_pstate_set_epp() is actually called
> > with 0?
> This was the path I traced yesterday considering amd_dynamic_epp enabled for
> a shared memory system that is plugged into the wall:
>
> amd_pstate_epp_cpu_init()
> cpudata->epp_default_ac = AMD_CPPC_EPP_PERFORMANCE;
> cpudata->epp_default_dc = AMD_CPPC_EPP_BALANCE_PERFORMANCE;
> cpudata->current_profile = PLATFORM_PROFILE_BALANCED;
>
> /* cpudata->cppc_req_cached has epp_cached as 0 */
>
> amd_pstate_set_dynamic_epp(policy)
> epp = amd_pstate_get_balanced_epp(policy); /* returns
> cpudata->epp_default_ac which is EPP_PERFORMANCE */
>
> amd_pstate_set_epp(policy, epp /* AMD_CPPC_EPP_PERFORMANCE */)
> if (epp == epp_cached)
> return;
>
> /*
> * Skips cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1)
> * where the last argument "enable" enables EPP via the
> * AUTO_SEL_ENABLE path.
> */
>
> > There are some trace points specifically for this purpose - you could
> > confirm there really is such a call by using them at bootup (or by
> > starting in passive and with a mode change to active at runtime).
> It should show up in the traces yes!
>
> > Then we call amd_pstate_set_epp() with that epp_default value.
>
> I think this patch has some merit, although the second patch, I'm not
> too sure of. After some digging, it seems we always had AUTO_SEL_ENABLE
> and ENERGY_PERF right from the time CPPC was introduced (including Zen1
> systems that had CPPC support).
>
> Marco, do you have actually have a system that neither has
> AUTO_SEL_ENABLE, nor ENERGY_PERF? Seems very unlikely a configuration
> like that made it out.
Hi Prateek, Mario,
No, I don't have a system lacking AUTO_SEL_ENABLE or ENERGY_PERF support.
The only AMD system I currently have access to is an 8940HX.
I think I understand the reasoning behind both of your comments. My
understanding is that the main question is whether there is a real boot path
where amd_pstate_set_epp() returns early before EPP has ever been programmed
in hardware, which would justify tracking that state separately from the
cached EPP value.
However, I'm not familiar enough with this part of the driver to confidently
turn that reasoning into a correct implementation, so I'd prefer to defer to
your judgement on whether the additional state tracking is the right approach.
Thanks,
Marco
On 6/4/26 01:56, Marco Scardovi wrote:
> In data giovedì 4 giugno 2026 05:56:20 Ora legale dell’Europa centrale, K
> Prateek Nayak ha scritto:
>> Hello Mario,
>>
>> On 6/4/2026 12:23 AM, Mario Limonciello wrote:
>>>> --- a/drivers/cpufreq/amd-pstate.h
>>>> +++ b/drivers/cpufreq/amd-pstate.h
>>>> @@ -128,6 +128,8 @@ struct amd_cpudata {
>>>> u8 epp_default_dc;
>>>> bool dynamic_epp;
>>>> bool raw_epp;
>>>> + /* Indicates that EPP has been successfully programmed at least once
>>>> since boot. */ + bool epp_hw_programmed;
>>>> struct notifier_block power_nb;
>>>> /* platform profile */
>>>
>>> I'm having a hard time following why this is needed. Let me explain my
>>> chain of thought.
>>>
>>> We start at amd_pstate_epp_cpu_init().
>>> * cpudata (and thus cppc_req_cached) is initalized to 0 via kzalloc()
>>> - On a server we initialize "epp_default" via a lookup to the HW
>>> (amd_pstate_get_epp). - On a non-server we initialize "epp_default" to
>>> 0x80.
>>> * we call amd_pstate_set_epp() with epp_deafult as the argument.
>>>
>>> Now in the shared mem backend (shmem_set_epp) we do a lookup of epp_cached
>>> and it should be 0. We do a comparison of the argument sent to the
>>> function, and this should be 0x80.
>>>
>>> So how do we get into a case that amd_pstate_set_epp() is actually called
>>> with 0?
>> This was the path I traced yesterday considering amd_dynamic_epp enabled for
>> a shared memory system that is plugged into the wall:
>>
>> amd_pstate_epp_cpu_init()
>> cpudata->epp_default_ac = AMD_CPPC_EPP_PERFORMANCE;
>> cpudata->epp_default_dc = AMD_CPPC_EPP_BALANCE_PERFORMANCE;
>> cpudata->current_profile = PLATFORM_PROFILE_BALANCED;
>>
>> /* cpudata->cppc_req_cached has epp_cached as 0 */
>>
>> amd_pstate_set_dynamic_epp(policy)
>> epp = amd_pstate_get_balanced_epp(policy); /* returns
>> cpudata->epp_default_ac which is EPP_PERFORMANCE */
>>
>> amd_pstate_set_epp(policy, epp /* AMD_CPPC_EPP_PERFORMANCE */)
>> if (epp == epp_cached)
>> return;
>>
>> /*
>> * Skips cppc_set_epp_perf(cpudata->cpu, &perf_ctrls, 1)
>> * where the last argument "enable" enables EPP via the
>> * AUTO_SEL_ENABLE path.
>> */
>>
>>> There are some trace points specifically for this purpose - you could
>>> confirm there really is such a call by using them at bootup (or by
>>> starting in passive and with a mode change to active at runtime).
>> It should show up in the traces yes!
>>
>>> Then we call amd_pstate_set_epp() with that epp_default value.
>>
>> I think this patch has some merit, although the second patch, I'm not
>> too sure of. After some digging, it seems we always had AUTO_SEL_ENABLE
>> and ENERGY_PERF right from the time CPPC was introduced (including Zen1
>> systems that had CPPC support).
>>
>> Marco, do you have actually have a system that neither has
>> AUTO_SEL_ENABLE, nor ENERGY_PERF? Seems very unlikely a configuration
>> like that made it out.
>
> Hi Prateek, Mario,
>
> No, I don't have a system lacking AUTO_SEL_ENABLE or ENERGY_PERF support.
> The only AMD system I currently have access to is an 8940HX.
>
> I think I understand the reasoning behind both of your comments. My
> understanding is that the main question is whether there is a real boot path
> where amd_pstate_set_epp() returns early before EPP has ever been programmed
> in hardware, which would justify tracking that state separately from the
> cached EPP value.
>
> However, I'm not familiar enough with this part of the driver to confidently
> turn that reasoning into a correct implementation, so I'd prefer to defer to
> your judgement on whether the additional state tracking is the right approach.
>
> Thanks,
> Marco
>
>
>
So at least for patch 1 if you run dynamic epp, and are plugged into a
wall you start out on EPP 0, but I'm pretty sure that's usually the
firmware default anyway. But this gives me a (better?) idea. How about
we just read the EPP from the firmware initially at startup?
Something like this:
╰─❯ git diff
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 3f06e33f47120..0e32f7f92651b 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1939,6 +1939,8 @@ static int amd_pstate_epp_cpu_init(struct
cpufreq_policy *policy)
policy->boost_supported = READ_ONCE(cpudata->boost_supported);
+ WRITE_ONCE(cpudata->cppc_req_cached,
FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, amd_pstate_get_epp(cpudata)));
+
/*
* Set the policy to provide a valid fallback value in case
* the default cpufreq governor is neither powersave nor
performance.
Hello Mario,
On 6/5/2026 12:13 AM, Mario Limonciello wrote:
> So at least for patch 1 if you run dynamic epp, and are plugged into a wall
> you start out on EPP 0, but I'm pretty sure that's usually the firmware
> default anyway. But this gives me a (better?) idea. How about we just
> read the EPP from the firmware initially at startup?
I'm looking at the ACPI tables across various generations and, from Zen4
onward, we stopped including the optional AUTO_SEL_ENABLE register as
the platforms introduced MSR based interfaces.
The Energy Performance Preference Register definition reads:
If supported, contains a resource descriptor with a single
Register() descriptor that describes a register to which OSPM writes
a value to control the Energy vs. Performance preference of the
platform’s energy efficiency and performance optimization policies
when *Autonomous Selection is enabled*.
I think, the shmem platforms should call cppc_set_auto_sel(cpu, 1) just
as a safety during epp_cpu_init just to ensure we are not missing out
on enabling it if there is some dependency there.
>
> Something like this:
>
> ╰─❯ git diff
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 3f06e33f47120..0e32f7f92651b 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1939,6 +1939,8 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>
> policy->boost_supported = READ_ONCE(cpudata->boost_supported);
>
> + WRITE_ONCE(cpudata->cppc_req_cached, FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, amd_pstate_get_epp(cpudata)));
> +
> /*
> * Set the policy to provide a valid fallback value in case
> * the default cpufreq governor is neither powersave nor performance.
>
Building on top of your suggestion:
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 72df461e7b39..27ed0db02bcb 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1882,6 +1882,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
struct amd_cpudata *cpudata;
union perf_cached perf;
struct device *dev;
+ s16 default_epp;
int ret;
/*
@@ -1928,9 +1929,22 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
/* It will be updated by governor */
policy->cur = policy->cpuinfo.min_freq;
-
policy->boost_supported = READ_ONCE(cpudata->boost_supported);
+ /* Cache the firmware programmed EPP */
+ default_epp = amd_pstate_get_epp(cpudata);
+ FIELD_MODIFY(AMD_CPPC_EPP_PERF_MASK, &cpudata->cppc_req_cached, default_epp);
+
+ /*
+ * Shared memory based systems may require the AUTO_SEL_ENABLE register
+ * to be toggled on to function correctly. Since the first call to
+ * amd_pstate_set_epp() may bail out early if the desired EPP is
+ * same as the one configured by the firmware, attempt to toggle the
+ * AUTO_SEL_ENABLE here, independent of EPP programming.
+ */
+ if (!cpu_feature_enabled(X86_FEATURE_CPPC))
+ cppc_set_auto_sel(policy->cpu, 1);
+
/*
* Set the policy to provide a valid fallback value in case
* the default cpufreq governor is neither powersave nor performance.
@@ -1938,7 +1952,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (amd_pstate_acpi_pm_profile_server() ||
amd_pstate_acpi_pm_profile_undefined()) {
policy->policy = CPUFREQ_POLICY_PERFORMANCE;
- cpudata->epp_default_ac = cpudata->epp_default_dc = amd_pstate_get_epp(cpudata);
+ cpudata->epp_default_ac = cpudata->epp_default_dc = default_epp;
cpudata->current_profile = PLATFORM_PROFILE_PERFORMANCE;
} else {
policy->policy = CPUFREQ_POLICY_POWERSAVE;
---
I'm not sure whether guided mode needs this but I'll get my hands on a
Zen2 part to test this out.
--
Thanks and Regards,
Prateek
Some AMD processors or firmware configurations do not support Collaborative
Processor Performance Control (CPPC) Energy Performance Preference (EPP).
When loading the amd-pstate driver in EPP-reliant "active" mode on these
systems, the driver fails to function correctly.
Unconditionally probe EPP support during driver initialization via a new
amd_pstate_epp_supported() helper, which queries cppc_get_epp_perf() on
the first online CPU. EPP capability is uniform across all CPUs on x86,
making a single CPU query sufficient.
Treat EPP support as a tri-state during probing:
- Success: EPP is supported.
- -EOPNOTSUPP: EPP is definitively unsupported.
- Unknown error: Warn about the unexpected failure, but default to assuming
EPP is supported to avoid false negatives.
Cache the capability in a static global `epp_supported` boolean to avoid
redundant capability checks during runtime mode switches.
If EPP is unsupported:
- Fall back to passive mode at boot to keep the driver functional rather
than failing to load entirely.
- Reject runtime switches to active mode via sysfs with -ENODEV and print
a warning.
Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221473
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Marco Scardovi <scardracs@disroot.org>
---
drivers/cpufreq/amd-pstate.c | 44 ++++++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index e8057f3dfb1e..cd052880ecb3 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -27,6 +27,7 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/smp.h>
+#include <linux/cpumask.h>
#include <linux/sched.h>
#include <linux/cpufreq.h>
#include <linux/compiler.h>
@@ -88,6 +89,8 @@ static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool amd_pstate_prefcore = true;
static bool dynamic_epp;
+/* EPP support capability cached for driver mode switches. */
+static bool epp_supported;
static struct quirk_entry *quirks;
/*
@@ -1772,6 +1775,11 @@ int amd_pstate_update_status(const char *buf, size_t size)
if (mode_idx < 0)
return mode_idx;
+ if (mode_idx == AMD_PSTATE_ACTIVE && !epp_supported) {
+ pr_warn("EPP is not supported by this processor, active mode rejected\n");
+ return -ENODEV;
+ }
+
if (mode_state_machine[cppc_state][mode_idx]) {
guard(mutex)(&amd_pstate_driver_lock);
return mode_state_machine[cppc_state][mode_idx](mode_idx);
@@ -2229,6 +2237,36 @@ static bool amd_cppc_supported(void)
return true;
}
+static bool amd_pstate_epp_supported(void)
+{
+ unsigned int cpu = cpumask_first(cpu_online_mask);
+ u64 epp;
+ int ret;
+
+ /*
+ * On symmetric x86 systems, CPPC EPP support is uniform across all
+ * CPUs. Probing the first online CPU is sufficient to determine
+ * system-wide capability.
+ */
+ ret = cppc_get_epp_perf(cpu, &epp);
+ if (!ret)
+ return true;
+
+ /*
+ * We treat EPP support as a tri-state:
+ * - Success (0): EPP is supported.
+ * - Unsupported (-EOPNOTSUPP): EPP is definitively unsupported.
+ * - Unknown error (others): Warn about the unexpected failure, but
+ * default to assuming support to avoid false negatives (this may
+ * be revisited if transient errors cause driver instability).
+ */
+ if (ret == -EOPNOTSUPP)
+ return false;
+
+ pr_warn("Unable to determine EPP capability: %d\n", ret);
+ return true;
+}
+
static int __init amd_pstate_init(void)
{
struct device *dev_root;
@@ -2251,6 +2289,12 @@ static int __init amd_pstate_init(void)
if (cpufreq_get_current_driver())
return -EEXIST;
+ epp_supported = amd_pstate_epp_supported();
+ if (cppc_state == AMD_PSTATE_ACTIVE && !epp_supported) {
+ pr_warn("EPP not supported, falling back to passive mode\n");
+ cppc_state = AMD_PSTATE_PASSIVE;
+ }
+
quirks = NULL;
/* check if this machine need CPPC quirks */
--
2.54.0
© 2016 - 2026 Red Hat, Inc.