[PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver

Penny Zheng posted 13 patches 2 months, 1 week ago
Only 12 patches received!
There is a newer version of this series
[PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Penny Zheng 2 months, 1 week ago
Introduce helper set_amd_cppc_para() and get_amd_cppc_para() to
SET/GET CPPC-related para for amd-cppc/amd-cppc-epp driver.

In get_cpufreq_cppc()/set_cpufreq_cppc(), we include
"processor_pminfo[cpuid]->init & XEN_CPPC_INIT" condition check to deal with
cpufreq driver in amd-cppc.

Also, a new field "policy" has also been added in "struct xen_get_cppc_para"
to describe performance policy in active mode. It gets printed with other
cppc paras. Move manifest constants "XEN_CPUFREQ_POLICY_xxx" to public header
to let it be used in user space tools.

Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
---
v1 -> v2:
- Give the variable des_perf an initializer of 0
- Use the strncmp()s directly in the if()
---
v3 -> v4
- refactor comments
- remove double blank lines
- replace amd_cppc_in_use flag with XEN_PROCESSOR_PM_CPPC
---
v4 -> v5:
- add new field "policy" in "struct xen_cppc_para"
- add new performamce policy XEN_CPUFREQ_POLICY_BALANCE
- drop string comparisons with "processor_pminfo[cpuid]->init & XEN_CPPC_INIT"
and "cpufreq.setpolicy == NULL"
- Blank line ahead of the main "return" of a function
- refactor comments, commit message and title
---
v5 -> v6:
- remove duplicated manifest constants, and just move it to public header
- use "else if" to avoid confusion that it looks as if both paths could be taken
- add check for legitimate perf values
- use "unknown" instead of "none"
- introduce "CPUFREQ_POLICY_END" for array overrun check in user space tools
---
v6 -> v7:
- use ARRAY_SIZE() instead
- ->policy print is avoided in passive mode and print "unknown" in invalid
cases
- let cpufreq_is_governorless() being the variable's initializer
- refactor with the conditional operator to increase readability
- move duplicated defination ahead and use local variable
- avoid using "else-condition" to bring "dead code" in Misra's nomeclature
- move the comment out of public header and into the respective internal
struct field
- wrap set{,get}_amd_cppc_para() with CONFIG_PM_OP
- add symmetry scenario for maximum check
---
 tools/misc/xenpm.c                   |  16 +++
 xen/arch/x86/acpi/cpufreq/amd-cppc.c | 181 +++++++++++++++++++++++++++
 xen/drivers/acpi/pm-op.c             |  10 +-
 xen/include/acpi/cpufreq/cpufreq.h   |  32 ++---
 xen/include/public/sysctl.h          |   6 +
 5 files changed, 226 insertions(+), 19 deletions(-)

diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
index 02981c4583..eedb745a46 100644
--- a/tools/misc/xenpm.c
+++ b/tools/misc/xenpm.c
@@ -38,6 +38,13 @@
 static xc_interface *xc_handle;
 static unsigned int max_cpu_nr;
 
+static const char cpufreq_policy_str[][12] = {
+    [CPUFREQ_POLICY_UNKNOWN] = "unknown",
+    [CPUFREQ_POLICY_POWERSAVE] = "powersave",
+    [CPUFREQ_POLICY_PERFORMANCE] = "performance",
+    [CPUFREQ_POLICY_ONDEMAND] = "ondemand",
+};
+
 /* help message */
 void show_help(void)
 {
@@ -826,6 +833,15 @@ static void print_cppc_para(unsigned int cpuid,
     printf("                     : desired [%"PRIu32"%s]\n",
            cppc->desired,
            cppc->desired ? "" : " hw autonomous");
+
+    if ( !cppc->desired )
+    {
+        if ( cppc->policy < ARRAY_SIZE(cpufreq_policy_str) )
+            printf("  performance policy : %s\n",
+                   cpufreq_policy_str[cppc->policy]);
+        else
+            printf("  performance policy : unknown\n");
+    }
     printf("\n");
 }
 
diff --git a/xen/arch/x86/acpi/cpufreq/amd-cppc.c b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
index 1b4753a062..493550bbb3 100644
--- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
+++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
@@ -557,6 +557,187 @@ static int cf_check amd_cppc_epp_set_policy(struct cpufreq_policy *policy)
     return 0;
 }
 
+#ifdef CONFIG_PM_OP
+int get_amd_cppc_para(const struct cpufreq_policy *policy,
+                      struct xen_get_cppc_para *cppc_para)
+{
+    const struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data,
+                                                   policy->cpu);
+
+    if ( data == NULL )
+        return -ENODATA;
+
+    cppc_para->policy           = policy->policy;
+    cppc_para->lowest           = data->caps.lowest_perf;
+    cppc_para->lowest_nonlinear = data->caps.lowest_nonlinear_perf;
+    cppc_para->nominal          = data->caps.nominal_perf;
+    cppc_para->highest          = data->caps.highest_perf;
+    cppc_para->minimum          = data->req.min_perf;
+    cppc_para->maximum          = data->req.max_perf;
+    cppc_para->desired          = data->req.des_perf;
+    cppc_para->energy_perf      = data->req.epp;
+
+    return 0;
+}
+
+int set_amd_cppc_para(struct cpufreq_policy *policy,
+                      const struct xen_set_cppc_para *set_cppc)
+{
+    unsigned int cpu = policy->cpu;
+    struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data, cpu);
+    uint8_t max_perf, min_perf, des_perf, epp;
+    bool active_mode = cpufreq_is_governorless(cpu);
+
+    if ( data == NULL )
+        return -ENOENT;
+
+    /* Return if there is nothing to do. */
+    if ( set_cppc->set_params == 0 )
+        return 0;
+
+    /* Only allow values if params bit is set. */
+    if ( (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED) &&
+          set_cppc->desired) ||
+         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) &&
+          set_cppc->minimum) ||
+         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
+          set_cppc->maximum) ||
+         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ENERGY_PERF) &&
+          set_cppc->energy_perf) )
+        return -EINVAL;
+
+    /*
+     * Validate all parameters
+     * Maximum performance may be set to any performance value in the range
+     * [Nonlinear Lowest Performance, Highest Performance], inclusive but must
+     * be set to a value that is larger than or equal to minimum Performance.
+     */
+    if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
+         (set_cppc->maximum > data->caps.highest_perf ||
+          set_cppc->maximum <
+                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM
+                         ? set_cppc->minimum
+                         : data->req.min_perf)) )
+        return -EINVAL;
+    /*
+     * Minimum performance may be set to any performance value in the range
+     * [Nonlinear Lowest Performance, Highest Performance], inclusive but must
+     * be set to a value that is less than or equal to Maximum Performance.
+     */
+    if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) &&
+         (set_cppc->minimum < data->caps.lowest_nonlinear_perf ||
+          (set_cppc->minimum >
+                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM
+                         ? set_cppc->maximum
+                         : data->req.max_perf))) )
+        return -EINVAL;
+    /*
+     * Desired performance may be set to any performance value in the range
+     * [Minimum Performance, Maximum Performance], inclusive.
+     */
+    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED )
+    {
+        if ( active_mode )
+            return -EOPNOTSUPP;
+
+        if ( (set_cppc->desired >
+                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM
+                         ? set_cppc->maximum
+                         : data->req.max_perf)) ||
+             (set_cppc->desired <
+                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM
+                         ? set_cppc->minimum
+                         : data->req.min_perf)) )
+            return -EINVAL;
+    }
+    /*
+     * Energy Performance Preference may be set with a range of values
+     * from 0 to 0xFF
+     */
+    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ENERGY_PERF )
+    {
+        if ( !active_mode )
+            return -EOPNOTSUPP;
+
+        if ( set_cppc->energy_perf > UINT8_MAX )
+            return -EINVAL;
+    }
+
+    /* Activity window not supported in MSR */
+    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ACT_WINDOW )
+        return -EOPNOTSUPP;
+
+    epp = per_cpu(epp_init, cpu);
+    min_perf = data->caps.lowest_nonlinear_perf;
+    max_perf = data->caps.highest_perf;
+    des_perf = data->req.des_perf;
+    /*
+     * Apply presets:
+     * XEN_SYSCTL_CPPC_SET_PRESET_POWERSAVE/PERFORMANCE/ONDEMAND are
+     * only available when CPPC in active mode
+     */
+    switch ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_PRESET_MASK )
+    {
+    case XEN_SYSCTL_CPPC_SET_PRESET_POWERSAVE:
+        if ( !active_mode )
+            return -EINVAL;
+        policy->policy = CPUFREQ_POLICY_POWERSAVE;
+        /*
+         * Lower max_perf to nonlinear_lowest to achieve
+         * ultmost power saviongs
+         */
+        max_perf = min_perf;
+        epp = CPPC_ENERGY_PERF_MAX_POWERSAVE;
+        break;
+
+    case XEN_SYSCTL_CPPC_SET_PRESET_PERFORMANCE:
+        if ( !active_mode )
+            return -EINVAL;
+        policy->policy = CPUFREQ_POLICY_PERFORMANCE;
+        /* Increase min_perf to highest to achieve ultmost performance */
+        min_perf = max_perf;
+        epp = CPPC_ENERGY_PERF_MAX_PERFORMANCE;
+        break;
+
+    case XEN_SYSCTL_CPPC_SET_PRESET_ONDEMAND:
+        if ( !active_mode )
+            return -EINVAL;
+        policy->policy = CPUFREQ_POLICY_ONDEMAND;
+        /*
+         * Take medium value to show no preference over
+         * performance or powersave
+         */
+        epp = CPPC_ENERGY_PERF_BALANCE;
+        break;
+
+    case XEN_SYSCTL_CPPC_SET_PRESET_NONE:
+        if ( active_mode )
+            policy->policy = CPUFREQ_POLICY_UNKNOWN;
+        break;
+
+    default:
+        return -EINVAL;
+    }
+
+    /* Further customize presets if needed */
+    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM )
+        min_perf = set_cppc->minimum;
+
+    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM )
+        max_perf = set_cppc->maximum;
+
+    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ENERGY_PERF )
+        epp = set_cppc->energy_perf;
+
+    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED )
+        des_perf = set_cppc->desired;
+
+    amd_cppc_write_request(cpu, min_perf, des_perf, max_perf, epp);
+
+    return 0;
+}
+#endif /* CONFIG_PM_OP */
+
 static const struct cpufreq_driver __initconst_cf_clobber
 amd_cppc_cpufreq_driver =
 {
diff --git a/xen/drivers/acpi/pm-op.c b/xen/drivers/acpi/pm-op.c
index 2b4c8070aa..195dcb88b5 100644
--- a/xen/drivers/acpi/pm-op.c
+++ b/xen/drivers/acpi/pm-op.c
@@ -84,6 +84,8 @@ static int get_cpufreq_cppc(unsigned int cpu,
 
     if ( hwp_active() )
         ret = get_hwp_para(cpu, cppc_para);
+    else if ( processor_pminfo[cpu]->init & XEN_CPPC_INIT )
+        ret = get_amd_cppc_para(per_cpu(cpufreq_cpu_policy, cpu), cppc_para);
 
     return ret;
 }
@@ -317,10 +319,12 @@ static int set_cpufreq_cppc(struct xen_sysctl_pm_op *op)
     if ( !policy || !policy->governor )
         return -ENOENT;
 
-    if ( !hwp_active() )
-        return -EOPNOTSUPP;
+    if ( hwp_active() )
+        return set_hwp_para(policy, &op->u.set_cppc);
+    if ( processor_pminfo[op->cpuid]->init & XEN_CPPC_INIT )
+        return set_amd_cppc_para(policy, &op->u.set_cppc);
 
-    return set_hwp_para(policy, &op->u.set_cppc);
+    return -EOPNOTSUPP;
 }
 
 int do_pm_op(struct xen_sysctl_pm_op *op)
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
index babc4a1a2c..6ca686c4b2 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -81,7 +81,18 @@ struct cpufreq_policy {
     int8_t              turbo;  /* tristate flag: 0 for unsupported
                                  * -1 for disable, 1 for enabled
                                  * See CPUFREQ_TURBO_* below for defines */
-    unsigned int        policy; /* CPUFREQ_POLICY_* */
+    unsigned int        policy; /* Performance Policy
+                                 * If cpufreq_driver->target() exists,
+                                 * the ->governor decides what frequency
+                                 * within the limits is used.
+                                 * If cpufreq_driver->setpolicy() exists, these
+                                 * following policies are available:
+                                 * CPUFREQ_POLICY_PERFORMANCE represents
+                                 * maximum performance
+                                 * CPUFREQ_POLICY_POWERSAVE represents least
+                                 * power consumption
+                                 * CPUFREQ_POLICY_ONDEMAND represents no
+                                 * preference over performance or powersave */
 };
 DECLARE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_policy);
 
@@ -132,21 +143,6 @@ extern int cpufreq_register_governor(struct cpufreq_governor *governor);
 extern struct cpufreq_governor *__find_governor(const char *governor);
 #define CPUFREQ_DEFAULT_GOVERNOR &cpufreq_gov_dbs
 
-/*
- * Performance Policy
- * If cpufreq_driver->target() exists, the ->governor decides what frequency
- * within the limits is used. If cpufreq_driver->setpolicy() exists, these
- * following policies are available:
- * CPUFREQ_POLICY_PERFORMANCE represents maximum performance
- * CPUFREQ_POLICY_POWERSAVE represents least power consumption
- * CPUFREQ_POLICY_ONDEMAND represents no preference over performance or
- * powersave
- */
-#define CPUFREQ_POLICY_UNKNOWN      0
-#define CPUFREQ_POLICY_POWERSAVE    1
-#define CPUFREQ_POLICY_PERFORMANCE  2
-#define CPUFREQ_POLICY_ONDEMAND     3
-
 unsigned int cpufreq_policy_from_governor(const struct cpufreq_governor *gov);
 
 /* pass a target to the cpufreq driver */
@@ -293,6 +289,10 @@ int acpi_cpufreq_register(void);
 
 int amd_cppc_cmdline_parse(const char *s, const char *e);
 int amd_cppc_register_driver(void);
+int get_amd_cppc_para(const struct cpufreq_policy *policy,
+                      struct xen_get_cppc_para *cppc_para);
+int set_amd_cppc_para(struct cpufreq_policy *policy,
+                      const struct xen_set_cppc_para *set_cppc);
 
 bool cpufreq_is_governorless(unsigned int cpuid);
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 3f654f98ab..c50fa7bb3c 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -336,8 +336,14 @@ struct xen_ondemand {
     uint32_t up_threshold;
 };
 
+#define CPUFREQ_POLICY_UNKNOWN      0
+#define CPUFREQ_POLICY_POWERSAVE    1
+#define CPUFREQ_POLICY_PERFORMANCE  2
+#define CPUFREQ_POLICY_ONDEMAND     3
+
 struct xen_get_cppc_para {
     /* OUT */
+    uint32_t policy; /* CPUFREQ_POLICY_xxx */
     /* activity_window supported if set */
 #define XEN_SYSCTL_CPPC_FEAT_ACT_WINDOW  (1 << 0)
     uint32_t features; /* bit flags for features */
-- 
2.34.1
Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Anthony PERARD 2 months ago
On Fri, Aug 22, 2025 at 06:52:18PM +0800, Penny Zheng wrote:
> diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
> index 02981c4583..eedb745a46 100644
> --- a/tools/misc/xenpm.c
> +++ b/tools/misc/xenpm.c
> @@ -38,6 +38,13 @@
>  static xc_interface *xc_handle;
>  static unsigned int max_cpu_nr;
>  
> +static const char cpufreq_policy_str[][12] = {

Is it necessary to hard-code an hand calculated size of the literal
strings? Can't we let the compiler do that for us? With this as type:

    static const char *cpufreq_policy_str[] = {

The compiler might not detect an issue if we write "11" instead of "12",
for example.

> +    [CPUFREQ_POLICY_UNKNOWN] = "unknown",
> +    [CPUFREQ_POLICY_POWERSAVE] = "powersave",
> +    [CPUFREQ_POLICY_PERFORMANCE] = "performance",
> +    [CPUFREQ_POLICY_ONDEMAND] = "ondemand",
> +};
> +
>  /* help message */
>  void show_help(void)
>  {

Otherwise the tool side of the patch looks fine to me,
so: Acked-by: Anthony PERARD <anthony.perard@vates.tech>

Thanks,

-- 
Anthony PERARD
Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Jan Beulich 2 months ago
On 27.08.2025 17:58, Anthony PERARD wrote:
> On Fri, Aug 22, 2025 at 06:52:18PM +0800, Penny Zheng wrote:
>> diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
>> index 02981c4583..eedb745a46 100644
>> --- a/tools/misc/xenpm.c
>> +++ b/tools/misc/xenpm.c
>> @@ -38,6 +38,13 @@
>>  static xc_interface *xc_handle;
>>  static unsigned int max_cpu_nr;
>>  
>> +static const char cpufreq_policy_str[][12] = {
> 
> Is it necessary to hard-code an hand calculated size of the literal
> strings? Can't we let the compiler do that for us? With this as type:
> 
>     static const char *cpufreq_policy_str[] = {

I think it was me to request this. Your approach has an extra level of
indirection (perhaps not a big problem here), and requires runtime
relocations when building as PIE (maybe also not a big problem here).
The 2nd const that's wanted is also, as can be seen, frequently
omitted. Overall I'm generally striving towards using more efficient
code also where efficiency isn't of primary concern, simply because
code is being copied, often without looking very closely.

What we may want to do is bump to 12 to 16, adding some leeway and
making calculations a little easier.

Jan
Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Jan Beulich 2 months ago
On 22.08.2025 12:52, Penny Zheng wrote:
> --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> @@ -557,6 +557,187 @@ static int cf_check amd_cppc_epp_set_policy(struct cpufreq_policy *policy)
>      return 0;
>  }
>  
> +#ifdef CONFIG_PM_OP
> +int get_amd_cppc_para(const struct cpufreq_policy *policy,
> +                      struct xen_get_cppc_para *cppc_para)

amd_cppc_get_para() and ...

> +{
> +    const struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data,
> +                                                   policy->cpu);
> +
> +    if ( data == NULL )
> +        return -ENODATA;
> +
> +    cppc_para->policy           = policy->policy;
> +    cppc_para->lowest           = data->caps.lowest_perf;
> +    cppc_para->lowest_nonlinear = data->caps.lowest_nonlinear_perf;
> +    cppc_para->nominal          = data->caps.nominal_perf;
> +    cppc_para->highest          = data->caps.highest_perf;
> +    cppc_para->minimum          = data->req.min_perf;
> +    cppc_para->maximum          = data->req.max_perf;
> +    cppc_para->desired          = data->req.des_perf;
> +    cppc_para->energy_perf      = data->req.epp;
> +
> +    return 0;
> +}
> +
> +int set_amd_cppc_para(struct cpufreq_policy *policy,
> +                      const struct xen_set_cppc_para *set_cppc)

... amd_cppc_set_para() would imo be more consistent names, considering how
other functions are named.

> +{
> +    unsigned int cpu = policy->cpu;
> +    struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data, cpu);
> +    uint8_t max_perf, min_perf, des_perf, epp;
> +    bool active_mode = cpufreq_is_governorless(cpu);
> +
> +    if ( data == NULL )
> +        return -ENOENT;
> +
> +    /* Return if there is nothing to do. */
> +    if ( set_cppc->set_params == 0 )
> +        return 0;

That is ...

> +    /* Only allow values if params bit is set. */
> +    if ( (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED) &&
> +          set_cppc->desired) ||
> +         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) &&
> +          set_cppc->minimum) ||
> +         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
> +          set_cppc->maximum) ||
> +         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ENERGY_PERF) &&
> +          set_cppc->energy_perf) )
> +        return -EINVAL;

... all the errors checked here are to be ignored when no flag is set at
all?

> +    /*
> +     * Validate all parameters
> +     * Maximum performance may be set to any performance value in the range
> +     * [Nonlinear Lowest Performance, Highest Performance], inclusive but must
> +     * be set to a value that is larger than or equal to minimum Performance.
> +     */
> +    if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
> +         (set_cppc->maximum > data->caps.highest_perf ||
> +          set_cppc->maximum <
> +                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM
> +                         ? set_cppc->minimum
> +                         : data->req.min_perf)) )

Too deep indentation (more of this throughout the function), and seeing ...

> +        return -EINVAL;
> +    /*
> +     * Minimum performance may be set to any performance value in the range
> +     * [Nonlinear Lowest Performance, Highest Performance], inclusive but must
> +     * be set to a value that is less than or equal to Maximum Performance.
> +     */
> +    if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) &&
> +         (set_cppc->minimum < data->caps.lowest_nonlinear_perf ||
> +          (set_cppc->minimum >

... this, one more pair of parentheses may also help there. (Recall:
symmetry where possible.)

> +                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM
> +                         ? set_cppc->maximum
> +                         : data->req.max_perf))) )
> +        return -EINVAL;
> +    /*
> +     * Desired performance may be set to any performance value in the range
> +     * [Minimum Performance, Maximum Performance], inclusive.
> +     */
> +    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED )
> +    {
> +        if ( active_mode )
> +            return -EOPNOTSUPP;
> +
> +        if ( (set_cppc->desired >
> +                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM
> +                         ? set_cppc->maximum
> +                         : data->req.max_perf)) ||
> +             (set_cppc->desired <
> +                        (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM
> +                         ? set_cppc->minimum
> +                         : data->req.min_perf)) )
> +            return -EINVAL;
> +    }
> +    /*
> +     * Energy Performance Preference may be set with a range of values
> +     * from 0 to 0xFF
> +     */
> +    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ENERGY_PERF )
> +    {
> +        if ( !active_mode )
> +            return -EOPNOTSUPP;
> +
> +        if ( set_cppc->energy_perf > UINT8_MAX )
> +            return -EINVAL;
> +    }
> +
> +    /* Activity window not supported in MSR */
> +    if ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_ACT_WINDOW )
> +        return -EOPNOTSUPP;
> +
> +    epp = per_cpu(epp_init, cpu);
> +    min_perf = data->caps.lowest_nonlinear_perf;
> +    max_perf = data->caps.highest_perf;
> +    des_perf = data->req.des_perf;
> +    /*
> +     * Apply presets:
> +     * XEN_SYSCTL_CPPC_SET_PRESET_POWERSAVE/PERFORMANCE/ONDEMAND are
> +     * only available when CPPC in active mode
> +     */
> +    switch ( set_cppc->set_params & XEN_SYSCTL_CPPC_SET_PRESET_MASK )
> +    {
> +    case XEN_SYSCTL_CPPC_SET_PRESET_POWERSAVE:
> +        if ( !active_mode )
> +            return -EINVAL;
> +        policy->policy = CPUFREQ_POLICY_POWERSAVE;
> +        /*
> +         * Lower max_perf to nonlinear_lowest to achieve
> +         * ultmost power saviongs
> +         */
> +        max_perf = min_perf;
> +        epp = CPPC_ENERGY_PERF_MAX_POWERSAVE;
> +        break;
> +
> +    case XEN_SYSCTL_CPPC_SET_PRESET_PERFORMANCE:
> +        if ( !active_mode )
> +            return -EINVAL;
> +        policy->policy = CPUFREQ_POLICY_PERFORMANCE;
> +        /* Increase min_perf to highest to achieve ultmost performance */
> +        min_perf = max_perf;
> +        epp = CPPC_ENERGY_PERF_MAX_PERFORMANCE;
> +        break;
> +
> +    case XEN_SYSCTL_CPPC_SET_PRESET_ONDEMAND:
> +        if ( !active_mode )
> +            return -EINVAL;
> +        policy->policy = CPUFREQ_POLICY_ONDEMAND;
> +        /*
> +         * Take medium value to show no preference over
> +         * performance or powersave
> +         */
> +        epp = CPPC_ENERGY_PERF_BALANCE;
> +        break;
> +
> +    case XEN_SYSCTL_CPPC_SET_PRESET_NONE:
> +        if ( active_mode )
> +            policy->policy = CPUFREQ_POLICY_UNKNOWN;
> +        break;
> +
> +    default:
> +        return -EINVAL;
> +    }

Much of this looks very similar to what patch 09 introduces in
amd_cppc_epp_set_policy(). Is it not possible to reduce the redundancy?

> --- a/xen/include/acpi/cpufreq/cpufreq.h
> +++ b/xen/include/acpi/cpufreq/cpufreq.h
> @@ -81,7 +81,18 @@ struct cpufreq_policy {
>      int8_t              turbo;  /* tristate flag: 0 for unsupported
>                                   * -1 for disable, 1 for enabled
>                                   * See CPUFREQ_TURBO_* below for defines */
> -    unsigned int        policy; /* CPUFREQ_POLICY_* */
> +    unsigned int        policy; /* Performance Policy
> +                                 * If cpufreq_driver->target() exists,
> +                                 * the ->governor decides what frequency
> +                                 * within the limits is used.
> +                                 * If cpufreq_driver->setpolicy() exists, these
> +                                 * following policies are available:
> +                                 * CPUFREQ_POLICY_PERFORMANCE represents
> +                                 * maximum performance
> +                                 * CPUFREQ_POLICY_POWERSAVE represents least
> +                                 * power consumption
> +                                 * CPUFREQ_POLICY_ONDEMAND represents no
> +                                 * preference over performance or powersave */

Besides not being a well-formed comment, this is close to unreadable in this
shape. This much text wants putting ahead of the field.

> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -336,8 +336,14 @@ struct xen_ondemand {
>      uint32_t up_threshold;
>  };
>  
> +#define CPUFREQ_POLICY_UNKNOWN      0
> +#define CPUFREQ_POLICY_POWERSAVE    1
> +#define CPUFREQ_POLICY_PERFORMANCE  2
> +#define CPUFREQ_POLICY_ONDEMAND     3

Without XEN_ prefixes they shouldn't appear in a public header. But do
we need ...

>  struct xen_get_cppc_para {
>      /* OUT */
> +    uint32_t policy; /* CPUFREQ_POLICY_xxx */

... the new field at all? Can't you synthesize the kind-of-governor into
struct xen_get_cpufreq_para's respective field? You invoke both sub-ops
from xenpm now anyway ...

Jan
RE: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Penny, Zheng 2 months ago
[Public]

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, August 26, 2025 12:03 AM
> To: Penny, Zheng <penny.zheng@amd.com>
> Cc: Huang, Ray <Ray.Huang@amd.com>; Anthony PERARD
> <anthony.perard@vates.tech>; Andrew Cooper <andrew.cooper3@citrix.com>;
> Orzel, Michal <Michal.Orzel@amd.com>; Julien Grall <julien@xen.org>; Roger
> Pau Monné <roger.pau@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC
> xen_sysctl_pm_op for amd-cppc driver
>
> On 22.08.2025 12:52, Penny Zheng wrote:
> > --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> > +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
> > +    /* Only allow values if params bit is set. */
> > +    if ( (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED) &&
> > +          set_cppc->desired) ||
> > +         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) &&
> > +          set_cppc->minimum) ||
> > +         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
> > +          set_cppc->maximum) ||
> > +         (!(set_cppc->set_params &
> XEN_SYSCTL_CPPC_SET_ENERGY_PERF) &&
> > +          set_cppc->energy_perf) )
> > +        return -EINVAL;
>
> ... all the errors checked here are to be ignored when no flag is set at all?
>

Yes, values are only meaningful when according flag is properly set, which has been described in the comment for "struct xen_set_cppc_para"

> > +    /*
> > +     * Validate all parameters
> > +     * Maximum performance may be set to any performance value in the range
> > +     * [Nonlinear Lowest Performance, Highest Performance], inclusive but
> must
> > +     * be set to a value that is larger than or equal to minimum Performance.
> > +     */
> > +    if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
> > +         (set_cppc->maximum > data->caps.highest_perf ||
> > +          set_cppc->maximum <
> > +                        (set_cppc->set_params &
> XEN_SYSCTL_CPPC_SET_MINIMUM
> > +                         ? set_cppc->minimum
> > +                         : data->req.min_perf)) )
>
> Too deep indentation (more of this throughout the function), and seeing ...

Maybe four indention is more proper
```
        if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
             (set_cppc->maximum > data->caps.highest_perf ||
              (set_cppc->maximum <
                          (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM
                    ? set_cppc->minimum
                    : data->req.min_perf))) )
```

> > +    case XEN_SYSCTL_CPPC_SET_PRESET_NONE:
> > +        if ( active_mode )
> > +            policy->policy = CPUFREQ_POLICY_UNKNOWN;
> > +        break;
> > +
> > +    default:
> > +        return -EINVAL;
> > +    }
>
> Much of this looks very similar to what patch 09 introduces in
> amd_cppc_epp_set_policy(). Is it not possible to reduce the redundancy?
>

I'll add a new helper to amd_cppc_prepare_policy() to extract common

> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -336,8 +336,14 @@ struct xen_ondemand {
> >      uint32_t up_threshold;
> >  };
> >
> > +#define CPUFREQ_POLICY_UNKNOWN      0
> > +#define CPUFREQ_POLICY_POWERSAVE    1
> > +#define CPUFREQ_POLICY_PERFORMANCE  2
> > +#define CPUFREQ_POLICY_ONDEMAND     3
>
> Without XEN_ prefixes they shouldn't appear in a public header. But do we
> need ...
>
> >  struct xen_get_cppc_para {
> >      /* OUT */
> > +    uint32_t policy; /* CPUFREQ_POLICY_xxx */
>
> ... the new field at all? Can't you synthesize the kind-of-governor into struct
> xen_get_cpufreq_para's respective field? You invoke both sub-ops from xenpm
> now anyway ...
>

Maybe I could borrow governor field to indicate policy info, like the following in print_cpufreq_para(), then we don't need to add the new filed "policy"
```
+    /* Translate governor info to policy info in CPPC active mode */
+    if ( is_cppc_active )
+    {
+        if ( !strncmp(p_cpufreq->u.s.scaling_governor,
+                      "ondemand", CPUFREQ_NAME_LEN) )
+            printf("cppc policy           : ondemand\n");
+        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
+                           "performance", CPUFREQ_NAME_LEN) )
+            printf("cppc policy           : performance\n");
+
+        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
+                           "powersave", CPUFREQ_NAME_LEN) )
+            printf("cppc policy           : powersave\n");
+        else
+            printf("cppc policy           : unknown\n");
+    }
+
```

> Jan
Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Jan Beulich 2 months ago
On 28.08.2025 06:06, Penny, Zheng wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Tuesday, August 26, 2025 12:03 AM
>>
>> On 22.08.2025 12:52, Penny Zheng wrote:
>>> --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c
>>> +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c
>>> +    /* Only allow values if params bit is set. */
>>> +    if ( (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_DESIRED) &&
>>> +          set_cppc->desired) ||
>>> +         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM) &&
>>> +          set_cppc->minimum) ||
>>> +         (!(set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
>>> +          set_cppc->maximum) ||
>>> +         (!(set_cppc->set_params &
>> XEN_SYSCTL_CPPC_SET_ENERGY_PERF) &&
>>> +          set_cppc->energy_perf) )
>>> +        return -EINVAL;
>>
>> ... all the errors checked here are to be ignored when no flag is set at all?
> 
> Yes, values are only meaningful when according flag is properly set, which has been described in the comment for "struct xen_set_cppc_para"

Especially since you stripped the initial part of this comment of mine, it feels
as if you misunderstood my request. What it boils down to is the question whether
"if ( set_cppc->set_params == 0 )" shouldn't move after the if() you left in
context above.

>>> +    /*
>>> +     * Validate all parameters
>>> +     * Maximum performance may be set to any performance value in the range
>>> +     * [Nonlinear Lowest Performance, Highest Performance], inclusive but
>> must
>>> +     * be set to a value that is larger than or equal to minimum Performance.
>>> +     */
>>> +    if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
>>> +         (set_cppc->maximum > data->caps.highest_perf ||
>>> +          set_cppc->maximum <
>>> +                        (set_cppc->set_params &
>> XEN_SYSCTL_CPPC_SET_MINIMUM
>>> +                         ? set_cppc->minimum
>>> +                         : data->req.min_perf)) )
>>
>> Too deep indentation (more of this throughout the function), and seeing ...
> 
> Maybe four indention is more proper
> ```
>         if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
>              (set_cppc->maximum > data->caps.highest_perf ||
>               (set_cppc->maximum <
>                           (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM
>                     ? set_cppc->minimum
>                     : data->req.min_perf))) )
> ```

No. In expressions you always want to indent according to pending open
parentheses:

        if ( (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MAXIMUM) &&
             (set_cppc->maximum > data->caps.highest_perf ||
              (set_cppc->maximum <
               (set_cppc->set_params & XEN_SYSCTL_CPPC_SET_MINIMUM
                ? set_cppc->minimum
                : data->req.min_perf))) )

>>> +    case XEN_SYSCTL_CPPC_SET_PRESET_NONE:
>>> +        if ( active_mode )
>>> +            policy->policy = CPUFREQ_POLICY_UNKNOWN;
>>> +        break;
>>> +
>>> +    default:
>>> +        return -EINVAL;
>>> +    }
>>
>> Much of this looks very similar to what patch 09 introduces in
>> amd_cppc_epp_set_policy(). Is it not possible to reduce the redundancy?
>>
> 
> I'll add a new helper to amd_cppc_prepare_policy() to extract common
> 
>>> --- a/xen/include/public/sysctl.h
>>> +++ b/xen/include/public/sysctl.h
>>> @@ -336,8 +336,14 @@ struct xen_ondemand {
>>>      uint32_t up_threshold;
>>>  };
>>>
>>> +#define CPUFREQ_POLICY_UNKNOWN      0
>>> +#define CPUFREQ_POLICY_POWERSAVE    1
>>> +#define CPUFREQ_POLICY_PERFORMANCE  2
>>> +#define CPUFREQ_POLICY_ONDEMAND     3
>>
>> Without XEN_ prefixes they shouldn't appear in a public header. But do we
>> need ...
>>
>>>  struct xen_get_cppc_para {
>>>      /* OUT */
>>> +    uint32_t policy; /* CPUFREQ_POLICY_xxx */
>>
>> ... the new field at all? Can't you synthesize the kind-of-governor into struct
>> xen_get_cpufreq_para's respective field? You invoke both sub-ops from xenpm
>> now anyway ...
>>
> 
> Maybe I could borrow governor field to indicate policy info, like the following in print_cpufreq_para(), then we don't need to add the new filed "policy"
> ```
> +    /* Translate governor info to policy info in CPPC active mode */
> +    if ( is_cppc_active )
> +    {
> +        if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> +                      "ondemand", CPUFREQ_NAME_LEN) )
> +            printf("cppc policy           : ondemand\n");
> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> +                           "performance", CPUFREQ_NAME_LEN) )
> +            printf("cppc policy           : performance\n");
> +
> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> +                           "powersave", CPUFREQ_NAME_LEN) )
> +            printf("cppc policy           : powersave\n");
> +        else
> +            printf("cppc policy           : unknown\n");
> +    }
> +
> ```

Something like this is what I was thinking of, yes.

Jan
Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Jan Beulich 2 months ago
On 28.08.2025 08:35, Jan Beulich wrote:
> On 28.08.2025 06:06, Penny, Zheng wrote:
>>> -----Original Message-----
>>> From: Jan Beulich <jbeulich@suse.com>
>>> Sent: Tuesday, August 26, 2025 12:03 AM
>>>
>>> On 22.08.2025 12:52, Penny Zheng wrote:
>>>> --- a/xen/include/public/sysctl.h
>>>> +++ b/xen/include/public/sysctl.h
>>>> @@ -336,8 +336,14 @@ struct xen_ondemand {
>>>>      uint32_t up_threshold;
>>>>  };
>>>>
>>>> +#define CPUFREQ_POLICY_UNKNOWN      0
>>>> +#define CPUFREQ_POLICY_POWERSAVE    1
>>>> +#define CPUFREQ_POLICY_PERFORMANCE  2
>>>> +#define CPUFREQ_POLICY_ONDEMAND     3
>>>
>>> Without XEN_ prefixes they shouldn't appear in a public header. But do we
>>> need ...
>>>
>>>>  struct xen_get_cppc_para {
>>>>      /* OUT */
>>>> +    uint32_t policy; /* CPUFREQ_POLICY_xxx */
>>>
>>> ... the new field at all? Can't you synthesize the kind-of-governor into struct
>>> xen_get_cpufreq_para's respective field? You invoke both sub-ops from xenpm
>>> now anyway ...
>>>
>>
>> Maybe I could borrow governor field to indicate policy info, like the following in print_cpufreq_para(), then we don't need to add the new filed "policy"
>> ```
>> +    /* Translate governor info to policy info in CPPC active mode */
>> +    if ( is_cppc_active )
>> +    {
>> +        if ( !strncmp(p_cpufreq->u.s.scaling_governor,
>> +                      "ondemand", CPUFREQ_NAME_LEN) )
>> +            printf("cppc policy           : ondemand\n");
>> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
>> +                           "performance", CPUFREQ_NAME_LEN) )
>> +            printf("cppc policy           : performance\n");
>> +
>> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
>> +                           "powersave", CPUFREQ_NAME_LEN) )
>> +            printf("cppc policy           : powersave\n");
>> +        else
>> +            printf("cppc policy           : unknown\n");
>> +    }
>> +
>> ```
> 
> Something like this is what I was thinking of, yes.

Albeit - why the complicated if/else sequence? Why not simply print
the field the hypercall returned?

Jan
RE: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Penny, Zheng 2 months ago
[Public]

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Thursday, August 28, 2025 2:38 PM
> To: Penny, Zheng <penny.zheng@amd.com>
> Cc: Huang, Ray <Ray.Huang@amd.com>; Anthony PERARD
> <anthony.perard@vates.tech>; Andrew Cooper <andrew.cooper3@citrix.com>;
> Orzel, Michal <Michal.Orzel@amd.com>; Julien Grall <julien@xen.org>; Roger Pau
> Monné <roger.pau@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC
> xen_sysctl_pm_op for amd-cppc driver
>
> On 28.08.2025 08:35, Jan Beulich wrote:
> > On 28.08.2025 06:06, Penny, Zheng wrote:
> >>> -----Original Message-----
> >>> From: Jan Beulich <jbeulich@suse.com>
> >>> Sent: Tuesday, August 26, 2025 12:03 AM
> >>>
> >>> On 22.08.2025 12:52, Penny Zheng wrote:
> >>>> --- a/xen/include/public/sysctl.h
> >>>> +++ b/xen/include/public/sysctl.h
> >>>> @@ -336,8 +336,14 @@ struct xen_ondemand {
> >>>>      uint32_t up_threshold;
> >>>>  };
> >>>>
> >>>> +#define CPUFREQ_POLICY_UNKNOWN      0
> >>>> +#define CPUFREQ_POLICY_POWERSAVE    1
> >>>> +#define CPUFREQ_POLICY_PERFORMANCE  2
> >>>> +#define CPUFREQ_POLICY_ONDEMAND     3
> >>>
> >>> Without XEN_ prefixes they shouldn't appear in a public header. But
> >>> do we need ...
> >>>
> >>>>  struct xen_get_cppc_para {
> >>>>      /* OUT */
> >>>> +    uint32_t policy; /* CPUFREQ_POLICY_xxx */
> >>>
> >>> ... the new field at all? Can't you synthesize the kind-of-governor
> >>> into struct xen_get_cpufreq_para's respective field? You invoke both
> >>> sub-ops from xenpm now anyway ...
> >>>
> >>
> >> Maybe I could borrow governor field to indicate policy info, like the following in
> print_cpufreq_para(), then we don't need to add the new filed "policy"
> >> ```
> >> +    /* Translate governor info to policy info in CPPC active mode */
> >> +    if ( is_cppc_active )
> >> +    {
> >> +        if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> >> +                      "ondemand", CPUFREQ_NAME_LEN) )
> >> +            printf("cppc policy           : ondemand\n");
> >> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> >> +                           "performance", CPUFREQ_NAME_LEN) )
> >> +            printf("cppc policy           : performance\n");
> >> +
> >> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> >> +                           "powersave", CPUFREQ_NAME_LEN) )
> >> +            printf("cppc policy           : powersave\n");
> >> +        else
> >> +            printf("cppc policy           : unknown\n");
> >> +    }
> >> +
> >> ```
> >
> > Something like this is what I was thinking of, yes.
>
> Albeit - why the complicated if/else sequence? Why not simply print the field the
> hypercall returned?
>

userspace governor doesn't have according policy. I could simplify it to
```
        if ( !strncmp(p_cpufreq->u.s.scaling_governor,
             "userspace", CPUFREQ_NAME_LEN) )
                printf("policy               : unknown\n");
        else
                printf("policy               : %s\n",
                          p_cpufreq->u.s.scaling_governor);
```


> Jan
Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Jan Beulich 2 months ago
On 28.08.2025 08:54, Penny, Zheng wrote:
> [Public]
> 
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Thursday, August 28, 2025 2:38 PM
>> To: Penny, Zheng <penny.zheng@amd.com>
>> Cc: Huang, Ray <Ray.Huang@amd.com>; Anthony PERARD
>> <anthony.perard@vates.tech>; Andrew Cooper <andrew.cooper3@citrix.com>;
>> Orzel, Michal <Michal.Orzel@amd.com>; Julien Grall <julien@xen.org>; Roger Pau
>> Monné <roger.pau@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
>> devel@lists.xenproject.org
>> Subject: Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC
>> xen_sysctl_pm_op for amd-cppc driver
>>
>> On 28.08.2025 08:35, Jan Beulich wrote:
>>> On 28.08.2025 06:06, Penny, Zheng wrote:
>>>>> -----Original Message-----
>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>> Sent: Tuesday, August 26, 2025 12:03 AM
>>>>>
>>>>> On 22.08.2025 12:52, Penny Zheng wrote:
>>>>>> --- a/xen/include/public/sysctl.h
>>>>>> +++ b/xen/include/public/sysctl.h
>>>>>> @@ -336,8 +336,14 @@ struct xen_ondemand {
>>>>>>      uint32_t up_threshold;
>>>>>>  };
>>>>>>
>>>>>> +#define CPUFREQ_POLICY_UNKNOWN      0
>>>>>> +#define CPUFREQ_POLICY_POWERSAVE    1
>>>>>> +#define CPUFREQ_POLICY_PERFORMANCE  2
>>>>>> +#define CPUFREQ_POLICY_ONDEMAND     3
>>>>>
>>>>> Without XEN_ prefixes they shouldn't appear in a public header. But
>>>>> do we need ...
>>>>>
>>>>>>  struct xen_get_cppc_para {
>>>>>>      /* OUT */
>>>>>> +    uint32_t policy; /* CPUFREQ_POLICY_xxx */
>>>>>
>>>>> ... the new field at all? Can't you synthesize the kind-of-governor
>>>>> into struct xen_get_cpufreq_para's respective field? You invoke both
>>>>> sub-ops from xenpm now anyway ...
>>>>>
>>>>
>>>> Maybe I could borrow governor field to indicate policy info, like the following in
>> print_cpufreq_para(), then we don't need to add the new filed "policy"
>>>> ```
>>>> +    /* Translate governor info to policy info in CPPC active mode */
>>>> +    if ( is_cppc_active )
>>>> +    {
>>>> +        if ( !strncmp(p_cpufreq->u.s.scaling_governor,
>>>> +                      "ondemand", CPUFREQ_NAME_LEN) )
>>>> +            printf("cppc policy           : ondemand\n");
>>>> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
>>>> +                           "performance", CPUFREQ_NAME_LEN) )
>>>> +            printf("cppc policy           : performance\n");
>>>> +
>>>> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
>>>> +                           "powersave", CPUFREQ_NAME_LEN) )
>>>> +            printf("cppc policy           : powersave\n");
>>>> +        else
>>>> +            printf("cppc policy           : unknown\n");
>>>> +    }
>>>> +
>>>> ```
>>>
>>> Something like this is what I was thinking of, yes.
>>
>> Albeit - why the complicated if/else sequence? Why not simply print the field the
>> hypercall returned?
> 
> userspace governor doesn't have according policy. I could simplify it to
> ```
>         if ( !strncmp(p_cpufreq->u.s.scaling_governor,
>              "userspace", CPUFREQ_NAME_LEN) )
>                 printf("policy               : unknown\n");
>         else
>                 printf("policy               : %s\n",
>                           p_cpufreq->u.s.scaling_governor);
> ```

But the hypervisor shouldn't report back "userspace" when the CPPC driver
is in use. ANd I think the tool is okay to trust the hypervisor.

Jan

RE: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC xen_sysctl_pm_op for amd-cppc driver
Posted by Penny, Zheng 2 months ago
[Public]

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Thursday, August 28, 2025 3:09 PM
> To: Penny, Zheng <penny.zheng@amd.com>
> Cc: Huang, Ray <Ray.Huang@amd.com>; Anthony PERARD
> <anthony.perard@vates.tech>; Andrew Cooper <andrew.cooper3@citrix.com>;
> Orzel, Michal <Michal.Orzel@amd.com>; Julien Grall <julien@xen.org>; Roger Pau
> Monné <roger.pau@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v7 13/13] xen/cpufreq: Adapt SET/GET_CPUFREQ_CPPC
> xen_sysctl_pm_op for amd-cppc driver
>
> On 28.08.2025 08:54, Penny, Zheng wrote:
> > [Public]
> >
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Thursday, August 28, 2025 2:38 PM
> >> To: Penny, Zheng <penny.zheng@amd.com>
> >> Cc: Huang, Ray <Ray.Huang@amd.com>; Anthony PERARD
> >> <anthony.perard@vates.tech>; Andrew Cooper
> >> <andrew.cooper3@citrix.com>; Orzel, Michal <Michal.Orzel@amd.com>;
> >> Julien Grall <julien@xen.org>; Roger Pau Monné
> >> <roger.pau@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>;
> >> xen- devel@lists.xenproject.org
> >> Subject: Re: [PATCH v7 13/13] xen/cpufreq: Adapt
> SET/GET_CPUFREQ_CPPC
> >> xen_sysctl_pm_op for amd-cppc driver
> >>
> >> On 28.08.2025 08:35, Jan Beulich wrote:
> >>> On 28.08.2025 06:06, Penny, Zheng wrote:
> >>>>> -----Original Message-----
> >>>>> From: Jan Beulich <jbeulich@suse.com>
> >>>>> Sent: Tuesday, August 26, 2025 12:03 AM
> >>>>>
> >>>>> On 22.08.2025 12:52, Penny Zheng wrote:
> >>>>>> --- a/xen/include/public/sysctl.h
> >>>>>> +++ b/xen/include/public/sysctl.h
> >>>>>> @@ -336,8 +336,14 @@ struct xen_ondemand {
> >>>>>>      uint32_t up_threshold;
> >>>>>>  };
> >>>>>>
> >>>>>> +#define CPUFREQ_POLICY_UNKNOWN      0
> >>>>>> +#define CPUFREQ_POLICY_POWERSAVE    1
> >>>>>> +#define CPUFREQ_POLICY_PERFORMANCE  2
> >>>>>> +#define CPUFREQ_POLICY_ONDEMAND     3
> >>>>>
> >>>>> Without XEN_ prefixes they shouldn't appear in a public header.
> >>>>> But do we need ...
> >>>>>
> >>>>>>  struct xen_get_cppc_para {
> >>>>>>      /* OUT */
> >>>>>> +    uint32_t policy; /* CPUFREQ_POLICY_xxx */
> >>>>>
> >>>>> ... the new field at all? Can't you synthesize the
> >>>>> kind-of-governor into struct xen_get_cpufreq_para's respective
> >>>>> field? You invoke both sub-ops from xenpm now anyway ...
> >>>>>
> >>>>
> >>>> Maybe I could borrow governor field to indicate policy info, like
> >>>> the following in
> >> print_cpufreq_para(), then we don't need to add the new filed "policy"
> >>>> ```
> >>>> +    /* Translate governor info to policy info in CPPC active mode */
> >>>> +    if ( is_cppc_active )
> >>>> +    {
> >>>> +        if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> >>>> +                      "ondemand", CPUFREQ_NAME_LEN) )
> >>>> +            printf("cppc policy           : ondemand\n");
> >>>> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> >>>> +                           "performance", CPUFREQ_NAME_LEN) )
> >>>> +            printf("cppc policy           : performance\n");
> >>>> +
> >>>> +        else if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> >>>> +                           "powersave", CPUFREQ_NAME_LEN) )
> >>>> +            printf("cppc policy           : powersave\n");
> >>>> +        else
> >>>> +            printf("cppc policy           : unknown\n");
> >>>> +    }
> >>>> +
> >>>> ```
> >>>
> >>> Something like this is what I was thinking of, yes.
> >>
> >> Albeit - why the complicated if/else sequence? Why not simply print
> >> the field the hypercall returned?
> >
> > userspace governor doesn't have according policy. I could simplify it
> > to ```
> >         if ( !strncmp(p_cpufreq->u.s.scaling_governor,
> >              "userspace", CPUFREQ_NAME_LEN) )
> >                 printf("policy               : unknown\n");
> >         else
> >                 printf("policy               : %s\n",
> >                           p_cpufreq->u.s.scaling_governor); ```
>
> But the hypervisor shouldn't report back "userspace" when the CPPC driver is in
> use. ANd I think the tool is okay to trust the hypervisor.

True, we shall make sure governor is set properly in hypervisor side even in cppc mode

>
> Jan