Hi,
This patch series adds Hardware-Controlled Performance States (HWP) for
Intel processors to Xen.
With HWP, the processor makes its own determinations for frequency
selection, though users can set some parameters and preferences. There
is also Turbo Boost which dynamically pushes the max frequency if
possible.
The existing governors don't work with HWP since they select frequencies
and HWP doesn't expose those. Therefore a dummy hwp-interal governor is
used that doesn't do anything.
xenpm get-cpufreq-para is extended to show HWP parameters, and
set-cpufreq-hwp is added to set them.
A lightly loaded OpenXT laptop showed ~1W power savings according to
powertop. A mostly idle Fedora system (dom0 only) showed a more modest
power savings.
This for for a 10th gen 6-core 1600 MHz base 4900 MHZ max cpu. In the
default balance mode, Turbo Boost doesn't exceed 4GHz. Tweaking the
energy_perf preference with `xenpm set-cpufreq-hwp balance ene:64`,
I've seen the CPU hit 4.7GHz before throttling down and bouncing around
between 4.3 and 4.5 GHz. Curiously the other cores read ~4GHz when
turbo boost takes affect. This was done after pinning all dom0 cores,
and using taskset to pin to vCPU/pCPU 11 and running a bash tightloop.
In v2, I think I addressed all comments for v1. I kept patch 11 "xenpm:
Factor out a non-fatal cpuid_parse variant", with a v2 comment
explaining why I keep it.
HWP defaults to disabled and running with the existing HWP configuration
- it doesn't reconfigure by default. It can be enabled with
cpufreq=xen:hwp.
Hardware Duty Cycling (HDC) is another feature to autonomously powerdown
things. It defaults to enabled when HWP is enabled, but HDC can be
disabled on the command line. cpufreq=xen:hwp,no-hdc
I've only tested on 8th gen and 10th gen systems with activity window
and energy_perf support. So the pathes for CPUs lacking those features
are untested.
Fast MSR support was removed in v2. The model specific checking was not
done properly, and I don't have hardware to test with. Since writes are
expected to be infrequent, I just removed the code.
This changes the systcl_pm_op hypercall, so that wants review.
Regards,
Jason
Jason Andryuk (13):
cpufreq: Allow restricting to internal governors only
cpufreq: Add perf_freq to cpuinfo
cpufreq: Export intel_feature_detect
cpufreq: Add Hardware P-State (HWP) driver
xenpm: Change get-cpufreq-para output for internal
cpufreq: Export HWP parameters to userspace
libxc: Include hwp_para in definitions
xenpm: Print HWP parameters
xen: Add SET_CPUFREQ_HWP xen_sysctl_pm_op
libxc: Add xc_set_cpufreq_hwp
xenpm: Factor out a non-fatal cpuid_parse variant
xenpm: Add set-cpufreq-hwp subcommand
CHANGELOG: Add Intel HWP entry
CHANGELOG.md | 3 +
docs/misc/xen-command-line.pandoc | 8 +-
tools/include/xenctrl.h | 6 +
tools/libs/ctrl/xc_pm.c | 18 +
tools/misc/xenpm.c | 355 +++++++++++-
xen/arch/x86/acpi/cpufreq/Makefile | 1 +
xen/arch/x86/acpi/cpufreq/cpufreq.c | 15 +-
xen/arch/x86/acpi/cpufreq/hwp.c | 627 ++++++++++++++++++++++
xen/arch/x86/include/asm/cpufeature.h | 13 +-
xen/arch/x86/include/asm/msr-index.h | 13 +
xen/drivers/acpi/pmstat.c | 28 +
xen/drivers/cpufreq/cpufreq.c | 37 ++
xen/drivers/cpufreq/utility.c | 1 +
xen/include/acpi/cpufreq/cpufreq.h | 14 +
xen/include/acpi/cpufreq/processor_perf.h | 3 +
xen/include/public/sysctl.h | 57 ++
16 files changed, 1171 insertions(+), 28 deletions(-)
create mode 100644 xen/arch/x86/acpi/cpufreq/hwp.c
--
2.37.1