[RFC PATCH for-4.22 1/2] x86/platform: Expose DTS sensors MSR

Teddy Astie posted 2 patches 2 days, 4 hours ago
There is a newer version of this series
[RFC PATCH for-4.22 1/2] x86/platform: Expose DTS sensors MSR
Posted by Teddy Astie 2 days, 4 hours ago
Intel provide CPU sensors through "DTS" MSRs. As there MSR are core-specific
(or package-specific), we can't reliably fetch them from Dom0 directly.
Expose these MSR (if supported) through XENPF_resource_op so that it is
accessible through hypercall.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
I'm not a fan of doing a inline cpuid check here, but I don't have a
better approach in mind.

 xen/arch/x86/include/asm/msr-index.h | 2 ++
 xen/arch/x86/platform_hypercall.c    | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index df52587c85..98dda629e5 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -510,6 +510,8 @@
 #define MSR_IA32_THERM_INTERRUPT	0x0000019b
 #define MSR_IA32_THERM_STATUS		0x0000019c
 #define MSR_IA32_MISC_ENABLE		0x000001a0
+#define MSR_IA32_TEMPERATURE_TARGET   0x000001a2
+#define MSR_IA32_PACKAGE_THERM_STATUS 0x000001b1
 #define MSR_IA32_MISC_ENABLE_FAST_STRING  (1<<0)
 #define MSR_IA32_MISC_ENABLE_PERF_AVAIL   (1<<7)
 #define MSR_IA32_MISC_ENABLE_BTS_UNAVAIL  (1<<11)
diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
index 79bb99e0b6..3190803cc2 100644
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -86,6 +86,12 @@ static bool msr_read_allowed(unsigned int msr)
 
     case MSR_MCU_OPT_CTRL:
         return cpu_has_srbds_ctrl;
+    
+    case MSR_IA32_TEMPERATURE_TARGET:
+    case MSR_IA32_THERM_STATUS:
+    case MSR_IA32_PACKAGE_THERM_STATUS:
+        return boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+            (cpuid_eax(0x6) & 0x1); /* Digital temperature sensor */
     }
 
     if ( ppin_msr && msr == ppin_msr )
-- 
2.51.1



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [RFC PATCH for-4.22 1/2] x86/platform: Expose DTS sensors MSR
Posted by Jan Beulich 1 day, 12 hours ago
On 27.10.2025 18:26, Teddy Astie wrote:
> Intel provide CPU sensors through "DTS" MSRs. As there MSR are core-specific
> (or package-specific), we can't reliably fetch them from Dom0 directly.
> Expose these MSR (if supported) through XENPF_resource_op so that it is
> accessible through hypercall.
> 
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
> ---
> I'm not a fan of doing a inline cpuid check here, but I don't have a
> better approach in mind.
> 
>  xen/arch/x86/include/asm/msr-index.h | 2 ++
>  xen/arch/x86/platform_hypercall.c    | 6 ++++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
> index df52587c85..98dda629e5 100644
> --- a/xen/arch/x86/include/asm/msr-index.h
> +++ b/xen/arch/x86/include/asm/msr-index.h
> @@ -510,6 +510,8 @@
>  #define MSR_IA32_THERM_INTERRUPT	0x0000019b
>  #define MSR_IA32_THERM_STATUS		0x0000019c
>  #define MSR_IA32_MISC_ENABLE		0x000001a0
> +#define MSR_IA32_TEMPERATURE_TARGET   0x000001a2
> +#define MSR_IA32_PACKAGE_THERM_STATUS 0x000001b1
>  #define MSR_IA32_MISC_ENABLE_FAST_STRING  (1<<0)
>  #define MSR_IA32_MISC_ENABLE_PERF_AVAIL   (1<<7)
>  #define MSR_IA32_MISC_ENABLE_BTS_UNAVAIL  (1<<11)

Now new additions like this to this file please (and even less so ones
disagreeing in padding with adjacent lines). Please go find this comment
in the file:

/*
 * Legacy MSR constants in need of cleanup.  No new MSRs below this comment.
 */

Jan
Re: [RFC PATCH for-4.22 1/2] x86/platform: Expose DTS sensors MSR
Posted by Andrew Cooper 2 days, 2 hours ago
On 27/10/2025 5:26 pm, Teddy Astie wrote:
> I'm not a fan of doing a inline cpuid check here, but I don't have a
> better approach in mind.

I'm not sure if there's enough information in leaf 6 to justify putting
it fully into the CPUID infrastructure.

But, if you do something like this:

diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
index f94f23e159d2..d02fe4d22151 100644
--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -121,7 +121,13 @@ struct cpu_policy
             uint64_t :64, :64; /* Leaf 0x3 - PSN. */
             uint64_t :64, :64; /* Leaf 0x4 - Structured Cache. */
             uint64_t :64, :64; /* Leaf 0x5 - MONITOR. */
-            uint64_t :64, :64; /* Leaf 0x6 - Therm/Perf. */
+
+            /* Leaf 0x6 - Thermal and Perf. */
+            struct {
+                bool /* a */ dts:1;
+                uint32_t /* b */:32, /* c */:32, /* d */:32;
+            };
+
             uint64_t :64, :64; /* Leaf 0x7 - Structured Features. */
             uint64_t :64, :64; /* Leaf 0x8 - rsvd */
             uint64_t :64, :64; /* Leaf 0x9 - DCA */


then ...

> diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
> index 79bb99e0b6..3190803cc2 100644
> --- a/xen/arch/x86/platform_hypercall.c
> +++ b/xen/arch/x86/platform_hypercall.c
> @@ -86,6 +86,12 @@ static bool msr_read_allowed(unsigned int msr)
>  
>      case MSR_MCU_OPT_CTRL:
>          return cpu_has_srbds_ctrl;
> +    

You've added trailing whitespace here.

> +    case MSR_IA32_TEMPERATURE_TARGET:
> +    case MSR_IA32_THERM_STATUS:
> +    case MSR_IA32_PACKAGE_THERM_STATUS:
> +        return boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> +            (cpuid_eax(0x6) & 0x1); /* Digital temperature sensor */

... you ought to be able to use host_policy.basic.dts here.  In
principle the Intel check can be dropped too.

~Andrew

Re: [RFC PATCH for-4.22 1/2] x86/platform: Expose DTS sensors MSR
Posted by Jan Beulich 1 day, 12 hours ago
On 27.10.2025 20:38, Andrew Cooper wrote:
> On 27/10/2025 5:26 pm, Teddy Astie wrote:
>> I'm not a fan of doing a inline cpuid check here, but I don't have a
>> better approach in mind.
> 
> I'm not sure if there's enough information in leaf 6 to justify putting
> it fully into the CPUID infrastructure.
> 
> But, if you do something like this:
> 
> diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
> index f94f23e159d2..d02fe4d22151 100644
> --- a/xen/include/xen/lib/x86/cpu-policy.h
> +++ b/xen/include/xen/lib/x86/cpu-policy.h
> @@ -121,7 +121,13 @@ struct cpu_policy
>              uint64_t :64, :64; /* Leaf 0x3 - PSN. */
>              uint64_t :64, :64; /* Leaf 0x4 - Structured Cache. */
>              uint64_t :64, :64; /* Leaf 0x5 - MONITOR. */
> -            uint64_t :64, :64; /* Leaf 0x6 - Therm/Perf. */
> +
> +            /* Leaf 0x6 - Thermal and Perf. */
> +            struct {
> +                bool /* a */ dts:1;
> +                uint32_t /* b */:32, /* c */:32, /* d */:32;
> +            };
> +
>              uint64_t :64, :64; /* Leaf 0x7 - Structured Features. */
>              uint64_t :64, :64; /* Leaf 0x8 - rsvd */
>              uint64_t :64, :64; /* Leaf 0x9 - DCA */

Just to mention, below a patch I have pending as part of a series to
e.g. replace the various CPUID6_* values we presently use. As you did
indicate when we talked about this, a prereq to then use respective
bits from host_policy is an adjustment to cpu-policy.c, which is also
part of that series. If we weren't in freeze right now, I would have
posted the series already.

Jan

x86/cpu-policy: define bits of leaf 6

... as far as we presently use them in the codebase.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Or should we make both parts proper featureset elements? At least
APERFMPERF could likely be made visible to guests (in principle).

--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -128,7 +128,31 @@ struct cpu_policy
             uint64_t :64, :64; /* Leaf 0x3 - PSN. */
             uint64_t :64, :64; /* Leaf 0x4 - Structured Cache. */
             uint64_t :64, :64; /* Leaf 0x5 - MONITOR. */
-            uint64_t :64, :64; /* Leaf 0x6 - Therm/Perf. */
+
+            /* Leaf 0x6 - Therm/Perf. */
+            struct {
+                uint32_t /* a */:1,
+                    turbo:1,
+                    arat:1,
+                    :4,
+                    hwp:1,
+                    hwp_notification:1,
+                    hwp_activity_window:1,
+                    hwp_epp:1,
+                    hwp_plr:1,
+                    :1,
+                    hdc:1,
+                    :2,
+                    hwp_peci:1,
+                    :2,
+                    hw_feedback:1,
+                    :12;
+                uint32_t /* b */:32;
+                uint32_t /* c */ aperfmperf:1,
+                    :31;
+                uint32_t /* d */:32;
+            } pm;
+
             uint64_t :64, :64; /* Leaf 0x7 - Structured Features. */
             uint64_t :64, :64; /* Leaf 0x8 - rsvd */
             uint64_t :64, :64; /* Leaf 0x9 - DCA */


Re: [RFC PATCH for-4.22 1/2] x86/platform: Expose DTS sensors MSR
Posted by Andrew Cooper 5 hours ago
On 28/10/2025 9:20 am, Jan Beulich wrote:
> On 27.10.2025 20:38, Andrew Cooper wrote:
>> On 27/10/2025 5:26 pm, Teddy Astie wrote:
>>> I'm not a fan of doing a inline cpuid check here, but I don't have a
>>> better approach in mind.
>> I'm not sure if there's enough information in leaf 6 to justify putting
>> it fully into the CPUID infrastructure.
>>
>> But, if you do something like this:
>>
>> diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
>> index f94f23e159d2..d02fe4d22151 100644
>> --- a/xen/include/xen/lib/x86/cpu-policy.h
>> +++ b/xen/include/xen/lib/x86/cpu-policy.h
>> @@ -121,7 +121,13 @@ struct cpu_policy
>>              uint64_t :64, :64; /* Leaf 0x3 - PSN. */
>>              uint64_t :64, :64; /* Leaf 0x4 - Structured Cache. */
>>              uint64_t :64, :64; /* Leaf 0x5 - MONITOR. */
>> -            uint64_t :64, :64; /* Leaf 0x6 - Therm/Perf. */
>> +
>> +            /* Leaf 0x6 - Thermal and Perf. */
>> +            struct {
>> +                bool /* a */ dts:1;
>> +                uint32_t /* b */:32, /* c */:32, /* d */:32;
>> +            };
>> +
>>              uint64_t :64, :64; /* Leaf 0x7 - Structured Features. */
>>              uint64_t :64, :64; /* Leaf 0x8 - rsvd */
>>              uint64_t :64, :64; /* Leaf 0x9 - DCA */
> Just to mention, below a patch I have pending as part of a series to
> e.g. replace the various CPUID6_* values we presently use. As you did
> indicate when we talked about this, a prereq to then use respective
> bits from host_policy is an adjustment to cpu-policy.c, which is also
> part of that series. If we weren't in freeze right now, I would have
> posted the series already.
>
> Jan
>
> x86/cpu-policy: define bits of leaf 6
>
> ... as far as we presently use them in the codebase.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Or should we make both parts proper featureset elements? At least
> APERFMPERF could likely be made visible to guests (in principle).
>
> --- a/xen/include/xen/lib/x86/cpu-policy.h
> +++ b/xen/include/xen/lib/x86/cpu-policy.h
> @@ -128,7 +128,31 @@ struct cpu_policy
>              uint64_t :64, :64; /* Leaf 0x3 - PSN. */
>              uint64_t :64, :64; /* Leaf 0x4 - Structured Cache. */
>              uint64_t :64, :64; /* Leaf 0x5 - MONITOR. */
> -            uint64_t :64, :64; /* Leaf 0x6 - Therm/Perf. */
> +
> +            /* Leaf 0x6 - Therm/Perf. */
> +            struct {
> +                uint32_t /* a */:1,
> +                    turbo:1,
> +                    arat:1,
> +                    :4,
> +                    hwp:1,
> +                    hwp_notification:1,
> +                    hwp_activity_window:1,
> +                    hwp_epp:1,
> +                    hwp_plr:1,
> +                    :1,
> +                    hdc:1,
> +                    :2,
> +                    hwp_peci:1,
> +                    :2,
> +                    hw_feedback:1,
> +                    :12;
> +                uint32_t /* b */:32;
> +                uint32_t /* c */ aperfmperf:1,
> +                    :31;
> +                uint32_t /* d */:32;
> +            } pm;

This works too, although we don't have 'pm' equivalents elsewhere in
this part of the union.

APERF/MPERF is a disaster of an interface.  It can't safely be read even
in root mode, because an NMI/SMI breaks the algorithm in a way that
isn't easy to spot and retry.  On AMD, it's marginally better because
GIF can be used to exclude NMIs and non-fatal MCEs while sampling the
register pair.

In a VM, it's simply unusable.  Any VMExit, and even a vCPU reschedule,
breaks reading the pair.

Until the CPU vendors produce a way of reading the two counters together
(i.e. atomically, which has been asked for, repeatedly), there's no
point considering it for use in a VM.

~Andrew