xen/arch/x86/msr.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is
set, and will also attempt to unconditionally access HWCR if the TSC is
reported as Invariant.
The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a
(bogus) warning message, but doing so at the cost of OpenBSD not booting is not
a suitable solution.
In order to fix expose an empty HWCR.
Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Not sure whether we want to expose something when is_cpufreq_controller() is
true, seeing as there's a special wrmsr handler for the same MSR in that case.
Likely should be done for PV only, but also likely quite bogus.
Missing reported by as the issue came from the QubesOS tracker.
---
xen/arch/x86/msr.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 3f0450259cdf..964d500ff8a1 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
case MSR_K8_HWCR:
if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
goto gp_fault;
- *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
- ? K8_HWCR_TSC_FREQ_SEL : 0;
+ /*
+ * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as
+ * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to
+ * also poke at PSTATE0.
+ */
+ *val = 0;
break;
case MSR_VIRT_SPEC_CTRL:
--
2.42.0
On 12/09/2023 5:23 pm, Roger Pau Monne wrote: > OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is > set, and will also attempt to unconditionally access HWCR if the TSC is > reported as Invariant. > > The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a > (bogus) warning message, but doing so at the cost of OpenBSD not booting is not > a suitable solution. > > In order to fix expose an empty HWCR. At first I was thinking a straight up revert, but AMD's CPUID Faulting is an architectural bit in here so it's worth keeping the register around. > > Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Not sure whether we want to expose something when is_cpufreq_controller() is > true, seeing as there's a special wrmsr handler for the same MSR in that case. > Likely should be done for PV only, but also likely quite bogus. > > Missing reported by as the issue came from the QubesOS tracker. Well - we can at least have a: Link: https://github.com/QubesOS/qubes-issues/issues/8502 in the commit message, and it's probably worth asking Solène / Marek (both CC'd) if they want a Reported-by tag. > --- > xen/arch/x86/msr.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c > index 3f0450259cdf..964d500ff8a1 100644 > --- a/xen/arch/x86/msr.c > +++ b/xen/arch/x86/msr.c > @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) > case MSR_K8_HWCR: > if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ) > goto gp_fault; > - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10 > - ? K8_HWCR_TSC_FREQ_SEL : 0; > + /* > + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as > + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to > + * also poke at PSTATE0. > + */ While this is true, the justification for removing this is because TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR. Also because it's addition without writing into the migration stream was bogus irrespective of the specifics of the bit. I'm still of the opinion that it's buggy for OpenBSD to be looking at model specific bits when virtualised, but given my latest reading of the AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it can see TSC_FREQ_SEL. In some theoretical future where the toolstack better understands MSRs and (non)migratable VMs (which is the QubesOS usecase), then it would in principle be fine to construct a VM which can see the host TSC_FREQ_SEL and PSTATE* values. Preferably with an adjusted comment, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> ~Andrew
On 12/09/2023 5:35 pm, Andrew Cooper wrote: > On 12/09/2023 5:23 pm, Roger Pau Monne wrote: >> OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is >> set, and will also attempt to unconditionally access HWCR if the TSC is >> reported as Invariant. >> >> The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a >> (bogus) warning message, but doing so at the cost of OpenBSD not booting is not >> a suitable solution. >> >> In order to fix expose an empty HWCR. > At first I was thinking a straight up revert, but AMD's CPUID Faulting > is an architectural bit in here so it's worth keeping the register around. > >> Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests') >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> --- >> Not sure whether we want to expose something when is_cpufreq_controller() is >> true, seeing as there's a special wrmsr handler for the same MSR in that case. >> Likely should be done for PV only, but also likely quite bogus. >> >> Missing reported by as the issue came from the QubesOS tracker. > Well - we can at least have a: > > Link: https://github.com/QubesOS/qubes-issues/issues/8502 > > in the commit message, and it's probably worth asking Solène / Marek > (both CC'd) if they want a Reported-by tag. > >> --- >> xen/arch/x86/msr.c | 8 ++++++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c >> index 3f0450259cdf..964d500ff8a1 100644 >> --- a/xen/arch/x86/msr.c >> +++ b/xen/arch/x86/msr.c >> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) >> case MSR_K8_HWCR: >> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ) >> goto gp_fault; >> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10 >> - ? K8_HWCR_TSC_FREQ_SEL : 0; >> + /* >> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as >> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to >> + * also poke at PSTATE0. >> + */ > While this is true, the justification for removing this is because > TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR. > > Also because it's addition without writing into the migration stream was > bogus irrespective of the specifics of the bit. > > I'm still of the opinion that it's buggy for OpenBSD to be looking at > model specific bits when virtualised, but given my latest reading of the > AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it > can see TSC_FREQ_SEL. > > In some theoretical future where the toolstack better understands MSRs > and (non)migratable VMs (which is the QubesOS usecase), then it would in > principle be fine to construct a VM which can see the host TSC_FREQ_SEL > and PSTATE* values. > > Preferably with an adjusted comment, Reviewed-by: Andrew Cooper > <andrew.cooper3@citrix.com> Sorry - I meant to be clearer here. I'd suggest just deleting the comment and leaving an unconditional return of 0 (which will become conditional when we wire up CPUID Faulting). MSR_HWCR *is* an architectural MSR on any 64bit AMD system, so shouldn't fault. ~Andrew
On Tue, Sep 12, 2023 at 05:36:53PM +0100, Andrew Cooper wrote: > On 12/09/2023 5:35 pm, Andrew Cooper wrote: > > On 12/09/2023 5:23 pm, Roger Pau Monne wrote: > >> OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is > >> set, and will also attempt to unconditionally access HWCR if the TSC is > >> reported as Invariant. > >> > >> The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a > >> (bogus) warning message, but doing so at the cost of OpenBSD not booting is not > >> a suitable solution. > >> > >> In order to fix expose an empty HWCR. > > At first I was thinking a straight up revert, but AMD's CPUID Faulting > > is an architectural bit in here so it's worth keeping the register around. > > > >> Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests') > >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > >> --- > >> Not sure whether we want to expose something when is_cpufreq_controller() is > >> true, seeing as there's a special wrmsr handler for the same MSR in that case. > >> Likely should be done for PV only, but also likely quite bogus. > >> > >> Missing reported by as the issue came from the QubesOS tracker. > > Well - we can at least have a: > > > > Link: https://github.com/QubesOS/qubes-issues/issues/8502 > > > > in the commit message, and it's probably worth asking Solène / Marek > > (both CC'd) if they want a Reported-by tag. > > > >> --- > >> xen/arch/x86/msr.c | 8 ++++++-- > >> 1 file changed, 6 insertions(+), 2 deletions(-) > >> > >> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c > >> index 3f0450259cdf..964d500ff8a1 100644 > >> --- a/xen/arch/x86/msr.c > >> +++ b/xen/arch/x86/msr.c > >> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) > >> case MSR_K8_HWCR: > >> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ) > >> goto gp_fault; > >> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10 > >> - ? K8_HWCR_TSC_FREQ_SEL : 0; > >> + /* > >> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as > >> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to > >> + * also poke at PSTATE0. > >> + */ > > While this is true, the justification for removing this is because > > TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR. > > > > Also because it's addition without writing into the migration stream was > > bogus irrespective of the specifics of the bit. > > > > I'm still of the opinion that it's buggy for OpenBSD to be looking at > > model specific bits when virtualised, but given my latest reading of the > > AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it > > can see TSC_FREQ_SEL. > > > > In some theoretical future where the toolstack better understands MSRs > > and (non)migratable VMs (which is the QubesOS usecase), then it would in > > principle be fine to construct a VM which can see the host TSC_FREQ_SEL > > and PSTATE* values. > > > > Preferably with an adjusted comment, Reviewed-by: Andrew Cooper > > <andrew.cooper3@citrix.com> > > Sorry - I meant to be clearer here. I'd suggest just deleting the > comment and leaving an unconditional return of 0 (which will become > conditional when we wire up CPUID Faulting). > > MSR_HWCR *is* an architectural MSR on any 64bit AMD system, so shouldn't > fault. Hm, I think it's worth to at least keep a note that if TSC_FREQ_SEL is exposed PSTATE0 must also be exposed to prevent OpenBSD 7.3 from panicking. Thanks, Roger.
On 13/09/2023 9:08 am, Roger Pau Monné wrote: > On Tue, Sep 12, 2023 at 05:36:53PM +0100, Andrew Cooper wrote: >> On 12/09/2023 5:35 pm, Andrew Cooper wrote: >>> On 12/09/2023 5:23 pm, Roger Pau Monne wrote: >>>> OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is >>>> set, and will also attempt to unconditionally access HWCR if the TSC is >>>> reported as Invariant. >>>> >>>> The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing a >>>> (bogus) warning message, but doing so at the cost of OpenBSD not booting is not >>>> a suitable solution. >>>> >>>> In order to fix expose an empty HWCR. >>> At first I was thinking a straight up revert, but AMD's CPUID Faulting >>> is an architectural bit in here so it's worth keeping the register around. >>> >>>> Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests') >>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >>>> --- >>>> Not sure whether we want to expose something when is_cpufreq_controller() is >>>> true, seeing as there's a special wrmsr handler for the same MSR in that case. >>>> Likely should be done for PV only, but also likely quite bogus. >>>> >>>> Missing reported by as the issue came from the QubesOS tracker. >>> Well - we can at least have a: >>> >>> Link: https://github.com/QubesOS/qubes-issues/issues/8502 >>> >>> in the commit message, and it's probably worth asking Solène / Marek >>> (both CC'd) if they want a Reported-by tag. >>> >>>> --- >>>> xen/arch/x86/msr.c | 8 ++++++-- >>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c >>>> index 3f0450259cdf..964d500ff8a1 100644 >>>> --- a/xen/arch/x86/msr.c >>>> +++ b/xen/arch/x86/msr.c >>>> @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) >>>> case MSR_K8_HWCR: >>>> if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ) >>>> goto gp_fault; >>>> - *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10 >>>> - ? K8_HWCR_TSC_FREQ_SEL : 0; >>>> + /* >>>> + * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is reported as >>>> + * Invariant. Do not set TSC_FREQ_SEL as that would trigger OpenBSD to >>>> + * also poke at PSTATE0. >>>> + */ >>> While this is true, the justification for removing this is because >>> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR. >>> >>> Also because it's addition without writing into the migration stream was >>> bogus irrespective of the specifics of the bit. >>> >>> I'm still of the opinion that it's buggy for OpenBSD to be looking at >>> model specific bits when virtualised, but given my latest reading of the >>> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it >>> can see TSC_FREQ_SEL. >>> >>> In some theoretical future where the toolstack better understands MSRs >>> and (non)migratable VMs (which is the QubesOS usecase), then it would in >>> principle be fine to construct a VM which can see the host TSC_FREQ_SEL >>> and PSTATE* values. >>> >>> Preferably with an adjusted comment, Reviewed-by: Andrew Cooper >>> <andrew.cooper3@citrix.com> >> Sorry - I meant to be clearer here. I'd suggest just deleting the >> comment and leaving an unconditional return of 0 (which will become >> conditional when we wire up CPUID Faulting). >> >> MSR_HWCR *is* an architectural MSR on any 64bit AMD system, so shouldn't >> fault. > Hm, I think it's worth to at least keep a note that if TSC_FREQ_SEL is > exposed PSTATE0 must also be exposed to prevent OpenBSD 7.3 from > panicking. But there's nothing OpenBSD 7.3 specific about it. Any software which sees this bit is permitted (expected even) to edit the pstate registers. You know why it's called frequency select? Because firmware is supposed to program the systemwide frequency/voltage to the user's request for whether the system wants to run cooler, or be overclocked. Putting OpenBSD 7.3 in the commit message is absolutely relevant to why we're making this change now, but it's not relevant to why we have concluded that TSC_FREQ_SEL is bogus to expose to guests. ~Andrew
© 2016 - 2024 Red Hat, Inc.