arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 24 ++++++++++++++++++++++++ arch/x86/kvm/svm/svm.h | 2 +- arch/x86/kvm/vmx/vmx.c | 8 ++------ arch/x86/kvm/vmx/vmx.h | 2 -- arch/x86/kvm/x86.c | 2 ++ 6 files changed, 30 insertions(+), 9 deletions(-)
Fix a long-lurking bug in SVM where KVM runs the guest with the host's
DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
context switch DEBUGCTL if and only if LBR virtualization is enabled (not
just supported, but fully enabled).
The bug has gone unnoticed because until recently, the only bits that
KVM would leave set were things like BTF, which are guest visible but
won't cause functional problems unless guest software is being especially
particular about #DBs.
The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
as the resulting #DBs due to split-lock accesses in guest userspace (lol
Steam) get reflected into the guest by KVM.
Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
likely the behavior that SVM guests have gotten the vast, vast majority of
the time, and given that it's the behavior on Intel, it's (hopefully) a safe
option for a fix, e.g. versus trying to add proper BTF virtualization on the
fly.
v3:
- Suppress BTF, as KVM doesn't actually support it. [Ravi]
- Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
it's guaranteed to be '0' in this scenario). [Ravi]
v2:
- Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
- Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
- Collect a review. [Xiaoyao]
- Make bits 5:3 fully reserved, in a separate not-for-stable patch.
v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
Sean Christopherson (6):
KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value
KVM: SVM: Suppress DEBUGCTL.BTF on AMD
KVM: x86: Snapshot the host's DEBUGCTL in common x86
KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is
disabled
KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
KVM: SVM: Treat DEBUGCTL[5:2] as reserved
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm/svm.c | 24 ++++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 8 ++------
arch/x86/kvm/vmx/vmx.h | 2 --
arch/x86/kvm/x86.c | 2 ++
6 files changed, 30 insertions(+), 9 deletions(-)
base-commit: fed48e2967f402f561d80075a20c5c9e16866e53
--
2.48.1.711.g2feabab25a-goog
On Thu, 27 Feb 2025 14:24:05 -0800, Sean Christopherson wrote:
> Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> just supported, but fully enabled).
>
> The bug has gone unnoticed because until recently, the only bits that
> KVM would leave set were things like BTF, which are guest visible but
> won't cause functional problems unless guest software is being especially
> particular about #DBs.
>
> [...]
Applied patch 6 to kvm-x86 svm (1-5 already went into 6.15).
[6/6] KVM: SVM: Treat DEBUGCTL[5:2] as reserved
https://github.com/kvm-x86/linux/commit/5ecdb48dd918
--
https://github.com/kvm-x86/linux/tree/next
On Thu, 2025-02-27 at 14:24 -0800, Sean Christopherson wrote:
> Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> just supported, but fully enabled).
>
> The bug has gone unnoticed because until recently, the only bits that
> KVM would leave set were things like BTF, which are guest visible but
> won't cause functional problems unless guest software is being especially
> particular about #DBs.
>
> The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> as the resulting #DBs due to split-lock accesses in guest userspace (lol
> Steam) get reflected into the guest by KVM.
>
> Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> likely the behavior that SVM guests have gotten the vast, vast majority of
> the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> option for a fix, e.g. versus trying to add proper BTF virtualization on the
> fly.
>
> v3:
> - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> it's guaranteed to be '0' in this scenario). [Ravi]
>
> v2:
> - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> - Collect a review. [Xiaoyao]
> - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
>
> v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
>
Hi,
Amusingly there is another DEBUGCTL issue, which I just got to the bottom of.
(if I am not mistaken of course).
We currently don't let the guest set DEBUGCTL.FREEZE_WHILE_SMM and neither
set it ourselves in GUEST_IA32_DEBUGCTL vmcs field, even when supported by the host
(If I read the code correctly, I didn't verify this in runtime)
This means that the host #SMIs will interfere with the guest PMU.
In particular this causes the 'pmu' kvm-unit-test to fail, which is something that our CI caught.
I think that kvm should just set this bit, or even better, use the host value of this bit,
and hide it from the guest, because the guest shouldn't know about host's smm,
and we AFAIK don't really support freezing perfmon when the guest enters its own emulated SMM.
What do you think? I'll post patches if you think that this is a good idea.
(A temp hack to set this bit always in GUEST_IA32_DEBUGCTL fixed the problem for me)
I also need to check if AMD also has this feature, or if this is Intel specific.
Best regards,
Maxim Levitsky
>
> Sean Christopherson (6):
> KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value
> KVM: SVM: Suppress DEBUGCTL.BTF on AMD
> KVM: x86: Snapshot the host's DEBUGCTL in common x86
> KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is
> disabled
> KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
> KVM: SVM: Treat DEBUGCTL[5:2] as reserved
>
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/svm/svm.c | 24 ++++++++++++++++++++++++
> arch/x86/kvm/svm/svm.h | 2 +-
> arch/x86/kvm/vmx/vmx.c | 8 ++------
> arch/x86/kvm/vmx/vmx.h | 2 --
> arch/x86/kvm/x86.c | 2 ++
> 6 files changed, 30 insertions(+), 9 deletions(-)
>
>
> base-commit: fed48e2967f402f561d80075a20c5c9e16866e53
On Tue, Apr 01, 2025, Maxim Levitsky wrote:
> On Thu, 2025-02-27 at 14:24 -0800, Sean Christopherson wrote:
> > Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> > DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> > context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> > just supported, but fully enabled).
> >
> > The bug has gone unnoticed because until recently, the only bits that
> > KVM would leave set were things like BTF, which are guest visible but
> > won't cause functional problems unless guest software is being especially
> > particular about #DBs.
> >
> > The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> > as the resulting #DBs due to split-lock accesses in guest userspace (lol
> > Steam) get reflected into the guest by KVM.
> >
> > Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> > likely the behavior that SVM guests have gotten the vast, vast majority of
> > the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> > option for a fix, e.g. versus trying to add proper BTF virtualization on the
> > fly.
> >
> > v3:
> > - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> > - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> > it's guaranteed to be '0' in this scenario). [Ravi]
> >
> > v2:
> > - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> > - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> > unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> > - Collect a review. [Xiaoyao]
> > - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
> >
> > v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
> >
>
>
> Hi,
>
> Amusingly there is another DEBUGCTL issue, which I just got to the bottom of.
> (if I am not mistaken of course).
>
> We currently don't let the guest set DEBUGCTL.FREEZE_WHILE_SMM and neither
> set it ourselves in GUEST_IA32_DEBUGCTL vmcs field, even when supported by the host
> (If I read the code correctly, I didn't verify this in runtime)
Ugh, SMM. Yeah, KVM doesn't propagate DEBUGCTLMSR_FREEZE_IN_SMM to the guest
value. KVM intercepts reads and writes to DEBUGCTL, so it should be easy enough
to shove the bit in on writes, and drop it on reads.
> This means that the host #SMIs will interfere with the guest PMU. In
> particular this causes the 'pmu' kvm-unit-test to fail, which is something
> that our CI caught.
>
> I think that kvm should just set this bit, or even better, use the host value
> of this bit, and hide it from the guest, because the guest shouldn't know
> about host's smm, and we AFAIK don't really support freezing perfmon when the
> guest enters its own emulated SMM.
Agreed. Easy thing is to use the host's value, so that KVM doesn't need to check
for its existence. I can't think of anything that would go sideways by freezing
perfmon if the host happens to take an SMI.
> What do you think? I'll post patches if you think that this is a good idea.
> (A temp hack to set this bit always in GUEST_IA32_DEBUGCTL fixed the problem for me)
>
> I also need to check if AMD also has this feature, or if this is Intel specific.
Intel only. I assume/think/hope AMD's Host/Guest Only field in the event selector
effectively hides SMM from the guest.
On 4/9/2025 4:13 AM, Sean Christopherson wrote:
> On Tue, Apr 01, 2025, Maxim Levitsky wrote:
>> On Thu, 2025-02-27 at 14:24 -0800, Sean Christopherson wrote:
>>> Fix a long-lurking bug in SVM where KVM runs the guest with the host's
>>> DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
>>> context switch DEBUGCTL if and only if LBR virtualization is enabled (not
>>> just supported, but fully enabled).
>>>
>>> The bug has gone unnoticed because until recently, the only bits that
>>> KVM would leave set were things like BTF, which are guest visible but
>>> won't cause functional problems unless guest software is being especially
>>> particular about #DBs.
>>>
>>> The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
>>> as the resulting #DBs due to split-lock accesses in guest userspace (lol
>>> Steam) get reflected into the guest by KVM.
>>>
>>> Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
>>> likely the behavior that SVM guests have gotten the vast, vast majority of
>>> the time, and given that it's the behavior on Intel, it's (hopefully) a safe
>>> option for a fix, e.g. versus trying to add proper BTF virtualization on the
>>> fly.
>>>
>>> v3:
>>> - Suppress BTF, as KVM doesn't actually support it. [Ravi]
>>> - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
>>> it's guaranteed to be '0' in this scenario). [Ravi]
>>>
>>> v2:
>>> - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
>>> - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
>>> unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
>>> - Collect a review. [Xiaoyao]
>>> - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
>>>
>>> v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
>>>
>>
>>
>> Hi,
>>
>> Amusingly there is another DEBUGCTL issue, which I just got to the bottom of.
>> (if I am not mistaken of course).
>>
>> We currently don't let the guest set DEBUGCTL.FREEZE_WHILE_SMM and neither
>> set it ourselves in GUEST_IA32_DEBUGCTL vmcs field, even when supported by the host
>> (If I read the code correctly, I didn't verify this in runtime)
>
> Ugh, SMM. Yeah, KVM doesn't propagate DEBUGCTLMSR_FREEZE_IN_SMM to the guest
> value. KVM intercepts reads and writes to DEBUGCTL, so it should be easy enough
> to shove the bit in on writes, and drop it on reads.
>
>> This means that the host #SMIs will interfere with the guest PMU. In
>> particular this causes the 'pmu' kvm-unit-test to fail, which is something
>> that our CI caught.
>>
>> I think that kvm should just set this bit, or even better, use the host value
>> of this bit, and hide it from the guest, because the guest shouldn't know
>> about host's smm, and we AFAIK don't really support freezing perfmon when the
>> guest enters its own emulated SMM.
>
> Agreed. Easy thing is to use the host's value, so that KVM doesn't need to check
> for its existence. I can't think of anything that would go sideways by freezing
> perfmon if the host happens to take an SMI.
>
>> What do you think? I'll post patches if you think that this is a good idea.
>> (A temp hack to set this bit always in GUEST_IA32_DEBUGCTL fixed the problem for me)
>>
>> I also need to check if AMD also has this feature, or if this is Intel specific.
>
> Intel only. I assume/think/hope AMD's Host/Guest Only field in the event selector
> effectively hides SMM from the guest.
Just using the GuestOnly bit does not hide SMM activity from guests. SMIs are
generally intercepted (kvm_amd.intercept_smi defaults to true) and handled in the
host context. So guest PMCs are isolated by a combination of having the GuestOnly
bit set and the #VMEXITs resulting from SMI interception.
On Mon, 2025-04-14 at 12:02 +0530, Sandipan Das wrote:
> On 4/9/2025 4:13 AM, Sean Christopherson wrote:
> > On Tue, Apr 01, 2025, Maxim Levitsky wrote:
> > > On Thu, 2025-02-27 at 14:24 -0800, Sean Christopherson wrote:
> > > > Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> > > > DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> > > > context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> > > > just supported, but fully enabled).
> > > >
> > > > The bug has gone unnoticed because until recently, the only bits that
> > > > KVM would leave set were things like BTF, which are guest visible but
> > > > won't cause functional problems unless guest software is being especially
> > > > particular about #DBs.
> > > >
> > > > The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> > > > as the resulting #DBs due to split-lock accesses in guest userspace (lol
> > > > Steam) get reflected into the guest by KVM.
> > > >
> > > > Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> > > > likely the behavior that SVM guests have gotten the vast, vast majority of
> > > > the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> > > > option for a fix, e.g. versus trying to add proper BTF virtualization on the
> > > > fly.
> > > >
> > > > v3:
> > > > - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> > > > - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> > > > it's guaranteed to be '0' in this scenario). [Ravi]
> > > >
> > > > v2:
> > > > - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> > > > - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> > > > unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> > > > - Collect a review. [Xiaoyao]
> > > > - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
> > > >
> > > > v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
> > > >
> > >
> > > Hi,
> > >
> > > Amusingly there is another DEBUGCTL issue, which I just got to the bottom of.
> > > (if I am not mistaken of course).
> > >
> > > We currently don't let the guest set DEBUGCTL.FREEZE_WHILE_SMM and neither
> > > set it ourselves in GUEST_IA32_DEBUGCTL vmcs field, even when supported by the host
> > > (If I read the code correctly, I didn't verify this in runtime)
> >
> > Ugh, SMM. Yeah, KVM doesn't propagate DEBUGCTLMSR_FREEZE_IN_SMM to the guest
> > value. KVM intercepts reads and writes to DEBUGCTL, so it should be easy enough
> > to shove the bit in on writes, and drop it on reads.
> >
> > > This means that the host #SMIs will interfere with the guest PMU. In
> > > particular this causes the 'pmu' kvm-unit-test to fail, which is something
> > > that our CI caught.
> > >
> > > I think that kvm should just set this bit, or even better, use the host value
> > > of this bit, and hide it from the guest, because the guest shouldn't know
> > > about host's smm, and we AFAIK don't really support freezing perfmon when the
> > > guest enters its own emulated SMM.
> >
> > Agreed. Easy thing is to use the host's value, so that KVM doesn't need to check
> > for its existence. I can't think of anything that would go sideways by freezing
> > perfmon if the host happens to take an SMI.
> >
> > > What do you think? I'll post patches if you think that this is a good idea.
> > > (A temp hack to set this bit always in GUEST_IA32_DEBUGCTL fixed the problem for me)
> > >
> > > I also need to check if AMD also has this feature, or if this is Intel specific.
> >
> > Intel only. I assume/think/hope AMD's Host/Guest Only field in the event selector
> > effectively hides SMM from the guest.
>
> Just using the GuestOnly bit does not hide SMM activity from guests. SMIs are
> generally intercepted (kvm_amd.intercept_smi defaults to true)
Hi,
Actually this setting doesn't really work these days, at lesat not on my Zen2 machine (3070x).
Long ago I tested it, and despite loading the system with SMIs either via APIC or via 0xB2 ioport write,
where in both cases I noticed significant slowdown of a VM, pinned on the receiving CPU I got no SMI VM exits.
BIOS likely has an option to override this setting.
I guess the reason is security, because with SVM,
one can effectively block the SMIs from being processed on the host.
Best regards,
Maxim Levitsky
> and handled in the
> host context. So guest PMCs are isolated by a combination of having the GuestOnly
> bit set and the #VMEXITs resulting from SMI interception.
>
On Tue, 2025-04-08 at 15:43 -0700, Sean Christopherson wrote:
> On Tue, Apr 01, 2025, Maxim Levitsky wrote:
> > On Thu, 2025-02-27 at 14:24 -0800, Sean Christopherson wrote:
> > > Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> > > DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> > > context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> > > just supported, but fully enabled).
> > >
> > > The bug has gone unnoticed because until recently, the only bits that
> > > KVM would leave set were things like BTF, which are guest visible but
> > > won't cause functional problems unless guest software is being especially
> > > particular about #DBs.
> > >
> > > The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> > > as the resulting #DBs due to split-lock accesses in guest userspace (lol
> > > Steam) get reflected into the guest by KVM.
> > >
> > > Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> > > likely the behavior that SVM guests have gotten the vast, vast majority of
> > > the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> > > option for a fix, e.g. versus trying to add proper BTF virtualization on the
> > > fly.
> > >
> > > v3:
> > > - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> > > - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> > > it's guaranteed to be '0' in this scenario). [Ravi]
> > >
> > > v2:
> > > - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> > > - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> > > unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> > > - Collect a review. [Xiaoyao]
> > > - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
> > >
> > > v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
> > >
> >
> > Hi,
> >
> > Amusingly there is another DEBUGCTL issue, which I just got to the bottom of.
> > (if I am not mistaken of course).
> >
> > We currently don't let the guest set DEBUGCTL.FREEZE_WHILE_SMM and neither
> > set it ourselves in GUEST_IA32_DEBUGCTL vmcs field, even when supported by the host
> > (If I read the code correctly, I didn't verify this in runtime)
>
> Ugh, SMM. Yeah, KVM doesn't propagate DEBUGCTLMSR_FREEZE_IN_SMM to the guest
> value. KVM intercepts reads and writes to DEBUGCTL, so it should be easy enough
> to shove the bit in on writes, and drop it on reads.
>
> > This means that the host #SMIs will interfere with the guest PMU. In
> > particular this causes the 'pmu' kvm-unit-test to fail, which is something
> > that our CI caught.
> >
> > I think that kvm should just set this bit, or even better, use the host value
> > of this bit, and hide it from the guest, because the guest shouldn't know
> > about host's smm, and we AFAIK don't really support freezing perfmon when the
> > guest enters its own emulated SMM.
>
> Agreed. Easy thing is to use the host's value, so that KVM doesn't need to check
> for its existence. I can't think of anything that would go sideways by freezing
> perfmon if the host happens to take an SMI.
>
> > What do you think? I'll post patches if you think that this is a good idea.
> > (A temp hack to set this bit always in GUEST_IA32_DEBUGCTL fixed the problem for me)
> >
> > I also need to check if AMD also has this feature, or if this is Intel specific.
>
> Intel only. I assume/think/hope AMD's Host/Guest Only field in the event selector
> effectively hides SMM from the guest.
>
Hi,
I will post a patch soon then. I just got my hands on the CI machine where the test failed
and yes, the machine receives about 8 #SMIs per second on each core. Oh well...
BTW pmu_counters_test selftest is also affected since it counts # of retired instructions.
With #SMI getting in the way, the number of course soars.
It doesn't fail often at this rate but it does when the test test is done for sufficient
number or times or you just get lucky.
Best regards,
Maxim Levitsky
On Tue, 2025-04-01 at 23:57 -0400, Maxim Levitsky wrote:
> On Thu, 2025-02-27 at 14:24 -0800, Sean Christopherson wrote:
> > Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> > DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> > context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> > just supported, but fully enabled).
> >
> > The bug has gone unnoticed because until recently, the only bits that
> > KVM would leave set were things like BTF, which are guest visible but
> > won't cause functional problems unless guest software is being especially
> > particular about #DBs.
> >
> > The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> > as the resulting #DBs due to split-lock accesses in guest userspace (lol
> > Steam) get reflected into the guest by KVM.
> >
> > Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> > likely the behavior that SVM guests have gotten the vast, vast majority of
> > the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> > option for a fix, e.g. versus trying to add proper BTF virtualization on the
> > fly.
> >
> > v3:
> > - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> > - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> > it's guaranteed to be '0' in this scenario). [Ravi]
> >
> > v2:
> > - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> > - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> > unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> > - Collect a review. [Xiaoyao]
> > - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
> >
> > v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
> >
>
> Hi,
>
> Amusingly there is another DEBUGCTL issue, which I just got to the bottom of.
> (if I am not mistaken of course).
>
> We currently don't let the guest set DEBUGCTL.FREEZE_WHILE_SMM and neither
> set it ourselves in GUEST_IA32_DEBUGCTL vmcs field, even when supported by the host
> (If I read the code correctly, I didn't verify this in runtime)
>
> This means that the host #SMIs will interfere with the guest PMU.
> In particular this causes the 'pmu' kvm-unit-test to fail, which is something that our CI caught.
>
> I think that kvm should just set this bit, or even better, use the host value of this bit,
> and hide it from the guest, because the guest shouldn't know about host's smm,
> and we AFAIK don't really support freezing perfmon when the guest enters its own emulated SMM.
>
> What do you think? I'll post patches if you think that this is a good idea.
> (A temp hack to set this bit always in GUEST_IA32_DEBUGCTL fixed the problem for me)
>
> I also need to check if AMD also has this feature, or if this is Intel specific.
Any update?
Best regards,
Maxim Levitsky
>
> Best regards,
> Maxim Levitsky
>
> > Sean Christopherson (6):
> > KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value
> > KVM: SVM: Suppress DEBUGCTL.BTF on AMD
> > KVM: x86: Snapshot the host's DEBUGCTL in common x86
> > KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is
> > disabled
> > KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
> > KVM: SVM: Treat DEBUGCTL[5:2] as reserved
> >
> > arch/x86/include/asm/kvm_host.h | 1 +
> > arch/x86/kvm/svm/svm.c | 24 ++++++++++++++++++++++++
> > arch/x86/kvm/svm/svm.h | 2 +-
> > arch/x86/kvm/vmx/vmx.c | 8 ++------
> > arch/x86/kvm/vmx/vmx.h | 2 --
> > arch/x86/kvm/x86.c | 2 ++
> > 6 files changed, 30 insertions(+), 9 deletions(-)
> >
> >
> > base-commit: fed48e2967f402f561d80075a20c5c9e16866e53
On Thu, 27 Feb 2025 14:24:05 -0800, Sean Christopherson wrote:
> Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> just supported, but fully enabled).
>
> The bug has gone unnoticed because until recently, the only bits that
> KVM would leave set were things like BTF, which are guest visible but
> won't cause functional problems unless guest software is being especially
> particular about #DBs.
>
> [...]
Applied 1-5 to kvm-x86 fixes (for 6.14). I'm going to hold off on making
DEBUGCTL[5:2] reserved until at least 6.15.
[1/6] KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value
https://github.com/kvm-x86/linux/commit/ee89e8013383
[2/6] KVM: SVM: Suppress DEBUGCTL.BTF on AMD
https://github.com/kvm-x86/linux/commit/d0eac42f5cec
[3/6] KVM: x86: Snapshot the host's DEBUGCTL in common x86
https://github.com/kvm-x86/linux/commit/fb71c7959356
[4/6] KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is disabled
https://github.com/kvm-x86/linux/commit/433265870ab3
[5/6] KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
https://github.com/kvm-x86/linux/commit/189ecdb3e112
[6/6] KVM: SVM: Treat DEBUGCTL[5:2] as reserved
(no commit info)
--
https://github.com/kvm-x86/linux/tree/next
On 28-Feb-25 3:54 AM, Sean Christopherson wrote:
> Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> just supported, but fully enabled).
>
> The bug has gone unnoticed because until recently, the only bits that
> KVM would leave set were things like BTF, which are guest visible but
> won't cause functional problems unless guest software is being especially
> particular about #DBs.
>
> The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> as the resulting #DBs due to split-lock accesses in guest userspace (lol
> Steam) get reflected into the guest by KVM.
>
> Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> likely the behavior that SVM guests have gotten the vast, vast majority of
> the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> option for a fix, e.g. versus trying to add proper BTF virtualization on the
> fly.
>
> v3:
> - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> it's guaranteed to be '0' in this scenario). [Ravi]
>
> v2:
> - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> - Collect a review. [Xiaoyao]
> - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
>
> v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
For the series,
Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Thanks,
Ravi
On Fri, Feb 28, 2025, Ravi Bangoria wrote:
> On 28-Feb-25 3:54 AM, Sean Christopherson wrote:
> > Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> > DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> > context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> > just supported, but fully enabled).
> >
> > The bug has gone unnoticed because until recently, the only bits that
> > KVM would leave set were things like BTF, which are guest visible but
> > won't cause functional problems unless guest software is being especially
> > particular about #DBs.
> >
> > The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> > as the resulting #DBs due to split-lock accesses in guest userspace (lol
> > Steam) get reflected into the guest by KVM.
> >
> > Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> > likely the behavior that SVM guests have gotten the vast, vast majority of
> > the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> > option for a fix, e.g. versus trying to add proper BTF virtualization on the
> > fly.
> >
> > v3:
> > - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> > - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> > it's guaranteed to be '0' in this scenario). [Ravi]
> >
> > v2:
> > - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> > - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> > unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> > - Collect a review. [Xiaoyao]
> > - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
> >
> > v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@google.com
>
> For the series,
>
> Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Thank you for all your help, much appreciated!
© 2016 - 2025 Red Hat, Inc.