Explicitly set/clear CR8 write interception when AVIC is (de)activated to
fix a bug where KVM leaves the interception enabled after AVIC is
activated. E.g. if KVM emulates INIT=>WFS while AVIC is deactivated, CR8
will remain intercepted in perpetuity.
On its own, the dangling CR8 intercept is "just" a performance issue, but
combined with the TPR sync bug fixed by commit d02e48830e3f ("KVM: SVM:
Sync TPR from LAPIC into VMCB::V_TPR even if AVIC is active"), the danging
intercept is fatal to Windows guests as the TPR seen by hardware gets
wildly out of sync with reality.
Note, VMX isn't affected by the bug as TPR_THRESHOLD is explicitly ignored
when Virtual Interrupt Delivery is enabled, i.e. when APICv is active in
KVM's world. I.e. there's no need to trigger update_cr8_intercept(), this
is firmly an SVM implementation flaw/detail.
WARN if KVM gets a CR8 write #VMEXIT while AVIC is active, as KVM should
never enter the guest with AVIC enabled and CR8 writes intercepted.
Fixes: 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC")
Cc: stable@vger.kernel.org
Cc: Jim Mattson <jmattson@google.com>
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Cc: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/avic.c | 6 ++++--
arch/x86/kvm/svm/svm.c | 9 +++++----
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 44e07c27b190..13a4a8949aba 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -189,12 +189,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
struct kvm_vcpu *vcpu = &svm->vcpu;
vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
-
vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
vmcb->control.avic_physical_id |= avic_get_max_physical_id(vcpu);
-
vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ svm_clr_intercept(svm, INTERCEPT_CR8_WRITE);
+
/*
* Note: KVM supports hybrid-AVIC mode, where KVM emulates x2APIC MSR
* accesses, while interrupt injection to a running vCPU can be
@@ -226,6 +226,8 @@ static void avic_deactivate_vmcb(struct vcpu_svm *svm)
vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
+ svm_set_intercept(svm, INTERCEPT_CR8_WRITE);
+
/*
* If running nested and the guest uses its own MSR bitmap, there
* is no need to update L0's msr bitmap
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e8313fdc5465..aa3ab22215f5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1077,8 +1077,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu, bool init_event)
svm_set_intercept(svm, INTERCEPT_CR0_WRITE);
svm_set_intercept(svm, INTERCEPT_CR3_WRITE);
svm_set_intercept(svm, INTERCEPT_CR4_WRITE);
- if (!kvm_vcpu_apicv_active(vcpu))
- svm_set_intercept(svm, INTERCEPT_CR8_WRITE);
+ svm_set_intercept(svm, INTERCEPT_CR8_WRITE);
set_dr_intercepts(svm);
@@ -2674,9 +2673,11 @@ static int dr_interception(struct kvm_vcpu *vcpu)
static int cr8_write_interception(struct kvm_vcpu *vcpu)
{
- int r;
-
u8 cr8_prev = kvm_get_cr8(vcpu);
+ int r;
+
+ WARN_ON_ONCE(kvm_vcpu_apicv_active(vcpu));
+
/* instruction emulation calls kvm_set_cr8() */
r = cr_interception(vcpu);
if (lapic_in_kernel(vcpu))
--
2.53.0.rc2.204.g2597b5adb4-goog
On Tue, Feb 03, 2026 at 11:07:10AM -0800, Sean Christopherson wrote:
> Explicitly set/clear CR8 write interception when AVIC is (de)activated to
> fix a bug where KVM leaves the interception enabled after AVIC is
> activated. E.g. if KVM emulates INIT=>WFS while AVIC is deactivated, CR8
> will remain intercepted in perpetuity.
Looking at svm_update_cr8_intercept(), I suppose this could also more
commonly happen whenever AVIC is inhibited (IRQ Windows, as an example)?
>
> On its own, the dangling CR8 intercept is "just" a performance issue, but
> combined with the TPR sync bug fixed by commit d02e48830e3f ("KVM: SVM:
> Sync TPR from LAPIC into VMCB::V_TPR even if AVIC is active"), the danging
> intercept is fatal to Windows guests as the TPR seen by hardware gets
> wildly out of sync with reality.
>
> Note, VMX isn't affected by the bug as TPR_THRESHOLD is explicitly ignored
> when Virtual Interrupt Delivery is enabled, i.e. when APICv is active in
> KVM's world. I.e. there's no need to trigger update_cr8_intercept(), this
> is firmly an SVM implementation flaw/detail.
>
> WARN if KVM gets a CR8 write #VMEXIT while AVIC is active, as KVM should
> never enter the guest with AVIC enabled and CR8 writes intercepted.
>
> Fixes: 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC")
> Cc: stable@vger.kernel.org
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Naveen N Rao (AMD) <naveen@kernel.org>
> Cc: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/avic.c | 6 ++++--
> arch/x86/kvm/svm/svm.c | 9 +++++----
> 2 files changed, 9 insertions(+), 6 deletions(-)
LGTM.
Reviewed-by: Naveen N Rao (AMD) <naveen@kernel.org>
Thanks,
Naveen
On Fri, Feb 06, 2026, Naveen N Rao wrote: > On Tue, Feb 03, 2026 at 11:07:10AM -0800, Sean Christopherson wrote: > > Explicitly set/clear CR8 write interception when AVIC is (de)activated to > > fix a bug where KVM leaves the interception enabled after AVIC is > > activated. E.g. if KVM emulates INIT=>WFS while AVIC is deactivated, CR8 > > will remain intercepted in perpetuity. > > Looking at svm_update_cr8_intercept(), I suppose this could also more > commonly happen whenever AVIC is inhibited (IRQ Windows, as an example)? Maybe? I don't think it's actually common in practice. Because the bug requires the source of the inhibition to be removed while the vCPU still has a pending IRQ that is below PPR. Which is definitely possible, but that seems overall unlikely, and it'd also be self-healing to some extent. E.g. if a workload is triggering ExtINT, then odds are good it's going to _keep_ generating ExtINT, keep toggling the inhibit, and thus reconcile CR8 interception every time AVIC is inhibited.
On Tue, Feb 3, 2026 at 11:07 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Explicitly set/clear CR8 write interception when AVIC is (de)activated to
> fix a bug where KVM leaves the interception enabled after AVIC is
> activated. E.g. if KVM emulates INIT=>WFS while AVIC is deactivated, CR8
> will remain intercepted in perpetuity.
>
> On its own, the dangling CR8 intercept is "just" a performance issue, but
> combined with the TPR sync bug fixed by commit d02e48830e3f ("KVM: SVM:
> Sync TPR from LAPIC into VMCB::V_TPR even if AVIC is active"), the danging
> intercept is fatal to Windows guests as the TPR seen by hardware gets
> wildly out of sync with reality.
>
> Note, VMX isn't affected by the bug as TPR_THRESHOLD is explicitly ignored
> when Virtual Interrupt Delivery is enabled, i.e. when APICv is active in
> KVM's world. I.e. there's no need to trigger update_cr8_intercept(), this
> is firmly an SVM implementation flaw/detail.
>
> WARN if KVM gets a CR8 write #VMEXIT while AVIC is active, as KVM should
> never enter the guest with AVIC enabled and CR8 writes intercepted.
>
> Fixes: 3bbf3565f48c ("svm: Do not intercept CR8 when enable AVIC")
> Cc: stable@vger.kernel.org
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Naveen N Rao (AMD) <naveen@kernel.org>
> Cc: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
© 2016 - 2026 Red Hat, Inc.