The exact behaviour of LVTERR interrupt generation is implementation
specific.
* Newer Intel CPUs generate an interrupt when pending_esr becomes
nonzero.
* Older Intel and all AMD CPUs generate an interrupt when any
individual bit in pending_esr becomes nonzero.
Neither vendor documents their behaviour very well. Xen implements
the per-bit behaviour and has done since support was added.
Importantly, the per-bit behaviour can be expressed using the atomic
operations available in the x86 architecture, whereas the
former (interrupt only on pending_esr becoming nonzero) cannot.
With vlapic->hw.pending_esr held outside of the main regs page, it's
much easier to use atomic operations.
Use xchg() in vlapic_reg_write(), and *set_bit() in vlapic_error().
The only interesting change is that vlapic_error() now needs to take a
single bit only, rather than a mask, but this fine for all current
callers and forseable changes.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
Confirmed by Intel, AMD, and 3rd party sources.
https://sandpile.org/x86/apic.htm has been updated to note this
behaviour. None of the vendors have indicated an enthusiasm to
clarify the behaviour in their docs.
v2:
* Rewrite the commit message from scratch.
---
xen/arch/x86/hvm/vlapic.c | 39 ++++++++++-----------------
xen/arch/x86/include/asm/hvm/vlapic.h | 1 -
2 files changed, 14 insertions(+), 26 deletions(-)
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 98394ed26a52..82b6d12e99d4 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -102,14 +102,16 @@ static int vlapic_find_highest_irr(struct vlapic *vlapic)
return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
}
-static void vlapic_error(struct vlapic *vlapic, unsigned int errmask)
+static void vlapic_error(struct vlapic *vlapic, unsigned int err_bit)
{
- unsigned long flags;
- uint32_t esr;
-
- spin_lock_irqsave(&vlapic->esr_lock, flags);
- esr = vlapic->hw.pending_esr;
- if ( (esr & errmask) != errmask )
+ /*
+ * Whether LVTERR is delivered on a per-bit basis, or only on
+ * pending_esr becoming nonzero is implementation specific.
+ *
+ * Xen implements the per-bit behaviour as it can be expressed
+ * locklessly.
+ */
+ if ( !test_and_set_bit(err_bit, &vlapic->hw.pending_esr) )
{
uint32_t lvterr = vlapic_get_reg(vlapic, APIC_LVTERR);
bool inj = false;
@@ -124,15 +126,12 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask)
if ( (lvterr & APIC_VECTOR_MASK) >= 16 )
inj = true;
else
- errmask |= APIC_ESR_RECVILL;
+ set_bit(ilog2(APIC_ESR_RECVILL), &vlapic->hw.pending_esr);
}
- vlapic->hw.pending_esr |= errmask;
-
if ( inj )
vlapic_set_irq(vlapic, lvterr & APIC_VECTOR_MASK, 0);
}
- spin_unlock_irqrestore(&vlapic->esr_lock, flags);
}
bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec)
@@ -153,7 +152,7 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
if ( unlikely(vec < 16) )
{
- vlapic_error(vlapic, APIC_ESR_RECVILL);
+ vlapic_error(vlapic, ilog2(APIC_ESR_RECVILL));
return;
}
@@ -525,7 +524,7 @@ void vlapic_ipi(
vlapic_domain(vlapic), vlapic, short_hand, dest, dest_mode);
if ( unlikely((icr_low & APIC_VECTOR_MASK) < 16) )
- vlapic_error(vlapic, APIC_ESR_SENDILL);
+ vlapic_error(vlapic, ilog2(APIC_ESR_SENDILL));
else if ( target )
vlapic_accept_irq(vlapic_vcpu(target), icr_low);
break;
@@ -534,7 +533,7 @@ void vlapic_ipi(
case APIC_DM_FIXED:
if ( unlikely((icr_low & APIC_VECTOR_MASK) < 16) )
{
- vlapic_error(vlapic, APIC_ESR_SENDILL);
+ vlapic_error(vlapic, ilog2(APIC_ESR_SENDILL));
break;
}
/* fall through */
@@ -803,17 +802,9 @@ void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val)
break;
case APIC_ESR:
- {
- unsigned long flags;
-
- spin_lock_irqsave(&vlapic->esr_lock, flags);
- val = vlapic->hw.pending_esr;
- vlapic->hw.pending_esr = 0;
- spin_unlock_irqrestore(&vlapic->esr_lock, flags);
-
+ val = xchg(&vlapic->hw.pending_esr, 0);
vlapic_set_reg(vlapic, APIC_ESR, val);
break;
- }
case APIC_TASKPRI:
vlapic_set_reg(vlapic, APIC_TASKPRI, val & 0xff);
@@ -1716,8 +1707,6 @@ int vlapic_init(struct vcpu *v)
vlapic_reset(vlapic);
- spin_lock_init(&vlapic->esr_lock);
-
tasklet_init(&vlapic->init_sipi.tasklet, vlapic_init_sipi_action, v);
if ( v->vcpu_id == 0 )
diff --git a/xen/arch/x86/include/asm/hvm/vlapic.h b/xen/arch/x86/include/asm/hvm/vlapic.h
index 2c4ff94ae7a8..c38855119836 100644
--- a/xen/arch/x86/include/asm/hvm/vlapic.h
+++ b/xen/arch/x86/include/asm/hvm/vlapic.h
@@ -69,7 +69,6 @@ struct vlapic {
bool hw, regs;
uint32_t id, ldr;
} loaded;
- spinlock_t esr_lock;
struct periodic_time pt;
s_time_t timer_last_update;
struct page_info *regs_page;
--
2.34.1
On 03.03.2025 19:53, Andrew Cooper wrote: > The exact behaviour of LVTERR interrupt generation is implementation > specific. > > * Newer Intel CPUs generate an interrupt when pending_esr becomes > nonzero. > > * Older Intel and all AMD CPUs generate an interrupt when any > individual bit in pending_esr becomes nonzero. > > Neither vendor documents their behaviour very well. Xen implements > the per-bit behaviour and has done since support was added. > > Importantly, the per-bit behaviour can be expressed using the atomic > operations available in the x86 architecture, whereas the > former (interrupt only on pending_esr becoming nonzero) cannot. > > With vlapic->hw.pending_esr held outside of the main regs page, it's > much easier to use atomic operations. > > Use xchg() in vlapic_reg_write(), and *set_bit() in vlapic_error(). > > The only interesting change is that vlapic_error() now needs to take a > single bit only, rather than a mask, but this fine for all current > callers and forseable changes. > > No practical change. From a guest perspective that is. > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> > @@ -124,15 +126,12 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask) > if ( (lvterr & APIC_VECTOR_MASK) >= 16 ) > inj = true; I wouldn't, btw, mind if you also corrected this indentation screw-up of mine along with you doing so ... > else > - errmask |= APIC_ESR_RECVILL; > + set_bit(ilog2(APIC_ESR_RECVILL), &vlapic->hw.pending_esr); ... here. Jan
On 05/03/2025 1:56 pm, Jan Beulich wrote: > On 03.03.2025 19:53, Andrew Cooper wrote: >> The exact behaviour of LVTERR interrupt generation is implementation >> specific. >> >> * Newer Intel CPUs generate an interrupt when pending_esr becomes >> nonzero. >> >> * Older Intel and all AMD CPUs generate an interrupt when any >> individual bit in pending_esr becomes nonzero. >> >> Neither vendor documents their behaviour very well. Xen implements >> the per-bit behaviour and has done since support was added. >> >> Importantly, the per-bit behaviour can be expressed using the atomic >> operations available in the x86 architecture, whereas the >> former (interrupt only on pending_esr becoming nonzero) cannot. >> >> With vlapic->hw.pending_esr held outside of the main regs page, it's >> much easier to use atomic operations. >> >> Use xchg() in vlapic_reg_write(), and *set_bit() in vlapic_error(). >> >> The only interesting change is that vlapic_error() now needs to take a >> single bit only, rather than a mask, but this fine for all current >> callers and forseable changes. >> >> No practical change. > From a guest perspective that is. > >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks. > >> @@ -124,15 +126,12 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask) >> if ( (lvterr & APIC_VECTOR_MASK) >= 16 ) >> inj = true; > I wouldn't, btw, mind if you also corrected this indentation screw-up of > mine along with you doing so ... > >> else >> - errmask |= APIC_ESR_RECVILL; >> + set_bit(ilog2(APIC_ESR_RECVILL), &vlapic->hw.pending_esr); > ... here. Both fixed. ~Andrew
© 2016 - 2025 Red Hat, Inc.