Xen currently presents APIC_ESR to guests as a simple read/write register.
This is incorrect. The SDM states:
The ESR is a write/read register. Before attempt to read from the ESR,
software should first write to it. (The value written does not affect the
values read subsequently; only zero may be written in x2APIC mode.) This
write clears any previously logged errors and updates the ESR with any
errors detected since the last write to the ESR.
Introduce a new pending_esr field in hvm_hw_lapic.
Update vlapic_error() to accumulate errors here, and extend vlapic_reg_write()
to discard the written value and transfer pending_esr into APIC_ESR. Reads
are still as before.
Importantly, this means that guests no longer destroys the ESR value it's
looking for in the LVTERR handler when following the SDM instructions.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
v2:
* Minor adjustment to the commit message
---
xen/arch/x86/hvm/vlapic.c | 17 +++++++++++++++--
xen/include/public/arch-x86/hvm/save.h | 1 +
2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 3363926b487b..98394ed26a52 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -108,7 +108,7 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask)
uint32_t esr;
spin_lock_irqsave(&vlapic->esr_lock, flags);
- esr = vlapic_get_reg(vlapic, APIC_ESR);
+ esr = vlapic->hw.pending_esr;
if ( (esr & errmask) != errmask )
{
uint32_t lvterr = vlapic_get_reg(vlapic, APIC_LVTERR);
@@ -127,7 +127,7 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask)
errmask |= APIC_ESR_RECVILL;
}
- vlapic_set_reg(vlapic, APIC_ESR, esr | errmask);
+ vlapic->hw.pending_esr |= errmask;
if ( inj )
vlapic_set_irq(vlapic, lvterr & APIC_VECTOR_MASK, 0);
@@ -802,6 +802,19 @@ void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val)
vlapic_set_reg(vlapic, APIC_ID, val);
break;
+ case APIC_ESR:
+ {
+ unsigned long flags;
+
+ spin_lock_irqsave(&vlapic->esr_lock, flags);
+ val = vlapic->hw.pending_esr;
+ vlapic->hw.pending_esr = 0;
+ spin_unlock_irqrestore(&vlapic->esr_lock, flags);
+
+ vlapic_set_reg(vlapic, APIC_ESR, val);
+ break;
+ }
+
case APIC_TASKPRI:
vlapic_set_reg(vlapic, APIC_TASKPRI, val & 0xff);
break;
diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
index 7ecacadde165..9c4bfc7ebdac 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -394,6 +394,7 @@ struct hvm_hw_lapic {
uint32_t disabled; /* VLAPIC_xx_DISABLED */
uint32_t timer_divisor;
uint64_t tdt_msr;
+ uint32_t pending_esr;
};
DECLARE_HVM_SAVE_TYPE(LAPIC, 5, struct hvm_hw_lapic);
--
2.34.1
On 03.03.2025 19:53, Andrew Cooper wrote: > Xen currently presents APIC_ESR to guests as a simple read/write register. > > This is incorrect. The SDM states: > > The ESR is a write/read register. Before attempt to read from the ESR, > software should first write to it. (The value written does not affect the > values read subsequently; only zero may be written in x2APIC mode.) This > write clears any previously logged errors and updates the ESR with any > errors detected since the last write to the ESR. > > Introduce a new pending_esr field in hvm_hw_lapic. > > Update vlapic_error() to accumulate errors here, and extend vlapic_reg_write() > to discard the written value and transfer pending_esr into APIC_ESR. Reads > are still as before. > > Importantly, this means that guests no longer destroys the ESR value it's > looking for in the LVTERR handler when following the SDM instructions. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> I guess there's no good Fixes: candidate? Jan
On 05/03/2025 1:49 pm, Jan Beulich wrote: > On 03.03.2025 19:53, Andrew Cooper wrote: >> Xen currently presents APIC_ESR to guests as a simple read/write register. >> >> This is incorrect. The SDM states: >> >> The ESR is a write/read register. Before attempt to read from the ESR, >> software should first write to it. (The value written does not affect the >> values read subsequently; only zero may be written in x2APIC mode.) This >> write clears any previously logged errors and updates the ESR with any >> errors detected since the last write to the ESR. >> >> Introduce a new pending_esr field in hvm_hw_lapic. >> >> Update vlapic_error() to accumulate errors here, and extend vlapic_reg_write() >> to discard the written value and transfer pending_esr into APIC_ESR. Reads >> are still as before. >> >> Importantly, this means that guests no longer destroys the ESR value it's >> looking for in the LVTERR handler when following the SDM instructions. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks. > I guess there's no good Fixes: candidate? Not that I could find. 5f32d186a8b introduced vlapic_error(), and therefore delivery of LVTERR to guests, but the (mis)behaviour of the APIC_ESR register goes back further. d1bd157fbc9b "Big merge the HVM full-virtualisation abstractions" 19 years ago had a far more buggy behaviour (counted writes, and only returned data on half the reads). The read side (of this far more broken behaviour) was broken by 50b3cef2eecb ("[HVM] Place all APIC registers into one page in native format.") and restored by 69f646a61b1b ("[HVM] Fix some IOAPIC and LAPIC device model bugs"). The far more broken behaviour was dropped by f7c8af3a6476 ("[XEN] HVM: Clean up and simplify vlapic device-model code.") in 3.0.4, leaving the behaviour we've had until today. ~Andrew
© 2016 - 2025 Red Hat, Inc.