[PATCH v2 1/2] x86/vlapic: Fix handling of writes to APIC_ESR

Andrew Cooper posted 2 patches 6 days, 3 hours ago
[PATCH v2 1/2] x86/vlapic: Fix handling of writes to APIC_ESR
Posted by Andrew Cooper 6 days, 3 hours ago
Xen currently presents APIC_ESR to guests as a simple read/write register.

This is incorrect.  The SDM states:

  The ESR is a write/read register. Before attempt to read from the ESR,
  software should first write to it. (The value written does not affect the
  values read subsequently; only zero may be written in x2APIC mode.) This
  write clears any previously logged errors and updates the ESR with any
  errors detected since the last write to the ESR.

Introduce a new pending_esr field in hvm_hw_lapic.

Update vlapic_error() to accumulate errors here, and extend vlapic_reg_write()
to discard the written value and transfer pending_esr into APIC_ESR.  Reads
are still as before.

Importantly, this means that guests no longer destroys the ESR value it's
looking for in the LVTERR handler when following the SDM instructions.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

v2:
 * Minor adjustment to the commit message
---
 xen/arch/x86/hvm/vlapic.c              | 17 +++++++++++++++--
 xen/include/public/arch-x86/hvm/save.h |  1 +
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 3363926b487b..98394ed26a52 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -108,7 +108,7 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask)
     uint32_t esr;
 
     spin_lock_irqsave(&vlapic->esr_lock, flags);
-    esr = vlapic_get_reg(vlapic, APIC_ESR);
+    esr = vlapic->hw.pending_esr;
     if ( (esr & errmask) != errmask )
     {
         uint32_t lvterr = vlapic_get_reg(vlapic, APIC_LVTERR);
@@ -127,7 +127,7 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask)
                  errmask |= APIC_ESR_RECVILL;
         }
 
-        vlapic_set_reg(vlapic, APIC_ESR, esr | errmask);
+        vlapic->hw.pending_esr |= errmask;
 
         if ( inj )
             vlapic_set_irq(vlapic, lvterr & APIC_VECTOR_MASK, 0);
@@ -802,6 +802,19 @@ void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val)
         vlapic_set_reg(vlapic, APIC_ID, val);
         break;
 
+    case APIC_ESR:
+    {
+        unsigned long flags;
+
+        spin_lock_irqsave(&vlapic->esr_lock, flags);
+        val = vlapic->hw.pending_esr;
+        vlapic->hw.pending_esr = 0;
+        spin_unlock_irqrestore(&vlapic->esr_lock, flags);
+
+        vlapic_set_reg(vlapic, APIC_ESR, val);
+        break;
+    }
+
     case APIC_TASKPRI:
         vlapic_set_reg(vlapic, APIC_TASKPRI, val & 0xff);
         break;
diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
index 7ecacadde165..9c4bfc7ebdac 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -394,6 +394,7 @@ struct hvm_hw_lapic {
     uint32_t             disabled; /* VLAPIC_xx_DISABLED */
     uint32_t             timer_divisor;
     uint64_t             tdt_msr;
+    uint32_t             pending_esr;
 };
 
 DECLARE_HVM_SAVE_TYPE(LAPIC, 5, struct hvm_hw_lapic);
-- 
2.34.1


Re: [PATCH v2 1/2] x86/vlapic: Fix handling of writes to APIC_ESR
Posted by Jan Beulich 4 days, 8 hours ago
On 03.03.2025 19:53, Andrew Cooper wrote:
> Xen currently presents APIC_ESR to guests as a simple read/write register.
> 
> This is incorrect.  The SDM states:
> 
>   The ESR is a write/read register. Before attempt to read from the ESR,
>   software should first write to it. (The value written does not affect the
>   values read subsequently; only zero may be written in x2APIC mode.) This
>   write clears any previously logged errors and updates the ESR with any
>   errors detected since the last write to the ESR.
> 
> Introduce a new pending_esr field in hvm_hw_lapic.
> 
> Update vlapic_error() to accumulate errors here, and extend vlapic_reg_write()
> to discard the written value and transfer pending_esr into APIC_ESR.  Reads
> are still as before.
> 
> Importantly, this means that guests no longer destroys the ESR value it's
> looking for in the LVTERR handler when following the SDM instructions.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

I guess there's no good Fixes: candidate?

Jan
Re: [PATCH v2 1/2] x86/vlapic: Fix handling of writes to APIC_ESR
Posted by Andrew Cooper 2 days, 21 hours ago
On 05/03/2025 1:49 pm, Jan Beulich wrote:
> On 03.03.2025 19:53, Andrew Cooper wrote:
>> Xen currently presents APIC_ESR to guests as a simple read/write register.
>>
>> This is incorrect.  The SDM states:
>>
>>   The ESR is a write/read register. Before attempt to read from the ESR,
>>   software should first write to it. (The value written does not affect the
>>   values read subsequently; only zero may be written in x2APIC mode.) This
>>   write clears any previously logged errors and updates the ESR with any
>>   errors detected since the last write to the ESR.
>>
>> Introduce a new pending_esr field in hvm_hw_lapic.
>>
>> Update vlapic_error() to accumulate errors here, and extend vlapic_reg_write()
>> to discard the written value and transfer pending_esr into APIC_ESR.  Reads
>> are still as before.
>>
>> Importantly, this means that guests no longer destroys the ESR value it's
>> looking for in the LVTERR handler when following the SDM instructions.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

> I guess there's no good Fixes: candidate?

Not that I could find.  5f32d186a8b introduced vlapic_error(), and
therefore delivery of LVTERR to guests, but the (mis)behaviour of the
APIC_ESR register goes back further.

d1bd157fbc9b "Big merge the HVM full-virtualisation abstractions" 19
years ago had a far more buggy behaviour (counted writes, and only
returned data on half the reads).

The read side (of this far more broken behaviour) was broken by
50b3cef2eecb ("[HVM] Place all APIC registers into one page in native
format.") and restored by 69f646a61b1b ("[HVM] Fix some IOAPIC and LAPIC
device model bugs").

The far more broken behaviour was dropped by f7c8af3a6476 ("[XEN] HVM:
Clean up and simplify vlapic device-model code.") in 3.0.4, leaving the
behaviour we've had until today.

~Andrew