From: Grygorii Strashko <grygorii_strashko@epam.com>
The LAPIC LVTx registers have two RO bits:
- all: Delivery Status (DS) bit 12
- LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
and the IRQ is:
- or accepted at destination and appears as pending
(vLAPIC Interrupt Request Register (IRR))
- or get rejected immediately.
The Remote IRR Flag (RIR) behavior emulation is not implemented for
LINT0/LINT1 in Xen for now.
The current vLAPIC implementations allows guest to write to these RO bits.
The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
to implement "Write ignore" access type for RO bits by applying masks from
vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
allows writing to RO fields.
Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
to reserved bits should cause #GP exception), but contains no statements
for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
now uses different masks (than vlapic_reg_write()) for checking LVTx
registers values for "Reserved" bit settings, which include RO bits and
do not cause #GP exception.
Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
---
Changes in v2:
- masks fixed in vlapic_lvt_mask[]
- commit msg reworded
v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/
xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 79697487ba90..2ecba8163f48 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -44,15 +44,17 @@
static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
{
/* LVTT */
- LVT_MASK | APIC_TIMER_MODE_MASK,
+ (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
/* LVTTHMR */
- LVT_MASK | APIC_DM_MASK,
+ (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
/* LVTPC */
- LVT_MASK | APIC_DM_MASK,
- /* LVT0-1 */
- LINT_MASK, LINT_MASK,
+ (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
+ /* LVT0 */
+ LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
+ /* LVT1 */
+ LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
/* LVTERR */
- LVT_MASK
+ LVT_MASK & ~APIC_SEND_PENDING,
};
#define vlapic_lvtt_period(vlapic) \
--
2.34.1
Hi Jan,
On 30.09.25 22:05, Grygorii Strashko wrote:
> From: Grygorii Strashko <grygorii_strashko@epam.com>
>
> The LAPIC LVTx registers have two RO bits:
> - all: Delivery Status (DS) bit 12
> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>
> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
> and the IRQ is:
> - or accepted at destination and appears as pending
> (vLAPIC Interrupt Request Register (IRR))
> - or get rejected immediately.
>
> The Remote IRR Flag (RIR) behavior emulation is not implemented for
> LINT0/LINT1 in Xen for now.
>
> The current vLAPIC implementations allows guest to write to these RO bits.
>
> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
> to implement "Write ignore" access type for RO bits by applying masks from
> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
> allows writing to RO fields.
>
> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>
> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
> to reserved bits should cause #GP exception), but contains no statements
> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
> now uses different masks (than vlapic_reg_write()) for checking LVTx
> registers values for "Reserved" bit settings, which include RO bits and
> do not cause #GP exception.
>
> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
> ---
> Changes in v2:
> - masks fixed in vlapic_lvt_mask[]
> - commit msg reworded
>
> v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 79697487ba90..2ecba8163f48 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -44,15 +44,17 @@
> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
> {
> /* LVTT */
> - LVT_MASK | APIC_TIMER_MODE_MASK,
> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
> /* LVTTHMR */
> - LVT_MASK | APIC_DM_MASK,
> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
> /* LVTPC */
> - LVT_MASK | APIC_DM_MASK,
> - /* LVT0-1 */
> - LINT_MASK, LINT_MASK,
> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
> + /* LVT0 */
> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
> + /* LVT1 */
> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
> /* LVTERR */
> - LVT_MASK
> + LVT_MASK & ~APIC_SEND_PENDING,
> };
>
> #define vlapic_lvtt_period(vlapic) \
I'd like to return back here and finally get this issue fixed (and make tests green again).
So could this simple fix be merged?
There is also follow up patch [1] which can be still a "follow up" patch or can be made prerequisite patch.
[XEN] x86: hvm: vlapic: rework WR/rsvdz masks for LVTx regs
[1] https://patchwork.kernel.org/project/xen-devel/patch/20251009114249.1964387-1-grygorii_strashko@epam.com/
--
Best regards,
-grygorii
On 18.12.2025 16:32, Grygorii Strashko wrote:
> On 30.09.25 22:05, Grygorii Strashko wrote:
>> From: Grygorii Strashko <grygorii_strashko@epam.com>
>>
>> The LAPIC LVTx registers have two RO bits:
>> - all: Delivery Status (DS) bit 12
>> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>>
>> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
>> and the IRQ is:
>> - or accepted at destination and appears as pending
>> (vLAPIC Interrupt Request Register (IRR))
>> - or get rejected immediately.
>>
>> The Remote IRR Flag (RIR) behavior emulation is not implemented for
>> LINT0/LINT1 in Xen for now.
>>
>> The current vLAPIC implementations allows guest to write to these RO bits.
>>
>> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
>> to implement "Write ignore" access type for RO bits by applying masks from
>> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
>> allows writing to RO fields.
>>
>> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
>> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>>
>> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
>> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
>> to reserved bits should cause #GP exception), but contains no statements
>> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
>> now uses different masks (than vlapic_reg_write()) for checking LVTx
>> registers values for "Reserved" bit settings, which include RO bits and
>> do not cause #GP exception.
>>
>> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
>> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
>> ---
>> Changes in v2:
>> - masks fixed in vlapic_lvt_mask[]
>> - commit msg reworded
>>
>> v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
>> index 79697487ba90..2ecba8163f48 100644
>> --- a/xen/arch/x86/hvm/vlapic.c
>> +++ b/xen/arch/x86/hvm/vlapic.c
>> @@ -44,15 +44,17 @@
>> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
>> {
>> /* LVTT */
>> - LVT_MASK | APIC_TIMER_MODE_MASK,
>> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
>> /* LVTTHMR */
>> - LVT_MASK | APIC_DM_MASK,
>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>> /* LVTPC */
>> - LVT_MASK | APIC_DM_MASK,
>> - /* LVT0-1 */
>> - LINT_MASK, LINT_MASK,
>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>> + /* LVT0 */
>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>> + /* LVT1 */
>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>> /* LVTERR */
>> - LVT_MASK
>> + LVT_MASK & ~APIC_SEND_PENDING,
>> };
>>
>> #define vlapic_lvtt_period(vlapic) \
>
> I'd like to return back here and finally get this issue fixed (and make tests green again).
>
> So could this simple fix be merged?
>
> There is also follow up patch [1] which can be still a "follow up" patch or can be made prerequisite patch.
>
> [XEN] x86: hvm: vlapic: rework WR/rsvdz masks for LVTx regs
> [1] https://patchwork.kernel.org/project/xen-devel/patch/20251009114249.1964387-1-grygorii_strashko@epam.com/
I was really intending for "x86/vLAPIC: CMCI LVT handling" to go first. Which
obviously requires someone to review it ...
Jan
On Tue Sep 30, 2025 at 9:05 PM CEST, Grygorii Strashko wrote:
> From: Grygorii Strashko <grygorii_strashko@epam.com>
>
> The LAPIC LVTx registers have two RO bits:
> - all: Delivery Status (DS) bit 12
> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>
> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
> and the IRQ is:
> - or accepted at destination and appears as pending
> (vLAPIC Interrupt Request Register (IRR))
> - or get rejected immediately.
>
> The Remote IRR Flag (RIR) behavior emulation is not implemented for
> LINT0/LINT1 in Xen for now.
>
> The current vLAPIC implementations allows guest to write to these RO bits.
>
> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
> to implement "Write ignore" access type for RO bits by applying masks from
> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
> allows writing to RO fields.
>
> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>
> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
> to reserved bits should cause #GP exception), but contains no statements
> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
> now uses different masks (than vlapic_reg_write()) for checking LVTx
> registers values for "Reserved" bit settings, which include RO bits and
> do not cause #GP exception.
>
> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
> ---
> Changes in v2:
> - masks fixed in vlapic_lvt_mask[]
> - commit msg reworded
>
> v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/
> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 79697487ba90..2ecba8163f48 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -44,15 +44,17 @@
> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
> {
> /* LVTT */
> - LVT_MASK | APIC_TIMER_MODE_MASK,
> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
> /* LVTTHMR */
> - LVT_MASK | APIC_DM_MASK,
> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
> /* LVTPC */
> - LVT_MASK | APIC_DM_MASK,
> - /* LVT0-1 */
> - LINT_MASK, LINT_MASK,
> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
> + /* LVT0 */
> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
> + /* LVT1 */
> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
> /* LVTERR */
> - LVT_MASK
> + LVT_MASK & ~APIC_SEND_PENDING,
> };
This is a bit messy. Why not have 2 masks? One for rsvdZ bits, and one
for RO?
That ought to simplify the logic in both the MSR and MMIO cases.
MMIO would do RAZ/WI on the OR of both, while the MSR interface would gate
#GP(0) on the mask for rsvd bits only and ensure all RO bits are preserved on
writes.
Thoughts?
Cheers,
Alejandro
On 01.10.25 18:18, Alejandro Vallejo wrote:
> On Tue Sep 30, 2025 at 9:05 PM CEST, Grygorii Strashko wrote:
>> From: Grygorii Strashko <grygorii_strashko@epam.com>
>>
>> The LAPIC LVTx registers have two RO bits:
>> - all: Delivery Status (DS) bit 12
>> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>>
>> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
>> and the IRQ is:
>> - or accepted at destination and appears as pending
>> (vLAPIC Interrupt Request Register (IRR))
>> - or get rejected immediately.
>>
>> The Remote IRR Flag (RIR) behavior emulation is not implemented for
>> LINT0/LINT1 in Xen for now.
>>
>> The current vLAPIC implementations allows guest to write to these RO bits.
>>
>> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
>> to implement "Write ignore" access type for RO bits by applying masks from
>> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
>> allows writing to RO fields.
>>
>> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
>> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>>
>> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
>> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
>> to reserved bits should cause #GP exception), but contains no statements
>> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
>> now uses different masks (than vlapic_reg_write()) for checking LVTx
>> registers values for "Reserved" bit settings, which include RO bits and
>> do not cause #GP exception.
>>
>> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
>> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
>> ---
>> Changes in v2:
>> - masks fixed in vlapic_lvt_mask[]
>> - commit msg reworded
>>
>> v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/
>> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
>> index 79697487ba90..2ecba8163f48 100644
>> --- a/xen/arch/x86/hvm/vlapic.c
>> +++ b/xen/arch/x86/hvm/vlapic.c
>> @@ -44,15 +44,17 @@
>> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
>> {
>> /* LVTT */
>> - LVT_MASK | APIC_TIMER_MODE_MASK,
>> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
>> /* LVTTHMR */
>> - LVT_MASK | APIC_DM_MASK,
>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>> /* LVTPC */
>> - LVT_MASK | APIC_DM_MASK,
>> - /* LVT0-1 */
>> - LINT_MASK, LINT_MASK,
>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>> + /* LVT0 */
>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>> + /* LVT1 */
>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>> /* LVTERR */
>> - LVT_MASK
>> + LVT_MASK & ~APIC_SEND_PENDING,
>> };
>
> This is a bit messy. Why not have 2 masks? One for rsvdZ bits, and one
> for RO?
>
> That ought to simplify the logic in both the MSR and MMIO cases.
>
> MMIO would do RAZ/WI on the OR of both, while the MSR interface would gate
> #GP(0) on the mask for rsvd bits only and ensure all RO bits are preserved on
> writes.
>
> Thoughts?
I've been thinking about the same and It can be done, np.
I always trying to make "fix" with as small diff as possible
considering back-porting.
How about "follow up" patch if there is an agreement to proceed this way on the Top level?
--
Best regards,
-grygorii
On 03.10.2025 16:04, Grygorii Strashko wrote:
>
>
> On 01.10.25 18:18, Alejandro Vallejo wrote:
>> On Tue Sep 30, 2025 at 9:05 PM CEST, Grygorii Strashko wrote:
>>> From: Grygorii Strashko <grygorii_strashko@epam.com>
>>>
>>> The LAPIC LVTx registers have two RO bits:
>>> - all: Delivery Status (DS) bit 12
>>> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>>>
>>> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
>>> and the IRQ is:
>>> - or accepted at destination and appears as pending
>>> (vLAPIC Interrupt Request Register (IRR))
>>> - or get rejected immediately.
>>>
>>> The Remote IRR Flag (RIR) behavior emulation is not implemented for
>>> LINT0/LINT1 in Xen for now.
>>>
>>> The current vLAPIC implementations allows guest to write to these RO bits.
>>>
>>> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
>>> to implement "Write ignore" access type for RO bits by applying masks from
>>> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
>>> allows writing to RO fields.
>>>
>>> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
>>> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>>>
>>> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
>>> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
>>> to reserved bits should cause #GP exception), but contains no statements
>>> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
>>> now uses different masks (than vlapic_reg_write()) for checking LVTx
>>> registers values for "Reserved" bit settings, which include RO bits and
>>> do not cause #GP exception.
>>>
>>> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
>>> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>> ---
>>> Changes in v2:
>>> - masks fixed in vlapic_lvt_mask[]
>>> - commit msg reworded
>>>
>>> v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/
>>> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
>>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
>>> index 79697487ba90..2ecba8163f48 100644
>>> --- a/xen/arch/x86/hvm/vlapic.c
>>> +++ b/xen/arch/x86/hvm/vlapic.c
>>> @@ -44,15 +44,17 @@
>>> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
>>> {
>>> /* LVTT */
>>> - LVT_MASK | APIC_TIMER_MODE_MASK,
>>> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
>>> /* LVTTHMR */
>>> - LVT_MASK | APIC_DM_MASK,
>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>> /* LVTPC */
>>> - LVT_MASK | APIC_DM_MASK,
>>> - /* LVT0-1 */
>>> - LINT_MASK, LINT_MASK,
>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>> + /* LVT0 */
>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>> + /* LVT1 */
>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>> /* LVTERR */
>>> - LVT_MASK
>>> + LVT_MASK & ~APIC_SEND_PENDING,
>>> };
>>
>> This is a bit messy. Why not have 2 masks? One for rsvdZ bits, and one
>> for RO?
>>
>> That ought to simplify the logic in both the MSR and MMIO cases.
>>
>> MMIO would do RAZ/WI on the OR of both, while the MSR interface would gate
>> #GP(0) on the mask for rsvd bits only and ensure all RO bits are preserved on
>> writes.
>>
>> Thoughts?
>
> I've been thinking about the same and It can be done, np.
> I always trying to make "fix" with as small diff as possible
> considering back-porting.
>
> How about "follow up" patch if there is an agreement to proceed this way on the Top level?
Doing it in two steps would be okay with me (I expected it to go that way
anyway), but then it would still be nice to limit churn some. Specifically,
taking LINT_MASK as example, can't we do
#define LINT_RO_MASK (LVT_RO_MASK | APIC_LVT_REMOTE_IRR)
#define LINT_WR_MASK \
(LVT_WR_MASK | APIC_DM_MASK | APIC_INPUT_POLARITY | \
APIC_LVT_LEVEL_TRIGGER)
#define LINT_MASK (LINT_WR_MASK | LINT_RO_MASK)
or some such, and then use *_WR_MASK in the table initializer?
Jan
Hi Jan,
Thanks for your comments and support.
On 07.10.25 18:35, Jan Beulich wrote:
> On 03.10.2025 16:04, Grygorii Strashko wrote:
>>
>>
>> On 01.10.25 18:18, Alejandro Vallejo wrote:
>>> On Tue Sep 30, 2025 at 9:05 PM CEST, Grygorii Strashko wrote:
>>>> From: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>
>>>> The LAPIC LVTx registers have two RO bits:
>>>> - all: Delivery Status (DS) bit 12
>>>> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>>>>
>>>> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
>>>> and the IRQ is:
>>>> - or accepted at destination and appears as pending
>>>> (vLAPIC Interrupt Request Register (IRR))
>>>> - or get rejected immediately.
>>>>
>>>> The Remote IRR Flag (RIR) behavior emulation is not implemented for
>>>> LINT0/LINT1 in Xen for now.
>>>>
>>>> The current vLAPIC implementations allows guest to write to these RO bits.
>>>>
>>>> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
>>>> to implement "Write ignore" access type for RO bits by applying masks from
>>>> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
>>>> allows writing to RO fields.
>>>>
>>>> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
>>>> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>>>>
>>>> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
>>>> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
>>>> to reserved bits should cause #GP exception), but contains no statements
>>>> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
>>>> now uses different masks (than vlapic_reg_write()) for checking LVTx
>>>> registers values for "Reserved" bit settings, which include RO bits and
>>>> do not cause #GP exception.
>>>>
>>>> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
>>>> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>> ---
>>>> Changes in v2:
>>>> - masks fixed in vlapic_lvt_mask[]
>>>> - commit msg reworded
>>>>
>>>> v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/
>>>> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
>>>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
>>>> index 79697487ba90..2ecba8163f48 100644
>>>> --- a/xen/arch/x86/hvm/vlapic.c
>>>> +++ b/xen/arch/x86/hvm/vlapic.c
>>>> @@ -44,15 +44,17 @@
>>>> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
>>>> {
>>>> /* LVTT */
>>>> - LVT_MASK | APIC_TIMER_MODE_MASK,
>>>> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
>>>> /* LVTTHMR */
>>>> - LVT_MASK | APIC_DM_MASK,
>>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>>> /* LVTPC */
>>>> - LVT_MASK | APIC_DM_MASK,
>>>> - /* LVT0-1 */
>>>> - LINT_MASK, LINT_MASK,
>>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>>> + /* LVT0 */
>>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>>> + /* LVT1 */
>>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>>> /* LVTERR */
>>>> - LVT_MASK
>>>> + LVT_MASK & ~APIC_SEND_PENDING,
>>>> };
>>>
>>> This is a bit messy. Why not have 2 masks? One for rsvdZ bits, and one
>>> for RO?
>>>
>>> That ought to simplify the logic in both the MSR and MMIO cases.
>>>
>>> MMIO would do RAZ/WI on the OR of both, while the MSR interface would gate
>>> #GP(0) on the mask for rsvd bits only and ensure all RO bits are preserved on
>>> writes.
>>>
>>> Thoughts?
>>
>> I've been thinking about the same and It can be done, np.
>> I always trying to make "fix" with as small diff as possible
>> considering back-porting.
>>
>> How about "follow up" patch if there is an agreement to proceed this way on the Top level?
>
> Doing it in two steps would be okay with me (I expected it to go that way
> anyway), but then it would still be nice to limit churn some. Specifically,
> taking LINT_MASK as example, can't we do
>
> #define LINT_RO_MASK (LVT_RO_MASK | APIC_LVT_REMOTE_IRR)
>
> #define LINT_WR_MASK \
> (LVT_WR_MASK | APIC_DM_MASK | APIC_INPUT_POLARITY | \
> APIC_LVT_LEVEL_TRIGGER)
>
> #define LINT_MASK (LINT_WR_MASK | LINT_RO_MASK)
>
> or some such, and then use *_WR_MASK in the table initializer?
Huh. I seems lost a bit, so it's time for ask for more clarifications.
I was under impression (seems wrong) that this patch is ok in general, but
more improvements need to be done while here [1].
My situation is simple - I have a broken safety test with obvious reason "RO bits are writable".
And for me fixing a bug (in most simple and fast way) is a high priority.
Then whatever optimization/improvements/refactoring (while have time slot).
So, what need to be done to get the bug fixed and fix merged? (preferably in 4.21)
Should i update this patch and port here "do LVTx masks renaming and rearranging to x_WR_MASK/x_RO_MASK" part from [1]?
Will it be enough?
[1] https://patchwork.kernel.org/project/xen-devel/patch/20251009114249.1964387-1-grygorii_strashko@epam.com/
--
Best regards,
-grygorii
On 09.10.2025 16:55, Grygorii Strashko wrote:
> Hi Jan,
>
> Thanks for your comments and support.
>
> On 07.10.25 18:35, Jan Beulich wrote:
>> On 03.10.2025 16:04, Grygorii Strashko wrote:
>>>
>>>
>>> On 01.10.25 18:18, Alejandro Vallejo wrote:
>>>> On Tue Sep 30, 2025 at 9:05 PM CEST, Grygorii Strashko wrote:
>>>>> From: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>>
>>>>> The LAPIC LVTx registers have two RO bits:
>>>>> - all: Delivery Status (DS) bit 12
>>>>> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>>>>>
>>>>> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
>>>>> and the IRQ is:
>>>>> - or accepted at destination and appears as pending
>>>>> (vLAPIC Interrupt Request Register (IRR))
>>>>> - or get rejected immediately.
>>>>>
>>>>> The Remote IRR Flag (RIR) behavior emulation is not implemented for
>>>>> LINT0/LINT1 in Xen for now.
>>>>>
>>>>> The current vLAPIC implementations allows guest to write to these RO bits.
>>>>>
>>>>> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
>>>>> to implement "Write ignore" access type for RO bits by applying masks from
>>>>> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
>>>>> allows writing to RO fields.
>>>>>
>>>>> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
>>>>> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>>>>>
>>>>> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
>>>>> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
>>>>> to reserved bits should cause #GP exception), but contains no statements
>>>>> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
>>>>> now uses different masks (than vlapic_reg_write()) for checking LVTx
>>>>> registers values for "Reserved" bit settings, which include RO bits and
>>>>> do not cause #GP exception.
>>>>>
>>>>> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
>>>>> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>> ---
>>>>> Changes in v2:
>>>>> - masks fixed in vlapic_lvt_mask[]
>>>>> - commit msg reworded
>>>>>
>>>>> v1: https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/
>>>>> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
>>>>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
>>>>> index 79697487ba90..2ecba8163f48 100644
>>>>> --- a/xen/arch/x86/hvm/vlapic.c
>>>>> +++ b/xen/arch/x86/hvm/vlapic.c
>>>>> @@ -44,15 +44,17 @@
>>>>> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
>>>>> {
>>>>> /* LVTT */
>>>>> - LVT_MASK | APIC_TIMER_MODE_MASK,
>>>>> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
>>>>> /* LVTTHMR */
>>>>> - LVT_MASK | APIC_DM_MASK,
>>>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>>>> /* LVTPC */
>>>>> - LVT_MASK | APIC_DM_MASK,
>>>>> - /* LVT0-1 */
>>>>> - LINT_MASK, LINT_MASK,
>>>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>>>> + /* LVT0 */
>>>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>>>> + /* LVT1 */
>>>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>>>> /* LVTERR */
>>>>> - LVT_MASK
>>>>> + LVT_MASK & ~APIC_SEND_PENDING,
>>>>> };
>>>>
>>>> This is a bit messy. Why not have 2 masks? One for rsvdZ bits, and one
>>>> for RO?
>>>>
>>>> That ought to simplify the logic in both the MSR and MMIO cases.
>>>>
>>>> MMIO would do RAZ/WI on the OR of both, while the MSR interface would gate
>>>> #GP(0) on the mask for rsvd bits only and ensure all RO bits are preserved on
>>>> writes.
>>>>
>>>> Thoughts?
>>>
>>> I've been thinking about the same and It can be done, np.
>>> I always trying to make "fix" with as small diff as possible
>>> considering back-porting.
>>>
>>> How about "follow up" patch if there is an agreement to proceed this way on the Top level?
>>
>> Doing it in two steps would be okay with me (I expected it to go that way
>> anyway), but then it would still be nice to limit churn some. Specifically,
>> taking LINT_MASK as example, can't we do
>>
>> #define LINT_RO_MASK (LVT_RO_MASK | APIC_LVT_REMOTE_IRR)
>>
>> #define LINT_WR_MASK \
>> (LVT_WR_MASK | APIC_DM_MASK | APIC_INPUT_POLARITY | \
>> APIC_LVT_LEVEL_TRIGGER)
>>
>> #define LINT_MASK (LINT_WR_MASK | LINT_RO_MASK)
>>
>> or some such, and then use *_WR_MASK in the table initializer?
>
> Huh. I seems lost a bit, so it's time for ask for more clarifications.
>
> I was under impression (seems wrong) that this patch is ok in general, but
> more improvements need to be done while here [1].
> My situation is simple - I have a broken safety test with obvious reason "RO bits are writable".
> And for me fixing a bug (in most simple and fast way) is a high priority.
> Then whatever optimization/improvements/refactoring (while have time slot).
>
> So, what need to be done to get the bug fixed and fix merged? (preferably in 4.21)
I think what Oleksii said on my series likely applies to this fix too: Has
been around for a long time, and hence isn't really release critical.
Jan
On 10/9/25 5:26 PM, Jan Beulich wrote:
> On 09.10.2025 16:55, Grygorii Strashko wrote:
>> Hi Jan,
>>
>> Thanks for your comments and support.
>>
>> On 07.10.25 18:35, Jan Beulich wrote:
>>> On 03.10.2025 16:04, Grygorii Strashko wrote:
>>>>
>>>> On 01.10.25 18:18, Alejandro Vallejo wrote:
>>>>> On Tue Sep 30, 2025 at 9:05 PM CEST, Grygorii Strashko wrote:
>>>>>> From: Grygorii Strashko<grygorii_strashko@epam.com>
>>>>>>
>>>>>> The LAPIC LVTx registers have two RO bits:
>>>>>> - all: Delivery Status (DS) bit 12
>>>>>> - LINT0/LINT1: Remote IRR Flag (RIR) bit 14.
>>>>>>
>>>>>> The Delivery Status (DS) is not emulated by Xen - there is no IRQ msg bus,
>>>>>> and the IRQ is:
>>>>>> - or accepted at destination and appears as pending
>>>>>> (vLAPIC Interrupt Request Register (IRR))
>>>>>> - or get rejected immediately.
>>>>>>
>>>>>> The Remote IRR Flag (RIR) behavior emulation is not implemented for
>>>>>> LINT0/LINT1 in Xen for now.
>>>>>>
>>>>>> The current vLAPIC implementations allows guest to write to these RO bits.
>>>>>>
>>>>>> The vLAPIC LVTx registers write happens in vlapic_reg_write() which expect
>>>>>> to implement "Write ignore" access type for RO bits by applying masks from
>>>>>> vlapic_lvt_mask[], but vlapic_lvt_mask[] contains incorrect masks which
>>>>>> allows writing to RO fields.
>>>>>>
>>>>>> Hence it is definitely wrong to allow guest to write to LVTx regs RO bits,
>>>>>> fix it by fixing LVTx registers masks in vlapic_lvt_mask[].
>>>>>>
>>>>>> In case of WRMSR (guest_wrmsr_x2apic()) access to LVTx registers, the SDM
>>>>>> clearly defines access type for "Reserved" bits as RsvdZ (Non-zero writes
>>>>>> to reserved bits should cause #GP exception), but contains no statements
>>>>>> for RO bits except that they are not "Reserved". So, guest_wrmsr_x2apic()
>>>>>> now uses different masks (than vlapic_reg_write()) for checking LVTx
>>>>>> registers values for "Reserved" bit settings, which include RO bits and
>>>>>> do not cause #GP exception.
>>>>>>
>>>>>> Fixes: d1bd157fbc9b ("Big merge the HVM full-virtualisation abstractions.")
>>>>>> Signed-off-by: Grygorii Strashko<grygorii_strashko@epam.com>
>>>>>> ---
>>>>>> Changes in v2:
>>>>>> - masks fixed in vlapic_lvt_mask[]
>>>>>> - commit msg reworded
>>>>>>
>>>>>> v1:https://patchwork.kernel.org/project/xen-devel/patch/20250925195558.519568-1-grygorii_strashko@epam.com/
>>>>>> xen/arch/x86/hvm/vlapic.c | 14 ++++++++------
>>>>>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
>>>>>> index 79697487ba90..2ecba8163f48 100644
>>>>>> --- a/xen/arch/x86/hvm/vlapic.c
>>>>>> +++ b/xen/arch/x86/hvm/vlapic.c
>>>>>> @@ -44,15 +44,17 @@
>>>>>> static const unsigned int vlapic_lvt_mask[VLAPIC_LVT_NUM] =
>>>>>> {
>>>>>> /* LVTT */
>>>>>> - LVT_MASK | APIC_TIMER_MODE_MASK,
>>>>>> + (LVT_MASK | APIC_TIMER_MODE_MASK) & ~APIC_SEND_PENDING,
>>>>>> /* LVTTHMR */
>>>>>> - LVT_MASK | APIC_DM_MASK,
>>>>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>>>>> /* LVTPC */
>>>>>> - LVT_MASK | APIC_DM_MASK,
>>>>>> - /* LVT0-1 */
>>>>>> - LINT_MASK, LINT_MASK,
>>>>>> + (LVT_MASK | APIC_DM_MASK) & ~APIC_SEND_PENDING,
>>>>>> + /* LVT0 */
>>>>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>>>>> + /* LVT1 */
>>>>>> + LINT_MASK & ~(APIC_LVT_REMOTE_IRR | APIC_SEND_PENDING),
>>>>>> /* LVTERR */
>>>>>> - LVT_MASK
>>>>>> + LVT_MASK & ~APIC_SEND_PENDING,
>>>>>> };
>>>>> This is a bit messy. Why not have 2 masks? One for rsvdZ bits, and one
>>>>> for RO?
>>>>>
>>>>> That ought to simplify the logic in both the MSR and MMIO cases.
>>>>>
>>>>> MMIO would do RAZ/WI on the OR of both, while the MSR interface would gate
>>>>> #GP(0) on the mask for rsvd bits only and ensure all RO bits are preserved on
>>>>> writes.
>>>>>
>>>>> Thoughts?
>>>> I've been thinking about the same and It can be done, np.
>>>> I always trying to make "fix" with as small diff as possible
>>>> considering back-porting.
>>>>
>>>> How about "follow up" patch if there is an agreement to proceed this way on the Top level?
>>> Doing it in two steps would be okay with me (I expected it to go that way
>>> anyway), but then it would still be nice to limit churn some. Specifically,
>>> taking LINT_MASK as example, can't we do
>>>
>>> #define LINT_RO_MASK (LVT_RO_MASK | APIC_LVT_REMOTE_IRR)
>>>
>>> #define LINT_WR_MASK \
>>> (LVT_WR_MASK | APIC_DM_MASK | APIC_INPUT_POLARITY | \
>>> APIC_LVT_LEVEL_TRIGGER)
>>>
>>> #define LINT_MASK (LINT_WR_MASK | LINT_RO_MASK)
>>>
>>> or some such, and then use *_WR_MASK in the table initializer?
>> Huh. I seems lost a bit, so it's time for ask for more clarifications.
>>
>> I was under impression (seems wrong) that this patch is ok in general, but
>> more improvements need to be done while here [1].
>> My situation is simple - I have a broken safety test with obvious reason "RO bits are writable".
>> And for me fixing a bug (in most simple and fast way) is a high priority.
>> Then whatever optimization/improvements/refactoring (while have time slot).
>>
>> So, what need to be done to get the bug fixed and fix merged? (preferably in 4.21)
> I think what Oleksii said on my series likely applies to this fix too: Has
> been around for a long time, and hence isn't really release critical.
Yes, your understanding is correct. IMO, this change can wait until 4.22.
~ Oleksii
© 2016 - 2026 Red Hat, Inc.