The IO-APIC RTEs are unconditionally programmed with physical destination
mode, and hence the field to set in the RTE is always physical_dest.
Remove the mode parameter from SET_DEST() and take the opportunity to
convert it into a function, there's no need for it to be a macro.
This is a benign fix, because due to the endianness of x86 the start of the
physical_dest and logical_dest fields on the RTE overlap.
While there also adjust setup_IO_APIC_irqs() to use the target CPU set by
the assign_irq_vector(), rather than using TARGET_CPUS and possibly
creating a mismatch between the target CPU in desc->arch.cpu_mask and the
one programmed in the IO-APIC RTE.
Fixes: 032d6733a458 ("x86/apic: remove delivery and destination mode fields from drivers")
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/io_apic.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 46c2a43dac6d..3c59bf0e15da 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1037,12 +1037,14 @@ static hw_irq_controller ioapic_edge_type;
#define IOAPIC_EDGE 0
#define IOAPIC_LEVEL 1
-#define SET_DEST(ent, mode, val) do { \
- if (x2apic_enabled && iommu_intremap) \
- (ent).dest.dest32 = (val); \
- else \
- (ent).dest.mode.mode##_dest = (val); \
-} while (0)
+static void set_entry_dest(struct IO_APIC_route_entry *entry,
+ unsigned int apic_id)
+{
+ if ( x2apic_enabled && iommu_intremap )
+ entry->dest.dest32 = apic_id;
+ else
+ entry->dest.physical.physical_dest = apic_id;
+}
static inline void ioapic_register_intr(int irq, unsigned long trigger)
{
@@ -1109,7 +1111,8 @@ static void __init setup_IO_APIC_irqs(void)
if (platform_legacy_irq(irq))
disable_8259A_irq(irq_to_desc(irq));
- SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
+ set_entry_dest(&entry,
+ cpu_mask_to_apicid(irq_to_desc(irq)->arch.cpu_mask));
spin_lock_irqsave(&ioapic_lock, flags);
__ioapic_write_entry(apic, pin, false, entry);
spin_unlock_irqrestore(&ioapic_lock, flags);
@@ -1140,7 +1143,7 @@ static void __init setup_ExtINT_IRQ0_pin(unsigned int apic, unsigned int pin, in
*/
entry.dest_mode = 0; /* physical delivery */
entry.mask = 0; /* unmask IRQ now */
- SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
+ set_entry_dest(&entry, cpu_mask_to_apicid(TARGET_CPUS));
entry.delivery_mode = dest_Fixed;
entry.polarity = 0;
entry.trigger = 0;
@@ -1432,7 +1435,7 @@ void disable_IO_APIC(void)
entry.dest_mode = 0; /* Physical */
entry.delivery_mode = dest_ExtINT; /* ExtInt */
entry.vector = 0;
- SET_DEST(entry, physical, get_apic_id());
+ set_entry_dest(&entry, get_apic_id());
/*
* Add it to the IO-APIC irq-routing table:
@@ -1806,7 +1809,7 @@ static void __init unlock_ExtINT_logic(void)
entry1.dest_mode = 0; /* physical delivery */
entry1.mask = 0; /* unmask IRQ now */
- SET_DEST(entry1, physical, get_apic_id());
+ set_entry_dest(&entry1, get_apic_id());
entry1.delivery_mode = dest_ExtINT;
entry1.polarity = entry0.polarity;
entry1.trigger = 0;
@@ -2137,7 +2140,7 @@ int io_apic_set_pci_routing (int ioapic, int pin, int irq, int edge_level, int a
cpumask_t *mask = this_cpu(scratch_cpumask);
cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
- SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+ set_entry_dest(&entry, cpu_mask_to_apicid(mask));
} else {
printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
irq, CPUMASK_PR(desc->arch.cpu_mask), CPUMASK_PR(TARGET_CPUS));
@@ -2334,7 +2337,7 @@ int ioapic_guest_write(unsigned long physbase, unsigned int reg, u32 val)
cpumask_t *mask = this_cpu(scratch_cpumask);
cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
- SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+ set_entry_dest(&rte, cpu_mask_to_apicid(mask));
}
else
{
--
2.51.0
On 20/11/2025 9:06 am, Roger Pau Monne wrote: > The IO-APIC RTEs are unconditionally programmed with physical destination > mode, and hence the field to set in the RTE is always physical_dest. > > Remove the mode parameter from SET_DEST() and take the opportunity to > convert it into a function, there's no need for it to be a macro. > > This is a benign fix, because due to the endianness of x86 the start of the > physical_dest and logical_dest fields on the RTE overlap. RTEs do not have overlapping fields; it's Xen's abstraction of the IO-APIC which is buggy. For starters, Xen's IO_APIC_route_entry still only has a 4-bit physical_dest field which hasn't been true since the Pentium 4 days. This might explain some of the interrupt bugs we see. ~Andrew
On 20.11.2025 14:18, Andrew Cooper wrote: > On 20/11/2025 9:06 am, Roger Pau Monne wrote: >> The IO-APIC RTEs are unconditionally programmed with physical destination >> mode, and hence the field to set in the RTE is always physical_dest. >> >> Remove the mode parameter from SET_DEST() and take the opportunity to >> convert it into a function, there's no need for it to be a macro. >> >> This is a benign fix, because due to the endianness of x86 the start of the >> physical_dest and logical_dest fields on the RTE overlap. > > RTEs do not have overlapping fields; it's Xen's abstraction of the > IO-APIC which is buggy. I wouldn't put it this negatively. In the old days, ... > For starters, Xen's IO_APIC_route_entry still only has a 4-bit > physical_dest field which hasn't been true since the Pentium 4 days. > This might explain some of the interrupt bugs we see. ... as you mention here, the two fields were distinct (and hence overlapping). In a number of places we passed "logical" to SET_DEST() as the middle argument, thus covering for the too narrow field width of physical_dest. Dropping that parameter and always using physical_dest requires that field to be widened, though (or else we'll end up chopping off the top 4 bits, as we already do in disable_IO_APIC() and unlock_ExtINT_logic() - both benign as long as the CPU used always has APIC ID 0, which will at least typically be the case, I think). Jan
On 20/11/2025 2:27 pm, Jan Beulich wrote: > On 20.11.2025 14:18, Andrew Cooper wrote: >> On 20/11/2025 9:06 am, Roger Pau Monne wrote: >>> The IO-APIC RTEs are unconditionally programmed with physical destination >>> mode, and hence the field to set in the RTE is always physical_dest. >>> >>> Remove the mode parameter from SET_DEST() and take the opportunity to >>> convert it into a function, there's no need for it to be a macro. >>> >>> This is a benign fix, because due to the endianness of x86 the start of the >>> physical_dest and logical_dest fields on the RTE overlap. >> RTEs do not have overlapping fields; it's Xen's abstraction of the >> IO-APIC which is buggy. > I wouldn't put it this negatively. In the old days, ... > >> For starters, Xen's IO_APIC_route_entry still only has a 4-bit >> physical_dest field which hasn't been true since the Pentium 4 days. >> This might explain some of the interrupt bugs we see. > ... as you mention here, the two fields were distinct (and hence overlapping). Since when? Even in the oldest manuals I can find, it's a single field called destination, and who's contents is interpreted differently depending on the logical mode bit. From a hardware point of view, there will either be 4 or 8 bits of storage, and it will have nothing to do with the lower bits. > In a number of places we passed "logical" to SET_DEST() as the middle argument, > thus covering for the too narrow field width of physical_dest. Dropping that > parameter and always using physical_dest requires that field to be widened, > though (or else we'll end up chopping off the top 4 bits, as we already do in > disable_IO_APIC() and unlock_ExtINT_logic() - both benign as long as the CPU > used always has APIC ID 0, which will at least typically be the case, I think). Latent or not, it's still in need of fixing. It looks like the code Xen inherited was added to Linux in e1d919786 (Jan 2008, even then only x86_32) and deleted by d83e94acd957 (August 2008). It looks like we're 17 years late undoing this... ~Andrew
On 20.11.2025 19:30, Andrew Cooper wrote: > On 20/11/2025 2:27 pm, Jan Beulich wrote: >> On 20.11.2025 14:18, Andrew Cooper wrote: >>> On 20/11/2025 9:06 am, Roger Pau Monne wrote: >>>> The IO-APIC RTEs are unconditionally programmed with physical destination >>>> mode, and hence the field to set in the RTE is always physical_dest. >>>> >>>> Remove the mode parameter from SET_DEST() and take the opportunity to >>>> convert it into a function, there's no need for it to be a macro. >>>> >>>> This is a benign fix, because due to the endianness of x86 the start of the >>>> physical_dest and logical_dest fields on the RTE overlap. >>> RTEs do not have overlapping fields; it's Xen's abstraction of the >>> IO-APIC which is buggy. >> I wouldn't put it this negatively. In the old days, ... >> >>> For starters, Xen's IO_APIC_route_entry still only has a 4-bit >>> physical_dest field which hasn't been true since the Pentium 4 days. >>> This might explain some of the interrupt bugs we see. >> ... as you mention here, the two fields were distinct (and hence overlapping). > > Since when? > > Even in the oldest manuals I can find, it's a single field called > destination, and who's contents is interpreted differently depending on > the logical mode bit. And these two different meanings are reflected in the union-ized definition in Xen. Which I still think is fair to describe as "overlapping". And the use of a union itself isn't buggy imo; it's some of the code using it which is. Jan
© 2016 - 2025 Red Hat, Inc.