While commit 46c4061cd2bf ("x86/IOMMU: mark IOMMU / intremap not in use
when ACPI tables are missing") deals with apic_x2apic_probe() as called
from x2apic_bsp_setup(), the check_x2apic_preenabled() path is similarly
affected: The call needs to occur after acpi_iommu_init(), such that
iommu_intremap getting disabled there can be properly taken into account
by apic_x2apic_probe().
Note that, for the time being (further cleanup patches following),
reversing the order of the calls to generic_apic_probe() and
acpi_boot_init() is not an option:
- acpi_process_madt() calls clustered_apic_check() and hence relies on
genapic to have got filled before,
- generic_bigsmp_probe() (called from acpi_process_madt()) needs to
occur after generic_apic_probe(),
- acpi_parse_madt() (called from acpi_process_madt()) calls
acpi_madt_oem_check(), which wants to be after generic_apic_probe().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Based on code inspection only - I have no affected system and hence no
way to actually test the case.
---
v2.2: Move generic_apic_probe() again, but only past acpi_iommu_init().
v2.1: Respect acpi_disabled in acpi_iommu_init().
v2: Don't move generic_apic_probe() invocation, instead pull out
acpi_iommu_init() from acpi_boot_init().
---
4.16: While investigating the issue addressed by the referenced commit,
a variant of that problem was identified when firmware pre-enables
x2APIC mode. Whether that's something sane firmware would do when
at the same time IOMMU(s) is/are disabled is unclear, so this may
be a purely academical consideration. Working around the problem
also ought to be as simple as passing "iommu=no-intremap" on the
command line. Considering the fragility of the code (as further
demonstrated by v1 having been completely wrong), it may therefore
be advisable to defer this change until after branching.
Nevertheless it will then be a backporting candidate, so
considering to take it right away can't simply be put off.
--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -757,8 +757,6 @@ int __init acpi_boot_init(void)
acpi_mmcfg_init();
- acpi_iommu_init();
-
erst_init();
acpi_hest_init();
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1700,15 +1700,30 @@ void __init noreturn __start_xen(unsigne
dmi_scan_machine();
- generic_apic_probe();
-
mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
RANGESETF_prettyprint_hex);
xsm_multiboot_init(module_map, mbi);
+ /*
+ * IOMMU-related ACPI table parsing may require some of the system domains
+ * to be usable.
+ */
setup_system_domains();
+ /*
+ * IOMMU-related ACPI table parsing has to happen before APIC probing, for
+ * check_x2apic_preenabled() to be able to observe respective findings, in
+ * particular iommu_intremap having got turned off.
+ */
+ acpi_iommu_init();
+
+ /*
+ * APIC probing needs to happen before general ACPI table parsing, as e.g.
+ * generic_bigsmp_probe() may occur only afterwards.
+ */
+ generic_apic_probe();
+
acpi_boot_init();
if ( smp_found_config )
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -43,14 +43,17 @@ bool __read_mostly iommu_intpost;
void __init acpi_iommu_init(void)
{
- int ret;
+ int ret = -ENODEV;
if ( !iommu_enable && !iommu_intremap )
return;
- ret = acpi_dmar_init();
- if ( ret == -ENODEV )
- ret = acpi_ivrs_init();
+ if ( !acpi_disabled )
+ {
+ ret = acpi_dmar_init();
+ if ( ret == -ENODEV )
+ ret = acpi_ivrs_init();
+ }
if ( ret )
{
On Mon, Nov 15, 2021 at 03:31:39PM +0100, Jan Beulich wrote:
> While commit 46c4061cd2bf ("x86/IOMMU: mark IOMMU / intremap not in use
> when ACPI tables are missing") deals with apic_x2apic_probe() as called
> from x2apic_bsp_setup(), the check_x2apic_preenabled() path is similarly
> affected: The call needs to occur after acpi_iommu_init(), such that
> iommu_intremap getting disabled there can be properly taken into account
> by apic_x2apic_probe().
>
> Note that, for the time being (further cleanup patches following),
> reversing the order of the calls to generic_apic_probe() and
> acpi_boot_init() is not an option:
> - acpi_process_madt() calls clustered_apic_check() and hence relies on
> genapic to have got filled before,
> - generic_bigsmp_probe() (called from acpi_process_madt()) needs to
> occur after generic_apic_probe(),
> - acpi_parse_madt() (called from acpi_process_madt()) calls
> acpi_madt_oem_check(), which wants to be after generic_apic_probe().
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Just one comment below.
> ---
> Based on code inspection only - I have no affected system and hence no
> way to actually test the case.
> ---
> v2.2: Move generic_apic_probe() again, but only past acpi_iommu_init().
> v2.1: Respect acpi_disabled in acpi_iommu_init().
> v2: Don't move generic_apic_probe() invocation, instead pull out
> acpi_iommu_init() from acpi_boot_init().
> ---
> 4.16: While investigating the issue addressed by the referenced commit,
> a variant of that problem was identified when firmware pre-enables
> x2APIC mode. Whether that's something sane firmware would do when
> at the same time IOMMU(s) is/are disabled is unclear, so this may
> be a purely academical consideration. Working around the problem
> also ought to be as simple as passing "iommu=no-intremap" on the
> command line. Considering the fragility of the code (as further
> demonstrated by v1 having been completely wrong), it may therefore
> be advisable to defer this change until after branching.
> Nevertheless it will then be a backporting candidate, so
> considering to take it right away can't simply be put off.
>
> --- a/xen/arch/x86/acpi/boot.c
> +++ b/xen/arch/x86/acpi/boot.c
> @@ -757,8 +757,6 @@ int __init acpi_boot_init(void)
>
> acpi_mmcfg_init();
>
> - acpi_iommu_init();
> -
> erst_init();
>
> acpi_hest_init();
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1700,15 +1700,30 @@ void __init noreturn __start_xen(unsigne
>
> dmi_scan_machine();
>
> - generic_apic_probe();
> -
> mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
> RANGESETF_prettyprint_hex);
>
> xsm_multiboot_init(module_map, mbi);
>
> + /*
> + * IOMMU-related ACPI table parsing may require some of the system domains
> + * to be usable.
I would be a bit more specific and likely add "...to be usable in
order to hide IOMMU PCI devices.".
Thanks, Roger.
On 15.11.2021 16:07, Roger Pau Monné wrote:
> On Mon, Nov 15, 2021 at 03:31:39PM +0100, Jan Beulich wrote:
>> While commit 46c4061cd2bf ("x86/IOMMU: mark IOMMU / intremap not in use
>> when ACPI tables are missing") deals with apic_x2apic_probe() as called
>> from x2apic_bsp_setup(), the check_x2apic_preenabled() path is similarly
>> affected: The call needs to occur after acpi_iommu_init(), such that
>> iommu_intremap getting disabled there can be properly taken into account
>> by apic_x2apic_probe().
>>
>> Note that, for the time being (further cleanup patches following),
>> reversing the order of the calls to generic_apic_probe() and
>> acpi_boot_init() is not an option:
>> - acpi_process_madt() calls clustered_apic_check() and hence relies on
>> genapic to have got filled before,
>> - generic_bigsmp_probe() (called from acpi_process_madt()) needs to
>> occur after generic_apic_probe(),
>> - acpi_parse_madt() (called from acpi_process_madt()) calls
>> acpi_madt_oem_check(), which wants to be after generic_apic_probe().
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Thanks.
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -1700,15 +1700,30 @@ void __init noreturn __start_xen(unsigne
>>
>> dmi_scan_machine();
>>
>> - generic_apic_probe();
>> -
>> mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
>> RANGESETF_prettyprint_hex);
>>
>> xsm_multiboot_init(module_map, mbi);
>>
>> + /*
>> + * IOMMU-related ACPI table parsing may require some of the system domains
>> + * to be usable.
>
> I would be a bit more specific and likely add "...to be usable in
> order to hide IOMMU PCI devices.".
Hmm, not sure. I did specifically leave out the "why" part, as dom_io
might also become required for something. If you nevertheless think
I should extend the comment, then I'd insist on using "e.g." just
like I did in the comment next to the call to generic_apic_probe().
Jan
On Mon, Nov 15, 2021 at 05:10:04PM +0100, Jan Beulich wrote:
> On 15.11.2021 16:07, Roger Pau Monné wrote:
> > On Mon, Nov 15, 2021 at 03:31:39PM +0100, Jan Beulich wrote:
> >> While commit 46c4061cd2bf ("x86/IOMMU: mark IOMMU / intremap not in use
> >> when ACPI tables are missing") deals with apic_x2apic_probe() as called
> >> from x2apic_bsp_setup(), the check_x2apic_preenabled() path is similarly
> >> affected: The call needs to occur after acpi_iommu_init(), such that
> >> iommu_intremap getting disabled there can be properly taken into account
> >> by apic_x2apic_probe().
> >>
> >> Note that, for the time being (further cleanup patches following),
> >> reversing the order of the calls to generic_apic_probe() and
> >> acpi_boot_init() is not an option:
> >> - acpi_process_madt() calls clustered_apic_check() and hence relies on
> >> genapic to have got filled before,
> >> - generic_bigsmp_probe() (called from acpi_process_madt()) needs to
> >> occur after generic_apic_probe(),
> >> - acpi_parse_madt() (called from acpi_process_madt()) calls
> >> acpi_madt_oem_check(), which wants to be after generic_apic_probe().
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Thanks.
>
> >> --- a/xen/arch/x86/setup.c
> >> +++ b/xen/arch/x86/setup.c
> >> @@ -1700,15 +1700,30 @@ void __init noreturn __start_xen(unsigne
> >>
> >> dmi_scan_machine();
> >>
> >> - generic_apic_probe();
> >> -
> >> mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
> >> RANGESETF_prettyprint_hex);
> >>
> >> xsm_multiboot_init(module_map, mbi);
> >>
> >> + /*
> >> + * IOMMU-related ACPI table parsing may require some of the system domains
> >> + * to be usable.
> >
> > I would be a bit more specific and likely add "...to be usable in
> > order to hide IOMMU PCI devices.".
>
> Hmm, not sure. I did specifically leave out the "why" part, as dom_io
> might also become required for something. If you nevertheless think
> I should extend the comment, then I'd insist on using "e.g." just
> like I did in the comment next to the call to generic_apic_probe().
My request was mostly to make it clear where dom_io is used, it's IMO
not trivial to spot that pci_ro_device requires the dom_io to be
setup. I'm fine if you want to use 'e.g.', or if you think this is not
helpful.
Thanks, Roger.
© 2016 - 2026 Red Hat, Inc.