[PATCH] IOMMU/x86: don't map IO-APICs for PV Dom0

Jan Beulich posted 1 patch 2 years, 8 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/d01563bc-ae9c-fe91-b313-19a30af09170@suse.com
[PATCH] IOMMU/x86: don't map IO-APICs for PV Dom0
Posted by Jan Beulich 2 years, 8 months ago
While already the case for PVH, there's no reason to treat PV
differently here (except of course where to take the addresses from).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -199,9 +199,18 @@ static bool __hwdom_init hwdom_iommu_map
     if ( pfn >= 0xfee00 && pfn <= 0xfeeff )
         return false;
     /* ... or the IO-APIC */
-    for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
-        if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
-            return false;
+    if ( has_vioapic(d) )
+    {
+        for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
+            if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
+                return false;
+    }
+    else
+    {
+        for ( i = 0; i < nr_ioapics; i++ )
+            if ( pfn == PFN_DOWN(mp_ioapics[i].mpc_apicaddr) )
+                return false;
+    }
     /*
      * ... or the PCIe MCFG regions.
      * TODO: runtime added MMCFG regions are not checked to make sure they


Re: [PATCH] IOMMU/x86: don't map IO-APICs for PV Dom0
Posted by Andrew Cooper 2 years, 8 months ago
On 16/08/2021 16:31, Jan Beulich wrote:
> While already the case for PVH, there's no reason to treat PV
> differently here (except of course where to take the addresses from).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Honestly, this is already a mess but I think the change is making things
worse rather than better.

To start with, IO-APIC windows are 1k not 4k, except that no-one added a
"4k safe" flag because IO-APICs weren't mapped into userspace by Linux
at the time.

More generally though, if something is safe to let dom0 map in the CPU
pagetables, it is safe to go in the IOMMU pagetables.  Conversely, if
it's not safe to go in one, it's not safe to go in either.

Mappings (or at least mapability) of everything/anything should be
uniform between the CPU and IOMMU pagetables for any kind of sanity to
prevail.

This is most easily demonstrated with PVH dom0 and shared vs split EPT
tables.  Split vs shared is an internal choice within Xen, and shouldn't
cause in any change in static DMA behaviour (obviously - there is
transient difference with logdirty but that's not relevant here).

~Andrew


Re: [PATCH] IOMMU/x86: don't map IO-APICs for PV Dom0
Posted by Jan Beulich 2 years, 8 months ago
On 16.08.2021 20:31, Andrew Cooper wrote:
> On 16/08/2021 16:31, Jan Beulich wrote:
>> While already the case for PVH, there's no reason to treat PV
>> differently here (except of course where to take the addresses from).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Honestly, this is already a mess but I think the change is making things
> worse rather than better.
> 
> To start with, IO-APIC windows are 1k not 4k, except that no-one added a
> "4k safe" flag because IO-APICs weren't mapped into userspace by Linux
> at the time.
> 
> More generally though, if something is safe to let dom0 map in the CPU
> pagetables, it is safe to go in the IOMMU pagetables.  Conversely, if
> it's not safe to go in one, it's not safe to go in either.
> 
> Mappings (or at least mapability) of everything/anything should be
> uniform between the CPU and IOMMU pagetables for any kind of sanity to
> prevail.
> 
> This is most easily demonstrated with PVH dom0 and shared vs split EPT
> tables.  Split vs shared is an internal choice within Xen, and shouldn't
> cause in any change in static DMA behaviour (obviously - there is
> transient difference with logdirty but that's not relevant here).

Well, as frequently my aim is consistency: Either we exclude IO-APIC
space here uniformly (regardless of guest type), or we include it.
Yet including is impossible for PVH afaict, because there's no MFN
to map to (IOW I don't buy all aspects of the last paragraph of your
reply - there's no mapping of the IO-APIC page(s) in either kind of
page tables).

I did notice the oddity while closely inspecting the IOMMU mappings
created for Dom0 in the context of finally making use of large pages
there. Seeing it I didn't think the IO-APICs should be mapped in the
IOMMU page tables when they are _not_ mapped / mappable in the CPU
ones (see the respective iomem_deny_access() in
dom0_setup_permissions()). Hence the change actually only brings us
to what you describe in the 2nd to last paragraph of your reply
above.

Or wait - default behavior actually is to allow r/o mappings of the
IO-APIC pages. This would suggest the IOMMU mappings should be r/o
as well (unless the rangeset addition in ioapic_init_mappings()
failed). I can certainly alter the change to this effect, at the
expense of more code and more code churn.

Jan


Re: [PATCH] IOMMU/x86: don't map IO-APICs for PV Dom0
Posted by Jan Beulich 2 years, 8 months ago
On 16.08.2021 20:31, Andrew Cooper wrote:
> On 16/08/2021 16:31, Jan Beulich wrote:
>> While already the case for PVH, there's no reason to treat PV
>> differently here (except of course where to take the addresses from).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Honestly, this is already a mess but I think the change is making things
> worse rather than better.
> 
> To start with, IO-APIC windows are 1k not 4k, except that no-one added a
> "4k safe" flag because IO-APICs weren't mapped into userspace by Linux
> at the time.
> 
> More generally though, if something is safe to let dom0 map in the CPU
> pagetables, it is safe to go in the IOMMU pagetables.  Conversely, if
> it's not safe to go in one, it's not safe to go in either.
> 
> Mappings (or at least mapability) of everything/anything should be
> uniform between the CPU and IOMMU pagetables for any kind of sanity to
> prevail.

Just in case it's not obvious (it just occurred to me to mention it):
There's a similar inconsistency with all other MMIO: HVM/PVH get this
mapped in both CPU and IOMMU, while PV doesn't. If mappings were
mirrored to the IOMMU, the patch here wouldn't be necessary in its v2
form (or the form when integrated into the larger series), but instead
would be correct in this initial shape.

So if you think that's the way to go, I can restore the initial version
of this patch after adding a prereq one to mirror MMIO page mappings to
the IOMMU. There's one difficulty though: We'd need to have a way to
tell when it's safe to unmap from the IOMMU again. In case of multiple
CPU side mappings we can't unmap when the first of these mappings goes
away.

Jan