[PATCH] x86/pvh: fix identity mapping of low 1MB

Roger Pau Monne posted 1 patch 6 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20231011153756.16714-1-roger.pau@citrix.com
There is a newer version of this series
xen/arch/x86/hvm/dom0_build.c       | 22 ----------------------
xen/drivers/passthrough/x86/iommu.c |  8 +-------
2 files changed, 1 insertion(+), 29 deletions(-)
[PATCH] x86/pvh: fix identity mapping of low 1MB
Posted by Roger Pau Monne 6 months, 3 weeks ago
The mapping of memory regions below the 1MB mark was all done by the PVH dom0
builder code, thus completely avoiding that region in the arch-specific IOMMU
hardware domain initialization code.  That lead to the IOMMU being enabled
without reserved regions in the low 1MB identity mapped in the p2m for PVH
hardware domains.  Firmware with missing RMRR/IVMD ranges that would otherwise
be located in the low 1MB would transiently trigger IOMMU faults until the p2m
is populated by the PVH dom0 builder:

AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0

Mostly remove the special handling of the low 1MB done by the PVH dom0 builder,
leaving just the data copy between RAM regions.  Otherwise rely on the IOMMU
arch init code to create any identity mappings for reserved regions in such
range (like it already does for all reserved regions).

Note there's a small difference in behavior, as holes in the low 1MB will no
longer be identity mapped to the p2m.

Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/dom0_build.c       | 22 ----------------------
 xen/drivers/passthrough/x86/iommu.c |  8 +-------
 2 files changed, 1 insertion(+), 29 deletions(-)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index bc0e290db612..979db7d1ec4d 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -449,28 +449,6 @@ static int __init pvh_populate_p2m(struct domain *d)
         }
     }
 
-    /* Non-RAM regions of space below 1MB get identity mapped. */
-    for ( i = rc = 0; i < MB1_PAGES; ++i )
-    {
-        p2m_type_t p2mt;
-        mfn_t mfn = get_gfn_query(d, i, &p2mt);
-
-        if ( mfn_eq(mfn, INVALID_MFN) )
-            rc = set_mmio_p2m_entry(d, _gfn(i), _mfn(i), PAGE_ORDER_4K);
-        else
-            /*
-             * If the p2m entry is already set it must belong to a RMRR and
-             * already be identity mapped, or be a RAM region.
-             */
-            ASSERT(p2mt == p2m_ram_rw || mfn_eq(mfn, _mfn(i)));
-        put_gfn(d, i);
-        if ( rc )
-        {
-            printk("Failed to identity map PFN %x: %d\n", i, rc);
-            return rc;
-        }
-    }
-
     if ( cpu_has_vmx && paging_mode_hap(d) && !vmx_unrestricted_guest(v) )
     {
         /*
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index c85549ccad6e..857dccb6a465 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -400,13 +400,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
     max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
     top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
 
-    /*
-     * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid
-     * setting up potentially conflicting mappings here.
-     */
-    start = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
-
-    for ( i = pfn_to_pdx(start), count = 0; i < top; )
+    for ( i = 0, start = 0, count = 0; i < top; )
     {
         unsigned long pfn = pdx_to_pfn(i);
         unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
-- 
2.42.0


Re: [PATCH] x86/pvh: fix identity mapping of low 1MB
Posted by Andrew Cooper 6 months, 3 weeks ago
On 11/10/2023 11:37 pm, Roger Pau Monne wrote:
> The mapping of memory regions below the 1MB mark was all done by the PVH dom0
> builder code, thus completely avoiding that region in the arch-specific IOMMU
> hardware domain initialization code.

This took a while to parse.  I think it would be clearer to say "builder
code, causing the region to be avoided by the arch ..."

>   That lead to the IOMMU being enabled
> without reserved regions in the low 1MB identity mapped in the p2m for PVH
> hardware domains.  Firmware with missing RMRR/IVMD ranges that would otherwise
> be located in the low 1MB would transiently trigger IOMMU faults until the p2m
> is populated by the PVH dom0 builder:

"Firmware which happens to be missing RMRR/IVMD ranges describing E820
reserved regions in the low 1MB would ..." ?

> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0
>
> Mostly remove the special handling of the low 1MB done by the PVH dom0 builder,
> leaving just the data copy between RAM regions.  Otherwise rely on the IOMMU
> arch init code to create any identity mappings for reserved regions in such
> range (like it already does for all reserved regions).

"in such ranges", or in this case "in that range" would be better.  Also
"for reserved regions elsewhere" IMO.

Just to confirm, we're saying our default treatment of identity mapping
e820 reserved regions into the IOMMU is masking (or not) a missing
RMRR/IVMD entry?

>
> Note there's a small difference in behavior, as holes in the low 1MB will no
> longer be identity mapped to the p2m.
>
> Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

I suppose you intended to mark this for 4.18 as you CC'd Henry, and also
send it for x86 (CC added)?

I'm tempted to commit it based on the diffstat alone.  How do we still
have so much junk code like this lying around breaking things...

Anyway - it's a clear improvement.

But a question first.  Is this from debugging the XSA-442 fallout?  If
so, it's probably worth mentioning the hardware we saw this on (which
IIRC was fairly old AMD), and that XSA-442 unmasked a pre-existing bug. 
And we think it's USB/PS2 emulation?

Thanks,

~Andrew

Re: [PATCH] x86/pvh: fix identity mapping of low 1MB
Posted by Roger Pau Monné 6 months, 3 weeks ago
On Thu, Oct 12, 2023 at 12:48:20AM +0800, Andrew Cooper wrote:
> On 11/10/2023 11:37 pm, Roger Pau Monne wrote:
> > The mapping of memory regions below the 1MB mark was all done by the PVH dom0
> > builder code, thus completely avoiding that region in the arch-specific IOMMU
> > hardware domain initialization code.
> 
> This took a while to parse.  I think it would be clearer to say "builder
> code, causing the region to be avoided by the arch ..."

Yes, that's likely better, so:

"The mapping of memory regions below the 1MB mark was all done by the PVH dom0
builder code, causing the region to be avoided by the arch specific IOMMU
hardware domain initialization code."

> >   That lead to the IOMMU being enabled
> > without reserved regions in the low 1MB identity mapped in the p2m for PVH
> > hardware domains.  Firmware with missing RMRR/IVMD ranges that would otherwise
> > be located in the low 1MB would transiently trigger IOMMU faults until the p2m
> > is populated by the PVH dom0 builder:
> 
> "Firmware which happens to be missing RMRR/IVMD ranges describing E820
> reserved regions in the low 1MB would ..." ?

Will adjust.

> > AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0
> >
> > Mostly remove the special handling of the low 1MB done by the PVH dom0 builder,
> > leaving just the data copy between RAM regions.  Otherwise rely on the IOMMU
> > arch init code to create any identity mappings for reserved regions in such
> > range (like it already does for all reserved regions).
> 
> "in such ranges", or in this case "in that range" would be better.  Also
> "for reserved regions elsewhere" IMO.
> 
> Just to confirm, we're saying our default treatment of identity mapping
> e820 reserved regions into the IOMMU is masking (or not) a missing
> RMRR/IVMD entry?

Yes, the provided tables are missing (some?) IVMD region(s) for the
USB devices (the 0xea and 0xeb pages at least, possibly others).  We
identity map reserved regions in the memory map for dom0 in order to
cover up for missing RMRR/IVMD regions, because it's quite common.

> >
> > Note there's a small difference in behavior, as holes in the low 1MB will no
> > longer be identity mapped to the p2m.
> >
> > Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> I suppose you intended to mark this for 4.18 as you CC'd Henry, and also
> send it for x86 (CC added)?

I didn't run add_maintainer on it, sorry.

> I'm tempted to commit it based on the diffstat alone.  How do we still
> have so much junk code like this lying around breaking things...
> 
> Anyway - it's a clear improvement.

I've got further improvements in this area, but not 4.18 material,
will try to post them soon anyway.

> But a question first.  Is this from debugging the XSA-442 fallout?  If
> so, it's probably worth mentioning the hardware we saw this on (which
> IIRC was fairly old AMD), and that XSA-442 unmasked a pre-existing bug. 
> And we think it's USB/PS2 emulation?

Yeah, can add this all.  It is triggered by pinot{0,1} which is AMD
Fam15h (AMD Opteron(tm) Processor 3350 HE).

Thanks, Roger.

Re: [PATCH] x86/pvh: fix identity mapping of low 1MB
Posted by Roger Pau Monné 6 months, 3 weeks ago
On Fri, Oct 13, 2023 at 10:39:06AM +0200, Roger Pau Monné wrote:
> On Thu, Oct 12, 2023 at 12:48:20AM +0800, Andrew Cooper wrote:
> > On 11/10/2023 11:37 pm, Roger Pau Monne wrote:
> > But a question first.  Is this from debugging the XSA-442 fallout?  If
> > so, it's probably worth mentioning the hardware we saw this on (which
> > IIRC was fairly old AMD), and that XSA-442 unmasked a pre-existing bug. 
> > And we think it's USB/PS2 emulation?
> 
> Yeah, can add this all.  It is triggered by pinot{0,1} which is AMD
> Fam15h (AMD Opteron(tm) Processor 3350 HE).

Just re-checked this in order to be sure, and no, the IOMMU faults are
already there prior to XSA-442 for the dom0pvh tests, but they don't
cause issues to osstest, and hence we didn't realize.  I just came
across those as I was triaging a different issue and looking at the
serial log.

Hence I won't add any mentions to XSA-442 in the commit message.

Thanks, Roger.

Re: [PATCH] x86/pvh: fix identity mapping of low 1MB
Posted by Henry Wang 6 months, 3 weeks ago
Hi,

> On Oct 12, 2023, at 00:48, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> On 11/10/2023 11:37 pm, Roger Pau Monne wrote:
>> The mapping of memory regions below the 1MB mark was all done by the PVH dom0
>> builder code, thus completely avoiding that region in the arch-specific IOMMU
>> hardware domain initialization code.
> 
> This took a while to parse.  I think it would be clearer to say "builder
> code, causing the region to be avoided by the arch ..."
> 
>>  That lead to the IOMMU being enabled
>> without reserved regions in the low 1MB identity mapped in the p2m for PVH
>> hardware domains.  Firmware with missing RMRR/IVMD ranges that would otherwise
>> be located in the low 1MB would transiently trigger IOMMU faults until the p2m
>> is populated by the PVH dom0 builder:
> 
> "Firmware which happens to be missing RMRR/IVMD ranges describing E820
> reserved regions in the low 1MB would ..." ?
> 
>> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
>> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
>> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
>> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
>> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
>> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
>> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0
>> 
>> Mostly remove the special handling of the low 1MB done by the PVH dom0 builder,
>> leaving just the data copy between RAM regions.  Otherwise rely on the IOMMU
>> arch init code to create any identity mappings for reserved regions in such
>> range (like it already does for all reserved regions).
> 
> "in such ranges", or in this case "in that range" would be better.  Also
> "for reserved regions elsewhere" IMO.
> 
> Just to confirm, we're saying our default treatment of identity mapping
> e820 reserved regions into the IOMMU is masking (or not) a missing
> RMRR/IVMD entry?
> 
>> 
>> Note there's a small difference in behavior, as holes in the low 1MB will no
>> longer be identity mapped to the p2m.
>> 
>> Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> I suppose you intended to mark this for 4.18 as you CC'd Henry, and also
> send it for x86 (CC added)?

Fine for me to include this bugfix,

Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Kind regards,
Henry


> 
> I'm tempted to commit it based on the diffstat alone.  How do we still
> have so much junk code like this lying around breaking things...
> 
> Anyway - it's a clear improvement.
> 
> But a question first.  Is this from debugging the XSA-442 fallout?  If
> so, it's probably worth mentioning the hardware we saw this on (which
> IIRC was fairly old AMD), and that XSA-442 unmasked a pre-existing bug. 
> And we think it's USB/PS2 emulation?
> 
> Thanks,
> 
> ~Andrew