[PATCH for-4.18 v3] x86/pvh: fix identity mapping of low 1MB

Roger Pau Monne posted 1 patch 6 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20231017082907.14455-1-roger.pau@citrix.com
xen/arch/x86/hvm/dom0_build.c       | 6 +++---
xen/drivers/passthrough/x86/iommu.c | 8 +-------
2 files changed, 4 insertions(+), 10 deletions(-)
[PATCH for-4.18 v3] x86/pvh: fix identity mapping of low 1MB
Posted by Roger Pau Monne 6 months, 3 weeks ago
The mapping of memory regions below the 1MB mark was all done by the PVH dom0
builder code, causing the region to be avoided by the arch specific IOMMU
hardware domain initialization code.  That lead to the IOMMU being enabled
without reserved regions in the low 1MB identity mapped in the p2m for PVH
hardware domains.  Firmware which happens to be missing RMRR/IVMD ranges
describing E820 reserved regions in the low 1MB would transiently trigger IOMMU
faults until the p2m is populated by the PVH dom0 builder:

AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0

Those errors have been observed on the osstest pinot{0,1} boxes (AMD Fam15h
Opteron(tm) Processor 3350 HE).

Rely on the IOMMU arch init code to create any identity mappings for reserved
regions in the low 1MB range (like it already does for reserved regions
elsewhere), and leave the mapping of any holes to be performed by the dom0
builder code.

Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - Leave the identity mapping of holes in the low 1MB.

Changes since v1:
 - Reword commit message.
---
 xen/arch/x86/hvm/dom0_build.c       | 6 +++---
 xen/drivers/passthrough/x86/iommu.c | 8 +-------
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index bc0e290db612..b8c27c1b1646 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -449,7 +449,7 @@ static int __init pvh_populate_p2m(struct domain *d)
         }
     }
 
-    /* Non-RAM regions of space below 1MB get identity mapped. */
+    /* Identity map everything below 1MB that's not already mapped. */
     for ( i = rc = 0; i < MB1_PAGES; ++i )
     {
         p2m_type_t p2mt;
@@ -459,8 +459,8 @@ static int __init pvh_populate_p2m(struct domain *d)
             rc = set_mmio_p2m_entry(d, _gfn(i), _mfn(i), PAGE_ORDER_4K);
         else
             /*
-             * If the p2m entry is already set it must belong to a RMRR and
-             * already be identity mapped, or be a RAM region.
+             * If the p2m entry is already set it must belong to a RMRR/IVMD or
+             * reserved region and be identity mapped, or else be a RAM region.
              */
             ASSERT(p2mt == p2m_ram_rw || mfn_eq(mfn, _mfn(i)));
         put_gfn(d, i);
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index c85549ccad6e..857dccb6a465 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -400,13 +400,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
     max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
     top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
 
-    /*
-     * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid
-     * setting up potentially conflicting mappings here.
-     */
-    start = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
-
-    for ( i = pfn_to_pdx(start), count = 0; i < top; )
+    for ( i = 0, start = 0, count = 0; i < top; )
     {
         unsigned long pfn = pdx_to_pfn(i);
         unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
-- 
2.42.0


Re: [PATCH for-4.18 v3] x86/pvh: fix identity mapping of low 1MB
Posted by Jan Beulich 6 months, 2 weeks ago
On 17.10.2023 10:29, Roger Pau Monne wrote:
> The mapping of memory regions below the 1MB mark was all done by the PVH dom0
> builder code, causing the region to be avoided by the arch specific IOMMU
> hardware domain initialization code.  That lead to the IOMMU being enabled
> without reserved regions in the low 1MB identity mapped in the p2m for PVH
> hardware domains.  Firmware which happens to be missing RMRR/IVMD ranges
> describing E820 reserved regions in the low 1MB would transiently trigger IOMMU
> faults until the p2m is populated by the PVH dom0 builder:
> 
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0
> 
> Those errors have been observed on the osstest pinot{0,1} boxes (AMD Fam15h
> Opteron(tm) Processor 3350 HE).
> 
> Rely on the IOMMU arch init code to create any identity mappings for reserved
> regions in the low 1MB range (like it already does for reserved regions
> elsewhere), and leave the mapping of any holes to be performed by the dom0
> builder code.
> 
> Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one suggestion:

> --- a/xen/arch/x86/hvm/dom0_build.c
> +++ b/xen/arch/x86/hvm/dom0_build.c
> @@ -449,7 +449,7 @@ static int __init pvh_populate_p2m(struct domain *d)
>          }
>      }
>  
> -    /* Non-RAM regions of space below 1MB get identity mapped. */
> +    /* Identity map everything below 1MB that's not already mapped. */
>      for ( i = rc = 0; i < MB1_PAGES; ++i )
>      {
>          p2m_type_t p2mt;
> @@ -459,8 +459,8 @@ static int __init pvh_populate_p2m(struct domain *d)
>              rc = set_mmio_p2m_entry(d, _gfn(i), _mfn(i), PAGE_ORDER_4K);
>          else
>              /*
> -             * If the p2m entry is already set it must belong to a RMRR and
> -             * already be identity mapped, or be a RAM region.
> +             * If the p2m entry is already set it must belong to a RMRR/IVMD or
> +             * reserved region and be identity mapped, or else be a RAM region.
>               */
>              ASSERT(p2mt == p2m_ram_rw || mfn_eq(mfn, _mfn(i)));

Would you mind wording the comment slightly differently, e.g.

"If the p2m entry is already set it must belong to a reserved region
 (e.g. RMRR/IVMD) and be identity mapped, or else be a RAM region."

This is because such RMRR/IVMD regions are required to be in reserved
ranges anyway.

Jan

Re: [PATCH for-4.18 v3] x86/pvh: fix identity mapping of low 1MB
Posted by Roger Pau Monné 6 months, 2 weeks ago
On Wed, Oct 18, 2023 at 05:11:58PM +0200, Jan Beulich wrote:
> On 17.10.2023 10:29, Roger Pau Monne wrote:
> > The mapping of memory regions below the 1MB mark was all done by the PVH dom0
> > builder code, causing the region to be avoided by the arch specific IOMMU
> > hardware domain initialization code.  That lead to the IOMMU being enabled
> > without reserved regions in the low 1MB identity mapped in the p2m for PVH
> > hardware domains.  Firmware which happens to be missing RMRR/IVMD ranges
> > describing E820 reserved regions in the low 1MB would transiently trigger IOMMU
> > faults until the p2m is populated by the PVH dom0 builder:
> > 
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
> > AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0
> > 
> > Those errors have been observed on the osstest pinot{0,1} boxes (AMD Fam15h
> > Opteron(tm) Processor 3350 HE).
> > 
> > Rely on the IOMMU arch init code to create any identity mappings for reserved
> > regions in the low 1MB range (like it already does for reserved regions
> > elsewhere), and leave the mapping of any holes to be performed by the dom0
> > builder code.
> > 
> > Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> with one suggestion:
> 
> > --- a/xen/arch/x86/hvm/dom0_build.c
> > +++ b/xen/arch/x86/hvm/dom0_build.c
> > @@ -449,7 +449,7 @@ static int __init pvh_populate_p2m(struct domain *d)
> >          }
> >      }
> >  
> > -    /* Non-RAM regions of space below 1MB get identity mapped. */
> > +    /* Identity map everything below 1MB that's not already mapped. */
> >      for ( i = rc = 0; i < MB1_PAGES; ++i )
> >      {
> >          p2m_type_t p2mt;
> > @@ -459,8 +459,8 @@ static int __init pvh_populate_p2m(struct domain *d)
> >              rc = set_mmio_p2m_entry(d, _gfn(i), _mfn(i), PAGE_ORDER_4K);
> >          else
> >              /*
> > -             * If the p2m entry is already set it must belong to a RMRR and
> > -             * already be identity mapped, or be a RAM region.
> > +             * If the p2m entry is already set it must belong to a RMRR/IVMD or
> > +             * reserved region and be identity mapped, or else be a RAM region.
> >               */
> >              ASSERT(p2mt == p2m_ram_rw || mfn_eq(mfn, _mfn(i)));
> 
> Would you mind wording the comment slightly differently, e.g.
> 
> "If the p2m entry is already set it must belong to a reserved region
>  (e.g. RMRR/IVMD) and be identity mapped, or else be a RAM region."
> 
> This is because such RMRR/IVMD regions are required to be in reserved
> ranges anyway.

IIRC there's an option to provide extra RMRR/IVMD regions on the
command line, and those are not required to be on reserved regions?

Otherwise LGTM, so would you mind adjusting at commit?

Thanks, Roger.

Re: [PATCH for-4.18 v3] x86/pvh: fix identity mapping of low 1MB
Posted by Jan Beulich 6 months, 2 weeks ago
On 18.10.2023 18:12, Roger Pau Monné wrote:
> On Wed, Oct 18, 2023 at 05:11:58PM +0200, Jan Beulich wrote:
>> On 17.10.2023 10:29, Roger Pau Monne wrote:
>>> The mapping of memory regions below the 1MB mark was all done by the PVH dom0
>>> builder code, causing the region to be avoided by the arch specific IOMMU
>>> hardware domain initialization code.  That lead to the IOMMU being enabled
>>> without reserved regions in the low 1MB identity mapped in the p2m for PVH
>>> hardware domains.  Firmware which happens to be missing RMRR/IVMD ranges
>>> describing E820 reserved regions in the low 1MB would transiently trigger IOMMU
>>> faults until the p2m is populated by the PVH dom0 builder:
>>>
>>> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
>>> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
>>> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
>>> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
>>> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
>>> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
>>> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0
>>>
>>> Those errors have been observed on the osstest pinot{0,1} boxes (AMD Fam15h
>>> Opteron(tm) Processor 3350 HE).
>>>
>>> Rely on the IOMMU arch init code to create any identity mappings for reserved
>>> regions in the low 1MB range (like it already does for reserved regions
>>> elsewhere), and leave the mapping of any holes to be performed by the dom0
>>> builder code.
>>>
>>> Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> with one suggestion:
>>
>>> --- a/xen/arch/x86/hvm/dom0_build.c
>>> +++ b/xen/arch/x86/hvm/dom0_build.c
>>> @@ -449,7 +449,7 @@ static int __init pvh_populate_p2m(struct domain *d)
>>>          }
>>>      }
>>>  
>>> -    /* Non-RAM regions of space below 1MB get identity mapped. */
>>> +    /* Identity map everything below 1MB that's not already mapped. */
>>>      for ( i = rc = 0; i < MB1_PAGES; ++i )
>>>      {
>>>          p2m_type_t p2mt;
>>> @@ -459,8 +459,8 @@ static int __init pvh_populate_p2m(struct domain *d)
>>>              rc = set_mmio_p2m_entry(d, _gfn(i), _mfn(i), PAGE_ORDER_4K);
>>>          else
>>>              /*
>>> -             * If the p2m entry is already set it must belong to a RMRR and
>>> -             * already be identity mapped, or be a RAM region.
>>> +             * If the p2m entry is already set it must belong to a RMRR/IVMD or
>>> +             * reserved region and be identity mapped, or else be a RAM region.
>>>               */
>>>              ASSERT(p2mt == p2m_ram_rw || mfn_eq(mfn, _mfn(i)));
>>
>> Would you mind wording the comment slightly differently, e.g.
>>
>> "If the p2m entry is already set it must belong to a reserved region
>>  (e.g. RMRR/IVMD) and be identity mapped, or else be a RAM region."
>>
>> This is because such RMRR/IVMD regions are required to be in reserved
>> ranges anyway.
> 
> IIRC there's an option to provide extra RMRR/IVMD regions on the
> command line, and those are not required to be on reserved regions?

On AMD we force-reserve such regions, and I think it is a mistake that
we don't on VT-d.

> Otherwise LGTM, so would you mind adjusting at commit?

Sure.

Jan

Re: [PATCH for-4.18 v3] x86/pvh: fix identity mapping of low 1MB
Posted by Henry Wang 6 months, 2 weeks ago
Hi Roger,

> On Oct 17, 2023, at 16:29, Roger Pau Monne <roger.pau@citrix.com> wrote:
> 
> The mapping of memory regions below the 1MB mark was all done by the PVH dom0
> builder code, causing the region to be avoided by the arch specific IOMMU
> hardware domain initialization code.  That lead to the IOMMU being enabled
> without reserved regions in the low 1MB identity mapped in the p2m for PVH
> hardware domains.  Firmware which happens to be missing RMRR/IVMD ranges
> describing E820 reserved regions in the low 1MB would transiently trigger IOMMU
> faults until the p2m is populated by the PVH dom0 builder:
> 
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb380 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.1 d0 addr 00000000000eb340 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:13.2 d0 addr 00000000000ea1c0 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb480 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb080 flags 0x20 RW
> AMD-Vi: IO_PAGE_FAULT: 0000:00:14.5 d0 addr 00000000000eb400 flags 0
> AMD-Vi: IO_PAGE_FAULT: 0000:00:12.0 d0 addr 00000000000eb040 flags 0
> 
> Those errors have been observed on the osstest pinot{0,1} boxes (AMD Fam15h
> Opteron(tm) Processor 3350 HE).
> 
> Rely on the IOMMU arch init code to create any identity mappings for reserved
> regions in the low 1MB range (like it already does for reserved regions
> elsewhere), and leave the mapping of any holes to be performed by the dom0
> builder code.
> 
> Fixes: 6b4f6a31ace1 ('x86/PVH: de-duplicate mappings for first Mb of Dom0 memory')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Kind regards,
Henry