[PATCH v3] xen/pci: detect when BARs are not suitably positioned

Roger Pau Monne posted 1 patch 2 years, 3 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20220201124551.2392-1-roger.pau@citrix.com
There is a newer version of this series
xen/arch/x86/mm.c             | 17 +++++++++
xen/drivers/passthrough/pci.c | 71 ++++++++++++++++++++++++++++++++++-
xen/include/xen/mm.h          |  2 +
xen/include/xen/pci_regs.h    |  2 +
xen/include/xen/vpci.h        |  2 -
5 files changed, 91 insertions(+), 3 deletions(-)
[PATCH v3] xen/pci: detect when BARs are not suitably positioned
Posted by Roger Pau Monne 2 years, 3 months ago
One of the boxes where I was attempting to boot Xen in PVH dom0 mode
has quirky firmware, as it will handover with a PCI device with memory
decoding enabled and a BAR of size 4K at address 0. Such BAR overlaps
with a RAM range on the e820.

This interacts badly with the dom0 PVH build, as BARs will be setup on
the p2m before RAM, so if there's a BAR positioned over a RAM region
it will trigger a domain crash when the dom0 builder attempts to
populate that region with a regular RAM page.

It's in general a very bad idea to have a BAR overlapping with any
memory region defined in the memory map, so add some sanity checks for
devices that are added with memory decoding enabled in order to assure
that BARs are not placed on top of memory regions defined in the
memory map. If overlaps are detected just disable the memory decoding
bit for the device and expect the hardware domain to properly position
the BAR.

Note apply_quirks must be called before check_pdev so that ignore_bars
is set when calling the later. PCI_HEADER_{NORMAL,BRIDGE}_NR_BARS
needs to be moved into pci_regs.h so it's defined even in the absence
of vPCI.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - Unify warning message and store in a static const var.
 - Rename checker function to is_memory_hole.
 - Pass an inclusive MFN range to the checker function.
 - Remove Arm implementation of is_memory_hole due to lack of
   feedback.

Changes since v1:
 - Add comment regarding pci_size_mem_bar failure.
 - Make e820entry const.
 - Move is_iomem_range after is_iomem_page.
 - Reword error message.
 - Make is_iomem_range paddr_t
 - Expand commit message.
 - Move PCI_HEADER_{NORMAL,BRIDGE}_NR_BARS.
 - Only attempt to read ROM BAR if rom_pos != 0.
---
 xen/arch/x86/mm.c             | 17 +++++++++
 xen/drivers/passthrough/pci.c | 71 ++++++++++++++++++++++++++++++++++-
 xen/include/xen/mm.h          |  2 +
 xen/include/xen/pci_regs.h    |  2 +
 xen/include/xen/vpci.h        |  2 -
 5 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 1397f83e41..468efd8fb4 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -783,6 +783,23 @@ bool is_iomem_page(mfn_t mfn)
     return (page_get_owner(page) == dom_io);
 }
 
+bool is_memory_hole(unsigned long start, unsigned long end)
+{
+    unsigned int i;
+
+    for ( i = 0; i < e820.nr_map; i++ )
+    {
+        const struct e820entry *entry = &e820.map[i];
+
+        /* Do not allow overlaps with any memory range. */
+        if ( start < PFN_DOWN(entry->addr + entry->size) &&
+             PFN_DOWN(entry->addr) <= end )
+            return false;
+    }
+
+    return true;
+}
+
 static int update_xen_mappings(unsigned long mfn, unsigned int cacheattr)
 {
     int err = 0;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 1fad80362f..9a5e6cf842 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -233,6 +233,9 @@ static void check_pdev(const struct pci_dev *pdev)
      PCI_STATUS_REC_TARGET_ABORT | PCI_STATUS_REC_MASTER_ABORT | \
      PCI_STATUS_SIG_SYSTEM_ERROR | PCI_STATUS_DETECTED_PARITY)
     u16 val;
+    unsigned int nbars = 0, rom_pos = 0, i;
+    static const char warn[] = XENLOG_WARNING
+        "%pp disabled: %sBAR [%#lx, %#lx] overlaps with memory map\n";
 
     if ( command_mask )
     {
@@ -251,6 +254,8 @@ static void check_pdev(const struct pci_dev *pdev)
     switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
     {
     case PCI_HEADER_TYPE_BRIDGE:
+        nbars = PCI_HEADER_BRIDGE_NR_BARS;
+        rom_pos = PCI_ROM_ADDRESS1;
         if ( !bridge_ctl_mask )
             break;
         val = pci_conf_read16(pdev->sbdf, PCI_BRIDGE_CONTROL);
@@ -267,11 +272,75 @@ static void check_pdev(const struct pci_dev *pdev)
         }
         break;
 
+    case PCI_HEADER_TYPE_NORMAL:
+        nbars = PCI_HEADER_NORMAL_NR_BARS;
+        rom_pos = PCI_ROM_ADDRESS;
+        break;
+
     case PCI_HEADER_TYPE_CARDBUS:
         /* TODO */
         break;
     }
 #undef PCI_STATUS_CHECK
+
+    /* Check if BARs overlap with other memory regions. */
+    val = pci_conf_read16(pdev->sbdf, PCI_COMMAND);
+    if ( !(val & PCI_COMMAND_MEMORY) || pdev->ignore_bars )
+        return;
+
+    pci_conf_write16(pdev->sbdf, PCI_COMMAND, val & ~PCI_COMMAND_MEMORY);
+    for ( i = 0; i < nbars; )
+    {
+        uint64_t addr, size;
+        unsigned int reg = PCI_BASE_ADDRESS_0 + i * 4;
+        int rc = 1;
+
+        if ( (pci_conf_read32(pdev->sbdf, reg) & PCI_BASE_ADDRESS_SPACE) !=
+             PCI_BASE_ADDRESS_SPACE_MEMORY )
+            goto next;
+
+        rc = pci_size_mem_bar(pdev->sbdf, reg, &addr, &size,
+                              (i == nbars - 1) ? PCI_BAR_LAST : 0);
+        if ( rc < 0 )
+            /* Unable to size, better leave memory decoding disabled. */
+            return;
+        if ( size && !is_memory_hole(PFN_DOWN(addr),
+                                     PFN_DOWN(addr + size - 1)) )
+        {
+            /*
+             * Return without enabling memory decoding if BAR position is not
+             * in IO suitable memory. Let the hardware domain re-position the
+             * BAR.
+             */
+            printk(warn,
+                   &pdev->sbdf, "", PFN_DOWN(addr), PFN_DOWN(addr + size - 1));
+            return;
+        }
+
+ next:
+        ASSERT(rc > 0);
+        i += rc;
+    }
+
+    if ( rom_pos &&
+         (pci_conf_read32(pdev->sbdf, rom_pos) & PCI_ROM_ADDRESS_ENABLE) )
+    {
+        uint64_t addr, size;
+        int rc = pci_size_mem_bar(pdev->sbdf, rom_pos, &addr, &size,
+                                  PCI_BAR_ROM);
+
+        if ( rc < 0 )
+            return;
+        if ( size && !is_memory_hole(PFN_DOWN(addr),
+                                     PFN_DOWN(addr + size - 1)) )
+        {
+            printk(warn, &pdev->sbdf, "ROM ", PFN_DOWN(addr),
+                   PFN_DOWN(addr + size - 1));
+            return;
+        }
+    }
+
+    pci_conf_write16(pdev->sbdf, PCI_COMMAND, val);
 }
 
 static void apply_quirks(struct pci_dev *pdev)
@@ -399,8 +468,8 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
             break;
     }
 
-    check_pdev(pdev);
     apply_quirks(pdev);
+    check_pdev(pdev);
 
     return pdev;
 }
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 5db26ed477..c434a53daa 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -554,6 +554,8 @@ int __must_check steal_page(struct domain *d, struct page_info *page,
 int page_is_ram_type(unsigned long mfn, unsigned long mem_type);
 /* Returns the page type(s). */
 unsigned int page_get_ram_type(mfn_t mfn);
+/* Check if a range falls into a hole in the memory map. */
+bool is_memory_hole(paddr_t start, uint64_t size);
 
 /* Prepare/destroy a ring for a dom0 helper. Helper with talk
  * with Xen on behalf of this domain. */
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index cc4ee3b83e..ee8e82be36 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -88,6 +88,8 @@
  * 0xffffffff to the register, and reading it back.  Only
  * 1 bits are decoded.
  */
+#define PCI_HEADER_NORMAL_NR_BARS	6
+#define PCI_HEADER_BRIDGE_NR_BARS	2
 #define PCI_BASE_ADDRESS_0	0x10	/* 32 bits */
 #define PCI_BASE_ADDRESS_1	0x14	/* 32 bits [htype 0,1 only] */
 #define PCI_BASE_ADDRESS_2	0x18	/* 32 bits [htype 0 only] */
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 3f32de9d7e..e8ac1eb395 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -80,8 +80,6 @@ struct vpci {
             bool prefetchable : 1;
             /* Store whether the BAR is mapped into guest p2m. */
             bool enabled      : 1;
-#define PCI_HEADER_NORMAL_NR_BARS        6
-#define PCI_HEADER_BRIDGE_NR_BARS        2
         } bars[PCI_HEADER_NORMAL_NR_BARS + 1];
         /* At most 6 BARS + 1 expansion ROM BAR. */
 
-- 
2.34.1


Re: [PATCH v3] xen/pci: detect when BARs are not suitably positioned
Posted by Jan Beulich 2 years, 3 months ago
On 01.02.2022 13:45, Roger Pau Monne wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -783,6 +783,23 @@ bool is_iomem_page(mfn_t mfn)
>      return (page_get_owner(page) == dom_io);
>  }
>  
> +bool is_memory_hole(unsigned long start, unsigned long end)
> +{
> +    unsigned int i;
> +
> +    for ( i = 0; i < e820.nr_map; i++ )
> +    {
> +        const struct e820entry *entry = &e820.map[i];
> +
> +        /* Do not allow overlaps with any memory range. */
> +        if ( start < PFN_DOWN(entry->addr + entry->size) &&
> +             PFN_DOWN(entry->addr) <= end )
> +            return false;
> +    }
> +
> +    return true;
> +}

Doesn't the left side of the && need to use PFN_UP()?

I also think it would help if a brief comment ahead of the
function said that the range is inclusive. Otherwise the use
of < and >= gives the impression of something being wrong.
Then again it may be better to switch to <= anyway, as I
think you want to avoid possible zero-size regions (at which
point subtracting 1 for using <= is going to be valid).

Finally I wonder whether the function parameters wouldn't
better be named e.g. spfn and epfn, but maybe their units can
be inferred from their types being unsigned long (which,
however, would build on the assumption that we use appropriate
types everywhere).

> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -554,6 +554,8 @@ int __must_check steal_page(struct domain *d, struct page_info *page,
>  int page_is_ram_type(unsigned long mfn, unsigned long mem_type);
>  /* Returns the page type(s). */
>  unsigned int page_get_ram_type(mfn_t mfn);
> +/* Check if a range falls into a hole in the memory map. */
> +bool is_memory_hole(paddr_t start, uint64_t size);

While resolving to the same type, these now also want to be
"unsigned long".

Jan


Re: [PATCH v3] xen/pci: detect when BARs are not suitably positioned
Posted by Julien Grall 2 years, 3 months ago
Hi,

On 02/02/2022 09:42, Jan Beulich wrote:
> On 01.02.2022 13:45, Roger Pau Monne wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -783,6 +783,23 @@ bool is_iomem_page(mfn_t mfn)
>>       return (page_get_owner(page) == dom_io);
>>   }
>>   
>> +bool is_memory_hole(unsigned long start, unsigned long end)
>> +{
>> +    unsigned int i;
>> +
>> +    for ( i = 0; i < e820.nr_map; i++ )
>> +    {
>> +        const struct e820entry *entry = &e820.map[i];
>> +
>> +        /* Do not allow overlaps with any memory range. */
>> +        if ( start < PFN_DOWN(entry->addr + entry->size) &&
>> +             PFN_DOWN(entry->addr) <= end )
>> +            return false;
>> +    }
>> +
>> +    return true;
>> +}
> 
> Doesn't the left side of the && need to use PFN_UP()?
> 
> I also think it would help if a brief comment ahead of the
> function said that the range is inclusive. Otherwise the use
> of < and >= gives the impression of something being wrong.
> Then again it may be better to switch to <= anyway, as I
> think you want to avoid possible zero-size regions (at which
> point subtracting 1 for using <= is going to be valid).
> 
> Finally I wonder whether the function parameters wouldn't
> better be named e.g. spfn and epfn, but maybe their units can
> be inferred from their types being unsigned long (which,
> however, would build on the assumption that we use appropriate
> types everywhere).

I think this a case where mfn_t would be useful to use.

> 
>> --- a/xen/include/xen/mm.h
>> +++ b/xen/include/xen/mm.h
>> @@ -554,6 +554,8 @@ int __must_check steal_page(struct domain *d, struct page_info *page,
>>   int page_is_ram_type(unsigned long mfn, unsigned long mem_type);
>>   /* Returns the page type(s). */
>>   unsigned int page_get_ram_type(mfn_t mfn);
>> +/* Check if a range falls into a hole in the memory map. */
>> +bool is_memory_hole(paddr_t start, uint64_t size);
> 
> While resolving to the same type, these now also want to be
> "unsigned long".
> 
> Jan
> 

Cheers,

-- 
Julien Grall

Re: [PATCH v3] xen/pci: detect when BARs are not suitably positioned
Posted by Jan Beulich 2 years, 3 months ago
On 02.02.2022 10:57, Julien Grall wrote:
> On 02/02/2022 09:42, Jan Beulich wrote:
>> On 01.02.2022 13:45, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -783,6 +783,23 @@ bool is_iomem_page(mfn_t mfn)
>>>       return (page_get_owner(page) == dom_io);
>>>   }
>>>   
>>> +bool is_memory_hole(unsigned long start, unsigned long end)
>>> +{
>>> +    unsigned int i;
>>> +
>>> +    for ( i = 0; i < e820.nr_map; i++ )
>>> +    {
>>> +        const struct e820entry *entry = &e820.map[i];
>>> +
>>> +        /* Do not allow overlaps with any memory range. */
>>> +        if ( start < PFN_DOWN(entry->addr + entry->size) &&
>>> +             PFN_DOWN(entry->addr) <= end )
>>> +            return false;
>>> +    }
>>> +
>>> +    return true;
>>> +}
>>
>> Doesn't the left side of the && need to use PFN_UP()?
>>
>> I also think it would help if a brief comment ahead of the
>> function said that the range is inclusive. Otherwise the use
>> of < and >= gives the impression of something being wrong.
>> Then again it may be better to switch to <= anyway, as I
>> think you want to avoid possible zero-size regions (at which
>> point subtracting 1 for using <= is going to be valid).
>>
>> Finally I wonder whether the function parameters wouldn't
>> better be named e.g. spfn and epfn, but maybe their units can
>> be inferred from their types being unsigned long (which,
>> however, would build on the assumption that we use appropriate
>> types everywhere).
> 
> I think this a case where mfn_t would be useful to use.

Actually I did consider to suggest it when asking to convert
to frame numbers, and specifically didn't because its use will
clutter the code here quite a bit. Which isn't to mean I'd
object if the adjustment was made ...

Jan


Re: [PATCH v3] xen/pci: detect when BARs are not suitably positioned
Posted by Julien Grall 2 years, 3 months ago
Hi Jan,

On 02/02/2022 10:05, Jan Beulich wrote:
> On 02.02.2022 10:57, Julien Grall wrote:
>> On 02/02/2022 09:42, Jan Beulich wrote:
>>> On 01.02.2022 13:45, Roger Pau Monne wrote:
>>>> --- a/xen/arch/x86/mm.c
>>>> +++ b/xen/arch/x86/mm.c
>>>> @@ -783,6 +783,23 @@ bool is_iomem_page(mfn_t mfn)
>>>>        return (page_get_owner(page) == dom_io);
>>>>    }
>>>>    
>>>> +bool is_memory_hole(unsigned long start, unsigned long end)
>>>> +{
>>>> +    unsigned int i;
>>>> +
>>>> +    for ( i = 0; i < e820.nr_map; i++ )
>>>> +    {
>>>> +        const struct e820entry *entry = &e820.map[i];
>>>> +
>>>> +        /* Do not allow overlaps with any memory range. */
>>>> +        if ( start < PFN_DOWN(entry->addr + entry->size) &&
>>>> +             PFN_DOWN(entry->addr) <= end )
>>>> +            return false;
>>>> +    }
>>>> +
>>>> +    return true;
>>>> +}
>>>
>>> Doesn't the left side of the && need to use PFN_UP()?
>>>
>>> I also think it would help if a brief comment ahead of the
>>> function said that the range is inclusive. Otherwise the use
>>> of < and >= gives the impression of something being wrong.
>>> Then again it may be better to switch to <= anyway, as I
>>> think you want to avoid possible zero-size regions (at which
>>> point subtracting 1 for using <= is going to be valid).
>>>
>>> Finally I wonder whether the function parameters wouldn't
>>> better be named e.g. spfn and epfn, but maybe their units can
>>> be inferred from their types being unsigned long (which,
>>> however, would build on the assumption that we use appropriate
>>> types everywhere).
>>
>> I think this a case where mfn_t would be useful to use.
> 
> Actually I did consider to suggest it when asking to convert
> to frame numbers, and specifically didn't because its use will
> clutter the code here quite a bit. Which isn't to mean I'd
> object if the adjustment was made ...
I thought about code cluterring, but there are two use
of the variables. So it would not look too bad.

But I care more about the external interface to be typesafe. So an 
alternative would be:

unsigned long smfn_ = mfn_x();
unsigned long emfn_ = mfn_x();

Cheers,

-- 
Julien Grall

Re: [PATCH v3] xen/pci: detect when BARs are not suitably positioned
Posted by Roger Pau Monné 2 years, 3 months ago
On Wed, Feb 02, 2022 at 10:42:22AM +0100, Jan Beulich wrote:
> On 01.02.2022 13:45, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -783,6 +783,23 @@ bool is_iomem_page(mfn_t mfn)
> >      return (page_get_owner(page) == dom_io);
> >  }
> >  
> > +bool is_memory_hole(unsigned long start, unsigned long end)
> > +{
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < e820.nr_map; i++ )
> > +    {
> > +        const struct e820entry *entry = &e820.map[i];
> > +
> > +        /* Do not allow overlaps with any memory range. */
> > +        if ( start < PFN_DOWN(entry->addr + entry->size) &&
> > +             PFN_DOWN(entry->addr) <= end )
> > +            return false;
> > +    }
> > +
> > +    return true;
> > +}
> 
> Doesn't the left side of the && need to use PFN_UP()?

Hm, I had is using PFN_UP before and switched to PFN_DOWN for some
weird reasoning.

> 
> I also think it would help if a brief comment ahead of the
> function said that the range is inclusive. Otherwise the use
> of < and >= gives the impression of something being wrong.
> Then again it may be better to switch to <= anyway, as I
> think you want to avoid possible zero-size regions (at which
> point subtracting 1 for using <= is going to be valid).

Right, so that would end up being:

start <= PFN_DOWN(entry->addr + entry->size - 1) &&

Rejecting entries with size == 0 beforehand.

> Finally I wonder whether the function parameters wouldn't
> better be named e.g. spfn and epfn, but maybe their units can
> be inferred from their types being unsigned long (which,
> however, would build on the assumption that we use appropriate
> types everywhere).

I guess I should switch to using mfn_t for the types and convert them
locally to unsigned long for the comparisons.

> > --- a/xen/include/xen/mm.h
> > +++ b/xen/include/xen/mm.h
> > @@ -554,6 +554,8 @@ int __must_check steal_page(struct domain *d, struct page_info *page,
> >  int page_is_ram_type(unsigned long mfn, unsigned long mem_type);
> >  /* Returns the page type(s). */
> >  unsigned int page_get_ram_type(mfn_t mfn);
> > +/* Check if a range falls into a hole in the memory map. */
> > +bool is_memory_hole(paddr_t start, uint64_t size);
> 
> While resolving to the same type, these now also want to be
> "unsigned long".

Doh, yes, sorry. Will convert them to mfn_t if we agree on that.

Thanks, Roger.

Re: [PATCH v3] xen/pci: detect when BARs are not suitably positioned
Posted by Jan Beulich 2 years, 3 months ago
On 02.02.2022 13:13, Roger Pau Monné wrote:
> On Wed, Feb 02, 2022 at 10:42:22AM +0100, Jan Beulich wrote:
>> On 01.02.2022 13:45, Roger Pau Monne wrote:
>>> --- a/xen/include/xen/mm.h
>>> +++ b/xen/include/xen/mm.h
>>> @@ -554,6 +554,8 @@ int __must_check steal_page(struct domain *d, struct page_info *page,
>>>  int page_is_ram_type(unsigned long mfn, unsigned long mem_type);
>>>  /* Returns the page type(s). */
>>>  unsigned int page_get_ram_type(mfn_t mfn);
>>> +/* Check if a range falls into a hole in the memory map. */
>>> +bool is_memory_hole(paddr_t start, uint64_t size);
>>
>> While resolving to the same type, these now also want to be
>> "unsigned long".
> 
> Doh, yes, sorry. Will convert them to mfn_t if we agree on that.

As said in reply to Julien - I don't mind the change, but in this
particular case I also don't view it as strictly necessary / useful.

Jan