[PATCH] iommu/vtd: fix address translation for superpages

Roger Pau Monne posted 1 patch 12 months ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20230509104146.61178-1-roger.pau@citrix.com
There is a newer version of this series
xen/drivers/passthrough/vtd/iommu.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
[PATCH] iommu/vtd: fix address translation for superpages
Posted by Roger Pau Monne 12 months ago
When translating an address that falls inside of a superpage in the
IOMMU page tables the fetching of the PTE physical address field
wasn't using dma_pte_addr(), which caused the returned data to be
corrupt as it would contain bits not related to the address field.

Fix this by re-using the value of pte_maddr which is obtained using
dma_pte_addr().  Such change requires adjusting the previous error
path to zero pte_maddr.

Fixes: c71e55501a61 ('VT-d: have callers specify the target level for page table walks')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/drivers/passthrough/vtd/iommu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 130a159cde07..819e996e6269 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
 
             if ( !alloc )
             {
-                pte_maddr = 0;
                 if ( !dma_pte_present(*pte) )
+                {
+                    pte_maddr = 0;
                     break;
+                }
 
                 /*
                  * When the leaf entry was requested, pass back the full PTE,
                  * with the address adjusted to account for the residual of
                  * the walk.
                  */
-                pte_maddr = pte->val +
+                pte_maddr +=
                     (addr & ((1UL << level_to_offset_bits(level)) - 1) &
                      PAGE_MASK);
                 if ( !target )
-- 
2.40.0


Re: [PATCH] iommu/vtd: fix address translation for superpages
Posted by Jan Beulich 12 months ago
On 09.05.2023 12:41, Roger Pau Monne wrote:
> When translating an address that falls inside of a superpage in the
> IOMMU page tables the fetching of the PTE physical address field
> wasn't using dma_pte_addr(), which caused the returned data to be
> corrupt as it would contain bits not related to the address field.

I'm afraid I don't understand:

> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
>  
>              if ( !alloc )
>              {
> -                pte_maddr = 0;
>                  if ( !dma_pte_present(*pte) )
> +                {
> +                    pte_maddr = 0;
>                      break;
> +                }
>  
>                  /*
>                   * When the leaf entry was requested, pass back the full PTE,
>                   * with the address adjusted to account for the residual of
>                   * the walk.
>                   */
> -                pte_maddr = pte->val +
> +                pte_maddr +=
>                      (addr & ((1UL << level_to_offset_bits(level)) - 1) &
>                       PAGE_MASK);

With this change you're now violating what the comment says (plus what
the comment ahead of the function says). And it says what it says for
a reason - see intel_iommu_lookup_page(), which I think your change is
breaking.

Note also the following code:

                if ( !target )
                    break;
            }

            pte_maddr = level - 1;

IOW the local variable is overwritten right away unless target == 0.

Jan
Re: [PATCH] iommu/vtd: fix address translation for superpages
Posted by Roger Pau Monné 11 months, 4 weeks ago
On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote:
> On 09.05.2023 12:41, Roger Pau Monne wrote:
> > When translating an address that falls inside of a superpage in the
> > IOMMU page tables the fetching of the PTE physical address field
> > wasn't using dma_pte_addr(), which caused the returned data to be
> > corrupt as it would contain bits not related to the address field.
> 
> I'm afraid I don't understand:
> 
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
> >  
> >              if ( !alloc )
> >              {
> > -                pte_maddr = 0;
> >                  if ( !dma_pte_present(*pte) )
> > +                {
> > +                    pte_maddr = 0;
> >                      break;
> > +                }
> >  
> >                  /*
> >                   * When the leaf entry was requested, pass back the full PTE,
> >                   * with the address adjusted to account for the residual of
> >                   * the walk.
> >                   */
> > -                pte_maddr = pte->val +
> > +                pte_maddr +=
> >                      (addr & ((1UL << level_to_offset_bits(level)) - 1) &
> >                       PAGE_MASK);
> 
> With this change you're now violating what the comment says (plus what
> the comment ahead of the function says). And it says what it says for
> a reason - see intel_iommu_lookup_page(), which I think your change is
> breaking.

Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes
the bits in DMA_PTE_CONTIG_MASK as part of the physical address when
doing the conversion to mfn?  maddr_to_mfn() doesn't perform a any
masking to remove the bits above PADDR_BITS.

Thanks, Roger.
Re: [PATCH] iommu/vtd: fix address translation for superpages
Posted by Jan Beulich 11 months, 4 weeks ago
On 10.05.2023 10:27, Roger Pau Monné wrote:
> On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote:
>> On 09.05.2023 12:41, Roger Pau Monne wrote:
>>> When translating an address that falls inside of a superpage in the
>>> IOMMU page tables the fetching of the PTE physical address field
>>> wasn't using dma_pte_addr(), which caused the returned data to be
>>> corrupt as it would contain bits not related to the address field.
>>
>> I'm afraid I don't understand:
>>
>>> --- a/xen/drivers/passthrough/vtd/iommu.c
>>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
>>>  
>>>              if ( !alloc )
>>>              {
>>> -                pte_maddr = 0;
>>>                  if ( !dma_pte_present(*pte) )
>>> +                {
>>> +                    pte_maddr = 0;
>>>                      break;
>>> +                }
>>>  
>>>                  /*
>>>                   * When the leaf entry was requested, pass back the full PTE,
>>>                   * with the address adjusted to account for the residual of
>>>                   * the walk.
>>>                   */
>>> -                pte_maddr = pte->val +
>>> +                pte_maddr +=
>>>                      (addr & ((1UL << level_to_offset_bits(level)) - 1) &
>>>                       PAGE_MASK);
>>
>> With this change you're now violating what the comment says (plus what
>> the comment ahead of the function says). And it says what it says for
>> a reason - see intel_iommu_lookup_page(), which I think your change is
>> breaking.
> 
> Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes
> the bits in DMA_PTE_CONTIG_MASK as part of the physical address when
> doing the conversion to mfn?  maddr_to_mfn() doesn't perform a any
> masking to remove the bits above PADDR_BITS.

Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page()
then. (It would likely be better anyway to switch "uint64_t val" to
"struct dma_pte pte" there, to make more visible that it's a PTE we're
dealing with.) I indeed overlooked this aspect when doing the earlier
change.

Jan

Re: [PATCH] iommu/vtd: fix address translation for superpages
Posted by Roger Pau Monné 11 months, 4 weeks ago
On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote:
> On 10.05.2023 10:27, Roger Pau Monné wrote:
> > On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote:
> >> On 09.05.2023 12:41, Roger Pau Monne wrote:
> >>> When translating an address that falls inside of a superpage in the
> >>> IOMMU page tables the fetching of the PTE physical address field
> >>> wasn't using dma_pte_addr(), which caused the returned data to be
> >>> corrupt as it would contain bits not related to the address field.
> >>
> >> I'm afraid I don't understand:
> >>
> >>> --- a/xen/drivers/passthrough/vtd/iommu.c
> >>> +++ b/xen/drivers/passthrough/vtd/iommu.c
> >>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
> >>>  
> >>>              if ( !alloc )
> >>>              {
> >>> -                pte_maddr = 0;
> >>>                  if ( !dma_pte_present(*pte) )
> >>> +                {
> >>> +                    pte_maddr = 0;
> >>>                      break;
> >>> +                }
> >>>  
> >>>                  /*
> >>>                   * When the leaf entry was requested, pass back the full PTE,
> >>>                   * with the address adjusted to account for the residual of
> >>>                   * the walk.
> >>>                   */
> >>> -                pte_maddr = pte->val +
> >>> +                pte_maddr +=
> >>>                      (addr & ((1UL << level_to_offset_bits(level)) - 1) &
> >>>                       PAGE_MASK);
> >>
> >> With this change you're now violating what the comment says (plus what
> >> the comment ahead of the function says). And it says what it says for
> >> a reason - see intel_iommu_lookup_page(), which I think your change is
> >> breaking.
> > 
> > Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes
> > the bits in DMA_PTE_CONTIG_MASK as part of the physical address when
> > doing the conversion to mfn?  maddr_to_mfn() doesn't perform a any
> > masking to remove the bits above PADDR_BITS.
> 
> Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page()
> then. (It would likely be better anyway to switch "uint64_t val" to
> "struct dma_pte pte" there, to make more visible that it's a PTE we're
> dealing with.) I indeed overlooked this aspect when doing the earlier
> change.

I guess I'm still confused, as the other return value for target == 0
(when the address is not part of a superpage) does return
dma_pte_addr(pte).  I think that needs further fixing then.

Thanks, Roger.

Re: [PATCH] iommu/vtd: fix address translation for superpages
Posted by Jan Beulich 11 months, 4 weeks ago
On 10.05.2023 12:22, Roger Pau Monné wrote:
> On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote:
>> On 10.05.2023 10:27, Roger Pau Monné wrote:
>>> On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote:
>>>> On 09.05.2023 12:41, Roger Pau Monne wrote:
>>>>> When translating an address that falls inside of a superpage in the
>>>>> IOMMU page tables the fetching of the PTE physical address field
>>>>> wasn't using dma_pte_addr(), which caused the returned data to be
>>>>> corrupt as it would contain bits not related to the address field.
>>>>
>>>> I'm afraid I don't understand:
>>>>
>>>>> --- a/xen/drivers/passthrough/vtd/iommu.c
>>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>>>>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
>>>>>  
>>>>>              if ( !alloc )
>>>>>              {
>>>>> -                pte_maddr = 0;
>>>>>                  if ( !dma_pte_present(*pte) )
>>>>> +                {
>>>>> +                    pte_maddr = 0;
>>>>>                      break;
>>>>> +                }
>>>>>  
>>>>>                  /*
>>>>>                   * When the leaf entry was requested, pass back the full PTE,
>>>>>                   * with the address adjusted to account for the residual of
>>>>>                   * the walk.
>>>>>                   */
>>>>> -                pte_maddr = pte->val +
>>>>> +                pte_maddr +=
>>>>>                      (addr & ((1UL << level_to_offset_bits(level)) - 1) &
>>>>>                       PAGE_MASK);
>>>>
>>>> With this change you're now violating what the comment says (plus what
>>>> the comment ahead of the function says). And it says what it says for
>>>> a reason - see intel_iommu_lookup_page(), which I think your change is
>>>> breaking.
>>>
>>> Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes
>>> the bits in DMA_PTE_CONTIG_MASK as part of the physical address when
>>> doing the conversion to mfn?  maddr_to_mfn() doesn't perform a any
>>> masking to remove the bits above PADDR_BITS.
>>
>> Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page()
>> then. (It would likely be better anyway to switch "uint64_t val" to
>> "struct dma_pte pte" there, to make more visible that it's a PTE we're
>> dealing with.) I indeed overlooked this aspect when doing the earlier
>> change.
> 
> I guess I'm still confused, as the other return value for target == 0
> (when the address is not part of a superpage) does return
> dma_pte_addr(pte).  I think that needs further fixing then.

Hmm, indeed. But I think it's worse than this: addr_to_dma_page_maddr()
also does one too many iterations in that case. All "normal" callers
supply a positive "target". We need to terminate the walk at level 1
also when target == 0.

Jan

Re: [PATCH] iommu/vtd: fix address translation for superpages
Posted by Roger Pau Monné 11 months, 4 weeks ago
On Wed, May 10, 2023 at 03:30:21PM +0200, Jan Beulich wrote:
> On 10.05.2023 12:22, Roger Pau Monné wrote:
> > On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote:
> >> On 10.05.2023 10:27, Roger Pau Monné wrote:
> >>> On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote:
> >>>> On 09.05.2023 12:41, Roger Pau Monne wrote:
> >>>>> When translating an address that falls inside of a superpage in the
> >>>>> IOMMU page tables the fetching of the PTE physical address field
> >>>>> wasn't using dma_pte_addr(), which caused the returned data to be
> >>>>> corrupt as it would contain bits not related to the address field.
> >>>>
> >>>> I'm afraid I don't understand:
> >>>>
> >>>>> --- a/xen/drivers/passthrough/vtd/iommu.c
> >>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c
> >>>>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
> >>>>>  
> >>>>>              if ( !alloc )
> >>>>>              {
> >>>>> -                pte_maddr = 0;
> >>>>>                  if ( !dma_pte_present(*pte) )
> >>>>> +                {
> >>>>> +                    pte_maddr = 0;
> >>>>>                      break;
> >>>>> +                }
> >>>>>  
> >>>>>                  /*
> >>>>>                   * When the leaf entry was requested, pass back the full PTE,
> >>>>>                   * with the address adjusted to account for the residual of
> >>>>>                   * the walk.
> >>>>>                   */
> >>>>> -                pte_maddr = pte->val +
> >>>>> +                pte_maddr +=
> >>>>>                      (addr & ((1UL << level_to_offset_bits(level)) - 1) &
> >>>>>                       PAGE_MASK);
> >>>>
> >>>> With this change you're now violating what the comment says (plus what
> >>>> the comment ahead of the function says). And it says what it says for
> >>>> a reason - see intel_iommu_lookup_page(), which I think your change is
> >>>> breaking.
> >>>
> >>> Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes
> >>> the bits in DMA_PTE_CONTIG_MASK as part of the physical address when
> >>> doing the conversion to mfn?  maddr_to_mfn() doesn't perform a any
> >>> masking to remove the bits above PADDR_BITS.
> >>
> >> Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page()
> >> then. (It would likely be better anyway to switch "uint64_t val" to
> >> "struct dma_pte pte" there, to make more visible that it's a PTE we're
> >> dealing with.) I indeed overlooked this aspect when doing the earlier
> >> change.
> > 
> > I guess I'm still confused, as the other return value for target == 0
> > (when the address is not part of a superpage) does return
> > dma_pte_addr(pte).  I think that needs further fixing then.
> 
> Hmm, indeed. But I think it's worse than this: addr_to_dma_page_maddr()
> also does one too many iterations in that case. All "normal" callers
> supply a positive "target". We need to terminate the walk at level 1
> also when target == 0.

Don't we do that already due to the following check:

if ( --level == target )
    break;

Which prevents mapping the PTE address as a page table directory?

Thanks, Roger.

Re: [PATCH] iommu/vtd: fix address translation for superpages
Posted by Jan Beulich 11 months, 4 weeks ago
On 10.05.2023 17:19, Roger Pau Monné wrote:
> On Wed, May 10, 2023 at 03:30:21PM +0200, Jan Beulich wrote:
>> On 10.05.2023 12:22, Roger Pau Monné wrote:
>>> On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote:
>>>> On 10.05.2023 10:27, Roger Pau Monné wrote:
>>>>> On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote:
>>>>>> On 09.05.2023 12:41, Roger Pau Monne wrote:
>>>>>>> When translating an address that falls inside of a superpage in the
>>>>>>> IOMMU page tables the fetching of the PTE physical address field
>>>>>>> wasn't using dma_pte_addr(), which caused the returned data to be
>>>>>>> corrupt as it would contain bits not related to the address field.
>>>>>>
>>>>>> I'm afraid I don't understand:
>>>>>>
>>>>>>> --- a/xen/drivers/passthrough/vtd/iommu.c
>>>>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>>>>>>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
>>>>>>>  
>>>>>>>              if ( !alloc )
>>>>>>>              {
>>>>>>> -                pte_maddr = 0;
>>>>>>>                  if ( !dma_pte_present(*pte) )
>>>>>>> +                {
>>>>>>> +                    pte_maddr = 0;
>>>>>>>                      break;
>>>>>>> +                }
>>>>>>>  
>>>>>>>                  /*
>>>>>>>                   * When the leaf entry was requested, pass back the full PTE,
>>>>>>>                   * with the address adjusted to account for the residual of
>>>>>>>                   * the walk.
>>>>>>>                   */
>>>>>>> -                pte_maddr = pte->val +
>>>>>>> +                pte_maddr +=
>>>>>>>                      (addr & ((1UL << level_to_offset_bits(level)) - 1) &
>>>>>>>                       PAGE_MASK);
>>>>>>
>>>>>> With this change you're now violating what the comment says (plus what
>>>>>> the comment ahead of the function says). And it says what it says for
>>>>>> a reason - see intel_iommu_lookup_page(), which I think your change is
>>>>>> breaking.
>>>>>
>>>>> Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes
>>>>> the bits in DMA_PTE_CONTIG_MASK as part of the physical address when
>>>>> doing the conversion to mfn?  maddr_to_mfn() doesn't perform a any
>>>>> masking to remove the bits above PADDR_BITS.
>>>>
>>>> Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page()
>>>> then. (It would likely be better anyway to switch "uint64_t val" to
>>>> "struct dma_pte pte" there, to make more visible that it's a PTE we're
>>>> dealing with.) I indeed overlooked this aspect when doing the earlier
>>>> change.
>>>
>>> I guess I'm still confused, as the other return value for target == 0
>>> (when the address is not part of a superpage) does return
>>> dma_pte_addr(pte).  I think that needs further fixing then.
>>
>> Hmm, indeed. But I think it's worse than this: addr_to_dma_page_maddr()
>> also does one too many iterations in that case. All "normal" callers
>> supply a positive "target". We need to terminate the walk at level 1
>> also when target == 0.
> 
> Don't we do that already due to the following check:
> 
> if ( --level == target )
>     break;
> 
> Which prevents mapping the PTE address as a page table directory?

I don't think this is enough - this code, afaict right now, is only
sufficient when target >= 1.

Jan