xen/drivers/passthrough/vtd/iommu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
When translating an address that falls inside of a superpage in the
IOMMU page tables the fetching of the PTE physical address field
wasn't using dma_pte_addr(), which caused the returned data to be
corrupt as it would contain bits not related to the address field.
Fix this by re-using the value of pte_maddr which is obtained using
dma_pte_addr(). Such change requires adjusting the previous error
path to zero pte_maddr.
Fixes: c71e55501a61 ('VT-d: have callers specify the target level for page table walks')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/drivers/passthrough/vtd/iommu.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 130a159cde07..819e996e6269 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr,
if ( !alloc )
{
- pte_maddr = 0;
if ( !dma_pte_present(*pte) )
+ {
+ pte_maddr = 0;
break;
+ }
/*
* When the leaf entry was requested, pass back the full PTE,
* with the address adjusted to account for the residual of
* the walk.
*/
- pte_maddr = pte->val +
+ pte_maddr +=
(addr & ((1UL << level_to_offset_bits(level)) - 1) &
PAGE_MASK);
if ( !target )
--
2.40.0
On 09.05.2023 12:41, Roger Pau Monne wrote: > When translating an address that falls inside of a superpage in the > IOMMU page tables the fetching of the PTE physical address field > wasn't using dma_pte_addr(), which caused the returned data to be > corrupt as it would contain bits not related to the address field. I'm afraid I don't understand: > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, > > if ( !alloc ) > { > - pte_maddr = 0; > if ( !dma_pte_present(*pte) ) > + { > + pte_maddr = 0; > break; > + } > > /* > * When the leaf entry was requested, pass back the full PTE, > * with the address adjusted to account for the residual of > * the walk. > */ > - pte_maddr = pte->val + > + pte_maddr += > (addr & ((1UL << level_to_offset_bits(level)) - 1) & > PAGE_MASK); With this change you're now violating what the comment says (plus what the comment ahead of the function says). And it says what it says for a reason - see intel_iommu_lookup_page(), which I think your change is breaking. Note also the following code: if ( !target ) break; } pte_maddr = level - 1; IOW the local variable is overwritten right away unless target == 0. Jan
On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote: > On 09.05.2023 12:41, Roger Pau Monne wrote: > > When translating an address that falls inside of a superpage in the > > IOMMU page tables the fetching of the PTE physical address field > > wasn't using dma_pte_addr(), which caused the returned data to be > > corrupt as it would contain bits not related to the address field. > > I'm afraid I don't understand: > > > --- a/xen/drivers/passthrough/vtd/iommu.c > > +++ b/xen/drivers/passthrough/vtd/iommu.c > > @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, > > > > if ( !alloc ) > > { > > - pte_maddr = 0; > > if ( !dma_pte_present(*pte) ) > > + { > > + pte_maddr = 0; > > break; > > + } > > > > /* > > * When the leaf entry was requested, pass back the full PTE, > > * with the address adjusted to account for the residual of > > * the walk. > > */ > > - pte_maddr = pte->val + > > + pte_maddr += > > (addr & ((1UL << level_to_offset_bits(level)) - 1) & > > PAGE_MASK); > > With this change you're now violating what the comment says (plus what > the comment ahead of the function says). And it says what it says for > a reason - see intel_iommu_lookup_page(), which I think your change is > breaking. Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes the bits in DMA_PTE_CONTIG_MASK as part of the physical address when doing the conversion to mfn? maddr_to_mfn() doesn't perform a any masking to remove the bits above PADDR_BITS. Thanks, Roger.
On 10.05.2023 10:27, Roger Pau Monné wrote: > On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote: >> On 09.05.2023 12:41, Roger Pau Monne wrote: >>> When translating an address that falls inside of a superpage in the >>> IOMMU page tables the fetching of the PTE physical address field >>> wasn't using dma_pte_addr(), which caused the returned data to be >>> corrupt as it would contain bits not related to the address field. >> >> I'm afraid I don't understand: >> >>> --- a/xen/drivers/passthrough/vtd/iommu.c >>> +++ b/xen/drivers/passthrough/vtd/iommu.c >>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, >>> >>> if ( !alloc ) >>> { >>> - pte_maddr = 0; >>> if ( !dma_pte_present(*pte) ) >>> + { >>> + pte_maddr = 0; >>> break; >>> + } >>> >>> /* >>> * When the leaf entry was requested, pass back the full PTE, >>> * with the address adjusted to account for the residual of >>> * the walk. >>> */ >>> - pte_maddr = pte->val + >>> + pte_maddr += >>> (addr & ((1UL << level_to_offset_bits(level)) - 1) & >>> PAGE_MASK); >> >> With this change you're now violating what the comment says (plus what >> the comment ahead of the function says). And it says what it says for >> a reason - see intel_iommu_lookup_page(), which I think your change is >> breaking. > > Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes > the bits in DMA_PTE_CONTIG_MASK as part of the physical address when > doing the conversion to mfn? maddr_to_mfn() doesn't perform a any > masking to remove the bits above PADDR_BITS. Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page() then. (It would likely be better anyway to switch "uint64_t val" to "struct dma_pte pte" there, to make more visible that it's a PTE we're dealing with.) I indeed overlooked this aspect when doing the earlier change. Jan
On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote: > On 10.05.2023 10:27, Roger Pau Monné wrote: > > On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote: > >> On 09.05.2023 12:41, Roger Pau Monne wrote: > >>> When translating an address that falls inside of a superpage in the > >>> IOMMU page tables the fetching of the PTE physical address field > >>> wasn't using dma_pte_addr(), which caused the returned data to be > >>> corrupt as it would contain bits not related to the address field. > >> > >> I'm afraid I don't understand: > >> > >>> --- a/xen/drivers/passthrough/vtd/iommu.c > >>> +++ b/xen/drivers/passthrough/vtd/iommu.c > >>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, > >>> > >>> if ( !alloc ) > >>> { > >>> - pte_maddr = 0; > >>> if ( !dma_pte_present(*pte) ) > >>> + { > >>> + pte_maddr = 0; > >>> break; > >>> + } > >>> > >>> /* > >>> * When the leaf entry was requested, pass back the full PTE, > >>> * with the address adjusted to account for the residual of > >>> * the walk. > >>> */ > >>> - pte_maddr = pte->val + > >>> + pte_maddr += > >>> (addr & ((1UL << level_to_offset_bits(level)) - 1) & > >>> PAGE_MASK); > >> > >> With this change you're now violating what the comment says (plus what > >> the comment ahead of the function says). And it says what it says for > >> a reason - see intel_iommu_lookup_page(), which I think your change is > >> breaking. > > > > Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes > > the bits in DMA_PTE_CONTIG_MASK as part of the physical address when > > doing the conversion to mfn? maddr_to_mfn() doesn't perform a any > > masking to remove the bits above PADDR_BITS. > > Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page() > then. (It would likely be better anyway to switch "uint64_t val" to > "struct dma_pte pte" there, to make more visible that it's a PTE we're > dealing with.) I indeed overlooked this aspect when doing the earlier > change. I guess I'm still confused, as the other return value for target == 0 (when the address is not part of a superpage) does return dma_pte_addr(pte). I think that needs further fixing then. Thanks, Roger.
On 10.05.2023 12:22, Roger Pau Monné wrote: > On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote: >> On 10.05.2023 10:27, Roger Pau Monné wrote: >>> On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote: >>>> On 09.05.2023 12:41, Roger Pau Monne wrote: >>>>> When translating an address that falls inside of a superpage in the >>>>> IOMMU page tables the fetching of the PTE physical address field >>>>> wasn't using dma_pte_addr(), which caused the returned data to be >>>>> corrupt as it would contain bits not related to the address field. >>>> >>>> I'm afraid I don't understand: >>>> >>>>> --- a/xen/drivers/passthrough/vtd/iommu.c >>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c >>>>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, >>>>> >>>>> if ( !alloc ) >>>>> { >>>>> - pte_maddr = 0; >>>>> if ( !dma_pte_present(*pte) ) >>>>> + { >>>>> + pte_maddr = 0; >>>>> break; >>>>> + } >>>>> >>>>> /* >>>>> * When the leaf entry was requested, pass back the full PTE, >>>>> * with the address adjusted to account for the residual of >>>>> * the walk. >>>>> */ >>>>> - pte_maddr = pte->val + >>>>> + pte_maddr += >>>>> (addr & ((1UL << level_to_offset_bits(level)) - 1) & >>>>> PAGE_MASK); >>>> >>>> With this change you're now violating what the comment says (plus what >>>> the comment ahead of the function says). And it says what it says for >>>> a reason - see intel_iommu_lookup_page(), which I think your change is >>>> breaking. >>> >>> Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes >>> the bits in DMA_PTE_CONTIG_MASK as part of the physical address when >>> doing the conversion to mfn? maddr_to_mfn() doesn't perform a any >>> masking to remove the bits above PADDR_BITS. >> >> Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page() >> then. (It would likely be better anyway to switch "uint64_t val" to >> "struct dma_pte pte" there, to make more visible that it's a PTE we're >> dealing with.) I indeed overlooked this aspect when doing the earlier >> change. > > I guess I'm still confused, as the other return value for target == 0 > (when the address is not part of a superpage) does return > dma_pte_addr(pte). I think that needs further fixing then. Hmm, indeed. But I think it's worse than this: addr_to_dma_page_maddr() also does one too many iterations in that case. All "normal" callers supply a positive "target". We need to terminate the walk at level 1 also when target == 0. Jan
On Wed, May 10, 2023 at 03:30:21PM +0200, Jan Beulich wrote: > On 10.05.2023 12:22, Roger Pau Monné wrote: > > On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote: > >> On 10.05.2023 10:27, Roger Pau Monné wrote: > >>> On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote: > >>>> On 09.05.2023 12:41, Roger Pau Monne wrote: > >>>>> When translating an address that falls inside of a superpage in the > >>>>> IOMMU page tables the fetching of the PTE physical address field > >>>>> wasn't using dma_pte_addr(), which caused the returned data to be > >>>>> corrupt as it would contain bits not related to the address field. > >>>> > >>>> I'm afraid I don't understand: > >>>> > >>>>> --- a/xen/drivers/passthrough/vtd/iommu.c > >>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c > >>>>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, > >>>>> > >>>>> if ( !alloc ) > >>>>> { > >>>>> - pte_maddr = 0; > >>>>> if ( !dma_pte_present(*pte) ) > >>>>> + { > >>>>> + pte_maddr = 0; > >>>>> break; > >>>>> + } > >>>>> > >>>>> /* > >>>>> * When the leaf entry was requested, pass back the full PTE, > >>>>> * with the address adjusted to account for the residual of > >>>>> * the walk. > >>>>> */ > >>>>> - pte_maddr = pte->val + > >>>>> + pte_maddr += > >>>>> (addr & ((1UL << level_to_offset_bits(level)) - 1) & > >>>>> PAGE_MASK); > >>>> > >>>> With this change you're now violating what the comment says (plus what > >>>> the comment ahead of the function says). And it says what it says for > >>>> a reason - see intel_iommu_lookup_page(), which I think your change is > >>>> breaking. > >>> > >>> Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes > >>> the bits in DMA_PTE_CONTIG_MASK as part of the physical address when > >>> doing the conversion to mfn? maddr_to_mfn() doesn't perform a any > >>> masking to remove the bits above PADDR_BITS. > >> > >> Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page() > >> then. (It would likely be better anyway to switch "uint64_t val" to > >> "struct dma_pte pte" there, to make more visible that it's a PTE we're > >> dealing with.) I indeed overlooked this aspect when doing the earlier > >> change. > > > > I guess I'm still confused, as the other return value for target == 0 > > (when the address is not part of a superpage) does return > > dma_pte_addr(pte). I think that needs further fixing then. > > Hmm, indeed. But I think it's worse than this: addr_to_dma_page_maddr() > also does one too many iterations in that case. All "normal" callers > supply a positive "target". We need to terminate the walk at level 1 > also when target == 0. Don't we do that already due to the following check: if ( --level == target ) break; Which prevents mapping the PTE address as a page table directory? Thanks, Roger.
On 10.05.2023 17:19, Roger Pau Monné wrote: > On Wed, May 10, 2023 at 03:30:21PM +0200, Jan Beulich wrote: >> On 10.05.2023 12:22, Roger Pau Monné wrote: >>> On Wed, May 10, 2023 at 12:00:51PM +0200, Jan Beulich wrote: >>>> On 10.05.2023 10:27, Roger Pau Monné wrote: >>>>> On Tue, May 09, 2023 at 06:06:45PM +0200, Jan Beulich wrote: >>>>>> On 09.05.2023 12:41, Roger Pau Monne wrote: >>>>>>> When translating an address that falls inside of a superpage in the >>>>>>> IOMMU page tables the fetching of the PTE physical address field >>>>>>> wasn't using dma_pte_addr(), which caused the returned data to be >>>>>>> corrupt as it would contain bits not related to the address field. >>>>>> >>>>>> I'm afraid I don't understand: >>>>>> >>>>>>> --- a/xen/drivers/passthrough/vtd/iommu.c >>>>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c >>>>>>> @@ -359,16 +359,18 @@ static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, >>>>>>> >>>>>>> if ( !alloc ) >>>>>>> { >>>>>>> - pte_maddr = 0; >>>>>>> if ( !dma_pte_present(*pte) ) >>>>>>> + { >>>>>>> + pte_maddr = 0; >>>>>>> break; >>>>>>> + } >>>>>>> >>>>>>> /* >>>>>>> * When the leaf entry was requested, pass back the full PTE, >>>>>>> * with the address adjusted to account for the residual of >>>>>>> * the walk. >>>>>>> */ >>>>>>> - pte_maddr = pte->val + >>>>>>> + pte_maddr += >>>>>>> (addr & ((1UL << level_to_offset_bits(level)) - 1) & >>>>>>> PAGE_MASK); >>>>>> >>>>>> With this change you're now violating what the comment says (plus what >>>>>> the comment ahead of the function says). And it says what it says for >>>>>> a reason - see intel_iommu_lookup_page(), which I think your change is >>>>>> breaking. >>>>> >>>>> Hm, but the code in intel_iommu_lookup_page() is now wrong as it takes >>>>> the bits in DMA_PTE_CONTIG_MASK as part of the physical address when >>>>> doing the conversion to mfn? maddr_to_mfn() doesn't perform a any >>>>> masking to remove the bits above PADDR_BITS. >>>> >>>> Oh, right. But that's a missing dma_pte_addr() in intel_iommu_lookup_page() >>>> then. (It would likely be better anyway to switch "uint64_t val" to >>>> "struct dma_pte pte" there, to make more visible that it's a PTE we're >>>> dealing with.) I indeed overlooked this aspect when doing the earlier >>>> change. >>> >>> I guess I'm still confused, as the other return value for target == 0 >>> (when the address is not part of a superpage) does return >>> dma_pte_addr(pte). I think that needs further fixing then. >> >> Hmm, indeed. But I think it's worse than this: addr_to_dma_page_maddr() >> also does one too many iterations in that case. All "normal" callers >> supply a positive "target". We need to terminate the walk at level 1 >> also when target == 0. > > Don't we do that already due to the following check: > > if ( --level == target ) > break; > > Which prevents mapping the PTE address as a page table directory? I don't think this is enough - this code, afaict right now, is only sufficient when target >= 1. Jan
© 2016 - 2024 Red Hat, Inc.