RE: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM

Tian, Kevin posted 58 patches 11 months ago
Only 0 patches received!
There is a newer version of this series
RE: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Tian, Kevin 11 months ago
> From: Mostafa Saleh <smostafa@google.com>
> Sent: Wednesday, January 8, 2025 8:10 PM
> 
> On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > Yeah, SVA is tricky, I guess for that we would have to use nesting,
> > > but tbh, I don’t think it’s a deal breaker for now.
> >
> > Again, it depends what your actual use case for translation is inside
> > the host/guest environments. It would be good to clearly spell this out..
> > There are few drivers that directly manpulate the iommu_domains of a
> > device. a few gpus, ath1x wireless, some tegra stuff, "venus". Which
> > of those are you targetting?
> >
> 
> Not sure I understand this point about manipulating domains.
> AFAIK, SVA is not that common, including mobile spaces but I can be wrong,
> that’s why it’s not a priority here.

Nested translation is required beyond SVA. A scenario which requires
a vIOMMU and multiple device domains within the guest would like to
embrace nesting. Especially for ARM vSMMU nesting is a must.

But I'm not sure that I got Jason's point about " there is no way to get
SVA support with para-virtualization." virtio-iommu is a para-virtualized
model and SVA support is in its plan. The main requirement is to pass
the base pointer of the guest CPU page table to backend and PRI faults/
responses back forth.
Re: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Jason Gunthorpe 11 months ago
On Thu, Jan 16, 2025 at 06:39:31AM +0000, Tian, Kevin wrote:
> > From: Mostafa Saleh <smostafa@google.com>
> > Sent: Wednesday, January 8, 2025 8:10 PM
> > 
> > On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > > Yeah, SVA is tricky, I guess for that we would have to use nesting,
> > > > but tbh, I don’t think it’s a deal breaker for now.
> > >
> > > Again, it depends what your actual use case for translation is inside
> > > the host/guest environments. It would be good to clearly spell this out..
> > > There are few drivers that directly manpulate the iommu_domains of a
> > > device. a few gpus, ath1x wireless, some tegra stuff, "venus". Which
> > > of those are you targetting?
> > >
> > 
> > Not sure I understand this point about manipulating domains.
> > AFAIK, SVA is not that common, including mobile spaces but I can be wrong,
> > that’s why it’s not a priority here.
> 
> Nested translation is required beyond SVA. A scenario which requires
> a vIOMMU and multiple device domains within the guest would like to
> embrace nesting. Especially for ARM vSMMU nesting is a must.

Right, if you need an iommu domain in the guest there are only three
mainstream ways to get this in Linux:
 1) Use the DMA API and have the iommu group be translating. This is
    optional in that the DMA API usually supports identity as an option.
 2) A driver directly calls iommu_paging_domain_alloc() and manually
    attaches it to some device, and does not use the DMA API. My list
    above of ath1x/etc are examples doing this
 3) Use VFIO

My remark to Mostafa is to be specific, which of the above do you want
to do in your mobile guest (and what driver exactly if #2) and why.

This will help inform what the performance profile looks like and
guide if nesting/para virt is appropriate.

> But I'm not sure that I got Jason's point about " there is no way to get
> SVA support with para-virtualization." virtio-iommu is a para-virtualized
> model and SVA support is in its plan. The main requirement is to pass
> the base pointer of the guest CPU page table to backend and PRI faults/
> responses back forth.

That's nesting, you have a full page table under the control of the
guest, and the guest needs to have a level of HW-specific
knowledge. It is just an alternative to using the native nesting
vIOMMU.

What I mean by "para-virtualization" is the guest does map/unmap calls
to the hypervisor and has no page tbale.

Jason
RE: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Tian, Kevin 11 months ago
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Friday, January 17, 2025 3:15 AM
> 
> On Thu, Jan 16, 2025 at 06:39:31AM +0000, Tian, Kevin wrote:
> > > From: Mostafa Saleh <smostafa@google.com>
> > > Sent: Wednesday, January 8, 2025 8:10 PM
> > >
> > > On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > > > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > > > Yeah, SVA is tricky, I guess for that we would have to use nesting,
> > > > > but tbh, I don’t think it’s a deal breaker for now.
> > > >
> > > > Again, it depends what your actual use case for translation is inside
> > > > the host/guest environments. It would be good to clearly spell this out..
> > > > There are few drivers that directly manpulate the iommu_domains of a
> > > > device. a few gpus, ath1x wireless, some tegra stuff, "venus". Which
> > > > of those are you targetting?
> > > >
> > >
> > > Not sure I understand this point about manipulating domains.
> > > AFAIK, SVA is not that common, including mobile spaces but I can be
> wrong,
> > > that’s why it’s not a priority here.
> >
> > Nested translation is required beyond SVA. A scenario which requires
> > a vIOMMU and multiple device domains within the guest would like to
> > embrace nesting. Especially for ARM vSMMU nesting is a must.
> 
> Right, if you need an iommu domain in the guest there are only three
> mainstream ways to get this in Linux:
>  1) Use the DMA API and have the iommu group be translating. This is
>     optional in that the DMA API usually supports identity as an option.
>  2) A driver directly calls iommu_paging_domain_alloc() and manually
>     attaches it to some device, and does not use the DMA API. My list
>     above of ath1x/etc are examples doing this
>  3) Use VFIO
> 
> My remark to Mostafa is to be specific, which of the above do you want
> to do in your mobile guest (and what driver exactly if #2) and why.
> 
> This will help inform what the performance profile looks like and
> guide if nesting/para virt is appropriate.

Yeah that part would be critical to help decide which route to pursue
first. Even when all options might be required in the end when pKVM
is scaled to more scenarios, as you mentioned in another mail, a staging
approach would be much preferrable to evolve.

The pros/cons between nesting/para virt is clear - more static the 
mapping is, more gain from the para approach due to less paging 
walking and smaller tlb footprint, while vice versa nesting performs
much better by avoiding frequent para calls on page table mgmt. 😊

> 
> > But I'm not sure that I got Jason's point about " there is no way to get
> > SVA support with para-virtualization." virtio-iommu is a para-virtualized
> > model and SVA support is in its plan. The main requirement is to pass
> > the base pointer of the guest CPU page table to backend and PRI faults/
> > responses back forth.
> 
> That's nesting, you have a full page table under the control of the
> guest, and the guest needs to have a level of HW-specific
> knowledge. It is just an alternative to using the native nesting
> vIOMMU.
> 
> What I mean by "para-virtualization" is the guest does map/unmap calls
> to the hypervisor and has no page tbale.
> 

Yes, that should never happen for SVA.
Re: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Mostafa Saleh 10 months, 3 weeks ago
On Fri, Jan 17, 2025 at 06:57:12AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@ziepe.ca>
> > Sent: Friday, January 17, 2025 3:15 AM
> > 
> > On Thu, Jan 16, 2025 at 06:39:31AM +0000, Tian, Kevin wrote:
> > > > From: Mostafa Saleh <smostafa@google.com>
> > > > Sent: Wednesday, January 8, 2025 8:10 PM
> > > >
> > > > On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > > > > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > > > > Yeah, SVA is tricky, I guess for that we would have to use nesting,
> > > > > > but tbh, I don’t think it’s a deal breaker for now.
> > > > >
> > > > > Again, it depends what your actual use case for translation is inside
> > > > > the host/guest environments. It would be good to clearly spell this out..
> > > > > There are few drivers that directly manpulate the iommu_domains of a
> > > > > device. a few gpus, ath1x wireless, some tegra stuff, "venus". Which
> > > > > of those are you targetting?
> > > > >
> > > >
> > > > Not sure I understand this point about manipulating domains.
> > > > AFAIK, SVA is not that common, including mobile spaces but I can be
> > wrong,
> > > > that’s why it’s not a priority here.
> > >
> > > Nested translation is required beyond SVA. A scenario which requires
> > > a vIOMMU and multiple device domains within the guest would like to
> > > embrace nesting. Especially for ARM vSMMU nesting is a must.

We can still do para-virtualization for guests the same way we do for the
host and use a single stage IOMMU.

> > 
> > Right, if you need an iommu domain in the guest there are only three
> > mainstream ways to get this in Linux:
> >  1) Use the DMA API and have the iommu group be translating. This is
> >     optional in that the DMA API usually supports identity as an option.
> >  2) A driver directly calls iommu_paging_domain_alloc() and manually
> >     attaches it to some device, and does not use the DMA API. My list
> >     above of ath1x/etc are examples doing this
> >  3) Use VFIO
> > 
> > My remark to Mostafa is to be specific, which of the above do you want
> > to do in your mobile guest (and what driver exactly if #2) and why.
> > 
> > This will help inform what the performance profile looks like and
> > guide if nesting/para virt is appropriate.
> 

AFAIK, the most common use cases would be:
- Devices using DMA API because it requires a lot of memory to be
  contiguous in IOVA, which is hard to do with identity
- Devices with security requirements/constraints to be isolated from the
  rest of the system, also using DMA API
- VFIO is something we are looking at the moment and have prototyped with
  pKVM, and it should be supported soon in Android (only for platform
  devices for now)

> Yeah that part would be critical to help decide which route to pursue
> first. Even when all options might be required in the end when pKVM
> is scaled to more scenarios, as you mentioned in another mail, a staging
> approach would be much preferrable to evolve.

I agree that would probably be the case. I will work on more staging
approach for v3, mostly without the pv part as Jason suggested.

> 
> The pros/cons between nesting/para virt is clear - more static the 
> mapping is, more gain from the para approach due to less paging 
> walking and smaller tlb footprint, while vice versa nesting performs
> much better by avoiding frequent para calls on page table mgmt. 😊

I am also working to get the numbers for both cases so we know
the order of magnitude of each case, as I guess it won't be as clear
for large systems with many DMA initiators what approach is best.


Thanks,
Mostafa

> 
> > 
> > > But I'm not sure that I got Jason's point about " there is no way to get
> > > SVA support with para-virtualization." virtio-iommu is a para-virtualized
> > > model and SVA support is in its plan. The main requirement is to pass
> > > the base pointer of the guest CPU page table to backend and PRI faults/
> > > responses back forth.
> > 
> > That's nesting, you have a full page table under the control of the
> > guest, and the guest needs to have a level of HW-specific
> > knowledge. It is just an alternative to using the native nesting
> > vIOMMU.
> > 
> > What I mean by "para-virtualization" is the guest does map/unmap calls
> > to the hypervisor and has no page tbale.
> > 
> 
> Yes, that should never happen for SVA.
RE: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Tian, Kevin 10 months, 3 weeks ago
> From: Mostafa Saleh <smostafa@google.com>
> Sent: Wednesday, January 22, 2025 7:04 PM
> 
> On Fri, Jan 17, 2025 at 06:57:12AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@ziepe.ca>
> > > Sent: Friday, January 17, 2025 3:15 AM
> > >
> > > On Thu, Jan 16, 2025 at 06:39:31AM +0000, Tian, Kevin wrote:
> > > > > From: Mostafa Saleh <smostafa@google.com>
> > > > > Sent: Wednesday, January 8, 2025 8:10 PM
> > > > >
> > > > > On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > > > > > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > > > > > Yeah, SVA is tricky, I guess for that we would have to use nesting,
> > > > > > > but tbh, I don’t think it’s a deal breaker for now.
> > > > > >
> > > > > > Again, it depends what your actual use case for translation is inside
> > > > > > the host/guest environments. It would be good to clearly spell this
> out..
> > > > > > There are few drivers that directly manpulate the iommu_domains
> of a
> > > > > > device. a few gpus, ath1x wireless, some tegra stuff, "venus". Which
> > > > > > of those are you targetting?
> > > > > >
> > > > >
> > > > > Not sure I understand this point about manipulating domains.
> > > > > AFAIK, SVA is not that common, including mobile spaces but I can be
> > > wrong,
> > > > > that’s why it’s not a priority here.
> > > >
> > > > Nested translation is required beyond SVA. A scenario which requires
> > > > a vIOMMU and multiple device domains within the guest would like to
> > > > embrace nesting. Especially for ARM vSMMU nesting is a must.
> 
> We can still do para-virtualization for guests the same way we do for the
> host and use a single stage IOMMU.

same way but both require a nested setup.

In concept there are two layers of address translations: GVA->GPA via
guest page table, and GPA->HPA via pKVM page table.

The difference between host/guest is just on the GPA mapping. For host
it's 1:1 with additional hardening for which portion can be mapped and
which cannot. For guest it's non-identical with the mapping established
from the host.

A nested translation naturally fits that conceptual layers.

Using a single-stage IOMMU means you need to combine two layers
into one layer i.e. GVA->HPA by removing GPA. Then you have to
paravirt guest page table so every guest PTE change is intercepted
to replace GPA with HPA.

Doing so completely kills the benefit of SVA, which is why Jason said
a no-go.

> 
> > >
> > > Right, if you need an iommu domain in the guest there are only three
> > > mainstream ways to get this in Linux:
> > >  1) Use the DMA API and have the iommu group be translating. This is
> > >     optional in that the DMA API usually supports identity as an option.
> > >  2) A driver directly calls iommu_paging_domain_alloc() and manually
> > >     attaches it to some device, and does not use the DMA API. My list
> > >     above of ath1x/etc are examples doing this
> > >  3) Use VFIO
> > >
> > > My remark to Mostafa is to be specific, which of the above do you want
> > > to do in your mobile guest (and what driver exactly if #2) and why.
> > >
> > > This will help inform what the performance profile looks like and
> > > guide if nesting/para virt is appropriate.
> >
> 
> AFAIK, the most common use cases would be:
> - Devices using DMA API because it requires a lot of memory to be
>   contiguous in IOVA, which is hard to do with identity
> - Devices with security requirements/constraints to be isolated from the
>   rest of the system, also using DMA API
> - VFIO is something we are looking at the moment and have prototyped with
>   pKVM, and it should be supported soon in Android (only for platform
>   devices for now)

what really matters is the frequency of map/unmap.

> 
> > Yeah that part would be critical to help decide which route to pursue
> > first. Even when all options might be required in the end when pKVM
> > is scaled to more scenarios, as you mentioned in another mail, a staging
> > approach would be much preferrable to evolve.
> 
> I agree that would probably be the case. I will work on more staging
> approach for v3, mostly without the pv part as Jason suggested.
> 
> >
> > The pros/cons between nesting/para virt is clear - more static the
> > mapping is, more gain from the para approach due to less paging
> > walking and smaller tlb footprint, while vice versa nesting performs
> > much better by avoiding frequent para calls on page table mgmt. 😊
> 
> I am also working to get the numbers for both cases so we know
> the order of magnitude of each case, as I guess it won't be as clear
> for large systems with many DMA initiators what approach is best.
> 
> 

That'd be great!
Re: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Mostafa Saleh 10 months, 2 weeks ago
On Thu, Jan 23, 2025 at 08:13:34AM +0000, Tian, Kevin wrote:
> > From: Mostafa Saleh <smostafa@google.com>
> > Sent: Wednesday, January 22, 2025 7:04 PM
> > 
> > On Fri, Jan 17, 2025 at 06:57:12AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@ziepe.ca>
> > > > Sent: Friday, January 17, 2025 3:15 AM
> > > >
> > > > On Thu, Jan 16, 2025 at 06:39:31AM +0000, Tian, Kevin wrote:
> > > > > > From: Mostafa Saleh <smostafa@google.com>
> > > > > > Sent: Wednesday, January 8, 2025 8:10 PM
> > > > > >
> > > > > > On Thu, Jan 02, 2025 at 04:16:14PM -0400, Jason Gunthorpe wrote:
> > > > > > > On Fri, Dec 13, 2024 at 07:39:04PM +0000, Mostafa Saleh wrote:
> > > > > > > > Yeah, SVA is tricky, I guess for that we would have to use nesting,
> > > > > > > > but tbh, I don’t think it’s a deal breaker for now.
> > > > > > >
> > > > > > > Again, it depends what your actual use case for translation is inside
> > > > > > > the host/guest environments. It would be good to clearly spell this
> > out..
> > > > > > > There are few drivers that directly manpulate the iommu_domains
> > of a
> > > > > > > device. a few gpus, ath1x wireless, some tegra stuff, "venus". Which
> > > > > > > of those are you targetting?
> > > > > > >
> > > > > >
> > > > > > Not sure I understand this point about manipulating domains.
> > > > > > AFAIK, SVA is not that common, including mobile spaces but I can be
> > > > wrong,
> > > > > > that’s why it’s not a priority here.
> > > > >
> > > > > Nested translation is required beyond SVA. A scenario which requires
> > > > > a vIOMMU and multiple device domains within the guest would like to
> > > > > embrace nesting. Especially for ARM vSMMU nesting is a must.
> > 
> > We can still do para-virtualization for guests the same way we do for the
> > host and use a single stage IOMMU.
> 
> same way but both require a nested setup.
> 
> In concept there are two layers of address translations: GVA->GPA via
> guest page table, and GPA->HPA via pKVM page table.
> 
> The difference between host/guest is just on the GPA mapping. For host
> it's 1:1 with additional hardening for which portion can be mapped and
> which cannot. For guest it's non-identical with the mapping established
> from the host.
> 
> A nested translation naturally fits that conceptual layers.
> 
> Using a single-stage IOMMU means you need to combine two layers
> into one layer i.e. GVA->HPA by removing GPA. Then you have to
> paravirt guest page table so every guest PTE change is intercepted
> to replace GPA with HPA.
> 
> Doing so completely kills the benefit of SVA, which is why Jason said
> a no-go.

I agree, this can’t work with SVA, in order to make that work we would
need some new para-virt operation to install the S1 table, and the
hypervisor has to configure the device in nested translation.

But, for guests that doesn’t need SVA, they can just use single-stage
para-virt (like virtio-iommu)

> 
> > 
> > > >
> > > > Right, if you need an iommu domain in the guest there are only three
> > > > mainstream ways to get this in Linux:
> > > >  1) Use the DMA API and have the iommu group be translating. This is
> > > >     optional in that the DMA API usually supports identity as an option.
> > > >  2) A driver directly calls iommu_paging_domain_alloc() and manually
> > > >     attaches it to some device, and does not use the DMA API. My list
> > > >     above of ath1x/etc are examples doing this
> > > >  3) Use VFIO
> > > >
> > > > My remark to Mostafa is to be specific, which of the above do you want
> > > > to do in your mobile guest (and what driver exactly if #2) and why.
> > > >
> > > > This will help inform what the performance profile looks like and
> > > > guide if nesting/para virt is appropriate.
> > >
> > 
> > AFAIK, the most common use cases would be:
> > - Devices using DMA API because it requires a lot of memory to be
> >   contiguous in IOVA, which is hard to do with identity
> > - Devices with security requirements/constraints to be isolated from the
> >   rest of the system, also using DMA API
> > - VFIO is something we are looking at the moment and have prototyped with
> >   pKVM, and it should be supported soon in Android (only for platform
> >   devices for now)
> 
> what really matters is the frequency of map/unmap.
Yes, though it differs between devices/systems :/ that’s why I reckon we
would need both on the long term. However, starting with some benchmarks
for these cases can help to understand the magnitude of both solutions
and prioritise which one is more suitable to start with for upstream.

Thanks,
Mostafa
> 
> > 
> > > Yeah that part would be critical to help decide which route to pursue
> > > first. Even when all options might be required in the end when pKVM
> > > is scaled to more scenarios, as you mentioned in another mail, a staging
> > > approach would be much preferrable to evolve.
> > 
> > I agree that would probably be the case. I will work on more staging
> > approach for v3, mostly without the pv part as Jason suggested.
> > 
> > >
> > > The pros/cons between nesting/para virt is clear - more static the
> > > mapping is, more gain from the para approach due to less paging
> > > walking and smaller tlb footprint, while vice versa nesting performs
> > > much better by avoiding frequent para calls on page table mgmt. 😊
> > 
> > I am also working to get the numbers for both cases so we know
> > the order of magnitude of each case, as I guess it won't be as clear
> > for large systems with many DMA initiators what approach is best.
> > 
> > 
> 
> That'd be great!
Re: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Jason Gunthorpe 10 months, 3 weeks ago
On Wed, Jan 22, 2025 at 11:04:24AM +0000, Mostafa Saleh wrote:
> AFAIK, the most common use cases would be:
> - Devices using DMA API because it requires a lot of memory to be
>   contiguous in IOVA, which is hard to do with identity

This is not a feature of the DMA API any driver should rely on .. Are
you aware of one that does?

> - Devices with security requirements/constraints to be isolated from the
>   rest of the system, also using DMA API

This is real, but again, in a mobile context does this even exist? It isn't
like there are external PCIe ports that need securing on a phone?

> - VFIO is something we are looking at the moment and have prototyped with
>   pKVM, and it should be supported soon in Android (only for platform
>   devices for now)

Yes, this makes sense

Jason
Re: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Mostafa Saleh 10 months, 3 weeks ago
On Wed, Jan 22, 2025 at 12:20:55PM -0400, Jason Gunthorpe wrote:
> On Wed, Jan 22, 2025 at 11:04:24AM +0000, Mostafa Saleh wrote:
> > AFAIK, the most common use cases would be:
> > - Devices using DMA API because it requires a lot of memory to be
> >   contiguous in IOVA, which is hard to do with identity
> 
> This is not a feature of the DMA API any driver should rely on .. Are
> you aware of one that does?
> 

I’d guess one example is media drivers, they usually need large contiguous
buffers, and would use for ex dma_alloc_coherent(), if the IOMMU is disabled or
bypassed, that means that the kernel has to find such contiguous size in the
physical address which can be impossible on devices with small memory as
mobile devices. Similarly.

I will look more into this while working on the patches to identity map
everything for v3, and I’d see what kind of issues I hit.

> > - Devices with security requirements/constraints to be isolated from the
> >   rest of the system, also using DMA API
> 
> This is real, but again, in a mobile context does this even exist? It isn't
> like there are external PCIe ports that need securing on a phone?

It’s not just about completely external devices, it’s a defence in depth
measure, where for example, network devices can be poked externally an
there have cases in the past where exploits were found[1], so some vendors
might have a policy to isolate such devices. Which I believe is a valid.

[1] https://lwn.net/ml/oss-security/20221013101046.GB20615@suse.de/

Thanks,
Mostafa

> 
> > - VFIO is something we are looking at the moment and have prototyped with
> >   pKVM, and it should be supported soon in Android (only for platform
> >   devices for now)
> 
> Yes, this makes sense
> 
> Jason
Re: [RFC PATCH v2 00/58] KVM: Arm SMMUv3 driver for pKVM
Posted by Jason Gunthorpe 10 months, 3 weeks ago
On Wed, Jan 22, 2025 at 05:17:50PM +0000, Mostafa Saleh wrote:
> On Wed, Jan 22, 2025 at 12:20:55PM -0400, Jason Gunthorpe wrote:
> > On Wed, Jan 22, 2025 at 11:04:24AM +0000, Mostafa Saleh wrote:
> > > AFAIK, the most common use cases would be:
> > > - Devices using DMA API because it requires a lot of memory to be
> > >   contiguous in IOVA, which is hard to do with identity
> > 
> > This is not a feature of the DMA API any driver should rely on .. Are
> > you aware of one that does?
> > 
> 
> I’d guess one example is media drivers, they usually need large contiguous
> buffers, and would use for ex dma_alloc_coherent(), if the IOMMU is disabled or
> bypassed, that means that the kernel has to find such contiguous size in the
> physical address which can be impossible on devices with small memory as
> mobile devices. Similarly.

I see, that make sense

> It’s not just about completely external devices, it’s a defence in depth
> measure, where for example, network devices can be poked externally an
> there have cases in the past where exploits were found[1], so some vendors
> might have a policy to isolate such devices. Which I believe is a valid.

The performance cost of doing isolation like that with networking is
probably prohibitive with paravirt..

Jason