[PATCH 0/2] Fix issues with untrusted devices and AMD IOMMU

Mario Limonciello posted 2 patches 4 years, 2 months ago
There is a newer version of this series
drivers/iommu/dma-iommu.c | 3 ++-
kernel/dma/swiotlb.c      | 5 +++++
2 files changed, 7 insertions(+), 1 deletion(-)
[PATCH 0/2] Fix issues with untrusted devices and AMD IOMMU
Posted by Mario Limonciello 4 years, 2 months ago
It's been observed that plugging in a TBT3 NVME device to a port marked
with ExternalFacingPort that some DMA transactions occur that are not a
full page and so the DMA API attempts to use software bounce buffers
instead of relying upon the IOMMU translation.

This doesn't work and leads to messaging like:

swiotlb buffer is full (sz: 4096 bytes), total 0 (slots), used 0 (slots)

The bounce buffers were originally set up, but torn down during
the boot process.
* This happens because as part of IOMMU initialization
  `amd_iommu_init_dma_ops` gets called and resets the global swiotlb to 0.
* When late_init gets called `pci_swiotlb_late_init` `swiotlb_exit` is
  called and the buffers are torn down.

This can be observed in the logs:
```
[    0.407286] AMD-Vi: Extended features (0x246577efa2254afa): PPR NX GT [5] IA GA PC GA_vAPIC
[    0.407291] AMD-Vi: Interrupt remapping enabled
[    0.407292] AMD-Vi: Virtual APIC enabled
[    0.407872] software IO TLB: tearing down default memory pool
```

This series adds some better messaging in case something like this comes
up again and also adds checks that swiotlb really is active before
trying to use it.

Mario Limonciello (2):
  swiotlb: Check that slabs have been allocated when requested
  iommu: Don't use swiotlb unless it's active

 drivers/iommu/dma-iommu.c | 3 ++-
 kernel/dma/swiotlb.c      | 5 +++++
 2 files changed, 7 insertions(+), 1 deletion(-)

-- 
2.34.1
Re: [PATCH 0/2] Fix issues with untrusted devices and AMD IOMMU
Posted by Christoph Hellwig 4 years, 2 months ago
On Mon, Apr 04, 2022 at 11:47:05AM -0500, Mario Limonciello wrote:
> The bounce buffers were originally set up, but torn down during
> the boot process.
> * This happens because as part of IOMMU initialization
>   `amd_iommu_init_dma_ops` gets called and resets the global swiotlb to 0.
> * When late_init gets called `pci_swiotlb_late_init` `swiotlb_exit` is
>   called and the buffers are torn down.

I think the proper thing is to do this:

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a1ada7bff44e6..079694f894b85 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1838,17 +1838,10 @@ void amd_iommu_domain_update(struct protection_domain *domain)
 	amd_iommu_domain_flush_complete(domain);
 }
 
-static void __init amd_iommu_init_dma_ops(void)
-{
-	swiotlb = (iommu_default_passthrough() || sme_me_mask) ? 1 : 0;
-}
-
 int __init amd_iommu_init_api(void)
 {
 	int err;
 
-	amd_iommu_init_dma_ops();
-
 	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
 	if (err)
 		return err;
RE: [PATCH 0/2] Fix issues with untrusted devices and AMD IOMMU
Posted by Limonciello, Mario 4 years, 2 months ago
[AMD Official Use Only]

> On Mon, Apr 04, 2022 at 11:47:05AM -0500, Mario Limonciello wrote:
> > The bounce buffers were originally set up, but torn down during
> > the boot process.
> > * This happens because as part of IOMMU initialization
> >   `amd_iommu_init_dma_ops` gets called and resets the global swiotlb to 0.
> > * When late_init gets called `pci_swiotlb_late_init` `swiotlb_exit` is
> >   called and the buffers are torn down.
> 
> I think the proper thing is to do this:
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e6..079694f894b85 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1838,17 +1838,10 @@ void amd_iommu_domain_update(struct
> protection_domain *domain)
>  	amd_iommu_domain_flush_complete(domain);
>  }
> 
> -static void __init amd_iommu_init_dma_ops(void)
> -{
> -	swiotlb = (iommu_default_passthrough() || sme_me_mask) ? 1 : 0;
> -}
> -
>  int __init amd_iommu_init_api(void)
>  {
>  	int err;
> 
> -	amd_iommu_init_dma_ops();
> -
>  	err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
>  	if (err)
>  		return err;

I do expect that solves it as well.  The reason I submitted the way I did is
that there seemed to be a strong affinity for having swiotlb disabled when IOMMU
is enabled on AMD IOMMU.  The original code that disabled SWIOTLB in AMD IOMMU
dates all the way back to 2.6.33 (commit 75f1cdf1dda92cae037ec848ae63690d91913eac)
and it has ping ponged around since then to add more criteria that it would be or wouldn't
be disabled, but was never actually dropped until your suggestion.

If the consensus is that it should be dropped, I'll validate it does help and then send that out
as a v2.  I do think that my messaging patch (1/2) may still be useful for debugging in the
future if for another reason SWIOTLB is disabled.
Re: [PATCH 0/2] Fix issues with untrusted devices and AMD IOMMU
Posted by Christoph Hellwig 4 years, 2 months ago
On Mon, Apr 04, 2022 at 05:05:00PM +0000, Limonciello, Mario wrote:
> I do expect that solves it as well.  The reason I submitted the way I
> did is that there seemed to be a strong affinity for having swiotlb
> disabled when IOMMU is enabled on AMD IOMMU.  The original code that
> disabled SWIOTLB in AMD IOMMU dates all the way back to 2.6.33 (commit
> 75f1cdf1dda92cae037ec848ae63690d91913eac) and it has ping ponged around
> since then to add more criteria that it would be or wouldn't be
> disabled, but was never actually dropped until your suggestion.

Well, that was before we started bounce buffering for untrusted devices.
We can't just have a less secure path for them because some conditions
are not met.  Especially given that most AMD systems right now probably
don't have that swiotlb buffer if the IOMMU is enabled.  So not freeing
the buffer in this case is a bug fix that is needed to properly
support the bounce buffering for unaligned I/O to untrusted devices.

> I do think that my messaging patch (1/2) may still be useful for
> debugging in the future if for another reason SWIOTLB is disabled.

I think the warning is useful.  For dma-direct we have it in the caller
so I'd be tempted todo the same for dma-iommu.