hw/virtio/virtio-iommu.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
When assigning VFIO devices protected by a virtio-iommu we need to replay the mappings when adding a new IOMMU MR and when attaching a device to a domain. While we do a "remap" we currently fail to first unmap the existing IOVA mapping and just map the new one. With some device/group topology this can lead to errors in VFIO when trying to DMA_MAP IOVA ranges onto existing ones. Eric Auger (2): virtio-iommu: Add unmap on virtio_iommu_remap() virtio-iommu: Fix replay on device attach hw/virtio/virtio-iommu.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) -- 2.37.3
Hi, Eric, On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: > When assigning VFIO devices protected by a virtio-iommu we need to replay > the mappings when adding a new IOMMU MR and when attaching a device to > a domain. While we do a "remap" we currently fail to first unmap the > existing IOVA mapping and just map the new one. With some device/group > topology this can lead to errors in VFIO when trying to DMA_MAP IOVA > ranges onto existing ones. I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when we were working on the vt-d replay for vfio. The issue is whether DMA can happen right after UNMAP but before MAP of the same page if the page was always mapped. The vt-d resolved it by using iova_tree so in a replay vt-d knows the page didn't change, so it avoids unmap+map. It only notifies newly unmapped or newly mapped. Thanks, -- Peter Xu
Hi Peter, On 12/8/22 00:49, Peter Xu wrote: > Hi, Eric, > > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: >> When assigning VFIO devices protected by a virtio-iommu we need to replay >> the mappings when adding a new IOMMU MR and when attaching a device to >> a domain. While we do a "remap" we currently fail to first unmap the >> existing IOVA mapping and just map the new one. With some device/group >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA >> ranges onto existing ones. > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when > we were working on the vt-d replay for vfio. The issue is whether DMA can > happen right after UNMAP but before MAP of the same page if the page was > always mapped. I don't think it can race because a mutex is hold while doing the virtio_iommu_replay(), and each time a virtio cmd is handled (attach, map, unmap), see virtio_iommu_handle_command. So I think it is safe. Thanks Eric > > The vt-d resolved it by using iova_tree so in a replay vt-d knows the page > didn't change, so it avoids unmap+map. It only notifies newly unmapped or > newly mapped. > > Thanks, >
On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote: > Hi Peter, Hi, Eric, > > On 12/8/22 00:49, Peter Xu wrote: > > Hi, Eric, > > > > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: > >> When assigning VFIO devices protected by a virtio-iommu we need to replay > >> the mappings when adding a new IOMMU MR and when attaching a device to > >> a domain. While we do a "remap" we currently fail to first unmap the > >> existing IOVA mapping and just map the new one. With some device/group > >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA > >> ranges onto existing ones. > > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when > > we were working on the vt-d replay for vfio. The issue is whether DMA can > > happen right after UNMAP but before MAP of the same page if the page was > > always mapped. > > I don't think it can race because a mutex is hold while doing the > virtio_iommu_replay(), and each time a virtio cmd is handled (attach, > map, unmap), see virtio_iommu_handle_command. > So I think it is safe. It's not the race in the code, it's the race between modifying host IOMMU pgtable with DMA happening in parallel. The bug triggered with DMA_MAP returning -EEXIST means there's existing mapping. If during replay there's mapped ranges and the ranges are prone to DMA, then IIUC it can happen. I didn't really check specifically for virtio-iommu and I mostly forget the details, just to raise this up. It's possible for some reason it just can't trigger. VT-d definitely can, in which case we'll see DMA errors on the host from the assigned device when the DMA triggers during the "unmap and map" window. Thanks, -- Peter Xu
On Thu, Dec 08, 2022 at 09:58:06AM -0500, Peter Xu wrote: > On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote: > > Hi Peter, > > Hi, Eric, > > > > > On 12/8/22 00:49, Peter Xu wrote: > > > Hi, Eric, > > > > > > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: > > >> When assigning VFIO devices protected by a virtio-iommu we need to replay > > >> the mappings when adding a new IOMMU MR and when attaching a device to > > >> a domain. While we do a "remap" we currently fail to first unmap the > > >> existing IOVA mapping and just map the new one. With some device/group > > >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA > > >> ranges onto existing ones. > > > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when > > > we were working on the vt-d replay for vfio. The issue is whether DMA can > > > happen right after UNMAP but before MAP of the same page if the page was > > > always mapped. > > > > I don't think it can race because a mutex is hold while doing the > > virtio_iommu_replay(), and each time a virtio cmd is handled (attach, > > map, unmap), see virtio_iommu_handle_command. > > So I think it is safe. > > It's not the race in the code, it's the race between modifying host IOMMU > pgtable with DMA happening in parallel. The bug triggered with DMA_MAP > returning -EEXIST means there's existing mapping. > > If during replay there's mapped ranges and the ranges are prone to DMA, > then IIUC it can happen. > > I didn't really check specifically for virtio-iommu and I mostly forget the > details, just to raise this up. It's possible for some reason it just > can't trigger. VT-d definitely can, in which case we'll see DMA errors on > the host from the assigned device when the DMA triggers during the "unmap > and map" window. > > Thanks, Eric any resolution on this? > -- > Peter Xu
Hi Michael, Peter, On 12/20/22 15:59, Michael S. Tsirkin wrote: > On Thu, Dec 08, 2022 at 09:58:06AM -0500, Peter Xu wrote: >> On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote: >>> Hi Peter, >> Hi, Eric, >> >>> On 12/8/22 00:49, Peter Xu wrote: >>>> Hi, Eric, >>>> >>>> On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote: >>>>> When assigning VFIO devices protected by a virtio-iommu we need to replay >>>>> the mappings when adding a new IOMMU MR and when attaching a device to >>>>> a domain. While we do a "remap" we currently fail to first unmap the >>>>> existing IOVA mapping and just map the new one. With some device/group >>>>> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA >>>>> ranges onto existing ones. >>>> I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when >>>> we were working on the vt-d replay for vfio. The issue is whether DMA can >>>> happen right after UNMAP but before MAP of the same page if the page was >>>> always mapped. >>> I don't think it can race because a mutex is hold while doing the >>> virtio_iommu_replay(), and each time a virtio cmd is handled (attach, >>> map, unmap), see virtio_iommu_handle_command. >>> So I think it is safe. >> It's not the race in the code, it's the race between modifying host IOMMU >> pgtable with DMA happening in parallel. The bug triggered with DMA_MAP >> returning -EEXIST means there's existing mapping. >> >> If during replay there's mapped ranges and the ranges are prone to DMA, >> then IIUC it can happen. >> >> I didn't really check specifically for virtio-iommu and I mostly forget the >> details, just to raise this up. It's possible for some reason it just >> can't trigger. VT-d definitely can, in which case we'll see DMA errors on >> the host from the assigned device when the DMA triggers during the "unmap >> and map" window. >> >> Thanks, > Eric any resolution on this? Sorry for the delay. Not yet unfortunately. following Peter's reply I now understand the race issue and it makes total sense but I need to study it further. Eric > >> -- >> Peter Xu
© 2016 - 2024 Red Hat, Inc.