[PATCH for 8.0 0/2] virtio-iommu: Fix Replay

Eric Auger posted 2 patches 1 year, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20221207133646.635760-1-eric.auger@redhat.com
Maintainers: Eric Auger <eric.auger@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>
hw/virtio/virtio-iommu.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
[PATCH for 8.0 0/2] virtio-iommu: Fix Replay
Posted by Eric Auger 1 year, 4 months ago
When assigning VFIO devices protected by a virtio-iommu we need to replay
the mappings when adding a new IOMMU MR and when attaching a device to
a domain. While we do a "remap" we currently fail to first unmap the
existing IOVA mapping and just map the new one. With some device/group
topology this can lead to errors in VFIO when trying to DMA_MAP IOVA
ranges onto existing ones.

Eric Auger (2):
  virtio-iommu: Add unmap on virtio_iommu_remap()
  virtio-iommu: Fix replay on device attach

 hw/virtio/virtio-iommu.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

-- 
2.37.3
Re: [PATCH for 8.0 0/2] virtio-iommu: Fix Replay
Posted by Peter Xu 1 year, 4 months ago
Hi, Eric,

On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote:
> When assigning VFIO devices protected by a virtio-iommu we need to replay
> the mappings when adding a new IOMMU MR and when attaching a device to
> a domain. While we do a "remap" we currently fail to first unmap the
> existing IOVA mapping and just map the new one. With some device/group
> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA
> ranges onto existing ones.

I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when
we were working on the vt-d replay for vfio.  The issue is whether DMA can
happen right after UNMAP but before MAP of the same page if the page was
always mapped.

The vt-d resolved it by using iova_tree so in a replay vt-d knows the page
didn't change, so it avoids unmap+map.  It only notifies newly unmapped or
newly mapped.

Thanks,

-- 
Peter Xu
Re: [PATCH for 8.0 0/2] virtio-iommu: Fix Replay
Posted by Eric Auger 1 year, 4 months ago
Hi Peter,

On 12/8/22 00:49, Peter Xu wrote:
> Hi, Eric,
>
> On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote:
>> When assigning VFIO devices protected by a virtio-iommu we need to replay
>> the mappings when adding a new IOMMU MR and when attaching a device to
>> a domain. While we do a "remap" we currently fail to first unmap the
>> existing IOVA mapping and just map the new one. With some device/group
>> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA
>> ranges onto existing ones.
> I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when
> we were working on the vt-d replay for vfio.  The issue is whether DMA can
> happen right after UNMAP but before MAP of the same page if the page was
> always mapped.

I don't think it can race because a mutex is hold while doing the
virtio_iommu_replay(), and each time a virtio cmd is handled (attach,
map, unmap), see virtio_iommu_handle_command.
So I think it is safe.

Thanks

Eric
>
> The vt-d resolved it by using iova_tree so in a replay vt-d knows the page
> didn't change, so it avoids unmap+map.  It only notifies newly unmapped or
> newly mapped.
>
> Thanks,
>
Re: [PATCH for 8.0 0/2] virtio-iommu: Fix Replay
Posted by Peter Xu 1 year, 4 months ago
On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote:
> Hi Peter,

Hi, Eric,

> 
> On 12/8/22 00:49, Peter Xu wrote:
> > Hi, Eric,
> >
> > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote:
> >> When assigning VFIO devices protected by a virtio-iommu we need to replay
> >> the mappings when adding a new IOMMU MR and when attaching a device to
> >> a domain. While we do a "remap" we currently fail to first unmap the
> >> existing IOVA mapping and just map the new one. With some device/group
> >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA
> >> ranges onto existing ones.
> > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when
> > we were working on the vt-d replay for vfio.  The issue is whether DMA can
> > happen right after UNMAP but before MAP of the same page if the page was
> > always mapped.
> 
> I don't think it can race because a mutex is hold while doing the
> virtio_iommu_replay(), and each time a virtio cmd is handled (attach,
> map, unmap), see virtio_iommu_handle_command.
> So I think it is safe.

It's not the race in the code, it's the race between modifying host IOMMU
pgtable with DMA happening in parallel.  The bug triggered with DMA_MAP
returning -EEXIST means there's existing mapping.

If during replay there's mapped ranges and the ranges are prone to DMA,
then IIUC it can happen.

I didn't really check specifically for virtio-iommu and I mostly forget the
details, just to raise this up.  It's possible for some reason it just
can't trigger.  VT-d definitely can, in which case we'll see DMA errors on
the host from the assigned device when the DMA triggers during the "unmap
and map" window.

Thanks,

-- 
Peter Xu
Re: [PATCH for 8.0 0/2] virtio-iommu: Fix Replay
Posted by Michael S. Tsirkin 1 year, 4 months ago
On Thu, Dec 08, 2022 at 09:58:06AM -0500, Peter Xu wrote:
> On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote:
> > Hi Peter,
> 
> Hi, Eric,
> 
> > 
> > On 12/8/22 00:49, Peter Xu wrote:
> > > Hi, Eric,
> > >
> > > On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote:
> > >> When assigning VFIO devices protected by a virtio-iommu we need to replay
> > >> the mappings when adding a new IOMMU MR and when attaching a device to
> > >> a domain. While we do a "remap" we currently fail to first unmap the
> > >> existing IOVA mapping and just map the new one. With some device/group
> > >> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA
> > >> ranges onto existing ones.
> > > I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when
> > > we were working on the vt-d replay for vfio.  The issue is whether DMA can
> > > happen right after UNMAP but before MAP of the same page if the page was
> > > always mapped.
> > 
> > I don't think it can race because a mutex is hold while doing the
> > virtio_iommu_replay(), and each time a virtio cmd is handled (attach,
> > map, unmap), see virtio_iommu_handle_command.
> > So I think it is safe.
> 
> It's not the race in the code, it's the race between modifying host IOMMU
> pgtable with DMA happening in parallel.  The bug triggered with DMA_MAP
> returning -EEXIST means there's existing mapping.
> 
> If during replay there's mapped ranges and the ranges are prone to DMA,
> then IIUC it can happen.
> 
> I didn't really check specifically for virtio-iommu and I mostly forget the
> details, just to raise this up.  It's possible for some reason it just
> can't trigger.  VT-d definitely can, in which case we'll see DMA errors on
> the host from the assigned device when the DMA triggers during the "unmap
> and map" window.
> 
> Thanks,

Eric any resolution on this?

> -- 
> Peter Xu
Re: [PATCH for 8.0 0/2] virtio-iommu: Fix Replay
Posted by Eric Auger 1 year, 4 months ago
Hi Michael, Peter,
On 12/20/22 15:59, Michael S. Tsirkin wrote:
> On Thu, Dec 08, 2022 at 09:58:06AM -0500, Peter Xu wrote:
>> On Thu, Dec 08, 2022 at 08:48:09AM +0100, Eric Auger wrote:
>>> Hi Peter,
>> Hi, Eric,
>>
>>> On 12/8/22 00:49, Peter Xu wrote:
>>>> Hi, Eric,
>>>>
>>>> On Wed, Dec 07, 2022 at 02:36:44PM +0100, Eric Auger wrote:
>>>>> When assigning VFIO devices protected by a virtio-iommu we need to replay
>>>>> the mappings when adding a new IOMMU MR and when attaching a device to
>>>>> a domain. While we do a "remap" we currently fail to first unmap the
>>>>> existing IOVA mapping and just map the new one. With some device/group
>>>>> topology this can lead to errors in VFIO when trying to DMA_MAP IOVA
>>>>> ranges onto existing ones.
>>>> I'm not sure whether virtio-iommu+vfio will suffer from DMA races like when
>>>> we were working on the vt-d replay for vfio.  The issue is whether DMA can
>>>> happen right after UNMAP but before MAP of the same page if the page was
>>>> always mapped.
>>> I don't think it can race because a mutex is hold while doing the
>>> virtio_iommu_replay(), and each time a virtio cmd is handled (attach,
>>> map, unmap), see virtio_iommu_handle_command.
>>> So I think it is safe.
>> It's not the race in the code, it's the race between modifying host IOMMU
>> pgtable with DMA happening in parallel.  The bug triggered with DMA_MAP
>> returning -EEXIST means there's existing mapping.
>>
>> If during replay there's mapped ranges and the ranges are prone to DMA,
>> then IIUC it can happen.
>>
>> I didn't really check specifically for virtio-iommu and I mostly forget the
>> details, just to raise this up.  It's possible for some reason it just
>> can't trigger.  VT-d definitely can, in which case we'll see DMA errors on
>> the host from the assigned device when the DMA triggers during the "unmap
>> and map" window.
>>
>> Thanks,
> Eric any resolution on this?

Sorry for the delay. Not yet unfortunately. following Peter's reply I
now understand the race issue and it makes total sense but I need to
study it further.

Eric


>
>> -- 
>> Peter Xu