[PATCH v3 0/3] virtio-scsi: fix SCSIDevice hot unplug with IOThread

Stefan Hajnoczi posted 3 patches 1 year, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230221212218.1378734-1-stefanha@redhat.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <fam@euphon.net>, "Michael S. Tsirkin" <mst@redhat.com>, Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>
include/hw/virtio/virtio-scsi.h |  11 ++-
hw/scsi/scsi-disk.c             |  23 +++--
hw/scsi/scsi-generic.c          |  11 ++-
hw/scsi/virtio-scsi.c           | 169 +++++++++++++++++++++++++-------
softmmu/dma-helpers.c           |  12 ++-
5 files changed, 171 insertions(+), 55 deletions(-)
[PATCH v3 0/3] virtio-scsi: fix SCSIDevice hot unplug with IOThread
Posted by Stefan Hajnoczi 1 year, 2 months ago
v3:
- Fix s/see/sees/ typo in Patch 2 commit description [Eric]
- Add call stack to Patch 3 commit description to make it clear how blk_drain()
  is invoked [Kevin]

Unplugging SCSIDevices when virtio-scsi is using an IOThread suffers from race
conditions:
- scsi_device_purge_requests() is called from the IOThread in TMF emulation.
  This is unsafe, it should only be called from the BQL.
- SCSIRequest->aiocb is not protected by a lock, so there are races between the
  main loop thread and the IOThread when scsi_device_purge_requests() runs in
  the main loop thread.
- DMAAIOCB->acb is not protected by a lock, so there are races in the DMA
  helpers code when cancelling a request from the main loop thread.

These fixes solve assertion failures during SCSIDevice hot unplug in
virtio-scsi with IOThread. Expanding the use of the AioContext lock isn't great
since we're in the midst of trying to remove it. However, I think this solution
is appropriate so that stable trees or distros can backport the fix without
depending on QEMU multi-queue block layer refactoring.

Special thanks to Qing Wang, who helped me iterate these patches because I
couldn't reproduce the assertion failures myself.

Stefan Hajnoczi (3):
  scsi: protect req->aiocb with AioContext lock
  dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race
  virtio-scsi: reset SCSI devices from main loop thread

 include/hw/virtio/virtio-scsi.h |  11 ++-
 hw/scsi/scsi-disk.c             |  23 +++--
 hw/scsi/scsi-generic.c          |  11 ++-
 hw/scsi/virtio-scsi.c           | 169 +++++++++++++++++++++++++-------
 softmmu/dma-helpers.c           |  12 ++-
 5 files changed, 171 insertions(+), 55 deletions(-)

-- 
2.39.1
Re: [PATCH v3 0/3] virtio-scsi: fix SCSIDevice hot unplug with IOThread
Posted by Kevin Wolf 1 year, 2 months ago
Am 21.02.2023 um 22:22 hat Stefan Hajnoczi geschrieben:
> v3:
> - Fix s/see/sees/ typo in Patch 2 commit description [Eric]
> - Add call stack to Patch 3 commit description to make it clear how blk_drain()
>   is invoked [Kevin]
> 
> Unplugging SCSIDevices when virtio-scsi is using an IOThread suffers from race
> conditions:
> - scsi_device_purge_requests() is called from the IOThread in TMF emulation.
>   This is unsafe, it should only be called from the BQL.
> - SCSIRequest->aiocb is not protected by a lock, so there are races between the
>   main loop thread and the IOThread when scsi_device_purge_requests() runs in
>   the main loop thread.
> - DMAAIOCB->acb is not protected by a lock, so there are races in the DMA
>   helpers code when cancelling a request from the main loop thread.
> 
> These fixes solve assertion failures during SCSIDevice hot unplug in
> virtio-scsi with IOThread. Expanding the use of the AioContext lock isn't great
> since we're in the midst of trying to remove it. However, I think this solution
> is appropriate so that stable trees or distros can backport the fix without
> depending on QEMU multi-queue block layer refactoring.
> 
> Special thanks to Qing Wang, who helped me iterate these patches because I
> couldn't reproduce the assertion failures myself.

Thanks, applied to the block branch.

Kevin