[Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()

Alberto Garcia posted 6 patches 5 years, 3 months ago
Test asan passed
Test checkpatch passed
Test docker-clang@ubuntu passed
Test docker-mingw@fedora passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/cover.1547475602.git.berto@igalia.com
Maintainers: John Snow <jsnow@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Fam Zheng <fam@euphon.net>, Gerd Hoffmann <kraxel@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Keith Busch <keith.busch@intel.com>, Max Reitz <mreitz@redhat.com>
hw/block/fdc.c             |  15 ++++--
hw/block/nvme.c            |  13 +++--
hw/block/virtio-blk.c      |  22 ++++++---
hw/ide/qdev.c              |  17 +++++--
hw/scsi/scsi-disk.c        |  23 +++++++--
hw/usb/dev-storage.c       |   5 ++
tests/qemu-iotests/236     | 121 +++++++++++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/236.out |  46 +++++++++++++++++
tests/qemu-iotests/group   |   1 +
9 files changed, 241 insertions(+), 22 deletions(-)
create mode 100755 tests/qemu-iotests/236
create mode 100644 tests/qemu-iotests/236.out
[Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()
Posted by Alberto Garcia 5 years, 3 months ago
This series acquires the AioContext in the _realize() functions of
several devices before making use of their block backends. This fixes
at least a couple of crashes (in virtio-blk and scsi). The other
devices don't currently support iothreads so there's no crashes.

Berto

v2:
- Patch 1: include a test case [Kevin]
- Patch 2: include a test case and extend the locking to protect more
  function calls [Kevin]

v1: https://lists.gnu.org/archive/html/qemu-block/2019-01/msg00259.html
- Initial version

Output of backport-diff against v1:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/6:[0095] [FC] 'block: Acquire the AioContext in virtio_blk_device_realize()'
002/6:[0094] [FC] 'block: Acquire the AioContext in scsi_*_realize()'
003/6:[----] [--] 'block: Acquire the AioContext in floppy_drive_realize()'
004/6:[----] [--] 'block: Acquire the AioContext in nvme_realize()'
005/6:[----] [--] 'block: Acquire the AioContext in ide_dev_initfn()'
006/6:[----] [--] 'block: Acquire the AioContext in usb_msd_storage_realize()'

Alberto Garcia (6):
  block: Acquire the AioContext in virtio_blk_device_realize()
  block: Acquire the AioContext in scsi_*_realize()
  block: Acquire the AioContext in floppy_drive_realize()
  block: Acquire the AioContext in nvme_realize()
  block: Acquire the AioContext in ide_dev_initfn()
  block: Acquire the AioContext in usb_msd_storage_realize()

 hw/block/fdc.c             |  15 ++++--
 hw/block/nvme.c            |  13 +++--
 hw/block/virtio-blk.c      |  22 ++++++---
 hw/ide/qdev.c              |  17 +++++--
 hw/scsi/scsi-disk.c        |  23 +++++++--
 hw/usb/dev-storage.c       |   5 ++
 tests/qemu-iotests/236     | 121 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/236.out |  46 +++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 9 files changed, 241 insertions(+), 22 deletions(-)
 create mode 100755 tests/qemu-iotests/236
 create mode 100644 tests/qemu-iotests/236.out

-- 
2.11.0


Re: [Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()
Posted by Stefan Hajnoczi 5 years, 3 months ago
On Mon, Jan 14, 2019 at 04:23:58PM +0200, Alberto Garcia wrote:
> This series acquires the AioContext in the _realize() functions of
> several devices before making use of their block backends. This fixes
> at least a couple of crashes (in virtio-blk and scsi). The other
> devices don't currently support iothreads so there's no crashes.
> 
> Berto

My assumption has been that drives are in the main loop AioContext until
the device model (e.g. virtio-blk) decides to move them into the
IOThread configured by the user.

(And when the VM is stopped or the device is removed, the drive is moved
back into the main loop AioContext by the device.)

The x-blockdev-set-iothread command is for low-level tests so I don't
expect users to invoke it.  So I'm not sure which real-world scenario is
being tested here?

This patch series makes virtio-blk and virtio-scsi more robust, although
I haven't checked what happens if the drive is attached to a different
IOThread than the device - will the switchover work?

What I'm unclear about is why fdc, ide, usb-msd, and nvme should worry
about IOThreads/AioContext in .realize() since they don't support it in
their data path.  What happens when you submit I/O requests to devices
configured like this?  I guess they would crash or hit assertion
failures since they invoke blk_aio_*() APIs from outside the appropriate
AioContext.

Maybe fdc and friends should forbid this scenario:

  /* Fail if @blk is attached to an IOThread */
  static bool blk_check_main_aio_context(BlockBackend *blk, Error **errp) {
      if (blk_get_aio_context(blk) != qemu_get_aio_context()) {
          error_setg(errp, "Device does not support iothreads");
          return false;
      }
      return true;
  }

  ...in a realize() function...
  if (!blk_check_main_aio_context(blk, errp)) {
      return;
  }
Re: [Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()
Posted by Alberto Garcia 5 years, 3 months ago
On Wed 16 Jan 2019 02:54:44 PM CET, Stefan Hajnoczi wrote:
> The x-blockdev-set-iothread command is for low-level tests so I don't
> expect users to invoke it.

As I said in a different e-mail maybe this is not necessary then.

> This patch series makes virtio-blk and virtio-scsi more robust,
> although I haven't checked what happens if the drive is attached to a
> different IOThread than the device - will the switchover work?

You mean if you use x-blockdev-set-iothread with a drive and then attach
it to a device using a different thread? That seems to work (with my
patch).

> What I'm unclear about is why fdc, ide, usb-msd, and nvme should worry
> about IOThreads/AioContext in .realize() since they don't support it
> in their data path.  What happens when you submit I/O requests to
> devices configured like this?  I guess they would crash or hit
> assertion failures since they invoke blk_aio_*() APIs from outside the
> appropriate AioContext.
>
> Maybe fdc and friends should forbid this scenario:

Yeah, that seems like a simple solution.

Berto

Re: [Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()
Posted by Stefan Hajnoczi 5 years, 3 months ago
On Thu, Jan 17, 2019 at 02:24:10PM +0100, Alberto Garcia wrote:
> On Wed 16 Jan 2019 02:54:44 PM CET, Stefan Hajnoczi wrote:
> > This patch series makes virtio-blk and virtio-scsi more robust,
> > although I haven't checked what happens if the drive is attached to a
> > different IOThread than the device - will the switchover work?
> 
> You mean if you use x-blockdev-set-iothread with a drive and then attach
> it to a device using a different thread? That seems to work (with my
> patch).

Does it work with a running VM that will do I/O?  If yes, then that's
great and we should merge the virtio-blk/scsi patches to make them
robust for this case.  I'm a little surprised it works though :-)
because the code wasn't designed or tested for this.

Stefan
Re: [Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()
Posted by Kevin Wolf 5 years, 3 months ago
Am 18.01.2019 um 10:56 hat Stefan Hajnoczi geschrieben:
> On Thu, Jan 17, 2019 at 02:24:10PM +0100, Alberto Garcia wrote:
> > On Wed 16 Jan 2019 02:54:44 PM CET, Stefan Hajnoczi wrote:
> > > This patch series makes virtio-blk and virtio-scsi more robust,
> > > although I haven't checked what happens if the drive is attached to a
> > > different IOThread than the device - will the switchover work?
> > 
> > You mean if you use x-blockdev-set-iothread with a drive and then attach
> > it to a device using a different thread? That seems to work (with my
> > patch).
> 
> Does it work with a running VM that will do I/O?  If yes, then that's
> great and we should merge the virtio-blk/scsi patches to make them
> robust for this case.  I'm a little surprised it works though :-)
> because the code wasn't designed or tested for this.

There are two ways to trigger the crash even without
x-blockdev-set-iothread:

* device_del, then device_add for a device with iothread (virtio-scsi;
  may or may not exist with virtio-blk)
  https://bugzilla.redhat.com/show_bug.cgi?id=1656276

* Simply attach two devices with iothread to the the same node
  https://bugzilla.redhat.com/show_bug.cgi?id=1662508

Maybe this series is actually papering over the real problems, though.
Unplug should move the BDS back to the main AioContext, and attaching
two users with different AioContexts must fail. If we don't take care
there, the locking will be wrong.

Note that even though there are more problems, we still need this series
so we can allow attaching two devices to the same node while using the
same iothread.

Kevin
Re: [Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()
Posted by Alberto Garcia 5 years, 3 months ago
On Fri 18 Jan 2019 11:14:15 AM CET, Kevin Wolf wrote:
> There are two ways to trigger the crash even without
> x-blockdev-set-iothread:
>
> * device_del, then device_add for a device with iothread (virtio-scsi;
>   may or may not exist with virtio-blk)
>   https://bugzilla.redhat.com/show_bug.cgi?id=1656276
>
> * Simply attach two devices with iothread to the the same node
>   https://bugzilla.redhat.com/show_bug.cgi?id=1662508

While having a look at this I found another crash. Here's how to
reproduce it (wait for the events after each system_reset):

   { "execute": "qmp_capabilities" }
   { "execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": "hd0"}}
   { "execute": "device_add", "arguments": {"id": "vb0", "driver": "virtio-blk", "drive": "hd0"}}
   { "execute": "system_reset"}

   { "execute": "device_del", "arguments": {"id": "vb0"}}
   { "execute": "system_reset"}

   { "execute": "device_add", "arguments": {"id": "vb0", "driver": "virtio-blk", "drive": "hd0"}}
   { "execute": "system_reset"}

   { "execute": "device_del", "arguments": {"id": "vb0"}}
   { "execute": "system_reset"}

   { "execute": "device_add", "arguments": {"id": "vb0", "driver": "virtio-blk", "drive": "hd0"}}
   { "execute": "system_reset"}

kvm_mem_ioeventfd_add: error adding ioeventfd: No space left on device
Aborted

git-bisect points to this commit:

   commit 3ac7d43a6fbb5d4a3d01fc9a055c218030af3727
   Author: Paolo Bonzini <pbonzini@redhat.com>
   Date:   Wed Nov 28 17:28:45 2018 +0100

       memory: update coalesced_range on transaction_commit

       The e1000 driver calls memory_region_add_coalescing but
       kvm_coalesce_mmio_region is never called for those regions.  The bug
       dates back to the introduction of the memory region API; to fix it,
       delete and re-add coalesced MMIO ranges when building the FlatViews.
       
       Because coalesced MMIO regions apply to all address spaces, the
       has_coalesced_range flag has to be changed into an int.

Berto

Re: [Qemu-devel] [PATCH v2 0/6] Acquire the AioContext during _realize()
Posted by Kevin Wolf 5 years, 3 months ago
Am 14.01.2019 um 15:23 hat Alberto Garcia geschrieben:
> This series acquires the AioContext in the _realize() functions of
> several devices before making use of their block backends. This fixes
> at least a couple of crashes (in virtio-blk and scsi). The other
> devices don't currently support iothreads so there's no crashes.

>  tests/qemu-iotests/236     | 121 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/236.out |  46 +++++++++++++++++

236 is already taken by two other series. :-)

I'll use 239 and 239.out for the conflict resolution, so I think the
next free one for you is 240.

Kevin