[PATCH v2 0/2] virtio-blk: add iothread-vq-mapping parameter

Stefan Hajnoczi posted 2 patches 7 months, 2 weeks ago
Failed in applying to current master (apply log)
There is a newer version of this series
qapi/virtio.json                    |  30 +++++
hw/block/dataplane/virtio-blk.h     |   3 +
include/hw/qdev-properties-system.h |   4 +
include/hw/virtio/virtio-blk.h      |   2 +
hw/block/dataplane/virtio-blk.c     | 163 +++++++++++++++++++++-------
hw/block/virtio-blk.c               |  92 ++++++++++++++--
hw/core/qdev-properties-system.c    |  47 ++++++++
7 files changed, 287 insertions(+), 54 deletions(-)
[PATCH v2 0/2] virtio-blk: add iothread-vq-mapping parameter
Posted by Stefan Hajnoczi 7 months, 2 weeks ago
virtio-blk and virtio-scsi devices need a way to specify the mapping between
IOThreads and virtqueues. At the moment all virtqueues are assigned to a single
IOThread or the main loop. This single thread can be a CPU bottleneck, so it is
necessary to allow finer-grained assignment to spread the load. With this
series applied, "pidstat -t 1" shows that guests with -smp 2 or higher are able
to exploit multiple IOThreads.

This series introduces command-line syntax for the new iothread-vq-mapping
property is as follows:

  --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]},...'

IOThreads are specified by name and virtqueues are specified by 0-based
index.

It will be common to simply assign virtqueues round-robin across a set
of IOThreads. A convenient syntax that does not require specifying
individual virtqueue indices is available:

  --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]},...'

There is no way to reassign virtqueues at runtime and I expect that to be a
very rare requirement.

Note that JSON --device syntax is required for the iothread-vq-mapping
parameter because it's non-scalar.

Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")

Stefan Hajnoczi (2):
  qdev: add IOThreadVirtQueueMappingList property type
  virtio-blk: add iothread-vq-mapping parameter

 qapi/virtio.json                    |  30 +++++
 hw/block/dataplane/virtio-blk.h     |   3 +
 include/hw/qdev-properties-system.h |   4 +
 include/hw/virtio/virtio-blk.h      |   2 +
 hw/block/dataplane/virtio-blk.c     | 163 +++++++++++++++++++++-------
 hw/block/virtio-blk.c               |  92 ++++++++++++++--
 hw/core/qdev-properties-system.c    |  47 ++++++++
 7 files changed, 287 insertions(+), 54 deletions(-)

-- 
2.41.0
Re: [PATCH v2 0/2] virtio-blk: add iothread-vq-mapping parameter
Posted by Kevin Wolf 6 months ago
Am 18.09.2023 um 18:16 hat Stefan Hajnoczi geschrieben:
> virtio-blk and virtio-scsi devices need a way to specify the mapping between
> IOThreads and virtqueues. At the moment all virtqueues are assigned to a single
> IOThread or the main loop. This single thread can be a CPU bottleneck, so it is
> necessary to allow finer-grained assignment to spread the load. With this
> series applied, "pidstat -t 1" shows that guests with -smp 2 or higher are able
> to exploit multiple IOThreads.
> 
> This series introduces command-line syntax for the new iothread-vq-mapping
> property is as follows:
> 
>   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]},...'
> 
> IOThreads are specified by name and virtqueues are specified by 0-based
> index.
> 
> It will be common to simply assign virtqueues round-robin across a set
> of IOThreads. A convenient syntax that does not require specifying
> individual virtqueue indices is available:
> 
>   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]},...'
> 
> There is no way to reassign virtqueues at runtime and I expect that to be a
> very rare requirement.
> 
> Note that JSON --device syntax is required for the iothread-vq-mapping
> parameter because it's non-scalar.
> 
> Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")

Does this strictly depend on patch 5/5 of that series, or would it just
be a missed opportunity for optimisation by unnecessarily running some
requests from a different thread?

I suspect it does depend on the other virtio-blk series, though:

[PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
https://patchew.org/QEMU/20230914140101.1065008-1-stefanha@redhat.com/

Is this right?

Given that soft freeze is early next week, maybe we should try to merge
just the bare minimum of strictly necessary dependencies.

Kevin
Re: [PATCH v2 0/2] virtio-blk: add iothread-vq-mapping parameter
Posted by Stefan Hajnoczi 5 months, 3 weeks ago
On Thu, Nov 02, 2023 at 03:10:52PM +0100, Kevin Wolf wrote:
> Am 18.09.2023 um 18:16 hat Stefan Hajnoczi geschrieben:
> > virtio-blk and virtio-scsi devices need a way to specify the mapping between
> > IOThreads and virtqueues. At the moment all virtqueues are assigned to a single
> > IOThread or the main loop. This single thread can be a CPU bottleneck, so it is
> > necessary to allow finer-grained assignment to spread the load. With this
> > series applied, "pidstat -t 1" shows that guests with -smp 2 or higher are able
> > to exploit multiple IOThreads.
> > 
> > This series introduces command-line syntax for the new iothread-vq-mapping
> > property is as follows:
> > 
> >   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]},...'
> > 
> > IOThreads are specified by name and virtqueues are specified by 0-based
> > index.
> > 
> > It will be common to simply assign virtqueues round-robin across a set
> > of IOThreads. A convenient syntax that does not require specifying
> > individual virtqueue indices is available:
> > 
> >   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]},...'
> > 
> > There is no way to reassign virtqueues at runtime and I expect that to be a
> > very rare requirement.
> > 
> > Note that JSON --device syntax is required for the iothread-vq-mapping
> > parameter because it's non-scalar.
> > 
> > Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")
> 
> Does this strictly depend on patch 5/5 of that series, or would it just
> be a missed opportunity for optimisation by unnecessarily running some
> requests from a different thread?

I looked at the issue with PATCH 5/5 more and didn't find a minimal
solution that I can implement today for soft freeze. There are too much
inconsistency between blk_remove_bs() in whether or not the AioContext
is acquired:

block/block-backend.c:        blk_remove_bs(blk); <- blk_unref (can't tell if AioContext is acquired)
block/block-backend.c:            blk_remove_bs(blk); (acquired)
block/monitor/block-hmp-cmds.c:        blk_remove_bs(blk); (acquired)
block/qapi-sysemu.c:    blk_remove_bs(blk); (acquired)
block/qapi-sysemu.c:            blk_remove_bs(blk); (not acquired)
qemu-nbd.c:        blk_remove_bs(blk); (not acquired)
tests/unit/test-block-iothread.c:    blk_remove_bs(blk); (acquired)
tests/unit/test-blockjob.c:    blk_remove_bs(blk); (sometimes acquired, sometimes not)

They usually get away with it because BDRV_WAIT_WHILE() only unlocks the
AioContext when the BlockDriverState's AioContext is not the current
thread's home context. This means main loop code works when the
AioContext is not acquired as long as the BDS AioContext is the main
loop AioContext.

The solution I have confidence in is to stop using the AioContext lock,
but it will take more time to refactor the SCSI layer (the last real
user of the AioContext).

I'm afraid iothread-vq-mapping can't make it into QEMU 8.2.

Stefan
Re: [PATCH v2 0/2] virtio-blk: add iothread-vq-mapping parameter
Posted by Stefan Hajnoczi 5 months, 3 weeks ago
On Thu, Nov 02, 2023 at 03:10:52PM +0100, Kevin Wolf wrote:
> Am 18.09.2023 um 18:16 hat Stefan Hajnoczi geschrieben:
> > virtio-blk and virtio-scsi devices need a way to specify the mapping between
> > IOThreads and virtqueues. At the moment all virtqueues are assigned to a single
> > IOThread or the main loop. This single thread can be a CPU bottleneck, so it is
> > necessary to allow finer-grained assignment to spread the load. With this
> > series applied, "pidstat -t 1" shows that guests with -smp 2 or higher are able
> > to exploit multiple IOThreads.
> > 
> > This series introduces command-line syntax for the new iothread-vq-mapping
> > property is as follows:
> > 
> >   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]},...'
> > 
> > IOThreads are specified by name and virtqueues are specified by 0-based
> > index.
> > 
> > It will be common to simply assign virtqueues round-robin across a set
> > of IOThreads. A convenient syntax that does not require specifying
> > individual virtqueue indices is available:
> > 
> >   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]},...'
> > 
> > There is no way to reassign virtqueues at runtime and I expect that to be a
> > very rare requirement.
> > 
> > Note that JSON --device syntax is required for the iothread-vq-mapping
> > parameter because it's non-scalar.
> > 
> > Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")
> 
> Does this strictly depend on patch 5/5 of that series, or would it just
> be a missed opportunity for optimisation by unnecessarily running some
> requests from a different thread?

"[PATCH v3 5/5] block-coroutine-wrapper: use
qemu_get_current_aio_context()" is necessary so that
virtio_blk_sect_range_ok -> blk_get_geometry -> blk_nb_sectors ->
bdrv_refresh_total_sectors -> bdrv_poll_co can be called without holding
the AioContext lock.

That case only happens when the BlockDriverState is a file-posix host
CD-ROM or a file-win32 host_device. Most users will never hit this
problem, but it would be unsafe to proceed merging code without this
patch.

> I suspect it does depend on the other virtio-blk series, though:
> 
> [PATCH 0/4] virtio-blk: prepare for the multi-queue block layer
> https://patchew.org/QEMU/20230914140101.1065008-1-stefanha@redhat.com/
> 
> Is this right?

Yes, it depends on "[PATCH 0/4] virtio-blk: prepare for the multi-queue
block layer" so that every AioContext is able to handle Linux AIO or
io_uring I/O and to stop using the AioContext lock in the virtio-blk I/O
code path.

Stefan

> 
> Given that soft freeze is early next week, maybe we should try to merge
> just the bare minimum of strictly necessary dependencies.
> 
> Kevin
> 
Re: [PATCH v2 0/2] virtio-blk: add iothread-vq-mapping parameter
Posted by Kevin Wolf 5 months, 3 weeks ago
Am 07.11.2023 um 04:00 hat Stefan Hajnoczi geschrieben:
> On Thu, Nov 02, 2023 at 03:10:52PM +0100, Kevin Wolf wrote:
> > Am 18.09.2023 um 18:16 hat Stefan Hajnoczi geschrieben:
> > > virtio-blk and virtio-scsi devices need a way to specify the mapping between
> > > IOThreads and virtqueues. At the moment all virtqueues are assigned to a single
> > > IOThread or the main loop. This single thread can be a CPU bottleneck, so it is
> > > necessary to allow finer-grained assignment to spread the load. With this
> > > series applied, "pidstat -t 1" shows that guests with -smp 2 or higher are able
> > > to exploit multiple IOThreads.
> > > 
> > > This series introduces command-line syntax for the new iothread-vq-mapping
> > > property is as follows:
> > > 
> > >   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]},...'
> > > 
> > > IOThreads are specified by name and virtqueues are specified by 0-based
> > > index.
> > > 
> > > It will be common to simply assign virtqueues round-robin across a set
> > > of IOThreads. A convenient syntax that does not require specifying
> > > individual virtqueue indices is available:
> > > 
> > >   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]},...'
> > > 
> > > There is no way to reassign virtqueues at runtime and I expect that to be a
> > > very rare requirement.
> > > 
> > > Note that JSON --device syntax is required for the iothread-vq-mapping
> > > parameter because it's non-scalar.
> > > 
> > > Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")
> > 
> > Does this strictly depend on patch 5/5 of that series, or would it just
> > be a missed opportunity for optimisation by unnecessarily running some
> > requests from a different thread?
> 
> "[PATCH v3 5/5] block-coroutine-wrapper: use
> qemu_get_current_aio_context()" is necessary so that
> virtio_blk_sect_range_ok -> blk_get_geometry -> blk_nb_sectors ->
> bdrv_refresh_total_sectors -> bdrv_poll_co can be called without holding
> the AioContext lock.

Ooh, so we only have the whole problem because bdrv_poll_co() wants to
temporarily unlock an AioContext that we don't even hold? That's a real
shame, but I understand now why we need the patch.

> That case only happens when the BlockDriverState is a file-posix host
> CD-ROM or a file-win32 host_device. Most users will never hit this
> problem, but it would be unsafe to proceed merging code without this
> patch.

Yes, I agree.

Kevin
Re: [PATCH v2 0/2] virtio-blk: add iothread-vq-mapping parameter
Posted by Michael S. Tsirkin 7 months ago
On Mon, Sep 18, 2023 at 12:16:02PM -0400, Stefan Hajnoczi wrote:
> virtio-blk and virtio-scsi devices need a way to specify the mapping between
> IOThreads and virtqueues. At the moment all virtqueues are assigned to a single
> IOThread or the main loop. This single thread can be a CPU bottleneck, so it is
> necessary to allow finer-grained assignment to spread the load. With this
> series applied, "pidstat -t 1" shows that guests with -smp 2 or higher are able
> to exploit multiple IOThreads.
> 
> This series introduces command-line syntax for the new iothread-vq-mapping
> property is as follows:
> 
>   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]},...'
> 
> IOThreads are specified by name and virtqueues are specified by 0-based
> index.
> 
> It will be common to simply assign virtqueues round-robin across a set
> of IOThreads. A convenient syntax that does not require specifying
> individual virtqueue indices is available:
> 
>   --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]},...'
> 
> There is no way to reassign virtqueues at runtime and I expect that to be a
> very rare requirement.
> 
> Note that JSON --device syntax is required for the iothread-vq-mapping
> parameter because it's non-scalar.
> 
> Based-on: 20230912231037.826804-1-stefanha@redhat.com ("[PATCH v3 0/5] block-backend: process I/O in the current AioContext")


Acked-by: Michael S. Tsirkin <mst@redhat.com>

More of a block thingy so pls use that tree.

> Stefan Hajnoczi (2):
>   qdev: add IOThreadVirtQueueMappingList property type
>   virtio-blk: add iothread-vq-mapping parameter
> 
>  qapi/virtio.json                    |  30 +++++
>  hw/block/dataplane/virtio-blk.h     |   3 +
>  include/hw/qdev-properties-system.h |   4 +
>  include/hw/virtio/virtio-blk.h      |   2 +
>  hw/block/dataplane/virtio-blk.c     | 163 +++++++++++++++++++++-------
>  hw/block/virtio-blk.c               |  92 ++++++++++++++--
>  hw/core/qdev-properties-system.c    |  47 ++++++++
>  7 files changed, 287 insertions(+), 54 deletions(-)
> 
> -- 
> 2.41.0