[PATCH RESEND v3 0/4] virtio-pci: enable blk and scsi multi-queue by default

Stefan Hajnoczi posted 4 patches 4 years ago
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test checkpatch passed
Test FreeBSD passed
Test asan passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200320103041.129527-1-stefanha@redhat.com
There is a newer version of this series
hw/block/vhost-user-blk.c          |  6 +++++-
hw/block/virtio-blk.c              |  6 +++++-
hw/core/machine.c                  |  5 +++++
hw/scsi/vhost-scsi.c               |  3 ++-
hw/scsi/vhost-user-scsi.c          |  5 +++--
hw/scsi/virtio-scsi.c              | 13 +++++++++----
hw/virtio/vhost-scsi-pci.c         | 10 ++++++++--
hw/virtio/vhost-user-blk-pci.c     |  6 ++++++
hw/virtio/vhost-user-scsi-pci.c    | 10 ++++++++--
hw/virtio/virtio-blk-pci.c         |  9 ++++++++-
hw/virtio/virtio-scsi-pci.c        | 10 ++++++++--
include/hw/virtio/vhost-user-blk.h |  2 ++
include/hw/virtio/virtio-blk.h     |  2 ++
include/hw/virtio/virtio-scsi.h    |  5 +++++
14 files changed, 76 insertions(+), 16 deletions(-)
[PATCH RESEND v3 0/4] virtio-pci: enable blk and scsi multi-queue by default
Posted by Stefan Hajnoczi 4 years ago
v3:
 * Add new performance results that demonstrate the scalability
 * Mention that this is PCI-specific [Cornelia]
v2:
 * Let the virtio-DEVICE-pci device select num-queues because the optimal
   multi-queue configuration may differ between virtio-pci, virtio-mmio, and
   virtio-ccw [Cornelia]

Enabling multi-queue on virtio-pci storage devices improves performance on SMP
guests because the completion interrupt is handled on the vCPU that submitted
the I/O request.  This avoids IPIs inside the guest.

Note that performance is unchanged in these cases:
1. Uniprocessor guests.  They don't have IPIs.
2. Application threads might be scheduled on the sole vCPU that handles
   completion interrupts purely by chance.  (This is one reason why benchmark
   results can vary noticably between runs.)
3. Users may bind the application to the vCPU that handles completion
   interrupts.

Set the number of queues to the number of vCPUs by default on virtio-blk and
virtio-scsi PCI devices.  Older machine types continue to default to 1 queue
for live migration compatibility.

Random read performance:
      IOPS
q=1    78k
q=32  104k  +33%

Boot time:
      Duration
q=1        51s
q=32     1m41s  +98%

Guest configuration: 32 vCPUs, 101 virtio-blk-pci disks

Previously measured results on a 4 vCPU guest were also positive but showed a
smaller 1-4% performance improvement.  They are no longer valid because
significant event loop optimizations have been merged.

Stefan Hajnoczi (4):
  virtio-scsi: introduce a constant for fixed virtqueues
  virtio-scsi: default num_queues to -smp N
  virtio-blk: default num_queues to -smp N
  vhost-user-blk: default num_queues to -smp N

 hw/block/vhost-user-blk.c          |  6 +++++-
 hw/block/virtio-blk.c              |  6 +++++-
 hw/core/machine.c                  |  5 +++++
 hw/scsi/vhost-scsi.c               |  3 ++-
 hw/scsi/vhost-user-scsi.c          |  5 +++--
 hw/scsi/virtio-scsi.c              | 13 +++++++++----
 hw/virtio/vhost-scsi-pci.c         | 10 ++++++++--
 hw/virtio/vhost-user-blk-pci.c     |  6 ++++++
 hw/virtio/vhost-user-scsi-pci.c    | 10 ++++++++--
 hw/virtio/virtio-blk-pci.c         |  9 ++++++++-
 hw/virtio/virtio-scsi-pci.c        | 10 ++++++++--
 include/hw/virtio/vhost-user-blk.h |  2 ++
 include/hw/virtio/virtio-blk.h     |  2 ++
 include/hw/virtio/virtio-scsi.h    |  5 +++++
 14 files changed, 76 insertions(+), 16 deletions(-)

-- 
2.24.1

Re: [PATCH RESEND v3 0/4] virtio-pci: enable blk and scsi multi-queue by default
Posted by Michael S. Tsirkin 4 years ago
On Fri, Mar 20, 2020 at 10:30:37AM +0000, Stefan Hajnoczi wrote:
> v3:
>  * Add new performance results that demonstrate the scalability
>  * Mention that this is PCI-specific [Cornelia]
> v2:
>  * Let the virtio-DEVICE-pci device select num-queues because the optimal
>    multi-queue configuration may differ between virtio-pci, virtio-mmio, and
>    virtio-ccw [Cornelia]


I'd like to queue it for merge after the release. If possible
please ping me after the release to help make sure it didn't get
dropped.

Thanks!


> Enabling multi-queue on virtio-pci storage devices improves performance on SMP
> guests because the completion interrupt is handled on the vCPU that submitted
> the I/O request.  This avoids IPIs inside the guest.
> 
> Note that performance is unchanged in these cases:
> 1. Uniprocessor guests.  They don't have IPIs.
> 2. Application threads might be scheduled on the sole vCPU that handles
>    completion interrupts purely by chance.  (This is one reason why benchmark
>    results can vary noticably between runs.)
> 3. Users may bind the application to the vCPU that handles completion
>    interrupts.
> 
> Set the number of queues to the number of vCPUs by default on virtio-blk and
> virtio-scsi PCI devices.  Older machine types continue to default to 1 queue
> for live migration compatibility.
> 
> Random read performance:
>       IOPS
> q=1    78k
> q=32  104k  +33%
> 
> Boot time:
>       Duration
> q=1        51s
> q=32     1m41s  +98%
> 
> Guest configuration: 32 vCPUs, 101 virtio-blk-pci disks
> 
> Previously measured results on a 4 vCPU guest were also positive but showed a
> smaller 1-4% performance improvement.  They are no longer valid because
> significant event loop optimizations have been merged.
> 
> Stefan Hajnoczi (4):
>   virtio-scsi: introduce a constant for fixed virtqueues
>   virtio-scsi: default num_queues to -smp N
>   virtio-blk: default num_queues to -smp N
>   vhost-user-blk: default num_queues to -smp N
> 
>  hw/block/vhost-user-blk.c          |  6 +++++-
>  hw/block/virtio-blk.c              |  6 +++++-
>  hw/core/machine.c                  |  5 +++++
>  hw/scsi/vhost-scsi.c               |  3 ++-
>  hw/scsi/vhost-user-scsi.c          |  5 +++--
>  hw/scsi/virtio-scsi.c              | 13 +++++++++----
>  hw/virtio/vhost-scsi-pci.c         | 10 ++++++++--
>  hw/virtio/vhost-user-blk-pci.c     |  6 ++++++
>  hw/virtio/vhost-user-scsi-pci.c    | 10 ++++++++--
>  hw/virtio/virtio-blk-pci.c         |  9 ++++++++-
>  hw/virtio/virtio-scsi-pci.c        | 10 ++++++++--
>  include/hw/virtio/vhost-user-blk.h |  2 ++
>  include/hw/virtio/virtio-blk.h     |  2 ++
>  include/hw/virtio/virtio-scsi.h    |  5 +++++
>  14 files changed, 76 insertions(+), 16 deletions(-)
> 
> -- 
> 2.24.1
> 


Re: [PATCH RESEND v3 0/4] virtio-pci: enable blk and scsi multi-queue by default
Posted by Pankaj Gupta 3 years, 12 months ago
For best case its really a good idea to configure default number of
queues to the number of CPU's.

For the series:
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>

Re: [PATCH RESEND v3 0/4] virtio-pci: enable blk and scsi multi-queue by default
Posted by Michael S. Tsirkin 3 years, 10 months ago
On Fri, Mar 20, 2020 at 10:30:37AM +0000, Stefan Hajnoczi wrote:
> v3:
>  * Add new performance results that demonstrate the scalability
>  * Mention that this is PCI-specific [Cornelia]
> v2:
>  * Let the virtio-DEVICE-pci device select num-queues because the optimal
>    multi-queue configuration may differ between virtio-pci, virtio-mmio, and
>    virtio-ccw [Cornelia]


So this needs to be rebased wrt compat properties. I also see some
comments from Cornelia, worth addressing.

> Enabling multi-queue on virtio-pci storage devices improves performance on SMP
> guests because the completion interrupt is handled on the vCPU that submitted
> the I/O request.  This avoids IPIs inside the guest.
> 
> Note that performance is unchanged in these cases:
> 1. Uniprocessor guests.  They don't have IPIs.
> 2. Application threads might be scheduled on the sole vCPU that handles
>    completion interrupts purely by chance.  (This is one reason why benchmark
>    results can vary noticably between runs.)
> 3. Users may bind the application to the vCPU that handles completion
>    interrupts.
> 
> Set the number of queues to the number of vCPUs by default on virtio-blk and
> virtio-scsi PCI devices.  Older machine types continue to default to 1 queue
> for live migration compatibility.
> 
> Random read performance:
>       IOPS
> q=1    78k
> q=32  104k  +33%
> 
> Boot time:
>       Duration
> q=1        51s
> q=32     1m41s  +98%
> 
> Guest configuration: 32 vCPUs, 101 virtio-blk-pci disks
> 
> Previously measured results on a 4 vCPU guest were also positive but showed a
> smaller 1-4% performance improvement.  They are no longer valid because
> significant event loop optimizations have been merged.
> 
> Stefan Hajnoczi (4):
>   virtio-scsi: introduce a constant for fixed virtqueues
>   virtio-scsi: default num_queues to -smp N
>   virtio-blk: default num_queues to -smp N
>   vhost-user-blk: default num_queues to -smp N
> 
>  hw/block/vhost-user-blk.c          |  6 +++++-
>  hw/block/virtio-blk.c              |  6 +++++-
>  hw/core/machine.c                  |  5 +++++
>  hw/scsi/vhost-scsi.c               |  3 ++-
>  hw/scsi/vhost-user-scsi.c          |  5 +++--
>  hw/scsi/virtio-scsi.c              | 13 +++++++++----
>  hw/virtio/vhost-scsi-pci.c         | 10 ++++++++--
>  hw/virtio/vhost-user-blk-pci.c     |  6 ++++++
>  hw/virtio/vhost-user-scsi-pci.c    | 10 ++++++++--
>  hw/virtio/virtio-blk-pci.c         |  9 ++++++++-
>  hw/virtio/virtio-scsi-pci.c        | 10 ++++++++--
>  include/hw/virtio/vhost-user-blk.h |  2 ++
>  include/hw/virtio/virtio-blk.h     |  2 ++
>  include/hw/virtio/virtio-scsi.h    |  5 +++++
>  14 files changed, 76 insertions(+), 16 deletions(-)
> 
> -- 
> 2.24.1
>