[PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path

Paolo Bonzini posted 8 patches 1 year ago
Failed in applying to current master (apply log)
block.c                           | 35 ++++++++++++++++++--------
block/block-backend.c             | 42 ++++++++++++++++++++++++-------
block/copy-on-read.c              |  1 -
block/file-posix.c                | 12 ++++++---
block/file-win32.c                |  2 +-
block/filter-compress.c           |  1 -
block/io.c                        |  4 +++
block/preallocate.c               |  1 -
block/raw-format.c                |  3 ++-
block/replication.c               |  1 -
include/block/block-io.h          |  5 +---
include/block/block_int-common.h  | 10 ++++++--
include/sysemu/block-backend-io.h |  5 ++--
migration/block.c                 |  5 ++--
14 files changed, 85 insertions(+), 42 deletions(-)
[PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path
Posted by Paolo Bonzini 1 year ago
The introduction of the graph lock is causing blk_get_geometry, a hot
function used in the I/O path, to create a coroutine for the call to
bdrv_co_refresh_total_sectors.

In theory the call to bdrv_co_refresh_total_sectors should only matter
in the rare case of host CD-ROM devices, whose size changes when a medium
is added or removed.  However, the call is actually keyed by a field in
BlockDriver, drv->has_variable_length, and the field is true in the common
case of the raw driver!  This is because the host CD-ROM is usually
layered below the raw driver.

So, this series starts by moving has_variable_length from BlockDriver to
BlockLimits.  This is patches 1-4, which also include a fix for a small
latent bug (patch 3).

The second half of the series then cleans up the functions to retrieve
the BlockDriverState's size (patches 5-7) to limit the amount of duplicated
code introduced by the hand-written wrappers of patch 8.  The final result
is that blk_get_geometry will not anymore create a coroutine.

This series applies to qemu.git, or to the block-next branch if commit
d8fbf9aa85ae ("block/export: Fix graph locking in blk_get_geometry()
call", 2023-03-27) is cherry picked.  Commit d8fbf9aa85ae is also where
bdrv_co_get_geometry() was introduced and with it the performance
regression.  It is quite a recent change, and therefore this is
probably a regression in 8.0 that had not been detected yet (except by
Stefan who talked to Kevin and me about it yesterday).  I'm not sure how
we can avoid the regression, if not by disabling completely the graph lock
(!) or applying this large series.

I'm throwing this out before disappearing for a couple days for Easter;
I have only tested it with qemu-iotests and "make check-unit".

Thanks,

Paolo

Paolo Bonzini (8):
  block: move has_variable_length to BlockLimits
  block: remove has_variable_length from filters
  block: refresh bs->total_sectors on reopen
  block: remove has_variable_length from BlockDriver
  migration/block: replace uses of blk_nb_sectors that do not check
    result
  block-backend: inline bdrv_co_get_geometry
  block-backend: ignore inserted state in blk_co_nb_sectors
  block, block-backend: write some hot coroutine wrappers by hand

 block.c                           | 35 ++++++++++++++++++--------
 block/block-backend.c             | 42 ++++++++++++++++++++++++-------
 block/copy-on-read.c              |  1 -
 block/file-posix.c                | 12 ++++++---
 block/file-win32.c                |  2 +-
 block/filter-compress.c           |  1 -
 block/io.c                        |  4 +++
 block/preallocate.c               |  1 -
 block/raw-format.c                |  3 ++-
 block/replication.c               |  1 -
 include/block/block-io.h          |  5 +---
 include/block/block_int-common.h  | 10 ++++++--
 include/sysemu/block-backend-io.h |  5 ++--
 migration/block.c                 |  5 ++--
 14 files changed, 85 insertions(+), 42 deletions(-)

-- 
2.39.2
Re: [PATCH 8.0 regression 0/8] block: remove bdrv_co_get_geometry coroutines from I/O hot path
Posted by Kevin Wolf 1 year ago
Am 07.04.2023 um 17:32 hat Paolo Bonzini geschrieben:
> The introduction of the graph lock is causing blk_get_geometry, a hot
> function used in the I/O path, to create a coroutine for the call to
> bdrv_co_refresh_total_sectors.
> 
> In theory the call to bdrv_co_refresh_total_sectors should only matter
> in the rare case of host CD-ROM devices, whose size changes when a medium
> is added or removed.  However, the call is actually keyed by a field in
> BlockDriver, drv->has_variable_length, and the field is true in the common
> case of the raw driver!  This is because the host CD-ROM is usually
> layered below the raw driver.
> 
> So, this series starts by moving has_variable_length from BlockDriver to
> BlockLimits.  This is patches 1-4, which also include a fix for a small
> latent bug (patch 3).
> 
> The second half of the series then cleans up the functions to retrieve
> the BlockDriverState's size (patches 5-7) to limit the amount of duplicated
> code introduced by the hand-written wrappers of patch 8.  The final result
> is that blk_get_geometry will not anymore create a coroutine.
> 
> This series applies to qemu.git, or to the block-next branch if commit
> d8fbf9aa85ae ("block/export: Fix graph locking in blk_get_geometry()
> call", 2023-03-27) is cherry picked.  Commit d8fbf9aa85ae is also where
> bdrv_co_get_geometry() was introduced and with it the performance
> regression.  It is quite a recent change, and therefore this is
> probably a regression in 8.0 that had not been detected yet (except by
> Stefan who talked to Kevin and me about it yesterday).  I'm not sure how
> we can avoid the regression, if not by disabling completely the graph lock
> (!) or applying this large series.
> 
> I'm throwing this out before disappearing for a couple days for Easter;
> I have only tested it with qemu-iotests and "make check-unit".

Thanks, fixed up patch 8 to make the non-coroutine wrappers almost exact
copies of the coroutine version (including fixing the bug that Eric
found), and applied to the block branch.

I'm not sure if the functions actually need to be coroutine_mixed_fn,
because coroutines should already call blk_co_get_geometry(), but we can
clean that up later.

Kevin