[PATCH 0/4] qemu: fix block jobs when copy_on_read is enabled

Peter Krempa posted 4 patches 4 years, 3 months ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/cover.1579794423.git.pkrempa@redhat.com
src/qemu/qemu_domain.c    | 38 ++++++++++++++++++++++++++------------
src/qemu/qemu_domain.h    |  4 ++++
src/qemu/qemu_driver.c    |  6 +++---
src/qemu/qemu_migration.c |  2 +-
4 files changed, 34 insertions(+), 16 deletions(-)
[PATCH 0/4] qemu: fix block jobs when copy_on_read is enabled
Posted by Peter Krempa 4 years, 3 months ago
QEMU requires us to pass the topmost node(-name) in the backing chain as
the 'device' argument for the blockdev-mirror/block-commit commands.
Otherwise QEMU complains:

 "Need a root block node"

Since libvirt always puts the copy-on-read driver on top of the backing
chain, blockjobs on disks which use "copy_on_read" fail.

The series below fixes that by passing the proper node. This fixes it
just fine for the mirror, but for block-commit we still get another
error due to a bug in qemu:

 "'libvirt-4-format' is not in this backing file chain"

Additionally one weird thing happens with block-stream (blockpull). If I
pass the layer below the copy-on-read as 'device', everything works.
This is kind of weird as the documentation for that is the same. If I
pass the nodename of copy-on-read.

Max, Kevin, could you please comment on the block-stream part and
possibly also on the plans to fix the issue with block-commit?

Cc: Max Reitz <mreitz@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>

https://bugzilla.redhat.com/show_bug.cgi?id=1792195

Peter Krempa (4):
  qemu: blockcopy: Actually unplug unused images when mirror job fails
    to start
  qemu: domain: Extract code to determine topmost nodename to
    qemuDomainDiskGetTopNodename
  qemu: Fix value of 'device' argument for blockdev-mirror
  qemu: Fix value of 'device' argument for block-commit

 src/qemu/qemu_domain.c    | 38 ++++++++++++++++++++++++++------------
 src/qemu/qemu_domain.h    |  4 ++++
 src/qemu/qemu_driver.c    |  6 +++---
 src/qemu/qemu_migration.c |  2 +-
 4 files changed, 34 insertions(+), 16 deletions(-)

-- 
2.24.1

Re: [PATCH 0/4] qemu: fix block jobs when copy_on_read is enabled
Posted by Max Reitz 4 years, 3 months ago
On 23.01.20 16:57, Peter Krempa wrote:
> QEMU requires us to pass the topmost node(-name) in the backing chain as
> the 'device' argument for the blockdev-mirror/block-commit commands.
> Otherwise QEMU complains:
> 
>  "Need a root block node"
> 
> Since libvirt always puts the copy-on-read driver on top of the backing
> chain, blockjobs on disks which use "copy_on_read" fail.
> 
> The series below fixes that by passing the proper node. This fixes it
> just fine for the mirror, but for block-commit we still get another
> error due to a bug in qemu:
> 
>  "'libvirt-4-format' is not in this backing file chain"
> 
> Additionally one weird thing happens with block-stream (blockpull). If I
> pass the layer below the copy-on-read as 'device', everything works.
> This is kind of weird as the documentation for that is the same. If I
> pass the nodename of copy-on-read.

It isn’t.  block-stream says:
> # @device: the device or node name of the top image
block-commit says:
> # @device:  the device name or node-name of a root node
blockdev-mirror says:
> # @device: The device name or node-name of a root node whose writes should be
> #          mirrored.

Note the subtle difference: “top” != “root”.  “top” is meant just as
part of the job: There’s a base image, and a top image.  The top image
is the target (for stream, that is), the base is the first image in the
chain that will be kept.

“root node” means actually a root node in the graph.

Hence, stream doesn’t require @device to point to a root node.  (Before
qemu commit 554b6147650 it did, but that commit relaxed block-stream.)

> Max, Kevin, could you please comment on the block-stream part and
> possibly also on the plans to fix the issue with block-commit?

The plans are to fix it.  The latest version of the series is here:

https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00350.html

It’s from August because it’s an extreme churn to work on new versions.
 It touches basically everything in the block layer, so rebasing it
alone takes time.  And then there are facts like reviews leading to new
prerequesite series, namely:

https://lists.nongnu.org/archive/html/qemu-block/2019-11/msg00370.html
https://lists.nongnu.org/archive/html/qemu-block/2019-11/msg00908.html

So in total, there are 96 patches.  The churn for each version is hard.
 Finding willing reviewers is even harder.

(I actually have basically no reviews for one of those prerequisite
series (“block: Introduce real BdrvChildRole”) yet, so there’s little
point in working on v7 of the main series.)

Max

> 
> Cc: Max Reitz <mreitz@redhat.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1792195
> 
> Peter Krempa (4):
>   qemu: blockcopy: Actually unplug unused images when mirror job fails
>     to start
>   qemu: domain: Extract code to determine topmost nodename to
>     qemuDomainDiskGetTopNodename
>   qemu: Fix value of 'device' argument for blockdev-mirror
>   qemu: Fix value of 'device' argument for block-commit
> 
>  src/qemu/qemu_domain.c    | 38 ++++++++++++++++++++++++++------------
>  src/qemu/qemu_domain.h    |  4 ++++
>  src/qemu/qemu_driver.c    |  6 +++---
>  src/qemu/qemu_migration.c |  2 +-
>  4 files changed, 34 insertions(+), 16 deletions(-)
> 


Re: [PATCH 0/4] qemu: fix block jobs when copy_on_read is enabled
Posted by Peter Krempa 4 years, 3 months ago
On Fri, Jan 24, 2020 at 11:23:46 +0100, Max Reitz wrote:
> On 23.01.20 16:57, Peter Krempa wrote:
> > QEMU requires us to pass the topmost node(-name) in the backing chain as
> > the 'device' argument for the blockdev-mirror/block-commit commands.
> > Otherwise QEMU complains:
> > 
> >  "Need a root block node"
> > 
> > Since libvirt always puts the copy-on-read driver on top of the backing
> > chain, blockjobs on disks which use "copy_on_read" fail.
> > 
> > The series below fixes that by passing the proper node. This fixes it
> > just fine for the mirror, but for block-commit we still get another
> > error due to a bug in qemu:
> > 
> >  "'libvirt-4-format' is not in this backing file chain"
> > 
> > Additionally one weird thing happens with block-stream (blockpull). If I
> > pass the layer below the copy-on-read as 'device', everything works.
> > This is kind of weird as the documentation for that is the same. If I
> > pass the nodename of copy-on-read.
> 
> It isn’t.  block-stream says:
> > # @device: the device or node name of the top image
> block-commit says:
> > # @device:  the device name or node-name of a root node
> blockdev-mirror says:
> > # @device: The device name or node-name of a root node whose writes should be
> > #          mirrored.
> 
> Note the subtle difference: “top” != “root”.  “top” is meant just as
> part of the job: There’s a base image, and a top image.  The top image
> is the target (for stream, that is), the base is the first image in the
> chain that will be kept.

Ah, right. The distinction was too subtle for me to detect :)

> 
> “root node” means actually a root node in the graph.
> 
> Hence, stream doesn’t require @device to point to a root node.  (Before
> qemu commit 554b6147650 it did, but that commit relaxed block-stream.)

Cool, thanks for the version info. Since we use this only starting from
qemu-4.2 we are safe to always use it.


> > Max, Kevin, could you please comment on the block-stream part and
> > possibly also on the plans to fix the issue with block-commit?
> 
> The plans are to fix it.  The latest version of the series is here:
> 
> https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00350.html
> 
> It’s from August because it’s an extreme churn to work on new versions.
>  It touches basically everything in the block layer, so rebasing it
> alone takes time.  And then there are facts like reviews leading to new
> prerequesite series, namely:
> 
> https://lists.nongnu.org/archive/html/qemu-block/2019-11/msg00370.html
> https://lists.nongnu.org/archive/html/qemu-block/2019-11/msg00908.html
> 
> So in total, there are 96 patches.  The churn for each version is hard.
>  Finding willing reviewers is even harder.
> 
> (I actually have basically no reviews for one of those prerequisite
> series (“block: Introduce real BdrvChildRole”) yet, so there’s little
> point in working on v7 of the main series.)

Thanks for the info. Unfortunately I'm not too familiar with qemu to
warant a review for this :(

Thank you for the information!