This patch series aims to improve the speed of qemu-img rebase. 1. Mainly by removing unnecessary reads when rebasing on the same chain. 2. But also by minimizing the number of bdrv_open calls rebase requires.
This patch series aims to improve the speed of qemu-img rebase. 1. Mainly by removing unnecessary reads when rebasing on the same chain. 2. But also by minimizing the number of bdrv_open calls rebase requires. v2: - Added missing g_free in "qemu-img: rebase: Reuse in-chain BlockDriverState"* Sam Eiderman (3): qemu-img: rebase: Reuse parent BlockDriverState qemu-img: rebase: Reduce reads on in-chain rebase qemu-img: rebase: Reuse in-chain BlockDriverState qemu-img.c | 85 ++++++++++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 58 insertions(+), 27 deletions(-) -- 2.13.3
On 5/2/19 8:58 AM, Sam Eiderman wrote: > This patch series aims to improve the speed of qemu-img rebase. > > 1. Mainly by removing unnecessary reads when rebasing on the same > chain. > 2. But also by minimizing the number of bdrv_open calls rebase > requires. > When sending a v2 series, it's best to do so as a new top-level thread rather than in-reply-to the v1 series, as our CI tools are more likely to spot it that way. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Gentle ping. Do you want me to resend this patch as a new top-level thread? Thanks, Sam > On 2 May 2019, at 17:21, Eric Blake <eblake@redhat.com> wrote: > > On 5/2/19 8:58 AM, Sam Eiderman wrote: >> This patch series aims to improve the speed of qemu-img rebase. >> >> 1. Mainly by removing unnecessary reads when rebasing on the same >> chain. >> 2. But also by minimizing the number of bdrv_open calls rebase >> requires. >> > > When sending a v2 series, it's best to do so as a new top-level thread > rather than in-reply-to the v1 series, as our CI tools are more likely > to spot it that way. > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org >
I see, Thanks > On 2 May 2019, at 17:21, Eric Blake <eblake@redhat.com> wrote: > > On 5/2/19 8:58 AM, Sam Eiderman wrote: >> This patch series aims to improve the speed of qemu-img rebase. >> >> 1. Mainly by removing unnecessary reads when rebasing on the same >> chain. >> 2. But also by minimizing the number of bdrv_open calls rebase >> requires. >> > > When sending a v2 series, it's best to do so as a new top-level thread > rather than in-reply-to the v1 series, as our CI tools are more likely > to spot it that way. > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org >
In safe mode we open the entire chain, including the parent backing
file of the rebased file.
Do not open a new BlockBackend for the parent backing file, which
saves opening the rest of the chain twice, which for long chains
saves many "pricy" bdrv_open() calls.
Permissions for blk_new() were copied from blk_new_open() when
flags = 0.
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Signed-off-by: Sagi Amit <sagi.amit@oracle.com>
Co-developed-by: Sagi Amit <sagi.amit@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
---
qemu-img.c | 29 ++++++++++++-----------------
1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index 8ee63daeae..d9b609b3f0 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3297,28 +3297,23 @@ static int img_rebase(int argc, char **argv)
/* For safe rebasing we need to compare old and new backing file */
if (!unsafe) {
- char backing_name[PATH_MAX];
QDict *options = NULL;
+ BlockDriverState *base_bs = backing_bs(bs);
- if (bs->backing_format[0] != '\0') {
- options = qdict_new();
- qdict_put_str(options, "driver", bs->backing_format);
+ if (!base_bs) {
+ error_setg(&local_err, "Image does not have a backing file");
+ ret = -1;
+ goto out;
}
- if (force_share) {
- if (!options) {
- options = qdict_new();
- }
- qdict_put_bool(options, BDRV_OPT_FORCE_SHARE, true);
- }
- bdrv_get_backing_filename(bs, backing_name, sizeof(backing_name));
- blk_old_backing = blk_new_open(backing_name, NULL,
- options, src_flags, &local_err);
- if (!blk_old_backing) {
+ blk_old_backing = blk_new(BLK_PERM_CONSISTENT_READ,
+ BLK_PERM_ALL);
+ ret = blk_insert_bs(blk_old_backing, base_bs,
+ &local_err);
+ if (ret < 0) {
error_reportf_err(local_err,
- "Could not open old backing file '%s': ",
- backing_name);
- ret = -1;
+ "Could not reuse old backing file '%s': ",
+ base_bs->filename);
goto out;
}
--
2.13.3
On 02.05.19 15:58, Sam Eiderman wrote: > In safe mode we open the entire chain, including the parent backing > file of the rebased file. > Do not open a new BlockBackend for the parent backing file, which > saves opening the rest of the chain twice, which for long chains > saves many "pricy" bdrv_open() calls. > > Permissions for blk_new() were copied from blk_new_open() when > flags = 0. > > Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> > Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com> > Signed-off-by: Sagi Amit <sagi.amit@oracle.com> > Co-developed-by: Sagi Amit <sagi.amit@oracle.com> > Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com> > --- > qemu-img.c | 29 ++++++++++++----------------- > 1 file changed, 12 insertions(+), 17 deletions(-) Looks good! But I’m afraid it will need a rebase on my “Allow rebase with no input base” series (which is in master)... Max
In the following case:
(base) A <- B <- C (tip)
when running:
qemu-img rebase -b A C
QEMU would read all sectors not allocated in the file being rebased (C)
and compare them to the new base image (A), regardless of whether they
were changed or even allocated anywhere along the chain between the new
base and the top image (B). This causes many unneeded reads when
rebasing an image which represents a small diff of a large disk, as it
would read most of the disk's sectors.
Instead, use bdrv_is_allocated_above() to reduce the number of
unnecessary reads.
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
Signed-off-by: Eyal Moscovici <eyal.moscovici@oracle.com>
---
qemu-img.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/qemu-img.c b/qemu-img.c
index d9b609b3f0..7f20858cb9 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3152,7 +3152,7 @@ static int img_rebase(int argc, char **argv)
BlockBackend *blk = NULL, *blk_old_backing = NULL, *blk_new_backing = NULL;
uint8_t *buf_old = NULL;
uint8_t *buf_new = NULL;
- BlockDriverState *bs = NULL;
+ BlockDriverState *bs = NULL, *prefix_chain_bs = NULL;
char *filename;
const char *fmt, *cache, *src_cache, *out_basefmt, *out_baseimg;
int c, flags, src_flags, ret;
@@ -3343,6 +3343,12 @@ static int img_rebase(int argc, char **argv)
goto out;
}
+ /*
+ * Find out whether we rebase an image on top of a previous image
+ * in its chain.
+ */
+ prefix_chain_bs = bdrv_find_backing_image(bs, out_real_path);
+
blk_new_backing = blk_new_open(out_real_path, NULL,
options, src_flags, &local_err);
g_free(out_real_path);
@@ -3422,6 +3428,23 @@ static int img_rebase(int argc, char **argv)
continue;
}
+ if (prefix_chain_bs) {
+ /*
+ * If cluster wasn't changed since prefix_chain, we don't need
+ * to take action
+ */
+ ret = bdrv_is_allocated_above(bs, prefix_chain_bs,
+ offset, n, &n);
+ if (ret < 0) {
+ error_report("error while reading image metadata: %s",
+ strerror(-ret));
+ goto out;
+ }
+ if (!ret) {
+ continue;
+ }
+ }
+
/*
* Read old and new backing file and take into consideration that
* backing files may be smaller than the COW image.
--
2.13.3
On 02.05.19 15:58, Sam Eiderman wrote: > In the following case: > > (base) A <- B <- C (tip) > > when running: > > qemu-img rebase -b A C > > QEMU would read all sectors not allocated in the file being rebased (C) > and compare them to the new base image (A), regardless of whether they > were changed or even allocated anywhere along the chain between the new > base and the top image (B). This causes many unneeded reads when > rebasing an image which represents a small diff of a large disk, as it > would read most of the disk's sectors. > > Instead, use bdrv_is_allocated_above() to reduce the number of > unnecessary reads. > > Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> > Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com> > Signed-off-by: Eyal Moscovici <eyal.moscovici@oracle.com> > --- > qemu-img.c | 25 ++++++++++++++++++++++++- > 1 file changed, 24 insertions(+), 1 deletion(-) > > diff --git a/qemu-img.c b/qemu-img.c > index d9b609b3f0..7f20858cb9 100644 > --- a/qemu-img.c > +++ b/qemu-img.c [...] > @@ -3422,6 +3428,23 @@ static int img_rebase(int argc, char **argv) > continue; > } > > + if (prefix_chain_bs) { > + /* > + * If cluster wasn't changed since prefix_chain, we don't need > + * to take action > + */ > + ret = bdrv_is_allocated_above(bs, prefix_chain_bs, > + offset, n, &n); This will always return true because it definitely is allocated in @bs, or we wouldn’t be here. (We just checked that with bdrv_is_allocated().) I think @top should be backing_bs(bs). Max > + if (ret < 0) { > + error_report("error while reading image metadata: %s", > + strerror(-ret)); > + goto out; > + } > + if (!ret) { > + continue; > + } > + } > + > /* > * Read old and new backing file and take into consideration that > * backing files may be smaller than the COW image. >
> On 23 May 2019, at 17:01, Max Reitz <mreitz@redhat.com> wrote: > > On 02.05.19 15:58, Sam Eiderman wrote: >> In the following case: >> >> (base) A <- B <- C (tip) >> >> when running: >> >> qemu-img rebase -b A C >> >> QEMU would read all sectors not allocated in the file being rebased (C) >> and compare them to the new base image (A), regardless of whether they >> were changed or even allocated anywhere along the chain between the new >> base and the top image (B). This causes many unneeded reads when >> rebasing an image which represents a small diff of a large disk, as it >> would read most of the disk's sectors. >> >> Instead, use bdrv_is_allocated_above() to reduce the number of >> unnecessary reads. >> >> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> >> Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com> >> Signed-off-by: Eyal Moscovici <eyal.moscovici@oracle.com> >> --- >> qemu-img.c | 25 ++++++++++++++++++++++++- >> 1 file changed, 24 insertions(+), 1 deletion(-) >> >> diff --git a/qemu-img.c b/qemu-img.c >> index d9b609b3f0..7f20858cb9 100644 >> --- a/qemu-img.c >> +++ b/qemu-img.c > > [...] > >> @@ -3422,6 +3428,23 @@ static int img_rebase(int argc, char **argv) >> continue; >> } >> >> + if (prefix_chain_bs) { >> + /* >> + * If cluster wasn't changed since prefix_chain, we don't need >> + * to take action >> + */ >> + ret = bdrv_is_allocated_above(bs, prefix_chain_bs, >> + offset, n, &n); > > This will always return true because it definitely is allocated in @bs, > or we wouldn’t be here. (We just checked that with > bdrv_is_allocated().) I think @top should be backing_bs(bs). > > Max I don’t think that’s true: Examine the case where we have the following chain: A <- B <- C When we rebase C directly over A: qemu-img rebase -b A C We must check for every offset (sector): bdrv_is_allocated_above(C, A, offset, n, &n); If a sector from C is allocated above A - it may have been changed - so we need to do a read from A and a read from C and compare. If the sector is not allocated above, it was not changed - we don’t need to read from A or C. Sam > >> + if (ret < 0) { >> + error_report("error while reading image metadata: %s", >> + strerror(-ret)); >> + goto out; >> + } >> + if (!ret) { >> + continue; >> + } >> + } >> + >> /* >> * Read old and new backing file and take into consideration that >> * backing files may be smaller than the COW image.
On 23.05.19 16:09, Sam Eiderman wrote: > > >> On 23 May 2019, at 17:01, Max Reitz <mreitz@redhat.com >> <mailto:mreitz@redhat.com>> wrote: >> >> On 02.05.19 15:58, Sam Eiderman wrote: >>> In the following case: >>> >>> (base) A <- B <- C (tip) >>> >>> when running: >>> >>> qemu-img rebase -b A C >>> >>> QEMU would read all sectors not allocated in the file being rebased (C) >>> and compare them to the new base image (A), regardless of whether they >>> were changed or even allocated anywhere along the chain between the new >>> base and the top image (B). This causes many unneeded reads when >>> rebasing an image which represents a small diff of a large disk, as it >>> would read most of the disk's sectors. >>> >>> Instead, use bdrv_is_allocated_above() to reduce the number of >>> unnecessary reads. >>> >>> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com >>> <mailto:karl.heubaum@oracle.com>> >>> Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com >>> <mailto:shmuel.eiderman@oracle.com>> >>> Signed-off-by: Eyal Moscovici <eyal.moscovici@oracle.com >>> <mailto:eyal.moscovici@oracle.com>> >>> --- >>> qemu-img.c | 25 ++++++++++++++++++++++++- >>> 1 file changed, 24 insertions(+), 1 deletion(-) >>> >>> diff --git a/qemu-img.c b/qemu-img.c >>> index d9b609b3f0..7f20858cb9 100644 >>> --- a/qemu-img.c >>> +++ b/qemu-img.c >> >> [...] >> >>> @@ -3422,6 +3428,23 @@ static int img_rebase(int argc, char **argv) >>> continue; >>> } >>> >>> + if (prefix_chain_bs) { >>> + /* >>> + * If cluster wasn't changed since prefix_chain, we >>> don't need >>> + * to take action >>> + */ >>> + ret = bdrv_is_allocated_above(bs, prefix_chain_bs, >>> + offset, n, &n); >> >> This will always return true because it definitely is allocated in @bs, >> or we wouldn’t be here. (We just checked that with >> bdrv_is_allocated().) I think @top should be backing_bs(bs). >> >> Max > > I don’t think that’s true: > > Examine the case where we have the following chain: > > A <- B <- C > > When we rebase C directly over A: qemu-img rebase -b A C > > We must check for every offset (sector): bdrv_is_allocated_above(C, A, > offset, n, &n); > > If a sector from C is allocated above A - it may have been changed - so > we need to do a read from A and a read from C and compare. > If the sector is not allocated above, it was not changed - we don’t need > to read from A or C. First: Oops, somehow I inverted the bdrv_is_allocated() check in my head. (For context: I mean this part above this hunk here: /* If the cluster is allocated, we don't need to take action */ ret = bdrv_is_allocated(bs, offset, n, &n); if (ret < 0) { error_report("error while reading image metadata: %s", strerror(-ret)); goto out; } if (ret) { continue; } ) So at this point, the range definitely is *not* allocated in @bs. But second: That still means that we do not have to check @bs itself, because we already did. We know the range isn’t allocated there, so we can start at its backing file. On a more abstract level: No, we do not need to read all sectors from A and C and compare them if they are allocated anywhere above A. If they are allocated in C, we’re good, because all we’d do is write them back to C (which is a no-op). That’s exactly what the existing bdrv_is_allocated() check is for. So we only need to know whether the sectors are allocated above the base (A) and below the top (C), so in your example, whether they are allocated in B. If they are, we need to compare and potentially copy, if they are not, we can skip them. So my claim that bdrv_is_allocated_above() would always return true is wrong, but it still should use backing_bs(bs) for the top because we have checked @bs already. Max
> On 23 May 2019, at 17:26, Max Reitz <mreitz@redhat.com> wrote: > > On 23.05.19 16:09, Sam Eiderman wrote: >> >> >>> On 23 May 2019, at 17:01, Max Reitz <mreitz@redhat.com >>> <mailto:mreitz@redhat.com>> wrote: >>> >>> On 02.05.19 15:58, Sam Eiderman wrote: >>>> In the following case: >>>> >>>> (base) A <- B <- C (tip) >>>> >>>> when running: >>>> >>>> qemu-img rebase -b A C >>>> >>>> QEMU would read all sectors not allocated in the file being rebased (C) >>>> and compare them to the new base image (A), regardless of whether they >>>> were changed or even allocated anywhere along the chain between the new >>>> base and the top image (B). This causes many unneeded reads when >>>> rebasing an image which represents a small diff of a large disk, as it >>>> would read most of the disk's sectors. >>>> >>>> Instead, use bdrv_is_allocated_above() to reduce the number of >>>> unnecessary reads. >>>> >>>> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com >>>> <mailto:karl.heubaum@oracle.com>> >>>> Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com >>>> <mailto:shmuel.eiderman@oracle.com>> >>>> Signed-off-by: Eyal Moscovici <eyal.moscovici@oracle.com >>>> <mailto:eyal.moscovici@oracle.com>> >>>> --- >>>> qemu-img.c | 25 ++++++++++++++++++++++++- >>>> 1 file changed, 24 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/qemu-img.c b/qemu-img.c >>>> index d9b609b3f0..7f20858cb9 100644 >>>> --- a/qemu-img.c >>>> +++ b/qemu-img.c >>> >>> [...] >>> >>>> @@ -3422,6 +3428,23 @@ static int img_rebase(int argc, char **argv) >>>> continue; >>>> } >>>> >>>> + if (prefix_chain_bs) { >>>> + /* >>>> + * If cluster wasn't changed since prefix_chain, we >>>> don't need >>>> + * to take action >>>> + */ >>>> + ret = bdrv_is_allocated_above(bs, prefix_chain_bs, >>>> + offset, n, &n); >>> >>> This will always return true because it definitely is allocated in @bs, >>> or we wouldn’t be here. (We just checked that with >>> bdrv_is_allocated().) I think @top should be backing_bs(bs). >>> >>> Max >> >> I don’t think that’s true: >> >> Examine the case where we have the following chain: >> >> A <- B <- C >> >> When we rebase C directly over A: qemu-img rebase -b A C >> >> We must check for every offset (sector): bdrv_is_allocated_above(C, A, >> offset, n, &n); >> >> If a sector from C is allocated above A - it may have been changed - so >> we need to do a read from A and a read from C and compare. >> If the sector is not allocated above, it was not changed - we don’t need >> to read from A or C. > > First: Oops, somehow I inverted the bdrv_is_allocated() check in my > head. (For context: I mean this part above this hunk here: > > /* If the cluster is allocated, we don't need to take action */ > ret = bdrv_is_allocated(bs, offset, n, &n); > if (ret < 0) { > error_report("error while reading image metadata: %s", > strerror(-ret)); > goto out; > } > > > > if (ret) { > continue; > } > > ) So at this point, the range definitely is *not* allocated in @bs. > > But second: That still means that we do not have to check @bs itself, > because we already did. We know the range isn’t allocated there, so we > can start at its backing file. > > On a more abstract level: No, we do not need to read all sectors from A > and C and compare them if they are allocated anywhere above A. If they > are allocated in C, we’re good, because all we’d do is write them back > to C (which is a no-op). That’s exactly what the existing > bdrv_is_allocated() check is for. > > So we only need to know whether the sectors are allocated above the base > (A) and below the top (C), so in your example, whether they are > allocated in B. If they are, we need to compare and potentially copy, > if they are not, we can skip them. > > So my claim that bdrv_is_allocated_above() would always return true is > wrong, but it still should use backing_bs(bs) for the top because we > have checked @bs already. I see your point, basically we save a single iteration in the loop at bdrv_is_allocated_above. I’ll submit a v4 patch series. Sam > > Max >
If a chain was detected, don't open a new BlockBackend from the target
backing file which will create a new BlockDriverState. Instead, create
an empty BlockBackend and attach the already open BlockDriverState.
Permissions for blk_new() were copied from blk_new_open() when
flags = 0.
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Signed-off-by: Sagi Amit <sagi.amit@oracle.com>
Co-developed-by: Sagi Amit <sagi.amit@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
---
qemu-img.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index 7f20858cb9..b32884bfc5 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3348,16 +3348,29 @@ static int img_rebase(int argc, char **argv)
* in its chain.
*/
prefix_chain_bs = bdrv_find_backing_image(bs, out_real_path);
-
- blk_new_backing = blk_new_open(out_real_path, NULL,
- options, src_flags, &local_err);
- g_free(out_real_path);
- if (!blk_new_backing) {
- error_reportf_err(local_err,
- "Could not open new backing file '%s': ",
- out_baseimg);
- ret = -1;
- goto out;
+ if (prefix_chain_bs) {
+ g_free(out_real_path);
+ blk_new_backing = blk_new(BLK_PERM_CONSISTENT_READ,
+ BLK_PERM_ALL);
+ ret = blk_insert_bs(blk_new_backing, prefix_chain_bs,
+ &local_err);
+ if (ret < 0) {
+ error_reportf_err(local_err,
+ "Could not reuse backing file '%s': ",
+ out_baseimg);
+ goto out;
+ }
+ } else {
+ blk_new_backing = blk_new_open(out_real_path, NULL,
+ options, src_flags, &local_err);
+ g_free(out_real_path);
+ if (!blk_new_backing) {
+ error_reportf_err(local_err,
+ "Could not open new backing file '%s': ",
+ out_baseimg);
+ ret = -1;
+ goto out;
+ }
}
}
}
--
2.13.3
On 02.05.19 15:58, Sam Eiderman wrote: > If a chain was detected, don't open a new BlockBackend from the target > backing file which will create a new BlockDriverState. Instead, create > an empty BlockBackend and attach the already open BlockDriverState. > > Permissions for blk_new() were copied from blk_new_open() when > flags = 0. > > Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> > Reviewed-by: Eyal Moscovici <eyal.moscovici@oracle.com> > Signed-off-by: Sagi Amit <sagi.amit@oracle.com> > Co-developed-by: Sagi Amit <sagi.amit@oracle.com> > Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com> > --- > qemu-img.c | 33 +++++++++++++++++++++++---------- > 1 file changed, 23 insertions(+), 10 deletions(-) Reviewed-by: Max Reitz <mreitz@redhat.com>
© 2016 - 2024 Red Hat, Inc.