[PATCH] storage: only fallocate when allocation matches capacity

Christian Ehrhardt posted 1 patch 3 years, 7 months ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20200902135814.869486-1-christian.ehrhardt@canonical.com
src/storage/storage_util.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
[PATCH] storage: only fallocate when allocation matches capacity
Posted by Christian Ehrhardt 3 years, 7 months ago
In c9ec7088 "storage: extend preallocation flags support for qemu-img"
the option to fallocate was added and meant to be active when (quote):
"the XML described storage <allocation> matches its <capacity>"

Up until recently 81a3042a12 "storage_util: fix qemu-img sparse allocation"
the compared allocation size was an order of magnitude too small, but still
it does use fallocate too often unless capacity>allocation.

This change fixes the comparison to match the intended description
of the feature.

Fixes: c9ec7088c7a3f4cd26bb471f1f243931fff6f4f9
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1759454
Fixes: https://bugs.launchpad.net/ubuntu/focal/+source/libvirt/+bug/1847105

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
---
 src/storage/storage_util.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/storage/storage_util.c b/src/storage/storage_util.c
index cf82ea0a87..85bed76863 100644
--- a/src/storage/storage_util.c
+++ b/src/storage/storage_util.c
@@ -710,10 +710,10 @@ storageBackendCreateQemuImgOpts(virStorageEncryptionInfoDefPtr encinfo,
         virQEMUBuildQemuImgKeySecretOpts(&buf, encinfo, info->secretAlias);
 
     if (info->preallocate) {
-        if (info->size_arg > info->allocation)
-            virBufferAddLit(&buf, "preallocation=metadata,");
-        else
+        if (info->size_arg == info->allocation)
             virBufferAddLit(&buf, "preallocation=falloc,");
+        else
+            virBufferAddLit(&buf, "preallocation=metadata,");
     }
 
     if (info->nocow)
-- 
2.28.0

Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Michal Privoznik 3 years, 7 months ago
On 9/2/20 3:58 PM, Christian Ehrhardt wrote:
> In c9ec7088 "storage: extend preallocation flags support for qemu-img"
> the option to fallocate was added and meant to be active when (quote):
> "the XML described storage <allocation> matches its <capacity>"
> 
> Up until recently 81a3042a12 "storage_util: fix qemu-img sparse allocation"
> the compared allocation size was an order of magnitude too small, but still
> it does use fallocate too often unless capacity>allocation.
> 
> This change fixes the comparison to match the intended description
> of the feature.
> 
> Fixes: c9ec7088c7a3f4cd26bb471f1f243931fff6f4f9
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1759454
> Fixes: https://bugs.launchpad.net/ubuntu/focal/+source/libvirt/+bug/1847105
> 
> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> ---
>   src/storage/storage_util.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

And sorry for making the mess earlier (~2 years ago).

Michal

Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Christian Ehrhardt 3 years, 7 months ago
On Wed, Sep 2, 2020 at 6:49 PM Michal Privoznik <mprivozn@redhat.com> wrote:
>
> On 9/2/20 3:58 PM, Christian Ehrhardt wrote:
> > In c9ec7088 "storage: extend preallocation flags support for qemu-img"
> > the option to fallocate was added and meant to be active when (quote):
> > "the XML described storage <allocation> matches its <capacity>"
> >
> > Up until recently 81a3042a12 "storage_util: fix qemu-img sparse allocation"
> > the compared allocation size was an order of magnitude too small, but still
> > it does use fallocate too often unless capacity>allocation.
> >
> > This change fixes the comparison to match the intended description
> > of the feature.
> >
> > Fixes: c9ec7088c7a3f4cd26bb471f1f243931fff6f4f9
> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1759454
> > Fixes: https://bugs.launchpad.net/ubuntu/focal/+source/libvirt/+bug/1847105
> >
> > Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > ---
> >   src/storage/storage_util.c | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
>
> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
>
> And sorry for making the mess earlier (~2 years ago).

no problem - it turned out to be even more confusing.

Due to some further testing and encouraged by feedback in the same
direction by Richard Lager (on CC now) I realized that while the
suggested change reads correct it will still not help my case :-/

Even if my fix lands, we are back to square one and would need
virt-manager to submit a different XML.
Remember: my target here would be to come back to pralloca=metadata as
it was before for image creations from virt-manager.
I've started that aspect of the discussion at the BZ [1] already.

On the libvirt side allocation>capacity sounds like being wrong anyway.
And if that is so we have these possible conditions:
- capacity==allocation now and before my change falloc
- capacity>allocation now and before my change metadata
- capacity<allocation before my change falloc, afterwards metadata
(but this one seems invalid anyway)

So I wonder are we really back at me asking Cole to let virt-manager
request things differently which is how this started about a year ago?
Or was I wrong trying to make the code to match the wording in the
commit that added it and do we actually want it to behave differently
(read no falloc) for the XMLs sent by virt-manager as of today?

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1759454


> Michal
>


-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd

Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Daniel P. Berrangé 3 years, 7 months ago
On Thu, Sep 03, 2020 at 12:18:42PM +0200, Christian Ehrhardt wrote:
> On Wed, Sep 2, 2020 at 6:49 PM Michal Privoznik <mprivozn@redhat.com> wrote:
> >
> > On 9/2/20 3:58 PM, Christian Ehrhardt wrote:
> > > In c9ec7088 "storage: extend preallocation flags support for qemu-img"
> > > the option to fallocate was added and meant to be active when (quote):
> > > "the XML described storage <allocation> matches its <capacity>"
> > >
> > > Up until recently 81a3042a12 "storage_util: fix qemu-img sparse allocation"
> > > the compared allocation size was an order of magnitude too small, but still
> > > it does use fallocate too often unless capacity>allocation.
> > >
> > > This change fixes the comparison to match the intended description
> > > of the feature.
> > >
> > > Fixes: c9ec7088c7a3f4cd26bb471f1f243931fff6f4f9
> > > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1759454
> > > Fixes: https://bugs.launchpad.net/ubuntu/focal/+source/libvirt/+bug/1847105
> > >
> > > Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > ---
> > >   src/storage/storage_util.c | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
> >
> > And sorry for making the mess earlier (~2 years ago).
> 
> no problem - it turned out to be even more confusing.
> 
> Due to some further testing and encouraged by feedback in the same
> direction by Richard Lager (on CC now) I realized that while the
> suggested change reads correct it will still not help my case :-/
> 
> Even if my fix lands, we are back to square one and would need
> virt-manager to submit a different XML.
> Remember: my target here would be to come back to pralloca=metadata as
> it was before for image creations from virt-manager.
> I've started that aspect of the discussion at the BZ [1] already.
> 
> On the libvirt side allocation>capacity sounds like being wrong anyway.

It is a bit wierd as an input XML from a mgmt app. It is to be expected
as an output XML from libvirt though. Some filesystems, notably XFS,
will sometimes speculatively over-allocate data extents in the belief
that further size-extending writes will probably arrive. So you can end
up with allocated blocks being greater than the current logical file
size.

> And if that is so we have these possible conditions:
> - capacity==allocation now and before my change falloc
> - capacity>allocation now and before my change metadata
> - capacity<allocation before my change falloc, afterwards metadata
> (but this one seems invalid anyway)
> 
> So I wonder are we really back at me asking Cole to let virt-manager
> request things differently which is how this started about a year ago?
> Or was I wrong trying to make the code to match the wording in the
> commit that added it and do we actually want it to behave differently
> (read no falloc) for the XMLs sent by virt-manager as of today?

I think we should provide three flags

  VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA
  VIR_STORAGE_VOL_CREATE_PREALLOC_NONE
  VIR_STORAGE_VOL_CREATE_PREALLOC_PAYLOAD

as a way to get explicit behaviour, with those flags ignoring the
"allocation" field. Only look at "allocation" if none of the flags
are given for sake of back compat.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Christian Ehrhardt 3 years, 7 months ago
On Thu, Sep 3, 2020 at 12:36 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Thu, Sep 03, 2020 at 12:18:42PM +0200, Christian Ehrhardt wrote:
> > On Wed, Sep 2, 2020 at 6:49 PM Michal Privoznik <mprivozn@redhat.com> wrote:
> > >
> > > On 9/2/20 3:58 PM, Christian Ehrhardt wrote:
> > > > In c9ec7088 "storage: extend preallocation flags support for qemu-img"
> > > > the option to fallocate was added and meant to be active when (quote):
> > > > "the XML described storage <allocation> matches its <capacity>"
> > > >
> > > > Up until recently 81a3042a12 "storage_util: fix qemu-img sparse allocation"
> > > > the compared allocation size was an order of magnitude too small, but still
> > > > it does use fallocate too often unless capacity>allocation.
> > > >
> > > > This change fixes the comparison to match the intended description
> > > > of the feature.
> > > >
> > > > Fixes: c9ec7088c7a3f4cd26bb471f1f243931fff6f4f9
> > > > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1759454
> > > > Fixes: https://bugs.launchpad.net/ubuntu/focal/+source/libvirt/+bug/1847105
> > > >
> > > > Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > > ---
> > > >   src/storage/storage_util.c | 6 +++---
> > > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
> > >
> > > And sorry for making the mess earlier (~2 years ago).
> >
> > no problem - it turned out to be even more confusing.
> >
> > Due to some further testing and encouraged by feedback in the same
> > direction by Richard Lager (on CC now) I realized that while the
> > suggested change reads correct it will still not help my case :-/
> >
> > Even if my fix lands, we are back to square one and would need
> > virt-manager to submit a different XML.
> > Remember: my target here would be to come back to pralloca=metadata as
> > it was before for image creations from virt-manager.
> > I've started that aspect of the discussion at the BZ [1] already.
> >
> > On the libvirt side allocation>capacity sounds like being wrong anyway.
>
> It is a bit wierd as an input XML from a mgmt app. It is to be expected
> as an output XML from libvirt though. Some filesystems, notably XFS,
> will sometimes speculatively over-allocate data extents in the belief
> that further size-extending writes will probably arrive. So you can end
> up with allocated blocks being greater than the current logical file
> size.
>
> > And if that is so we have these possible conditions:
> > - capacity==allocation now and before my change falloc
> > - capacity>allocation now and before my change metadata
> > - capacity<allocation before my change falloc, afterwards metadata
> > (but this one seems invalid anyway)
> >
> > So I wonder are we really back at me asking Cole to let virt-manager
> > request things differently which is how this started about a year ago?
> > Or was I wrong trying to make the code to match the wording in the
> > commit that added it and do we actually want it to behave differently
> > (read no falloc) for the XMLs sent by virt-manager as of today?
>
> I think we should provide three flags
>
>   VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA
>   VIR_STORAGE_VOL_CREATE_PREALLOC_NONE
>   VIR_STORAGE_VOL_CREATE_PREALLOC_PAYLOAD
>
> as a way to get explicit behaviour, with those flags ignoring the
> "allocation" field. Only look at "allocation" if none of the flags
> are given for sake of back compat.

Independent to the issue I initially wanted to solve and to the other
discussion on
why fallocate on ZFS makes barely any sense I like the suggestion Daniel.

I think the past try to implicitly derive from allocation size has
proven to be misleading at best.

> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>


-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Daniel P. Berrangé 3 years, 7 months ago
On Thu, Sep 03, 2020 at 03:40:22PM +0200, Christian Ehrhardt wrote:
> On Thu, Sep 3, 2020 at 12:36 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Thu, Sep 03, 2020 at 12:18:42PM +0200, Christian Ehrhardt wrote:
> > > On Wed, Sep 2, 2020 at 6:49 PM Michal Privoznik <mprivozn@redhat.com> wrote:
> > > >
> > > > On 9/2/20 3:58 PM, Christian Ehrhardt wrote:
> > > > > In c9ec7088 "storage: extend preallocation flags support for qemu-img"
> > > > > the option to fallocate was added and meant to be active when (quote):
> > > > > "the XML described storage <allocation> matches its <capacity>"
> > > > >
> > > > > Up until recently 81a3042a12 "storage_util: fix qemu-img sparse allocation"
> > > > > the compared allocation size was an order of magnitude too small, but still
> > > > > it does use fallocate too often unless capacity>allocation.
> > > > >
> > > > > This change fixes the comparison to match the intended description
> > > > > of the feature.
> > > > >
> > > > > Fixes: c9ec7088c7a3f4cd26bb471f1f243931fff6f4f9
> > > > > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1759454
> > > > > Fixes: https://bugs.launchpad.net/ubuntu/focal/+source/libvirt/+bug/1847105
> > > > >
> > > > > Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > > > ---
> > > > >   src/storage/storage_util.c | 6 +++---
> > > > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
> > > >
> > > > And sorry for making the mess earlier (~2 years ago).
> > >
> > > no problem - it turned out to be even more confusing.
> > >
> > > Due to some further testing and encouraged by feedback in the same
> > > direction by Richard Lager (on CC now) I realized that while the
> > > suggested change reads correct it will still not help my case :-/
> > >
> > > Even if my fix lands, we are back to square one and would need
> > > virt-manager to submit a different XML.
> > > Remember: my target here would be to come back to pralloca=metadata as
> > > it was before for image creations from virt-manager.
> > > I've started that aspect of the discussion at the BZ [1] already.
> > >
> > > On the libvirt side allocation>capacity sounds like being wrong anyway.
> >
> > It is a bit wierd as an input XML from a mgmt app. It is to be expected
> > as an output XML from libvirt though. Some filesystems, notably XFS,
> > will sometimes speculatively over-allocate data extents in the belief
> > that further size-extending writes will probably arrive. So you can end
> > up with allocated blocks being greater than the current logical file
> > size.
> >
> > > And if that is so we have these possible conditions:
> > > - capacity==allocation now and before my change falloc
> > > - capacity>allocation now and before my change metadata
> > > - capacity<allocation before my change falloc, afterwards metadata
> > > (but this one seems invalid anyway)
> > >
> > > So I wonder are we really back at me asking Cole to let virt-manager
> > > request things differently which is how this started about a year ago?
> > > Or was I wrong trying to make the code to match the wording in the
> > > commit that added it and do we actually want it to behave differently
> > > (read no falloc) for the XMLs sent by virt-manager as of today?
> >
> > I think we should provide three flags
> >
> >   VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA
> >   VIR_STORAGE_VOL_CREATE_PREALLOC_NONE
> >   VIR_STORAGE_VOL_CREATE_PREALLOC_PAYLOAD
> >
> > as a way to get explicit behaviour, with those flags ignoring the
> > "allocation" field. Only look at "allocation" if none of the flags
> > are given for sake of back compat.
> 
> Independent to the issue I initially wanted to solve and to the other
> discussion on
> why fallocate on ZFS makes barely any sense I like the suggestion Daniel.

We could add a 4th flag. 

   VIR_STORAGE_VOL_CREATE_PREALLOC_NONE
   VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA
   VIR_STORAGE_VOL_CREATE_PREALLOC_TRY_PAYLOAD
   VIR_STORAGE_VOL_CREATE_PREALLOC_FORCE_PAYLOAD

TRY_PAYLOAD == plain  fallocate() only, ignore ENOSYS

FORCE_PAYLOAD == fallocate() and fallback to write zeros on ENOSYS

virt-manager would try TRY_PAYLOAD so it mostly does the right thing
without being terribly slow on FS that lack fallocate().

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Richard Laager 3 years, 7 months ago
On 9/3/20 5:18 AM, Christian Ehrhardt wrote:
> Even if my fix lands, we are back to square one and would need
> virt-manager to submit a different XML.
> Remember: my target here would be to come back to pralloca=metadata as
> it was before for image creations from virt-manager.

Why is that your goal?

If this is simply because OpenZFS doesn't support fallocate(mode=0),
that has (finally!) been resolved for the next release:
https://github.com/openzfs/zfs/commit/f734301d2267cbb33eaffbca195fc93f1dae7b74

ZFS will "fake" the fallocate() request. It'll check to make sure
there's enough free space at the moment, which is about all it can do
anyway. It can't reserve the space anyway, mostly because it is a
copy-on-write filesystem. Even if the application writes zeros, ZFS will
just throw them away anyway (assuming you are using compression, which
everyone should be).

> On the libvirt side allocation>capacity sounds like being wrong anyway.
> And if that is so we have these possible conditions:
> - capacity==allocation now and before my change falloc
> - capacity>allocation now and before my change metadata
> - capacity<allocation before my change falloc, afterwards metadata
> (but this one seems invalid anyway)
> 
> So I wonder are we really back at me asking Cole to let virt-manager
> request things differently which is how this started about a year ago?

Setting aside cases of semi-allocation (capacity > allocation != 0) and
overprovisioning (allocation > capacity), I assume the common cases are
thin provisioning (allocation == 0) and thick provisioning (capacity ==
allocation).

virt-manager (at least in the way I use it) asks explicitly for the
allocation and capacity. If virt-manager is properly conveying (and I'd
assume it is) the user's capacity and allocation choices from the GUI to
libvirt, then virt-manager is working correctly in my view and should be
left alone.

I believe the main goal for thick provisioning is to reserve the space
as best as possible, because ENOSPC underneath a virtual machine is bad.
Secondary goals would be allocating the space relatively contiguously
for performance and accounting for the space immediately to help the
administrator keep track of usage.

If the filesystem supports fallocate(), using it accomplishes all of
these goals in a very performant way. If the filesystem does not support
fallocate(), then the application can either write zeros or do nothing.
Writing zeros is slow, but achieves the goals to the extent possible.
Not writing zeros is fast, but does not reserve/account for the space;
though, depending on the filesystem, that might not be possible anyway.

I think the question fundamentally comes down to: how strong do you take
a "thick provisioning" request? Do you do everything in your power to
achieve it (which would mean writing zeros*) or do you treat it as a
hint that you'll only follow if it is fast to do so?

If it's a demand, then try fallocate() but fall back to writing zeroes.
(glibc's posix_fallocate() does exactly this.). If it's a hint, then
only ever call fallocate().

I think it is reasonable to treat it as a demand and write zeros if
fallocate() fails. If it is too slow, the admin will notice and can make
the decision to (in the future) stop requesting thick provisioning and
just request thin provisioning.

In the ZFS case, why is the admin requesting thick provisioning anyway?


* One could go further and defeat compression by writing random data.
  But that seems extreme, so I'm going to ignore that.

-- 
Richard

Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Christian Ehrhardt 3 years, 7 months ago
On Thu, Sep 3, 2020 at 12:49 PM Richard Laager <rlaager@wiktel.com> wrote:
>
> On 9/3/20 5:18 AM, Christian Ehrhardt wrote:
> > Even if my fix lands, we are back to square one and would need
> > virt-manager to submit a different XML.
> > Remember: my target here would be to come back to pralloca=metadata as
> > it was before for image creations from virt-manager.
>
> Why is that your goal?
>
> If this is simply because OpenZFS doesn't support fallocate(mode=0),

Yeah it was because behavior changed for users on upgrades and became
slow (if on ZFS).
And since the symptom was restricted to just that and also just a
slowdown the prio was always low anyway.

> that has (finally!) been resolved for the next release:
> https://github.com/openzfs/zfs/commit/f734301d2267cbb33eaffbca195fc93f1dae7b74

Glad to hear that - and I agree that fallocate @ COW/Compression FS is
not really applicable.

So it seems "my need" for this is completely gone once we get the change above.
Thanks Richard for the FYI on it!

> ZFS will "fake" the fallocate() request. It'll check to make sure
> there's enough free space at the moment, which is about all it can do
> anyway. It can't reserve the space anyway, mostly because it is a
> copy-on-write filesystem. Even if the application writes zeros, ZFS will
> just throw them away anyway (assuming you are using compression, which
> everyone should be).
>
> > On the libvirt side allocation>capacity sounds like being wrong anyway.
> > And if that is so we have these possible conditions:
> > - capacity==allocation now and before my change falloc
> > - capacity>allocation now and before my change metadata
> > - capacity<allocation before my change falloc, afterwards metadata
> > (but this one seems invalid anyway)
> >
> > So I wonder are we really back at me asking Cole to let virt-manager
> > request things differently which is how this started about a year ago?
>
> Setting aside cases of semi-allocation (capacity > allocation != 0) and
> overprovisioning (allocation > capacity), I assume the common cases are
> thin provisioning (allocation == 0) and thick provisioning (capacity ==
> allocation).
>
> virt-manager (at least in the way I use it) asks explicitly for the
> allocation and capacity. If virt-manager is properly conveying (and I'd
> assume it is) the user's capacity and allocation choices from the GUI to
> libvirt, then virt-manager is working correctly in my view and should be
> left alone.
>
> I believe the main goal for thick provisioning is to reserve the space
> as best as possible, because ENOSPC underneath a virtual machine is bad.
> Secondary goals would be allocating the space relatively contiguously
> for performance and accounting for the space immediately to help the
> administrator keep track of usage.
>
> If the filesystem supports fallocate(), using it accomplishes all of
> these goals in a very performant way. If the filesystem does not support
> fallocate(), then the application can either write zeros or do nothing.
> Writing zeros is slow, but achieves the goals to the extent possible.
> Not writing zeros is fast, but does not reserve/account for the space;
> though, depending on the filesystem, that might not be possible anyway.
>
> I think the question fundamentally comes down to: how strong do you take
> a "thick provisioning" request? Do you do everything in your power to
> achieve it (which would mean writing zeros*) or do you treat it as a
> hint that you'll only follow if it is fast to do so?
>
> If it's a demand, then try fallocate() but fall back to writing zeroes.
> (glibc's posix_fallocate() does exactly this.). If it's a hint, then
> only ever call fallocate().
>
> I think it is reasonable to treat it as a demand and write zeros if
> fallocate() fails. If it is too slow, the admin will notice and can make
> the decision to (in the future) stop requesting thick provisioning and
> just request thin provisioning.
>
> In the ZFS case, why is the admin requesting thick provisioning anyway?
>
>
> * One could go further and defeat compression by writing random data.
>   But that seems extreme, so I'm going to ignore that.
>
> --
> Richard



-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd

Re: [PATCH] storage: only fallocate when allocation matches capacity
Posted by Daniel P. Berrangé 3 years, 7 months ago
On Wed, Sep 02, 2020 at 03:58:14PM +0200, Christian Ehrhardt wrote:
> In c9ec7088 "storage: extend preallocation flags support for qemu-img"
> the option to fallocate was added and meant to be active when (quote):
> "the XML described storage <allocation> matches its <capacity>"
> 
> Up until recently 81a3042a12 "storage_util: fix qemu-img sparse allocation"
> the compared allocation size was an order of magnitude too small, but still
> it does use fallocate too often unless capacity>allocation.
> 
> This change fixes the comparison to match the intended description
> of the feature.
> 
> Fixes: c9ec7088c7a3f4cd26bb471f1f243931fff6f4f9
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1759454
> Fixes: https://bugs.launchpad.net/ubuntu/focal/+source/libvirt/+bug/1847105
> 
> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> ---
>  src/storage/storage_util.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/storage/storage_util.c b/src/storage/storage_util.c
> index cf82ea0a87..85bed76863 100644
> --- a/src/storage/storage_util.c
> +++ b/src/storage/storage_util.c
> @@ -710,10 +710,10 @@ storageBackendCreateQemuImgOpts(virStorageEncryptionInfoDefPtr encinfo,
>          virQEMUBuildQemuImgKeySecretOpts(&buf, encinfo, info->secretAlias);
>  
>      if (info->preallocate) {
> -        if (info->size_arg > info->allocation)
> -            virBufferAddLit(&buf, "preallocation=metadata,");
> -        else
> +        if (info->size_arg == info->allocation)
>              virBufferAddLit(&buf, "preallocation=falloc,");
> +        else
> +            virBufferAddLit(&buf, "preallocation=metadata,");
>      }

This is still wrong, as preallocation=falloc is inside the
"info->preallocate" check.

The "info->preallocate" field is set if VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA
was passed as a flag to virStorageVolCreate. By its very name that flag only
refers to metadata preallocation, not payload.

IOW Pre-allocating  image payload should not be dependendant on the
VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA flag, it should be driven
by the capacity/allocation values in the XML, as it done for raw
volumes.

I regret that we ever used "allocation" in the XML as a sign to preallocate.
We should really have treated it as an output only field, and had an explicit
flag.

It isn't too late for us to introduce two new flags:

  VIR_STORAGE_VOL_CREATE_PREALLOC_PAYLOAD
  VIR_STORAGE_VOL_CREATE_PREALLOC_NONE

If either of the three PREALLOC flags are present, we should declare that
"allocation" in the XML is ignored. So at least we'll have sane behaviour
for apps that pass the flags at that point.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|