block/io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
Commit 5634622bcb ("file-posix: allow BLKZEROOUT with -t writeback")
enables the BLKZEROOUT ioctl when using 'writeback' cache, regressing
certain 'qemu-img convert' invocations, because of a pre-existing
issue. Namely, the BLKZEROOUT ioctl might fail with errno EINVAL when
the request is shorter than the block size of the block device.
Fallback to the bounce buffer, similar to when the ioctl is not
supported at all, rather than treating such an error as fatal.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3257
Resolves: https://bugzilla.proxmox.com/show_bug.cgi?id=7197
Cc: qemu-stable@nongnu.org
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/io.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/io.c b/block/io.c
index cace297f22..e37d9d2417 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1917,7 +1917,8 @@ bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
assert(!bs->supported_zero_flags);
}
- if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
+ if ((ret == -ENOTSUP || (ret == -EINVAL && num < alignment)) &&
+ !(flags & BDRV_REQ_NO_FALLBACK)) {
/* Fall back to bounce buffer if write zeroes is unsupported */
BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
--
2.47.3
On Mon, Jan 05, 2026 at 03:29:55PM +0100, Fiona Ebner wrote:
> Commit 5634622bcb ("file-posix: allow BLKZEROOUT with -t writeback")
> enables the BLKZEROOUT ioctl when using 'writeback' cache, regressing
> certain 'qemu-img convert' invocations, because of a pre-existing
> issue. Namely, the BLKZEROOUT ioctl might fail with errno EINVAL when
> the request is shorter than the block size of the block device.
> Fallback to the bounce buffer, similar to when the ioctl is not
> supported at all, rather than treating such an error as fatal.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3257
> Resolves: https://bugzilla.proxmox.com/show_bug.cgi?id=7197
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> block/io.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/block/io.c b/block/io.c
> index cace297f22..e37d9d2417 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1917,7 +1917,8 @@ bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
> assert(!bs->supported_zero_flags);
> }
>
> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> + if ((ret == -ENOTSUP || (ret == -EINVAL && num < alignment)) &&
> + !(flags & BDRV_REQ_NO_FALLBACK)) {
> /* Fall back to bounce buffer if write zeroes is unsupported */
> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
This patch invokes the ioctl(BLKZERO) first and then falls back to a
write request. In the bug report this only occurs at the end of qemu-img
convert, so the performance overhead of failing the ioctl first and then
submitting a write doesn't matter.
I would prefer taking advantage of the existing alignment code in
bdrv_co_do_zero_pwritev() instead of falling back like this. The
alignment code currently has no effect because
BlockLimits->request_alignment = 1 for cache=writeback. It is unaware
that ioctl(BLKZERO) requires block alignment. Once that is fixed,
bdrv_co_do_zero_pwritev() will submit a regular write request for the
misaligned tail.
Block drivers with specific alignment requirements for write zeroes
report them in BlockLimits->pwrite_zeroes_alignment. I think
bdrv_co_do_zero_pwritev() should prioritize
bs->bl.pwrite_zeroes_alignment over bs->bl.request_alignment.
file-posix.c already populates pwrite_zeroes_alignment - it's just not
used by bdrv_co_do_zero_pwritev() yet.
Do you want to give that a try?
Thanks,
Stefan
Am 06.01.26 um 3:23 PM schrieb Stefan Hajnoczi:
> On Mon, Jan 05, 2026 at 03:29:55PM +0100, Fiona Ebner wrote:
>> Commit 5634622bcb ("file-posix: allow BLKZEROOUT with -t writeback")
>> enables the BLKZEROOUT ioctl when using 'writeback' cache, regressing
>> certain 'qemu-img convert' invocations, because of a pre-existing
>> issue. Namely, the BLKZEROOUT ioctl might fail with errno EINVAL when
>> the request is shorter than the block size of the block device.
>> Fallback to the bounce buffer, similar to when the ioctl is not
>> supported at all, rather than treating such an error as fatal.
>>
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3257
>> Resolves: https://bugzilla.proxmox.com/show_bug.cgi?id=7197
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> ---
>> block/io.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index cace297f22..e37d9d2417 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -1917,7 +1917,8 @@ bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
>> assert(!bs->supported_zero_flags);
>> }
>>
>> - if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
>> + if ((ret == -ENOTSUP || (ret == -EINVAL && num < alignment)) &&
>> + !(flags & BDRV_REQ_NO_FALLBACK)) {
>> /* Fall back to bounce buffer if write zeroes is unsupported */
>> BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
>
> This patch invokes the ioctl(BLKZERO) first and then falls back to a
> write request. In the bug report this only occurs at the end of qemu-img
> convert, so the performance overhead of failing the ioctl first and then
> submitting a write doesn't matter.
In my testing, it does not only happen at the end of the image, but
after the 2 GiB mark for larger zero images. Initially,
convert_iteration_sectors() returns 4096, but at the 2 GiB mark
s->sector_next_status - sector_num is 4095, so it will take precedence
in the assignment
n = MIN(n, s->sector_next_status - sector_num);
in convert_iteration_sectors(). After that, convert_iteration_sectors()
will return 4096 again, but short zero writes happen from there on,
because of that one time throwing off the alignment. (Similarly, around
other multiples of 2 GiB, there will be one time where 4095 is returned
in between).
> I would prefer taking advantage of the existing alignment code in
> bdrv_co_do_zero_pwritev() instead of falling back like this. The
> alignment code currently has no effect because
> BlockLimits->request_alignment = 1 for cache=writeback. It is unaware
> that ioctl(BLKZERO) requires block alignment. Once that is fixed,
> bdrv_co_do_zero_pwritev() will submit a regular write request for the
> misaligned tail.
>
> Block drivers with specific alignment requirements for write zeroes
> report them in BlockLimits->pwrite_zeroes_alignment. I think
> bdrv_co_do_zero_pwritev() should prioritize
> bs->bl.pwrite_zeroes_alignment over bs->bl.request_alignment.
> file-posix.c already populates pwrite_zeroes_alignment - it's just not
> used by bdrv_co_do_zero_pwritev() yet.
>
> Do you want to give that a try?
Yes, I'll try. This approach sounds cleaner :)
Best Regards,
Fiona
© 2016 - 2026 Red Hat, Inc.