[PATCH 4/6] block/io: safeguard max transfer calculation in bdrv_aligned_pwritev()

Fiona Ebner posted 6 patches 1 month ago
Maintainers: Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>
[PATCH 4/6] block/io: safeguard max transfer calculation in bdrv_aligned_pwritev()
Posted by Fiona Ebner 1 month ago
This partially fixes iotest 177 with qcow2, where max_transfer is
64KiB, but the cluster size and thus pwrite_zeroes_alignment is 1MiB.
Previously, max_transfer would be calculated as 0, triggering an
assertion later.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 block/io.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 12dc153573..233b2617ea 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2087,8 +2087,10 @@ bdrv_aligned_pwritev(BdrvChild *child, BdrvTrackedRequest *req,
     assert(is_power_of_2(align));
     assert((offset & (align - 1)) == 0);
     assert((bytes & (align - 1)) == 0);
-    max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
-                                   align);
+    max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX);
+    if (max_transfer > align) {
+        max_transfer = QEMU_ALIGN_DOWN(max_transfer, align);
+    }
 
     ret = bdrv_co_write_req_prepare(child, offset, bytes, req, flags);
 
-- 
2.47.3
Re: [PATCH 4/6] block/io: safeguard max transfer calculation in bdrv_aligned_pwritev()
Posted by Stefan Hajnoczi 3 weeks ago
On Fri, Jan 09, 2026 at 01:08:31PM +0100, Fiona Ebner wrote:
> This partially fixes iotest 177 with qcow2, where max_transfer is
> 64KiB, but the cluster size and thus pwrite_zeroes_alignment is 1MiB.
> Previously, max_transfer would be calculated as 0, triggering an
> assertion later.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  block/io.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 12dc153573..233b2617ea 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2087,8 +2087,10 @@ bdrv_aligned_pwritev(BdrvChild *child, BdrvTrackedRequest *req,
>      assert(is_power_of_2(align));
>      assert((offset & (align - 1)) == 0);
>      assert((bytes & (align - 1)) == 0);
> -    max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
> -                                   align);
> +    max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX);
> +    if (max_transfer > align) {
> +        max_transfer = QEMU_ALIGN_DOWN(max_transfer, align);
> +    }

max_transfer < align seems paradoxical. It's a situation where the
largest allowed I/O request cannot meet alignment requirements.

Every other place that uses max_transfer in QEMU would also need to cope
with this.

block/blkdebug.c:blkdebug_open() fails if max_transfer is not aligned,
indicating that there are assumptions at least in some places that
max_transfer is aligned.

I hesitate to make this change because I fear it will break more things.
Why was max_transfer 64KB while pwrite_zeroes_alignment was 1MiB? Either
max_transfer should fit the alignment or pwrite_zeroes code needs to
distinguish between actual write zeroes operations and read-write-modify
I/O (which is plain read/write and not subject to
pwrite_zeroes_alignment).

Stefan
Re: [PATCH 4/6] block/io: safeguard max transfer calculation in bdrv_aligned_pwritev()
Posted by Kevin Wolf 4 days, 6 hours ago
Am 19.01.2026 um 20:34 hat Stefan Hajnoczi geschrieben:
> On Fri, Jan 09, 2026 at 01:08:31PM +0100, Fiona Ebner wrote:
> > This partially fixes iotest 177 with qcow2, where max_transfer is
> > 64KiB, but the cluster size and thus pwrite_zeroes_alignment is 1MiB.
> > Previously, max_transfer would be calculated as 0, triggering an
> > assertion later.
> > 
> > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> > ---
> >  block/io.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 12dc153573..233b2617ea 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -2087,8 +2087,10 @@ bdrv_aligned_pwritev(BdrvChild *child, BdrvTrackedRequest *req,
> >      assert(is_power_of_2(align));
> >      assert((offset & (align - 1)) == 0);
> >      assert((bytes & (align - 1)) == 0);
> > -    max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
> > -                                   align);
> > +    max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX);
> > +    if (max_transfer > align) {
> > +        max_transfer = QEMU_ALIGN_DOWN(max_transfer, align);
> > +    }
> 
> max_transfer < align seems paradoxical. It's a situation where the
> largest allowed I/O request cannot meet alignment requirements.
> 
> Every other place that uses max_transfer in QEMU would also need to cope
> with this.
> 
> block/blkdebug.c:blkdebug_open() fails if max_transfer is not aligned,
> indicating that there are assumptions at least in some places that
> max_transfer is aligned.
> 
> I hesitate to make this change because I fear it will break more things.
> Why was max_transfer 64KB while pwrite_zeroes_alignment was 1MiB? Either
> max_transfer should fit the alignment or pwrite_zeroes code needs to
> distinguish between actual write zeroes operations and read-write-modify
> I/O (which is plain read/write and not subject to
> pwrite_zeroes_alignment).

I don't think haing a 64k max_transfer for normal I/O where data is
actually transferred and much larger write_zeroes requests should be
inherently incompatible.

In fact, in this specific case, I'm wondering how we even hit a
problematic case where align == pwrite_zeroes_alignment, but
max_transfer is still used. Shouldn't we take the BDRV_REQ_ZERO_WRITE
code path, which leaves max_transfer completely unused?

Do we somehow end up with normal writes using pwrite_zeroes_alignment?
Ah, yes. This seems to be a problem with patch 3. It doesn't consider
that bdrv_co_do_zero_pwritev() deals both with write_zeroes requests
(for the bulk in the middle) and with normal writes for the padding,
which require different alignments. So changing 'align' for all calls
seems wrong, it should probably be only for those requests that keep the
BDRV_REQ_ZERO_WRITE flag set.

Kevin