drivers/md/dm-bufio.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> When performing a read-modify-write(RMW) operation, any modification
> to a buffered block must cause the entire buffer to be marked dirty.
>
> Marking only a subrange as dirty is incorrect because the underlying
> device block size(ubs) defines the minimum read/write granularity. A
> lower device can perform I/O only on regions which are fully aligned
> and sized to ubs.
Hi
I think it would be better to fix this in dm-bufio, so that other dm-bufio
users would also benefit from the fix. Please try this patch - does it fix
it?
Mikulas
From: Mikulas Patocka <mpatocka@redhat.com>
There may be devices with logical block size larger than 4k. Fix
dm-bufio, so that it will align I/O on logical block size. This commit
fixes I/O errors on the dm-ebs target on the top of emulated nvme device
with 8k logical block size created with qemu parameters:
-device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
---
drivers/md/dm-bufio.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
@@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
{
unsigned int n_sectors;
sector_t sector;
- unsigned int offset, end;
+ unsigned int offset, end, align;
b->end_io = end_io;
@@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
b->c->write_callback(b);
offset = b->write_start;
end = b->write_end;
- offset &= -DM_BUFIO_WRITE_ALIGN;
- end += DM_BUFIO_WRITE_ALIGN - 1;
- end &= -DM_BUFIO_WRITE_ALIGN;
+ align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
+ offset &= -align;
+ end += align - 1;
+ end &= -align;
if (unlikely(end > b->c->block_size))
end = b->c->block_size;
On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
>
>
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
>
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> >
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
>
> Hi
>
> I think it would be better to fix this in dm-bufio, so that other dm-bufio
> users would also benefit from the fix.
This looks to me like it should accomplish the same thing as
Uladzislau's patch. But I think there could still be problems with other
dm-bufio users, for devices where the blocksize is larger than 4k.
In dm_bufio_client_create() I think we want to make sure that block_size
is a multiple of bdev_logical_block_size(bdev), instead of 512b.
Otherwise block_to_sector() can return sectors that are not addressable
on the device. Unfortunatley, I don't think all users of dm-bufio will
pass in block_sizes that are larger than 4k (uds_make_bufio() in
dm-vdp/indexer/io-factory.c for instance).
-Ben
> Please try this patch - does it fix it?
>
> Mikulas
>
>
>
> From: Mikulas Patocka <mpatocka@redhat.com>
>
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
>
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
>
> ---
> drivers/md/dm-bufio.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> {
> unsigned int n_sectors;
> sector_t sector;
> - unsigned int offset, end;
> + unsigned int offset, end, align;
>
> b->end_io = end_io;
>
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> b->c->write_callback(b);
> offset = b->write_start;
> end = b->write_end;
> - offset &= -DM_BUFIO_WRITE_ALIGN;
> - end += DM_BUFIO_WRITE_ALIGN - 1;
> - end &= -DM_BUFIO_WRITE_ALIGN;
> + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> + offset &= -align;
> + end += align - 1;
> + end &= -align;
> if (unlikely(end > b->c->block_size))
> end = b->c->block_size;
>
>
On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> >
> >
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> >
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > >
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> >
> > Hi
> >
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > users would also benefit from the fix.
>
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.
>
> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.
I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to
dm_bufio_client_create. But I think it's too late in this development
cycle, I would add it after the next merge window closes, when I open a
new patch series for the kernel 6.20 (or 7.0).
> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
>
> -Ben
>
> > Please try this patch - does it fix it?
> >
> > Mikulas
I changed the patch below, so that it aligns write bios on
max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
greater than logical block size, the writes are aligned so that the device
doesn't do read-modify-write.
Mikulas
> > From: Mikulas Patocka <mpatocka@redhat.com>
> >
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> >
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> >
> > ---
> > drivers/md/dm-bufio.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > {
> > unsigned int n_sectors;
> > sector_t sector;
> > - unsigned int offset, end;
> > + unsigned int offset, end, align;
> >
> > b->end_io = end_io;
> >
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > b->c->write_callback(b);
> > offset = b->write_start;
> > end = b->write_end;
> > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > - end &= -DM_BUFIO_WRITE_ALIGN;
> > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > + offset &= -align;
> > + end += align - 1;
> > + end &= -align;
> > if (unlikely(end > b->c->block_size))
> > end = b->c->block_size;
> >
> >
>
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote: > I changed the patch below, so that it aligns write bios on > max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), > bdev_physical_block_size(b->c->bdev)); - so that if physical block size is > greater than logical block size, the writes are aligned so that the device > doesn't do read-modify-write. That doesn't make any sense whatsoever. The physical block size must be >= logical block size, and the block enforces that.
On Wed, 19 Nov 2025, Christoph Hellwig wrote: > On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote: > > I changed the patch below, so that it aligns write bios on > > max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), > > bdev_physical_block_size(b->c->bdev)); - so that if physical block size is > > greater than logical block size, the writes are aligned so that the device > > doesn't do read-modify-write. > > That doesn't make any sense whatsoever. The physical block size must > be >= logical block size, and the block enforces that. OK, so I changed it to max(DM_BUFIO_WRITE_ALIGN, bdev_physical_block_size(b->c->bdev)) Mikulas
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
>
>
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
>
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > >
> > >
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > >
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > >
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > >
> > > Hi
> > >
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > users would also benefit from the fix.
> >
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> >
> > In dm_bufio_client_create() I think we want to make sure that block_size
> > is a multiple of bdev_logical_block_size(bdev), instead of 512b.
>
> I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to
> dm_bufio_client_create. But I think it's too late in this development
> cycle, I would add it after the next merge window closes, when I open a
> new patch series for the kernel 6.20 (or 7.0).
>
> > Otherwise block_to_sector() can return sectors that are not addressable
> > on the device. Unfortunatley, I don't think all users of dm-bufio will
> > pass in block_sizes that are larger than 4k (uds_make_bufio() in
> > dm-vdp/indexer/io-factory.c for instance).
> >
> > -Ben
> >
> > > Please try this patch - does it fix it?
> > >
> > > Mikulas
>
> I changed the patch below, so that it aligns write bios on
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev),
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is
> greater than logical block size, the writes are aligned so that the device
> doesn't do read-modify-write.
This will really only help if the bufio client block_size is a multiple
of the underlying device's physical block size, and the device is
aligned to the physical block size. Perhaps we should figure
out the alignment in dm_bufio_client_create(), with something like:
c->align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(bdev));
if (block_size & -bdev_physical_block_size(bdev) &&
bdev_alignment_offset(bdev) == 0)
c->align = bdev_physical_block_size(bdev);
I suppose pre-calculating this could cause problems if the underlying
device was another dm device, and it switched tables in a way that
changed its limits. I dunno if we care about that, however.
-Ben
> Mikulas
>
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > >
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > >
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > >
> > > ---
> > > drivers/md/dm-bufio.c | 9 +++++----
> > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > {
> > > unsigned int n_sectors;
> > > sector_t sector;
> > > - unsigned int offset, end;
> > > + unsigned int offset, end, align;
> > >
> > > b->end_io = end_io;
> > >
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > b->c->write_callback(b);
> > > offset = b->write_start;
> > > end = b->write_end;
> > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > + offset &= -align;
> > > + end += align - 1;
> > > + end &= -align;
> > > if (unlikely(end > b->c->block_size))
> > > end = b->c->block_size;
> > >
> > >
> >
On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> >
> >
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> >
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > >
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> >
> > Hi
> >
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > users would also benefit from the fix.
>
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.
Yes, but Uladzislau said that this patch doesn't work for him. So, I
suspect that he has "logical_block_size" set incorrectly.
Mikulas
> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.
> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
>
> -Ben
>
> > Please try this patch - does it fix it?
> >
> > Mikulas
> >
> >
> >
> > From: Mikulas Patocka <mpatocka@redhat.com>
> >
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> >
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> >
> > ---
> > drivers/md/dm-bufio.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > {
> > unsigned int n_sectors;
> > sector_t sector;
> > - unsigned int offset, end;
> > + unsigned int offset, end, align;
> >
> > b->end_io = end_io;
> >
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > b->c->write_callback(b);
> > offset = b->write_start;
> > end = b->write_end;
> > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > - end &= -DM_BUFIO_WRITE_ALIGN;
> > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > + offset &= -align;
> > + end += align - 1;
> > + end &= -align;
> > if (unlikely(end > b->c->block_size))
> > end = b->c->block_size;
> >
> >
>
On Tue, Nov 18, 2025 at 12:15:43PM +0100, Mikulas Patocka wrote: > > > On Mon, 17 Nov 2025, Benjamin Marzinski wrote: > > > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote: > > > > > > > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote: > > > > > > > When performing a read-modify-write(RMW) operation, any modification > > > > to a buffered block must cause the entire buffer to be marked dirty. > > > > > > > > Marking only a subrange as dirty is incorrect because the underlying > > > > device block size(ubs) defines the minimum read/write granularity. A > > > > lower device can perform I/O only on regions which are fully aligned > > > > and sized to ubs. > > > > > > Hi > > > > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio > > > users would also benefit from the fix. > > > > This looks to me like it should accomplish the same thing as > > Uladzislau's patch. But I think there could still be problems with other > > dm-bufio users, for devices where the blocksize is larger than 4k. > > Yes, but Uladzislau said that this patch doesn't work for him. So, I > suspect that he has "logical_block_size" set incorrectly. > Indeed. Because logical is < physical in my case. Your change does not fix it because of I/O size is equal to physical. -- Uladzislau Rezki
Hello!
Sorry i have missed you email for unknown reason to me. It is
probably because you answered to email with different subject
i sent initially.
>
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
>
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> >
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
>
> Hi
>
> I think it would be better to fix this in dm-bufio, so that other dm-bufio
> users would also benefit from the fix. Please try this patch - does it fix
> it?
>
If it solves what i describe i do not mind :)
>
>
> From: Mikulas Patocka <mpatocka@redhat.com>
>
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
>
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
>
> ---
> drivers/md/dm-bufio.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> {
> unsigned int n_sectors;
> sector_t sector;
> - unsigned int offset, end;
> + unsigned int offset, end, align;
>
> b->end_io = end_io;
>
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> b->c->write_callback(b);
> offset = b->write_start;
> end = b->write_end;
> - offset &= -DM_BUFIO_WRITE_ALIGN;
> - end += DM_BUFIO_WRITE_ALIGN - 1;
> - end &= -DM_BUFIO_WRITE_ALIGN;
> + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> + offset &= -align;
> + end += align - 1;
> + end &= -align;
> if (unlikely(end > b->c->block_size))
> end = b->c->block_size;
>
>
I will check it and get back soon.
Thank you.
--
Uladzislau Rezki
On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> Hello!
>
> Sorry i have missed you email for unknown reason to me. It is
> probably because you answered to email with different subject
> i sent initially.
>
> >
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> >
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > >
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> >
> > Hi
> >
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > users would also benefit from the fix. Please try this patch - does it fix
> > it?
> >
> If it solves what i describe i do not mind :)
>
> >
> >
> > From: Mikulas Patocka <mpatocka@redhat.com>
> >
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> >
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> >
> > ---
> > drivers/md/dm-bufio.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > {
> > unsigned int n_sectors;
> > sector_t sector;
> > - unsigned int offset, end;
> > + unsigned int offset, end, align;
> >
> > b->end_io = end_io;
> >
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > b->c->write_callback(b);
> > offset = b->write_start;
> > end = b->write_end;
> > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > - end &= -DM_BUFIO_WRITE_ALIGN;
> > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
>
Should it be physical_block_size of device? It is a min_io the device
can perform. The point is, a user sets "ubs" size which should correspond
to the smallest I/O the device can write, i.e. physically.
--
Uladzislau Rezki
On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > Hello!
> >
> > Sorry i have missed you email for unknown reason to me. It is
> > probably because you answered to email with different subject
> > i sent initially.
> >
> > >
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > >
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > >
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > >
> > > Hi
> > >
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > users would also benefit from the fix. Please try this patch - does it fix
> > > it?
> > >
> > If it solves what i describe i do not mind :)
> >
> > >
> > >
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > >
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > >
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > >
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > >
> > > ---
> > > drivers/md/dm-bufio.c | 9 +++++----
> > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > {
> > > unsigned int n_sectors;
> > > sector_t sector;
> > > - unsigned int offset, end;
> > > + unsigned int offset, end, align;
> > >
> > > b->end_io = end_io;
> > >
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > b->c->write_callback(b);
> > > offset = b->write_start;
> > > end = b->write_end;
> > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> >
> Should it be physical_block_size of device? It is a min_io the device
> can perform. The point is, a user sets "ubs" size which should correspond
> to the smallest I/O the device can write, i.e. physically.
physical_block_size is unreliable - some SSDs report physical block size
512 bytes, some 4k. Regardless of what they report, all current SSDs have
4k sector size internally and they do slow read-modify-write cycle on
requests that are not aligned on 4k boundary.
Mikulas
> --
> Uladzislau Rezki
>
On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
>
>
> On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
>
> > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > Hello!
> > >
> > > Sorry i have missed you email for unknown reason to me. It is
> > > probably because you answered to email with different subject
> > > i sent initially.
> > >
> > > >
> > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > >
> > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > >
> > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > and sized to ubs.
> > > >
> > > > Hi
> > > >
> > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > > users would also benefit from the fix. Please try this patch - does it fix
> > > > it?
> > > >
> > > If it solves what i describe i do not mind :)
> > >
> > > >
> > > >
> > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > >
> > > > There may be devices with logical block size larger than 4k. Fix
> > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > with 8k logical block size created with qemu parameters:
> > > >
> > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > >
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > Cc: stable@vger.kernel.org
> > > >
> > > > ---
> > > > drivers/md/dm-bufio.c | 9 +++++----
> > > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > > >
> > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > > {
> > > > unsigned int n_sectors;
> > > > sector_t sector;
> > > > - unsigned int offset, end;
> > > > + unsigned int offset, end, align;
> > > >
> > > > b->end_io = end_io;
> > > >
> > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > > b->c->write_callback(b);
> > > > offset = b->write_start;
> > > > end = b->write_end;
> > > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > >
> > Should it be physical_block_size of device? It is a min_io the device
> > can perform. The point is, a user sets "ubs" size which should correspond
> > to the smallest I/O the device can write, i.e. physically.
>
> physical_block_size is unreliable - some SSDs report physical block size
> 512 bytes, some 4k. Regardless of what they report, all current SSDs have
> 4k sector size internally and they do slow read-modify-write cycle on
> requests that are not aligned on 4k boundary.
>
I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
flags. I agree there is mess there.
The change does not help my project and case. I posted the patch to fix
the dm-ebs as the code offloads partial size instead of ubs size, what
actually a user asking for. When a target is created, the physical_block_size
corresponds to ubs.
I really appreciate if you take the fix i posted. Your patch can be
sent out separately.
Does it work for you?
Thank you!
--
Uladzislau Rezki
On Wed, Oct 29, 2025 at 02:06:31PM +0100, Uladzislau Rezki wrote:
> On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
> >
> >
> > On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> >
> > > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > > Hello!
> > > >
> > > > Sorry i have missed you email for unknown reason to me. It is
> > > > probably because you answered to email with different subject
> > > > i sent initially.
> > > >
> > > > >
> > > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > > >
> > > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > > >
> > > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > > and sized to ubs.
> > > > >
> > > > > Hi
> > > > >
> > > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio
> > > > > users would also benefit from the fix. Please try this patch - does it fix
> > > > > it?
> > > > >
> > > > If it solves what i describe i do not mind :)
> > > >
> > > > >
> > > > >
> > > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > > >
> > > > > There may be devices with logical block size larger than 4k. Fix
> > > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > > with 8k logical block size created with qemu parameters:
> > > > >
> > > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > > >
> > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > > Cc: stable@vger.kernel.org
> > > > >
> > > > > ---
> > > > > drivers/md/dm-bufio.c | 9 +++++----
> > > > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > > > >
> > > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > > ===================================================================
> > > > > --- linux-2.6.orig/drivers/md/dm-bufio.c 2025-10-13 21:42:47.000000000 +0200
> > > > > +++ linux-2.6/drivers/md/dm-bufio.c 2025-10-20 14:40:32.000000000 +0200
> > > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > > > {
> > > > > unsigned int n_sectors;
> > > > > sector_t sector;
> > > > > - unsigned int offset, end;
> > > > > + unsigned int offset, end, align;
> > > > >
> > > > > b->end_io = end_io;
> > > > >
> > > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > > > b->c->write_callback(b);
> > > > > offset = b->write_start;
> > > > > end = b->write_end;
> > > > > - offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > > - end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > > - end &= -DM_BUFIO_WRITE_ALIGN;
> > > > > + align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > >
> > > Should it be physical_block_size of device? It is a min_io the device
> > > can perform. The point is, a user sets "ubs" size which should correspond
> > > to the smallest I/O the device can write, i.e. physically.
> >
> > physical_block_size is unreliable - some SSDs report physical block size
> > 512 bytes, some 4k. Regardless of what they report, all current SSDs have
> > 4k sector size internally and they do slow read-modify-write cycle on
> > requests that are not aligned on 4k boundary.
> >
> I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
> flags. I agree there is mess there.
>
> The change does not help my project and case. I posted the patch to fix
> the dm-ebs as the code offloads partial size instead of ubs size, what
> actually a user asking for. When a target is created, the physical_block_size
> corresponds to ubs.
>
> I really appreciate if you take the fix i posted. Your patch can be
> sent out separately.
>
> Does it work for you?
>
Any feedback or comments on it?
--
Uladzislau Rezki
© 2016 - 2026 Red Hat, Inc.