[PATCH] dm-bufio: align write boundary on bdev_logical_block_size

Mikulas Patocka posted 1 patch 3 months, 2 weeks ago
drivers/md/dm-bufio.c |    9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
[PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Mikulas Patocka 3 months, 2 weeks ago


On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:

> When performing a read-modify-write(RMW) operation, any modification
> to a buffered block must cause the entire buffer to be marked dirty.
> 
> Marking only a subrange as dirty is incorrect because the underlying
> device block size(ubs) defines the minimum read/write granularity. A
> lower device can perform I/O only on regions which are fully aligned
> and sized to ubs.

Hi

I think it would be better to fix this in dm-bufio, so that other dm-bufio 
users would also benefit from the fix. Please try this patch - does it fix 
it?

Mikulas



From: Mikulas Patocka <mpatocka@redhat.com>

There may be devices with logical block size larger than 4k. Fix
dm-bufio, so that it will align I/O on logical block size. This commit
fixes I/O errors on the dm-ebs target on the top of emulated nvme device
with 8k logical block size created with qemu parameters:

-device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/md/dm-bufio.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
@@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
 {
 	unsigned int n_sectors;
 	sector_t sector;
-	unsigned int offset, end;
+	unsigned int offset, end, align;
 
 	b->end_io = end_io;
 
@@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
 			b->c->write_callback(b);
 		offset = b->write_start;
 		end = b->write_end;
-		offset &= -DM_BUFIO_WRITE_ALIGN;
-		end += DM_BUFIO_WRITE_ALIGN - 1;
-		end &= -DM_BUFIO_WRITE_ALIGN;
+		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
+		offset &= -align;
+		end += align - 1;
+		end &= -align;
 		if (unlikely(end > b->c->block_size))
 			end = b->c->block_size;
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Benjamin Marzinski 2 months, 3 weeks ago
On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> 
> 
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> 
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> > 
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
> 
> Hi
> 
> I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> users would also benefit from the fix.

This looks to me like it should accomplish the same thing as
Uladzislau's patch. But I think there could still be problems with other
dm-bufio users, for devices where the blocksize is larger than 4k.

In dm_bufio_client_create() I think we want to make sure that block_size
is a multiple of bdev_logical_block_size(bdev), instead of 512b.
Otherwise block_to_sector() can return sectors that are not addressable
on the device. Unfortunatley, I don't think all users of dm-bufio will
pass in block_sizes that are larger than 4k (uds_make_bufio() in
dm-vdp/indexer/io-factory.c for instance).

-Ben

> Please try this patch - does it fix it?
> 
> Mikulas
> 
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
> 
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  drivers/md/dm-bufio.c |    9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
>  {
>  	unsigned int n_sectors;
>  	sector_t sector;
> -	unsigned int offset, end;
> +	unsigned int offset, end, align;
>  
>  	b->end_io = end_io;
>  
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
>  			b->c->write_callback(b);
>  		offset = b->write_start;
>  		end = b->write_end;
> -		offset &= -DM_BUFIO_WRITE_ALIGN;
> -		end += DM_BUFIO_WRITE_ALIGN - 1;
> -		end &= -DM_BUFIO_WRITE_ALIGN;
> +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> +		offset &= -align;
> +		end += align - 1;
> +		end &= -align;
>  		if (unlikely(end > b->c->block_size))
>  			end = b->c->block_size;
>  
>
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Mikulas Patocka 2 months, 2 weeks ago

On Mon, 17 Nov 2025, Benjamin Marzinski wrote:

> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > 
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > > 
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> > 
> > Hi
> > 
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > users would also benefit from the fix.
> 
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.
> 
> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.

I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to 
dm_bufio_client_create. But I think it's too late in this development 
cycle, I would add it after the next merge window closes, when I open a 
new patch series for the kernel 6.20 (or 7.0).

> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
> 
> -Ben
> 
> > Please try this patch - does it fix it?
> > 
> > Mikulas

I changed the patch below, so that it aligns write bios on 
max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
greater than logical block size, the writes are aligned so that the device 
doesn't do read-modify-write.

Mikulas

> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> > 
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> >  drivers/md/dm-bufio.c |    9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> >  {
> >  	unsigned int n_sectors;
> >  	sector_t sector;
> > -	unsigned int offset, end;
> > +	unsigned int offset, end, align;
> >  
> >  	b->end_io = end_io;
> >  
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> >  			b->c->write_callback(b);
> >  		offset = b->write_start;
> >  		end = b->write_end;
> > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > +		offset &= -align;
> > +		end += align - 1;
> > +		end &= -align;
> >  		if (unlikely(end > b->c->block_size))
> >  			end = b->c->block_size;
> >  
> > 
>
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Christoph Hellwig 2 months, 2 weeks ago
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> I changed the patch below, so that it aligns write bios on 
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
> greater than logical block size, the writes are aligned so that the device 
> doesn't do read-modify-write.

That doesn't make any sense whatsoever.  The physical block size must
be >= logical block size, and the block enforces that.
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Mikulas Patocka 2 months, 2 weeks ago

On Wed, 19 Nov 2025, Christoph Hellwig wrote:

> On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> > I changed the patch below, so that it aligns write bios on 
> > max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
> > bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
> > greater than logical block size, the writes are aligned so that the device 
> > doesn't do read-modify-write.
> 
> That doesn't make any sense whatsoever.  The physical block size must
> be >= logical block size, and the block enforces that.

OK, so I changed it to max(DM_BUFIO_WRITE_ALIGN, 
bdev_physical_block_size(b->c->bdev))

Mikulas
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Benjamin Marzinski 2 months, 2 weeks ago
On Tue, Nov 18, 2025 at 06:45:55PM +0100, Mikulas Patocka wrote:
> 
> 
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> 
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > 
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > 
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > > 
> > > Hi
> > > 
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > users would also benefit from the fix.
> > 
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> > 
> > In dm_bufio_client_create() I think we want to make sure that block_size
> > is a multiple of bdev_logical_block_size(bdev), instead of 512b.
> 
> I could add WARN_ON(block_size < bdev_logical_block_size(bdev)) to 
> dm_bufio_client_create. But I think it's too late in this development 
> cycle, I would add it after the next merge window closes, when I open a 
> new patch series for the kernel 6.20 (or 7.0).
> 
> > Otherwise block_to_sector() can return sectors that are not addressable
> > on the device. Unfortunatley, I don't think all users of dm-bufio will
> > pass in block_sizes that are larger than 4k (uds_make_bufio() in
> > dm-vdp/indexer/io-factory.c for instance).
> > 
> > -Ben
> > 
> > > Please try this patch - does it fix it?
> > > 
> > > Mikulas
> 
> I changed the patch below, so that it aligns write bios on 
> max3(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev), 
> bdev_physical_block_size(b->c->bdev)); - so that if physical block size is 
> greater than logical block size, the writes are aligned so that the device 
> doesn't do read-modify-write.

This will really only help if the bufio client block_size is a multiple
of the underlying device's physical block size, and the device is
aligned to the physical block size. Perhaps we should figure
out the alignment in dm_bufio_client_create(), with something like:

	c->align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(bdev));
	if (block_size & -bdev_physical_block_size(bdev) &&
	    bdev_alignment_offset(bdev) == 0)
		c->align = bdev_physical_block_size(bdev);

I suppose pre-calculating this could cause problems if the underlying
device was another dm device, and it switched tables in a way that
changed its limits. I dunno if we care about that, however.

-Ben 

> Mikulas
> 
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > 
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > > 
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > > 
> > > ---
> > >  drivers/md/dm-bufio.c |    9 +++++----
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > 
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > >  {
> > >  	unsigned int n_sectors;
> > >  	sector_t sector;
> > > -	unsigned int offset, end;
> > > +	unsigned int offset, end, align;
> > >  
> > >  	b->end_io = end_io;
> > >  
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > >  			b->c->write_callback(b);
> > >  		offset = b->write_start;
> > >  		end = b->write_end;
> > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > +		offset &= -align;
> > > +		end += align - 1;
> > > +		end &= -align;
> > >  		if (unlikely(end > b->c->block_size))
> > >  			end = b->c->block_size;
> > >  
> > > 
> >
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Mikulas Patocka 2 months, 3 weeks ago

On Mon, 17 Nov 2025, Benjamin Marzinski wrote:

> On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > 
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > > 
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> > 
> > Hi
> > 
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > users would also benefit from the fix.
> 
> This looks to me like it should accomplish the same thing as
> Uladzislau's patch. But I think there could still be problems with other
> dm-bufio users, for devices where the blocksize is larger than 4k.

Yes, but Uladzislau said that this patch doesn't work for him. So, I 
suspect that he has "logical_block_size" set incorrectly.

Mikulas

> In dm_bufio_client_create() I think we want to make sure that block_size
> is a multiple of bdev_logical_block_size(bdev), instead of 512b.
> Otherwise block_to_sector() can return sectors that are not addressable
> on the device. Unfortunatley, I don't think all users of dm-bufio will
> pass in block_sizes that are larger than 4k (uds_make_bufio() in
> dm-vdp/indexer/io-factory.c for instance).
> 
> -Ben
> 
> > Please try this patch - does it fix it?
> > 
> > Mikulas
> > 
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> > 
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> >  drivers/md/dm-bufio.c |    9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> >  {
> >  	unsigned int n_sectors;
> >  	sector_t sector;
> > -	unsigned int offset, end;
> > +	unsigned int offset, end, align;
> >  
> >  	b->end_io = end_io;
> >  
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> >  			b->c->write_callback(b);
> >  		offset = b->write_start;
> >  		end = b->write_end;
> > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > +		offset &= -align;
> > +		end += align - 1;
> > +		end &= -align;
> >  		if (unlikely(end > b->c->block_size))
> >  			end = b->c->block_size;
> >  
> > 
>
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Uladzislau Rezki 2 months, 3 weeks ago
On Tue, Nov 18, 2025 at 12:15:43PM +0100, Mikulas Patocka wrote:
> 
> 
> On Mon, 17 Nov 2025, Benjamin Marzinski wrote:
> 
> > On Mon, Oct 20, 2025 at 02:48:13PM +0200, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > 
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > 
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > > 
> > > Hi
> > > 
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > users would also benefit from the fix.
> > 
> > This looks to me like it should accomplish the same thing as
> > Uladzislau's patch. But I think there could still be problems with other
> > dm-bufio users, for devices where the blocksize is larger than 4k.
> 
> Yes, but Uladzislau said that this patch doesn't work for him. So, I 
> suspect that he has "logical_block_size" set incorrectly.
> 
Indeed. Because logical is < physical in my case. Your change does not fix
it because of I/O size is equal to physical.

--
Uladzislau Rezki
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Uladzislau Rezki 3 months, 1 week ago
Hello!

Sorry i have missed you email for unknown reason to me. It is
probably because you answered to email with different subject
i sent initially.

> 
> On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> 
> > When performing a read-modify-write(RMW) operation, any modification
> > to a buffered block must cause the entire buffer to be marked dirty.
> > 
> > Marking only a subrange as dirty is incorrect because the underlying
> > device block size(ubs) defines the minimum read/write granularity. A
> > lower device can perform I/O only on regions which are fully aligned
> > and sized to ubs.
> 
> Hi
> 
> I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> users would also benefit from the fix. Please try this patch - does it fix 
> it?
> 
If it solves what i describe i do not mind :)

> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> There may be devices with logical block size larger than 4k. Fix
> dm-bufio, so that it will align I/O on logical block size. This commit
> fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> with 8k logical block size created with qemu parameters:
> 
> -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  drivers/md/dm-bufio.c |    9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
>  {
>  	unsigned int n_sectors;
>  	sector_t sector;
> -	unsigned int offset, end;
> +	unsigned int offset, end, align;
>  
>  	b->end_io = end_io;
>  
> @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
>  			b->c->write_callback(b);
>  		offset = b->write_start;
>  		end = b->write_end;
> -		offset &= -DM_BUFIO_WRITE_ALIGN;
> -		end += DM_BUFIO_WRITE_ALIGN - 1;
> -		end &= -DM_BUFIO_WRITE_ALIGN;
> +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> +		offset &= -align;
> +		end += align - 1;
> +		end &= -align;
>  		if (unlikely(end > b->c->block_size))
>  			end = b->c->block_size;
>  
> 
I will check it and get back soon.

Thank you.

--
Uladzislau Rezki
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Uladzislau Rezki 3 months, 1 week ago
On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> Hello!
> 
> Sorry i have missed you email for unknown reason to me. It is
> probably because you answered to email with different subject
> i sent initially.
> 
> > 
> > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > 
> > > When performing a read-modify-write(RMW) operation, any modification
> > > to a buffered block must cause the entire buffer to be marked dirty.
> > > 
> > > Marking only a subrange as dirty is incorrect because the underlying
> > > device block size(ubs) defines the minimum read/write granularity. A
> > > lower device can perform I/O only on regions which are fully aligned
> > > and sized to ubs.
> > 
> > Hi
> > 
> > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > users would also benefit from the fix. Please try this patch - does it fix 
> > it?
> > 
> If it solves what i describe i do not mind :)
> 
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > There may be devices with logical block size larger than 4k. Fix
> > dm-bufio, so that it will align I/O on logical block size. This commit
> > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > with 8k logical block size created with qemu parameters:
> > 
> > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> >  drivers/md/dm-bufio.c |    9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> >  {
> >  	unsigned int n_sectors;
> >  	sector_t sector;
> > -	unsigned int offset, end;
> > +	unsigned int offset, end, align;
> >  
> >  	b->end_io = end_io;
> >  
> > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> >  			b->c->write_callback(b);
> >  		offset = b->write_start;
> >  		end = b->write_end;
> > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
>
Should it be physical_block_size of device? It is a min_io the device
can perform. The point is, a user sets "ubs" size which should correspond
to the smallest I/O the device can write, i.e. physically.

--
Uladzislau Rezki
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Mikulas Patocka 3 months, 1 week ago

On Tue, 28 Oct 2025, Uladzislau Rezki wrote:

> On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > Hello!
> > 
> > Sorry i have missed you email for unknown reason to me. It is
> > probably because you answered to email with different subject
> > i sent initially.
> > 
> > > 
> > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > 
> > > > When performing a read-modify-write(RMW) operation, any modification
> > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > 
> > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > lower device can perform I/O only on regions which are fully aligned
> > > > and sized to ubs.
> > > 
> > > Hi
> > > 
> > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > users would also benefit from the fix. Please try this patch - does it fix 
> > > it?
> > > 
> > If it solves what i describe i do not mind :)
> > 
> > > 
> > > 
> > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > 
> > > There may be devices with logical block size larger than 4k. Fix
> > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > with 8k logical block size created with qemu parameters:
> > > 
> > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > > 
> > > ---
> > >  drivers/md/dm-bufio.c |    9 +++++----
> > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > 
> > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > >  {
> > >  	unsigned int n_sectors;
> > >  	sector_t sector;
> > > -	unsigned int offset, end;
> > > +	unsigned int offset, end, align;
> > >  
> > >  	b->end_io = end_io;
> > >  
> > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > >  			b->c->write_callback(b);
> > >  		offset = b->write_start;
> > >  		end = b->write_end;
> > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> >
> Should it be physical_block_size of device? It is a min_io the device
> can perform. The point is, a user sets "ubs" size which should correspond
> to the smallest I/O the device can write, i.e. physically.

physical_block_size is unreliable - some SSDs report physical block size 
512 bytes, some 4k. Regardless of what they report, all current SSDs have 
4k sector size internally and they do slow read-modify-write cycle on 
requests that are not aligned on 4k boundary.

Mikulas

> --
> Uladzislau Rezki
>
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Uladzislau Rezki 3 months, 1 week ago
On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
> 
> 
> On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> 
> > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > Hello!
> > > 
> > > Sorry i have missed you email for unknown reason to me. It is
> > > probably because you answered to email with different subject
> > > i sent initially.
> > > 
> > > > 
> > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > > 
> > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > > 
> > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > and sized to ubs.
> > > > 
> > > > Hi
> > > > 
> > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > > users would also benefit from the fix. Please try this patch - does it fix 
> > > > it?
> > > > 
> > > If it solves what i describe i do not mind :)
> > > 
> > > > 
> > > > 
> > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > > 
> > > > There may be devices with logical block size larger than 4k. Fix
> > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > with 8k logical block size created with qemu parameters:
> > > > 
> > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > > 
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > Cc: stable@vger.kernel.org
> > > > 
> > > > ---
> > > >  drivers/md/dm-bufio.c |    9 +++++----
> > > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > > 
> > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > >  {
> > > >  	unsigned int n_sectors;
> > > >  	sector_t sector;
> > > > -	unsigned int offset, end;
> > > > +	unsigned int offset, end, align;
> > > >  
> > > >  	b->end_io = end_io;
> > > >  
> > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > >  			b->c->write_callback(b);
> > > >  		offset = b->write_start;
> > > >  		end = b->write_end;
> > > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > >
> > Should it be physical_block_size of device? It is a min_io the device
> > can perform. The point is, a user sets "ubs" size which should correspond
> > to the smallest I/O the device can write, i.e. physically.
> 
> physical_block_size is unreliable - some SSDs report physical block size 
> 512 bytes, some 4k. Regardless of what they report, all current SSDs have 
> 4k sector size internally and they do slow read-modify-write cycle on 
> requests that are not aligned on 4k boundary.
> 
I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
flags. I agree there is mess there.
 
The change does not help my project and case. I posted the patch to fix
the dm-ebs as the code offloads partial size instead of ubs size, what
actually a user asking for. When a target is created, the physical_block_size
corresponds to ubs.
 
I really appreciate if you take the fix i posted. Your patch can be
sent out separately.
 
Does it work for you?
 
Thank you!
 
--
Uladzislau Rezki
Re: [PATCH] dm-bufio: align write boundary on bdev_logical_block_size
Posted by Uladzislau Rezki 2 months, 4 weeks ago
On Wed, Oct 29, 2025 at 02:06:31PM +0100, Uladzislau Rezki wrote:
> On Wed, Oct 29, 2025 at 11:24:25AM +0100, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 28 Oct 2025, Uladzislau Rezki wrote:
> > 
> > > On Tue, Oct 28, 2025 at 09:47:40AM +0100, Uladzislau Rezki wrote:
> > > > Hello!
> > > > 
> > > > Sorry i have missed you email for unknown reason to me. It is
> > > > probably because you answered to email with different subject
> > > > i sent initially.
> > > > 
> > > > > 
> > > > > On Mon, 20 Oct 2025, Uladzislau Rezki (Sony) wrote:
> > > > > 
> > > > > > When performing a read-modify-write(RMW) operation, any modification
> > > > > > to a buffered block must cause the entire buffer to be marked dirty.
> > > > > > 
> > > > > > Marking only a subrange as dirty is incorrect because the underlying
> > > > > > device block size(ubs) defines the minimum read/write granularity. A
> > > > > > lower device can perform I/O only on regions which are fully aligned
> > > > > > and sized to ubs.
> > > > > 
> > > > > Hi
> > > > > 
> > > > > I think it would be better to fix this in dm-bufio, so that other dm-bufio 
> > > > > users would also benefit from the fix. Please try this patch - does it fix 
> > > > > it?
> > > > > 
> > > > If it solves what i describe i do not mind :)
> > > > 
> > > > > 
> > > > > 
> > > > > From: Mikulas Patocka <mpatocka@redhat.com>
> > > > > 
> > > > > There may be devices with logical block size larger than 4k. Fix
> > > > > dm-bufio, so that it will align I/O on logical block size. This commit
> > > > > fixes I/O errors on the dm-ebs target on the top of emulated nvme device
> > > > > with 8k logical block size created with qemu parameters:
> > > > > 
> > > > > -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192
> > > > > 
> > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > > Cc: stable@vger.kernel.org
> > > > > 
> > > > > ---
> > > > >  drivers/md/dm-bufio.c |    9 +++++----
> > > > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > > > 
> > > > > Index: linux-2.6/drivers/md/dm-bufio.c
> > > > > ===================================================================
> > > > > --- linux-2.6.orig/drivers/md/dm-bufio.c	2025-10-13 21:42:47.000000000 +0200
> > > > > +++ linux-2.6/drivers/md/dm-bufio.c	2025-10-20 14:40:32.000000000 +0200
> > > > > @@ -1374,7 +1374,7 @@ static void submit_io(struct dm_buffer *
> > > > >  {
> > > > >  	unsigned int n_sectors;
> > > > >  	sector_t sector;
> > > > > -	unsigned int offset, end;
> > > > > +	unsigned int offset, end, align;
> > > > >  
> > > > >  	b->end_io = end_io;
> > > > >  
> > > > > @@ -1388,9 +1388,10 @@ static void submit_io(struct dm_buffer *
> > > > >  			b->c->write_callback(b);
> > > > >  		offset = b->write_start;
> > > > >  		end = b->write_end;
> > > > > -		offset &= -DM_BUFIO_WRITE_ALIGN;
> > > > > -		end += DM_BUFIO_WRITE_ALIGN - 1;
> > > > > -		end &= -DM_BUFIO_WRITE_ALIGN;
> > > > > +		align = max(DM_BUFIO_WRITE_ALIGN, bdev_logical_block_size(b->c->bdev));
> > > >
> > > Should it be physical_block_size of device? It is a min_io the device
> > > can perform. The point is, a user sets "ubs" size which should correspond
> > > to the smallest I/O the device can write, i.e. physically.
> > 
> > physical_block_size is unreliable - some SSDs report physical block size 
> > 512 bytes, some 4k. Regardless of what they report, all current SSDs have 
> > 4k sector size internally and they do slow read-modify-write cycle on 
> > requests that are not aligned on 4k boundary.
> > 
> I see. Some NVMEs have buggy firmwares therefore we have a lot of quicks
> flags. I agree there is mess there.
>  
> The change does not help my project and case. I posted the patch to fix
> the dm-ebs as the code offloads partial size instead of ubs size, what
> actually a user asking for. When a target is created, the physical_block_size
> corresponds to ubs.
>  
> I really appreciate if you take the fix i posted. Your patch can be
> sent out separately.
>  
> Does it work for you?
>  
Any feedback or comments on it?

--
Uladzislau Rezki