[v1] ext4: enable block size larger than page size

[PATCH 04/25] ext4: make ext4_punch_hole() support large block size

Posted by libaokun@huaweicloud.com 3 months, 2 weeks ago

From: Baokun Li <libaokun1@huawei.com>

Since the block size may be greater than the page size, when a hole
extends beyond i_size, we need to align the hole's end upwards to the
larger of PAGE_SIZE and blocksize.

This is to prevent the issues seen in commit 2be4751b21ae ("ext4: fix
2nd xfstests 127 punch hole failure") from reappearing after BS > PS
is supported.

Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4c04af7e51c9..a63513a3db53 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4401,7 +4401,8 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
 	 * the page that contains i_size.
 	 */
 	if (end > inode->i_size)
-		end = round_up(inode->i_size, PAGE_SIZE);
+		end = round_up(inode->i_size,
+			       umax(PAGE_SIZE, sb->s_blocksize));
 	if (end > max_end)
 		end = max_end;
 	length = end - offset;
-- 
2.46.1

Re: [PATCH 04/25] ext4: make ext4_punch_hole() support large block size

Posted by Jan Kara 3 months, 1 week ago

On Sat 25-10-25 11:22:00, libaokun@huaweicloud.com wrote:
> From: Baokun Li <libaokun1@huawei.com>
> 
> Since the block size may be greater than the page size, when a hole
> extends beyond i_size, we need to align the hole's end upwards to the
> larger of PAGE_SIZE and blocksize.
> 
> This is to prevent the issues seen in commit 2be4751b21ae ("ext4: fix
> 2nd xfstests 127 punch hole failure") from reappearing after BS > PS
> is supported.
> 
> Signed-off-by: Baokun Li <libaokun1@huawei.com>
> Reviewed-by: Zhang Yi <yi.zhang@huawei.com>

When going for bs > ps support, I'm very suspicious of any code that keeps
using PAGE_SIZE because it doesn't make too much sense anymore. Usually that
should be either appropriate folio size or something like that. For example
in this case if we indeed rely on freeing some buffers then with 4k block
size in an order-2 folio things would be already broken.

As far as I'm checking truncate_inode_pages_range() already handles partial
folio invalidation fine so I think we should just use blocksize in the
rounding (to save pointless tail block zeroing) and be done with it.

> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 4c04af7e51c9..a63513a3db53 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4401,7 +4401,8 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>  	 * the page that contains i_size.
>  	 */
>  	if (end > inode->i_size)

BTW I think here we should have >= (not your fault but we can fix it when
changing the code).

> -		end = round_up(inode->i_size, PAGE_SIZE);
> +		end = round_up(inode->i_size,
> +			       umax(PAGE_SIZE, sb->s_blocksize));
>  	if (end > max_end)
>  		end = max_end;
>  	length = end - offset;

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

Re: [PATCH 04/25] ext4: make ext4_punch_hole() support large block size

Posted by Baokun Li 3 months, 1 week ago

On 2025-11-03 16:05, Jan Kara wrote:
> On Sat 25-10-25 11:22:00, libaokun@huaweicloud.com wrote:
>> From: Baokun Li <libaokun1@huawei.com>
>>
>> Since the block size may be greater than the page size, when a hole
>> extends beyond i_size, we need to align the hole's end upwards to the
>> larger of PAGE_SIZE and blocksize.
>>
>> This is to prevent the issues seen in commit 2be4751b21ae ("ext4: fix
>> 2nd xfstests 127 punch hole failure") from reappearing after BS > PS
>> is supported.
>>
>> Signed-off-by: Baokun Li <libaokun1@huawei.com>
>> Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
> When going for bs > ps support, I'm very suspicious of any code that keeps
> using PAGE_SIZE because it doesn't make too much sense anymore. Usually that
> should be either appropriate folio size or something like that. For example
> in this case if we indeed rely on freeing some buffers then with 4k block
> size in an order-2 folio things would be already broken.
>
> As far as I'm checking truncate_inode_pages_range() already handles partial
> folio invalidation fine so I think we should just use blocksize in the
> rounding (to save pointless tail block zeroing) and be done with it.

Right. I missed that truncate_inode_pages_range already handles this.

I will directly use the blocksize in v2.

Thank you for your review!

>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 4c04af7e51c9..a63513a3db53 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -4401,7 +4401,8 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>>  	 * the page that contains i_size.
>>  	 */
>>  	if (end > inode->i_size)
> BTW I think here we should have >= (not your fault but we can fix it when
> changing the code).

Yes, I didn’t notice this bug. I will fix it together in v2.


Cheers,
Baokun

>
>> -		end = round_up(inode->i_size, PAGE_SIZE);
>> +		end = round_up(inode->i_size,
>> +			       umax(PAGE_SIZE, sb->s_blocksize));
>>  	if (end > max_end)
>>  		end = max_end;
>>  	length = end - offset;
> 								Honza