From: Zhang Yi <yi.zhang@huawei.com>
The journal credits calculation in ext4_ext_index_trans_blocks() is
currently inadequate. It only multiplies the depth of the extents tree
and doesn't account for the blocks that may be required for adding the
leaf extents themselves.
After enabling large folios, we can easily run out of handle credits,
triggering a warning in jbd2_journal_dirty_metadata() on filesystems
with a 1KB block size. This occurs because we may need more extents when
iterating through each large folio in
ext4_do_writepages()->mpage_map_and_submit_extent(). Therefore, we
should modify ext4_ext_index_trans_blocks() to include a count of the
leaf extents in the worst case as well.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
fs/ext4/extents.c | 5 +++--
fs/ext4/inode.c | 10 ++++------
2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index c616a16a9f36..e759941bd262 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2405,9 +2405,10 @@ int ext4_ext_index_trans_blocks(struct inode *inode, int extents)
depth = ext_depth(inode);
if (extents <= 1)
- index = depth * 2;
+ index = depth * 2 + extents;
else
- index = depth * 3;
+ index = depth * 3 +
+ DIV_ROUND_UP(extents, ext4_ext_space_block(inode, 0));
return index;
}
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ffbf444b56d4..3e962a760d71 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5792,18 +5792,16 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
int ret;
/*
- * How many index blocks need to touch to map @lblocks logical blocks
- * to @pextents physical extents?
+ * How many index and lead blocks need to touch to map @lblocks
+ * logical blocks to @pextents physical extents?
*/
idxblocks = ext4_index_trans_blocks(inode, lblocks, pextents);
- ret = idxblocks;
-
/*
* Now let's see how many group bitmaps and group descriptors need
* to account
*/
- groups = idxblocks + pextents;
+ groups = idxblocks;
gdpblocks = groups;
if (groups > ngroups)
groups = ngroups;
@@ -5811,7 +5809,7 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
gdpblocks = EXT4_SB(inode->i_sb)->s_gdb_count;
/* bitmaps and block group descriptor blocks */
- ret += groups + gdpblocks;
+ ret = idxblocks + groups + gdpblocks;
/* Blocks for super block, inode, quota and xattr blocks */
ret += EXT4_META_TRANS_BLOCKS(inode->i_sb);
--
2.46.1
On Mon 12-05-25 14:33:16, Zhang Yi wrote: > From: Zhang Yi <yi.zhang@huawei.com> > > The journal credits calculation in ext4_ext_index_trans_blocks() is > currently inadequate. It only multiplies the depth of the extents tree > and doesn't account for the blocks that may be required for adding the > leaf extents themselves. > > After enabling large folios, we can easily run out of handle credits, > triggering a warning in jbd2_journal_dirty_metadata() on filesystems > with a 1KB block size. This occurs because we may need more extents when > iterating through each large folio in > ext4_do_writepages()->mpage_map_and_submit_extent(). Therefore, we > should modify ext4_ext_index_trans_blocks() to include a count of the > leaf extents in the worst case as well. > > Signed-off-by: Zhang Yi <yi.zhang@huawei.com> One comment below > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index c616a16a9f36..e759941bd262 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -2405,9 +2405,10 @@ int ext4_ext_index_trans_blocks(struct inode *inode, int extents) > depth = ext_depth(inode); > > if (extents <= 1) > - index = depth * 2; > + index = depth * 2 + extents; > else > - index = depth * 3; > + index = depth * 3 + > + DIV_ROUND_UP(extents, ext4_ext_space_block(inode, 0)); > > return index; > } > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index ffbf444b56d4..3e962a760d71 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5792,18 +5792,16 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks, > int ret; > > /* > - * How many index blocks need to touch to map @lblocks logical blocks > - * to @pextents physical extents? > + * How many index and lead blocks need to touch to map @lblocks > + * logical blocks to @pextents physical extents? > */ > idxblocks = ext4_index_trans_blocks(inode, lblocks, pextents); > > - ret = idxblocks; > - > /* > * Now let's see how many group bitmaps and group descriptors need > * to account > */ > - groups = idxblocks + pextents; > + groups = idxblocks; I don't think you can drop 'pextents' from this computation... Yes, you now account possible number of modified extent tree leaf blocks in ext4_index_trans_blocks() but additionally, each extent separately may be allocated from a different group and thus need to update different bitmap and group descriptor block. That is separate from the computation you do in ext4_index_trans_blocks() AFAICT... Honza > gdpblocks = groups; > if (groups > ngroups) > groups = ngroups; > @@ -5811,7 +5809,7 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks, > gdpblocks = EXT4_SB(inode->i_sb)->s_gdb_count; > > /* bitmaps and block group descriptor blocks */ > - ret += groups + gdpblocks; > + ret = idxblocks + groups + gdpblocks; > > /* Blocks for super block, inode, quota and xattr blocks */ > ret += EXT4_META_TRANS_BLOCKS(inode->i_sb); > -- > 2.46.1 > -- Jan Kara <jack@suse.com> SUSE Labs, CR
On 2025/5/20 4:24, Jan Kara wrote: > On Mon 12-05-25 14:33:16, Zhang Yi wrote: >> From: Zhang Yi <yi.zhang@huawei.com> >> >> The journal credits calculation in ext4_ext_index_trans_blocks() is >> currently inadequate. It only multiplies the depth of the extents tree >> and doesn't account for the blocks that may be required for adding the >> leaf extents themselves. >> >> After enabling large folios, we can easily run out of handle credits, >> triggering a warning in jbd2_journal_dirty_metadata() on filesystems >> with a 1KB block size. This occurs because we may need more extents when >> iterating through each large folio in >> ext4_do_writepages()->mpage_map_and_submit_extent(). Therefore, we >> should modify ext4_ext_index_trans_blocks() to include a count of the >> leaf extents in the worst case as well. >> >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> > > One comment below > >> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >> index c616a16a9f36..e759941bd262 100644 >> --- a/fs/ext4/extents.c >> +++ b/fs/ext4/extents.c >> @@ -2405,9 +2405,10 @@ int ext4_ext_index_trans_blocks(struct inode *inode, int extents) >> depth = ext_depth(inode); >> >> if (extents <= 1) >> - index = depth * 2; >> + index = depth * 2 + extents; >> else >> - index = depth * 3; >> + index = depth * 3 + >> + DIV_ROUND_UP(extents, ext4_ext_space_block(inode, 0)); >> >> return index; >> } >> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c >> index ffbf444b56d4..3e962a760d71 100644 >> --- a/fs/ext4/inode.c >> +++ b/fs/ext4/inode.c >> @@ -5792,18 +5792,16 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks, >> int ret; >> >> /* >> - * How many index blocks need to touch to map @lblocks logical blocks >> - * to @pextents physical extents? >> + * How many index and lead blocks need to touch to map @lblocks >> + * logical blocks to @pextents physical extents? >> */ >> idxblocks = ext4_index_trans_blocks(inode, lblocks, pextents); >> >> - ret = idxblocks; >> - >> /* >> * Now let's see how many group bitmaps and group descriptors need >> * to account >> */ >> - groups = idxblocks + pextents; >> + groups = idxblocks; > > I don't think you can drop 'pextents' from this computation... Yes, you now > account possible number of modified extent tree leaf blocks in > ext4_index_trans_blocks() but additionally, each extent separately may be > allocated from a different group and thus need to update different bitmap > and group descriptor block. That is separate from the computation you do in > ext4_index_trans_blocks() AFAICT... > Yes, that's right! Sorry for my mistake. I will fix this. Thanks, Yi. > >> gdpblocks = groups; >> if (groups > ngroups) >> groups = ngroups; >> @@ -5811,7 +5809,7 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks, >> gdpblocks = EXT4_SB(inode->i_sb)->s_gdb_count; >> >> /* bitmaps and block group descriptor blocks */ >> - ret += groups + gdpblocks; >> + ret = idxblocks + groups + gdpblocks; >> >> /* Blocks for super block, inode, quota and xattr blocks */ >> ret += EXT4_META_TRANS_BLOCKS(inode->i_sb); >> -- >> 2.46.1 >>
Hi Ted.
On 2025/5/12 14:33, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> The journal credits calculation in ext4_ext_index_trans_blocks() is
> currently inadequate. It only multiplies the depth of the extents tree
> and doesn't account for the blocks that may be required for adding the
> leaf extents themselves.
>
> After enabling large folios, we can easily run out of handle credits,
> triggering a warning in jbd2_journal_dirty_metadata() on filesystems
> with a 1KB block size. This occurs because we may need more extents when
> iterating through each large folio in
> ext4_do_writepages()->mpage_map_and_submit_extent(). Therefore, we
> should modify ext4_ext_index_trans_blocks() to include a count of the
> leaf extents in the worst case as well.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> ---
> fs/ext4/extents.c | 5 +++--
> fs/ext4/inode.c | 10 ++++------
> 2 files changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index c616a16a9f36..e759941bd262 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2405,9 +2405,10 @@ int ext4_ext_index_trans_blocks(struct inode *inode, int extents)
> depth = ext_depth(inode);
>
> if (extents <= 1)
> - index = depth * 2;
> + index = depth * 2 + extents;
> else
> - index = depth * 3;
> + index = depth * 3 +
> + DIV_ROUND_UP(extents, ext4_ext_space_block(inode, 0));
>
> return index;
> }
This patch conflicts with Jan's patch e18d4f11d240 ("ext4: fix
calculation of credits for extent tree modification") in
ext4_ext_index_trans_blocks(), the conflict should be resolved when
merging this patch. However, I checked the merged commit of this patch
in your dev branch[1], and the changes in ext4_ext_index_trans_blocks()
seem to be incorrect, which could result in insufficient credit
reservations on 1K block size filesystems.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git/commit/?h=dev&id=d80af138eb8873eb13f5fece1adabb3ca4325134
I think the correct conflict resolution in ext4_ext_index_trans_blocks()
should be:
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 9053fe68ee4c..431d66181721 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2409,9 +2409,10 @@ int ext4_ext_index_trans_blocks(struct inode *inode, int extents)
* the time we actually modify the tree. Assume the worst case.
*/
if (extents <= 1)
- index = EXT4_MAX_EXTENT_DEPTH * 2;
+ index = EXT4_MAX_EXTENT_DEPTH * 2 + extents;
else
- index = EXT4_MAX_EXTENT_DEPTH * 3;
+ index = EXT4_MAX_EXTENT_DEPTH * 3 +
+ DIV_ROUND_UP(extents, ext4_ext_space_block(inode, 0));
return index;
Best Regards,
Yi.
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index ffbf444b56d4..3e962a760d71 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5792,18 +5792,16 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
> int ret;
>
> /*
> - * How many index blocks need to touch to map @lblocks logical blocks
> - * to @pextents physical extents?
> + * How many index and lead blocks need to touch to map @lblocks
> + * logical blocks to @pextents physical extents?
> */
> idxblocks = ext4_index_trans_blocks(inode, lblocks, pextents);
>
> - ret = idxblocks;
> -
> /*
> * Now let's see how many group bitmaps and group descriptors need
> * to account
> */
> - groups = idxblocks + pextents;
> + groups = idxblocks;
> gdpblocks = groups;
> if (groups > ngroups)
> groups = ngroups;
> @@ -5811,7 +5809,7 @@ static int ext4_meta_trans_blocks(struct inode *inode, int lblocks,
> gdpblocks = EXT4_SB(inode->i_sb)->s_gdb_count;
>
> /* bitmaps and block group descriptor blocks */
> - ret += groups + gdpblocks;
> + ret = idxblocks + groups + gdpblocks;
>
> /* Blocks for super block, inode, quota and xattr blocks */
> ret += EXT4_META_TRANS_BLOCKS(inode->i_sb);
On Mon, May 19, 2025 at 10:48:28AM +0800, Zhang Yi wrote:
>
> This patch conflicts with Jan's patch e18d4f11d240 ("ext4: fix
> calculation of credits for extent tree modification") in
> ext4_ext_index_trans_blocks(), the conflict should be resolved when
> merging this patch. However, I checked the merged commit of this patch
> in your dev branch[1], and the changes in ext4_ext_index_trans_blocks()
> seem to be incorrect, which could result in insufficient credit
> reservations on 1K block size filesystems.
Thanks so much for noticing the mis-merge! I've fixed it in my tree,
and will be pushing it out shortly. If you could take a look and make
sure that it's correct, that would be great.
- Ted
On 2025/5/19 23:48, Theodore Ts'o wrote:
> On Mon, May 19, 2025 at 10:48:28AM +0800, Zhang Yi wrote:
>>
>> This patch conflicts with Jan's patch e18d4f11d240 ("ext4: fix
>> calculation of credits for extent tree modification") in
>> ext4_ext_index_trans_blocks(), the conflict should be resolved when
>> merging this patch. However, I checked the merged commit of this patch
>> in your dev branch[1], and the changes in ext4_ext_index_trans_blocks()
>> seem to be incorrect, which could result in insufficient credit
>> reservations on 1K block size filesystems.
>
> Thanks so much for noticing the mis-merge! I've fixed it in my tree,
> and will be pushing it out shortly. If you could take a look and make
> sure that it's correct, that would be great.
>
The merge in ext4_ext_index_trans_blocks() appears to be correct now.
However, the issue that Jan pointed out regarding the modification in
ext4_meta_trans_blocks() is correct, it will also lead to insufficient
credit reservations on some corner images. I will send out a fix ASAP.
Best Regards.
Yi.
© 2016 - 2025 Red Hat, Inc.