Before this patch, the call stack in ext4_run_li_request is as follows:
/*
* nr = no. of BGs we want to fetch (=s_mb_prefetch)
* prefetch_ios = no. of BGs not uptodate after
* ext4_read_block_bitmap_nowait()
*/
next_group = ext4_mb_prefetch(sb, group, nr, prefetch_ios);
ext4_mb_prefetch_fini(sb, next_group prefetch_ios);
ext4_mb_prefetch_fini() will only try to initialize buddies for BGs in
range [next_group - prefetch_ios, next_group). This is incorrect since
sometimes (prefetch_ios < nr), which causes ext4_mb_prefetch_fini() to
incorrectly ignore some of the BGs that might need initialization. This
issue is more notable now with the previous patch enabling "fetching" of
BLOCK_UNINIT BGs which are marked buffer_uptodate by default.
Fix this by passing nr to ext4_mb_prefetch_fini() instead of
prefetch_ios so that it considers the right range of groups.
Similarly, make sure we don't pass nr=0 to ext4_mb_prefetch_fini() in
ext4_mb_regular_allocator() since we might have prefetched BLOCK_UNINIT
groups that would need buddy initialization.
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/mballoc.c | 4 ----
fs/ext4/super.c | 11 ++++-------
2 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 79455c7e645b..6775d73dfc68 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2735,8 +2735,6 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
if ((prefetch_grp == group) &&
(cr > CR1 ||
prefetch_ios < sbi->s_mb_prefetch_limit)) {
- unsigned int curr_ios = prefetch_ios;
-
nr = sbi->s_mb_prefetch;
if (ext4_has_feature_flex_bg(sb)) {
nr = 1 << sbi->s_log_groups_per_flex;
@@ -2745,8 +2743,6 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
}
prefetch_grp = ext4_mb_prefetch(sb, group,
nr, &prefetch_ios);
- if (prefetch_ios == curr_ios)
- nr = 0;
}
/* This now checks without needing the buddy page */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 2da5476fa48b..27c1dabacd43 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3692,16 +3692,13 @@ static int ext4_run_li_request(struct ext4_li_request *elr)
ext4_group_t group = elr->lr_next_group;
unsigned int prefetch_ios = 0;
int ret = 0;
+ int nr = EXT4_SB(sb)->s_mb_prefetch;
u64 start_time;
if (elr->lr_mode == EXT4_LI_MODE_PREFETCH_BBITMAP) {
- elr->lr_next_group = ext4_mb_prefetch(sb, group,
- EXT4_SB(sb)->s_mb_prefetch, &prefetch_ios);
- if (prefetch_ios)
- ext4_mb_prefetch_fini(sb, elr->lr_next_group,
- prefetch_ios);
- trace_ext4_prefetch_bitmaps(sb, group, elr->lr_next_group,
- prefetch_ios);
+ elr->lr_next_group = ext4_mb_prefetch(sb, group, nr, &prefetch_ios);
+ ext4_mb_prefetch_fini(sb, elr->lr_next_group, nr);
+ trace_ext4_prefetch_bitmaps(sb, group, elr->lr_next_group, nr);
if (group >= elr->lr_next_group) {
ret = 1;
if (elr->lr_first_not_zeroed != ngroups &&
--
2.31.1
Hello,
On 5/30/23 20:33, Ojaswin Mujoo wrote:
> Before this patch, the call stack in ext4_run_li_request is as follows:
>
> /*
> * nr = no. of BGs we want to fetch (=s_mb_prefetch)
> * prefetch_ios = no. of BGs not uptodate after
> * ext4_read_block_bitmap_nowait()
> */
> next_group = ext4_mb_prefetch(sb, group, nr, prefetch_ios);
> ext4_mb_prefetch_fini(sb, next_group prefetch_ios);
>
> ext4_mb_prefetch_fini() will only try to initialize buddies for BGs in
> range [next_group - prefetch_ios, next_group). This is incorrect since
> sometimes (prefetch_ios < nr), which causes ext4_mb_prefetch_fini() to
> incorrectly ignore some of the BGs that might need initialization. This
> issue is more notable now with the previous patch enabling "fetching" of
> BLOCK_UNINIT BGs which are marked buffer_uptodate by default.
>
> Fix this by passing nr to ext4_mb_prefetch_fini() instead of
> prefetch_ios so that it considers the right range of groups.
Thanks for the series.
> Similarly, make sure we don't pass nr=0 to ext4_mb_prefetch_fini() in
> ext4_mb_regular_allocator() since we might have prefetched BLOCK_UNINIT
> groups that would need buddy initialization.
Seems ext4_mb_prefetch_fini can't be called by ext4_mb_regular_allocator
if nr is 0.
https://elixir.bootlin.com/linux/v6.4-rc5/source/fs/ext4/mballoc.c#L2816
Am I miss something?
Thanks,
Guoqing
> Signed-off-by: Ojaswin Mujoo<ojaswin@linux.ibm.com>
> Reviewed-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
> Reviewed-by: Jan Kara<jack@suse.cz>
> ---
> fs/ext4/mballoc.c | 4 ----
> fs/ext4/super.c | 11 ++++-------
> 2 files changed, 4 insertions(+), 11 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 79455c7e645b..6775d73dfc68 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2735,8 +2735,6 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
> if ((prefetch_grp == group) &&
> (cr > CR1 ||
> prefetch_ios < sbi->s_mb_prefetch_limit)) {
> - unsigned int curr_ios = prefetch_ios;
> -
> nr = sbi->s_mb_prefetch;
> if (ext4_has_feature_flex_bg(sb)) {
> nr = 1 << sbi->s_log_groups_per_flex;
> @@ -2745,8 +2743,6 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
> }
> prefetch_grp = ext4_mb_prefetch(sb, group,
> nr, &prefetch_ios);
> - if (prefetch_ios == curr_ios)
> - nr = 0;
> }
>
> /* This now checks without needing the buddy page */
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 2da5476fa48b..27c1dabacd43 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3692,16 +3692,13 @@ static int ext4_run_li_request(struct ext4_li_request *elr)
> ext4_group_t group = elr->lr_next_group;
> unsigned int prefetch_ios = 0;
> int ret = 0;
> + int nr = EXT4_SB(sb)->s_mb_prefetch;
> u64 start_time;
>
> if (elr->lr_mode == EXT4_LI_MODE_PREFETCH_BBITMAP) {
> - elr->lr_next_group = ext4_mb_prefetch(sb, group,
> - EXT4_SB(sb)->s_mb_prefetch, &prefetch_ios);
> - if (prefetch_ios)
> - ext4_mb_prefetch_fini(sb, elr->lr_next_group,
> - prefetch_ios);
> - trace_ext4_prefetch_bitmaps(sb, group, elr->lr_next_group,
> - prefetch_ios);
> + elr->lr_next_group = ext4_mb_prefetch(sb, group, nr, &prefetch_ios);
> + ext4_mb_prefetch_fini(sb, elr->lr_next_group, nr);
> + trace_ext4_prefetch_bitmaps(sb, group, elr->lr_next_group, nr);
> if (group >= elr->lr_next_group) {
> ret = 1;
> if (elr->lr_first_not_zeroed != ngroups &&
On Tue, Jun 06, 2023 at 10:00:57PM +0800, Guoqing Jiang wrote: > Hello, > > On 5/30/23 20:33, Ojaswin Mujoo wrote: > > Before this patch, the call stack in ext4_run_li_request is as follows: > > > > /* > > * nr = no. of BGs we want to fetch (=s_mb_prefetch) > > * prefetch_ios = no. of BGs not uptodate after > > * ext4_read_block_bitmap_nowait() > > */ > > next_group = ext4_mb_prefetch(sb, group, nr, prefetch_ios); > > ext4_mb_prefetch_fini(sb, next_group prefetch_ios); > > > > ext4_mb_prefetch_fini() will only try to initialize buddies for BGs in > > range [next_group - prefetch_ios, next_group). This is incorrect since > > sometimes (prefetch_ios < nr), which causes ext4_mb_prefetch_fini() to > > incorrectly ignore some of the BGs that might need initialization. This > > issue is more notable now with the previous patch enabling "fetching" of > > BLOCK_UNINIT BGs which are marked buffer_uptodate by default. > > > > Fix this by passing nr to ext4_mb_prefetch_fini() instead of > > prefetch_ios so that it considers the right range of groups. > > Thanks for the series. > > > Similarly, make sure we don't pass nr=0 to ext4_mb_prefetch_fini() in > > ext4_mb_regular_allocator() since we might have prefetched BLOCK_UNINIT > > groups that would need buddy initialization. > > Seems ext4_mb_prefetch_fini can't be called by ext4_mb_regular_allocator > if nr is 0. Hi Guoqing, Sorry I was on vacation so didn't get a chance to reply to this sooner. Let me explain what I meant by that statement in the commit message. So basically, the prefetch_ios output argument is incremented whenever ext4_mb_prefetch() reads a block group with !buffer_uptodate(bh). However, for BLOCK_UNINIT BGs the buffer is marked uptodate after initialization and hence prefetch_ios is not incremented when such BGs are prefetched. This leads to nr becoming 0 due to the following line (removed in this patch): if (prefetch_ios == curr_ios) nr = 0; hence ext4_mb_prefetch_fini() would never pre initialise the corresponding buddy structures. Instead, these structures would then get initialized probably at a later point during the slower allocation criterias. The motivation of making sure the BLOCK_UNINIT BGs' buddies are pre initialized is so the faster allocation criterias can utilize the data to make better decisions. Regards, ojaswin > > https://elixir.bootlin.com/linux/v6.4-rc5/source/fs/ext4/mballoc.c#L2816 > > Am I miss something? > > Thanks, > Guoqing >
Hi Ojaswin, On 6/27/23 14:51, Ojaswin Mujoo wrote: > On Tue, Jun 06, 2023 at 10:00:57PM +0800, Guoqing Jiang wrote: >> Hello, >> >> On 5/30/23 20:33, Ojaswin Mujoo wrote: >>> Before this patch, the call stack in ext4_run_li_request is as follows: >>> >>> /* >>> * nr = no. of BGs we want to fetch (=s_mb_prefetch) >>> * prefetch_ios = no. of BGs not uptodate after >>> * ext4_read_block_bitmap_nowait() >>> */ >>> next_group = ext4_mb_prefetch(sb, group, nr, prefetch_ios); >>> ext4_mb_prefetch_fini(sb, next_group prefetch_ios); >>> >>> ext4_mb_prefetch_fini() will only try to initialize buddies for BGs in >>> range [next_group - prefetch_ios, next_group). This is incorrect since >>> sometimes (prefetch_ios < nr), which causes ext4_mb_prefetch_fini() to >>> incorrectly ignore some of the BGs that might need initialization. This >>> issue is more notable now with the previous patch enabling "fetching" of >>> BLOCK_UNINIT BGs which are marked buffer_uptodate by default. >>> >>> Fix this by passing nr to ext4_mb_prefetch_fini() instead of >>> prefetch_ios so that it considers the right range of groups. >> Thanks for the series. >> >>> Similarly, make sure we don't pass nr=0 to ext4_mb_prefetch_fini() in >>> ext4_mb_regular_allocator() since we might have prefetched BLOCK_UNINIT >>> groups that would need buddy initialization. >> Seems ext4_mb_prefetch_fini can't be called by ext4_mb_regular_allocator >> if nr is 0. > Hi Guoqing, > > Sorry I was on vacation so didn't get a chance to reply to this sooner. > Let me explain what I meant by that statement in the commit message. > > So basically, the prefetch_ios output argument is incremented whenever > ext4_mb_prefetch() reads a block group with !buffer_uptodate(bh). > However, for BLOCK_UNINIT BGs the buffer is marked uptodate after > initialization and hence prefetch_ios is not incremented when such BGs > are prefetched. > > This leads to nr becoming 0 due to the following line (removed in this patch): > > if (prefetch_ios == curr_ios) > nr = 0; > > hence ext4_mb_prefetch_fini() would never pre initialise the corresponding > buddy structures. Instead, these structures would then get initialized > probably at a later point during the slower allocation criterias. The > motivation of making sure the BLOCK_UNINIT BGs' buddies are pre > initialized is so the faster allocation criterias can utilize the data > to make better decisions. Got it, appreciate for the detailed explanation! Thanks, Guoqing
© 2016 - 2026 Red Hat, Inc.