f2fs_read_data_large_folio() can build a single read BIO across multiple
folios during readahead. If a folio ends up having none of its subpages
added to the BIO (e.g. all subpages are zeroed / treated as holes), it
will never be seen by f2fs_finish_read_bio(), so folio_end_read() is
never called. This leaves the folio locked and not marked uptodate.
Track whether the current folio has been added to a BIO via a local
'folio_in_bio' bool flag, and when iterating readahead folios, explicitly
mark the folio uptodate (on success) and unlock it when nothing was added.
Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
---
fs/f2fs/data.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 66ab7a43a56f..ac569a396914 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
unsigned nrpages;
struct f2fs_folio_state *ffs;
int ret = 0;
+ bool folio_in_bio = false;
if (!IS_IMMUTABLE(inode))
return -EOPNOTSUPP;
@@ -2445,6 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
if (!folio)
goto out;
+ folio_in_bio = false
index = folio->index;
offset = 0;
ffs = NULL;
@@ -2530,6 +2532,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
offset << PAGE_SHIFT))
goto submit_and_realloc;
+ folio_in_bio = true;
inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
F2FS_BLKSIZE);
@@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
}
trace_f2fs_read_folio(folio, DATA);
if (rac) {
+ if (!folio_in_bio) {
+ if (!ret)
+ folio_mark_uptodate(folio);
+ folio_unlock(folio);
+ }
folio = readahead_folio(rac);
goto next_folio;
}
--
2.34.1
On 1/5/2026 11:31 PM, Nanzhe Zhao wrote:
> f2fs_read_data_large_folio() can build a single read BIO across multiple
> folios during readahead. If a folio ends up having none of its subpages
> added to the BIO (e.g. all subpages are zeroed / treated as holes), it
> will never be seen by f2fs_finish_read_bio(), so folio_end_read() is
> never called. This leaves the folio locked and not marked uptodate.
>
> Track whether the current folio has been added to a BIO via a local
> 'folio_in_bio' bool flag, and when iterating readahead folios, explicitly
> mark the folio uptodate (on success) and unlock it when nothing was added.
>
> Signed-off-by: Nanzhe Zhao <nzzhao@126.com>
> ---
> fs/f2fs/data.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 66ab7a43a56f..ac569a396914 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> unsigned nrpages;
> struct f2fs_folio_state *ffs;
> int ret = 0;
> + bool folio_in_bio = false;
No need to initialize folio_in_bio?
>
> if (!IS_IMMUTABLE(inode))
> return -EOPNOTSUPP;
> @@ -2445,6 +2446,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> if (!folio)
> goto out;
>
> + folio_in_bio = false
folio_in_bio = false;
> index = folio->index;
> offset = 0;
> ffs = NULL;
> @@ -2530,6 +2532,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> offset << PAGE_SHIFT))
> goto submit_and_realloc;
>
> + folio_in_bio = true;
> inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
> f2fs_update_iostat(F2FS_I_SB(inode), NULL, FS_DATA_READ_IO,
> F2FS_BLKSIZE);
> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
> }
> trace_f2fs_read_folio(folio, DATA);
> if (rac) {
> + if (!folio_in_bio) {
> + if (!ret)
> + folio_mark_uptodate(folio);
> + folio_unlock(folio);
> + }
err_out:
/* Nothing was submitted. */
if (!bio) {
if (!ret)
folio_mark_uptodate(folio);
folio_unlock(folio);
^^^^^^^^^^^^
If all folios in rac have not been mapped (hole case), will we unlock the folio twice?
Thanks,
return ret;
}
> folio = readahead_folio(rac);
> goto next_folio;
> }
> --
> 2.34.1
>
Hi Chao yu:
At 2026-01-06 17:31:20, "Chao Yu" <chao@kernel.org> wrote:
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 66ab7a43a56f..ac569a396914 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>> unsigned nrpages;
>>> struct f2fs_folio_state *ffs;
>>> int ret = 0;
>>> + bool folio_in_bio = false;
>>
>>No need to initialize folio_in_bio?
Agreed. It's redundant since we reset it to false for each new folio before processing.
>>> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>> }
>>> trace_f2fs_read_folio(folio, DATA);
>>> if (rac) {
>>> + if (!folio_in_bio) {
>>> + if (!ret)
>>> + folio_mark_uptodate(folio);
>>> + folio_unlock(folio);
>>> + }
>>
>>err_out:
>> /* Nothing was submitted. */
>> if (!bio) {
>> if (!ret)
>> folio_mark_uptodate(folio);
>> folio_unlock(folio);
>>
>> ^^^^^^^^^^^^
>>
>>If all folios in rac have not been mapped (hole case), will we unlock the folio twice?
Are you worried the folio could be unlocked once in the if (rac) { ... } block and then
unlocked again at err_out:? If so, I think that won't happen.
In such a case, every non-NULL folio will be unlocked exactly once by:
if (!folio_in_bio) {
if (!ret)
folio_mark_uptodate(folio);
folio_unlock(folio);
}
Specifically, after the last folio runs through the block above, the next call:
folio = readahead_folio(rac);
will return NULL. Then we go to next_folio:, and will directly hit:
if (!folio)
goto out;
This jumps straight to the out: label, skipping err_out: entirely.
Therefore, when ret is not an error code, the err_out: label will never be reached.
If ret becomes an error code, then the current folio will immediately goto err_out;
and be unlocked there once.
If rac is NULL (meaning we only read the single large folio passed in as the function argument),
we won't enter the if (rac) { ... goto next_folio; } path at all, so we also won't go to next_folio
and then potentially goto out;. In that case, it will naturally be unlocked once at err_out:.
Or am I missing some edge case here?
Thanks,
Nanzhe
On 1/7/2026 8:33 AM, Nanzhe Zhao wrote:
> Hi Chao yu:
> At 2026-01-06 17:31:20, "Chao Yu" <chao@kernel.org> wrote:
>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>> index 66ab7a43a56f..ac569a396914 100644
>>>> --- a/fs/f2fs/data.c
>>>> +++ b/fs/f2fs/data.c
>>>> @@ -2430,6 +2430,7 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>> unsigned nrpages;
>>>> struct f2fs_folio_state *ffs;
>>>> int ret = 0;
>>>> + bool folio_in_bio = false;
>>>
>>> No need to initialize folio_in_bio?
>
> Agreed. It's redundant since we reset it to false for each new folio before processing.
>
>>>> @@ -2539,6 +2542,11 @@ static int f2fs_read_data_large_folio(struct inode *inode,
>>>> }
>>>> trace_f2fs_read_folio(folio, DATA);
>>>> if (rac) {
>>>> + if (!folio_in_bio) {
>>>> + if (!ret)
>>>> + folio_mark_uptodate(folio);
>>>> + folio_unlock(folio);
>>>> + }
>>>
>>> err_out:
>>> /* Nothing was submitted. */
>>> if (!bio) {
>>> if (!ret)
>>> folio_mark_uptodate(folio);
>>> folio_unlock(folio);
>>>
>>> ^^^^^^^^^^^^
>>>
>>> If all folios in rac have not been mapped (hole case), will we unlock the folio twice?
>
> Are you worried the folio could be unlocked once in the if (rac) { ... } block and then
> unlocked again at err_out:? If so, I think that won't happen.
>
> In such a case, every non-NULL folio will be unlocked exactly once by:
>
> if (!folio_in_bio) {
> if (!ret)
> folio_mark_uptodate(folio);
> folio_unlock(folio);
> }
> Specifically, after the last folio runs through the block above, the next call:
>
> folio = readahead_folio(rac);
> will return NULL. Then we go to next_folio:, and will directly hit:
>
> if (!folio)
> goto out;
> This jumps straight to the out: label, skipping err_out: entirely.
> Therefore, when ret is not an error code, the err_out: label will never be reached.
>
> If ret becomes an error code, then the current folio will immediately goto err_out;
> and be unlocked there once.
>
> If rac is NULL (meaning we only read the single large folio passed in as the function argument),
> we won't enter the if (rac) { ... goto next_folio; } path at all, so we also won't go to next_folio
> and then potentially goto out;. In that case, it will naturally be unlocked once at err_out:.
> Or am I missing some edge case here?
Nanzhe,
Oh, yes, I think so, thanks for the explanation.
Thanks,
>
> Thanks,
> Nanzhe
© 2016 - 2026 Red Hat, Inc.