fs/erofs/zdata.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
folio_trylock() in erofs_try_to_free_all_cached_folios() may
successfully acquire the folio lock, but the subsequent check
for erofs_folio_is_managed() can skip unlocking when the folio
is not managed by EROFS.
This leads to a lock imbalance and leaves the folio permanently
locked, which may cause reclaim stalls or interfere with memory
management.
Fix this by ensuring folio_unlock() is called before continuing.
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
---
fs/erofs/zdata.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index fe8121df9ef2..9d7ff22f1622 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -605,8 +605,10 @@ static int erofs_try_to_free_all_cached_folios(struct erofs_sb_info *sbi,
if (!folio_trylock(folio))
return -EBUSY;
- if (!erofs_folio_is_managed(sbi, folio))
+ if (!erofs_folio_is_managed(sbi, folio)) {
+ folio_unlock(folio);
continue;
+ }
pcl->compressed_bvecs[i].page = NULL;
folio_detach_private(folio);
folio_unlock(folio);
--
2.43.0
Hi Zhan,
On 2026/3/31 10:33, Zhan Xusheng wrote:
> folio_trylock() in erofs_try_to_free_all_cached_folios() may
> successfully acquire the folio lock, but the subsequent check
> for erofs_folio_is_managed() can skip unlocking when the folio
> is not managed by EROFS.
Do you find some real timing?
I don't think it can really happen, because:
>
> This leads to a lock imbalance and leaves the folio permanently
> locked, which may cause reclaim stalls or interfere with memory
> management.
>
> Fix this by ensuring folio_unlock() is called before continuing.
If a folio links to a pcluster, folio->private will be non-NULL,
and pcl->compressed_bvecs[i] points to that folios.
And z_erofs_cache_release_folio() will be called with folio lock,
and pcl->compressed_bvecs[i] will be set NULL here.
So I don't think erofs_try_to_free_all_cached_folios() can find
!erofs_folio_is_managed(sbi, folio) in the real world.
>
> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
> ---
> fs/erofs/zdata.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> index fe8121df9ef2..9d7ff22f1622 100644
> --- a/fs/erofs/zdata.c
> +++ b/fs/erofs/zdata.c
> @@ -605,8 +605,10 @@ static int erofs_try_to_free_all_cached_folios(struct erofs_sb_info *sbi,
> if (!folio_trylock(folio))
> return -EBUSY;
>
> - if (!erofs_folio_is_managed(sbi, folio))
> + if (!erofs_folio_is_managed(sbi, folio)) {
> + folio_unlock(folio);
> continue;
> + }
> pcl->compressed_bvecs[i].page = NULL;
> folio_detach_private(folio);
But I admit that we should rewrite in function as:
if (!erofs_folio_is_managed(sbi, folio)) {
DBG_BUGON(1);
} else {
pcl->compressed_bvecs[i].page = NULL;
folio_detach_private(folio);
}
folio_unlock(folio);
Thanks,
Gao Xiang
> folio_unlock(folio);
On 2026/3/31 10:50, Gao Xiang wrote:
> Hi Zhan,
>
> On 2026/3/31 10:33, Zhan Xusheng wrote:
>> folio_trylock() in erofs_try_to_free_all_cached_folios() may
>> successfully acquire the folio lock, but the subsequent check
>> for erofs_folio_is_managed() can skip unlocking when the folio
>> is not managed by EROFS.
>
> Do you find some real timing?
>
> I don't think it can really happen, because:
>
>>
>> This leads to a lock imbalance and leaves the folio permanently
>> locked, which may cause reclaim stalls or interfere with memory
>> management.
>>
>> Fix this by ensuring folio_unlock() is called before continuing.
>
> If a folio links to a pcluster, folio->private will be non-NULL,
> and pcl->compressed_bvecs[i] points to that folios.
>
> And z_erofs_cache_release_folio() will be called with folio lock,
> and pcl->compressed_bvecs[i] will be set NULL here.
>
> So I don't think erofs_try_to_free_all_cached_folios() can find
> !erofs_folio_is_managed(sbi, folio) in the real world.
>
>>
>> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
>> ---
>> fs/erofs/zdata.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
>> index fe8121df9ef2..9d7ff22f1622 100644
>> --- a/fs/erofs/zdata.c
>> +++ b/fs/erofs/zdata.c
>> @@ -605,8 +605,10 @@ static int erofs_try_to_free_all_cached_folios(struct erofs_sb_info *sbi,
>> if (!folio_trylock(folio))
>> return -EBUSY;
>> - if (!erofs_folio_is_managed(sbi, folio))
>> + if (!erofs_folio_is_managed(sbi, folio)) {
>> + folio_unlock(folio);
>> continue;
>> + }
>> pcl->compressed_bvecs[i].page = NULL;
>> folio_detach_private(folio);
>
> But I admit that we should rewrite in function as:
>
> if (!erofs_folio_is_managed(sbi, folio)) {
> DBG_BUGON(1);
> } else {
> pcl->compressed_bvecs[i].page = NULL;
> folio_detach_private(folio);
> }
Or maybe just:
DBG_BUGON(!erofs_folio_is_managed(sbi, folio));
pcl->compressed_bvecs[i].page = NULL;
folio_detach_private(folio);
folio_unlock(folio);
Since if a pcluster goes here (!pcl->lockref.count),
`pcl->compressed_bvecs[i]` should leave all valid cached
folios (Or some should be recycled by .release_folio
instead.)
Unless there is the other bug somewhere, but in any case,
I don't think your phenomenon is related to EROFS.
Thanks,
Gao Xiang
> folio_unlock(folio);
>
> Thanks,
> Gao Xiang
>
>
>> folio_unlock(folio);
>
© 2016 - 2026 Red Hat, Inc.