[v1] erofs: fix missing folio_unlock causing lock imbalance

[PATCH] erofs: fix missing folio_unlock causing lock imbalance

Posted by Zhan Xusheng 1 day, 21 hours ago

folio_trylock() in erofs_try_to_free_all_cached_folios() may
successfully acquire the folio lock, but the subsequent check
for erofs_folio_is_managed() can skip unlocking when the folio
is not managed by EROFS.

This leads to a lock imbalance and leaves the folio permanently
locked, which may cause reclaim stalls or interfere with memory
management.

Fix this by ensuring folio_unlock() is called before continuing.

Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
---
 fs/erofs/zdata.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index fe8121df9ef2..9d7ff22f1622 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -605,8 +605,10 @@ static int erofs_try_to_free_all_cached_folios(struct erofs_sb_info *sbi,
 			if (!folio_trylock(folio))
 				return -EBUSY;
 
-			if (!erofs_folio_is_managed(sbi, folio))
+			if (!erofs_folio_is_managed(sbi, folio)) {
+				folio_unlock(folio);
 				continue;
+			}
 			pcl->compressed_bvecs[i].page = NULL;
 			folio_detach_private(folio);
 			folio_unlock(folio);
-- 
2.43.0

Re: [PATCH] erofs: fix missing folio_unlock causing lock imbalance

Posted by Gao Xiang 1 day, 20 hours ago

Hi Zhan,

On 2026/3/31 10:33, Zhan Xusheng wrote:
> folio_trylock() in erofs_try_to_free_all_cached_folios() may
> successfully acquire the folio lock, but the subsequent check
> for erofs_folio_is_managed() can skip unlocking when the folio
> is not managed by EROFS.

Do you find some real timing?

I don't think it can really happen, because:

> 
> This leads to a lock imbalance and leaves the folio permanently
> locked, which may cause reclaim stalls or interfere with memory
> management.
> 
> Fix this by ensuring folio_unlock() is called before continuing.

If a folio links to a pcluster, folio->private will be non-NULL,
and pcl->compressed_bvecs[i] points to that folios.

And z_erofs_cache_release_folio() will be called with folio lock,
and pcl->compressed_bvecs[i] will be set NULL here.

So I don't think erofs_try_to_free_all_cached_folios() can find
!erofs_folio_is_managed(sbi, folio) in the real world.

> 
> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
> ---
>   fs/erofs/zdata.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> index fe8121df9ef2..9d7ff22f1622 100644
> --- a/fs/erofs/zdata.c
> +++ b/fs/erofs/zdata.c
> @@ -605,8 +605,10 @@ static int erofs_try_to_free_all_cached_folios(struct erofs_sb_info *sbi,
>   			if (!folio_trylock(folio))
>   				return -EBUSY;
>   
> -			if (!erofs_folio_is_managed(sbi, folio))
> +			if (!erofs_folio_is_managed(sbi, folio)) {
> +				folio_unlock(folio);
>   				continue;
> +			}
>   			pcl->compressed_bvecs[i].page = NULL;
>   			folio_detach_private(folio);

But I admit that we should rewrite in function as:

			if (!erofs_folio_is_managed(sbi, folio)) {
				DBG_BUGON(1);
			} else {
				pcl->compressed_bvecs[i].page = NULL;
				folio_detach_private(folio);
			}
			folio_unlock(folio);

Thanks,
Gao Xiang


>   			folio_unlock(folio);

Re: [PATCH] erofs: fix missing folio_unlock causing lock imbalance

Posted by Gao Xiang 1 day, 20 hours ago


On 2026/3/31 10:50, Gao Xiang wrote:
> Hi Zhan,
> 
> On 2026/3/31 10:33, Zhan Xusheng wrote:
>> folio_trylock() in erofs_try_to_free_all_cached_folios() may
>> successfully acquire the folio lock, but the subsequent check
>> for erofs_folio_is_managed() can skip unlocking when the folio
>> is not managed by EROFS.
> 
> Do you find some real timing?
> 
> I don't think it can really happen, because:
> 
>>
>> This leads to a lock imbalance and leaves the folio permanently
>> locked, which may cause reclaim stalls or interfere with memory
>> management.
>>
>> Fix this by ensuring folio_unlock() is called before continuing.
> 
> If a folio links to a pcluster, folio->private will be non-NULL,
> and pcl->compressed_bvecs[i] points to that folios.
> 
> And z_erofs_cache_release_folio() will be called with folio lock,
> and pcl->compressed_bvecs[i] will be set NULL here.
> 
> So I don't think erofs_try_to_free_all_cached_folios() can find
> !erofs_folio_is_managed(sbi, folio) in the real world.
> 
>>
>> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
>> ---
>>   fs/erofs/zdata.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
>> index fe8121df9ef2..9d7ff22f1622 100644
>> --- a/fs/erofs/zdata.c
>> +++ b/fs/erofs/zdata.c
>> @@ -605,8 +605,10 @@ static int erofs_try_to_free_all_cached_folios(struct erofs_sb_info *sbi,
>>               if (!folio_trylock(folio))
>>                   return -EBUSY;
>> -            if (!erofs_folio_is_managed(sbi, folio))
>> +            if (!erofs_folio_is_managed(sbi, folio)) {
>> +                folio_unlock(folio);
>>                   continue;
>> +            }
>>               pcl->compressed_bvecs[i].page = NULL;
>>               folio_detach_private(folio);
> 
> But I admit that we should rewrite in function as:
> 
>              if (!erofs_folio_is_managed(sbi, folio)) {
>                  DBG_BUGON(1);
>              } else {
>                  pcl->compressed_bvecs[i].page = NULL;
>                  folio_detach_private(folio);
>              }

Or maybe just:
		DBG_BUGON(!erofs_folio_is_managed(sbi, folio));
		pcl->compressed_bvecs[i].page = NULL;
		folio_detach_private(folio);
		folio_unlock(folio);

Since if a pcluster goes here (!pcl->lockref.count),
`pcl->compressed_bvecs[i]` should leave all valid cached
folios (Or some should be recycled by .release_folio
instead.)

Unless there is the other bug somewhere, but in any case,
I don't think your phenomenon is related to EROFS.

Thanks,
Gao Xiang

>              folio_unlock(folio);
> 
> Thanks,
> Gao Xiang
> 
> 
>>               folio_unlock(folio);
>