[PATCH RFC v2 2/7] mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED

Lisa Wang posted 7 patches 2 weeks, 3 days ago
[PATCH RFC v2 2/7] mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED
Posted by Lisa Wang 2 weeks, 3 days ago
The .error_remove_folio a_ops is used by different filesystems to handle
folio truncation upon discovery of a memory failure in the memory
associated with the given folio.

Currently, MF_DELAYED is treated as an error, causing "Failed to punch
page" to be written to the console. MF_DELAYED is then relayed to the
caller of truncate_error_folio() as MF_FAILED. This further causes
memory_failure() to return -EBUSY, which then always causes a SIGBUS.

This is also implies that regardless of whether the thread's memory
corruption kill policy is PR_MCE_KILL_EARLY or PR_MCE_KILL_LATE, a
memory failure with MF_DELAYED will always cause a SIGBUS.

Update truncate_error_folio() to return MF_DELAYED to the caller if the
.error_remove_folio() callback reports MF_DELAYED.

Signed-off-by: Lisa Wang <wyihan@google.com>
---
 mm/memory-failure.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 4f143334d5a1..57f7762e7418 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -941,6 +941,8 @@ static int truncate_error_folio(struct folio *folio, unsigned long pfn,
 	if (mapping->a_ops->error_remove_folio) {
 		int err = mapping->a_ops->error_remove_folio(mapping, folio);
 
+		if (err == MF_DELAYED)
+			return err;
 		if (err != 0)
 			pr_info("%#lx: Failed to punch page: %d\n", pfn, err);
 		else if (!filemap_release_folio(folio, GFP_NOIO))

-- 
2.53.0.959.g497ff81fa9-goog
Re: [PATCH RFC v2 2/7] mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED
Posted by Miaohe Lin 1 week ago
On 2026/3/20 7:30, Lisa Wang wrote:
> The .error_remove_folio a_ops is used by different filesystems to handle
> folio truncation upon discovery of a memory failure in the memory
> associated with the given folio.
> 
> Currently, MF_DELAYED is treated as an error, causing "Failed to punch
> page" to be written to the console. MF_DELAYED is then relayed to the
> caller of truncate_error_folio() as MF_FAILED. This further causes
> memory_failure() to return -EBUSY, which then always causes a SIGBUS.
> 
> This is also implies that regardless of whether the thread's memory
> corruption kill policy is PR_MCE_KILL_EARLY or PR_MCE_KILL_LATE, a
> memory failure with MF_DELAYED will always cause a SIGBUS.
> 
> Update truncate_error_folio() to return MF_DELAYED to the caller if the
> .error_remove_folio() callback reports MF_DELAYED.
> 
> Signed-off-by: Lisa Wang <wyihan@google.com>
> ---
>  mm/memory-failure.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 4f143334d5a1..57f7762e7418 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -941,6 +941,8 @@ static int truncate_error_folio(struct folio *folio, unsigned long pfn,
>  	if (mapping->a_ops->error_remove_folio) {
>  		int err = mapping->a_ops->error_remove_folio(mapping, folio);
>  
> +		if (err == MF_DELAYED)
> +			return err;

Will it be better to add a pr_info here to provide some information for users?

Thanks.
.
Re: [PATCH RFC v2 2/7] mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED
Posted by Lisa Wang 2 days, 10 hours ago
> On Mon, Mar 30, 2026 at 03:02:01PM +0800, Miaohe Lin wrote:
[...snip...]
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -941,6 +941,8 @@ static int truncate_error_folio(struct folio *folio, unsigned long pfn,
> >  	if (mapping->a_ops->error_remove_folio) {
> >  		int err = mapping->a_ops->error_remove_folio(mapping, folio);
> >  
> > +		if (err == MF_DELAYED)
> > +			return err;
> 
> Will it be better to add a pr_info here to provide some information for users?
> 
> Thanks.
> .
I think we don't need to add pr_info here; truncate_error_folio() always
leads to action_result, which already logs the recovery status.

Lisa