[PATCH RFC v2 0/7] mm: Fix MF_DELAYED handling on memory failure

Lisa Wang posted 7 patches 2 weeks, 1 day ago
mm/memory-failure.c                                |  17 +-
mm/shmem.c                                         |   2 +-
tools/testing/selftests/kvm/guest_memfd_test.c     | 233 +++++++++++++++++++++
tools/testing/selftests/mm/Makefile                |   3 +
tools/testing/selftests/mm/run_vmtests.sh          |   1 +
.../selftests/mm/shmem_memory_failure_test.c       |  98 +++++++++
6 files changed, 344 insertions(+), 10 deletions(-)
[PATCH RFC v2 0/7] mm: Fix MF_DELAYED handling on memory failure
Posted by Lisa Wang 2 weeks, 1 day ago
Here's a second revision to fix MF_DELAYED handling on memory failure.

This patch series addresses an issue in the memory failure handling path
where MF_DELAYED is incorrectly treated as an error. This issue was
discovered while testing memory failure handling for guest_memfd.

The proposed solution involves -
1. Clarifying the definition of MF_DELAYED to mean that memory failure
   handling is only partially completed, and that the metadata for the
   memory that failed (as in struct page/folio) is still referenced.
2. Updating shmem’s handling to align with the clarified definition.
3. Updating how the result of .error_remove_folio() is interpreted.

RFC v2 is a more complete solution that includes parts 1 and 2 above to
address David’s comment [1]. Selftests are included for all the above.

+ RFC v1: https://lore.kernel.org/all/cover.1760551864.git.wyihan@google.com/

[1]: https://lore.kernel.org/all/91dbea57-d5b0-49b7-8920-3a2d252c46b0@redhat.com/

Signed-off-by: Lisa Wang <wyihan@google.com>
---
Lisa Wang (7):
      mm: memory_failure: Clarify the MF_DELAYED definition
      mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED
      mm: shmem: Update shmem handler to the MF_DELAYED definition
      mm: memory_failure: Generalize extra_pins handling to all MF_DELAYED cases
      mm: selftests: Add shmem memory failure test
      KVM: selftests: Add memory failure tests in guest_memfd_test
      KVM: selftests: Test guest_memfd behavior with respect to stage 2 page tables

 mm/memory-failure.c                                |  17 +-
 mm/shmem.c                                         |   2 +-
 tools/testing/selftests/kvm/guest_memfd_test.c     | 233 +++++++++++++++++++++
 tools/testing/selftests/mm/Makefile                |   3 +
 tools/testing/selftests/mm/run_vmtests.sh          |   1 +
 .../selftests/mm/shmem_memory_failure_test.c       |  98 +++++++++
 6 files changed, 344 insertions(+), 10 deletions(-)
---
base-commit: 1f318b96cc84d7c2ab792fcc0bfd42a7ca890681
change-id: 20260319-memory-failure-mf-delayed-fix-rfc-v2-5ee11d6a7260

Best regards,
-- 
Lisa Wang <wyihan@google.com>
Re: [PATCH RFC v2 0/7] mm: Fix MF_DELAYED handling on memory failure
Posted by Andrew Morton 2 weeks, 1 day ago
On Thu, 19 Mar 2026 23:30:27 +0000 Lisa Wang <wyihan@google.com> wrote:

> Here's a second revision to fix MF_DELAYED handling on memory failure.
> 
> This patch series addresses an issue in the memory failure handling path
> where MF_DELAYED is incorrectly treated as an error. This issue was
> discovered while testing memory failure handling for guest_memfd.
> 
> The proposed solution involves -
> 1. Clarifying the definition of MF_DELAYED to mean that memory failure
>    handling is only partially completed, and that the metadata for the
>    memory that failed (as in struct page/folio) is still referenced.
> 2. Updating shmem’s handling to align with the clarified definition.
> 3. Updating how the result of .error_remove_folio() is interpreted.
> 
> RFC v2 is a more complete solution that includes parts 1 and 2 above to
> address David’s comment [1]. Selftests are included for all the above.

A few questions from Sashiko:
	https://sashiko.dev/#/patchset/20260319-memory-failure-mf-delayed-fix-rfc-v2-v2-0-92c596402a7a%40google.com