Here's a fourth revision to fix MF_DELAYED handling on memory failure.
This patch series addresses an issue in the memory failure handling path
where MF_DELAYED is incorrectly treated as an error. This issue was
discovered while testing memory failure handling for guest_memfd.
The proposed solution involves -
1. Clarifying the definition of MF_DELAYED to mean that memory failure
handling is only partially completed, and that the metadata for the
memory that failed (as in struct page/folio) is still referenced.
2. Updating shmem’s handling to align with the clarified definition.
3. Updating how the result of .error_remove_folio() is interpreted.
Changes from v3:
+ Split an independent guest_memfd_memory_failure_test, as suggested by
Ackerley and Sean
+ Align error logging style in truncate_error_folio, as suggested by
Miaohe and David
+ Verify a clean shmem page can be read successfully after soft-offline
memory failure, as suggested by Miaohe
Thanks!
+ RFC v3: https://lore.kernel.org/all/20260408-memory-failure-mf-delayed-fix-rfc-v3-v3-0-718f45eb7c75@google.com/
Signed-off-by: Lisa Wang <wyihan@google.com>
---
Lisa Wang (7):
mm: memory_failure: Clarify the MF_DELAYED definition
mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED
mm: shmem: Update shmem handler to the MF_DELAYED definition
mm: memory_failure: Generalize extra_pins handling to all MF_DELAYED cases
mm: selftests: Add shmem into memory failure test
KVM: selftests: Add the guest_memfd memory failure test
KVM: selftests: Test guest_memfd behavior with respect to stage 2 page tables
mm/memory-failure.c | 29 +-
mm/shmem.c | 2 +-
tools/testing/selftests/kvm/Makefile.kvm | 2 +
.../kvm/guest_memfd_memory_failure_test.c | 402 +++++++++++++++++++++
tools/testing/selftests/mm/memory-failure.c | 111 +++++-
5 files changed, 527 insertions(+), 19 deletions(-)
---
base-commit: 38741a8e3bc1b809d64f8c8885ab15c3e40700ff
change-id: 20260527-memory-failure-mf-delayed-fix-7d5a8f4a8a8b
Best regards,
--
Lisa Wang <wyihan@google.com>