x86/mm: support memory-failure on 32-bits with SPARSEMEM

[PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Xie Yuanbin 3 months ago

Memory bit flips are among the most common hardware errors in the server
and embedded fields, many hardware components have memory verification
mechanisms, for example ECC. When an error is detected, some hardware or
architectures report the information to software (OS/BIOS), for example,
the MCE (Machine Check Exception) on x86.

Common errors include CE (Correctable Errors) and UE (Uncorrectable
Errors). When the kernel receives memory error information, if it has the
memory-failure feature, it can better handle memory errors without reboot.
For example, kernel can attempt to offline the affected memory by
migrating it or killing the process. Therefore, this feature is widely
used in servers and embedded fields.

For historical versions, memory-failure cannot be enabled with x86_32 &&
SPARSEMEM because the number of page-flags are insufficient. However, this
issue has been resolved in the current version, and this patch will allow
SPARSEMEM and memory-failure to be enabled together on x86_32.

By the way, due to increased demand, DRAM prices have recently
skyrocketed, making memory-failure potentially even more valuable in the
coming years.

v1-v2: https://lore.kernel.org/20251103033536.52234-1-xieyuanbin1@huawei.com
  - Describe the purpose of these patches in the cover letter.

  - Correct the description of historical changes to page flags.

  - Move the memory-failure traceing code from ras_event.h to
    memory-failure.h

Xie Yuanbin (2):
  x86/mm: support memory-failure on 32-bits with SPARSEMEM
  mm/memory-failure: remove the selection of RAS

 arch/x86/Kconfig                      |  3 -
 include/ras/ras_event.h               | 86 ------------------------
 include/trace/events/memory-failure.h | 97 +++++++++++++++++++++++++++
 mm/Kconfig                            |  1 -
 mm/memory-failure.c                   |  5 +-
 5 files changed, 101 insertions(+), 91 deletions(-)
 create mode 100644 include/trace/events/memory-failure.h

-- 
2.51.0

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Dave Hansen 3 months ago

On 11/3/25 23:23, Xie Yuanbin wrote:
> Memory bit flips are among the most common hardware errors in the server
> and embedded fields, many hardware components have memory verification
> mechanisms, for example ECC. When an error is detected, some hardware or
> architectures report the information to software (OS/BIOS), for example,
> the MCE (Machine Check Exception) on x86.
> 
> Common errors include CE (Correctable Errors) and UE (Uncorrectable
> Errors). When the kernel receives memory error information, if it has the
> memory-failure feature, it can better handle memory errors without reboot.
> For example, kernel can attempt to offline the affected memory by
> migrating it or killing the process. Therefore, this feature is widely
> used in servers and embedded fields.
> 
> For historical versions, memory-failure cannot be enabled with x86_32 &&
> SPARSEMEM because the number of page-flags are insufficient. However, this
> issue has been resolved in the current version, and this patch will allow
> SPARSEMEM and memory-failure to be enabled together on x86_32.
> 
> By the way, due to increased demand, DRAM prices have recently
> skyrocketed, making memory-failure potentially even more valuable in the
> coming years.

Which LLM generated that for you, btw?

I wanted to know _specifically_ what kind of hardware or 32-bit
environment you wanted to support with this series, though.

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Xie Yuanbin 3 months ago

On Tue, 4 Nov 2025 06:26:58 -0800, Dave Hansen wrote:
> Which LLM generated that for you, btw?

I wrote this myself; LLM just helped me with the translation. My English
isn't very good, so I apologize for any mistakes.

> I wanted to know _specifically_ what kind of hardware or 32-bit
> environment you wanted to support with this series, though.

I think I have explained it clearly enough in this email:
Link: https://lore.kernel.org/20251104133254.145660-1-xieyuanbin1@huawei.com

In simple terms, it refers to some old existing equipment and some
embedded devices. More specifically, it includes some routers, switches,
and similar devices. From what I know, there is no VM environment that
using it.
If you are asking about a specific CPU chip model, I'm sorry, but I may
not be able to provide that information for you.

Btw, why do you only ask about which x86_32 devices use memory-failure,
but not which x86_32 devices use sparsemem? This patch just allows both
to coexist, and perhaps both are important?

Thanks!

Xie Yuanbin

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by David Hildenbrand (Red Hat) 3 months ago

On 05.11.25 03:45, Xie Yuanbin wrote:
> On Tue, 4 Nov 2025 06:26:58 -0800, Dave Hansen wrote:
>> Which LLM generated that for you, btw?
> 
> I wrote this myself; LLM just helped me with the translation. My English
> isn't very good, so I apologize for any mistakes.
> 
>> I wanted to know _specifically_ what kind of hardware or 32-bit
>> environment you wanted to support with this series, though.
> 
> I think I have explained it clearly enough in this email:
> Link: https://lore.kernel.org/20251104133254.145660-1-xieyuanbin1@huawei.com
> 
> In simple terms, it refers to some old existing equipment and some
> embedded devices. More specifically, it includes some routers, switches,
> and similar devices. From what I know, there is no VM environment that
> using it.
> If you are asking about a specific CPU chip model, I'm sorry, but I may
> not be able to provide that information for you.
> 
> Btw, why do you only ask about which x86_32 devices use memory-failure,
> but not which x86_32 devices use sparsemem? This patch just allows both
> to coexist, and perhaps both are important?

Let me clarify what we need to know:

Will you (or your employer) be running such updated 32bit kernels on 
hardware that supports MCEs.

In other words: is this change driver by *real demand* or just by "oh 
look, we can enable that now, I can come up with a theoretical use case 
but I don't know if anybody would actually care"?

-- 
Cheers

David

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Xie Yuanbin 3 months ago

On Wed, 5 Nov 2025 09:12:04 +0100, Dave Hansen wrote:
> Let me clarify what we need to know:
>
> Will you (or your employer) be running such updated 32bit kernels on
> hardware that supports MCEs.
>
> In other words: is this change driver by *real demand*

Thanks! Asking like this, I completely understand now.

We won't directly upgrade the kernel to 6.18.x (or later versions) to use
this feature, but if Linux community approves these patches, we will
backport it to 5.10.x and use it. I know that the page-flags in 5.10.x
have been exhausted, but we can work around them by adjusting
SECTION_SIZE_BITS/MAX_PHYSMEM_BITS to free up a page flag.
Another patch I submitted for arm32:
Link: https://lore.kernel.org/20250922021453.3939-1-xieyuanbin1@huawei.com
, follows the same logic.

Currently, there is a clear demand for ARM32, while the demand for x86 is
still under discussion.

> or just by "oh
> look, we can enable that now, I can come up with a theoretical use case
> but I don't know if anybody would actually care"?

It can also be said that way. In fact, when developing the demand
"support MEMORY_FAILURE for 32-bit OS" in version 5.10.x, I found that the
latest version already supported this feature, so I submitted these
patches, and hope others can benefit from it as well.

> Cheers
>
> David

Thanks!

Xie Yuanbin

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Xie Yuanbin 2 months, 3 weeks ago

On Wed, 5 Nov 2025 17:05:36 +0800, Xie Yuanbin wrote:
> On Wed, 5 Nov 2025 09:12:04 +0100, Dave Hansen wrote:
>> Let me clarify what we need to know:
>>
>> Will you (or your employer) be running such updated 32bit kernels on
>> hardware that supports MCEs.
>>
>> In other words: is this change driver by *real demand*
>
> Thanks! Asking like this, I completely understand now.
>
> We won't directly upgrade the kernel to 6.18.x (or later versions) to use
> this feature, but if Linux community approves these patches, we will
> backport it to 5.10.x and use it. I know that the page-flags in 5.10.x
> have been exhausted, but we can work around them by adjusting
> SECTION_SIZE_BITS/MAX_PHYSMEM_BITS to free up a page flag.
> Another patch I submitted for arm32:
> Link: https://lore.kernel.org/20250922021453.3939-1-xieyuanbin1@huawei.com
> , follows the same logic.
>
> Currently, there is a clear demand for ARM32, while the demand for x86 is
> still under discussion.
>
>> or just by "oh
>> look, we can enable that now, I can come up with a theoretical use case
>> but I don't know if anybody would actually care"?
>
> It can also be said that way. In fact, when developing the demand
> "support MEMORY_FAILURE for 32-bit OS" in version 5.10.x, I found that the
> latest version already supported this feature, so I submitted these
> patches, and hope others can benefit from it as well.

Hello, David Hildenbrand and Dave Hansen!

Do you have any other comments on this patch? If you think that
supporting memory-failure on x86_32 is meaningless, I will only submit
patch 2 in the v3 patches.

Thank you very much!

Xie Yuanbin

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by David Hildenbrand (Red Hat) 2 months, 3 weeks ago

On 17.11.25 03:09, Xie Yuanbin wrote:
> On Wed, 5 Nov 2025 17:05:36 +0800, Xie Yuanbin wrote:
>> On Wed, 5 Nov 2025 09:12:04 +0100, Dave Hansen wrote:
>>> Let me clarify what we need to know:
>>>
>>> Will you (or your employer) be running such updated 32bit kernels on
>>> hardware that supports MCEs.
>>>
>>> In other words: is this change driver by *real demand*
>>
>> Thanks! Asking like this, I completely understand now.
>>
>> We won't directly upgrade the kernel to 6.18.x (or later versions) to use
>> this feature, but if Linux community approves these patches, we will
>> backport it to 5.10.x and use it. I know that the page-flags in 5.10.x
>> have been exhausted, but we can work around them by adjusting
>> SECTION_SIZE_BITS/MAX_PHYSMEM_BITS to free up a page flag.
>> Another patch I submitted for arm32:
>> Link: https://lore.kernel.org/20250922021453.3939-1-xieyuanbin1@huawei.com
>> , follows the same logic.
>>
>> Currently, there is a clear demand for ARM32, while the demand for x86 is
>> still under discussion.
>>
>>> or just by "oh
>>> look, we can enable that now, I can come up with a theoretical use case
>>> but I don't know if anybody would actually care"?
>>
>> It can also be said that way. In fact, when developing the demand
>> "support MEMORY_FAILURE for 32-bit OS" in version 5.10.x, I found that the
>> latest version already supported this feature, so I submitted these
>> patches, and hope others can benefit from it as well.
> 
> Hello, David Hildenbrand and Dave Hansen!
> 
> Do you have any other comments on this patch? If you think that
> supporting memory-failure on x86_32 is meaningless, I will only submit
> patch 2 in the v3 patches.

I'd say, if nobody will really make use of that right now (customer 
request etc), just leave x86 alone for now.

-- 
Cheers

David

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Xie Yuanbin 2 months, 3 weeks ago

On Wed, Mon, 17 Nov 2025 14:03:46 +0100, David Hildenbrand wrote:
> I'd say, if nobody will really make use of that right now (customer 
> request etc), just leave x86 alone for now.

Okay, thanks, I will only submit patch 2 in the V3 patches.

On Tue, 4 Nov 2025 10:38:54 +0100, David Hildenbrand wrote:
Link: https://lore.kernel.org/01b44e0f-ea2e-406f-9f65-b698b5504f42@kernel.org
> This trace system should not be called "ras". All RAS terminology should 
> be removed here.
>
> #define TRACE_SYSTEM memory_failure
>
> We want to add that new file to the "HWPOISON MEMORY FAILURE HANDLING"
> section in MAINTAINERS.
>
> Nothing else jumped at me.

Can I add an
"Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>"
in the patch 2?

The full patch will be:
```patch
From: Xie Yuanbin <xieyuanbin1@huawei.com>
Subject: [PATCH v3] mm/memory-failure: remove the selection of RAS

The commit 97f0b13452198290799f ("tracing: add trace event for
memory-failure") introduces the selection of RAS in memory-failure.
This commit is just a tracing feature; in reality, there is no dependency
between memory-failure and RAS. RAS increases the size of the bzImage
image by 8k, which is very valuable for embedded devices.

Move the memory-failure traceing code from ras_event.h to
memory-failure.h and remove the selection of RAS.

v2->v3: https://lore.kernel.org/20251104072306.100738-3-xieyuanbin1@huawei.com
  - Change define TRACE_SYSTEM from ras to memory_failure
  - Add include/trace/events/memory-failure.h to
    "HWPOISON MEMORY FAILURE HANDLING" section in MAINTAINERS
  - Rebase to latest linux-next source

v1->v2: https://lore.kernel.org/20251103033536.52234-2-xieyuanbin1@huawei.com
  - Move the memory-failure traceing code from ras_event.h to
    memory-failure.h

Signed-off-by: Xie Yuanbin <xieyuanbin1@huawei.com>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
---
 MAINTAINERS                           |  1 +
 include/ras/ras_event.h               | 87 ------------------------
 include/trace/events/memory-failure.h | 98 +++++++++++++++++++++++++++
 mm/Kconfig                            |  1 -
 mm/memory-failure.c                   |  5 +-
 5 files changed, 103 insertions(+), 89 deletions(-)
 create mode 100644 include/trace/events/memory-failure.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 7310d9ca0370..43d6eb95fb05 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11631,10 +11631,11 @@ R:	Naoya Horiguchi <nao.horiguchi@gmail.com>
 L:	linux-mm@kvack.org
 S:	Maintained
 F:	include/linux/memory-failure.h
 F:	mm/hwpoison-inject.c
 F:	mm/memory-failure.c
+F:	include/trace/events/memory-failure.h
 
 HYCON HY46XX TOUCHSCREEN SUPPORT
 M:	Giulio Benetti <giulio.benetti@benettiengineering.com>
 L:	linux-input@vger.kernel.org
 S:	Maintained
diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index fecfeb7c8be7..1e5e87020eef 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -10,11 +10,10 @@
 #include <linux/edac.h>
 #include <linux/ktime.h>
 #include <linux/pci.h>
 #include <linux/aer.h>
 #include <linux/cper.h>
-#include <linux/mm.h>
 
 /*
  * MCE Extended Error Log trace event
  *
  * These events are generated when hardware detects a corrected or
@@ -337,95 +336,9 @@ TRACE_EVENT(aer_event,
 		__entry->tlp_header_valid ?
 			__print_array(__entry->tlp_header, PCIE_STD_MAX_TLP_HEADERLOG, 4) :
 			"Not available")
 );
 #endif /* CONFIG_PCIEAER */
-
-/*
- * memory-failure recovery action result event
- *
- * unsigned long pfn -	Page Frame Number of the corrupted page
- * int type	-	Page types of the corrupted page
- * int result	-	Result of recovery action
- */
-
-#ifdef CONFIG_MEMORY_FAILURE
-#define MF_ACTION_RESULT	\
-	EM ( MF_IGNORED, "Ignored" )	\
-	EM ( MF_FAILED,  "Failed" )	\
-	EM ( MF_DELAYED, "Delayed" )	\
-	EMe ( MF_RECOVERED, "Recovered" )
-
-#define MF_PAGE_TYPE		\
-	EM ( MF_MSG_KERNEL, "reserved kernel page" )			\
-	EM ( MF_MSG_KERNEL_HIGH_ORDER, "high-order kernel page" )	\
-	EM ( MF_MSG_HUGE, "huge page" )					\
-	EM ( MF_MSG_FREE_HUGE, "free huge page" )			\
-	EM ( MF_MSG_GET_HWPOISON, "get hwpoison page" )			\
-	EM ( MF_MSG_UNMAP_FAILED, "unmapping failed page" )		\
-	EM ( MF_MSG_DIRTY_SWAPCACHE, "dirty swapcache page" )		\
-	EM ( MF_MSG_CLEAN_SWAPCACHE, "clean swapcache page" )		\
-	EM ( MF_MSG_DIRTY_MLOCKED_LRU, "dirty mlocked LRU page" )	\
-	EM ( MF_MSG_CLEAN_MLOCKED_LRU, "clean mlocked LRU page" )	\
-	EM ( MF_MSG_DIRTY_UNEVICTABLE_LRU, "dirty unevictable LRU page" )	\
-	EM ( MF_MSG_CLEAN_UNEVICTABLE_LRU, "clean unevictable LRU page" )	\
-	EM ( MF_MSG_DIRTY_LRU, "dirty LRU page" )			\
-	EM ( MF_MSG_CLEAN_LRU, "clean LRU page" )			\
-	EM ( MF_MSG_TRUNCATED_LRU, "already truncated LRU page" )	\
-	EM ( MF_MSG_BUDDY, "free buddy page" )				\
-	EM ( MF_MSG_DAX, "dax page" )					\
-	EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" )			\
-	EM ( MF_MSG_ALREADY_POISONED, "already poisoned" )		\
-	EM ( MF_MSG_PFN_MAP, "non struct page pfn" )                    \
-	EMe ( MF_MSG_UNKNOWN, "unknown page" )
-
-/*
- * First define the enums in MM_ACTION_RESULT to be exported to userspace
- * via TRACE_DEFINE_ENUM().
- */
-#undef EM
-#undef EMe
-#define EM(a, b) TRACE_DEFINE_ENUM(a);
-#define EMe(a, b)	TRACE_DEFINE_ENUM(a);
-
-MF_ACTION_RESULT
-MF_PAGE_TYPE
-
-/*
- * Now redefine the EM() and EMe() macros to map the enums to the strings
- * that will be printed in the output.
- */
-#undef EM
-#undef EMe
-#define EM(a, b)		{ a, b },
-#define EMe(a, b)	{ a, b }
-
-TRACE_EVENT(memory_failure_event,
-	TP_PROTO(unsigned long pfn,
-		 int type,
-		 int result),
-
-	TP_ARGS(pfn, type, result),
-
-	TP_STRUCT__entry(
-		__field(unsigned long, pfn)
-		__field(int, type)
-		__field(int, result)
-	),
-
-	TP_fast_assign(
-		__entry->pfn	= pfn;
-		__entry->type	= type;
-		__entry->result	= result;
-	),
-
-	TP_printk("pfn %#lx: recovery action for %s: %s",
-		__entry->pfn,
-		__print_symbolic(__entry->type, MF_PAGE_TYPE),
-		__print_symbolic(__entry->result, MF_ACTION_RESULT)
-	)
-);
-#endif /* CONFIG_MEMORY_FAILURE */
 #endif /* _TRACE_HW_EVENT_MC_H */
 
 /* This part must be outside protection */
 #include <trace/define_trace.h>
diff --git a/include/trace/events/memory-failure.h b/include/trace/events/memory-failure.h
new file mode 100644
index 000000000000..aa57cc8f896b
--- /dev/null
+++ b/include/trace/events/memory-failure.h
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM memory_failure
+#define TRACE_INCLUDE_FILE memory-failure
+
+#if !defined(_TRACE_MEMORY_FAILURE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_MEMORY_FAILURE_H
+
+#include <linux/tracepoint.h>
+#include <linux/mm.h>
+
+/*
+ * memory-failure recovery action result event
+ *
+ * unsigned long pfn -	Page Frame Number of the corrupted page
+ * int type	-	Page types of the corrupted page
+ * int result	-	Result of recovery action
+ */
+
+#define MF_ACTION_RESULT	\
+	EM ( MF_IGNORED, "Ignored" )	\
+	EM ( MF_FAILED,  "Failed" )	\
+	EM ( MF_DELAYED, "Delayed" )	\
+	EMe ( MF_RECOVERED, "Recovered" )
+
+#define MF_PAGE_TYPE		\
+	EM ( MF_MSG_KERNEL, "reserved kernel page" )			\
+	EM ( MF_MSG_KERNEL_HIGH_ORDER, "high-order kernel page" )	\
+	EM ( MF_MSG_HUGE, "huge page" )					\
+	EM ( MF_MSG_FREE_HUGE, "free huge page" )			\
+	EM ( MF_MSG_GET_HWPOISON, "get hwpoison page" )			\
+	EM ( MF_MSG_UNMAP_FAILED, "unmapping failed page" )		\
+	EM ( MF_MSG_DIRTY_SWAPCACHE, "dirty swapcache page" )		\
+	EM ( MF_MSG_CLEAN_SWAPCACHE, "clean swapcache page" )		\
+	EM ( MF_MSG_DIRTY_MLOCKED_LRU, "dirty mlocked LRU page" )	\
+	EM ( MF_MSG_CLEAN_MLOCKED_LRU, "clean mlocked LRU page" )	\
+	EM ( MF_MSG_DIRTY_UNEVICTABLE_LRU, "dirty unevictable LRU page" )	\
+	EM ( MF_MSG_CLEAN_UNEVICTABLE_LRU, "clean unevictable LRU page" )	\
+	EM ( MF_MSG_DIRTY_LRU, "dirty LRU page" )			\
+	EM ( MF_MSG_CLEAN_LRU, "clean LRU page" )			\
+	EM ( MF_MSG_TRUNCATED_LRU, "already truncated LRU page" )	\
+	EM ( MF_MSG_BUDDY, "free buddy page" )				\
+	EM ( MF_MSG_DAX, "dax page" )					\
+	EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" )			\
+	EM ( MF_MSG_ALREADY_POISONED, "already poisoned" )		\
+	EM ( MF_MSG_PFN_MAP, "non struct page pfn" )                    \
+	EMe ( MF_MSG_UNKNOWN, "unknown page" )
+
+/*
+ * First define the enums in MM_ACTION_RESULT to be exported to userspace
+ * via TRACE_DEFINE_ENUM().
+ */
+#undef EM
+#undef EMe
+#define EM(a, b) TRACE_DEFINE_ENUM(a);
+#define EMe(a, b)	TRACE_DEFINE_ENUM(a);
+
+MF_ACTION_RESULT
+MF_PAGE_TYPE
+
+/*
+ * Now redefine the EM() and EMe() macros to map the enums to the strings
+ * that will be printed in the output.
+ */
+#undef EM
+#undef EMe
+#define EM(a, b)		{ a, b },
+#define EMe(a, b)	{ a, b }
+
+TRACE_EVENT(memory_failure_event,
+	TP_PROTO(unsigned long pfn,
+		 int type,
+		 int result),
+
+	TP_ARGS(pfn, type, result),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, pfn)
+		__field(int, type)
+		__field(int, result)
+	),
+
+	TP_fast_assign(
+		__entry->pfn	= pfn;
+		__entry->type	= type;
+		__entry->result	= result;
+	),
+
+	TP_printk("pfn %#lx: recovery action for %s: %s",
+		__entry->pfn,
+		__print_symbolic(__entry->type, MF_PAGE_TYPE),
+		__print_symbolic(__entry->result, MF_ACTION_RESULT)
+	)
+);
+#endif /* _TRACE_MEMORY_FAILURE_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/mm/Kconfig b/mm/Kconfig
index d548976d0e0a..bd0ea5454af8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -738,11 +738,10 @@ config ARCH_SUPPORTS_MEMORY_FAILURE
 
 config MEMORY_FAILURE
 	depends on MMU
 	depends on ARCH_SUPPORTS_MEMORY_FAILURE
 	bool "Enable recovery from hardware memory errors"
-	select RAS
 	select INTERVAL_TREE
 	help
 	  Enables code to recover from some memory failures on systems
 	  with MCA recovery. This allows a system to continue running
 	  even when some of its memory has uncorrected errors. This requires
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 7f908ad795ad..fbc5a01260c8 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -59,13 +59,16 @@
 #include <linux/kfifo.h>
 #include <linux/ratelimit.h>
 #include <linux/pagewalk.h>
 #include <linux/shmem_fs.h>
 #include <linux/sysctl.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/memory-failure.h>
+
 #include "swap.h"
 #include "internal.h"
-#include "ras/ras_event.h"
 
 static int sysctl_memory_failure_early_kill __read_mostly;
 
 static int sysctl_memory_failure_recovery __read_mostly = 1;
 
-- 
2.51.0
```

Thanks very much.

Xie Yuanbin

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by David Hildenbrand (Red Hat) 3 months ago

On 04.11.25 08:23, Xie Yuanbin wrote:
> Memory bit flips are among the most common hardware errors in the server
> and embedded fields, many hardware components have memory verification
> mechanisms, for example ECC. When an error is detected, some hardware or
> architectures report the information to software (OS/BIOS), for example,
> the MCE (Machine Check Exception) on x86.
> 
> Common errors include CE (Correctable Errors) and UE (Uncorrectable
> Errors). When the kernel receives memory error information, if it has the
> memory-failure feature, it can better handle memory errors without reboot.
> For example, kernel can attempt to offline the affected memory by
> migrating it or killing the process. Therefore, this feature is widely
> used in servers and embedded fields.

This is a pretty generic description of MCEs.

I think what we are missing is: who runs 32bit OSes on MCE-capable 
hardware (or VMs?) and needs this to work.

What's the use case?

-- 
Cheers

David

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Xie Yuanbin 3 months ago

The previous email was corrupted; please ignore it.
I'm very sorry about this.

On Tue, 4 Nov 2025 10:33:39 +0100, David Hildenbrand wrote:
> This is a pretty generic description of MCEs.
>
> I think what we are missing is: who runs 32bit OSes on MCE-capable 
> hardware (or VMs?) and needs this to work.
>
> What's the use case?

I did indeed miss this part in my description, and I apologize for that.
Since the memory-failure feature was introduced, from
commit 6a46079cf57a7f7758e8 ("HWPOISON: The high level memory error
handler in the VM v7"), it can be enabled on x86_32, submitting these
patches only because MEMORY_FAILURE cannot be enabled together with
SPARSEMEM on x86_32. The memory-failure was introduced in 2009, when
64-bit hardware was not even very popular yet, and the first caller of
`memory_failure()` is from x86's MCE.
Even in latest version, with default i386_defconfig, MEMORY_FAILURE can be
enabled directly on x86_32, because i386_defconfig does not enable
SPARSEMEM by default.
Therefore, I did not consider the need to explain why MEMORY_FAILURE needs
to be enabled on the x86_32.

Now, let me try to explain it. From what I understand, it mainly comes
from two aspects:
1. Although almost all new CPUs are 64-bit, there are still many existing
32-bit x86 devices in uses.
2. On some embedded devices, in order to save memory overhead, even with
64-bit CPU hardware, a 32-bit kernel may still be used. You might wonder
why embedded devices need SPARSEMEM. This is because the MEMORY_HOTPLUG
feature depends on SPARSEMEM, not necessarily SPARSEMEM itself.

All of the above devices, the memory-failure feature may be used to
provide reliable memory errors handling, and to minimize service
interruptions as much as possible.

> Cheers
>
> David

Thanks!

Xie Yuanbin

Re: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM

Posted by Xie Yuanbin 3 months ago

> This is a pretty generic description of MCEs.
>
> I think what we are missing is: who runs 32bit OSes on MCE-capable 
> hardware (or VMs?) and needs this to work.
>
> What's the use case?

Now, let me try to explain it. From what I understand, it mainly comes
from two aspects:
1. Although almost all new CPUs are 64-bit, there are still many existing
32-bit x86 devices in uses.
2. On some embedded devices, in order to save memory overhead, even with
64-bit CPU hardware, a 32-bit kernel may still be used. You might wonder
why embedded devices need SPARSEMEM. This is because the MEMORY_HOTPLUG
feature depends on SPARSEMEM, not necessarily SPARSEMEM itself.

All of the above devices, the memory-failure feature may be used to
provide reliable memory errors handling, and to minimize service
interruptions as much as possible.

> Cheers
>
> David

Thanks!

Xie Yuanbin