[PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall

Waiman Long posted 1 patch 3 days, 13 hours ago
kernel/sched/isolation.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
[PATCH v3.1] sched/isolation: Defer freeing of cpumask memblock memory to initcall
Posted by Waiman Long 3 days, 13 hours ago
When testing a linux-next kernel with commit 59bd1d914bb5 ("memblock:
warn when freeing reserved memory before memory map is initialized"),
the following warning was hit when there was a "nohz_full" kernel boot
parameter.

  Cannot free reserved memory because of deferred initialization of the memory map
  WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
    :
  Call Trace:
   <TASK>
   memblock_phys_free+0xcb/0x100
   housekeeping_init+0x14c/0x170
   start_kernel+0x207/0x450
   x86_64_start_reservations+0x24/0x30
   x86_64_start_kernel+0xda/0xe0
   common_startup_64+0x13e/0x141
   </TASK>

IOW, we shouldn't free memblock allocated memory so early
in the boot process when memory map isn't fully initialized in
deferred_init_memmap().

Fix it by saving the housekeeping cpumask memblock memory to
be freed into a free list in housekeeping_init() and add a new
housekeeping_late_init() helper to defer the actual freeing of memblock
memory to when initcall's are being processed. The non-atomic version
of the llist APIs are used as there is no contention.

This commit also depends on the presence of commit 7c2eee9c1367
("memblock: don't touch memblock arrays when memblock_free() is called
late") to prevent a KASAN UAF bug report [1].

 [1] https://lore.kernel.org/lkml/20260505051821.1107133-1-longman@redhat.com/

Fixes: 27c3a5967f05 ("sched/isolation: Convert housekeeping cpumasks to rcu pointers")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/sched/isolation.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

 [v3.1] Add __initdata to memblock_freelist

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index ef152d401fe2..156025ef81b7 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -8,6 +8,7 @@
  *
  */
 #include <linux/sched/isolation.h>
+#include <linux/llist.h>
 #include <linux/pci.h>
 #include "sched.h"
 
@@ -27,6 +28,7 @@ struct housekeeping {
 };
 
 static struct housekeeping housekeeping;
+static __initdata LLIST_HEAD(memblock_freelist);
 
 bool housekeeping_enabled(enum hk_type type)
 {
@@ -189,10 +191,22 @@ void __init housekeeping_init(void)
 		WARN_ON_ONCE(cpumask_empty(omask));
 		cpumask_copy(nmask, omask);
 		RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
-		memblock_free(omask, cpumask_size());
+		__llist_add((struct llist_node *)omask, &memblock_freelist);
 	}
 }
 
+static int __init housekeeping_late_init(void)
+{
+	struct llist_node *llnode, *pos, *t;
+
+	/* Free allocated memblock memory, if any */
+	llnode = __llist_del_all(&memblock_freelist);
+	llist_for_each_safe(pos, t, llnode)
+		memblock_free(pos, cpumask_size());
+	return 0;
+}
+pure_initcall(housekeeping_late_init);
+
 static void __init housekeeping_setup_type(enum hk_type type,
 					   cpumask_var_t housekeeping_staging)
 {
-- 
2.54.0