kernel/sched/isolation.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
freeing reserved memory before memory map is initialized"), the following
warning was hit when there was a "nohz_full" kernel boot parameter.
[ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
[ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
:
[ 0.080945] Call Trace:
[ 0.080947] <TASK>
[ 0.080949] memblock_phys_free+0xcb/0x100
[ 0.080953] housekeeping_init+0x14c/0x170
[ 0.080957] start_kernel+0x207/0x450
[ 0.080961] x86_64_start_reservations+0x24/0x30
[ 0.080965] x86_64_start_kernel+0xda/0xe0
[ 0.080967] common_startup_64+0x13e/0x141
[ 0.080972] </TASK>
The commit states that freeing of reserved memory before the memory
map is fully initialized in deferred_init_memmap() would cause access
to uninitialized struct pages and may crash when accessing spurious
list pointers. However, if the memblock_free() call is deferred to
the start of initcall processing in the bootup process, for instance,
the following KASAN warning can appear.
[ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
[ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
:
[ 8.514775] Call Trace:
[ 8.514775] <TASK>
[ 8.514775] kasan_report+0xb2/0x1b0
[ 8.514775] memblock_isolate_range+0x4ac/0x650
[ 8.514775] memblock_phys_free+0xc4/0x190
[ 8.514775] housekeeping_late_init+0x257/0x280
[ 8.514775] do_one_initcall+0xaa/0x470
[ 8.514775] do_initcalls+0x1b4/0x1f0
[ 8.514775] kernel_init_freeable+0x4b5/0x550
[ 8.514775] kernel_init+0x1c/0x150
[ 8.514775] ret_from_fork+0x5dc/0x8e0
[ 8.514775] ret_from_fork_asm+0x1a/0x30
[ 8.514775] </TASK>
It is likely that memblock_discard() may discard memblock data needed
for memblock_free(). One workaround for now to avoid these warning/bug
messages is to keep the memblock allocated cpumasks even if they are
no longer needed until the memblock subsystem is properly updated to
handle memblock_free().
On most systems, memory occuipied by a cpumask is pretty small. So not
much memory will be wasted if the memblock cpumasks are not freed.
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/sched/isolation.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index ef152d401fe2..ad9b1a1104e3 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -189,7 +189,13 @@ void __init housekeeping_init(void)
WARN_ON_ONCE(cpumask_empty(omask));
cpumask_copy(nmask, omask);
RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
- memblock_free(omask, cpumask_size());
+
+ /*
+ * TODO: Don't free memblock allocated cpumasks until the
+ * memblock subystem is able to handle the memblock_free()
+ * properly.
+ */
+ // memblock_free(omask, cpumask_size());
}
}
--
2.53.0
Hi Waiman,
On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote:
> When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
> freeing reserved memory before memory map is initialized"), the following
> warning was hit when there was a "nohz_full" kernel boot parameter.
>
> [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
> [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> :
> [ 0.080945] Call Trace:
> [ 0.080947] <TASK>
> [ 0.080949] memblock_phys_free+0xcb/0x100
> [ 0.080953] housekeeping_init+0x14c/0x170
> [ 0.080957] start_kernel+0x207/0x450
> [ 0.080961] x86_64_start_reservations+0x24/0x30
> [ 0.080965] x86_64_start_kernel+0xda/0xe0
> [ 0.080967] common_startup_64+0x13e/0x141
> [ 0.080972] </TASK>
>
> The commit states that freeing of reserved memory before the memory
> map is fully initialized in deferred_init_memmap() would cause access
> to uninitialized struct pages and may crash when accessing spurious
> list pointers. However, if the memblock_free() call is deferred to
> the start of initcall processing in the bootup process, for instance,
> the following KASAN warning can appear.
>
> [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
> [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
> :
> [ 8.514775] Call Trace:
> [ 8.514775] <TASK>
> [ 8.514775] kasan_report+0xb2/0x1b0
> [ 8.514775] memblock_isolate_range+0x4ac/0x650
> [ 8.514775] memblock_phys_free+0xc4/0x190
> [ 8.514775] housekeeping_late_init+0x257/0x280
> [ 8.514775] do_one_initcall+0xaa/0x470
> [ 8.514775] do_initcalls+0x1b4/0x1f0
> [ 8.514775] kernel_init_freeable+0x4b5/0x550
> [ 8.514775] kernel_init+0x1c/0x150
> [ 8.514775] ret_from_fork+0x5dc/0x8e0
> [ 8.514775] ret_from_fork_asm+0x1a/0x30
> [ 8.514775] </TASK>
>
> It is likely that memblock_discard() may discard memblock data needed
> for memblock_free(). One workaround for now to avoid these warning/bug
> messages is to keep the memblock allocated cpumasks even if they are
> no longer needed until the memblock subsystem is properly updated to
> handle memblock_free().
>
> On most systems, memory occuipied by a cpumask is pretty small. So not
> much memory will be wasted if the memblock cpumasks are not freed.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/sched/isolation.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index ef152d401fe2..ad9b1a1104e3 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -189,7 +189,13 @@ void __init housekeeping_init(void)
> WARN_ON_ONCE(cpumask_empty(omask));
> cpumask_copy(nmask, omask);
> RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
> - memblock_free(omask, cpumask_size());
> +
> + /*
> + * TODO: Don't free memblock allocated cpumasks until the
> + * memblock subystem is able to handle the memblock_free()
> + * properly.
> + */
> + // memblock_free(omask, cpumask_size());
Before 59bd1d914bb5 it was a silent leak. housekeeping_init() is called
after memblock moves all the memory to buddy, so this would only update
memblock.reserved.
The comment a few lines above says that we reallocate to be able to kfree()
later. Is it possible to delay reallocation until an initcall?
> }
> }
>
> --
> 2.53.0
>
--
Sincerely yours,
Mike.
On 5/10/26 11:02 AM, Mike Rapoport wrote:
> Hi Waiman,
>
> On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote:
>> When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
>> freeing reserved memory before memory map is initialized"), the following
>> warning was hit when there was a "nohz_full" kernel boot parameter.
>>
>> [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
>> [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
>> :
>> [ 0.080945] Call Trace:
>> [ 0.080947] <TASK>
>> [ 0.080949] memblock_phys_free+0xcb/0x100
>> [ 0.080953] housekeeping_init+0x14c/0x170
>> [ 0.080957] start_kernel+0x207/0x450
>> [ 0.080961] x86_64_start_reservations+0x24/0x30
>> [ 0.080965] x86_64_start_kernel+0xda/0xe0
>> [ 0.080967] common_startup_64+0x13e/0x141
>> [ 0.080972] </TASK>
>>
>> The commit states that freeing of reserved memory before the memory
>> map is fully initialized in deferred_init_memmap() would cause access
>> to uninitialized struct pages and may crash when accessing spurious
>> list pointers. However, if the memblock_free() call is deferred to
>> the start of initcall processing in the bootup process, for instance,
>> the following KASAN warning can appear.
>>
>> [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
>> [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
>> :
>> [ 8.514775] Call Trace:
>> [ 8.514775] <TASK>
>> [ 8.514775] kasan_report+0xb2/0x1b0
>> [ 8.514775] memblock_isolate_range+0x4ac/0x650
>> [ 8.514775] memblock_phys_free+0xc4/0x190
>> [ 8.514775] housekeeping_late_init+0x257/0x280
>> [ 8.514775] do_one_initcall+0xaa/0x470
>> [ 8.514775] do_initcalls+0x1b4/0x1f0
>> [ 8.514775] kernel_init_freeable+0x4b5/0x550
>> [ 8.514775] kernel_init+0x1c/0x150
>> [ 8.514775] ret_from_fork+0x5dc/0x8e0
>> [ 8.514775] ret_from_fork_asm+0x1a/0x30
>> [ 8.514775] </TASK>
>>
>> It is likely that memblock_discard() may discard memblock data needed
>> for memblock_free(). One workaround for now to avoid these warning/bug
>> messages is to keep the memblock allocated cpumasks even if they are
>> no longer needed until the memblock subsystem is properly updated to
>> handle memblock_free().
>>
>> On most systems, memory occuipied by a cpumask is pretty small. So not
>> much memory will be wasted if the memblock cpumasks are not freed.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>> kernel/sched/isolation.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>> index ef152d401fe2..ad9b1a1104e3 100644
>> --- a/kernel/sched/isolation.c
>> +++ b/kernel/sched/isolation.c
>> @@ -189,7 +189,13 @@ void __init housekeeping_init(void)
>> WARN_ON_ONCE(cpumask_empty(omask));
>> cpumask_copy(nmask, omask);
>> RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
>> - memblock_free(omask, cpumask_size());
>> +
>> + /*
>> + * TODO: Don't free memblock allocated cpumasks until the
>> + * memblock subystem is able to handle the memblock_free()
>> + * properly.
>> + */
>> + // memblock_free(omask, cpumask_size());
> Before 59bd1d914bb5 it was a silent leak. housekeeping_init() is called
> after memblock moves all the memory to buddy, so this would only update
> memblock.reserved.
>
> The comment a few lines above says that we reallocate to be able to kfree()
> later. Is it possible to delay reallocation until an initcall?
My original thought was to defer the freeing to init call. That changes
led to the KASAN bug splat listed in the commit log, I think the right
window to free memblock memory is currently just too narrow. Do you mean
that with the fix patch you sent to Breno, memblock freeing in initcall
will work without bug report? If so, I can send another patch to defer
memblock freeing after the fix patch is merged as the KASAN bug is more
serious than the memblock warning. I will do some testing tomorrow with
your fix patch.
Cheers,
Longman
On Mon, May 11, 2026 at 12:55:39AM -0400, Waiman Long wrote:
> On 5/10/26 11:02 AM, Mike Rapoport wrote:
> > Hi Waiman,
> >
> > On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote:
> > > When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
> > > freeing reserved memory before memory map is initialized"), the following
> > > warning was hit when there was a "nohz_full" kernel boot parameter.
> > >
> > > [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
> > > [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> > > :
> > > [ 0.080945] Call Trace:
> > > [ 0.080947] <TASK>
> > > [ 0.080949] memblock_phys_free+0xcb/0x100
> > > [ 0.080953] housekeeping_init+0x14c/0x170
> > > [ 0.080957] start_kernel+0x207/0x450
> > > [ 0.080961] x86_64_start_reservations+0x24/0x30
> > > [ 0.080965] x86_64_start_kernel+0xda/0xe0
> > > [ 0.080967] common_startup_64+0x13e/0x141
> > > [ 0.080972] </TASK>
> > >
> > > The commit states that freeing of reserved memory before the memory
> > > map is fully initialized in deferred_init_memmap() would cause access
> > > to uninitialized struct pages and may crash when accessing spurious
> > > list pointers. However, if the memblock_free() call is deferred to
> > > the start of initcall processing in the bootup process, for instance,
> > > the following KASAN warning can appear.
> > >
> > > [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
> > > [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
> > > :
> > > [ 8.514775] Call Trace:
> > > [ 8.514775] <TASK>
> > > [ 8.514775] kasan_report+0xb2/0x1b0
> > > [ 8.514775] memblock_isolate_range+0x4ac/0x650
> > > [ 8.514775] memblock_phys_free+0xc4/0x190
> > > [ 8.514775] housekeeping_late_init+0x257/0x280
> > > [ 8.514775] do_one_initcall+0xaa/0x470
> > > [ 8.514775] do_initcalls+0x1b4/0x1f0
> > > [ 8.514775] kernel_init_freeable+0x4b5/0x550
> > > [ 8.514775] kernel_init+0x1c/0x150
> > > [ 8.514775] ret_from_fork+0x5dc/0x8e0
> > > [ 8.514775] ret_from_fork_asm+0x1a/0x30
> > > [ 8.514775] </TASK>
> > >
> > > It is likely that memblock_discard() may discard memblock data needed
> > > for memblock_free(). One workaround for now to avoid these warning/bug
> > > messages is to keep the memblock allocated cpumasks even if they are
> > > no longer needed until the memblock subsystem is properly updated to
> > > handle memblock_free().
> > >
> > > On most systems, memory occuipied by a cpumask is pretty small. So not
> > > much memory will be wasted if the memblock cpumasks are not freed.
> > >
> > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > ---
> > > kernel/sched/isolation.c | 8 +++++++-
> > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> > > index ef152d401fe2..ad9b1a1104e3 100644
> > > --- a/kernel/sched/isolation.c
> > > +++ b/kernel/sched/isolation.c
> > > @@ -189,7 +189,13 @@ void __init housekeeping_init(void)
> > > WARN_ON_ONCE(cpumask_empty(omask));
> > > cpumask_copy(nmask, omask);
> > > RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
> > > - memblock_free(omask, cpumask_size());
> > > +
> > > + /*
> > > + * TODO: Don't free memblock allocated cpumasks until the
> > > + * memblock subystem is able to handle the memblock_free()
> > > + * properly.
> > > + */
> > > + // memblock_free(omask, cpumask_size());
> > Before 59bd1d914bb5 it was a silent leak. housekeeping_init() is called
> > after memblock moves all the memory to buddy, so this would only update
> > memblock.reserved.
> >
> > The comment a few lines above says that we reallocate to be able to kfree()
> > later. Is it possible to delay reallocation until an initcall?
>
> My original thought was to defer the freeing to init call. That changes led
> to the KASAN bug splat listed in the commit log, I think the right window to
> free memblock memory is currently just too narrow. Do you mean that with the
> fix patch you sent to Breno, memblock freeing in initcall will work without
> bug report?
Yes, with the fix I sent to Breno memblock_free() should work in an
initcall and "do the right thing".
> If so, I can send another patch to defer memblock freeing after the fix
> patch is merged as the KASAN bug is more serious than the memblock
> warning. I will do some testing tomorrow with your fix patch.
> Cheers,
> Longman
>
--
Sincerely yours,
Mike.
On 5/11/26 4:34 AM, Mike Rapoport wrote:
> On Mon, May 11, 2026 at 12:55:39AM -0400, Waiman Long wrote:
>> On 5/10/26 11:02 AM, Mike Rapoport wrote:
>>> Hi Waiman,
>>>
>>> On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote:
>>>> When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
>>>> freeing reserved memory before memory map is initialized"), the following
>>>> warning was hit when there was a "nohz_full" kernel boot parameter.
>>>>
>>>> [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
>>>> [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
>>>> :
>>>> [ 0.080945] Call Trace:
>>>> [ 0.080947] <TASK>
>>>> [ 0.080949] memblock_phys_free+0xcb/0x100
>>>> [ 0.080953] housekeeping_init+0x14c/0x170
>>>> [ 0.080957] start_kernel+0x207/0x450
>>>> [ 0.080961] x86_64_start_reservations+0x24/0x30
>>>> [ 0.080965] x86_64_start_kernel+0xda/0xe0
>>>> [ 0.080967] common_startup_64+0x13e/0x141
>>>> [ 0.080972] </TASK>
>>>>
>>>> The commit states that freeing of reserved memory before the memory
>>>> map is fully initialized in deferred_init_memmap() would cause access
>>>> to uninitialized struct pages and may crash when accessing spurious
>>>> list pointers. However, if the memblock_free() call is deferred to
>>>> the start of initcall processing in the bootup process, for instance,
>>>> the following KASAN warning can appear.
>>>>
>>>> [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
>>>> [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
>>>> :
>>>> [ 8.514775] Call Trace:
>>>> [ 8.514775] <TASK>
>>>> [ 8.514775] kasan_report+0xb2/0x1b0
>>>> [ 8.514775] memblock_isolate_range+0x4ac/0x650
>>>> [ 8.514775] memblock_phys_free+0xc4/0x190
>>>> [ 8.514775] housekeeping_late_init+0x257/0x280
>>>> [ 8.514775] do_one_initcall+0xaa/0x470
>>>> [ 8.514775] do_initcalls+0x1b4/0x1f0
>>>> [ 8.514775] kernel_init_freeable+0x4b5/0x550
>>>> [ 8.514775] kernel_init+0x1c/0x150
>>>> [ 8.514775] ret_from_fork+0x5dc/0x8e0
>>>> [ 8.514775] ret_from_fork_asm+0x1a/0x30
>>>> [ 8.514775] </TASK>
>>>>
>>>> It is likely that memblock_discard() may discard memblock data needed
>>>> for memblock_free(). One workaround for now to avoid these warning/bug
>>>> messages is to keep the memblock allocated cpumasks even if they are
>>>> no longer needed until the memblock subsystem is properly updated to
>>>> handle memblock_free().
>>>>
>>>> On most systems, memory occuipied by a cpumask is pretty small. So not
>>>> much memory will be wasted if the memblock cpumasks are not freed.
>>>>
>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>> ---
>>>> kernel/sched/isolation.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>>>> index ef152d401fe2..ad9b1a1104e3 100644
>>>> --- a/kernel/sched/isolation.c
>>>> +++ b/kernel/sched/isolation.c
>>>> @@ -189,7 +189,13 @@ void __init housekeeping_init(void)
>>>> WARN_ON_ONCE(cpumask_empty(omask));
>>>> cpumask_copy(nmask, omask);
>>>> RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
>>>> - memblock_free(omask, cpumask_size());
>>>> +
>>>> + /*
>>>> + * TODO: Don't free memblock allocated cpumasks until the
>>>> + * memblock subystem is able to handle the memblock_free()
>>>> + * properly.
>>>> + */
>>>> + // memblock_free(omask, cpumask_size());
>>> Before 59bd1d914bb5 it was a silent leak. housekeeping_init() is called
>>> after memblock moves all the memory to buddy, so this would only update
>>> memblock.reserved.
>>>
>>> The comment a few lines above says that we reallocate to be able to kfree()
>>> later. Is it possible to delay reallocation until an initcall?
>> My original thought was to defer the freeing to init call. That changes led
>> to the KASAN bug splat listed in the commit log, I think the right window to
>> free memblock memory is currently just too narrow. Do you mean that with the
>> fix patch you sent to Breno, memblock freeing in initcall will work without
>> bug report?
> Yes, with the fix I sent to Breno memblock_free() should work in an
> initcall and "do the right thing".
Thanks for the confirmation. I have tested your patch with my patch to
defer the memblock_free() to initcall. There is no longer any KASAN
splat when booting up a debug test kernel. You can add the following tag
when you send out your patch.
Tested-by: Waiman Long <longman@redhat.com>
Le Mon, May 11, 2026 at 05:36:08PM -0400, Waiman Long a écrit :
> On 5/11/26 4:34 AM, Mike Rapoport wrote:
> > On Mon, May 11, 2026 at 12:55:39AM -0400, Waiman Long wrote:
> > > On 5/10/26 11:02 AM, Mike Rapoport wrote:
> > > > Hi Waiman,
> > > >
> > > > On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote:
> > > > > When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
> > > > > freeing reserved memory before memory map is initialized"), the following
> > > > > warning was hit when there was a "nohz_full" kernel boot parameter.
> > > > >
> > > > > [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
> > > > > [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> > > > > :
> > > > > [ 0.080945] Call Trace:
> > > > > [ 0.080947] <TASK>
> > > > > [ 0.080949] memblock_phys_free+0xcb/0x100
> > > > > [ 0.080953] housekeeping_init+0x14c/0x170
> > > > > [ 0.080957] start_kernel+0x207/0x450
> > > > > [ 0.080961] x86_64_start_reservations+0x24/0x30
> > > > > [ 0.080965] x86_64_start_kernel+0xda/0xe0
> > > > > [ 0.080967] common_startup_64+0x13e/0x141
> > > > > [ 0.080972] </TASK>
> > > > >
> > > > > The commit states that freeing of reserved memory before the memory
> > > > > map is fully initialized in deferred_init_memmap() would cause access
> > > > > to uninitialized struct pages and may crash when accessing spurious
> > > > > list pointers. However, if the memblock_free() call is deferred to
> > > > > the start of initcall processing in the bootup process, for instance,
> > > > > the following KASAN warning can appear.
> > > > >
> > > > > [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
> > > > > [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
> > > > > :
> > > > > [ 8.514775] Call Trace:
> > > > > [ 8.514775] <TASK>
> > > > > [ 8.514775] kasan_report+0xb2/0x1b0
> > > > > [ 8.514775] memblock_isolate_range+0x4ac/0x650
> > > > > [ 8.514775] memblock_phys_free+0xc4/0x190
> > > > > [ 8.514775] housekeeping_late_init+0x257/0x280
> > > > > [ 8.514775] do_one_initcall+0xaa/0x470
> > > > > [ 8.514775] do_initcalls+0x1b4/0x1f0
> > > > > [ 8.514775] kernel_init_freeable+0x4b5/0x550
> > > > > [ 8.514775] kernel_init+0x1c/0x150
> > > > > [ 8.514775] ret_from_fork+0x5dc/0x8e0
> > > > > [ 8.514775] ret_from_fork_asm+0x1a/0x30
> > > > > [ 8.514775] </TASK>
> > > > >
> > > > > It is likely that memblock_discard() may discard memblock data needed
> > > > > for memblock_free(). One workaround for now to avoid these warning/bug
> > > > > messages is to keep the memblock allocated cpumasks even if they are
> > > > > no longer needed until the memblock subsystem is properly updated to
> > > > > handle memblock_free().
> > > > >
> > > > > On most systems, memory occuipied by a cpumask is pretty small. So not
> > > > > much memory will be wasted if the memblock cpumasks are not freed.
> > > > >
> > > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > > > ---
> > > > > kernel/sched/isolation.c | 8 +++++++-
> > > > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> > > > > index ef152d401fe2..ad9b1a1104e3 100644
> > > > > --- a/kernel/sched/isolation.c
> > > > > +++ b/kernel/sched/isolation.c
> > > > > @@ -189,7 +189,13 @@ void __init housekeeping_init(void)
> > > > > WARN_ON_ONCE(cpumask_empty(omask));
> > > > > cpumask_copy(nmask, omask);
> > > > > RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask);
> > > > > - memblock_free(omask, cpumask_size());
> > > > > +
> > > > > + /*
> > > > > + * TODO: Don't free memblock allocated cpumasks until the
> > > > > + * memblock subystem is able to handle the memblock_free()
> > > > > + * properly.
> > > > > + */
> > > > > + // memblock_free(omask, cpumask_size());
> > > > Before 59bd1d914bb5 it was a silent leak. housekeeping_init() is called
> > > > after memblock moves all the memory to buddy, so this would only update
> > > > memblock.reserved.
> > > >
> > > > The comment a few lines above says that we reallocate to be able to kfree()
> > > > later. Is it possible to delay reallocation until an initcall?
> > > My original thought was to defer the freeing to init call. That changes led
> > > to the KASAN bug splat listed in the commit log, I think the right window to
> > > free memblock memory is currently just too narrow. Do you mean that with the
> > > fix patch you sent to Breno, memblock freeing in initcall will work without
> > > bug report?
> > Yes, with the fix I sent to Breno memblock_free() should work in an
> > initcall and "do the right thing".
>
> Thanks for the confirmation. I have tested your patch with my patch to defer
> the memblock_free() to initcall. There is no longer any KASAN splat when
> booting up a debug test kernel. You can add the following tag when you send
> out your patch.
>
> Tested-by: Waiman Long <longman@redhat.com>
Thanks a lot guys!
--
Frederic Weisbecker
SUSE Labs
On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote:
> One workaround for now to avoid these warning/bug
> messages is to keep the memblock allocated cpumasks even if they are
> no longer needed until the memblock subsystem is properly updated to
> handle memblock_free().
We just hit the same KASAN UAF from a different caller on a v7.1-rc3 boot,
which I think reinforces that the fix really needs to be in memblock rather
than in each subsystem.
In our case the offender is the IMA kexec buffer release path:
[ 113.498542] BUG: KASAN: use-after-free in memblock_isolate_range+0x208/0x8f0
[ 113.514206] Read of size 8 at addr ff11001824ba4000 by task swapper/0/1
...
[ 113.532258] memblock_isolate_range+0x208/0x8f0
[ 113.532267] memblock_phys_free+0x5f/0x300
[ 113.532274] ima_free_kexec_buffer+0x1d/0x40
[ 113.532280] ima_load_kexec_buffer+0xbf/0xf0
[ 113.532285] ima_init+0x42/0xa0
[ 113.532287] init_ima+0x5e/0x190
[ 113.532290] security_initcall_late+0xad/0x210
[ 113.532301] do_one_initcall+0x138/0x540
Same shape as your second trace: memblock_phys_free() reads
memblock.reserved.regions, which memblock_discard() has already returned
to the buddy allocator (the KASAN shadow shows the page as fully poisoned,
and pfn 0x1824ba4 has been reallocated). It then page-faults a moment later
on the same address.
ima_init runs as a security_initcall_late, so by the time
ima_free_kexec_buffer() calls memblock_phys_free() on the previous
kernel's measurement buffer, memblock has long been torn down on
configurations without CONFIG_ARCH_KEEP_MEMBLOCK
This regression seems to come from commit 87ce9e83ab8b ("memblock, treewide: make
memblock_free() handle late freeing"), which dropped memblock_free_late()
and made memblock_phys_free() unconditionally call
memblock_remove_range(&memblock.reserved, ...) followed by an optional
__free_reserved_area().
Hi Breno,
On Fri, May 08, 2026 at 07:19:06AM -0700, Breno Leitao wrote:
> On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote:
> > One workaround for now to avoid these warning/bug
> > messages is to keep the memblock allocated cpumasks even if they are
> > no longer needed until the memblock subsystem is properly updated to
> > handle memblock_free().
>
> We just hit the same KASAN UAF from a different caller on a v7.1-rc3 boot,
> which I think reinforces that the fix really needs to be in memblock rather
> than in each subsystem.
>
> In our case the offender is the IMA kexec buffer release path:
>
> [ 113.498542] BUG: KASAN: use-after-free in memblock_isolate_range+0x208/0x8f0
> [ 113.514206] Read of size 8 at addr ff11001824ba4000 by task swapper/0/1
> ...
> [ 113.532258] memblock_isolate_range+0x208/0x8f0
> [ 113.532267] memblock_phys_free+0x5f/0x300
> [ 113.532274] ima_free_kexec_buffer+0x1d/0x40
> [ 113.532280] ima_load_kexec_buffer+0xbf/0xf0
> [ 113.532285] ima_init+0x42/0xa0
> [ 113.532287] init_ima+0x5e/0x190
> [ 113.532290] security_initcall_late+0xad/0x210
> [ 113.532301] do_one_initcall+0x138/0x540
>
> Same shape as your second trace: memblock_phys_free() reads
> memblock.reserved.regions, which memblock_discard() has already returned
> to the buddy allocator (the KASAN shadow shows the page as fully poisoned,
> and pfn 0x1824ba4 has been reallocated). It then page-faults a moment later
> on the same address.
>
> ima_init runs as a security_initcall_late, so by the time
> ima_free_kexec_buffer() calls memblock_phys_free() on the previous
> kernel's measurement buffer, memblock has long been torn down on
> configurations without CONFIG_ARCH_KEEP_MEMBLOCK
>
> This regression seems to come from commit 87ce9e83ab8b ("memblock, treewide: make
> memblock_free() handle late freeing"), which dropped memblock_free_late()
> and made memblock_phys_free() unconditionally call
> memblock_remove_range(&memblock.reserved, ...) followed by an optional
> __free_reserved_area().
Oops, somehow I overlooked that late freeing can't access memblock arrays :(
Can you please test this fix:
diff --git a/mm/memblock.c b/mm/memblock.c
index a6a1c91e276d..ccd43f3abb82 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -989,13 +989,15 @@ void __init_memblock memblock_free(void *ptr, size_t size)
int __init_memblock memblock_phys_free(phys_addr_t base, phys_addr_t size)
{
phys_addr_t end = base + size - 1;
- int ret;
+ int ret = 0;
memblock_dbg("%s: [%pa-%pa] %pS\n", __func__,
&base, &end, (void *)_RET_IP_);
kmemleak_free_part_phys(base, size);
- ret = memblock_remove_range(&memblock.reserved, base, size);
+
+ if (!slab_is_available() || IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
+ ret = memblock_remove_range(&memblock.reserved, base, size);
if (slab_is_available())
__free_reserved_area(base, base + size, -1);
--
Sincerely yours,
Mike.
On 05/05/26 01:18, Waiman Long wrote:
> When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
> freeing reserved memory before memory map is initialized"), the following
> warning was hit when there was a "nohz_full" kernel boot parameter.
>
> [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
> [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
> :
> [ 0.080945] Call Trace:
> [ 0.080947] <TASK>
> [ 0.080949] memblock_phys_free+0xcb/0x100
> [ 0.080953] housekeeping_init+0x14c/0x170
> [ 0.080957] start_kernel+0x207/0x450
> [ 0.080961] x86_64_start_reservations+0x24/0x30
> [ 0.080965] x86_64_start_kernel+0xda/0xe0
> [ 0.080967] common_startup_64+0x13e/0x141
> [ 0.080972] </TASK>
>
> The commit states that freeing of reserved memory before the memory
> map is fully initialized in deferred_init_memmap() would cause access
> to uninitialized struct pages and may crash when accessing spurious
> list pointers. However, if the memblock_free() call is deferred to
> the start of initcall processing in the bootup process, for instance,
> the following KASAN warning can appear.
>
> [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
> [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
> :
> [ 8.514775] Call Trace:
> [ 8.514775] <TASK>
> [ 8.514775] kasan_report+0xb2/0x1b0
> [ 8.514775] memblock_isolate_range+0x4ac/0x650
> [ 8.514775] memblock_phys_free+0xc4/0x190
> [ 8.514775] housekeeping_late_init+0x257/0x280
> [ 8.514775] do_one_initcall+0xaa/0x470
> [ 8.514775] do_initcalls+0x1b4/0x1f0
> [ 8.514775] kernel_init_freeable+0x4b5/0x550
> [ 8.514775] kernel_init+0x1c/0x150
> [ 8.514775] ret_from_fork+0x5dc/0x8e0
> [ 8.514775] ret_from_fork_asm+0x1a/0x30
> [ 8.514775] </TASK>
>
Darn, I just saw the previous version doing this.
> It is likely that memblock_discard() may discard memblock data needed
> for memblock_free(). One workaround for now to avoid these warning/bug
> messages is to keep the memblock allocated cpumasks even if they are
> no longer needed until the memblock subsystem is properly updated to
> handle memblock_free().
Pardon my ignorance, but how come this isn't the case for the other
memblock users? It sounds like there is no right place for freeing this
mask.
On 5/6/26 9:25 AM, Valentin Schneider wrote:
> On 05/05/26 01:18, Waiman Long wrote:
>> When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when
>> freeing reserved memory before memory map is initialized"), the following
>> warning was hit when there was a "nohz_full" kernel boot parameter.
>>
>> [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map
>> [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0
>> :
>> [ 0.080945] Call Trace:
>> [ 0.080947] <TASK>
>> [ 0.080949] memblock_phys_free+0xcb/0x100
>> [ 0.080953] housekeeping_init+0x14c/0x170
>> [ 0.080957] start_kernel+0x207/0x450
>> [ 0.080961] x86_64_start_reservations+0x24/0x30
>> [ 0.080965] x86_64_start_kernel+0xda/0xe0
>> [ 0.080967] common_startup_64+0x13e/0x141
>> [ 0.080972] </TASK>
>>
>> The commit states that freeing of reserved memory before the memory
>> map is fully initialized in deferred_init_memmap() would cause access
>> to uninitialized struct pages and may crash when accessing spurious
>> list pointers. However, if the memblock_free() call is deferred to
>> the start of initcall processing in the bootup process, for instance,
>> the following KASAN warning can appear.
>>
>> [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650
>> [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1
>> :
>> [ 8.514775] Call Trace:
>> [ 8.514775] <TASK>
>> [ 8.514775] kasan_report+0xb2/0x1b0
>> [ 8.514775] memblock_isolate_range+0x4ac/0x650
>> [ 8.514775] memblock_phys_free+0xc4/0x190
>> [ 8.514775] housekeeping_late_init+0x257/0x280
>> [ 8.514775] do_one_initcall+0xaa/0x470
>> [ 8.514775] do_initcalls+0x1b4/0x1f0
>> [ 8.514775] kernel_init_freeable+0x4b5/0x550
>> [ 8.514775] kernel_init+0x1c/0x150
>> [ 8.514775] ret_from_fork+0x5dc/0x8e0
>> [ 8.514775] ret_from_fork_asm+0x1a/0x30
>> [ 8.514775] </TASK>
>>
> Darn, I just saw the previous version doing this.
>
>> It is likely that memblock_discard() may discard memblock data needed
>> for memblock_free(). One workaround for now to avoid these warning/bug
>> messages is to keep the memblock allocated cpumasks even if they are
>> no longer needed until the memblock subsystem is properly updated to
>> handle memblock_free().
> Pardon my ignorance, but how come this isn't the case for the other
> memblock users? It sounds like there is no right place for freeing this
> mask.
My current thought is to chain all the memblock memory blocks to be
freed in a singly linked list first and then freed them at the right
moment by the memblock code. That will require some more investigation
into the memblock code. This patch is just a temporary workaround which
I hope will be reverted in the future.
Cheers,
Longman
© 2016 - 2026 Red Hat, Inc.