mm/sparse.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
memmap pages can be allocated either from the memblock (boot) allocator
during early boot or from the buddy allocator.
When these memmap pages are removed via arch_remove_memory(), the
deallocation path depends on their source:
* For pages from the buddy allocator, depopulate_section_memmap() is
called, which should decrement the count of nr_memmap_pages.
* For pages from the boot allocator, free_map_bootmem() is called, which
should decrement the count of the nr_memmap_boot_pages.
Ensure correct tracking of memmap pages for both early sections and non
early sections by adjusting the accounting in section_deactivate().
Cc: stable@vger.kernel.org
Fixes: 15995a352474 ("mm: report per-page metadata information")
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
---
v2: consider accounting for !CONFIG_SPARSEMEM_VMEMMAP.
mm/sparse.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/mm/sparse.c b/mm/sparse.c
index 3c012cf83cc2..b9cc9e548f80 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -680,7 +680,6 @@ static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages,
unsigned long start = (unsigned long) pfn_to_page(pfn);
unsigned long end = start + nr_pages * sizeof(struct page);
- memmap_pages_add(-1L * (DIV_ROUND_UP(end - start, PAGE_SIZE)));
vmemmap_free(start, end, altmap);
}
static void free_map_bootmem(struct page *memmap)
@@ -856,10 +855,14 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages,
* The memmap of early sections is always fully populated. See
* section_activate() and pfn_valid() .
*/
- if (!section_is_early)
+ if (!section_is_early) {
+ memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)));
depopulate_section_memmap(pfn, nr_pages, altmap);
- else if (memmap)
+ } else if (memmap) {
+ memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page),
+ PAGE_SIZE)));
free_map_bootmem(memmap);
+ }
if (empty)
ms->section_mem_map = (unsigned long)NULL;
--
2.48.1
On 04.08.25 17:13, Sumanth Korikkar wrote: > memmap pages can be allocated either from the memblock (boot) allocator > during early boot or from the buddy allocator. > > When these memmap pages are removed via arch_remove_memory(), the > deallocation path depends on their source: > > * For pages from the buddy allocator, depopulate_section_memmap() is > called, which should decrement the count of nr_memmap_pages. > > * For pages from the boot allocator, free_map_bootmem() is called, which > should decrement the count of the nr_memmap_boot_pages. > > Ensure correct tracking of memmap pages for both early sections and non > early sections by adjusting the accounting in section_deactivate(). > > Cc: stable@vger.kernel.org > Fixes: 15995a352474 ("mm: report per-page metadata information") > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> > --- > v2: consider accounting for !CONFIG_SPARSEMEM_VMEMMAP. > > mm/sparse.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/sparse.c b/mm/sparse.c > index 3c012cf83cc2..b9cc9e548f80 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -680,7 +680,6 @@ static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, > unsigned long start = (unsigned long) pfn_to_page(pfn); > unsigned long end = start + nr_pages * sizeof(struct page); > > - memmap_pages_add(-1L * (DIV_ROUND_UP(end - start, PAGE_SIZE))); > vmemmap_free(start, end, altmap); > } > static void free_map_bootmem(struct page *memmap) > @@ -856,10 +855,14 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, > * The memmap of early sections is always fully populated. See > * section_activate() and pfn_valid() . > */ > - if (!section_is_early) > + if (!section_is_early) { > + memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE))); > depopulate_section_memmap(pfn, nr_pages, altmap); > - else if (memmap) > + } else if (memmap) { > + memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), > + PAGE_SIZE))); > free_map_bootmem(memmap); > + } > > if (empty) > ms->section_mem_map = (unsigned long)NULL; Acked-by: David Hildenbrand <david@redhat.com> Hopefully we're not missing anything important. -- Cheers, David / dhildenb
On Mon, Aug 04, 2025 at 05:13:27PM +0200, Sumanth Korikkar wrote: >memmap pages can be allocated either from the memblock (boot) allocator >during early boot or from the buddy allocator. > >When these memmap pages are removed via arch_remove_memory(), the >deallocation path depends on their source: > >* For pages from the buddy allocator, depopulate_section_memmap() is > called, which should decrement the count of nr_memmap_pages. > >* For pages from the boot allocator, free_map_bootmem() is called, which > should decrement the count of the nr_memmap_boot_pages. > >Ensure correct tracking of memmap pages for both early sections and non >early sections by adjusting the accounting in section_deactivate(). > >Cc: stable@vger.kernel.org >Fixes: 15995a352474 ("mm: report per-page metadata information") >Suggested-by: David Hildenbrand <david@redhat.com> >Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> >--- >v2: consider accounting for !CONFIG_SPARSEMEM_VMEMMAP. > > mm/sparse.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > >diff --git a/mm/sparse.c b/mm/sparse.c >index 3c012cf83cc2..b9cc9e548f80 100644 >--- a/mm/sparse.c >+++ b/mm/sparse.c >@@ -680,7 +680,6 @@ static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, > unsigned long start = (unsigned long) pfn_to_page(pfn); > unsigned long end = start + nr_pages * sizeof(struct page); > >- memmap_pages_add(-1L * (DIV_ROUND_UP(end - start, PAGE_SIZE))); > vmemmap_free(start, end, altmap); > } > static void free_map_bootmem(struct page *memmap) >@@ -856,10 +855,14 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, > * The memmap of early sections is always fully populated. See > * section_activate() and pfn_valid() . > */ >- if (!section_is_early) >+ if (!section_is_early) { >+ memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE))); > depopulate_section_memmap(pfn, nr_pages, altmap); >- else if (memmap) >+ } else if (memmap) { >+ memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), >+ PAGE_SIZE))); > free_map_bootmem(memmap); >+ } The change here is reasonable. While maybe we still miss the counting at some other points. For example: a. sparse_init_nid() __populate_section_memmap() If !CONFIG_SPARSEMEM_VMEMMAP, and sparse_buffer_alloc() return NULL, it allocate extra memory from bootmem, which looks not counted. b. section_activate() populate_section_memmap() If !CONFIG_SPARSEMEM_VMEMMAP, it just call kvmalloc_node(), which looks not counted. Do I missed something? > > if (empty) > ms->section_mem_map = (unsigned long)NULL; >-- >2.48.1 > -- Wei Yang Help you, Help me
> The change here is reasonable. While maybe we still miss the counting at some > other points. > > For example: > > a. > > sparse_init_nid() > __populate_section_memmap() > > If !CONFIG_SPARSEMEM_VMEMMAP, and sparse_buffer_alloc() return NULL, it > allocate extra memory from bootmem, which looks not counted. Currently, the accounting is done upfront in sparse_buffer_init(), where memmap_boot_pages_add() is called for !CONFIG_SPARSEMEM_VMEMMAP. The function sparse_buffer_alloc() can return NULL in two scenarios: * During sparse_buffer_init(), if memmap_alloc() fails, sparsemap_buf will be NULL. * Inside sparse_buffer_alloc(), if ptr + size exceeds sparsemap_buf_end, then ptr is set to NULL. Considering this, perhaps memmap_boot_pages_add() could be moved into __populate_section_memmap(), with the accounting done only if the operation is successful. What do you think? > section_activate() > populate_section_memmap() > > If !CONFIG_SPARSEMEM_VMEMMAP, it just call kvmalloc_node(), which looks not > counted. Sounds right. This means nr_memmap_pages adjustment is needed for !CONFIG_SPARSEMEM_VMEMMAP here. I will recheck this. Thank you
On Wed, Aug 06, 2025 at 02:46:43PM +0200, Sumanth Korikkar wrote: >> The change here is reasonable. While maybe we still miss the counting at some >> other points. >> >> For example: >> >> a. >> >> sparse_init_nid() >> __populate_section_memmap() >> >> If !CONFIG_SPARSEMEM_VMEMMAP, and sparse_buffer_alloc() return NULL, it >> allocate extra memory from bootmem, which looks not counted. > >Currently, the accounting is done upfront in sparse_buffer_init(), where >memmap_boot_pages_add() is called for !CONFIG_SPARSEMEM_VMEMMAP. > >The function sparse_buffer_alloc() can return NULL in two scenarios: > >* During sparse_buffer_init(), if memmap_alloc() fails, sparsemap_buf will be NULL. >* Inside sparse_buffer_alloc(), if ptr + size exceeds sparsemap_buf_end, > then ptr is set to NULL. > >Considering this, perhaps memmap_boot_pages_add() could be moved into >__populate_section_memmap(), with the accounting done only if the >operation is successful. What do you think? > Looks reasonable to me. >> section_activate() >> populate_section_memmap() >> >> If !CONFIG_SPARSEMEM_VMEMMAP, it just call kvmalloc_node(), which looks not >> counted. > >Sounds right. This means nr_memmap_pages adjustment is needed for >!CONFIG_SPARSEMEM_VMEMMAP here. I will recheck this. > >Thank you -- Wei Yang Help you, Help me
© 2016 - 2025 Red Hat, Inc.