Documentation/mm/physical_memory.rst | 2 +- drivers/virtio/virtio_mem.c | 2 +- include/linux/gfp.h | 7 +- include/linux/memory_hotplug.h | 3 +- include/linux/mmzone.h | 21 +- include/linux/page-isolation.h | 47 +++- include/linux/pageblock-flags.h | 48 ++-- include/trace/events/kmem.h | 14 +- mm/cma.c | 2 +- mm/hugetlb.c | 4 +- mm/internal.h | 3 +- mm/memory_hotplug.c | 24 +- mm/memremap.c | 2 +- mm/mm_init.c | 24 +- mm/page_alloc.c | 321 +++++++++++++++++++++------ mm/page_isolation.c | 100 ++++----- 16 files changed, 433 insertions(+), 191 deletions(-)
Hi all, This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid being overwritten during pageblock isolation process. Currently, MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h), thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original migratetype. This causes pageblock migratetype loss during alloc_contig_range() and memory offline, especially when the process fails due to a failed pageblock isolation and the code tries to undo the finished pageblock isolations. It is on top of mm-everything-2025-06-15-23-48. In terms of performance for changing pageblock types, no performance change is observed: 1. I used perf to collect stats of offlining and onlining all memory of a 40GB VM 10 times and see that get_pfnblock_flags_mask() and set_pfnblock_flags_mask() take about 0.12% and 0.02% of the whole process respectively with and without this patchset across 3 runs. 2. I used perf to collect stats of dd from /dev/random to a 40GB tmpfs file and find get_pfnblock_flags_mask() takes about 0.05% of the process with and without this patchset across 3 runs. Changelog === From V8[7]: 1. make init_pageblock_migratetype() set right migratetype, when page_group_by_mobility_disabled is 1. From V7[6]: 1. restored acr_flags_t and renamed ACR_OTHER to ACR_NONE From V6[5]: 1. Used MIGRATETYPE_AND_ISO_MASK in init_pageblock_migratetype() too. 2. fixed an indentation issue in Patch 3. 3. removed acr_flags_t and used enum pb_isolate_mode instead in alloc_contig_range(). 4. collected review tags. From V5[4]: 1. used atomic version bitops for pageblock standalone bit operations. 2. added a helper function for standalone bit check. 3. renamed PB_migrate_skip to PB_compact_skip. 4. used #define MIGRATETYPE_AND_ISO_MASK MIGRATETYPE_MASK to simplify !CONFIG_MEMORY_ISOLATION code. 5. added __MIGRATE_TYPE_END to make sure migratetypes can be stored in PB_migratetype_bits. 6. used set and clear to implement toggle_pageblock_isolate() and added VM_WARN_ONCE in __move_freepages_block_isolate() to warn isolating a isolated pageblock and unisolating a not isolated pageblock. 7. dropped toggle_pfnblock_bit(). 8. made acr_flags_t an enum and added ACR_OTHER for non CMA allocation. 9. renamed pb_isolate_mode items to have PB_ISOLATE_MODE prefix. 10. collected reviewed-by. From v4[3]: 1. cleaned up existing pageblock flag functions: a. added {get,set}_{pfnblock,pageblock}_migratetype() to change pageblock migratetype b. added {get,set,clear}_pfnblock_bit() to change pageblock standalone bit, i.e., PB_migrate_skip and PB_migrate_isolate (added in this series). c. removed {get,set}_pfnblock_flags_mask(). 2. added __NR_PAGEBLOCK_BITS to present the number of pageblock flag bits and used roundup_pow_of_two(__NR_PAGEBLOCK_BITS) as NR_PAGEBLOCK_BITS. 3. moved {get,set,clear}_pageblock_isolate() to linux/page-isolation.h. 4. added init_pageblock_migratetype() to initialize a pageblock with a migratetype and isolated. It is used by memmap_init_range(), which is called by move_pfn_range_to_zone() in online_pages() from mm/memory_hotplug.c. Other set_pageblock_migratetype() users are changed too except the ones in mm/page_alloc.c. 5. toggle_pageblock_isolate() is reimplemented using __change_bit(). 6. set_pageblock_migratetype() gives a warning if a pageblock is changed from MIGRATE_ISOLATE to other migratetype. 7. added pb_isolate_mode: MEMORY_OFFLINE, CMA_ALLOCATION, ISOLATE_MODE_OTHERS to replace isolate flags. 8. REPORT_FAILURE is removed, since it is only used by MEMORY_OFFLINE. From v3[2]: 1. kept the original is_migrate_isolate_page() 2. moved {get,set,clear}_pageblock_isolate() to mm/page_isolation.c 3. used a single version for get_pageblock_migratetype() and get_pfnblock_migratetype(). 4. replace get_pageblock_isolate() with get_pageblock_migratetype() == MIGRATE_ISOLATE, a get_pageblock_isolate() becomes private in mm/page_isolation.c 5. made set_pageblock_migratetype() not accept MIGRATE_ISOLATE, so that people need to use the dedicate {get,set,clear}_pageblock_isolate() APIs. 6. changed online_page() from mm/memory_hotplug.c to first set pageblock migratetype to MIGRATE_MOVABLE, then isolate pageblocks. 7. added __maybe_unused to get_pageblock_isolate(), since it is only used in VM_BUG_ON(), which could be not present when MM debug is off. It is reported by kernel test robot. 7. fixed test_pages_isolated() type issues reported by kernel test robot. From v2[1]: 1. Moved MIGRATETYPE_NO_ISO_MASK to Patch 2, where it is used. 2. Removed spurious changes in Patch 1. 3. Refactored code so that migratetype mask is passed properly for all callers to {get,set}_pfnblock_flags_mask(). 4. Added toggle_pageblock_isolate() for setting and clearing MIGRATE_ISOLATE. 5. Changed get_pageblock_migratetype() when CONFIG_MEMORY_ISOLATION to handle MIGRATE_ISOLATE case. It acts like a parsing layer for get_pfnblock_flags_mask(). Design === Pageblock flags are read in words to achieve good performance and existing pageblock flags take 4 bits per pageblock. To avoid a substantial change to the pageblock flag code, 8 pageblock flag bits are used. It might look like the pageblock flags have doubled the overhead, but in reality, the overhead is only 1 byte per 2MB/4MB (based on pageblock config), or 0.0000476 %. Any comment and/or suggestion is welcome. Thanks. [1] https://lore.kernel.org/linux-mm/20250214154215.717537-1-ziy@nvidia.com/ [2] https://lore.kernel.org/linux-mm/20250507211059.2211628-2-ziy@nvidia.com/ [3] https://lore.kernel.org/linux-mm/20250509200111.3372279-1-ziy@nvidia.com/ [4] https://lore.kernel.org/linux-mm/20250523191258.339826-1-ziy@nvidia.com/ [5] https://lore.kernel.org/linux-mm/20250530162227.715551-1-ziy@nvidia.com/ [6] https://lore.kernel.org/linux-mm/20250602151807.987731-1-ziy@nvidia.com/ [7] https://lore.kernel.org/linux-mm/20250602235247.1219983-1-ziy@nvidia.com/ Zi Yan (6): mm/page_alloc: pageblock flags functions clean up. mm/page_isolation: make page isolation a standalone bit. mm/page_alloc: add support for initializing pageblock as isolated. mm/page_isolation: remove migratetype from move_freepages_block_isolate() mm/page_isolation: remove migratetype from undo_isolate_page_range() mm/page_isolation: remove migratetype parameter from more functions. Documentation/mm/physical_memory.rst | 2 +- drivers/virtio/virtio_mem.c | 2 +- include/linux/gfp.h | 7 +- include/linux/memory_hotplug.h | 3 +- include/linux/mmzone.h | 21 +- include/linux/page-isolation.h | 47 +++- include/linux/pageblock-flags.h | 48 ++-- include/trace/events/kmem.h | 14 +- mm/cma.c | 2 +- mm/hugetlb.c | 4 +- mm/internal.h | 3 +- mm/memory_hotplug.c | 24 +- mm/memremap.c | 2 +- mm/mm_init.c | 24 +- mm/page_alloc.c | 321 +++++++++++++++++++++------ mm/page_isolation.c | 100 ++++----- 16 files changed, 433 insertions(+), 191 deletions(-) -- 2.47.2
On Mon, 16 Jun 2025 08:10:13 -0400 Zi Yan <ziy@nvidia.com> wrote: > Hi all, > > This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid > being overwritten during pageblock isolation process. Currently, > MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h), > thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original > migratetype. This causes pageblock migratetype loss during > alloc_contig_range() and memory offline, especially when the process > fails due to a failed pageblock isolation and the code tries to undo the > finished pageblock isolations. > > It is on top of mm-everything-2025-06-15-23-48. mm-new would be a better target. mm-new is not (yet) included in linux-next, hence it is not in mm-everything. I hit a few issues (x86_64 allmodconfig): In file included from ./include/linux/slab.h:16, from ./include/linux/irq.h:21, from ./include/linux/of_irq.h:7, from drivers/gpu/drm/msm/hdmi/hdmi.c:9: ./include/linux/gfp.h:428:25: error: expected identifier before '(' token 428 | #define ACR_NONE ((__force acr_flags_t)0) // ordinary allocation request | ^ drivers/gpu/drm/msm/generated/hdmi.xml.h:71:9: note: in expansion of macro 'ACR_NONE' 71 | ACR_NONE = 0, | ^~~~~~~~ And this was needed: kernel/kexec_handover.c uses set_pageblock_migratetype() --- a/include/linux/page-isolation.h~mm-page_isolation-remove-migratetype-from-move_freepages_block_isolate-fix +++ a/include/linux/page-isolation.h @@ -45,6 +45,8 @@ void __meminit init_pageblock_migratetyp enum migratetype migratetype, bool isolate); +void set_pageblock_migratetype(struct page *page, enum migratetype migratetype); + bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page); bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page); --- a/mm/page_alloc.c~mm-page_isolation-remove-migratetype-from-move_freepages_block_isolate-fix +++ a/mm/page_alloc.c @@ -525,8 +525,7 @@ void clear_pfnblock_bit(const struct pag * @page: The page within the block of interest * @migratetype: migratetype to set */ -static void set_pageblock_migratetype(struct page *page, - enum migratetype migratetype) +void set_pageblock_migratetype(struct page *page, enum migratetype migratetype) { if (unlikely(page_group_by_mobility_disabled && migratetype < MIGRATE_PCPTYPES)) _
On 16 Jun 2025, at 21:37, Andrew Morton wrote: > On Mon, 16 Jun 2025 08:10:13 -0400 Zi Yan <ziy@nvidia.com> wrote: > >> Hi all, >> >> This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid >> being overwritten during pageblock isolation process. Currently, >> MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h), >> thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original >> migratetype. This causes pageblock migratetype loss during >> alloc_contig_range() and memory offline, especially when the process >> fails due to a failed pageblock isolation and the code tries to undo the >> finished pageblock isolations. >> >> It is on top of mm-everything-2025-06-15-23-48. > > mm-new would be a better target. mm-new is not (yet) included in > linux-next, hence it is not in mm-everything. > > I hit a few issues (x86_64 allmodconfig): > > In file included from ./include/linux/slab.h:16, > from ./include/linux/irq.h:21, > from ./include/linux/of_irq.h:7, > from drivers/gpu/drm/msm/hdmi/hdmi.c:9: > ./include/linux/gfp.h:428:25: error: expected identifier before '(' token > 428 | #define ACR_NONE ((__force acr_flags_t)0) // ordinary allocation request > | ^ > drivers/gpu/drm/msm/generated/hdmi.xml.h:71:9: note: in expansion of macro 'ACR_NONE' > 71 | ACR_NONE = 0, > | ^~~~~~~~ > > > > And this was needed: > > kernel/kexec_handover.c uses set_pageblock_migratetype() > > --- a/include/linux/page-isolation.h~mm-page_isolation-remove-migratetype-from-move_freepages_block_isolate-fix > +++ a/include/linux/page-isolation.h > @@ -45,6 +45,8 @@ void __meminit init_pageblock_migratetyp > enum migratetype migratetype, > bool isolate); > > +void set_pageblock_migratetype(struct page *page, enum migratetype migratetype); > + > bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page); > bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page); > > --- a/mm/page_alloc.c~mm-page_isolation-remove-migratetype-from-move_freepages_block_isolate-fix > +++ a/mm/page_alloc.c > @@ -525,8 +525,7 @@ void clear_pfnblock_bit(const struct pag > * @page: The page within the block of interest > * @migratetype: migratetype to set > */ > -static void set_pageblock_migratetype(struct page *page, > - enum migratetype migratetype) > +void set_pageblock_migratetype(struct page *page, enum migratetype migratetype) > { > if (unlikely(page_group_by_mobility_disabled && > migratetype < MIGRATE_PCPTYPES)) > _ Got it. Let me rebase and resend it. Best Regards, Yan, Zi
© 2016 - 2025 Red Hat, Inc.