include/linux/migrate.h | 6 ++ include/linux/mmzone.h | 5 + include/linux/page_ext.h | 17 +++ mm/Makefile | 3 +- mm/kmigrated.c | 223 +++++++++++++++++++++++++++++++++++++++ mm/memory.c | 30 +----- mm/migrate.c | 36 ++++++- mm/mm_init.c | 6 ++ mm/page_ext.c | 11 ++ 9 files changed, 309 insertions(+), 28 deletions(-) create mode 100644 mm/kmigrated.c
Hi, This is a continuation of the earlier post[1] that attempted to convert migrations from NUMA Balancing to be async and batched. In this version, per-node kernel threads are created to handle migrations in an async manner. This adds a few fields to the extended page flags that can be used both by the sub-systems that request migrations and kmigrated which migrates the pages. Some of the fields are potentially defined to be used by kpromoted-like subsystem to manage hot page metrics, but are unused right now. Currently only NUMA Balancing is changed to make use of the async batched migration. It does so by recording the target NID and the readiness of the page to be migrated in the extended page flags fields. Each kmigrated routinely scans its PFNs, identifies the pages marked for migration and batch-migrates them. Unlike the previous approach, the responsibility of isolating the pages is now with kmigrated. The major difference between this approach and the way kpromoted[2] tracked hot pages is the elimination of heavy synchronization points between producers(sub-systems that request migrations or report a hot page) and the consumer (kmigrated or kpromoted). Instead of tracking only the list of hot pages in an orthogonal manner, this approach ties the hot page or migration infomation to the struct page. TODOs: - Very lightly tested(only with NUMAB=1) and posted to get some feedback on the overall approach. - Currently uses the flags field from page extension sub-system. However need to check if it is preferrable to use/allocate a separate 32bit field exclusively for this purpose within page extension sub-system or outside of it. - Benefit of async batch migration still needs to be measured. - Need to really tune a few things like the number of pages to batch, the aggressiveness of kthread, the kthread sleep interval etc. - The logic to skip scanning of zones that don't have any pages marked for migration needs to be added. - No separate kernel config is defined currently and dependency on PAGE_EXTENSION isn't cleanly laid out. Some added definitions currently sit in page_ext.h which may not be an ideal location for them. [1] v0 - https://lore.kernel.org/linux-mm/20250521080238.209678-3-bharata@amd.com/ [2] kpromoted patchset - https://lore.kernel.org/linux-mm/20250306054532.221138-1-bharata@amd.com/ Bharata B Rao (3): mm: migrate: Allow misplaced migration without VMA too mm: kmigrated - Async kernel migration thread mm: sched: Batch-migrate misplaced pages Gregory Price (1): migrate: implement migrate_misplaced_folios_batch include/linux/migrate.h | 6 ++ include/linux/mmzone.h | 5 + include/linux/page_ext.h | 17 +++ mm/Makefile | 3 +- mm/kmigrated.c | 223 +++++++++++++++++++++++++++++++++++++++ mm/memory.c | 30 +----- mm/migrate.c | 36 ++++++- mm/mm_init.c | 6 ++ mm/page_ext.c | 11 ++ 9 files changed, 309 insertions(+), 28 deletions(-) create mode 100644 mm/kmigrated.c -- 2.34.1
Bharata B Rao <bharata@amd.com> writes: > Hi, > > This is a continuation of the earlier post[1] that attempted to > convert migrations from NUMA Balancing to be async and batched. > In this version, per-node kernel threads are created to handle > migrations in an async manner. > > This adds a few fields to the extended page flags that can be > used both by the sub-systems that request migrations and kmigrated > which migrates the pages. Some of the fields are potentially defined > to be used by kpromoted-like subsystem to manage hot page metrics, > but are unused right now. > > Currently only NUMA Balancing is changed to make use of the async > batched migration. It does so by recording the target NID and the > readiness of the page to be migrated in the extended page flags > fields. > > Each kmigrated routinely scans its PFNs, identifies the pages > marked for migration and batch-migrates them. Unlike the previous > approach, the responsibility of isolating the pages is now with > kmigrated. > > The major difference between this approach and the way kpromoted[2] > tracked hot pages is the elimination of heavy synchronization points > between producers(sub-systems that request migrations or report > a hot page) and the consumer (kmigrated or kpromoted). > Instead of tracking only the list of hot pages in an orthogonal > manner, this approach ties the hot page or migration infomation > to the struct page. I don't think page flag + scanning is a good idea. If the synchronization is really a problem for you (based on test results), some per-CPU data structure can be used to record candidate pages. [snip] --- Best Regards, Huang, Ying
On 20-Jun-25 12:09 PM, Huang, Ying wrote: > Bharata B Rao <bharata@amd.com> writes: > <snip> > > I don't think page flag + scanning is a good idea.If the If extended page flags is not the ideal location (I chose it in this version only to get something going quickly), we can look at maintaining per-pfn allocation for the required hot page metadata separately. Or is your concern specifically with scanning? What problems do you see? It is the cost or the possibility of not identifying the migrate-ready pages in time? Or something else?? Regards, Bharata.
Bharata B Rao <bharata@amd.com> writes: > On 20-Jun-25 12:09 PM, Huang, Ying wrote: >> Bharata B Rao <bharata@amd.com> writes: >> <snip> >> >> I don't think page flag + scanning is a good idea.If the > > If extended page flags is not the ideal location (I chose it in this > version only to get something going quickly), we can look at maintaining > per-pfn allocation for the required hot page metadata separately. > > Or is your concern specifically with scanning? What problems do you > see? > > It is the cost or the possibility of not identifying the migrate-ready > pages in time? Or something else?? We may need to scan a large number of pages to identify a page to promote. This will waste CPU cycles and pollute cache. --- Best Regards, Huang, Ying
© 2016 - 2025 Red Hat, Inc.