[RFC PATCH v1 0/4] Kernel thread based async batch migration

Bharata B Rao posted 4 patches 3 months, 3 weeks ago
include/linux/migrate.h  |   6 ++
include/linux/mmzone.h   |   5 +
include/linux/page_ext.h |  17 +++
mm/Makefile              |   3 +-
mm/kmigrated.c           | 223 +++++++++++++++++++++++++++++++++++++++
mm/memory.c              |  30 +-----
mm/migrate.c             |  36 ++++++-
mm/mm_init.c             |   6 ++
mm/page_ext.c            |  11 ++
9 files changed, 309 insertions(+), 28 deletions(-)
create mode 100644 mm/kmigrated.c
[RFC PATCH v1 0/4] Kernel thread based async batch migration
Posted by Bharata B Rao 3 months, 3 weeks ago
Hi,

This is a continuation of the earlier post[1] that attempted to
convert migrations from NUMA Balancing to be async and batched.
In this version, per-node kernel threads are created to handle
migrations in an async manner.

This adds a few fields to the extended page flags that can be
used both by the sub-systems that request migrations and kmigrated
which migrates the pages. Some of the fields are potentially defined
to be used by kpromoted-like subsystem to manage hot page metrics,
but are unused right now.

Currently only NUMA Balancing is changed to make use of the async
batched migration. It does so by recording the target NID and the
readiness of the page to be migrated in the extended page flags
fields.

Each kmigrated routinely scans its PFNs, identifies the pages
marked for migration and batch-migrates them. Unlike the previous
approach, the responsibility of isolating the pages is now with
kmigrated.

The major difference between this approach and the way kpromoted[2]
tracked hot pages is the elimination of heavy synchronization points
between producers(sub-systems that request migrations or report
a hot page) and the consumer (kmigrated or kpromoted).
Instead of tracking only the list of hot pages in an orthogonal
manner, this approach ties the hot page or migration infomation
to the struct page.

TODOs:

- Very lightly tested(only with NUMAB=1) and posted to get some
  feedback on the overall approach.
- Currently uses the flags field from page extension sub-system.
  However need to check if it is preferrable to use/allocate a
  separate 32bit field exclusively for this purpose within page
  extension sub-system or outside of it.
- Benefit of async batch migration still needs to be measured.
- Need to really tune a few things like the number of pages to
  batch, the aggressiveness of kthread, the kthread sleep interval etc.
- The logic to skip scanning of zones that don't have any pages
  marked for migration needs to be added.
- No separate kernel config is defined currently and dependency
  on PAGE_EXTENSION isn't cleanly laid out. Some added definitions
  currently sit in page_ext.h which may not be an ideal location
  for them.

[1] v0 - https://lore.kernel.org/linux-mm/20250521080238.209678-3-bharata@amd.com/
[2] kpromoted patchset - https://lore.kernel.org/linux-mm/20250306054532.221138-1-bharata@amd.com/

Bharata B Rao (3):
  mm: migrate: Allow misplaced migration without VMA too
  mm: kmigrated - Async kernel migration thread
  mm: sched: Batch-migrate misplaced pages

Gregory Price (1):
  migrate: implement migrate_misplaced_folios_batch

 include/linux/migrate.h  |   6 ++
 include/linux/mmzone.h   |   5 +
 include/linux/page_ext.h |  17 +++
 mm/Makefile              |   3 +-
 mm/kmigrated.c           | 223 +++++++++++++++++++++++++++++++++++++++
 mm/memory.c              |  30 +-----
 mm/migrate.c             |  36 ++++++-
 mm/mm_init.c             |   6 ++
 mm/page_ext.c            |  11 ++
 9 files changed, 309 insertions(+), 28 deletions(-)
 create mode 100644 mm/kmigrated.c

-- 
2.34.1
Re: [RFC PATCH v1 0/4] Kernel thread based async batch migration
Posted by Huang, Ying 3 months, 3 weeks ago
Bharata B Rao <bharata@amd.com> writes:

> Hi,
>
> This is a continuation of the earlier post[1] that attempted to
> convert migrations from NUMA Balancing to be async and batched.
> In this version, per-node kernel threads are created to handle
> migrations in an async manner.
>
> This adds a few fields to the extended page flags that can be
> used both by the sub-systems that request migrations and kmigrated
> which migrates the pages. Some of the fields are potentially defined
> to be used by kpromoted-like subsystem to manage hot page metrics,
> but are unused right now.
>
> Currently only NUMA Balancing is changed to make use of the async
> batched migration. It does so by recording the target NID and the
> readiness of the page to be migrated in the extended page flags
> fields.
>
> Each kmigrated routinely scans its PFNs, identifies the pages
> marked for migration and batch-migrates them. Unlike the previous
> approach, the responsibility of isolating the pages is now with
> kmigrated.
>
> The major difference between this approach and the way kpromoted[2]
> tracked hot pages is the elimination of heavy synchronization points
> between producers(sub-systems that request migrations or report
> a hot page) and the consumer (kmigrated or kpromoted).
> Instead of tracking only the list of hot pages in an orthogonal
> manner, this approach ties the hot page or migration infomation
> to the struct page.

I don't think page flag + scanning is a good idea.  If the
synchronization is really a problem for you (based on test results),
some per-CPU data structure can be used to record candidate pages.

[snip]

---
Best Regards,
Huang, Ying
Re: [RFC PATCH v1 0/4] Kernel thread based async batch migration
Posted by Bharata B Rao 3 months, 3 weeks ago
On 20-Jun-25 12:09 PM, Huang, Ying wrote:
> Bharata B Rao <bharata@amd.com> writes:
> <snip>
> 
> I don't think page flag + scanning is a good idea.If the

If extended page flags is not the ideal location (I chose it in this
version only to get something going quickly), we can look at maintaining
per-pfn allocation for the required hot page metadata separately.

Or is your concern specifically with scanning? What problems do you see?
It is the cost or the possibility of not identifying the migrate-ready
pages in time? Or something else??

Regards,
Bharata.
Re: [RFC PATCH v1 0/4] Kernel thread based async batch migration
Posted by Huang, Ying 3 months, 3 weeks ago
Bharata B Rao <bharata@amd.com> writes:

> On 20-Jun-25 12:09 PM, Huang, Ying wrote:
>> Bharata B Rao <bharata@amd.com> writes:
>> <snip>
>> 
>> I don't think page flag + scanning is a good idea.If the
>
> If extended page flags is not the ideal location (I chose it in this
> version only to get something going quickly), we can look at maintaining
> per-pfn allocation for the required hot page metadata separately.
>
> Or is your concern specifically with scanning? What problems do you
> see?
>
> It is the cost or the possibility of not identifying the migrate-ready
> pages in time? Or something else??

We may need to scan a large number of pages to identify a page to
promote.  This will waste CPU cycles and pollute cache.

---
Best Regards,
Huang, Ying