[PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support

Sean Christopherson posted 12 patches 3 months, 3 weeks ago
.clang-format                                 |   1 +
include/uapi/linux/magic.h                    |   1 +
tools/testing/selftests/kvm/arm64/vgic_irq.c  |   2 +-
.../testing/selftests/kvm/guest_memfd_test.c  |  98 ++++++
.../selftests/kvm/include/kvm_syscalls.h      |  81 +++++
.../testing/selftests/kvm/include/kvm_util.h  |  29 +-
tools/testing/selftests/kvm/include/numaif.h  | 110 +++---
.../selftests/kvm/kvm_binary_stats_test.c     |   4 +-
tools/testing/selftests/kvm/lib/kvm_util.c    |  55 +--
.../kvm/x86/private_mem_conversions_test.c    |   9 +-
.../selftests/kvm/x86/xapic_ipi_test.c        |   5 +-
virt/kvm/guest_memfd.c                        | 331 ++++++++++++++----
virt/kvm/kvm_main.c                           |   7 +-
virt/kvm/kvm_mm.h                             |   9 +-
14 files changed, 565 insertions(+), 177 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/kvm_syscalls.h
[PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Sean Christopherson 3 months, 3 weeks ago
Miguel, you got pulled in due to a one-line change to add a new iterator
macros in .clang-format.

Shivank's series to add support for NUMA-aware memory placement in
guest_memfd.  Based on kvm-x86/next.

Note, Ackerley pointed out that we should probably have testing for the
cpuset_do_page_mem_spread() behavior.  I 100% agree, but I'm also a-ok
merging without those tests.

v13:
 - Collect reviews.
 - Add kvm_gmem_for_each_file to .clang-format. [Ackerley]
 - Fix typos. [Ackerley]
 - Don't use get_task_policy() if the guest_memfd doesn't have its own
   policy, so that cpuset_do_page_mem_spread() works. [Ackerley]
 - Fix goofs in the changelogs related to numaif.h. [Ackerley]

v12:
 - https://lore.kernel.org/all/20251007221420.344669-1-seanjc@google.com
 - Add missing functionality to KVM selftests' existing numaif.h instead of
   linking to libnuma (which appears to have caveats with -static).
 - Add KVM_SYSCALL_DEFINE() infrastructure to reduce the boilerplate needed
   to wrap syscalls and/or to assert that a syscall succeeds.
 - Rename kvm_gmem to gmem_file, and use gmem_inode for the inode structure.
 - Track flags in a gmem_inode field instead of using i_private.
 - Add comments to call out subtleties in the mempolicy code (e.g. that
   returning NULL for vm_operations_struct.get_policy() is important for ABI
   reasons).
 - Improve debugability of guest_memfd_test (I kept generating SIGBUS when
   tweaking the tests).
 - Test mbind() with private memory (sadly, verifying placement with
   move_pages() doesn't work due to the dependency on valid page tables).

- V11: Rebase on kvm-next, remove RFC tag, use Ackerley's latest patch
       and fix a rcu race bug during kvm module unload.
- V10: Rebase on top of Fuad's V17. Use latest guest_memfd inode patch
       from Ackerley (with David's review comments). Use newer kmem_cache_create()
       API variant with arg parameter (Vlastimil)
- v9: Rebase on top of Fuad's V13 and incorporate review comments
- v8: Rebase on top of Fuad's V12: Host mmaping for guest_memfd memory.
- v7: Use inodes to store NUMA policy instead of file [0].
- v4-v6: Current approach using shared_policy support and vm_ops (based on
         suggestions from David [1] and guest_memfd bi-weekly upstream
         call discussion [2]).
- v3: Introduced fbind() syscall for VMM memory-placement configuration.
- v1,v2: Extended the KVM_CREATE_GUEST_MEMFD IOCTL to pass mempolicy.

[0] https://lore.kernel.org/all/diqzbjumm167.fsf@ackerleytng-ctop.c.googlers.com
[1] https://lore.kernel.org/all/6fbef654-36e2-4be5-906e-2a648a845278@redhat.com
[2] https://lore.kernel.org/all/2b77e055-98ac-43a1-a7ad-9f9065d7f38f@amd.com

Ackerley Tng (1):
  KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes

Sean Christopherson (7):
  KVM: guest_memfd: Rename "struct kvm_gmem" to "struct gmem_file"
  KVM: guest_memfd: Add macro to iterate over gmem_files for a
    mapping/inode
  KVM: selftests: Define wrappers for common syscalls to assert success
  KVM: selftests: Report stacktraces SIGBUS, SIGSEGV, SIGILL, and SIGFPE
    by default
  KVM: selftests: Add additional equivalents to libnuma APIs in KVM's
    numaif.h
  KVM: selftests: Use proper uAPI headers to pick up mempolicy.h
    definitions
  KVM: guest_memfd: Add gmem_inode.flags field instead of using
    i_private

Shivank Garg (4):
  KVM: guest_memfd: Add slab-allocated inode cache
  KVM: guest_memfd: Enforce NUMA mempolicy using shared policy
  KVM: selftests: Add helpers to probe for NUMA support, and multi-node
    systems
  KVM: selftests: Add guest_memfd tests for mmap and NUMA policy support

 .clang-format                                 |   1 +
 include/uapi/linux/magic.h                    |   1 +
 tools/testing/selftests/kvm/arm64/vgic_irq.c  |   2 +-
 .../testing/selftests/kvm/guest_memfd_test.c  |  98 ++++++
 .../selftests/kvm/include/kvm_syscalls.h      |  81 +++++
 .../testing/selftests/kvm/include/kvm_util.h  |  29 +-
 tools/testing/selftests/kvm/include/numaif.h  | 110 +++---
 .../selftests/kvm/kvm_binary_stats_test.c     |   4 +-
 tools/testing/selftests/kvm/lib/kvm_util.c    |  55 +--
 .../kvm/x86/private_mem_conversions_test.c    |   9 +-
 .../selftests/kvm/x86/xapic_ipi_test.c        |   5 +-
 virt/kvm/guest_memfd.c                        | 331 ++++++++++++++----
 virt/kvm/kvm_main.c                           |   7 +-
 virt/kvm/kvm_mm.h                             |   9 +-
 14 files changed, 565 insertions(+), 177 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/kvm_syscalls.h


base-commit: f222788458c8a7753d43befef2769cd282dc008e
-- 
2.51.0.858.gf9c4a03a3a-goog
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Miguel Ojeda 3 months, 3 weeks ago
On Thu, Oct 16, 2025 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Miguel, you got pulled in due to a one-line change to add a new iterator
> macros in .clang-format.

Thanks!

The macro is not in `include/`, right? That means that, currently,
when I rerun the command to update the list it will go away.

If that is correct, and you want to have it in the list, then we
should add e.g. `virt/` there or similar, or we could have a few
separate lines at the top that are independent of the ones generated
by the command.

Cheers,
Miguel
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Sean Christopherson 3 months, 3 weeks ago
On Thu, Oct 16, 2025, Miguel Ojeda wrote:
> On Thu, Oct 16, 2025 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > Miguel, you got pulled in due to a one-line change to add a new iterator
> > macros in .clang-format.
> 
> Thanks!
> 
> The macro is not in `include/`, right? That means that, currently,
> when I rerun the command to update the list it will go away.

Oh, I take it .clang-format is auto-generated?  Is it a "formal" script, or do
you literally just run the grep command in the comment?

  # Taken from:
  #   git grep -h '^#define [^[:space:]]*for_each[^[:space:]]*(' include/ tools/ \
  #   | sed "s,^#define \([^[:space:]]*for_each[^[:space:]]*\)(.*$,  - '\1'," \
  #   | LC_ALL=C sort -u

> If that is correct, and you want to have it in the list,

I don't think I care if it's in the list?  I honestly don't know for sure, because
it's entirely possible I'm consuming .clang-format without knowing it.  I added
the entry based on someone else's request.

Ackerley?

> then we should add e.g. `virt/` there or similar, or we could have a few
> separate lines at the top that are independent of the ones generated
> by the command.

Is it possible, and sensible, to have per-subsystem .clang-format files?  KVM
(virt/kvm) and KVM x86 (arch/x86/kvm) both have has several for_each macros,
pretty much all of which are more interesting than kvm_gmem_for_each_file().

Adding arch/x86/kvm to the "script" in .clang-format feels wrong.

E.g.

$ git grep -h '^#define [^[:space:]]*for_each[^[:space:]]*(' arch/x86/kvm/ | \
  sed "s,^#define \([^[:space:]]*for_each[^[:space:]]*\)(.*$,  - '\1'," | \
  LC_ALL=C sort -u
  - '__for_each_rmap_spte'
  - '__for_each_tdp_mmu_root'
  - '__for_each_tdp_mmu_root_yield_safe'
  - 'for_each_gfn_valid_sp_with_gptes'
  - 'for_each_rmap_spte'
  - 'for_each_rmap_spte_lockless'
  - 'for_each_shadow_entry'
  - 'for_each_shadow_entry_lockless'
  - 'for_each_shadow_entry_using_root'
  - 'for_each_slot_rmap_range'
  - 'for_each_sp'
  - 'for_each_tdp_mmu_root_rcu'
  - 'for_each_tdp_mmu_root_yield_safe'
  - 'for_each_tdp_pte'
  - 'for_each_tdp_pte_min_level'
  - 'for_each_tdp_pte_min_level_all'
  - 'for_each_valid_sp'
  - 'for_each_valid_tdp_mmu_root'
  - 'for_each_valid_tdp_mmu_root_yield_safe'
  - 'kvm_for_each_pmc'
  - 'tdp_root_for_each_leaf_pte'
  - 'tdp_root_for_each_pte'
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Miguel Ojeda 3 months, 3 weeks ago
On Thu, Oct 16, 2025 at 10:28 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Oh, I take it .clang-format is auto-generated?  Is it a "formal" script, or do
> you literally just run the grep command in the comment?

I just run it and copy-paste the results there from time to time.
Yeah, a very low-tech solution :)

> I don't think I care if it's in the list?  I honestly don't know for sure, because
> it's entirely possible I'm consuming .clang-format without knowing it.  I added
> the entry based on someone else's request.
>
> Ackerley?

If you are not relying on it, then please just skip it, yeah.

> Is it possible, and sensible, to have per-subsystem .clang-format files?  KVM
> (virt/kvm) and KVM x86 (arch/x86/kvm) both have has several for_each macros,
> pretty much all of which are more interesting than kvm_gmem_for_each_file().

There is `InheritParentConfig` nowadays, but from a quick look I don't
see it supports merging lists.

So to do something fancier, we would do need something like we did for
rust-analyzer, i.e. a `make` target or similar that would generate it.

Otherwise, we can just add extra macros at the top meanwhile.

What we did last time is just to add `tools/` to that command --
increasing coverage is not an issue (I just started with `include/`
originally to be a bit conservative and avoid a huge list until we
knew the tool would be used).

Cheers,
Miguel
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Ackerley Tng 3 months, 3 weeks ago
Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> writes:

> On Thu, Oct 16, 2025 at 10:28 PM Sean Christopherson <seanjc@google.com> wrote:
>>
>> Oh, I take it .clang-format is auto-generated?  Is it a "formal" script, or do
>> you literally just run the grep command in the comment?
>
> I just run it and copy-paste the results there from time to time.
> Yeah, a very low-tech solution :)
>

I assumed someone was doing this from time to time, and I ran the grep
command in .clang-format but IIUC it only reads tools/ and include/
(which doesn't cover this new macro) and so I thought the "automation"
would miss this new macro, hence I suggested to manually add the macro.

Using the command on virt/ would pick it up. Would it be better to add
"virt/" to the "automation" + update .clang-format while we're at it?

$ git grep -h '^#define [^[:space:]]*for_each[^[:space:]]*(' virt/ | sed "s,^#define \([^[:space:]]*for_each[^[:space:]]*\)(.*$,  - '\1'," | LC_ALL=C sort -u
- 'kvm_for_each_memslot_in_hva_range'
- 'kvm_gmem_for_each_file'

>> I don't think I care if it's in the list?  I honestly don't know for sure, because
>> it's entirely possible I'm consuming .clang-format without knowing it.  I added
>> the entry based on someone else's request.
>>
>> Ackerley?
>
> If you are not relying on it, then please just skip it, yeah.
>

I'm using it, I believe clangd (my lsp server) uses it to reflow correctly.

>> Is it possible, and sensible, to have per-subsystem .clang-format files?  KVM
>> (virt/kvm) and KVM x86 (arch/x86/kvm) both have has several for_each macros,
>> pretty much all of which are more interesting than kvm_gmem_for_each_file().
>
> There is `InheritParentConfig` nowadays, but from a quick look I don't
> see it supports merging lists.
>
> So to do something fancier, we would do need something like we did for
> rust-analyzer, i.e. a `make` target or similar that would generate it.
>
> Otherwise, we can just add extra macros at the top meanwhile.
>
> What we did last time is just to add `tools/` to that command --
> increasing coverage is not an issue (I just started with `include/`
> originally to be a bit conservative and avoid a huge list until we
> knew the tool would be used).
>
> Cheers,
> Miguel
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Miguel Ojeda 3 months, 3 weeks ago
On Fri, Oct 17, 2025 at 1:57 AM Ackerley Tng <ackerleytng@google.com> wrote:
>
> Using the command on virt/ would pick it up. Would it be better to add
> "virt/" to the "automation" + update .clang-format while we're at it?

Yeah, that is what I was suggesting if you rely on it (and if the
maintainers of the relevant folders are OK with it).

Cheers,
Miguel
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Sean Christopherson 3 months, 3 weeks ago
On Fri, Oct 17, 2025, Miguel Ojeda wrote:
> On Fri, Oct 17, 2025 at 1:57 AM Ackerley Tng <ackerleytng@google.com> wrote:
> >
> > Using the command on virt/ would pick it up. Would it be better to add
> > "virt/" to the "automation" + update .clang-format while we're at it?
> 
> Yeah, that is what I was suggesting if you rely on it (and if the
> maintainers of the relevant folders are OK with it).

Hmm, my vote would be to go all-or-nothing for KVM (x86), i.e. include everything
in KVM, or explicitly filter out KVM.  I don't see how auto-formatting can be
useful if it's wildly inconsistent, e.g. if it works for some KVM for-each macros,
but clobbers others.

And I'm leaning towards filtering out KVM, because I'm not sure I want to encourage
use of auto-formatting.  I can definitely see how it's useful, but so much of the
auto-formatting is just _awful_.

E.g. I ran it on a few KVM files and it generated changes like this

-       intel_pmu_enable_fixed_counter_bits(pmu, INTEL_FIXED_0_KERNEL |
-                                                INTEL_FIXED_0_USER |
-                                                INTEL_FIXED_0_ENABLE_PMI);
+       intel_pmu_enable_fixed_counter_bits(
+               pmu, INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER |
+                            INTEL_FIXED_0_ENABLE_PMI);

and 

-                       intel_pmu_enable_fixed_counter_bits(pmu, ICL_FIXED_0_ADAPTIVE);
+                       intel_pmu_enable_fixed_counter_bits(
+                               pmu, ICL_FIXED_0_ADAPTIVE);

There are definitely plenty of good changes as well, but overall I find the results
to be very net negative.  That's obviously highly subjective, and maybe there's
settings in clangd I can tweak to make things more to my liking, but my initial
reaction is that I don't want to actively encourage use of auto-formatting in KVM.

I think no matter what, any decision should be in a separate, dedicated patch/thread.
So for this series, I'll drop the .clang-format change when applying, assuming
nothing else pops that requires a new version.
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Sean Christopherson 3 months, 2 weeks ago
On Thu, 16 Oct 2025 10:28:41 -0700, Sean Christopherson wrote:
> Miguel, you got pulled in due to a one-line change to add a new iterator
> macros in .clang-format.
> 
> Shivank's series to add support for NUMA-aware memory placement in
> guest_memfd.  Based on kvm-x86/next.
> 
> Note, Ackerley pointed out that we should probably have testing for the
> cpuset_do_page_mem_spread() behavior.  I 100% agree, but I'm also a-ok
> merging without those tests.
> 
> [...]

Applied to kvm-x86 gmem, sans the .clang-format change.  Thanks!

[01/12] KVM: guest_memfd: Rename "struct kvm_gmem" to "struct gmem_file"
        https://github.com/kvm-x86/linux/commit/497b1dfbcacf
[02/12] KVM: guest_memfd: Add macro to iterate over gmem_files for a mapping/inode
        https://github.com/kvm-x86/linux/commit/392dd9d9488a
[03/12] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes
        https://github.com/kvm-x86/linux/commit/a63ca4236e67
[04/12] KVM: guest_memfd: Add slab-allocated inode cache
        https://github.com/kvm-x86/linux/commit/f609e89ae893
[05/12] KVM: guest_memfd: Enforce NUMA mempolicy using shared policy
        https://github.com/kvm-x86/linux/commit/ed1ffa810bd6
[06/12] KVM: selftests: Define wrappers for common syscalls to assert success
        https://github.com/kvm-x86/linux/commit/3223560c93eb
[07/12] KVM: selftests: Report stacktraces SIGBUS, SIGSEGV, SIGILL, and SIGFPE by default
        https://github.com/kvm-x86/linux/commit/29dc539d74ab
[08/12] KVM: selftests: Add additional equivalents to libnuma APIs in KVM's numaif.h
        https://github.com/kvm-x86/linux/commit/2189d78269c5
[09/12] KVM: selftests: Use proper uAPI headers to pick up mempolicy.h definitions
        https://github.com/kvm-x86/linux/commit/fe7baebb99de
[10/12] KVM: selftests: Add helpers to probe for NUMA support, and multi-node systems
        https://github.com/kvm-x86/linux/commit/e698e89b3ed1
[11/12] KVM: selftests: Add guest_memfd tests for mmap and NUMA policy support
        https://github.com/kvm-x86/linux/commit/38ccc50ac037
[12/12] KVM: guest_memfd: Add gmem_inode.flags field instead of using i_private
        https://github.com/kvm-x86/linux/commit/e66438bb81c4

--
https://github.com/kvm-x86/linux/tree/next
Re: [PATCH v13 00/12] KVM: guest_memfd: Add NUMA mempolicy support
Posted by Garg, Shivank 3 months, 2 weeks ago

On 10/20/2025 10:03 PM, Sean Christopherson wrote:
> On Thu, 16 Oct 2025 10:28:41 -0700, Sean Christopherson wrote:
>> Miguel, you got pulled in due to a one-line change to add a new iterator
>> macros in .clang-format.
>>
>> Shivank's series to add support for NUMA-aware memory placement in
>> guest_memfd.  Based on kvm-x86/next.
>>
>> Note, Ackerley pointed out that we should probably have testing for the
>> cpuset_do_page_mem_spread() behavior.  I 100% agree, but I'm also a-ok
>> merging without those tests.
>>
>> [...]
> 
> Applied to kvm-x86 gmem, sans the .clang-format change.  Thanks!
> 
> [01/12] KVM: guest_memfd: Rename "struct kvm_gmem" to "struct gmem_file"
>         https://github.com/kvm-x86/linux/commit/497b1dfbcacf
> [02/12] KVM: guest_memfd: Add macro to iterate over gmem_files for a mapping/inode
>         https://github.com/kvm-x86/linux/commit/392dd9d9488a
> [03/12] KVM: guest_memfd: Use guest mem inodes instead of anonymous inodes
>         https://github.com/kvm-x86/linux/commit/a63ca4236e67
> [04/12] KVM: guest_memfd: Add slab-allocated inode cache
>         https://github.com/kvm-x86/linux/commit/f609e89ae893
> [05/12] KVM: guest_memfd: Enforce NUMA mempolicy using shared policy
>         https://github.com/kvm-x86/linux/commit/ed1ffa810bd6
> [06/12] KVM: selftests: Define wrappers for common syscalls to assert success
>         https://github.com/kvm-x86/linux/commit/3223560c93eb
> [07/12] KVM: selftests: Report stacktraces SIGBUS, SIGSEGV, SIGILL, and SIGFPE by default
>         https://github.com/kvm-x86/linux/commit/29dc539d74ab
> [08/12] KVM: selftests: Add additional equivalents to libnuma APIs in KVM's numaif.h
>         https://github.com/kvm-x86/linux/commit/2189d78269c5
> [09/12] KVM: selftests: Use proper uAPI headers to pick up mempolicy.h definitions
>         https://github.com/kvm-x86/linux/commit/fe7baebb99de
> [10/12] KVM: selftests: Add helpers to probe for NUMA support, and multi-node systems
>         https://github.com/kvm-x86/linux/commit/e698e89b3ed1
> [11/12] KVM: selftests: Add guest_memfd tests for mmap and NUMA policy support
>         https://github.com/kvm-x86/linux/commit/38ccc50ac037
> [12/12] KVM: guest_memfd: Add gmem_inode.flags field instead of using i_private
>         https://github.com/kvm-x86/linux/commit/e66438bb81c4
> 
> --
> https://github.com/kvm-x86/linux/tree/next

Hi Sean,

Thank you for handling all the changes in v12-v13. I appreciate you taking this on,
especially the refactoring, improving selftests and code clarity.

Thanks to everyone who provided support, reviews, and suggestions throughout the series.

Best regards,
Shivank