[RFC PATCH 00/12] KVM: x86/mmu: TDX post-populate cleanups

Sean Christopherson posted 12 patches 1 month, 1 week ago
There is a newer version of this series
arch/x86/kvm/mmu.h         |   3 +-
arch/x86/kvm/mmu/mmu.c     |  66 ++++++++++++++++++--
arch/x86/kvm/mmu/tdp_mmu.c |  37 ++---------
arch/x86/kvm/vmx/tdx.c     | 123 +++++++++++++------------------------
arch/x86/kvm/vmx/tdx.h     |   9 ++-
5 files changed, 117 insertions(+), 121 deletions(-)
[RFC PATCH 00/12] KVM: x86/mmu: TDX post-populate cleanups
Posted by Sean Christopherson 1 month, 1 week ago
This is a largely untested series to do most of what was discussed in the
thread regarding locking issues between gmem and TDX's post-populate hook[*],
with more than a few side quests thrown in as I was navigating through the
code to try to figure out how best to eliminate the copy_from_user() from
sev_gmem_post_populate(), which has the same locking problem (copying from
a userspace address can fault and in theory trigger the same problematic
path, I think).

Notably absent is the extraction of copy_from_user() from
sev_gmem_post_populate() to kvm_gmem_populate().  I've had this on my todo
list for a few weeks now, and haven't been able to focus on it for long
enough to get something hammered out, and with KVM Forum on the horizon, I
don't anticipate getting 'round to it within the next month (if not much
longer).

The thing that stymied me is what to do if snp_launch_update() is passed in
a huge batch of pages.  I waffled between doing a slow one-at-a-time approach
and a batched approached, and got especially stuck when trying to remember
and/or figure out how that handling would interact with hugepage support in
SNP in particular.

If anyone wants to tackle that project, the one thing change I definitely
think we should do is change the post_populate() callback to operate on
exactly one page.  KVM_SEV_SNP_LAUNCH_UPDATE allows for partial progress,
i.e. KVM's ABI doesn't require it to unwind a batch if adding a page fails.
If we take advantage of that, then sev_gmem_post_populate() will be a bit
simpler (though I wouldn't go so far as to call it "simple").

RFC as this is compile tested only (mostly due to lack of access to a TDX
capable system, but also due to lack of cycles).

[*] http://lore.kernel.org/all/aG_pLUlHdYIZ2luh@google.com

Sean Christopherson (12):
  KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings
  KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU
  Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP
    MMU"
  KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_prefault_page()
  KVM: TDX: Drop superfluous page pinning in S-EPT management
  KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition
  KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()
  KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent
  KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller
  KVM: TDX: Assert that slots_lock is held when nr_premapped is accessed
  KVM: TDX: Track nr_premapped as an "unsigned long", not an
    "atomic64_t"
  KVM: TDX: Rename nr_premapped to nr_pending_tdh_mem_page_adds

 arch/x86/kvm/mmu.h         |   3 +-
 arch/x86/kvm/mmu/mmu.c     |  66 ++++++++++++++++++--
 arch/x86/kvm/mmu/tdp_mmu.c |  37 ++---------
 arch/x86/kvm/vmx/tdx.c     | 123 +++++++++++++------------------------
 arch/x86/kvm/vmx/tdx.h     |   9 ++-
 5 files changed, 117 insertions(+), 121 deletions(-)


base-commit: 196d9e72c4b0bd68b74a4ec7f52d248f37d0f030
-- 
2.51.0.268.g9569e192d0-goog
Re: [RFC PATCH 00/12] KVM: x86/mmu: TDX post-populate cleanups
Posted by Edgecombe, Rick P 1 month ago
On Tue, 2025-08-26 at 17:05 -0700, Sean Christopherson wrote:
> RFC as this is compile tested only (mostly due to lack of access to a TDX
> capable system, but also due to lack of cycles).

Let us know how we could best help with this. The series fails the tests because
of the page size issue Yan pointed. We could just review and test a v2, or if
you want us to pull together the feedback, test the result, and repost please
let us know. I think either should work from our end.

I suspect Vishal could hook you up with a TDX machine. But if you need any setup
help there too, please shout.
Re: [RFC PATCH 00/12] KVM: x86/mmu: TDX post-populate cleanups
Posted by Sean Christopherson 1 month ago
On Thu, Aug 28, 2025, Edgecombe, Rick P wrote:
> On Tue, 2025-08-26 at 17:05 -0700, Sean Christopherson wrote:
> > RFC as this is compile tested only (mostly due to lack of access to a TDX
> > capable system, but also due to lack of cycles).
> 
> Let us know how we could best help with this. The series fails the tests because
> of the page size issue Yan pointed. We could just review and test a v2, or if
> you want us to pull together the feedback, test the result, and repost please
> let us know. I think either should work from our end.

I'll post a v2, it's going to look quite different.

> I suspect Vishal could hook you up with a TDX machine. But if you need any setup
> help there too, please shout.

Oh, he can, I just haven't crossed that testing bridge yet (ditto for SNP).  I'll
do so someday, but for now I'll abuse your generosity and throw noodles at ya :-)
Re: [RFC PATCH 00/12] KVM: x86/mmu: TDX post-populate cleanups
Posted by Yan Zhao 1 month, 1 week ago
On Tue, Aug 26, 2025 at 05:05:10PM -0700, Sean Christopherson wrote:
> This is a largely untested series to do most of what was discussed in the
> thread regarding locking issues between gmem and TDX's post-populate hook[*],
> with more than a few side quests thrown in as I was navigating through the
> code to try to figure out how best to eliminate the copy_from_user() from
> sev_gmem_post_populate(), which has the same locking problem (copying from
> a userspace address can fault and in theory trigger the same problematic
> path, I think).
> 
> Notably absent is the extraction of copy_from_user() from
> sev_gmem_post_populate() to kvm_gmem_populate().  I've had this on my todo
> list for a few weeks now, and haven't been able to focus on it for long
> enough to get something hammered out, and with KVM Forum on the horizon, I
> don't anticipate getting 'round to it within the next month (if not much
> longer).
> 
> The thing that stymied me is what to do if snp_launch_update() is passed in
> a huge batch of pages.  I waffled between doing a slow one-at-a-time approach
> and a batched approached, and got especially stuck when trying to remember
> and/or figure out how that handling would interact with hugepage support in
> SNP in particular.
> 
> If anyone wants to tackle that project, the one thing change I definitely
> think we should do is change the post_populate() callback to operate on
> exactly one page.
Not sure if I understand it correctly.
Do you mean something like the tdx_gmem_post_populate_4k() in
https://lore.kernel.org/all/20250424030500.32720-1-yan.y.zhao@intel.com, or
invoking hugepage_set_guest_inhibit() in the post_populate() callback? 

> KVM_SEV_SNP_LAUNCH_UPDATE allows for partial progress,
> i.e. KVM's ABI doesn't require it to unwind a batch if adding a page fails.
> If we take advantage of that, then sev_gmem_post_populate() will be a bit
> simpler (though I wouldn't go so far as to call it "simple").
> 
> RFC as this is compile tested only (mostly due to lack of access to a TDX
> capable system, but also due to lack of cycles).
> 
> [*] http://lore.kernel.org/all/aG_pLUlHdYIZ2luh@google.com
> 
> Sean Christopherson (12):
>   KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings
>   KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU
>   Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP
>     MMU"
>   KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_prefault_page()
>   KVM: TDX: Drop superfluous page pinning in S-EPT management
>   KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition
>   KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()
>   KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent
>   KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller
>   KVM: TDX: Assert that slots_lock is held when nr_premapped is accessed
>   KVM: TDX: Track nr_premapped as an "unsigned long", not an
>     "atomic64_t"
>   KVM: TDX: Rename nr_premapped to nr_pending_tdh_mem_page_adds
> 
>  arch/x86/kvm/mmu.h         |   3 +-
>  arch/x86/kvm/mmu/mmu.c     |  66 ++++++++++++++++++--
>  arch/x86/kvm/mmu/tdp_mmu.c |  37 ++---------
>  arch/x86/kvm/vmx/tdx.c     | 123 +++++++++++++------------------------
>  arch/x86/kvm/vmx/tdx.h     |   9 ++-
>  5 files changed, 117 insertions(+), 121 deletions(-)
> 
> 
> base-commit: 196d9e72c4b0bd68b74a4ec7f52d248f37d0f030
> -- 
> 2.51.0.268.g9569e192d0-goog
>