[PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()

Sean Christopherson posted 3 patches 10 months, 1 week ago
There is a newer version of this series
[PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Sean Christopherson 10 months, 1 week ago
Now that the size of "struct kvm" is less than 2KiB, switch back to using
kzalloc() to allocate the VM structures.  Add compile-time assertions in
vendor code to ensure the size is an order-0 allocation, i.e. to prevent
unknowingly letting the size balloon in the future.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/svm/svm.c          | 1 +
 arch/x86/kvm/vmx/vmx.c          | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e523d7d8a107..6c7fd7db6f11 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1940,7 +1940,7 @@ void kvm_x86_vendor_exit(void);
 #define __KVM_HAVE_ARCH_VM_ALLOC
 static inline struct kvm *kvm_arch_alloc_vm(void)
 {
-	return __vmalloc(kvm_x86_ops.vm_size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+	return kzalloc(kvm_x86_ops.vm_size, GFP_KERNEL_ACCOUNT);
 }
 
 #define __KVM_HAVE_ARCH_VM_FREE
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8abeab91d329..589adc5f92e0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5536,6 +5536,7 @@ static int __init svm_init(void)
 	if (r)
 		goto err_kvm_init;
 
+	BUILD_BUG_ON(get_order(sizeof(struct kvm_svm) != 0));
 	return 0;
 
 err_kvm_init:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b70ed72c1783..01264842bf45 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8755,6 +8755,7 @@ static int __init vmx_init(void)
 	if (r)
 		goto err_kvm_init;
 
+	BUILD_BUG_ON(get_order(sizeof(struct kvm_vmx) != 0));
 	return 0;
 
 err_kvm_init:
-- 
2.49.0.472.ge94155a9ec-goog
Re: [PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Vipin Sharma 9 months, 4 weeks ago
On 2025-04-01 08:57:13, Sean Christopherson wrote:
> Now that the size of "struct kvm" is less than 2KiB, switch back to using
> kzalloc() to allocate the VM structures.  Add compile-time assertions in
> vendor code to ensure the size is an order-0 allocation, i.e. to prevent
> unknowingly letting the size balloon in the future.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 2 +-
>  arch/x86/kvm/svm/svm.c          | 1 +
>  arch/x86/kvm/vmx/vmx.c          | 1 +
>  3 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index e523d7d8a107..6c7fd7db6f11 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1940,7 +1940,7 @@ void kvm_x86_vendor_exit(void);
>  #define __KVM_HAVE_ARCH_VM_ALLOC
>  static inline struct kvm *kvm_arch_alloc_vm(void)
>  {
> -	return __vmalloc(kvm_x86_ops.vm_size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +	return kzalloc(kvm_x86_ops.vm_size, GFP_KERNEL_ACCOUNT);
>  }
>  
>  #define __KVM_HAVE_ARCH_VM_FREE
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8abeab91d329..589adc5f92e0 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5536,6 +5536,7 @@ static int __init svm_init(void)
>  	if (r)
>  		goto err_kvm_init;
>  
> +	BUILD_BUG_ON(get_order(sizeof(struct kvm_svm) != 0));

There is a typo here. It is checking sizeof(struct kvm_svm) != 0, instead
of checking get_order(...) != 0.

>  	return 0;
>  
>  err_kvm_init:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index b70ed72c1783..01264842bf45 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8755,6 +8755,7 @@ static int __init vmx_init(void)
>  	if (r)
>  		goto err_kvm_init;
>  
> +	BUILD_BUG_ON(get_order(sizeof(struct kvm_vmx) != 0));

Same as above.

>  	return 0;
>  
>  err_kvm_init:
> -- 
> 2.49.0.472.ge94155a9ec-goog
>
Re: [PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Vipin Sharma 9 months, 4 weeks ago
On 2025-04-16 11:24:37, Vipin Sharma wrote:
> On 2025-04-01 08:57:13, Sean Christopherson wrote:
> >  
> > +	BUILD_BUG_ON(get_order(sizeof(struct kvm_svm) != 0));
> 
> There is a typo here. It is checking sizeof(struct kvm_svm) != 0, instead
> of checking get_order(...) != 0.
> 
> >  	return 0;
> >  
> >  err_kvm_init:
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index b70ed72c1783..01264842bf45 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -8755,6 +8755,7 @@ static int __init vmx_init(void)
> >  	if (r)
> >  		goto err_kvm_init;
> >  
> > +	BUILD_BUG_ON(get_order(sizeof(struct kvm_vmx) != 0));
> 
> Same as above.
> 

After fixing the typo build is failing.

Checked via pahole, sizes of struct have reduced but still not under 4k.
After applying the patch:

struct kvm{} - 4104
struct kvm_svm{} - 4320
struct kvm_vmx{} - 4128

Also, this BUILD_BUG_ON() might not be reliable unless all of the ifdefs
under kvm_[vmx|svm] and its children are enabled. Won't that be an
issue?
Re: [PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Sean Christopherson 9 months, 4 weeks ago
On Wed, Apr 16, 2025, Vipin Sharma wrote:
> On 2025-04-16 11:24:37, Vipin Sharma wrote:
> > On 2025-04-01 08:57:13, Sean Christopherson wrote:
> > >  
> > > +	BUILD_BUG_ON(get_order(sizeof(struct kvm_svm) != 0));
> > 
> > There is a typo here. It is checking sizeof(struct kvm_svm) != 0, instead
> > of checking get_order(...) != 0.
> > 
> > >  	return 0;
> > >  
> > >  err_kvm_init:
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index b70ed72c1783..01264842bf45 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -8755,6 +8755,7 @@ static int __init vmx_init(void)
> > >  	if (r)
> > >  		goto err_kvm_init;
> > >  
> > > +	BUILD_BUG_ON(get_order(sizeof(struct kvm_vmx) != 0));
> > 
> > Same as above.

Ugh.  That's what I get for violating the kernel's "don't check for '0'" rule
(I thought it would make the code more understandable).  Bad me.

> After fixing the typo build is failing.
> 
> Checked via pahole, sizes of struct have reduced but still not under 4k.
> After applying the patch:
> 
> struct kvm{} - 4104
> struct kvm_svm{} - 4320
> struct kvm_vmx{} - 4128
> 
> Also, this BUILD_BUG_ON() might not be reliable unless all of the ifdefs
> under kvm_[vmx|svm] and its children are enabled. Won't that be an
> issue?

That's what build bots (and to a lesser extent, maintainers) are for.  An individual
developer might miss a particular config, but the build bots that run allyesconfig
will very quickly detect the issue, and then we fix it.

I also build what is effectively an "allkvmconfig" before officially applying
anything, so in general things like this shouldn't even make it to the bots.
Re: [PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Huang, Kai 9 months, 3 weeks ago
On Wed, 2025-04-16 at 12:57 -0700, Sean Christopherson wrote:
> On Wed, Apr 16, 2025, Vipin Sharma wrote:
> > On 2025-04-16 11:24:37, Vipin Sharma wrote:
> > > On 2025-04-01 08:57:13, Sean Christopherson wrote:
> > > >  
> > > > +	BUILD_BUG_ON(get_order(sizeof(struct kvm_svm) != 0));
> > > 
> > > There is a typo here. It is checking sizeof(struct kvm_svm) != 0, instead
> > > of checking get_order(...) != 0.
> > > 
> > > >  	return 0;
> > > >  
> > > >  err_kvm_init:
> > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > > index b70ed72c1783..01264842bf45 100644
> > > > --- a/arch/x86/kvm/vmx/vmx.c
> > > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > > @@ -8755,6 +8755,7 @@ static int __init vmx_init(void)
> > > >  	if (r)
> > > >  		goto err_kvm_init;
> > > >  
> > > > +	BUILD_BUG_ON(get_order(sizeof(struct kvm_vmx) != 0));
> > > 
> > > Same as above.
> 
> Ugh.  That's what I get for violating the kernel's "don't check for '0'" rule
> (I thought it would make the code more understandable).  Bad me.
> 
> > After fixing the typo build is failing.
> > 
> > Checked via pahole, sizes of struct have reduced but still not under 4k.
> > After applying the patch:
> > 
> > struct kvm{} - 4104
> > struct kvm_svm{} - 4320
> > struct kvm_vmx{} - 4128
> > 
> > Also, this BUILD_BUG_ON() might not be reliable unless all of the ifdefs
> > under kvm_[vmx|svm] and its children are enabled. Won't that be an
> > issue?
> 
> That's what build bots (and to a lesser extent, maintainers) are for.  An individual
> developer might miss a particular config, but the build bots that run allyesconfig
> will very quickly detect the issue, and then we fix it.
> 
> I also build what is effectively an "allkvmconfig" before officially applying
> anything, so in general things like this shouldn't even make it to the bots.
> 

Just want to understand the intention here:

What if someday a developer really needs to add some new field(s) to, lets say
'struct kvm_vmx', and that makes the size exceed 4K?

What should the developer do?  Is it a hard requirement that the size should
never go beyond 4K?  Or, should the assert of order 0 allocation be changed to
the assert of order 1 allocation?
Re: [PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Sean Christopherson 9 months, 3 weeks ago
On Tue, Apr 22, 2025, Kai Huang wrote:
> On Wed, 2025-04-16 at 12:57 -0700, Sean Christopherson wrote:
> > On Wed, Apr 16, 2025, Vipin Sharma wrote:
> > > Checked via pahole, sizes of struct have reduced but still not under 4k.
> > > After applying the patch:
> > > 
> > > struct kvm{} - 4104
> > > struct kvm_svm{} - 4320
> > > struct kvm_vmx{} - 4128
> > > 
> > > Also, this BUILD_BUG_ON() might not be reliable unless all of the ifdefs
> > > under kvm_[vmx|svm] and its children are enabled. Won't that be an
> > > issue?
> > 
> > That's what build bots (and to a lesser extent, maintainers) are for.  An individual
> > developer might miss a particular config, but the build bots that run allyesconfig
> > will very quickly detect the issue, and then we fix it.
> > 
> > I also build what is effectively an "allkvmconfig" before officially applying
> > anything, so in general things like this shouldn't even make it to the bots.
> > 
> 
> Just want to understand the intention here:
> 
> What if someday a developer really needs to add some new field(s) to, lets say
> 'struct kvm_vmx', and that makes the size exceed 4K?

If it helps, here's the changelog I plan on posting for v3:
    
    Allocate VM structs via kvzalloc(), i.e. try to use a contiguous physical
    allocation before falling back to __vmalloc(), to avoid the overhead of
    establishing the virtual mappings.  The SVM and VMX (and TDX) structures
    are now just above 4096 bytes, i.e. are order-1 allocations, and will
    likely remain that way for quite some time.
    
    Add compile-time assertions in vendor code to ensure the size is an
    order-0 or order-1 allocation, i.e. to prevent unknowingly letting the
    size balloon in the future.  There's nothing fundamentally wrong with a
    larger kvm_{svm,vmx,tdx} size, but given that the size is barely above
    4096 after 18+ years of existence, exceeding exceed 8192 bytes would be
    quite notable.


> What should the developer do?  Is it a hard requirement that the size should
> never go beyond 4K?  Or, should the assert of order 0 allocation be changed to
> the assert of order 1 allocation?

It depends.  Now that Vipin has corrected my math, the assertion will be that the
VM struct is order-1 or smaller, i.e. <= 8KiB.  That gives us a _lot_ of room to
grow.  E.g. KVM has existed for ~18 years and is barely about 4KiB, so for organic
growth (small additions here and there), I don't expect to hit the 8KiB limit in
the next decade (famous last words).  And the memory landscape will likely be
quite different 10+ years from now, i.e. the assertion may be completely unnecessary
by the time it fires.

What I'm most interested in detecting and prevent is things like mmu_page_hash,
where a massive field is embedded in struct kvm for an *optional* feature.  I.e.
if a new feature adds a massive field, then it should probably be placed in a
separate, dynamically allocated structure.  And for those, it should be quite
obvious that a separate allocation is the way to go.
Re: [PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Huang, Kai 9 months, 3 weeks ago
On Wed, 2025-04-23 at 10:07 -0700, Sean Christopherson wrote:
> On Tue, Apr 22, 2025, Kai Huang wrote:
> > On Wed, 2025-04-16 at 12:57 -0700, Sean Christopherson wrote:
> > > On Wed, Apr 16, 2025, Vipin Sharma wrote:
> > > > Checked via pahole, sizes of struct have reduced but still not under 4k.
> > > > After applying the patch:
> > > > 
> > > > struct kvm{} - 4104
> > > > struct kvm_svm{} - 4320
> > > > struct kvm_vmx{} - 4128
> > > > 
> > > > Also, this BUILD_BUG_ON() might not be reliable unless all of the ifdefs
> > > > under kvm_[vmx|svm] and its children are enabled. Won't that be an
> > > > issue?
> > > 
> > > That's what build bots (and to a lesser extent, maintainers) are for.  An individual
> > > developer might miss a particular config, but the build bots that run allyesconfig
> > > will very quickly detect the issue, and then we fix it.
> > > 
> > > I also build what is effectively an "allkvmconfig" before officially applying
> > > anything, so in general things like this shouldn't even make it to the bots.
> > > 
> > 
> > Just want to understand the intention here:
> > 
> > What if someday a developer really needs to add some new field(s) to, lets say
> > 'struct kvm_vmx', and that makes the size exceed 4K?
> 
> If it helps, here's the changelog I plan on posting for v3:
>     
>     Allocate VM structs via kvzalloc(), i.e. try to use a contiguous physical
>     allocation before falling back to __vmalloc(), to avoid the overhead of
>     establishing the virtual mappings.  The SVM and VMX (and TDX) structures
>     are now just above 4096 bytes, i.e. are order-1 allocations, and will
>     likely remain that way for quite some time.
>     
>     Add compile-time assertions in vendor code to ensure the size is an
>     order-0 or order-1 allocation, i.e. to prevent unknowingly letting the
>     size balloon in the future.  There's nothing fundamentally wrong with a
>     larger kvm_{svm,vmx,tdx} size, but given that the size is barely above
>     4096 after 18+ years of existence, exceeding exceed 8192 bytes would be
>     quite notable.

Yeah looks reasonable.

Nit: I am not quite following "falling back to __vmalloc()" part.  We are
replacing __vmalloc() with kzalloc() AFAICT, therefore there should be no
"falling back"?

> 
> 
> > What should the developer do?  Is it a hard requirement that the size should
> > never go beyond 4K?  Or, should the assert of order 0 allocation be changed to
> > the assert of order 1 allocation?
> 
> It depends.  Now that Vipin has corrected my math, the assertion will be that the
> VM struct is order-1 or smaller, i.e. <= 8KiB.  That gives us a _lot_ of room to
> grow.  E.g. KVM has existed for ~18 years and is barely about 4KiB, so for organic
> growth (small additions here and there), I don't expect to hit the 8KiB limit in
> the next decade (famous last words).  And the memory landscape will likely be
> quite different 10+ years from now, i.e. the assertion may be completely unnecessary
> by the time it fires.
> 
> What I'm most interested in detecting and prevent is things like mmu_page_hash,
> where a massive field is embedded in struct kvm for an *optional* feature.  I.e.
> if a new feature adds a massive field, then it should probably be placed in a
> separate, dynamically allocated structure.  And for those, it should be quite
> obvious that a separate allocation is the way to go.

Agreed.  Thanks for explaining.

Re: [PATCH v2 2/3] KVM: x86: Allocate kvm_vmx/kvm_svm structures using kzalloc()
Posted by Sean Christopherson 9 months, 2 weeks ago
On Wed, Apr 23, 2025, Kai Huang wrote:
> On Wed, 2025-04-23 at 10:07 -0700, Sean Christopherson wrote:
> > On Tue, Apr 22, 2025, Kai Huang wrote:
> > > On Wed, 2025-04-16 at 12:57 -0700, Sean Christopherson wrote:
> > > > On Wed, Apr 16, 2025, Vipin Sharma wrote:
> > > > > Checked via pahole, sizes of struct have reduced but still not under 4k.
> > > > > After applying the patch:
> > > > > 
> > > > > struct kvm{} - 4104
> > > > > struct kvm_svm{} - 4320
> > > > > struct kvm_vmx{} - 4128
> > > > > 
> > > > > Also, this BUILD_BUG_ON() might not be reliable unless all of the ifdefs
> > > > > under kvm_[vmx|svm] and its children are enabled. Won't that be an
> > > > > issue?
> > > > 
> > > > That's what build bots (and to a lesser extent, maintainers) are for.  An individual
> > > > developer might miss a particular config, but the build bots that run allyesconfig
> > > > will very quickly detect the issue, and then we fix it.
> > > > 
> > > > I also build what is effectively an "allkvmconfig" before officially applying
> > > > anything, so in general things like this shouldn't even make it to the bots.
> > > > 
> > > 
> > > Just want to understand the intention here:
> > > 
> > > What if someday a developer really needs to add some new field(s) to, lets say
> > > 'struct kvm_vmx', and that makes the size exceed 4K?
> > 
> > If it helps, here's the changelog I plan on posting for v3:
> >     
> >     Allocate VM structs via kvzalloc(), i.e. try to use a contiguous physical
> >     allocation before falling back to __vmalloc(), to avoid the overhead of
> >     establishing the virtual mappings.  The SVM and VMX (and TDX) structures
> >     are now just above 4096 bytes, i.e. are order-1 allocations, and will
> >     likely remain that way for quite some time.
> >     
> >     Add compile-time assertions in vendor code to ensure the size is an
> >     order-0 or order-1 allocation, i.e. to prevent unknowingly letting the
> >     size balloon in the future.  There's nothing fundamentally wrong with a
> >     larger kvm_{svm,vmx,tdx} size, but given that the size is barely above
> >     4096 after 18+ years of existence, exceeding exceed 8192 bytes would be
> >     quite notable.
> 
> Yeah looks reasonable.
> 
> Nit: I am not quite following "falling back to __vmalloc()" part.  We are
> replacing __vmalloc() with kzalloc() AFAICT, therefore there should be no
> "falling back"?

Correct, not in this version.  In the next version, my plan is to use kvzalloc()
(though honestly, I'm not sure that's worth doing; it'll be an order-1 allocation,
and if that fails...).