KVM: x86: Prep KVM hypercall handling for TDX

[PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Sean Christopherson 1 year, 2 months ago

Rework __kvm_emulate_hypercall() into a macro so that completion of
hypercalls that don't exit to userspace use direct function calls to the
completion helper, i.e. don't trigger a retpoline when RETPOLINE=y.

Opportunistically take the names of the input registers, as opposed to
taking the input values, to preemptively dedup more of the calling code
(TDX needs to use different registers).  Use the direct GPR accessors to
read values to avoid the pointless marking of the registers as available
(KVM requires GPRs to always be available).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 29 +++++++++--------------------
 arch/x86/kvm/x86.h | 25 ++++++++++++++++++++-----
 2 files changed, 29 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 39be2a891ab4..fef8b4e63d25 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
 	return kvm_skip_emulated_instruction(vcpu);
 }
 
-int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
-			    unsigned long a0, unsigned long a1,
-			    unsigned long a2, unsigned long a3,
-			    int op_64_bit, int cpl,
-			    int (*complete_hypercall)(struct kvm_vcpu *))
+int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
+			      unsigned long a0, unsigned long a1,
+			      unsigned long a2, unsigned long a3,
+			      int op_64_bit, int cpl,
+			      int (*complete_hypercall)(struct kvm_vcpu *))
 {
 	unsigned long ret;
 
@@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
 
 out:
 	vcpu->run->hypercall.ret = ret;
-	complete_hypercall(vcpu);
 	return 1;
 }
-EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall);
+EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall);
 
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
-	unsigned long nr, a0, a1, a2, a3;
-	int op_64_bit;
-	int cpl;
-
 	if (kvm_xen_hypercall_enabled(vcpu->kvm))
 		return kvm_xen_hypercall(vcpu);
 
 	if (kvm_hv_hypercall_enabled(vcpu))
 		return kvm_hv_hypercall(vcpu);
 
-	nr = kvm_rax_read(vcpu);
-	a0 = kvm_rbx_read(vcpu);
-	a1 = kvm_rcx_read(vcpu);
-	a2 = kvm_rdx_read(vcpu);
-	a3 = kvm_rsi_read(vcpu);
-	op_64_bit = is_64_bit_hypercall(vcpu);
-	cpl = kvm_x86_call(get_cpl)(vcpu);
-
-	return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl,
+	return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi,
+				       is_64_bit_hypercall(vcpu),
+				       kvm_x86_call(get_cpl)(vcpu),
 				       complete_hypercall_exit);
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 28adc8ea04bf..ad6fe6159dea 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr)
 	return kvm->arch.hypercall_exit_enabled & BIT(hc_nr);
 }
 
-int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
-			    unsigned long a0, unsigned long a1,
-			    unsigned long a2, unsigned long a3,
-			    int op_64_bit, int cpl,
-			    int (*complete_hypercall)(struct kvm_vcpu *));
+int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
+			      unsigned long a0, unsigned long a1,
+			      unsigned long a2, unsigned long a3,
+			      int op_64_bit, int cpl,
+			      int (*complete_hypercall)(struct kvm_vcpu *));
+
+#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall)	\
+({												\
+	int __ret;										\
+												\
+	__ret = ____kvm_emulate_hypercall(_vcpu,						\
+					  kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu),	\
+					  kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu),	\
+					  kvm_##a3##_read(_vcpu), op_64_bit, cpl,		\
+					  complete_hypercall);					\
+												\
+	if (__ret > 0)										\
+		complete_hypercall(_vcpu);							\
+	__ret;											\
+})
 
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
-- 
2.47.0.338.g60cca15819-goog

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Paolo Bonzini 1 year, 2 months ago

On 11/28/24 01:43, Sean Christopherson wrote:
> Rework __kvm_emulate_hypercall() into a macro so that completion of
> hypercalls that don't exit to userspace use direct function calls to the
> completion helper, i.e. don't trigger a retpoline when RETPOLINE=y.
> 
> Opportunistically take the names of the input registers, as opposed to
> taking the input values, to preemptively dedup more of the calling code
> (TDX needs to use different registers).  Use the direct GPR accessors to
> read values to avoid the pointless marking of the registers as available
> (KVM requires GPRs to always be available).
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
> ---
>   arch/x86/kvm/x86.c | 29 +++++++++--------------------
>   arch/x86/kvm/x86.h | 25 ++++++++++++++++++++-----
>   2 files changed, 29 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 39be2a891ab4..fef8b4e63d25 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
>   	return kvm_skip_emulated_instruction(vcpu);
>   }
>   
> -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> -			    unsigned long a0, unsigned long a1,
> -			    unsigned long a2, unsigned long a3,
> -			    int op_64_bit, int cpl,
> -			    int (*complete_hypercall)(struct kvm_vcpu *))
> +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> +			      unsigned long a0, unsigned long a1,
> +			      unsigned long a2, unsigned long a3,
> +			      int op_64_bit, int cpl,
> +			      int (*complete_hypercall)(struct kvm_vcpu *))
>   {
>   	unsigned long ret;
>   
> @@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>   
>   out:
>   	vcpu->run->hypercall.ret = ret;
> -	complete_hypercall(vcpu);
>   	return 1;
>   }
> -EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall);
> +EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall);
>   
>   int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>   {
> -	unsigned long nr, a0, a1, a2, a3;
> -	int op_64_bit;
> -	int cpl;
> -
>   	if (kvm_xen_hypercall_enabled(vcpu->kvm))
>   		return kvm_xen_hypercall(vcpu);
>   
>   	if (kvm_hv_hypercall_enabled(vcpu))
>   		return kvm_hv_hypercall(vcpu);
>   
> -	nr = kvm_rax_read(vcpu);
> -	a0 = kvm_rbx_read(vcpu);
> -	a1 = kvm_rcx_read(vcpu);
> -	a2 = kvm_rdx_read(vcpu);
> -	a3 = kvm_rsi_read(vcpu);
> -	op_64_bit = is_64_bit_hypercall(vcpu);
> -	cpl = kvm_x86_call(get_cpl)(vcpu);
> -
> -	return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl,
> +	return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi,
> +				       is_64_bit_hypercall(vcpu),
> +				       kvm_x86_call(get_cpl)(vcpu),
>   				       complete_hypercall_exit);
>   }
>   EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 28adc8ea04bf..ad6fe6159dea 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr)
>   	return kvm->arch.hypercall_exit_enabled & BIT(hc_nr);
>   }
>   d -
> -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> -			    unsigned long a0, unsigned long a1,
> -			    unsigned long a2, unsigned long a3,
> -			    int op_64_bit, int cpl,
> -			    int (*complete_hypercall)(struct kvm_vcpu *));
> +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> +			      unsigned long a0, unsigned long a1,
> +			      unsigned long a2, unsigned long a3,
> +			      int op_64_bit, int cpl,
> +			      int (*complete_hypercall)(struct kvm_vcpu *));
> +
> +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall)	\
> +({												\
> +	int __ret;										\
> +												\
> +	__ret = ____kvm_emulate_hypercall(_vcpu,						\
> +					  kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu),	\
> +					  kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu),	\
> +					  kvm_##a3##_read(_vcpu), op_64_bit, cpl,		\
> +					  complete_hypercall);					\
> +												\
> +	if (__ret > 0)										\
> +		complete_hypercall(_vcpu);							\

So based on the review of the previous patch this should become

	__ret = complete_hypercall(_vcpu);

Applied with this change to kvm-coco-queue, thanks.

Paolo

> +	__ret;											\
> +})
>   
>   int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
>

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Sean Christopherson 1 year, 1 month ago

On Tue, Dec 10, 2024, Paolo Bonzini wrote:
> On 11/28/24 01:43, Sean Christopherson wrote:
> > +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall)	\
> > +({												\
> > +	int __ret;										\
> > +												\
> > +	__ret = ____kvm_emulate_hypercall(_vcpu,						\
> > +					  kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu),	\
> > +					  kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu),	\
> > +					  kvm_##a3##_read(_vcpu), op_64_bit, cpl,		\
> > +					  complete_hypercall);					\
> > +												\
> > +	if (__ret > 0)										\
> > +		complete_hypercall(_vcpu);							\
> 
> So based on the review of the previous patch this should become
> 
> 	__ret = complete_hypercall(_vcpu);
> 
> Applied with this change to kvm-coco-queue, thanks.

I was planning on applying this for 6.14.  Should I still do that, or do you want
to take the bulk of the series through kvm/next, or maybe let it set in
kvm-coco-queue?  I can't think of any potential conflicts off the top of my head,
and the refactoring is really only useful for TDX.

Patch 1 should go in sooner than later though.

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Binbin Wu 1 year, 2 months ago



On 11/28/2024 8:43 AM, Sean Christopherson wrote:
> Rework __kvm_emulate_hypercall() into a macro so that completion of
> hypercalls that don't exit to userspace use direct function calls to the
> completion helper, i.e. don't trigger a retpoline when RETPOLINE=y.
>
> Opportunistically take the names of the input registers, as opposed to
> taking the input values, to preemptively dedup more of the calling code
> (TDX needs to use different registers).  Use the direct GPR accessors to
> read values to avoid the pointless marking of the registers as available
> (KVM requires GPRs to always be available).
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

> ---
>   arch/x86/kvm/x86.c | 29 +++++++++--------------------
>   arch/x86/kvm/x86.h | 25 ++++++++++++++++++++-----
>   2 files changed, 29 insertions(+), 25 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 39be2a891ab4..fef8b4e63d25 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
>   	return kvm_skip_emulated_instruction(vcpu);
>   }
>   
> -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> -			    unsigned long a0, unsigned long a1,
> -			    unsigned long a2, unsigned long a3,
> -			    int op_64_bit, int cpl,
> -			    int (*complete_hypercall)(struct kvm_vcpu *))
> +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> +			      unsigned long a0, unsigned long a1,
> +			      unsigned long a2, unsigned long a3,
> +			      int op_64_bit, int cpl,
> +			      int (*complete_hypercall)(struct kvm_vcpu *))
>   {
>   	unsigned long ret;
>   
> @@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>   
>   out:
>   	vcpu->run->hypercall.ret = ret;
> -	complete_hypercall(vcpu);
>   	return 1;
>   }
> -EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall);
> +EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall);
>   
>   int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>   {
> -	unsigned long nr, a0, a1, a2, a3;
> -	int op_64_bit;
> -	int cpl;
> -
>   	if (kvm_xen_hypercall_enabled(vcpu->kvm))
>   		return kvm_xen_hypercall(vcpu);
>   
>   	if (kvm_hv_hypercall_enabled(vcpu))
>   		return kvm_hv_hypercall(vcpu);
>   
> -	nr = kvm_rax_read(vcpu);
> -	a0 = kvm_rbx_read(vcpu);
> -	a1 = kvm_rcx_read(vcpu);
> -	a2 = kvm_rdx_read(vcpu);
> -	a3 = kvm_rsi_read(vcpu);
> -	op_64_bit = is_64_bit_hypercall(vcpu);
> -	cpl = kvm_x86_call(get_cpl)(vcpu);
> -
> -	return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl,
> +	return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi,
> +				       is_64_bit_hypercall(vcpu),
> +				       kvm_x86_call(get_cpl)(vcpu),
>   				       complete_hypercall_exit);
>   }
>   EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 28adc8ea04bf..ad6fe6159dea 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr)
>   	return kvm->arch.hypercall_exit_enabled & BIT(hc_nr);
>   }
>   
> -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> -			    unsigned long a0, unsigned long a1,
> -			    unsigned long a2, unsigned long a3,
> -			    int op_64_bit, int cpl,
> -			    int (*complete_hypercall)(struct kvm_vcpu *));
> +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> +			      unsigned long a0, unsigned long a1,
> +			      unsigned long a2, unsigned long a3,
> +			      int op_64_bit, int cpl,
> +			      int (*complete_hypercall)(struct kvm_vcpu *));
> +
> +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall)	\
> +({												\
> +	int __ret;										\
> +												\
> +	__ret = ____kvm_emulate_hypercall(_vcpu,						\
> +					  kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu),	\
> +					  kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu),	\
> +					  kvm_##a3##_read(_vcpu), op_64_bit, cpl,		\
> +					  complete_hypercall);					\
> +												\
> +	if (__ret > 0)										\
> +		complete_hypercall(_vcpu);							\
> +	__ret;											\
> +})
>   
>   int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
>

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Adrian Hunter 1 year, 2 months ago

On 28/11/24 02:43, Sean Christopherson wrote:
> Rework __kvm_emulate_hypercall() into a macro so that completion of
> hypercalls that don't exit to userspace use direct function calls to the
> completion helper, i.e. don't trigger a retpoline when RETPOLINE=y.
> 
> Opportunistically take the names of the input registers, as opposed to
> taking the input values, to preemptively dedup more of the calling code
> (TDX needs to use different registers).  Use the direct GPR accessors to
> read values to avoid the pointless marking of the registers as available
> (KVM requires GPRs to always be available).

For TDX, there is an RFC relating to using descriptively
named parameters instead of register names for tdh_vp_enter():

	https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/

Please do give some feedback on that approach.  Note we
need both KVM and x86 maintainer approval for SEAMCALL
wrappers like tdh_vp_enter().

As proposed, that ends up with putting the values back into
vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not
pretty:

 static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu)
 {
+	struct vcpu_tdx *tdx = to_tdx(vcpu);
 	int r;
 
+	kvm_r10_write(vcpu, tdx->vp_enter_args.tdcall.fn);
+	kvm_r11_write(vcpu, tdx->vp_enter_args.tdcall.subfn);
+	kvm_r12_write(vcpu, tdx->vp_enter_args.tdcall.vmcall.p2);
+	kvm_r13_write(vcpu, tdx->vp_enter_args.tdcall.vmcall.p3);
+	kvm_r14_write(vcpu, tdx->vp_enter_args.tdcall.vmcall.p4);
+
 	/*
 	 * ABI for KVM tdvmcall argument:
 	 * In Guest-Hypervisor Communication Interface(GHCI) specification,
@@ -1092,13 +1042,12 @@ static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu)
 	 * vendor-specific.  KVM uses this for KVM hypercall.  NOTE: KVM
 	 * hypercall number starts from one.  Zero isn't used for KVM hypercall
 	 * number.
-	 *
-	 * R10: KVM hypercall number
-	 * arguments: R11, R12, R13, R14.
 	 */
 	r = __kvm_emulate_hypercall(vcpu, r10, r11, r12, r13, r14, true, 0,
 				    R10, complete_hypercall_exit);
 
+	tdvmcall_set_return_code(vcpu, kvm_r10_read(vcpu));
+
 	return r > 0;
 }

> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 29 +++++++++--------------------
>  arch/x86/kvm/x86.h | 25 ++++++++++++++++++++-----
>  2 files changed, 29 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 39be2a891ab4..fef8b4e63d25 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
>  	return kvm_skip_emulated_instruction(vcpu);
>  }
>  
> -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> -			    unsigned long a0, unsigned long a1,
> -			    unsigned long a2, unsigned long a3,
> -			    int op_64_bit, int cpl,
> -			    int (*complete_hypercall)(struct kvm_vcpu *))
> +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> +			      unsigned long a0, unsigned long a1,
> +			      unsigned long a2, unsigned long a3,
> +			      int op_64_bit, int cpl,
> +			      int (*complete_hypercall)(struct kvm_vcpu *))
>  {
>  	unsigned long ret;
>  
> @@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>  
>  out:
>  	vcpu->run->hypercall.ret = ret;
> -	complete_hypercall(vcpu);
>  	return 1;
>  }
> -EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall);
> +EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall);
>  
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  {
> -	unsigned long nr, a0, a1, a2, a3;
> -	int op_64_bit;
> -	int cpl;
> -
>  	if (kvm_xen_hypercall_enabled(vcpu->kvm))
>  		return kvm_xen_hypercall(vcpu);
>  
>  	if (kvm_hv_hypercall_enabled(vcpu))
>  		return kvm_hv_hypercall(vcpu);
>  
> -	nr = kvm_rax_read(vcpu);
> -	a0 = kvm_rbx_read(vcpu);
> -	a1 = kvm_rcx_read(vcpu);
> -	a2 = kvm_rdx_read(vcpu);
> -	a3 = kvm_rsi_read(vcpu);
> -	op_64_bit = is_64_bit_hypercall(vcpu);
> -	cpl = kvm_x86_call(get_cpl)(vcpu);
> -
> -	return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl,
> +	return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi,
> +				       is_64_bit_hypercall(vcpu),
> +				       kvm_x86_call(get_cpl)(vcpu),
>  				       complete_hypercall_exit);
>  }
>  EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 28adc8ea04bf..ad6fe6159dea 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr)
>  	return kvm->arch.hypercall_exit_enabled & BIT(hc_nr);
>  }
>  
> -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> -			    unsigned long a0, unsigned long a1,
> -			    unsigned long a2, unsigned long a3,
> -			    int op_64_bit, int cpl,
> -			    int (*complete_hypercall)(struct kvm_vcpu *));
> +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> +			      unsigned long a0, unsigned long a1,
> +			      unsigned long a2, unsigned long a3,
> +			      int op_64_bit, int cpl,
> +			      int (*complete_hypercall)(struct kvm_vcpu *));
> +
> +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall)	\
> +({												\
> +	int __ret;										\
> +												\
> +	__ret = ____kvm_emulate_hypercall(_vcpu,						\
> +					  kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu),	\
> +					  kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu),	\
> +					  kvm_##a3##_read(_vcpu), op_64_bit, cpl,		\
> +					  complete_hypercall);					\
> +												\
> +	if (__ret > 0)										\
> +		complete_hypercall(_vcpu);							\
> +	__ret;											\
> +})
>  
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
>

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Paolo Bonzini 1 year, 2 months ago

On 11/28/24 09:38, Adrian Hunter wrote:
> 
> For TDX, there is an RFC relating to using descriptively
> named parameters instead of register names for tdh_vp_enter():
> 
> 	https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/
> 
> Please do give some feedback on that approach.  Note we
> need both KVM and x86 maintainer approval for SEAMCALL
> wrappers like tdh_vp_enter().
> 
> As proposed, that ends up with putting the values back into
> vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not
> pretty:

If needed we can revert this patch, it's not a big problem.

Paolo

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Sean Christopherson 1 year, 1 month ago

On Tue, Dec 10, 2024, Paolo Bonzini wrote:
> On 11/28/24 09:38, Adrian Hunter wrote:
> > 
> > For TDX, there is an RFC relating to using descriptively
> > named parameters instead of register names for tdh_vp_enter():
> > 
> > 	https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/
> > 
> > Please do give some feedback on that approach.  Note we
> > need both KVM and x86 maintainer approval for SEAMCALL
> > wrappers like tdh_vp_enter().
> > 
> > As proposed, that ends up with putting the values back into
> > vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not
> > pretty:
> 
> If needed we can revert this patch, it's not a big problem.

I don't care terribly about the SEAMCALL interfaces.  I have opinions on what
would I think would be ideal, but I can live with whatever.

What I do deeply care about though is consistency within KVM, across vendors and
VM flavors.  And that means that guest registers absolutely need to be captured in
vcpu->arch.regs[].  TDX already requires too much special cased code in KVM, there
is zero reason to make TDX even more different and thus more difficult to maintain.

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Adrian Hunter 1 year, 1 month ago

On 10/12/24 22:03, Sean Christopherson wrote:
> On Tue, Dec 10, 2024, Paolo Bonzini wrote:
>> On 11/28/24 09:38, Adrian Hunter wrote:
>>>
>>> For TDX, there is an RFC relating to using descriptively
>>> named parameters instead of register names for tdh_vp_enter():
>>>
>>> 	https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/
>>>
>>> Please do give some feedback on that approach.  Note we
>>> need both KVM and x86 maintainer approval for SEAMCALL
>>> wrappers like tdh_vp_enter().
>>>
>>> As proposed, that ends up with putting the values back into
>>> vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not
>>> pretty:
>>
>> If needed we can revert this patch, it's not a big problem.
> 
> I don't care terribly about the SEAMCALL interfaces.  I have opinions on what
> would I think would be ideal, but I can live with whatever.
> 
> What I do deeply care about though is consistency within KVM, across vendors and
> VM flavors.  And that means that guest registers absolutely need to be captured in
> vcpu->arch.regs[].

In general, TDX host VMM does not know what guest register
values are.

This case, where some GPRs are passed to the host VMM via
arguments of the TDG.VP.VMCALL TDCALL, is really just a
side effect of the choice of argument passing rather than
any attempt to share guest registers with the host VMM.

It could be regarded as more consistent to never use
vcpu->arch.regs[] for confidential guests.

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Sean Christopherson 1 year, 1 month ago

On Thu, Dec 12, 2024, Adrian Hunter wrote:
> On 10/12/24 22:03, Sean Christopherson wrote:
> > On Tue, Dec 10, 2024, Paolo Bonzini wrote:
> >> On 11/28/24 09:38, Adrian Hunter wrote:
> >>>
> >>> For TDX, there is an RFC relating to using descriptively
> >>> named parameters instead of register names for tdh_vp_enter():
> >>>
> >>> 	https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/
> >>>
> >>> Please do give some feedback on that approach.  Note we
> >>> need both KVM and x86 maintainer approval for SEAMCALL
> >>> wrappers like tdh_vp_enter().
> >>>
> >>> As proposed, that ends up with putting the values back into
> >>> vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not
> >>> pretty:
> >>
> >> If needed we can revert this patch, it's not a big problem.
> > 
> > I don't care terribly about the SEAMCALL interfaces.  I have opinions on what
> > would I think would be ideal, but I can live with whatever.
> > 
> > What I do deeply care about though is consistency within KVM, across vendors and
> > VM flavors.  And that means that guest registers absolutely need to be captured in
> > vcpu->arch.regs[].
> 
> In general, TDX host VMM does not know what guest register values are.
> 
> This case, where some GPRs are passed to the host VMM via arguments of the
> TDG.VP.VMCALL TDCALL, is really just a side effect of the choice of argument
> passing rather than any attempt to share guest registers with the host VMM.
> 
> It could be regarded as more consistent to never use vcpu->arch.regs[] for
> confidential guests.

SEV-ES+ marshalls data to/from the GHCB to KVM's register array, because the GHCB
spec was intentionally crafted to allow hypervisors to reuse exit-handling code.
Granted, that's only for R{A,B,C,D}X and RSI, but the other GPRs should never be
used and thus their data is irrelevant.

Which applies to TDX as well.  For regs[], it's really only TDVMCALL that I care
about, i.e. cases where GPRs hold guest values, versus things like EXIT_QUALIFICATION
where the GPR is simply TDX's way of communicating information to the hypervisor.

Argh.  The reason I care about putting vCPU state into regs[] is because it helps
share code between vendors.  Looking at kvm-coco-queue, TDX support wandered in
the opposite direction.  E.g. TDX rolls its own RDMSR, WRMSR, CPUID, and HYPERCALL
implementations, which is quite frustrating.  Ditto for things like EXIT_REASON,
EXIT_QUALIFICATION, EXIT_INTR_INFO, etc.

For EXIT_REASON in particular, I think maintaining the guest-requested exit reason
(via TDMVCALL) in a separate field is a mistake.  Readers shouldn't have to care
that a HLT exit technically was requested via TDVMCALL.  If KVM instead immediately
morphs the requested exit reason to KVM's tracked exit_reason, then there's no need
to deal with the TDVMCALL layer in flows that don't care.  The only danger is a
collision with a EXIT_REASON_EPT_MISCONFIG from the TDX module, but that's easy
enough to handle.

And even where TDX and VMX have shared some code, IMO it doesn't go far enough.
E.g. having vcpu_tdx and vcpu_vmx open code their own version of the posted
interrupt fields, just to avoid minimal churn in the VMX code, is beyond gross.

Even concepts like guest_state_loaded are unnecessarily different for TDX.  Yes,
I get that that host state doesn't need to be reloaded if KVM doesn't actually
enter the guest.  But holy moly, we're talking about avoid _one_ WRMSR in an
extremely rare path (late abort of entry), at the cost of making TDX frustratingly
different from VMX.

Making TDX look more like everything else isn't just about code sharing.  It's also
about providing a familiar setting so that readers who know almost nothing about
TDX can find their way around without having to effectively learn an entirely new
"architecture" *and* code base.

Hacking around, I think the attached half-baked diff will provide a middle ground
for the regs[] vs. struct/union issue.  The basic gist is to essentially treat
TDX's register ABI as a faster version of VMREAD/VMWRITE, e.g. marshall state
to/from the appropriate x86 registers as needed.  That way, regs[] holds the correct
state and so TDX can reuse much of KVM's existing code verbatim, while allowing
the kernel's VP_ENTER API to evolve independently.

Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro

Posted by Paolo Bonzini 1 year, 1 month ago

On 12/12/24 08:32, Adrian Hunter wrote:
> On 10/12/24 22:03, Sean Christopherson wrote:
>> What I do deeply care about though is consistency within KVM, across vendors and
>> VM flavors.  And that means that guest registers absolutely need to be captured in
>> vcpu->arch.regs[].
> 
> In general, TDX host VMM does not know what guest register
> values are.
> 
> This case, where some GPRs are passed to the host VMM via
> arguments of the TDG.VP.VMCALL TDCALL, is really just a
> side effect of the choice of argument passing rather than
> any attempt to share guest registers with the host VMM.
> 
> It could be regarded as more consistent to never use
> vcpu->arch.regs[] for confidential guests.

Yes, that's where I stand as well.  There's reasons to use 
vcpu->arch.regs[] when "decrypted" values are available, and reasons to 
not use it at all.  Both of them could be considered the more consistent 
choice, and I think I prefer slightly the latter, but it's definitely 
not a hill to die on...

Paolo