[PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC

Isaku Yamahata posted 2 patches 1 year, 3 months ago
There is a newer version of this series
arch/x86/include/asm/kvm_host.h |  1 +
arch/x86/kvm/x86.c              | 21 ++++++++++++++-------
2 files changed, 15 insertions(+), 7 deletions(-)
[PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Isaku Yamahata 1 year, 3 months ago
This patch series is for the kvm-coco-queue branch.  The change for TDX KVM is
included at the last.  The test is done by create TDX vCPU and run, get TSC
offset via vCPU device attributes and compare it with the TDX TSC OFFSET
metadata.  Because the test requires the TDX KVM and TDX KVM kselftests, don't
include it in this patch series.


Background
----------
X86 confidential computing technology defines protected guest TSC so that the
VMM can't change the TSC offset/multiplier once vCPU is initialized and the
guest can trust TSC.  The SEV-SNP defines Secure TSC as optional.  TDX mandates
it.  The TDX module determines the TSC offset/multiplier.  The VMM has to
retrieve them.

On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
(kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).


Problem
-------
The current x86 KVM implementation conflicts with protected TSC because the
VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
logic to change/adjust the TSC offset/multiplier somehow.

Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
offset/multiplier, the TSC timer interrupts are injected to the guest at the
wrong time if the KVM TSC offset is different from what the TDX module
determined.

Originally the issue was found by cyclic test of rt-test [1] as the latency in
TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
the KVM TSC offset is different from what the TDX module determines.


Solution
--------
The solution is to keep the KVM TSC offset/multiplier the same as the value of
the TDX module somehow.  Possible solutions are as follows.
- Skip the logic
  Ignore (or don't call related functions) the request to change the TSC
  offset/multiplier.
  Pros
  - Logically clean.  This is similar to the guest_protected case.
  Cons
  - Needs to identify the call sites.

- Revert the change at the hooks after TSC adjustment
  x86 KVM defines the vendor hooks when the TSC offset/multiplier are
  changed.  The callback can revert the change.
  Pros
  - We don't need to care about the logic to change the TSC offset/multiplier.
  Cons:
  - Hacky to revert the KVM x86 common code logic.

Choose the first one.  With this patch series, SEV-SNP secure TSC can be
supported.


Patches:
1: Preparation for the next patch
2: Skip the logic to adjust the TSC offset/multiplier in the common x86 KVM logic

[1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git

Changes for TDX KVM

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 8785309ccb46..969da729d89f 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -694,8 +712,6 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu)
 	vcpu->arch.cr0_guest_owned_bits = -1ul;
 	vcpu->arch.cr4_guest_owned_bits = -1ul;
 
-	vcpu->arch.tsc_offset = kvm_tdx->tsc_offset;
-	vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset;
 	/*
 	 * TODO: support off-TD debug.  If TD DEBUG is enabled, guest state
 	 * can be accessed. guest_state_protected = false. and kvm ioctl to
@@ -706,6 +722,13 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu)
 	 */
 	vcpu->arch.guest_state_protected = true;
 
+	/* VMM can't change TSC offset/multiplier as TDX module manages them. */
+	vcpu->arch.guest_tsc_protected = true;
+	vcpu->arch.tsc_offset = kvm_tdx->tsc_offset;
+	vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset;
+	vcpu->arch.tsc_scaling_ratio = kvm_tdx->tsc_multiplier;
+	vcpu->arch.l1_tsc_scaling_ratio = kvm_tdx->tsc_multiplier;
+
 	if ((kvm_tdx->xfam & XFEATURE_MASK_XTILE) == XFEATURE_MASK_XTILE)
 		vcpu->arch.xfd_no_write_intercept = true;
 
@@ -2674,6 +2697,7 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
 		goto out;
 
 	kvm_tdx->tsc_offset = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_OFFSET);
+	kvm_tdx->tsc_multiplier = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_MULTIPLIER);
 	kvm_tdx->attributes = td_params->attributes;
 	kvm_tdx->xfam = td_params->xfam;
 
diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
index 614b1c3b8483..c0e4fa61cab1 100644
--- a/arch/x86/kvm/vmx/tdx.h
+++ b/arch/x86/kvm/vmx/tdx.h
@@ -42,6 +42,7 @@ struct kvm_tdx {
 	bool tsx_supported;
 
 	u64 tsc_offset;
+	u64 tsc_multiplier;
 
 	enum kvm_tdx_state state;
 
diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h
index 861c0f649b69..be4cf65c90a8 100644
--- a/arch/x86/kvm/vmx/tdx_arch.h
+++ b/arch/x86/kvm/vmx/tdx_arch.h
@@ -69,6 +69,7 @@
 
 enum tdx_tdcs_execution_control {
 	TD_TDCS_EXEC_TSC_OFFSET = 10,
+	TD_TDCS_EXEC_TSC_MULTIPLIER = 11,
 };
 
 enum tdx_vcpu_guest_other_state {

---
Isaku Yamahata (2):
  KVM: x86: Push down setting vcpu.arch.user_set_tsc
  KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c              | 21 ++++++++++++++-------
 2 files changed, 15 insertions(+), 7 deletions(-)


base-commit: 909f9d422f59f863d7b6e4e2c6e57abb97a27d4d
-- 
2.45.2
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Paolo Bonzini 11 months ago
On 10/12/24 09:55, Isaku Yamahata wrote:
> The current x86 KVM implementation conflicts with protected TSC because the
> VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
> logic to change/adjust the TSC offset/multiplier somehow.
> 
> Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> offset/multiplier, the TSC timer interrupts are injected to the guest at the
> wrong time if the KVM TSC offset is different from what the TDX module
> determined.
> 
> Originally the issue was found by cyclic test of rt-test [1] as the latency in
> TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
> the KVM TSC offset is different from what the TDX module determines.
> 
> The solution is to keep the KVM TSC offset/multiplier the same as the value of
> the TDX module somehow. [...] Ignore (or don't call related functions) the
> request to change the TSC offset/multiplier.
> 
> [...]  With this patch series, SEV-SNP secure TSC can be supported.

Thanks, I've squashed these changes (apart from setting
vcpu->arch.guest_tsc_protected) into the corresponding patches in
kvm-coco-queue.  Just one small change is needed in patch 2, to
which I will reply.

For SEV-SNP, all that's necessary on top should be

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index d92e97baea0f..beddeed90ff0 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2481,6 +2481,9 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
  		}
  
  		svm->vcpu.arch.guest_state_protected = true;
+		if (snp_secure_tsc_enabled(kvm))
+			svm->vcpu.arch.guest_tsc_protected = true;
+
  		/*
  		 * SEV-ES (and thus SNP) guest mandates LBR Virtualization to
  		 * be _always_ ON. Enable it only after setting

For the sake of testing, I applied the latest SEV-SNP host patches
from https://github.com/AMDESE/linux-kvm/commits/sectsc-host-latest
to kvm-coco-queue as well, plus the above hunk; Nikunj can integrate
it in the next revision of
https://lore.kernel.org/kvm/20250310064347.13986-1-nikunj@amd.com/T/.

Paolo
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Nikunj A. Dadhania 11 months ago

On 3/12/2025 5:52 PM, Paolo Bonzini wrote:
> On 10/12/24 09:55, Isaku Yamahata wrote:
>> The current x86 KVM implementation conflicts with protected TSC because the
>> VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
>> logic to change/adjust the TSC offset/multiplier somehow.
>>
>> Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
>> offset/multiplier, the TSC timer interrupts are injected to the guest at the
>> wrong time if the KVM TSC offset is different from what the TDX module
>> determined.
>>
>> Originally the issue was found by cyclic test of rt-test [1] as the latency in
>> TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
>> the KVM TSC offset is different from what the TDX module determines.
>>
>> The solution is to keep the KVM TSC offset/multiplier the same as the value of
>> the TDX module somehow. [...] Ignore (or don't call related functions) the
>> request to change the TSC offset/multiplier.
>>
>> [...]  With this patch series, SEV-SNP secure TSC can be supported.
> 
> Thanks, I've squashed these changes (apart from setting
> vcpu->arch.guest_tsc_protected) into the corresponding patches in
> kvm-coco-queue.  Just one small change is needed in patch 2, to
> which I will reply.
> 
> For SEV-SNP, all that's necessary on top should be
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index d92e97baea0f..beddeed90ff0 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2481,6 +2481,9 @@ static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
>          }
>  
>          svm->vcpu.arch.guest_state_protected = true;
> +        if (snp_secure_tsc_enabled(kvm))
> +            svm->vcpu.arch.guest_tsc_protected = true;
> +
>          /*
>           * SEV-ES (and thus SNP) guest mandates LBR Virtualization to
>           * be _always_ ON. Enable it only after setting
> 
> For the sake of testing, I applied the latest SEV-SNP host patches
> from https://github.com/AMDESE/linux-kvm/commits/sectsc-host-latest
> to kvm-coco-queue as well, plus the above hunk; Nikunj can integrate
> it in the next revision of
> https://lore.kernel.org/kvm/20250310064347.13986-1-nikunj@amd.com/T/.

Sure Paolo, I will add the above changes in my next revision. 
Should I rebase the Secure TSC host patches on top of kvm-coco-queue ?

Regards
Nikunj
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Marcelo Tosatti 11 months ago
On Sat, Oct 12, 2024 at 12:55:54AM -0700, Isaku Yamahata wrote:
> This patch series is for the kvm-coco-queue branch.  The change for TDX KVM is
> included at the last.  The test is done by create TDX vCPU and run, get TSC
> offset via vCPU device attributes and compare it with the TDX TSC OFFSET
> metadata.  Because the test requires the TDX KVM and TDX KVM kselftests, don't
> include it in this patch series.

OK, previous results were incorrect. In fact, this patches (which apply
cleanly to current kvm-coco-queue) reduce cyclictest latency from:

Max Latencies: 00167 00160
Max Latencies: 00132 00151
Max Latencies: 00138 00142
Max Latencies: 02512 02582
Max Latencies: 00139 00140
Max Latencies: 00128 00131
Max Latencies: 00131 00132
Max Latencies: 00131 00134
Max Latencies: 00136 00147
Max Latencies: 00153 00135
Max Latencies: 00138 00138

to:

Max Latencies: 00134 00131                                                                                  
Max Latencies: 00130 00129                                                                                  
Max Latencies: 00126 00141                                                                                 
Max Latencies: 00137 00138                                                                                  
Max Latencies: 00123 00115                                                                                  
Max Latencies: 00119 00127                                                                                  
Max Latencies: 00131 00104                                                                                  
Max Latencies: 00137 00127                                                                                  
Max Latencies: 00135 00126                                                                                  
Max Latencies: 00128 00142                                                                                  
Max Latencies: 00135 00138         

> 
> 
> Background
> ----------
> X86 confidential computing technology defines protected guest TSC so that the
> VMM can't change the TSC offset/multiplier once vCPU is initialized and the
> guest can trust TSC.  The SEV-SNP defines Secure TSC as optional.  TDX mandates
> it.  The TDX module determines the TSC offset/multiplier.  The VMM has to
> retrieve them.
> 
> On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
> offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
> creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
> (kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
> guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).
> 
> 
> Problem
> -------
> The current x86 KVM implementation conflicts with protected TSC because the
> VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
> logic to change/adjust the TSC offset/multiplier somehow.
> 
> Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> offset/multiplier, the TSC timer interrupts are injected to the guest at the
> wrong time if the KVM TSC offset is different from what the TDX module
> determined.
> 
> Originally the issue was found by cyclic test of rt-test [1] as the latency in
> TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
> the KVM TSC offset is different from what the TDX module determines.
> 
> 
> Solution
> --------
> The solution is to keep the KVM TSC offset/multiplier the same as the value of
> the TDX module somehow.  Possible solutions are as follows.
> - Skip the logic
>   Ignore (or don't call related functions) the request to change the TSC
>   offset/multiplier.
>   Pros
>   - Logically clean.  This is similar to the guest_protected case.
>   Cons
>   - Needs to identify the call sites.
> 
> - Revert the change at the hooks after TSC adjustment
>   x86 KVM defines the vendor hooks when the TSC offset/multiplier are
>   changed.  The callback can revert the change.
>   Pros
>   - We don't need to care about the logic to change the TSC offset/multiplier.
>   Cons:
>   - Hacky to revert the KVM x86 common code logic.
> 
> Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> supported.
> 
> 
> Patches:
> 1: Preparation for the next patch
> 2: Skip the logic to adjust the TSC offset/multiplier in the common x86 KVM logic
> 
> [1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
> 
> Changes for TDX KVM
> 
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index 8785309ccb46..969da729d89f 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -694,8 +712,6 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu)
>  	vcpu->arch.cr0_guest_owned_bits = -1ul;
>  	vcpu->arch.cr4_guest_owned_bits = -1ul;
>  
> -	vcpu->arch.tsc_offset = kvm_tdx->tsc_offset;
> -	vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset;
>  	/*
>  	 * TODO: support off-TD debug.  If TD DEBUG is enabled, guest state
>  	 * can be accessed. guest_state_protected = false. and kvm ioctl to
> @@ -706,6 +722,13 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu)
>  	 */
>  	vcpu->arch.guest_state_protected = true;
>  
> +	/* VMM can't change TSC offset/multiplier as TDX module manages them. */
> +	vcpu->arch.guest_tsc_protected = true;
> +	vcpu->arch.tsc_offset = kvm_tdx->tsc_offset;
> +	vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset;
> +	vcpu->arch.tsc_scaling_ratio = kvm_tdx->tsc_multiplier;
> +	vcpu->arch.l1_tsc_scaling_ratio = kvm_tdx->tsc_multiplier;
> +
>  	if ((kvm_tdx->xfam & XFEATURE_MASK_XTILE) == XFEATURE_MASK_XTILE)
>  		vcpu->arch.xfd_no_write_intercept = true;
>  
> @@ -2674,6 +2697,7 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
>  		goto out;
>  
>  	kvm_tdx->tsc_offset = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_OFFSET);
> +	kvm_tdx->tsc_multiplier = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_MULTIPLIER);
>  	kvm_tdx->attributes = td_params->attributes;
>  	kvm_tdx->xfam = td_params->xfam;
>  
> diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
> index 614b1c3b8483..c0e4fa61cab1 100644
> --- a/arch/x86/kvm/vmx/tdx.h
> +++ b/arch/x86/kvm/vmx/tdx.h
> @@ -42,6 +42,7 @@ struct kvm_tdx {
>  	bool tsx_supported;
>  
>  	u64 tsc_offset;
> +	u64 tsc_multiplier;
>  
>  	enum kvm_tdx_state state;
>  
> diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h
> index 861c0f649b69..be4cf65c90a8 100644
> --- a/arch/x86/kvm/vmx/tdx_arch.h
> +++ b/arch/x86/kvm/vmx/tdx_arch.h
> @@ -69,6 +69,7 @@
>  
>  enum tdx_tdcs_execution_control {
>  	TD_TDCS_EXEC_TSC_OFFSET = 10,
> +	TD_TDCS_EXEC_TSC_MULTIPLIER = 11,
>  };
>  
>  enum tdx_vcpu_guest_other_state {
> 
> ---
> Isaku Yamahata (2):
>   KVM: x86: Push down setting vcpu.arch.user_set_tsc
>   KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change
> 
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/x86.c              | 21 ++++++++++++++-------
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> 
> base-commit: 909f9d422f59f863d7b6e4e2c6e57abb97a27d4d
> -- 
> 2.45.2
> 
>
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Isaku Yamahata 11 months ago
On Tue, Mar 11, 2025 at 10:13:30PM -0300,
Marcelo Tosatti <mtosatti@redhat.com> wrote:

> On Sat, Oct 12, 2024 at 12:55:54AM -0700, Isaku Yamahata wrote:
> > This patch series is for the kvm-coco-queue branch.  The change for TDX KVM is
> > included at the last.  The test is done by create TDX vCPU and run, get TSC
> > offset via vCPU device attributes and compare it with the TDX TSC OFFSET
> > metadata.  Because the test requires the TDX KVM and TDX KVM kselftests, don't
> > include it in this patch series.
> 
> OK, previous results were incorrect. In fact, this patches (which apply
> cleanly to current kvm-coco-queue) reduce cyclictest latency from:

Thank you for evaluating it.  Does your guest include [1] (or similar fix)?
The patch would affect the result much.

[1] https://lore.kernel.org/lkml/20250228014416.3925664-1-vannapurve@google.com/
-- 
Isaku Yamahata <isaku.yamahata@intel.com>
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Nikunj A. Dadhania 1 year, 3 months ago
Hi Isaku,

On 10/12/2024 1:25 PM, Isaku Yamahata wrote:
> This patch series is for the kvm-coco-queue branch.  The change for TDX KVM is
> included at the last.  The test is done by create TDX vCPU and run, get TSC
> offset via vCPU device attributes and compare it with the TDX TSC OFFSET
> metadata.  Because the test requires the TDX KVM and TDX KVM kselftests, don't
> include it in this patch series.
> 
> 
> Background
> ----------
> X86 confidential computing technology defines protected guest TSC so that the
> VMM can't change the TSC offset/multiplier once vCPU is initialized and the
> guest can trust TSC.  The SEV-SNP defines Secure TSC as optional.  TDX mandates
> it.  The TDX module determines the TSC offset/multiplier.  The VMM has to
> retrieve them.
> 
> On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
> offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
> creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
> (kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
> guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).
> 
> 
> Problem
> -------
> The current x86 KVM implementation conflicts with protected TSC because the
> VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
> logic to change/adjust the TSC offset/multiplier somehow.
> 
> Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> offset/multiplier, the TSC timer interrupts are injected to the guest at the
> wrong time if the KVM TSC offset is different from what the TDX module
> determined.
> 
> Originally the issue was found by cyclic test of rt-test [1] as the latency in
> TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
> the KVM TSC offset is different from what the TDX module determines.

Can you provide what is the exact command line to reproduce this problem ? 
Any links to this reported issue ?

> 
> 
> Solution
> --------
> The solution is to keep the KVM TSC offset/multiplier the same as the value of
> the TDX module somehow.  Possible solutions are as follows.
> - Skip the logic
>   Ignore (or don't call related functions) the request to change the TSC
>   offset/multiplier.
>   Pros
>   - Logically clean.  This is similar to the guest_protected case.
>   Cons
>   - Needs to identify the call sites.
> 
> - Revert the change at the hooks after TSC adjustment
>   x86 KVM defines the vendor hooks when the TSC offset/multiplier are
>   changed.  The callback can revert the change.
>   Pros
>   - We don't need to care about the logic to change the TSC offset/multiplier.
>   Cons:
>   - Hacky to revert the KVM x86 common code logic.
> 
> Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> supported.

I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and 
GUEST_TSC_SCALE are only available to the guest.

Regards,
Nikunj
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Isaku Yamahata 1 year, 2 months ago
On Mon, Oct 14, 2024 at 08:17:19PM +0530,
"Nikunj A. Dadhania" <nikunj@amd.com> wrote:

> > Solution
> > --------
> > The solution is to keep the KVM TSC offset/multiplier the same as the value of
> > the TDX module somehow.  Possible solutions are as follows.
> > - Skip the logic
> >   Ignore (or don't call related functions) the request to change the TSC
> >   offset/multiplier.
> >   Pros
> >   - Logically clean.  This is similar to the guest_protected case.
> >   Cons
> >   - Needs to identify the call sites.
> > 
> > - Revert the change at the hooks after TSC adjustment
> >   x86 KVM defines the vendor hooks when the TSC offset/multiplier are
> >   changed.  The callback can revert the change.
> >   Pros
> >   - We don't need to care about the logic to change the TSC offset/multiplier.
> >   Cons:
> >   - Hacky to revert the KVM x86 common code logic.
> > 
> > Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> > supported.
> 
> I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and 
> GUEST_TSC_SCALE are only available to the guest.

Although Xiaoyao mentioned KVM emulated timer, let me clarify it.
I think the following is common for SEV-SNP and TDX.

The issue is with guest MSR_IA32_TSC_DEADLINE (and guest local APIC Timer
Initial Count Register).  As long as I understand according to the public
documentation, the SEV-SNP hardware (or the TDX module) doesn't virtualize it so
that the VMM (KVM) has to emulate it.
The KVM uses the host timer(vcpu->arch.apic.lapic_timer) and inject timer
interrupt withit.  KVM tracks TSC multiplier/offset to calculate the host
TSC timeout value based on them.

The KVM multiplier and offset must match with the values the hardware 
(or the TDX module) thinks.  If they don't match,  the KVM tsc deadline timer
(or local APIC timer) emulation is inaccurate.  Timer interrupt is injected
Late or early.

KVM has several points to change tsc multiplier/offset from the original value.
This patch series is to prevent KVM from modifying TSC multipler/offset.

In reality, it's difficult to notice that the timer interrupt is injected late
or early like several miliseconds due to vCPU scheduling.  The injecting timer
interrupt always late harms time sensitive workload (e.g. heart beat) in guest
as Marcelo noticed.

Thanks,
-- 
Isaku Yamahata <isaku.yamahata@intel.com>
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Marcelo Tosatti 1 year, 3 months ago
On Mon, Oct 14, 2024 at 08:17:19PM +0530, Nikunj A. Dadhania wrote:
> Hi Isaku,
> 
> On 10/12/2024 1:25 PM, Isaku Yamahata wrote:
> > This patch series is for the kvm-coco-queue branch.  The change for TDX KVM is
> > included at the last.  The test is done by create TDX vCPU and run, get TSC
> > offset via vCPU device attributes and compare it with the TDX TSC OFFSET
> > metadata.  Because the test requires the TDX KVM and TDX KVM kselftests, don't
> > include it in this patch series.
> > 
> > 
> > Background
> > ----------
> > X86 confidential computing technology defines protected guest TSC so that the
> > VMM can't change the TSC offset/multiplier once vCPU is initialized and the
> > guest can trust TSC.  The SEV-SNP defines Secure TSC as optional.  TDX mandates
> > it.  The TDX module determines the TSC offset/multiplier.  The VMM has to
> > retrieve them.
> > 
> > On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
> > offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
> > creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
> > (kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
> > guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).
> > 
> > 
> > Problem
> > -------
> > The current x86 KVM implementation conflicts with protected TSC because the
> > VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
> > logic to change/adjust the TSC offset/multiplier somehow.
> > 
> > Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> > offset/multiplier, the TSC timer interrupts are injected to the guest at the
> > wrong time if the KVM TSC offset is different from what the TDX module
> > determined.
> > 
> > Originally the issue was found by cyclic test of rt-test [1] as the latency in
> > TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
> > the KVM TSC offset is different from what the TDX module determines.
> 
> Can you provide what is the exact command line to reproduce this problem ? 

Nikunj,

Run cyclictest, on an isolated CPU, in a VM. For the maximum latency
metric, rather than 50us, one gets 500us at times.

> Any links to this reported issue ?

This was not posted publically. But its not hard to reproduce.

> > Solution
> > --------
> > The solution is to keep the KVM TSC offset/multiplier the same as the value of
> > the TDX module somehow.  Possible solutions are as follows.
> > - Skip the logic
> >   Ignore (or don't call related functions) the request to change the TSC
> >   offset/multiplier.
> >   Pros
> >   - Logically clean.  This is similar to the guest_protected case.
> >   Cons
> >   - Needs to identify the call sites.
> > 
> > - Revert the change at the hooks after TSC adjustment
> >   x86 KVM defines the vendor hooks when the TSC offset/multiplier are
> >   changed.  The callback can revert the change.
> >   Pros
> >   - We don't need to care about the logic to change the TSC offset/multiplier.
> >   Cons:
> >   - Hacky to revert the KVM x86 common code logic.
> > 
> > Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> > supported.
> 
> I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and 
> GUEST_TSC_SCALE are only available to the guest.

Nikunj,

FYI:

SEV-SNP processors (at least the one below) do not seem affected by this problem.

At least this one:

vendor_id	: AuthenticAMD
cpu family	: 25
model		: 17
model name	: AMD EPYC 9124 16-Core Processor
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Nikunj A. Dadhania 1 year, 3 months ago
Hi Marcelo,

On 10/25/2024 9:54 PM, Marcelo Tosatti wrote:
> On Mon, Oct 14, 2024 at 08:17:19PM +0530, Nikunj A. Dadhania wrote:
>> On 10/12/2024 1:25 PM, Isaku Yamahata wrote:

>>> Problem
>>> -------
>>> The current x86 KVM implementation conflicts with protected TSC because the
>>> VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
>>> logic to change/adjust the TSC offset/multiplier somehow.
>>>
>>> Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
>>> offset/multiplier, the TSC timer interrupts are injected to the guest at the
>>> wrong time if the KVM TSC offset is different from what the TDX module
>>> determined.
>>>
>>> Originally the issue was found by cyclic test of rt-test [1] as the latency in
>>> TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
>>> the KVM TSC offset is different from what the TDX module determines.
>>
>> Can you provide what is the exact command line to reproduce this problem ? 
> 
> Nikunj,
> 
> Run cyclictest, on an isolated CPU, in a VM. For the maximum latency
> metric, rather than 50us, one gets 500us at times.

I tried out the cyclictest after referring to the documentation[1]. Here are the
results of the run on an isolated CPU in the Secure TSC-enabled SEV-SNP VM (CPUs
16-31 are isolated in the VM):

$ sudo taskset -c 16-31 ./cyclictest --mlockall --priority=80 --interval=200 --distance=0 -q -D 5m
T: 0 ( 1226) P:80 I:200 C:1500000 Min:      6 Act:   10 Avg:   16 Max:     150
$

VM detail: 32 vCPUs VM, guest kernel v6.12-rc3 compiled with CONFIG_PREEMPT_RT=y

> 
> FYI:
> 
> SEV-SNP processors (at least the one below) do not seem affected by this problem.

Thanks for testing on SEV-SNP.

Regards,
Nikunj

1. https://wiki.linuxfoundation.org/realtime/documentation/howto/tools/cyclictest/start
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Xiaoyao Li 1 year, 3 months ago
On 10/26/2024 12:24 AM, Marcelo Tosatti wrote:
> On Mon, Oct 14, 2024 at 08:17:19PM +0530, Nikunj A. Dadhania wrote:
>> Hi Isaku,
>>
>> On 10/12/2024 1:25 PM, Isaku Yamahata wrote:
>>> This patch series is for the kvm-coco-queue branch.  The change for TDX KVM is
>>> included at the last.  The test is done by create TDX vCPU and run, get TSC
>>> offset via vCPU device attributes and compare it with the TDX TSC OFFSET
>>> metadata.  Because the test requires the TDX KVM and TDX KVM kselftests, don't
>>> include it in this patch series.
>>>
>>>
>>> Background
>>> ----------
>>> X86 confidential computing technology defines protected guest TSC so that the
>>> VMM can't change the TSC offset/multiplier once vCPU is initialized and the
>>> guest can trust TSC.  The SEV-SNP defines Secure TSC as optional.  TDX mandates
>>> it.  The TDX module determines the TSC offset/multiplier.  The VMM has to
>>> retrieve them.
>>>
>>> On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
>>> offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
>>> creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
>>> (kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
>>> guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).
>>>
>>>
>>> Problem
>>> -------
>>> The current x86 KVM implementation conflicts with protected TSC because the
>>> VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
>>> logic to change/adjust the TSC offset/multiplier somehow.
>>>
>>> Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
>>> offset/multiplier, the TSC timer interrupts are injected to the guest at the
>>> wrong time if the KVM TSC offset is different from what the TDX module
>>> determined.
>>>
>>> Originally the issue was found by cyclic test of rt-test [1] as the latency in
>>> TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
>>> the KVM TSC offset is different from what the TDX module determines.
>>
>> Can you provide what is the exact command line to reproduce this problem ?
> 
> Nikunj,
> 
> Run cyclictest, on an isolated CPU, in a VM. For the maximum latency
> metric, rather than 50us, one gets 500us at times.
> 
>> Any links to this reported issue ?
> 
> This was not posted publically. But its not hard to reproduce.
> 
>>> Solution
>>> --------
>>> The solution is to keep the KVM TSC offset/multiplier the same as the value of
>>> the TDX module somehow.  Possible solutions are as follows.
>>> - Skip the logic
>>>    Ignore (or don't call related functions) the request to change the TSC
>>>    offset/multiplier.
>>>    Pros
>>>    - Logically clean.  This is similar to the guest_protected case.
>>>    Cons
>>>    - Needs to identify the call sites.
>>>
>>> - Revert the change at the hooks after TSC adjustment
>>>    x86 KVM defines the vendor hooks when the TSC offset/multiplier are
>>>    changed.  The callback can revert the change.
>>>    Pros
>>>    - We don't need to care about the logic to change the TSC offset/multiplier.
>>>    Cons:
>>>    - Hacky to revert the KVM x86 common code logic.
>>>
>>> Choose the first one.  With this patch series, SEV-SNP secure TSC can be
>>> supported.
>>
>> I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and
>> GUEST_TSC_SCALE are only available to the guest.
> 
> Nikunj,
> 
> FYI:
> 
> SEV-SNP processors (at least the one below) do not seem affected by this problem.

Did you apply Secure TSC patches of (guest kernel, KVM and QEMU) 
manualy? because none of them are merged. Otherwise, I think SNP guest 
is still using KVM emulated TSC.

> At least this one:
> 
> vendor_id	: AuthenticAMD
> cpu family	: 25
> model		: 17
> model name	: AMD EPYC 9124 16-Core Processor
> 
>
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Marcelo Tosatti 1 year, 3 months ago
On Sun, Oct 27, 2024 at 10:06:17PM +0800, Xiaoyao Li wrote:
> On 10/26/2024 12:24 AM, Marcelo Tosatti wrote:
> > On Mon, Oct 14, 2024 at 08:17:19PM +0530, Nikunj A. Dadhania wrote:
> > > Hi Isaku,
> > > 
> > > On 10/12/2024 1:25 PM, Isaku Yamahata wrote:
> > > > This patch series is for the kvm-coco-queue branch.  The change for TDX KVM is
> > > > included at the last.  The test is done by create TDX vCPU and run, get TSC
> > > > offset via vCPU device attributes and compare it with the TDX TSC OFFSET
> > > > metadata.  Because the test requires the TDX KVM and TDX KVM kselftests, don't
> > > > include it in this patch series.
> > > > 
> > > > 
> > > > Background
> > > > ----------
> > > > X86 confidential computing technology defines protected guest TSC so that the
> > > > VMM can't change the TSC offset/multiplier once vCPU is initialized and the
> > > > guest can trust TSC.  The SEV-SNP defines Secure TSC as optional.  TDX mandates
> > > > it.  The TDX module determines the TSC offset/multiplier.  The VMM has to
> > > > retrieve them.
> > > > 
> > > > On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
> > > > offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
> > > > creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
> > > > (kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
> > > > guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).
> > > > 
> > > > 
> > > > Problem
> > > > -------
> > > > The current x86 KVM implementation conflicts with protected TSC because the
> > > > VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
> > > > logic to change/adjust the TSC offset/multiplier somehow.
> > > > 
> > > > Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> > > > offset/multiplier, the TSC timer interrupts are injected to the guest at the
> > > > wrong time if the KVM TSC offset is different from what the TDX module
> > > > determined.
> > > > 
> > > > Originally the issue was found by cyclic test of rt-test [1] as the latency in
> > > > TDX case is worse than VMX value + TDX SEAMCALL overhead.  It turned out that
> > > > the KVM TSC offset is different from what the TDX module determines.
> > > 
> > > Can you provide what is the exact command line to reproduce this problem ?
> > 
> > Nikunj,
> > 
> > Run cyclictest, on an isolated CPU, in a VM. For the maximum latency
> > metric, rather than 50us, one gets 500us at times.
> > 
> > > Any links to this reported issue ?
> > 
> > This was not posted publically. But its not hard to reproduce.
> > 
> > > > Solution
> > > > --------
> > > > The solution is to keep the KVM TSC offset/multiplier the same as the value of
> > > > the TDX module somehow.  Possible solutions are as follows.
> > > > - Skip the logic
> > > >    Ignore (or don't call related functions) the request to change the TSC
> > > >    offset/multiplier.
> > > >    Pros
> > > >    - Logically clean.  This is similar to the guest_protected case.
> > > >    Cons
> > > >    - Needs to identify the call sites.
> > > > 
> > > > - Revert the change at the hooks after TSC adjustment
> > > >    x86 KVM defines the vendor hooks when the TSC offset/multiplier are
> > > >    changed.  The callback can revert the change.
> > > >    Pros
> > > >    - We don't need to care about the logic to change the TSC offset/multiplier.
> > > >    Cons:
> > > >    - Hacky to revert the KVM x86 common code logic.
> > > > 
> > > > Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> > > > supported.
> > > 
> > > I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and
> > > GUEST_TSC_SCALE are only available to the guest.
> > 
> > Nikunj,
> > 
> > FYI:
> > 
> > SEV-SNP processors (at least the one below) do not seem affected by this problem.
> 
> Did you apply Secure TSC patches of (guest kernel, KVM and QEMU) manualy?
> because none of them are merged. 

Yes. cyclictest latency, on a system configured with tuned
realtime-virtual-host/realtime-virtual-guest tuned profiles,
goes from 30us to 50us.

> Otherwise, I think SNP guest is still using
> KVM emulated TSC.

Not in the case the test was made.
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Nikunj A. Dadhania 1 year, 3 months ago
Hello Marcelo

On 10/28/2024 10:12 PM, Marcelo Tosatti wrote:
> On Sun, Oct 27, 2024 at 10:06:17PM +0800, Xiaoyao Li wrote:
>> On 10/26/2024 12:24 AM, Marcelo Tosatti wrote:
>>> On Mon, Oct 14, 2024 at 08:17:19PM +0530, Nikunj A. Dadhania wrote:
>>>> Hi Isaku,
>>>>
>>>> On 10/12/2024 1:25 PM, Isaku Yamahata wrote:
>>>>> Choose the first one.  With this patch series, SEV-SNP secure TSC can be
>>>>> supported.
>>>>
>>>> I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and
>>>> GUEST_TSC_SCALE are only available to the guest.
>>>
>>> Nikunj,
>>>
>>> FYI:
>>>
>>> SEV-SNP processors (at least the one below) do not seem affected by this problem.
>>
>> Did you apply Secure TSC patches of (guest kernel, KVM and QEMU) manualy?
>> because none of them are merged. 
> 
> Yes. cyclictest latency, on a system configured with tuned
> realtime-virtual-host/realtime-virtual-guest tuned profiles,
> goes from 30us to 50us.

Would you be ok if I include your Tested-by in the next version of my Secure TSC patches?

https://lore.kernel.org/lkml/20241028053431.3439593-1-nikunj@amd.com/

>> Otherwise, I think SNP guest is still using
>> KVM emulated TSC.
> 
> Not in the case the test was made.
> 

Regards,
Nikunj
Re: [PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC
Posted by Marcelo Tosatti 1 year, 3 months ago
On Tue, Oct 29, 2024 at 09:34:58AM +0530, Nikunj A. Dadhania wrote:
> Hello Marcelo
> 
> On 10/28/2024 10:12 PM, Marcelo Tosatti wrote:
> > On Sun, Oct 27, 2024 at 10:06:17PM +0800, Xiaoyao Li wrote:
> >> On 10/26/2024 12:24 AM, Marcelo Tosatti wrote:
> >>> On Mon, Oct 14, 2024 at 08:17:19PM +0530, Nikunj A. Dadhania wrote:
> >>>> Hi Isaku,
> >>>>
> >>>> On 10/12/2024 1:25 PM, Isaku Yamahata wrote:
> >>>>> Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> >>>>> supported.
> >>>>
> >>>> I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and
> >>>> GUEST_TSC_SCALE are only available to the guest.
> >>>
> >>> Nikunj,
> >>>
> >>> FYI:
> >>>
> >>> SEV-SNP processors (at least the one below) do not seem affected by this problem.
> >>
> >> Did you apply Secure TSC patches of (guest kernel, KVM and QEMU) manualy?
> >> because none of them are merged. 
> > 
> > Yes. cyclictest latency, on a system configured with tuned
> > realtime-virtual-host/realtime-virtual-guest tuned profiles,
> > goes from 30us to 50us.
> 
> Would you be ok if I include your Tested-by in the next version of my Secure TSC patches?
> 
> https://lore.kernel.org/lkml/20241028053431.3439593-1-nikunj@amd.com/

Please don't, haven't tested specifically the patches above.

> >> Otherwise, I think SNP guest is still using
> >> KVM emulated TSC.
> > 
> > Not in the case the test was made.
> > 
> 
> Regards,
> Nikunj
> 
>
[PATCH 1/2] KVM: x86: Push down setting vcpu.arch.user_set_tsc
Posted by Isaku Yamahata 1 year, 3 months ago
Push down setting vcpu.arch.user_set_tsc to true from kvm_synchronize_tsc()
to __kvm_synchronize_tsc() so that the two callers don't have to modify
user_set_tsc directly as preparation.

Later, prohibit changing TSC synchronization for TDX guests to modify
__kvm_synchornize_tsc() change.  We don't want to touch caller sites not to
change user_set_tsc.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cb24e394d768..65d871bb5b35 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2644,12 +2644,15 @@ static inline bool kvm_check_tsc_unstable(void)
  * participates in.
  */
 static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc,
-				  u64 ns, bool matched)
+				  u64 ns, bool matched, bool user_set_tsc)
 {
 	struct kvm *kvm = vcpu->kvm;
 
 	lockdep_assert_held(&kvm->arch.tsc_write_lock);
 
+	if (user_set_tsc)
+		vcpu->kvm->arch.user_set_tsc = true;
+
 	/*
 	 * We also track th most recent recorded KHZ, write and time to
 	 * allow the matching interval to be extended at each write.
@@ -2735,8 +2738,6 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 *user_value)
 		}
 	}
 
-	if (user_value)
-		kvm->arch.user_set_tsc = true;
 
 	/*
 	 * For a reliable TSC, we can match TSC offsets, and for an unstable
@@ -2756,7 +2757,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 *user_value)
 		matched = true;
 	}
 
-	__kvm_synchronize_tsc(vcpu, offset, data, ns, matched);
+	__kvm_synchronize_tsc(vcpu, offset, data, ns, matched, !!user_value);
 	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 }
 
@@ -5760,8 +5761,7 @@ static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu,
 		tsc = kvm_scale_tsc(rdtsc(), vcpu->arch.l1_tsc_scaling_ratio) + offset;
 		ns = get_kvmclock_base_ns();
 
-		kvm->arch.user_set_tsc = true;
-		__kvm_synchronize_tsc(vcpu, offset, tsc, ns, matched);
+		__kvm_synchronize_tsc(vcpu, offset, tsc, ns, matched, true);
 		raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 
 		r = 0;
-- 
2.45.2
[PATCH 2/2] KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change
Posted by Isaku Yamahata 1 year, 3 months ago
Add guest_tsc_protected member to struct kvm_arch_vcpu and prohibit
changing TSC offset/multiplier when guest_tsc_protected is true.

Background
X86 confidential computing technology defines protected guest TSC so that
the VMM can't change the TSC offset/multiplier once vCPU is initialized.
The SEV-SNP defines Secure TSC as optional.  TDX mandates it.  The TDX
module determines the TSC offset/multiplier.  The VMM has to retrieve them.

On the other hand, the x86 KVM common logic tries to guess or adjust TSC
offset/multiplier for better guest TSC and TSC interrupt latency at KVM
vCPU creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
(kvm_arch_vcpu_load()), vCPU TSC device attributes
(kvm_arch_tsc_set_attr()) and guest/host writing to TSC or TSC adjust MSR
(kvm_set_msr_common()).

Problem
The current x86 KVM implementation conflicts with protected TSC because the
VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
logic to change/adjust the TSC offset/multiplier somehow.

Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
offset/multiplier, the TSC timer interrupts is injected to the guest at the
wrong time if the KVM TSC offset is different from what the TDX module
determined.

Originally this issue was found by cyclic test of rt-test [1] as the
latency in TDX case is worse than VMX value + TDX SEAMCALL overhead.  It
turned out that the KVM TSC offset is different from what the TDX module
determines.

Solution
The solution is to keep the KVM TSC offset/multiplier the same as the value
of the TDX module somehow.  Possible solutions are as follows.
- Skip the logic
  Ignore (or don't call related functions) the request to change the TSC
  offset/multiplier.
  Pros
  - Logically clean.  This is similar to the guest_protected case.
  Cons
  - Needs to identify the call sites.

- Revert the change at the hooks after TSC adjustment
  x86 KVM defines the vendor hooks when TSC offset/multiplier are
  changed.  The callback can revert the change.
  Pros
  - We don't need to care about the logic to change the TSC
    offset/multiplier.
  Cons:
  - Hacky to revert the KVM x86 common code logic.

Choose the first one.  With this patch series, SEV-SNP secure TSC can be
supported.

[1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git

Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/x86.c              | 9 ++++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 61b7e9fe5e57..112b8a4f1860 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1036,6 +1036,7 @@ struct kvm_vcpu_arch {
 
 	/* Protected Guests */
 	bool guest_state_protected;
+	bool guest_tsc_protected;
 
 	/*
 	 * Set when PDPTS were loaded directly by the userspace without
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 65d871bb5b35..a6cf4422df28 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2587,6 +2587,9 @@ EXPORT_SYMBOL_GPL(kvm_calc_nested_tsc_multiplier);
 
 static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 l1_offset)
 {
+	if (vcpu->arch.guest_tsc_protected)
+		return;
+
 	trace_kvm_write_tsc_offset(vcpu->vcpu_id,
 				   vcpu->arch.l1_tsc_offset,
 				   l1_offset);
@@ -2650,6 +2653,9 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc,
 
 	lockdep_assert_held(&kvm->arch.tsc_write_lock);
 
+	if (vcpu->arch.guest_tsc_protected)
+		return;
+
 	if (user_set_tsc)
 		vcpu->kvm->arch.user_set_tsc = true;
 
@@ -5028,7 +5034,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 			u64 offset = kvm_compute_l1_tsc_offset(vcpu,
 						vcpu->arch.last_guest_tsc);
 			kvm_vcpu_write_tsc_offset(vcpu, offset);
-			vcpu->arch.tsc_catchup = 1;
+			if (!vcpu->arch.guest_tsc_protected)
+				vcpu->arch.tsc_catchup = 1;
 		}
 
 		if (kvm_lapic_hv_timer_in_use(vcpu))
-- 
2.45.2
Re: [PATCH 2/2] KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change
Posted by Paolo Bonzini 11 months ago
On 10/12/24 09:55, Isaku Yamahata wrote:
> Add guest_tsc_protected member to struct kvm_arch_vcpu and prohibit
> changing TSC offset/multiplier when guest_tsc_protected is true.

Thanks Isaku!  To match the behavior of the SEV-SNP patches, this is
also needed, which I have added to kvm-coco-queue:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dc2f14a6d8a1..ccde7c2b2248 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3919,7 +3925,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
  	case MSR_IA32_TSC:
  		if (msr_info->host_initiated) {
  			kvm_synchronize_tsc(vcpu, &data);
-		} else {
+		} else if (!vcpu->arch.guest_tsc_protected) {
  			u64 adj = kvm_compute_l1_tsc_offset(vcpu, data) - vcpu->arch.l1_tsc_offset;
  			adjust_tsc_offset_guest(vcpu, adj);
  			vcpu->arch.ia32_tsc_adjust_msr += adj;

Also, I rewrote the commit message as follows:

     Add guest_tsc_protected member to struct kvm_arch_vcpu and prohibit
     changing TSC offset/multiplier when guest_tsc_protected is true.
     
     X86 confidential computing technology defines protected guest TSC so that
     the VMM can't change the TSC offset/multiplier once vCPU is initialized.
     SEV-SNP defines Secure TSC as optional, whereas TDX mandates it.
     
     KVM has common logic on x86 that tries to guess or adjust TSC
     offset/multiplier for better guest TSC and TSC interrupt latency
     at KVM vCPU creation (kvm_arch_vcpu_postcreate()), vCPU migration
     over pCPU (kvm_arch_vcpu_load()), vCPU TSC device attributes
     (kvm_arch_tsc_set_attr()) and guest/host writing to TSC or TSC adjust MSR
     (kvm_set_msr_common()).
     
     The current x86 KVM implementation conflicts with protected TSC because the
     VMM can't change the TSC offset/multiplier.
     Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
     offset/multiplier, the TSC timer interrupts is injected to the guest at the
     wrong time if the KVM TSC offset is different from what the TDX module
     determined.
     
     Originally this issue was found by cyclic test of rt-test [1] as the
     latency in TDX case is worse than VMX value + TDX SEAMCALL overhead.  It
     turned out that the KVM TSC offset is different from what the TDX module
     determines.
     
     Disable or ignore the KVM logic to change/adjust the TSC offset/multiplier
     somehow, thus keeping the KVM TSC offset/multiplier the same as the
     value of the TDX module.  Writes to MSR_IA32_TSC are also blocked as
     they amount to a change in the TSC offset.
     
     [1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git

Paolo

> Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> ---
>   arch/x86/include/asm/kvm_host.h | 1 +
>   arch/x86/kvm/x86.c              | 9 ++++++++-
>   2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 61b7e9fe5e57..112b8a4f1860 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1036,6 +1036,7 @@ struct kvm_vcpu_arch {
>   
>   	/* Protected Guests */
>   	bool guest_state_protected;
> +	bool guest_tsc_protected;
>   
>   	/*
>   	 * Set when PDPTS were loaded directly by the userspace without
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 65d871bb5b35..a6cf4422df28 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2587,6 +2587,9 @@ EXPORT_SYMBOL_GPL(kvm_calc_nested_tsc_multiplier);
>   
>   static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 l1_offset)
>   {
> +	if (vcpu->arch.guest_tsc_protected)
> +		return;
> +
>   	trace_kvm_write_tsc_offset(vcpu->vcpu_id,
>   				   vcpu->arch.l1_tsc_offset,
>   				   l1_offset);
> @@ -2650,6 +2653,9 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc,
>   
>   	lockdep_assert_held(&kvm->arch.tsc_write_lock);
>   
> +	if (vcpu->arch.guest_tsc_protected)
> +		return;
> +
>   	if (user_set_tsc)
>   		vcpu->kvm->arch.user_set_tsc = true;
>   
> @@ -5028,7 +5034,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>   			u64 offset = kvm_compute_l1_tsc_offset(vcpu,
>   						vcpu->arch.last_guest_tsc);
>   			kvm_vcpu_write_tsc_offset(vcpu, offset);
> -			vcpu->arch.tsc_catchup = 1;
> +			if (!vcpu->arch.guest_tsc_protected)
> +				vcpu->arch.tsc_catchup = 1;
>   		}
>   
>   		if (kvm_lapic_hv_timer_in_use(vcpu))
Re: [PATCH 2/2] KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change
Posted by Isaku Yamahata 11 months ago
On Wed, Mar 12, 2025 at 01:24:32PM +0100,
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 10/12/24 09:55, Isaku Yamahata wrote:
> > Add guest_tsc_protected member to struct kvm_arch_vcpu and prohibit
> > changing TSC offset/multiplier when guest_tsc_protected is true.
> 
> Thanks Isaku!  To match the behavior of the SEV-SNP patches, this is
> also needed, which I have added to kvm-coco-queue:
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index dc2f14a6d8a1..ccde7c2b2248 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3919,7 +3925,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		if (msr_info->host_initiated) {
>  			kvm_synchronize_tsc(vcpu, &data);
> -		} else {
> +		} else if (!vcpu->arch.guest_tsc_protected) {
>  			u64 adj = kvm_compute_l1_tsc_offset(vcpu, data) - vcpu->arch.l1_tsc_offset;
>  			adjust_tsc_offset_guest(vcpu, adj);
>  			vcpu->arch.ia32_tsc_adjust_msr += adj;

Thank you for catching this. This looks good.
-- 
Isaku Yamahata <isaku.yamahata@intel.com>
Re: [PATCH 2/2] KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change
Posted by Edgecombe, Rick P 1 year, 3 months ago
On Sat, 2024-10-12 at 00:55 -0700, Isaku Yamahata wrote:
> Problem
> The current x86 KVM implementation conflicts with protected TSC because the
> VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
> logic to change/adjust the TSC offset/multiplier somehow.
> 
> Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> offset/multiplier, the TSC timer interrupts is injected to the guest at the
> wrong time if the KVM TSC offset is different from what the TDX module
> determined.
> 
> Originally this issue was found by cyclic test of rt-test [1] as the
> latency in TDX case is worse than VMX value + TDX SEAMCALL overhead.  It
> turned out that the KVM TSC offset is different from what the TDX module
> determines.
> 
> Solution
> The solution is to keep the KVM TSC offset/multiplier the same as the value
> of the TDX module somehow.  Possible solutions are as follows.
> - Skip the logic
>   Ignore (or don't call related functions) the request to change the TSC
>   offset/multiplier.
>   Pros
>   - Logically clean.  This is similar to the guest_protected case.
>   Cons
>   - Needs to identify the call sites.
> 
> - Revert the change at the hooks after TSC adjustment
>   x86 KVM defines the vendor hooks when TSC offset/multiplier are
>   changed.  The callback can revert the change.
>   Pros
>   - We don't need to care about the logic to change the TSC
>     offset/multiplier.
>   Cons:
>   - Hacky to revert the KVM x86 common code logic.
> 
> Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> supported.
> 
> [1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
> 
> Reported-by: Marcelo Tosatti <mtosatti@redhat.com>

IIUC this problem was reported by Marcelo and he tested these patches and found
that they did *not* resolve his issue? But offline you mentioned that you
reproduced a similar seeming bug on your end that *was* resolved by these
patches. If I got that right, I would think we should figure out Marcelo's
problem before fixing this upstream. If it only affects out-of-tree TDX code we
can take more time and not thrash the code as it gets untangled further.
Re: [PATCH 2/2] KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change
Posted by Isaku Yamahata 1 year, 2 months ago
On Mon, Oct 14, 2024 at 03:48:03PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@intel.com> wrote:

> On Sat, 2024-10-12 at 00:55 -0700, Isaku Yamahata wrote:
> > Problem
> > The current x86 KVM implementation conflicts with protected TSC because the
> > VMM can't change the TSC offset/multiplier.  Disable or ignore the KVM
> > logic to change/adjust the TSC offset/multiplier somehow.
> > 
> > Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
> > offset/multiplier, the TSC timer interrupts is injected to the guest at the
> > wrong time if the KVM TSC offset is different from what the TDX module
> > determined.
> > 
> > Originally this issue was found by cyclic test of rt-test [1] as the
> > latency in TDX case is worse than VMX value + TDX SEAMCALL overhead.  It
> > turned out that the KVM TSC offset is different from what the TDX module
> > determines.
> > 
> > Solution
> > The solution is to keep the KVM TSC offset/multiplier the same as the value
> > of the TDX module somehow.  Possible solutions are as follows.
> > - Skip the logic
> >   Ignore (or don't call related functions) the request to change the TSC
> >   offset/multiplier.
> >   Pros
> >   - Logically clean.  This is similar to the guest_protected case.
> >   Cons
> >   - Needs to identify the call sites.
> > 
> > - Revert the change at the hooks after TSC adjustment
> >   x86 KVM defines the vendor hooks when TSC offset/multiplier are
> >   changed.  The callback can revert the change.
> >   Pros
> >   - We don't need to care about the logic to change the TSC
> >     offset/multiplier.
> >   Cons:
> >   - Hacky to revert the KVM x86 common code logic.
> > 
> > Choose the first one.  With this patch series, SEV-SNP secure TSC can be
> > supported.
> > 
> > [1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
> > 
> > Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> IIUC this problem was reported by Marcelo and he tested these patches and found
> that they did *not* resolve his issue? But offline you mentioned that you
> reproduced a similar seeming bug on your end that *was* resolved by these
> patches.

That's right. The first experimental patch didn't, but this patch does.
(At least I belive so. Marcelo, please jump in if I'm wrong.)


> If I got that right, I would think we should figure out Marcelo's
> problem before fixing this upstream. If it only affects out-of-tree TDX code we
> can take more time and not thrash the code as it gets untangled further.

Ok.  This patch affects TDX code (and potentially SEV-SNP secure TSC host code.)
-- 
Isaku Yamahata <isaku.yamahata@intel.com>