[PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests

Nikunj A Dadhania posted 13 patches 1 month ago
There is a newer version of this series
[PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
Posted by Nikunj A Dadhania 1 month ago
Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
accesses should not exit to the hypervisor for such guests.

Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
and reads of MSR_IA32_TSC should return the result of the RDTSC
instruction.

Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Peter Gonda <pgonda@google.com>
---
 arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index 965209067f03..2ad7773458c0 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
 		return ES_OK;
 	}
 
+	/*
+	 * TSC related accesses should not exit to the hypervisor when a
+	 * guest is executing with SecureTSC enabled, so special handling
+	 * is required for accesses of MSR_IA32_TSC:
+	 *
+	 * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
+	 *         of the TSC to return undefined values, so ignore all
+	 *         writes.
+	 * Reads:  Reads of MSR_IA32_TSC should return the current TSC
+	 *         value, use the value returned by RDTSC.
+	 */
+	if (regs->cx == MSR_IA32_TSC && cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) {
+		u64 tsc;
+
+		if (exit_info_1)
+			return ES_OK;
+
+		tsc = rdtsc();
+		regs->ax = UINT_MAX & tsc;
+		regs->dx = UINT_MAX & (tsc >> 32);
+
+		return ES_OK;
+	}
+
 	ghcb_set_rcx(ghcb, regs->cx);
 	if (exit_info_1) {
 		ghcb_set_rax(ghcb, regs->ax);
-- 
2.34.1
Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
Posted by Xiaoyao Li 1 month ago
On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
> the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
> accesses should not exit to the hypervisor for such guests.
> 
> Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
> guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
> and reads of MSR_IA32_TSC should return the result of the RDTSC
> instruction.
> 
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> Tested-by: Peter Gonda <pgonda@google.com>
> ---
>   arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
> 
> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
> index 965209067f03..2ad7773458c0 100644
> --- a/arch/x86/coco/sev/core.c
> +++ b/arch/x86/coco/sev/core.c
> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>   		return ES_OK;
>   	}
>   
> +	/*
> +	 * TSC related accesses should not exit to the hypervisor when a
> +	 * guest is executing with SecureTSC enabled, so special handling
> +	 * is required for accesses of MSR_IA32_TSC:
> +	 *
> +	 * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
> +	 *         of the TSC to return undefined values, so ignore all
> +	 *         writes.
> +	 * Reads:  Reads of MSR_IA32_TSC should return the current TSC
> +	 *         value, use the value returned by RDTSC.
> +	 */

Why doesn't handle it by returning ES_VMM_ERROR when hypervisor 
intercepts RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems not 
need to be intercepted.

I think the reason is that SNP guest relies on interception to do the 
ignore behavior for WRMSR in #VC handler because the writing leads to 
undefined result. Then the question is what if the hypervisor doesn't 
intercept write to MSR_IA32_TSC in the first place?

> +	if (regs->cx == MSR_IA32_TSC && cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) {
> +		u64 tsc;
> +
> +		if (exit_info_1)
> +			return ES_OK;
> +
> +		tsc = rdtsc();
> +		regs->ax = UINT_MAX & tsc;
> +		regs->dx = UINT_MAX & (tsc >> 32);
> +
> +		return ES_OK;
> +	}
> +
>   	ghcb_set_rcx(ghcb, regs->cx);
>   	if (exit_info_1) {
>   		ghcb_set_rax(ghcb, regs->ax);
Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
Posted by Nikunj A. Dadhania 1 month ago

On 10/23/2024 8:55 AM, Xiaoyao Li wrote:
> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
>> the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
>> accesses should not exit to the hypervisor for such guests.
>>
>> Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
>> guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
>> and reads of MSR_IA32_TSC should return the result of the RDTSC
>> instruction.
>>
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Tested-by: Peter Gonda <pgonda@google.com>
>> ---
>>   arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
>>   1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
>> index 965209067f03..2ad7773458c0 100644
>> --- a/arch/x86/coco/sev/core.c
>> +++ b/arch/x86/coco/sev/core.c
>> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>           return ES_OK;
>>       }
>>   +    /*
>> +     * TSC related accesses should not exit to the hypervisor when a
>> +     * guest is executing with SecureTSC enabled, so special handling
>> +     * is required for accesses of MSR_IA32_TSC:
>> +     *
>> +     * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
>> +     *         of the TSC to return undefined values, so ignore all
>> +     *         writes.
>> +     * Reads:  Reads of MSR_IA32_TSC should return the current TSC
>> +     *         value, use the value returned by RDTSC.
>> +     */
> 
> Why doesn't handle it by returning ES_VMM_ERROR when hypervisor intercepts
> RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems not need to be
> intercepted.

ES_VMM_ERROR will terminate the guest, which is not the expected behaviour. As
documented, writes to the MSR is ignored and reads are done using RDTSC.

> I think the reason is that SNP guest relies on interception to do the ignore
> behavior for WRMSR in #VC handler because the writing leads to undefined
> result.

For legacy and secure guests MSR_IA32_TSC is always intercepted(for both RD/WR).
Moreover, this is a legacy MSR, RDTSC and RDTSCP is the what modern OSes should
use. The idea is the catch any writes to TSC MSR and handle them gracefully.

> Then the question is what if the hypervisor doesn't intercept write to
> MSR_IA32_TSC in the first place?

I have tried to disable interception of MSR_IA32_TSC, and writes are ignored by
the HW as well. I would like to continue the current documented HW as per the APM.

Regards,
Nikunj
Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
Posted by Xiaoyao Li 1 month ago
On 10/24/2024 2:24 PM, Nikunj A. Dadhania wrote:
> 
> 
> On 10/23/2024 8:55 AM, Xiaoyao Li wrote:
>> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>>> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as
>>> the subsequent TSC value reads are undefined. MSR_IA32_TSC read/write
>>> accesses should not exit to the hypervisor for such guests.
>>>
>>> Accesses to MSR_IA32_TSC needs special handling in the #VC handler for the
>>> guests with Secure TSC enabled. Writes to MSR_IA32_TSC should be ignored,
>>> and reads of MSR_IA32_TSC should return the result of the RDTSC
>>> instruction.
>>>
>>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>>> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>>> Tested-by: Peter Gonda <pgonda@google.com>
>>> ---
>>>    arch/x86/coco/sev/core.c | 24 ++++++++++++++++++++++++
>>>    1 file changed, 24 insertions(+)
>>>
>>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
>>> index 965209067f03..2ad7773458c0 100644
>>> --- a/arch/x86/coco/sev/core.c
>>> +++ b/arch/x86/coco/sev/core.c
>>> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>>            return ES_OK;
>>>        }
>>>    +    /*
>>> +     * TSC related accesses should not exit to the hypervisor when a
>>> +     * guest is executing with SecureTSC enabled, so special handling
>>> +     * is required for accesses of MSR_IA32_TSC:
>>> +     *
>>> +     * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
>>> +     *         of the TSC to return undefined values, so ignore all
>>> +     *         writes.
>>> +     * Reads:  Reads of MSR_IA32_TSC should return the current TSC
>>> +     *         value, use the value returned by RDTSC.
>>> +     */
>>
>> Why doesn't handle it by returning ES_VMM_ERROR when hypervisor intercepts
>> RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems not need to be
>> intercepted.
> 
> ES_VMM_ERROR will terminate the guest, which is not the expected behaviour. As
> documented, writes to the MSR is ignored and reads are done using RDTSC.
> 
>> I think the reason is that SNP guest relies on interception to do the ignore
>> behavior for WRMSR in #VC handler because the writing leads to undefined
>> result.
> 
> For legacy and secure guests MSR_IA32_TSC is always intercepted(for both RD/WR).

We cannot make such assumption unless it's enforced by AMD HW.

> Moreover, this is a legacy MSR, RDTSC and RDTSCP is the what modern OSes should
> use. 

Again, this is your assumption and expectation only.

> The idea is the catch any writes to TSC MSR and handle them gracefully.

If SNP guest requires MSR_IA32_TSC being intercepted by hypervisor. It 
should come with a solution that guest kernel can check it certainly, 
just like the patch 5 and patch 6, that they can check the behavior of 
hypervisor.

If there is no clean way for guest to ensure MSR_IA32_TSC is intercepted 
by hypervisor, we at least need add some comment to call out that these 
code replies on the assumption that hypervisor intercepts MSR_IA32_TSC.

>> Then the question is what if the hypervisor doesn't intercept write to
>> MSR_IA32_TSC in the first place?
> 
> I have tried to disable interception of MSR_IA32_TSC, and writes are ignored by
> the HW as well. I would like to continue the current documented HW as per the APM.

I only means the writes are ignored in your testing HW. We don't know 
the result on other SNP-capable HW or future HW, unless it's documented 
in APM.

Current documented behavior is that write leads to a undefined value of 
subsequent read. So we need to avoid write. One solution is to intercept 
write and ignore it, but it depends on hypervisor to intercept it. 
Anther solution would be we fix all the place of writing MSR_IA32_TSC 
for SNP guest in linux.

> Regards,
> Nikunj

Re: [PATCH v13 04/13] x86/sev: Change TSC MSR behavior for Secure TSC enabled guests
Posted by Nikunj A. Dadhania 1 month ago

On 10/24/2024 1:01 PM, Xiaoyao Li wrote:
> On 10/24/2024 2:24 PM, Nikunj A. Dadhania wrote:
>>
>>
>> On 10/23/2024 8:55 AM, Xiaoyao Li wrote:
>>> On 10/21/2024 1:51 PM, Nikunj A Dadhania wrote:
>>>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
>>>> index 965209067f03..2ad7773458c0 100644
>>>> --- a/arch/x86/coco/sev/core.c
>>>> +++ b/arch/x86/coco/sev/core.c
>>>> @@ -1308,6 +1308,30 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>>>            return ES_OK;
>>>>        }
>>>>    +    /*
>>>> +     * TSC related accesses should not exit to the hypervisor when a
>>>> +     * guest is executing with SecureTSC enabled, so special handling
>>>> +     * is required for accesses of MSR_IA32_TSC:
>>>> +     *
>>>> +     * Writes: Writing to MSR_IA32_TSC can cause subsequent reads
>>>> +     *         of the TSC to return undefined values, so ignore all
>>>> +     *         writes.
>>>> +     * Reads:  Reads of MSR_IA32_TSC should return the current TSC
>>>> +     *         value, use the value returned by RDTSC.
>>>> +     */
>>>
>>>
>>> Why doesn't handle it by returning ES_VMM_ERROR when hypervisor
>>> intercepts RD/WR of MSR_IA32_TSC? With SECURE_TSC enabled, it seems
>>> not need to be intercepted.
>> 
>> ES_VMM_ERROR will terminate the guest, which is not the expected
>> behaviour. As documented, writes to the MSR is ignored and reads are
>> done using RDTSC.
>> 
>>> I think the reason is that SNP guest relies on interception to do
>>> the ignore behavior for WRMSR in #VC handler because the writing
>>> leads to undefined result.
>> 
>> For legacy and secure guests MSR_IA32_TSC is always intercepted(for
>> both RD/WR).
>
> We cannot make such assumption unless it's enforced by AMD HW.
>
>> Moreover, this is a legacy MSR, RDTSC and RDTSCP is the what modern
>> OSes should use.
>
> Again, this is your assumption and expectation only.

Not my assumption, I could not find any reference to MSR_IA32_TSC 
read/write in the kernel code. 

>
>> The idea is the catch any writes to TSC MSR and handle them
>> gracefully.
>
> If SNP guest requires MSR_IA32_TSC being intercepted by hypervisor. It
> should come with a solution that guest kernel can check it certainly,
> just like the patch 5 and patch 6, that they can check the behavior of
> hypervisor.
>
> If there is no clean way for guest to ensure MSR_IA32_TSC is
> intercepted by hypervisor, 

Yes, that is my understanding as well.

> we at least need add some comment to call
> out that these code replies on the assumption that hypervisor
> intercepts MSR_IA32_TSC.

Sure, I will add a comment.

>>> Then the question is what if the hypervisor doesn't intercept write
>>> to MSR_IA32_TSC in the first place?
>> 
>> I have tried to disable interception of MSR_IA32_TSC, and writes are
>> ignored by the HW as well. I would like to continue the current
>> documented HW as per the APM.
>
> I only means the writes are ignored in your testing HW. We don't know
> the result on other SNP-capable HW or future HW, unless it's
> documented in APM.
>
> Current documented behavior is that write leads to a undefined value
> of subsequent read. So we need to avoid write. One solution is to
> intercept write and ignore it, but it depends on hypervisor to
> intercept it. 
> Anther solution would be we fix all the place of writing
> MSR_IA32_TSC for SNP guest in linux.

There is no MSR_IA32_TSC write in the kernel code, IMHO, so second
solution is taken care.

Regards,
Nikunj