[v1] KVM: x86/mmu: Page fault and MMIO cleanups

[PATCH 06/16] KVM: x86/mmu: WARN if upper 32 bits of legacy #PF error code are non-zero

Posted by Sean Christopherson 9 months ago

WARN if bits 63:32 are non-zero when handling an intercepted legacy #PF,
as the error code for #PF is limited to 32 bits (and in practice, 16 bits
on Intel CPUS).  This behavior is architectural, is part of KVM's ABI
(see kvm_vcpu_events.error_code), and is explicitly documented as being
preserved for intecerpted #PF in both the APM:

  The error code saved in EXITINFO1 is the same as would be pushed onto
  the stack by a non-intercepted #PF exception in protected mode.

and even more explicitly in the SDM as VMCS.VM_EXIT_INTR_ERROR_CODE is a
32-bit field.

Simply drop the upper bits of hardware provides garbage, as spurious
information should do no harm (though in all likelihood hardware is buggy
and the kernel is doomed).

Handling all upper 32 bits in the #PF path will allow moving the sanity
check on synthetic checks from kvm_mmu_page_fault() to npf_interception(),
which in turn will allow deriving PFERR_PRIVATE_ACCESS from AMD's
PFERR_GUEST_ENC_MASK without running afoul of the sanity check.

Note, this also why Intel uses bit 15 for SGX (highest bit on Intel CPUs)
and AMD uses bit 31 for RMP (highest bit on AMD CPUs); using the highest
bit minimizes the probability of a collision with the "other" vendor,
without needing to plumb more bits through microcode.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7807bdcd87e8..5d892bd59c97 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4553,6 +4553,13 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
 	if (WARN_ON_ONCE(fault_address >> 32))
 		return -EFAULT;
 #endif
+	/*
+	 * Legacy #PF exception only have a 32-bit error code.  Simply drop the
+	 * upper bits as KVM doesn't use them for #PF (because they are never
+	 * set), and to ensure there are no collisions with KVM-defined bits.
+	 */
+	if (WARN_ON_ONCE(error_code >> 32))
+		error_code = lower_32_bits(error_code);
 
 	vcpu->arch.l1tf_flush_l1d = true;
 	if (!flags) {
-- 
2.44.0.278.ge034bb2e1d-goog

Re: [PATCH 06/16] KVM: x86/mmu: WARN if upper 32 bits of legacy #PF error code are non-zero

Posted by Huang, Kai 8 months, 4 weeks ago


On 28/02/2024 3:41 pm, Sean Christopherson wrote:
> WARN if bits 63:32 are non-zero when handling an intercepted legacy #PF,

I found "legacy #PF" is a little bit confusing but I couldn't figure out 
a better name either :-)

> as the error code for #PF is limited to 32 bits (and in practice, 16 bits
> on Intel CPUS).  This behavior is architectural, is part of KVM's ABI
> (see kvm_vcpu_events.error_code), and is explicitly documented as being
> preserved for intecerpted #PF in both the APM:
> 
>    The error code saved in EXITINFO1 is the same as would be pushed onto
>    the stack by a non-intercepted #PF exception in protected mode.
> 
> and even more explicitly in the SDM as VMCS.VM_EXIT_INTR_ERROR_CODE is a
> 32-bit field.
> 
> Simply drop the upper bits of hardware provides garbage, as spurious

"of" -> "if" ?

> information should do no harm (though in all likelihood hardware is buggy
> and the kernel is doomed).
> 
> Handling all upper 32 bits in the #PF path will allow moving the sanity
> check on synthetic checks from kvm_mmu_page_fault() to npf_interception(),
> which in turn will allow deriving PFERR_PRIVATE_ACCESS from AMD's
> PFERR_GUEST_ENC_MASK without running afoul of the sanity check.
> 
> Note, this also why Intel uses bit 15 for SGX (highest bit on Intel CPUs)

"this" -> "this is" ?

> and AMD uses bit 31 for RMP (highest bit on AMD CPUs); using the highest
> bit minimizes the probability of a collision with the "other" vendor,
> without needing to plumb more bits through microcode.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/mmu/mmu.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7807bdcd87e8..5d892bd59c97 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4553,6 +4553,13 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
>   	if (WARN_ON_ONCE(fault_address >> 32))
>   		return -EFAULT;
>   #endif
> +	/*
> +	 * Legacy #PF exception only have a 32-bit error code.  Simply drop the

"have" -> "has" ?

> +	 * upper bits as KVM doesn't use them for #PF (because they are never
> +	 * set), and to ensure there are no collisions with KVM-defined bits.
> +	 */
> +	if (WARN_ON_ONCE(error_code >> 32))
> +		error_code = lower_32_bits(error_code);
>   
>   	vcpu->arch.l1tf_flush_l1d = true;
>   	if (!flags) {
Reviewed-by: Kai Huang <kai.huang@intel.com>

Re: [PATCH 06/16] KVM: x86/mmu: WARN if upper 32 bits of legacy #PF error code are non-zero

Posted by Sean Christopherson 8 months, 4 weeks ago

On Fri, Mar 01, 2024, Kai Huang wrote:
> 
> 
> On 28/02/2024 3:41 pm, Sean Christopherson wrote:
> > WARN if bits 63:32 are non-zero when handling an intercepted legacy #PF,
> 
> I found "legacy #PF" is a little bit confusing but I couldn't figure out a
> better name either :-)
> 
> > as the error code for #PF is limited to 32 bits (and in practice, 16 bits
> > on Intel CPUS).  This behavior is architectural, is part of KVM's ABI
> > (see kvm_vcpu_events.error_code), and is explicitly documented as being
> > preserved for intecerpted #PF in both the APM:
> > 
> >    The error code saved in EXITINFO1 is the same as would be pushed onto
> >    the stack by a non-intercepted #PF exception in protected mode.
> > 
> > and even more explicitly in the SDM as VMCS.VM_EXIT_INTR_ERROR_CODE is a
> > 32-bit field.
> > 
> > Simply drop the upper bits of hardware provides garbage, as spurious
> 
> "of" -> "if" ?
> 
> > information should do no harm (though in all likelihood hardware is buggy
> > and the kernel is doomed).
> > 
> > Handling all upper 32 bits in the #PF path will allow moving the sanity
> > check on synthetic checks from kvm_mmu_page_fault() to npf_interception(),
> > which in turn will allow deriving PFERR_PRIVATE_ACCESS from AMD's
> > PFERR_GUEST_ENC_MASK without running afoul of the sanity check.
> > 
> > Note, this also why Intel uses bit 15 for SGX (highest bit on Intel CPUs)
> 
> "this" -> "this is" ?
> 
> > and AMD uses bit 31 for RMP (highest bit on AMD CPUs); using the highest
> > bit minimizes the probability of a collision with the "other" vendor,
> > without needing to plumb more bits through microcode.
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >   arch/x86/kvm/mmu/mmu.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 7807bdcd87e8..5d892bd59c97 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -4553,6 +4553,13 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
> >   	if (WARN_ON_ONCE(fault_address >> 32))
> >   		return -EFAULT;
> >   #endif
> > +	/*
> > +	 * Legacy #PF exception only have a 32-bit error code.  Simply drop the
> 
> "have" -> "has" ?

This one I'll fix by making "exception" plural.

Thanks much for the reviews!

> 
> > +	 * upper bits as KVM doesn't use them for #PF (because they are never
> > +	 * set), and to ensure there are no collisions with KVM-defined bits.
> > +	 */
> > +	if (WARN_ON_ONCE(error_code >> 32))
> > +		error_code = lower_32_bits(error_code);
> >   	vcpu->arch.l1tf_flush_l1d = true;
> >   	if (!flags) {
> Reviewed-by: Kai Huang <kai.huang@intel.com>

Re: [PATCH 06/16] KVM: x86/mmu: WARN if upper 32 bits of legacy #PF error code are non-zero

Posted by Binbin Wu 8 months, 2 weeks ago


On 3/1/2024 7:07 AM, Sean Christopherson wrote:
> On Fri, Mar 01, 2024, Kai Huang wrote:
>>
>> On 28/02/2024 3:41 pm, Sean Christopherson wrote:
>>> WARN if bits 63:32 are non-zero when handling an intercepted legacy #PF,
>> I found "legacy #PF" is a little bit confusing but I couldn't figure out a
>> better name either :-)

Me too.

>>
>>> as the error code for #PF is limited to 32 bits (and in practice, 16 bits
>>> on Intel CPUS).  This behavior is architectural, is part of KVM's ABI
>>> (see kvm_vcpu_events.error_code), and is explicitly documented as being
>>> preserved for intecerpted #PF in both the APM:

"intecerpted" -> "intercepted"

>>>
>>>     The error code saved in EXITINFO1 is the same as would be pushed onto
>>>     the stack by a non-intercepted #PF exception in protected mode.
>>>
>>> and even more explicitly in the SDM as VMCS.VM_EXIT_INTR_ERROR_CODE is a
>>> 32-bit field.
>>>
>>> Simply drop the upper bits of hardware provides garbage, as spurious
>> "of" -> "if" ?
>>
>>> information should do no harm (though in all likelihood hardware is buggy
>>> and the kernel is doomed).
>>>
>>> Handling all upper 32 bits in the #PF path will allow moving the sanity
>>> check on synthetic checks from kvm_mmu_page_fault() to npf_interception(),
>>> which in turn will allow deriving PFERR_PRIVATE_ACCESS from AMD's
>>> PFERR_GUEST_ENC_MASK without running afoul of the sanity check.
>>>
>>> Note, this also why Intel uses bit 15 for SGX (highest bit on Intel CPUs)
>> "this" -> "this is" ?
>>
>>> and AMD uses bit 31 for RMP (highest bit on AMD CPUs); using the highest
>>> bit minimizes the probability of a collision with the "other" vendor,
>>> without needing to plumb more bits through microcode.
>>>
>>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>>> ---
>>>    arch/x86/kvm/mmu/mmu.c | 7 +++++++
>>>    1 file changed, 7 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>>> index 7807bdcd87e8..5d892bd59c97 100644
>>> --- a/arch/x86/kvm/mmu/mmu.c
>>> +++ b/arch/x86/kvm/mmu/mmu.c
>>> @@ -4553,6 +4553,13 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
>>>    	if (WARN_ON_ONCE(fault_address >> 32))
>>>    		return -EFAULT;
>>>    #endif
>>> +	/*
>>> +	 * Legacy #PF exception only have a 32-bit error code.  Simply drop the
>> "have" -> "has" ?
> This one I'll fix by making "exception" plural.
>
> Thanks much for the reviews!
>
>>> +	 * upper bits as KVM doesn't use them for #PF (because they are never
>>> +	 * set), and to ensure there are no collisions with KVM-defined bits.
>>> +	 */
>>> +	if (WARN_ON_ONCE(error_code >> 32))
>>> +		error_code = lower_32_bits(error_code);
>>>    	vcpu->arch.l1tf_flush_l1d = true;
>>>    	if (!flags) {
>> Reviewed-by: Kai Huang <kai.huang@intel.com>