[PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic

Maxim Levitsky posted 1 patch 3 years, 12 months ago
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic
Posted by Maxim Levitsky 3 years, 12 months ago
Fixes: 1c2361f667f36 ("KVM: x86: Use __try_cmpxchg_user() to emulate atomic accesses")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Yes, this is the root cause of the TDP mmu leak I was doing debug of in the last week.
Non working cmpxchg on which TDP mmu relies makes it install two differnt shadow pages
under same spte.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8ee8c91fa7625..79cabd3d97d22 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7329,7 +7329,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
 		goto emul_write;
 
 	hva = kvm_vcpu_gfn_to_hva(vcpu, gpa_to_gfn(gpa));
-	if (kvm_is_error_hva(addr))
+	if (kvm_is_error_hva(hva))
 		goto emul_write;
 
 	hva += offset_in_page(gpa);
-- 
2.26.3
Re: [PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic
Posted by Paolo Bonzini 3 years, 12 months ago
On 5/12/22 12:14, Maxim Levitsky wrote:
> Yes, this is the root cause of the TDP mmu leak I was doing debug of in the last week.
> Non working cmpxchg on which TDP mmu relies makes it install two differnt shadow pages
> under same spte.

Awesome!  And queued, thanks.

Paolo
Re: [PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic
Posted by Sean Christopherson 3 years, 12 months ago
On Thu, May 12, 2022, Paolo Bonzini wrote:
> On 5/12/22 12:14, Maxim Levitsky wrote:
> > Yes, this is the root cause of the TDP mmu leak I was doing debug of in the last week.
> > Non working cmpxchg on which TDP mmu relies makes it install two differnt shadow pages
> > under same spte.
> 
> Awesome!  And queued, thanks.

If you haven't done so already, can you add 

  Cc: stable@vger.kernel.org

Also, given that we have concrete proof that not honoring atomic accesses can have
dire consequences for the guest, what about adding a capability to turn the emul_write
path into an emulation error?
Re: [PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic
Posted by Maxim Levitsky 3 years, 11 months ago
On Thu, 2022-05-12 at 21:27 +0000, Sean Christopherson wrote:
> On Thu, May 12, 2022, Paolo Bonzini wrote:
> > On 5/12/22 12:14, Maxim Levitsky wrote:
> > > Yes, this is the root cause of the TDP mmu leak I was doing debug of in the last week.
> > > Non working cmpxchg on which TDP mmu relies makes it install two differnt shadow pages
> > > under same spte.
> > 
> > Awesome!  And queued, thanks.
> 
> If you haven't done so already, can you add 
> 
>   Cc: stable@vger.kernel.org

When I posted my patch, I checked that the patch didn't reach mainline yet,
so I assumed that it won't be in -stable either yet, although it was CCed there.


> 
> Also, given that we have concrete proof that not honoring atomic accesses can have
> dire consequences for the guest, what about adding a capability to turn the emul_write
> path into an emulation error?
> 


This is a good idea. It might though break some guests - I did see that
warning few times, that is why I wasn't alert by the fact that it started showing up more often.

I'll take a look at this soon.

Best regards,
	Maxim Levitsky
Re: [PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic
Posted by Sean Christopherson 3 years, 11 months ago
On Mon, May 16, 2022, Maxim Levitsky wrote:
> On Thu, 2022-05-12 at 21:27 +0000, Sean Christopherson wrote:
> > On Thu, May 12, 2022, Paolo Bonzini wrote:
> > > On 5/12/22 12:14, Maxim Levitsky wrote:
> > > > Yes, this is the root cause of the TDP mmu leak I was doing debug of in the last week.
> > > > Non working cmpxchg on which TDP mmu relies makes it install two differnt shadow pages
> > > > under same spte.
> > > 
> > > Awesome!  And queued, thanks.
> > 
> > If you haven't done so already, can you add 
> > 
> >   Cc: stable@vger.kernel.org
> 
> When I posted my patch, I checked that the patch didn't reach mainline yet,
> so I assumed that it won't be in -stable either yet, although it was CCed there.

Yeah, it should hit stable trees because of the explicit stable@.  The Fixes: on
this patch is likely enough, but no harm in being paranoid.

> > Also, given that we have concrete proof that not honoring atomic accesses can have
> > dire consequences for the guest, what about adding a capability to turn the emul_write
> > path into an emulation error?
> > 
> 
> 
> This is a good idea. It might though break some guests - I did see that
> warning few times, that is why I wasn't alert by the fact that it started
> showing up more often.

It mostly shows up in KUT, one of the tests deliberately triggers the scenario.
But yeah, there's definitely potential for breakage.  Not sure if a capability or
debug oriented module param would be best.  In theory, userspace could do a better
job of emulating the atomic access than KVM, which makes me lean toward a capability,
but practically speaking I doubt a userspace will ever do anything besides
terminate the guest.
Re: [PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic
Posted by Vitaly Kuznetsov 3 years, 12 months ago
Maxim Levitsky <mlevitsk@redhat.com> writes:

> Fixes: 1c2361f667f36 ("KVM: x86: Use __try_cmpxchg_user() to emulate atomic accesses")
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Yes, this is the root cause of the TDP mmu leak I was doing debug of in the last week.
> Non working cmpxchg on which TDP mmu relies makes it install two differnt shadow pages
> under same spte.

In case the fix is not squashed with 1c2361f667f36, the above should
really go before '---'.

>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 8ee8c91fa7625..79cabd3d97d22 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7329,7 +7329,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
>  		goto emul_write;
>  
>  	hva = kvm_vcpu_gfn_to_hva(vcpu, gpa_to_gfn(gpa));
> -	if (kvm_is_error_hva(addr))
> +	if (kvm_is_error_hva(hva))

Looks like a typo indeed, so

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

>  		goto emul_write;
>  
>  	hva += offset_in_page(gpa);

-- 
Vitaly
Re: [PATCH] KVM: x86: fix a typo in __try_cmpxchg_user that caused cmpxchg to be not atomic
Posted by Sean Christopherson 3 years, 12 months ago
On Thu, May 12, 2022, Vitaly Kuznetsov wrote:
> Maxim Levitsky <mlevitsk@redhat.com> writes:
> 
> > Fixes: 1c2361f667f36 ("KVM: x86: Use __try_cmpxchg_user() to emulate atomic accesses")
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/x86.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Yes, this is the root cause of the TDP mmu leak I was doing debug of in the last week.
> > Non working cmpxchg on which TDP mmu relies makes it install two differnt shadow pages
> > under same spte.

Ewww, as in running a buggy L0 resulted in a CMPXCHG going sideways in L1?  That's
awful.  My apologies :-(

> In case the fix is not squashed with 1c2361f667f36, the above should
> really go before '---'.
> 
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 8ee8c91fa7625..79cabd3d97d22 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7329,7 +7329,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
> >  		goto emul_write;
> >  
> >  	hva = kvm_vcpu_gfn_to_hva(vcpu, gpa_to_gfn(gpa));
> > -	if (kvm_is_error_hva(addr))
> > +	if (kvm_is_error_hva(hva))
> 
> Looks like a typo indeed, so
> 
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Yep, if this doesn't get squashed

Reviewed-by: Sean Christopherson <seanjc@google.com>