[PATCH v3 0/7] KVM: x86: never write to memory from kvm_vcpu_check_block

Paolo Bonzini posted 7 patches 3 years, 7 months ago
There is a newer version of this series
Documentation/virt/kvm/vcpu-requests.rst | 28 +------------
arch/arm64/kvm/arm.c                     |  1 -
arch/mips/kvm/emulate.c                  |  6 +--
arch/powerpc/kvm/book3s_pr.c             |  1 -
arch/powerpc/kvm/book3s_pr_papr.c        |  1 -
arch/powerpc/kvm/booke.c                 |  1 -
arch/powerpc/kvm/powerpc.c               |  1 -
arch/riscv/kvm/vcpu_insn.c               |  1 -
arch/s390/kvm/kvm-s390.c                 |  2 -
arch/x86/include/asm/kvm_host.h          |  3 +-
arch/x86/kvm/i8259.c                     |  4 +-
arch/x86/kvm/lapic.h                     |  2 +-
arch/x86/kvm/vmx/nested.c                |  9 +++-
arch/x86/kvm/vmx/vmx.c                   |  6 ++-
arch/x86/kvm/x86.c                       | 53 ++++++++++++++++++------
arch/x86/kvm/x86.h                       |  5 ---
arch/x86/kvm/xen.c                       |  1 -
include/linux/kvm_host.h                 |  3 +-
virt/kvm/kvm_main.c                      |  4 +-
19 files changed, 63 insertions(+), 69 deletions(-)
[PATCH v3 0/7] KVM: x86: never write to memory from kvm_vcpu_check_block
Posted by Paolo Bonzini 3 years, 7 months ago
The following backtrace:

[ 1355.807187]  kvm_vcpu_map+0x159/0x190 [kvm]
[ 1355.807628]  nested_svm_vmexit+0x4c/0x7f0 [kvm_amd]
[ 1355.808036]  ? kvm_vcpu_block+0x54/0xa0 [kvm]
[ 1355.808450]  svm_check_nested_events+0x97/0x390 [kvm_amd]
[ 1355.808920]  kvm_check_nested_events+0x1c/0x40 [kvm] 
[ 1355.809396]  kvm_arch_vcpu_runnable+0x4e/0x190 [kvm]
[ 1355.809892]  kvm_vcpu_check_block+0x4f/0x100 [kvm]
[ 1355.811259]  kvm_vcpu_block+0x6b/0xa0 [kvm] 

can occur due to kmap being called in non-sleepable (!TASK_RUNNING) context.
The fix is to extend kvm_x86_ops->nested_ops.hv_timer_pending() to cover
all events not already checked in kvm_arch_vcpu_is_runnable(), and then
get rid of the annoying (and wrong) call to kvm_check_nested_events()
from kvm_vcpu_check_block().

Beware, this is not a complete fix, because kvm_guest_apic_has_interrupt()
might still _read_ memory from non-sleepable context.  The fix here is
probably to make kvm_arch_vcpu_is_runnable() return -EAGAIN, and in that
case do a round of kvm_vcpu_check_block() polling in sleepable context.
Nevertheless, it is a good start as it pushes the vmexit into vcpu_block().

The series also does a small cleanup pass on kvm_vcpu_check_block(),
removing KVM_REQ_UNHALT in favor of simply calling kvm_arch_vcpu_runnable()
again.  Now that kvm_check_nested_events() is not called anymore by
kvm_arch_vcpu_runnable(), it is much easier to see that KVM will never
consume the event that caused kvm_vcpu_has_events() to return true,
and therefore it is safe to evaluate it again.

The alternative of propagating the return value of
kvm_arch_vcpu_runnable() up to kvm_vcpu_{block,halt}() is inferior
because it does not quite get right the edge cases where the vCPU becomes
runnable right before schedule() or right after kvm_vcpu_check_block().
While these edge cases are unlikely to truly matter in practice, it is
also pointless to get them "wrong".

Paolo

v2->v3: do not propagate the return value of
	kvm_arch_vcpu_runnable() up to kvm_vcpu_{block,halt}()

	move and reformat the comment in vcpu_block()

	move KVM_REQ_UNHALT removal last

Paolo Bonzini (6):
  KVM: x86: check validity of argument to KVM_SET_MP_STATE
  KVM: x86: make vendor code check for all nested events
  KVM: x86: lapic does not have to process INIT if it is blocked
  KVM: x86: never write to memory from kvm_vcpu_check_block
  KVM: mips, x86: do not rely on KVM_REQ_UNHALT
  KVM: remove KVM_REQ_UNHALT

Sean Christopherson (1):
  KVM: nVMX: Make an event request when pending an MTF nested VM-Exit

 Documentation/virt/kvm/vcpu-requests.rst | 28 +------------
 arch/arm64/kvm/arm.c                     |  1 -
 arch/mips/kvm/emulate.c                  |  6 +--
 arch/powerpc/kvm/book3s_pr.c             |  1 -
 arch/powerpc/kvm/book3s_pr_papr.c        |  1 -
 arch/powerpc/kvm/booke.c                 |  1 -
 arch/powerpc/kvm/powerpc.c               |  1 -
 arch/riscv/kvm/vcpu_insn.c               |  1 -
 arch/s390/kvm/kvm-s390.c                 |  2 -
 arch/x86/include/asm/kvm_host.h          |  3 +-
 arch/x86/kvm/i8259.c                     |  4 +-
 arch/x86/kvm/lapic.h                     |  2 +-
 arch/x86/kvm/vmx/nested.c                |  9 +++-
 arch/x86/kvm/vmx/vmx.c                   |  6 ++-
 arch/x86/kvm/x86.c                       | 53 ++++++++++++++++++------
 arch/x86/kvm/x86.h                       |  5 ---
 arch/x86/kvm/xen.c                       |  1 -
 include/linux/kvm_host.h                 |  3 +-
 virt/kvm/kvm_main.c                      |  4 +-
 19 files changed, 63 insertions(+), 69 deletions(-)

-- 
2.31.1
Re: [PATCH v3 0/7] KVM: x86: never write to memory from kvm_vcpu_check_block
Posted by Sean Christopherson 3 years, 7 months ago
On Mon, Aug 22, 2022, Paolo Bonzini wrote:
> The following backtrace:
> Paolo Bonzini (6):
>   KVM: x86: check validity of argument to KVM_SET_MP_STATE

Skipping this one since it's already in 6.0 and AFAICT isn't strictly necessary
for the rest of the series (shouldn't matter anyways?).

>   KVM: x86: make vendor code check for all nested events
>   KVM: x86: lapic does not have to process INIT if it is blocked
>   KVM: x86: never write to memory from kvm_vcpu_check_block
>   KVM: mips, x86: do not rely on KVM_REQ_UNHALT
>   KVM: remove KVM_REQ_UNHALT
> 
> Sean Christopherson (1):
>   KVM: nVMX: Make an event request when pending an MTF nested VM-Exit

Pushed to branch `for_paolo/6.1` at:

    https://github.com/sean-jc/linux.git

with a cosmetic cleanup to kvm_apic_has_events() and the MTF migration fix squashed
in.
Re: [PATCH v3 0/7] KVM: x86: never write to memory from kvm_vcpu_check_block
Posted by Sean Christopherson 3 years, 6 months ago
On Thu, Sep 08, 2022, Sean Christopherson wrote:
> On Mon, Aug 22, 2022, Paolo Bonzini wrote:
> > The following backtrace:
> > Paolo Bonzini (6):
> >   KVM: x86: check validity of argument to KVM_SET_MP_STATE
> 
> Skipping this one since it's already in 6.0 and AFAICT isn't strictly necessary
> for the rest of the series (shouldn't matter anyways?).
> 
> >   KVM: x86: make vendor code check for all nested events
> >   KVM: x86: lapic does not have to process INIT if it is blocked
> >   KVM: x86: never write to memory from kvm_vcpu_check_block
> >   KVM: mips, x86: do not rely on KVM_REQ_UNHALT
> >   KVM: remove KVM_REQ_UNHALT
> > 
> > Sean Christopherson (1):
> >   KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
> 
> Pushed to branch `for_paolo/6.1` at:
> 
>     https://github.com/sean-jc/linux.git
> 
> with a cosmetic cleanup to kvm_apic_has_events() and the MTF migration fix squashed
> in.

Oh the irony about complaining that people waste maintainers' time by not running
existing tests :-)  I suppose it's not technically ironic since I was the one doing
the actual complaining, but it's still hilarious.

The eponymous patch breaks handling of INITs (and SIPIs) that are "latched"[1]
and later become unblocked, e.g. due to entering VMX non-root mode or because SVM's
GIF is set.  vmx_init_signal_test fails because KVM fails to re-evaluate pending
events after entering guest/non-root.  It passes now because KVM always checks
nested events in the outer run loop.

I have fixes, I'll (temporarily) drop this from the queue and post a new version of
this series on Monday.  As a reward to myself for bisecting and debugging, I'm going
to tweak "KVM: x86: lapic does not have to process INIT if it is blocked" to incorporate
my suggestions[2] from v2 so that the VMX and SVM code can check only for pending
INIT/SIPI and not include the blocking check to align with related checks that also
trigger KVM_REQ_EVENT (and because the resulting SVM GIF code would be quite fragile
if the blocking were incorporated).

[1] It annoys me to no end that KVM uses different terminology for INIT/SIPI versus
    everything else.
[2] https://lore.kernel.org/all/YvwxJzHC5xYnc7CJ@google.com
Re: [PATCH v3 0/7] KVM: x86: never write to memory from kvm_vcpu_check_block
Posted by Sean Christopherson 3 years, 6 months ago
On Sat, Sep 17, 2022, Sean Christopherson wrote:
> The eponymous patch breaks handling of INITs (and SIPIs) that are "latched"[1]
> and later become unblocked, e.g. due to entering VMX non-root mode or because SVM's
> GIF is set.  vmx_init_signal_test fails because KVM fails to re-evaluate pending
> events after entering guest/non-root.  It passes now because KVM always checks
> nested events in the outer run loop.
> 
> I have fixes, I'll (temporarily) drop this from the queue and post a new version of
> this series on Monday.

And by "Monday" I meant "Tuesday", the weird pending_events snapshot thing sent me
down a bit of a rabbit hole.