x86/virt: Silence RCU lockdep splat in emergency virt callback path

[PATCH v2] x86/virt: Silence RCU lockdep splat in emergency virt callback path

Posted by Mikhail Gavrilov 1 month, 1 week ago

x86_virt_invoke_kvm_emergency_callback() reaches rcu_dereference()
through machine_crash_shutdown() with IRQs disabled but with RCU not
necessarily watching the crashing CPU, which triggers a suspicious
RCU usage splat on debug kernels (CONFIG_PROVE_RCU=y) during
panic/kdump:

  WARNING: suspicious RCU usage
  arch/x86/virt/hw.c:52 suspicious rcu_dereference_check() usage!

  rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by tee/11119:
   #0: ffff8881fa32c440 (sb_writers#3){.+.+}-{0:0}, at: ksys_write

  Call Trace:
   <TASK>
   dump_stack_lvl+0x84/0xd0
   lockdep_rcu_suspicious.cold+0x37/0x8f
   x86_virt_invoke_kvm_emergency_callback+0x5f/0x70
   x86_svm_emergency_disable_virtualization_cpu+0x2a/0x30
   x86_virt_emergency_disable_virtualization_cpu+0x6b/0x90
   native_machine_crash_shutdown+0x72/0x170
   __crash_kexec+0x137/0x280
   panic+0xce/0xd0
   sysrq_handle_crash+0x1f/0x20
   __handle_sysrq.cold+0x192/0x335
   write_sysrq_trigger+0x8c/0xc0
   proc_reg_write+0x1c3/0x3c0
   vfs_write+0x1d0/0xf80
   ksys_write+0x116/0x250
   do_syscall_64+0x11c/0x1480
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
   </TASK>

A truly correct fix is non-trivial: the RCU usage genuinely is wrong in
panic context (RCU may ignore the crashing CPU during synchronization),
and a concurrent KVM module unload could in principle race with the
callback read; see commit 2baa33a8ddd6 ("KVM: x86: Leave user-return
notifier registered on reboot/shutdown") which notes that nothing
prevents module unload during panic/reboot.

However, the alternatives are worse:

  - smp_store_release()/smp_load_acquire() handles ordering but not
    liveness; the kernel still needs to keep the module text alive
    while the callback is in flight.
  - Taking a lock in the panic path is risky — any lock could be held
    by a CPU that has already been NMI'd to a halt.

Use rcu_dereference_raw() to silence the splat and accept the
vanishingly small remaining race. Panic context inherently cannot
guarantee complete correctness; the goal here is to keep debug builds
quiet on the kdump path so the splat doesn't obscure the actual
kernel state being captured.

Reproducible on a debug kernel (CONFIG_PROVE_LOCKING=y, CONFIG_PROVE_RCU=y)
with kvm_amd or kvm_intel loaded by triggering kdump:

  echo c > /proc/sysrq-trigger

Suggested-by: Sean Christopherson <seanjc@google.com>
Fixes: 428afac5a8ea ("KVM: x86: Move bulk of emergency virtualizaton logic to virt subsystem")
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 arch/x86/virt/hw.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/virt/hw.c b/arch/x86/virt/hw.c
index f647557d38ac..7e9091c640be 100644
--- a/arch/x86/virt/hw.c
+++ b/arch/x86/virt/hw.c
@@ -49,7 +49,20 @@ static void x86_virt_invoke_kvm_emergency_callback(void)
 {
 	cpu_emergency_virt_cb *kvm_callback;
 
-	kvm_callback = rcu_dereference(kvm_emergency_callback);
+	/*
+	 * RCU may not be watching the crashing CPU here, so rcu_dereference()
+	 * triggers a suspicious-RCU-usage splat. In principle, a concurrent
+	 * KVM module unload could race with this read; see commit 2baa33a8ddd6
+	 * ("KVM: x86: Leave user-return notifier registered on reboot/shutdown")
+	 * which notes that nothing prevents module unload during panic/reboot.
+	 *
+	 * However, taking a lock here would be riskier than the current race:
+	 * the system is going down via NMI shootdown, and any lock could be
+	 * held by an already-stopped CPU. Use rcu_dereference_raw() to silence
+	 * the lockdep splat and accept the comically small remaining race;
+	 * panic context inherently cannot guarantee complete correctness.
+	 */
+	kvm_callback = rcu_dereference_raw(kvm_emergency_callback);
 	if (kvm_callback)
 		kvm_callback();
 }
-- 
2.54.0

Re: [PATCH v2] x86/virt: Silence RCU lockdep splat in emergency virt callback path

Posted by Sean Christopherson 3 weeks, 4 days ago

On Tue, 05 May 2026 04:54:35 +0500, Mikhail Gavrilov wrote:
> x86_virt_invoke_kvm_emergency_callback() reaches rcu_dereference()
> through machine_crash_shutdown() with IRQs disabled but with RCU not
> necessarily watching the crashing CPU, which triggers a suspicious
> RCU usage splat on debug kernels (CONFIG_PROVE_RCU=y) during
> panic/kdump:
> 
>   WARNING: suspicious RCU usage
>   arch/x86/virt/hw.c:52 suspicious rcu_dereference_check() usage!
> 
> [...]

Applied to kvm-x86 fixes, thanks!

[1/1] x86/virt: Silence RCU lockdep splat in emergency virt callback path
      https://github.com/kvm-x86/linux/commit/fff82ea9d900

--
https://github.com/kvm-x86/linux/tree/next

Re: [PATCH v2] x86/virt: Silence RCU lockdep splat in emergency virt callback path

Posted by Sean Christopherson 1 month, 1 week ago

On Tue, May 05, 2026, Mikhail Gavrilov wrote:
> x86_virt_invoke_kvm_emergency_callback() reaches rcu_dereference()
> through machine_crash_shutdown() with IRQs disabled but with RCU not
> necessarily watching the crashing CPU, which triggers a suspicious
> RCU usage splat on debug kernels (CONFIG_PROVE_RCU=y) during
> panic/kdump:
> 
>   WARNING: suspicious RCU usage
>   arch/x86/virt/hw.c:52 suspicious rcu_dereference_check() usage!
> 
>   rcu_scheduler_active = 2, debug_locks = 1
>   1 lock held by tee/11119:
>    #0: ffff8881fa32c440 (sb_writers#3){.+.+}-{0:0}, at: ksys_write
> 
>   Call Trace:
>    <TASK>
>    dump_stack_lvl+0x84/0xd0
>    lockdep_rcu_suspicious.cold+0x37/0x8f
>    x86_virt_invoke_kvm_emergency_callback+0x5f/0x70
>    x86_svm_emergency_disable_virtualization_cpu+0x2a/0x30
>    x86_virt_emergency_disable_virtualization_cpu+0x6b/0x90
>    native_machine_crash_shutdown+0x72/0x170
>    __crash_kexec+0x137/0x280
>    panic+0xce/0xd0
>    sysrq_handle_crash+0x1f/0x20
>    __handle_sysrq.cold+0x192/0x335
>    write_sysrq_trigger+0x8c/0xc0
>    proc_reg_write+0x1c3/0x3c0
>    vfs_write+0x1d0/0xf80
>    ksys_write+0x116/0x250
>    do_syscall_64+0x11c/0x1480
>    entry_SYSCALL_64_after_hwframe+0x76/0x7e
>    </TASK>
> 
> A truly correct fix is non-trivial: the RCU usage genuinely is wrong in
> panic context (RCU may ignore the crashing CPU during synchronization),
> and a concurrent KVM module unload could in principle race with the
> callback read; see commit 2baa33a8ddd6 ("KVM: x86: Leave user-return
> notifier registered on reboot/shutdown") which notes that nothing
> prevents module unload during panic/reboot.
> 
> However, the alternatives are worse:
> 
>   - smp_store_release()/smp_load_acquire() handles ordering but not
>     liveness; the kernel still needs to keep the module text alive
>     while the callback is in flight.
>   - Taking a lock in the panic path is risky — any lock could be held
>     by a CPU that has already been NMI'd to a halt.
> 
> Use rcu_dereference_raw() to silence the splat and accept the
> vanishingly small remaining race. Panic context inherently cannot
> guarantee complete correctness; the goal here is to keep debug builds
> quiet on the kdump path so the splat doesn't obscure the actual
> kernel state being captured.
> 
> Reproducible on a debug kernel (CONFIG_PROVE_LOCKING=y, CONFIG_PROVE_RCU=y)
> with kvm_amd or kvm_intel loaded by triggering kdump:
> 
>   echo c > /proc/sysrq-trigger
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Fixes: 428afac5a8ea ("KVM: x86: Move bulk of emergency virtualizaton logic to virt subsystem")
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> ---

Acked-by: Sean Christopherson <seanjc@google.com>

(I can also take this through kvm-x86; I have no preference whatsoever)

Re: [PATCH v2] x86/virt: Silence RCU lockdep splat in emergency virt callback path

Posted by Mikhail Gavrilov 1 month ago

On Fri, May 8, 2026 at 2:59 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Acked-by: Sean Christopherson <seanjc@google.com>
>
> (I can also take this through kvm-x86; I have no preference whatsoever)

Thanks Sean! Whichever path is most convenient works for me.

-- 
Best Regards,
Mike Gavrilov.