[PATCH] drm/vmwgfx: Break ABBA deadlock in vblank disable path

w15303746062@163.com posted 1 patch 2 days, 5 hours ago
drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
[PATCH] drm/vmwgfx: Break ABBA deadlock in vblank disable path
Posted by w15303746062@163.com 2 days, 5 hours ago
From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

A severe deadlock occurs when disabling the CRTC while the VKMS vblank
hrtimer is running. The issue is caused by a circular lock dependency
(ABBA) involving the DRM core's dev->vbl_lock and the hrtimer cancellation
sequence.

Stack traces from NMI backtrace confirm the deadlock:
CPU 0 (IRQ Context):
 [ <0>] hrtimer_interrupt
 [ <0>] vkms_vblank_simulate
 [ <0>] drm_crtc_handle_vblank
 [ <0>] _raw_spin_lock_irqsave (waiting for dev->vbl_lock)

CPU 2 (Process Context):
 [ <2>] drm_crtc_vblank_off
 [ <2>] vmw_vkms_disable_vblank
 [ <2>] hrtimer_cancel (blocks waiting for timer callback)

This results in a system lockup and RCU stall:
[ 3367.370429] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks
[ 3367.912523] rcu: rcu_preempt kthread starved for 10504 jiffies!

The driver incorrectly calls the blocking hrtimer_cancel() while holding
dev->vbl_lock inside the disable_vblank() callback.

Fix this by using hrtimer_try_to_cancel() in vmw_vkms_disable_vblank().
This callback must remain non-blocking as it is called with dev->vbl_lock
held by the DRM core. Subsequently, call hrtimer_cancel() in
vmw_vkms_crtc_atomic_disable() *after* drm_crtc_vblank_off() has released
the lock. This ensures the timer is safely and synchronously stopped
without inducing a deadlock.

Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
index 5abd7f5ad2db..96fc856b9e06 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
@@ -305,7 +305,10 @@ vmw_vkms_disable_vblank(struct drm_crtc *crtc)
 	if (!vmw->vkms_enabled)
 		return;
 
-	hrtimer_cancel(&du->vkms.timer);
+	/*
+	 * Non-blocking cancel to avoid ABBA deadlock while holding vbl_lock.
+	 */
+	hrtimer_try_to_cancel(&du->vkms.timer);
 	du->vkms.surface = NULL;
 	du->vkms.period_ns = ktime_set(0, 0);
 }
@@ -390,9 +393,16 @@ vmw_vkms_crtc_atomic_disable(struct drm_crtc *crtc,
 			     struct drm_atomic_state *state)
 {
 	struct vmw_private *vmw = vmw_priv(crtc->dev);
+	struct vmw_display_unit *du = vmw_crtc_to_du(crtc);
 
-	if (vmw->vkms_enabled)
-		drm_crtc_vblank_off(crtc);
+	if (vmw->vkms_enabled) {
+		drm_crtc_vblank_off(crtc);
+		/*
+		 * Synchronously stop the timer after releasing the vbl_lock
+		 * to ensure no further callbacks occur.
+		 */
+		hrtimer_cancel(&du->vkms.timer);
+	}
 }
 
 static bool
-- 
2.34.1