From nobody Sun May 24 21:38:44 2026
Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com
 [209.85.208.169])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F91E3A6F05
	for <linux-kernel@vger.kernel.org>; Thu, 21 May 2026 10:43:46 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.208.169
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1779360227; cv=none;
 b=d+73JQh2kJ5zOiD0Pb9/rpoRfmSGMvQAPtHBdVyyMCu4MMZEONpzir9VnvIbQL28jDkAqbX+5EEoH7aOa3Rlt2DTxXuLxNq5EScKNGhTYlxPDaYrPHZlPWNlWMAKVlaCPryvm7m2aXM4dgLKM1VKJKUpfqkAfCG+YumX75b9jeI=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1779360227; c=relaxed/simple;
	bh=zrEcgZ8Q7c8zsCON9s5of3t6K3mdZED8+/BuMRfbxH8=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version;
 b=XFm4ouIgni7BUYS6W18FRlPYO4e9CBWEdfERC4yuDQfLjR6gTn9IdPE89ozedtGo1Zt/UMWSiLOZvzbh5oUN9x1gaF08UQXX8EDLB9DFUeBZTNVK76F+NQXCAlzzavmpx0s1e8bStN7CrOU7yrwT4gRkXqbS7ZF00Yzg2aBlISQ=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=niX8ZYDb; arc=none smtp.client-ip=209.85.208.169
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="niX8ZYDb"
Received: by mail-lj1-f169.google.com with SMTP id
 38308e7fff4ca-393c40246afso69635241fa.1
        for <linux-kernel@vger.kernel.org>;
 Thu, 21 May 2026 03:43:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1779360224; x=1779965024;
 darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=Tost9z39fc3oAwF71abrFy2xi58J916FX1uOTr+0ob4=;
        b=niX8ZYDbZzj8wPYCSb7CTc1IuO1Uxhp7khPvrlrEsNUJowv5NlGDc8JQtewB+E7N56
         bgFbjsQ6Ebt7O+jneuj1q5Idtf7ErRC5UlViYzl+gRXAUjwMVKVwJtAyAf5KGviT6/J0
         +j5aGTZbYJ/mrIjrMBea1r6GijJdl/3HJLy+R3tioQbKf9/vGuSHwv07UIjlkRv618YM
         jrkCnivycj9vFu0h5dpl5um73di6O2UX4el8b1b+cgv5bNcdT15co1rWP0HtBuTMWxlS
         Wzmjq/iUStD0jS7OyihYH04KPb6CPrZeKygDr/W6FC2Uhd19ikk+h3geOVD+ZvfiAxw8
         HPtQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1779360224; x=1779965024;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from
         :to:cc:subject:date:message-id:reply-to;
        bh=Tost9z39fc3oAwF71abrFy2xi58J916FX1uOTr+0ob4=;
        b=rulQsCF1HNJqnoySPTqV+qndhQr6wQA+W5yQ1zmkR2lK90tELPzd0xUm5Wiv0Y/tAa
         S4cwmZqPN7JZZmPpeQzF4sKip8Bwo96gD4qD+70J7vxsGYtLswKquN/rMf9ZunwbV+2l
         wqJgNZA+d4kYgoxR/f6q4OXdmg1tpcaZSOLQqxbxW8MOjfOFnVfrUzhSm+tu3OnIsXlf
         KjWKLUCTIccv2e0hhDgj63PFoQUmMJMmLUjthkHw50MZhwaA8SWxptuK4bIMRCvYSTMW
         TrVsRt5MBdMOvUmFYNjioi4iwz9cUoFR+SeyiETZRC/MnKkqRAZTXzkkKMQeHyeVme/V
         JBzg==
X-Forwarded-Encrypted: i=1;
 AFNElJ80GZ0wU7o4TIGFIwpj1c6Rl0RT5KmlmX44jqZLlcrncN50DRGooV98fSkKpgpQPxMLM3DfmfGSrMCqVYI=@vger.kernel.org
X-Gm-Message-State: AOJu0YzVexOeUc9mYGH9nj9DX2J8fj0myHbbNfloJu2KG8ifmVnBlIlT
	Kl2PDtlWpdC0aBFH1odzmJtXyr5rdcX/0MH+dBPAKyPYSCdrkd/F4blr
X-Gm-Gg: Acq92OEjpDyn2w/cPqJvSIAZkrVL5IrZMCFnpJUVynJZdYywMg++QAAWxwleunxEZa/
	yqjETI4E67XWxyBqEfltsU4qOfRQtZGPIxNHlMo7zdhQcytRV7m8R45gqq8TuMRlQlMs/lGBDyG
	F9n6Y/kvNhDatJ1TtS4iH9Erog+DBaFG0cnlnxk4tqy1qu3MZ3AEMTJMOoLAiPp9L8a6thXXYy7
	4RQsBt+TRh4CJJbzMlUhsT6pXrBc/uj8mldXps5MfRNUl6fC0T/KQ7S+1AJUXks4rEHQhkUubIP
	t2dwCvScY3JMtLdqz4OMaGyfD7j10Ejr0UEQEywAk/HZVNFvo59yoCIQUu25st6uLiR374UQa+F
	Sa5k/+uOyNIL5ZAjwvgOh7meJiL5n1rftw0WydjJSNrN4CjDUZyH39jW9Y0n3O/qx+/TMnfMwI/
	JJ/EgzKVbfPHRAHJkH6QcGNNmFb9OJYXCv8Q7/cGojA3QK
X-Received: by 2002:a2e:a548:0:b0:38e:cab9:3637 with SMTP id
 38308e7fff4ca-395ca6386cfmr9212891fa.18.1779360223995;
        Thu, 21 May 2026 03:43:43 -0700 (PDT)
Received: from localhost ([188.234.148.119])
        by smtp.gmail.com with ESMTPSA id
 38308e7fff4ca-395d0b49073sm1595611fa.31.2026.05.21.03.43.41
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 21 May 2026 03:43:42 -0700 (PDT)
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Cc: Alex Deucher <alexander.deucher@amd.com>,
	=?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org,
	Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Subject: [PATCH v4 1/2] drm/amdgpu: convert amdgpu_vm_lock_by_pasid() to
 drm_exec
Date: Thu, 21 May 2026 15:43:32 +0500
Message-ID: <20260521104335.28978-2-mikhail.v.gavrilov@gmail.com>
X-Mailer: git-send-email 2.54.0
In-Reply-To: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com>
References: <20260520151741.50575-1-mikhail.v.gavrilov@gmail.com>
 <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

amdgpu_vm_lock_by_pasid() looks up a VM by PASID and reserves its root
PD with a bare amdgpu_bo_reserve(), returning the still-reserved root to
the caller. A caller that then needs to reserve further BOs (for example
the devcoredump IB dump) ends up nesting reservation_ww_class_mutex
acquires without a ww_acquire_ctx, which lockdep flags as recursive
locking.

Convert the helper to take a drm_exec context and lock the root PD with
drm_exec_lock_obj(). Callers now run it inside a
drm_exec_until_all_locked() loop and can lock additional BOs in the same
ww ticket, so there is no nested ww_mutex acquire.

The drm_exec context holds its own reference on the locked root BO, so
the helper no longer hands a root reference back to the caller: the
root output parameter is dropped, and the transient reference taken
across the PASID lookup is released before returning.

The only existing caller, amdgpu_vm_handle_fault(), is updated
accordingly. Its is_compute_context path, which previously dropped the
root reservation around svm_range_restore_pages() and re-took it, now
finalises the drm_exec context and re-initialises a fresh one; behaviour
is otherwise unchanged.

No functional change intended for the page-fault path.

Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 91 ++++++++++++++++----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
 2 files changed, 58 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a=
mdgpu/amdgpu_vm.c
index 9ba9de16a27a..591980907211 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2950,47 +2950,56 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *d=
ata, struct drm_file *filp)
 }
=20
 /**
- * amdgpu_vm_lock_by_pasid - return an amdgpu_vm and its root bo from a pa=
sid, if possible.
+ * amdgpu_vm_lock_by_pasid - look up a VM by PASID and lock its root PD
  * @adev: amdgpu device pointer
- * @root: root BO of the VM
  * @pasid: PASID of the VM
- * The caller needs to unreserve and unref the root bo on success.
+ * @exec: drm_exec context to lock the root PD in
+ *
+ * Must be called from within a drm_exec_until_all_locked() loop; the call=
er
+ * runs drm_exec_retry_on_contention() afterwards. The drm_exec context ho=
lds
+ * a reference on the root BO until it is finalised.
+ *
+ * Return: the VM on success, or NULL if the PASID has no VM, the VM is be=
ing
+ * torn down, or locking the root PD failed.
  */
 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev,
-					  struct amdgpu_bo **root, u32 pasid)
+					  u32 pasid, struct drm_exec *exec)
 {
 	unsigned long irqflags;
+	struct amdgpu_bo *root;
 	struct amdgpu_vm *vm;
 	int r;
=20
 	xa_lock_irqsave(&adev->vm_manager.pasids, irqflags);
 	vm =3D xa_load(&adev->vm_manager.pasids, pasid);
-	*root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL;
+	root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL;
 	xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags);
=20
-	if (!*root)
+	if (!root)
 		return NULL;
=20
-	r =3D amdgpu_bo_reserve(*root, true);
-	if (r)
-		goto error_unref;
+	r =3D drm_exec_lock_obj(exec, &root->tbo.base);
+	if (r) {
+		amdgpu_bo_unref(&root);
+		return NULL;
+	}
=20
 	/* Double check that the VM still exists */
 	xa_lock_irqsave(&adev->vm_manager.pasids, irqflags);
 	vm =3D xa_load(&adev->vm_manager.pasids, pasid);
-	if (vm && vm->root.bo !=3D *root)
+	if (vm && vm->root.bo !=3D root)
 		vm =3D NULL;
 	xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags);
-	if (!vm)
-		goto error_unlock;
+	if (!vm) {
+		drm_exec_unlock_obj(exec, &root->tbo.base);
+		amdgpu_bo_unref(&root);
+		return NULL;
+	}
=20
-	return vm;
-error_unlock:
-	amdgpu_bo_unreserve(*root);
+	/* The drm_exec context holds its own reference on the root BO. */
+	amdgpu_bo_unref(&root);
=20
-error_unref:
-	amdgpu_bo_unref(root);
-	return NULL;
+	return vm;
 }
=20
 /**
@@ -3012,33 +3021,49 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a=
dev, u32 pasid,
 			    uint64_t ts, bool write_fault)
 {
 	bool is_compute_context =3D false;
-	struct amdgpu_bo *root;
+	struct drm_exec exec;
 	uint64_t value, flags;
 	struct amdgpu_vm *vm;
 	int r;
=20
-	vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid);
-	if (!vm)
+	drm_exec_init(&exec, 0, 0);
+	drm_exec_until_all_locked(&exec) {
+		vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec);
+		drm_exec_retry_on_contention(&exec);
+		if (!vm)
+			break;
+	}
+	if (!vm) {
+		drm_exec_fini(&exec);
 		return false;
+	}
=20
 	is_compute_context =3D vm->is_compute_context;
=20
 	if (is_compute_context) {
-		/* Unreserve root since svm_range_restore_pages might try to reserve it.=
 */
-		/* TODO: rework svm_range_restore_pages so that this isn't necessary. */
-		amdgpu_bo_unreserve(root);
+		/* Release the root PD lock since svm_range_restore_pages
+		 * might try to take it.
+		 * TODO: rework svm_range_restore_pages so that this isn't
+		 * necessary.
+		 */
+		drm_exec_fini(&exec);
=20
 		if (!svm_range_restore_pages(adev, pasid, vmid,
-					     node_id, addr >> PAGE_SHIFT, ts, write_fault)) {
-			amdgpu_bo_unref(&root);
+					     node_id, addr >> PAGE_SHIFT, ts, write_fault))
 			return true;
-		}
-		amdgpu_bo_unref(&root);
=20
 		/* Re-acquire the VM lock, could be that the VM was freed in between. */
-		vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid);
-		if (!vm)
+		drm_exec_init(&exec, 0, 0);
+		drm_exec_until_all_locked(&exec) {
+			vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec);
+			drm_exec_retry_on_contention(&exec);
+			if (!vm)
+				break;
+		}
+		if (!vm) {
+			drm_exec_fini(&exec);
 			return false;
+		}
 	}
=20
 	addr /=3D AMDGPU_GPU_PAGE_SIZE;
@@ -3062,7 +3087,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *ade=
v, u32 pasid,
 		value =3D 0;
 	}
=20
-	r =3D dma_resv_reserve_fences(root->tbo.base.resv, 1);
+	r =3D dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1);
 	if (r) {
 		pr_debug("failed %d to reserve fence slot\n", r);
 		goto error_unlock;
@@ -3076,12 +3101,10 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a=
dev, u32 pasid,
 	r =3D amdgpu_vm_update_pdes(adev, vm, true);
=20
 error_unlock:
-	amdgpu_bo_unreserve(root);
+	drm_exec_fini(&exec);
 	if (r < 0)
 		dev_err(adev->dev, "Can't handle page fault (%d)\n", r);
=20
-	amdgpu_bo_unref(&root);
-
 	return false;
 }
=20
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a=
mdgpu/amdgpu_vm.h
index d083d7aab75c..0c6e3e0368c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -593,7 +593,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev,=
 u32 pasid,
 			    bool write_fault);
=20
 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev,
-					  struct amdgpu_bo **root, u32 pasid);
+					  u32 pasid, struct drm_exec *exec);
=20
 void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
=20
--=20
2.54.0
From nobody Sun May 24 21:38:44 2026
Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com
 [209.85.167.46])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1ECC3A5E9F
	for <linux-kernel@vger.kernel.org>; Thu, 21 May 2026 10:43:47 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.167.46
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1779360229; cv=none;
 b=ritnL+Nt27dvinwxNgjkzO314r+RFVcHKjANgtA3/rlBNGf8IFo+PW0dApNpGs1Jk7BFi28LjML9ghQ9AuzV/PwN/W0/cgYROpqApdqjU71h/gJ5JNpwFt57WyHT1zg7TdJYQfJGG54Fh6lre+p1Ft/phfm8EpcxqrLJvjDIgtU=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1779360229; c=relaxed/simple;
	bh=jhawuaQE4VaUOaqBjSVJ7L9kHTPtT9q8QTpefJsl81M=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version:Content-Type;
 b=b08IOaHZfJINZiMfgxhRTaNqvEPjEZSj3pPMb/MVQ1PCGvVEKriXJr/V9Byq501xVggoZ8JocYUWJcjMEBV6s+c8Fp5RawZi6s581dIdhQvazGtgO0gBZIsk38JKKUjW0BCclAodarRgFyNN3uwgAIaKiPtoftQXQjpY5pHIvKo=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=E200blCd; arc=none smtp.client-ip=209.85.167.46
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="E200blCd"
Received: by mail-lf1-f46.google.com with SMTP id
 2adb3069b0e04-5a62f43b76aso6345426e87.3
        for <linux-kernel@vger.kernel.org>;
 Thu, 21 May 2026 03:43:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1779360226; x=1779965026;
 darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=;
        b=E200blCdoeXIVL9RK1c+3nx4PRCn43GPmMTmI0ShzuSNzY+mrw/G+Qh4v7E7n48qqo
         mkZ0EFWAr5ULbIAv8Hs/+EvhIlw3WNPguUTVVI6Gwmw1XsYAkLB7PAs3c5nrLqJdurcn
         NwzJGr53qGm2msnWSXWeeTRgRLe30ItTaV/w52Qa/MQtEI933NUScYDreSwkX5LW1kwP
         VtTlLnmcVQlxLqDipknMf95FbBoYo7EVRWWSdekwiL5YS/HydntqKkhO9REKPqZ7AK8H
         3YaPn5WYe42NUJ3g4UIr3Ln8Oop56lJ3aQCgq6yO1NmsZI/ExPL/TtJqvmJ49P3j+B/y
         +QiA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1779360226; x=1779965026;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from
         :to:cc:subject:date:message-id:reply-to;
        bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=;
        b=fbLHyMnKOZFdekKl5eD96gFC3M2mqpn4EyFtGDIx1IG1kW9yiTomg4azaN9+y8Yu7E
         6grHQiw0rGliJaJOLJBNqGskkpOfzFG3ypZbtLkpJ90kUvPW+RTU4UNlLELvPbSiP+li
         woP5cK7kqG/e+sbOUHLyu1/w5aTOAw+xtBcNo/jncTA7DpnTFDLIJ4M0/3PR7RvAaiPZ
         CxmqyQS7gepYqcPLStrVv3bWKNBODAzAshZmK/maFo6nHnL8AP47/0o6YXMz/AZnnYQO
         UYRrl42nUvtknUNwNjFXuDvVWjOhmhql7HNktHR6347sD/W4siu4kSsld1hfjPVMHxEJ
         6SHQ==
X-Forwarded-Encrypted: i=1;
 AFNElJ8pa663a/TduyVJ+RAfGI+OdLuCKAaBovlB+1MmV6RziCNKxjGgiaK9CRR9yKyED3LEz/Nfk40cgP0FNEc=@vger.kernel.org
X-Gm-Message-State: AOJu0YwzC/5YYYbc+gu+BZWTd5jp1Bi+OlHof8EWOOoa4owwKgIG3+mb
	uqvWwzo3p3aM4c3BncsWSLzuDTEB+3sDkEb21r8Wq8iuHirfoH9UjtT3
X-Gm-Gg: Acq92OGvueT+RJ2gdQgUUObpnQnpttZ+xDBip8YMOnfTeYXYQR3Rbchs/jzBN/NNJSb
	KWhMLqnr+mPbNzNJTwA7raGPKMfzUJiiSieIv2tOTBFawkmD11PUjMMlAgaDHhKmOdgkVa4pGZ7
	Ad+ZxGU0nOeeZzn9WUNBnqwpdoW22Ce08UcoZQKvjNw/VnnWeRfsd/QU/JXvdnVBs2RuCPOOnFO
	oYVL/7Tm7RUa3nPvh7u8k8JnjpVVZosxw4Jm1/my+R3OoZrgy1cDMDlFOTnkTZ5v3W3+x9YAuMM
	j2Btxk5OBRHUk0o4d0RiQAItDM22P+ggV02DkyY9bHxIyAHzumF277M2IjSBCFAbL2MGJPqJqnh
	ZTeLWGGXhfUTRsJqlGERu7yhAfgTicaVlKm978eMawv3VQO1mjgczFwBptHsHvzl+jsklGq990q
	wRG5LqRni97sh6ATFB9BDJLmyQWYXBtGQqfWJMyAtkXcXs
X-Received: by 2002:ac2:46ea:0:b0:5a8:89ad:e172 with SMTP id
 2adb3069b0e04-5aa2ba9afecmr629800e87.37.1779360225752;
        Thu, 21 May 2026 03:43:45 -0700 (PDT)
Received: from localhost ([188.234.148.119])
        by smtp.gmail.com with ESMTPSA id
 38308e7fff4ca-395d0b49073sm1595611fa.31.2026.05.21.03.43.44
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 21 May 2026 03:43:45 -0700 (PDT)
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Cc: Alex Deucher <alexander.deucher@amd.com>,
	=?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org,
	Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Subject: [PATCH v4 2/2] drm/amdgpu: fix recursive ww_mutex acquire in
 amdgpu_devcoredump_format
Date: Thu, 21 May 2026 15:43:33 +0500
Message-ID: <20260521104335.28978-3-mikhail.v.gavrilov@gmail.com>
X-Mailer: git-send-email 2.54.0
In-Reply-To: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com>
References: <20260520151741.50575-1-mikhail.v.gavrilov@gmail.com>
 <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

When dumping IB contents from a hung job, amdgpu_devcoredump_format()
acquired the VM root PD's reservation via amdgpu_vm_lock_by_pasid() and
then, for each IB, called amdgpu_bo_reserve() on the BO backing the IB.
Both reservations are reservation_ww_class_mutex objects and neither
used a ww_acquire_ctx, which trips lockdep:

  WARNING: possible recursive locking detected
  --------------------------------------------
  kworker/u128:0 is trying to acquire lock:
  ffff88838b16e1f0 (reservation_ww_class_mutex){+.+.}-{4:4},
    at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu]

  but task is already holding lock:
  ffff8882f82681f0 (reservation_ww_class_mutex){+.+.}-{4:4},
    at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu]

   Possible unsafe locking scenario:
         CPU0
         ----
    lock(reservation_ww_class_mutex);
    lock(reservation_ww_class_mutex);

   *** DEADLOCK ***
   May be due to missing lock nesting notation

  Workqueue: events_unbound amdgpu_devcoredump_deferred_work [amdgpu]
  Call Trace:
   __ww_mutex_lock.constprop.0
   ww_mutex_lock
   amdgpu_bo_reserve
   amdgpu_devcoredump_format+0x1594 [amdgpu]
   amdgpu_devcoredump_deferred_work+0xea [amdgpu]

The two reservations are on different BOs in the captured trace, so the
splat is a lockdep-correctness warning, not an observed deadlock. It
becomes a real self-deadlock whenever the IB BO shares its dma_resv with
the root PD (the always-valid case, see amdgpu_vm_is_bo_always_valid()):
amdgpu_bo_reserve(abo) re-acquires the same ww_mutex without a ticket
and blocks forever.

With amdgpu.gpu_recovery=3D0 the timeout handler refires every ~2 s and
each invocation produces this splat, drowning the kernel ring buffer.

Now that amdgpu_vm_lock_by_pasid() takes a drm_exec context, lock the
root PD and every IB BO together in a single drm_exec ticket.
DRM_EXEC_IGNORE_DUPLICATES handles IB BOs that share a dma_resv (e.g.
always-valid BOs, or two IBs backed by the same BO). Every lock is now
a top-level acquire under one ww_acquire_ctx, so the recursive ww_mutex
condition is gone, and the per-IB amdgpu_bo_reserve()/amdgpu_bo_unref()
dance -- including a BO refcount leak on the amdgpu_bo_reserve() failure
path -- is removed.

Reproducer (~150 LoC libdrm_amdgpu): submit a single GFX IB containing
PACKET3_INDIRECT_BUFFER chained at GPU VA 0 and wait for the fence. The
TDR fires within ~10 s and the deferred coredump worker produces the
splat above on every invocation; with this change applied the splat is
gone.

Fixes: 7b15fc2d1f1a ("drm/amdgpu: dump job ibs in the devcoredump")
Suggested-by: Christian K=C3=B6nig <christian.koenig@amd.com>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c  | 105 ++++++++++++------
 1 file changed, 71 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu=
/drm/amd/amdgpu/amdgpu_dev_coredump.c
index d386bc775d03..456ea9911d48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
@@ -24,6 +24,7 @@
=20
 #include <generated/utsrelease.h>
 #include <linux/devcoredump.h>
+#include <drm/drm_exec.h>
 #include "amdgpu_dev_coredump.h"
 #include "atom.h"
=20
@@ -214,13 +215,9 @@ amdgpu_devcoredump_format(char *buffer, size_t count, =
struct amdgpu_coredump_inf
 	struct drm_printer p;
 	struct drm_print_iterator iter;
 	struct amdgpu_vm_fault_info *fault_info;
-	struct amdgpu_bo_va_mapping *mapping;
 	struct amdgpu_ip_block *ip_block;
 	struct amdgpu_res_cursor cursor;
-	struct amdgpu_bo *abo, *root;
-	uint64_t va_start, offset;
 	struct amdgpu_ring *ring;
-	struct amdgpu_vm *vm;
 	u32 *ib_content;
 	uint8_t *kptr;
 	int ver, i, j, r;
@@ -343,43 +340,84 @@ amdgpu_devcoredump_format(char *buffer, size_t count,=
 struct amdgpu_coredump_inf
 		drm_printf(&p, "VRAM is lost due to GPU reset!\n");
=20
 	if (coredump->num_ibs) {
-		/* Don't try to lookup the VM or map the BOs when calculating the
-		 * size required to store the devcoredump.
+		struct amdgpu_bo_va_mapping *mapping;
+		struct amdgpu_bo *abo;
+		struct drm_exec exec;
+		struct amdgpu_vm *vm;
+		u64 va_start, offset;
+		bool locked =3D false;
+
+		/*
+		 * Lock the VM root PD and every IB BO together in a single
+		 * drm_exec ticket. Reserving the IB BOs one by one while the
+		 * root PD is held would be a recursive reservation_ww_class_mutex
+		 * acquire without a ww_acquire_ctx, which trips lockdep and
+		 * self-deadlocks for IB BOs that share their dma_resv with the
+		 * root PD (always-valid BOs).
+		 *
+		 * Skip locking entirely on the sizing pass: it does not write
+		 * IB content, so the size estimate doesn't depend on whether
+		 * the BOs are reachable.
 		 */
-		if (sizing_pass)
-			vm =3D NULL;
-		else
-			vm =3D amdgpu_vm_lock_by_pasid(adev, &root, coredump->pasid);
+		if (!sizing_pass) {
+			drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES,
+				      1 + coredump->num_ibs);
+			drm_exec_until_all_locked(&exec) {
+				vm =3D amdgpu_vm_lock_by_pasid(adev, coredump->pasid,
+							     &exec);
+				drm_exec_retry_on_contention(&exec);
+				if (!vm)
+					break;
+
+				for (int i =3D 0; i < coredump->num_ibs; i++) {
+					u64 pfn;
+
+					va_start =3D coredump->ibs[i].gpu_addr &
+						   AMDGPU_GMC_HOLE_MASK;
+					pfn =3D va_start / AMDGPU_GPU_PAGE_SIZE;
+					mapping =3D amdgpu_vm_bo_lookup_mapping(vm, pfn);
+					if (!mapping)
+						continue;
+
+					abo =3D mapping->bo_va->base.bo;
+					r =3D drm_exec_lock_obj(&exec, &abo->tbo.base);
+					drm_exec_retry_on_contention(&exec);
+					if (r)
+						break;
+				}
+				if (r)
+					break;
+			}
+			if (vm && !r)
+				locked =3D true;
+			else
+				drm_exec_fini(&exec);
+		}
+
+		for (int i =3D 0; i < coredump->num_ibs; i++) {
+			bool emit_content =3D sizing_pass;
=20
-		for (int i =3D 0; i < coredump->num_ibs && (sizing_pass || vm); i++) {
 			ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4,
 						    GFP_KERNEL);
 			if (!ib_content)
 				continue;
=20
-			/* vm=3DNULL can only happen when 'sizing_pass' is true. Skip to the
-			 * drm_printf() calls (ib_content doesn't need to be initialized
-			 * as its content won't be written anywhere).
-			 */
-			if (!vm)
+			if (!locked)
 				goto output_ib_content;
=20
 			va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK;
 			mapping =3D amdgpu_vm_bo_lookup_mapping(vm, va_start / AMDGPU_GPU_PAGE_=
SIZE);
 			if (!mapping)
-				goto free_ib_content;
+				goto output_ib_content;
=20
-			offset =3D va_start - (mapping->start * AMDGPU_GPU_PAGE_SIZE);
-			abo =3D amdgpu_bo_ref(mapping->bo_va->base.bo);
-			r =3D amdgpu_bo_reserve(abo, false);
-			if (r)
-				goto free_ib_content;
+			abo =3D mapping->bo_va->base.bo;
+			offset =3D va_start - mapping->start * AMDGPU_GPU_PAGE_SIZE;
=20
 			if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) {
 				off =3D 0;
=20
 				if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM)
-					goto unreserve_abo;
+					goto output_ib_content;
=20
 				amdgpu_res_first(abo->tbo.resource, offset,
 						 coredump->ibs[i].ib_size_dw * 4,
@@ -391,12 +429,13 @@ amdgpu_devcoredump_format(char *buffer, size_t count,=
 struct amdgpu_coredump_inf
 					off +=3D cursor.size;
 					amdgpu_res_next(&cursor, cursor.size);
 				}
+				emit_content =3D true;
 			} else {
 				r =3D ttm_bo_kmap(&abo->tbo, 0,
 						PFN_UP(abo->tbo.base.size),
 						&abo->kmap);
 				if (r)
-					goto unreserve_abo;
+					goto output_ib_content;
=20
 				kptr =3D amdgpu_bo_kptr(abo);
 				kptr +=3D offset;
@@ -404,23 +443,21 @@ amdgpu_devcoredump_format(char *buffer, size_t count,=
 struct amdgpu_coredump_inf
 				       coredump->ibs[i].ib_size_dw * 4);
=20
 				amdgpu_bo_kunmap(abo);
+				emit_content =3D true;
 			}
=20
 output_ib_content:
 			drm_printf(&p, "\nIB #%d 0x%llx %d dw\n",
 				   i, coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw);
-			for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++)
-				drm_printf(&p, "0x%08x\n", ib_content[j]);
-unreserve_abo:
-			if (vm)
-				amdgpu_bo_unreserve(abo);
-free_ib_content:
+			if (emit_content) {
+				for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++)
+					drm_printf(&p, "0x%08x\n", ib_content[j]);
+			}
 			kvfree(ib_content);
 		}
-		if (vm) {
-			amdgpu_bo_unreserve(root);
-			amdgpu_bo_unref(&root);
-		}
+
+		if (locked)
+			drm_exec_fini(&exec);
 	}
=20
 	return count - iter.remain;
--=20
2.54.0