From nobody Sun May 24 20:35:28 2026
Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com
 [209.85.167.49])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E8BC23BCEE
	for <linux-kernel@vger.kernel.org>; Thu, 21 May 2026 15:08:50 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.167.49
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1779376132; cv=none;
 b=YdQ2ZJHAQ9f6UG9+NuRK/GMu5bDvEFG8vhZ4a5ke6zDZeRPJCFjFUwhjhVGcjHyS/OgtNuY7rwOgP4K4YWCKfMh7feR9hjKfjJBsteWvGVqI36RNTSafh7ZMuFhXBxYQdDThjCyGMzbEzwnxOwplp58NF+0xpI+eJpvRPpP37PE=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1779376132; c=relaxed/simple;
	bh=xah8INPoIl6lNjJx8qaG+onAYm2FXP9my57l6oRqvZk=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version:Content-Type;
 b=nl9gLv8WEsmFZECfkbb7jnZGiHVM92PWZ3dxpJsO0mIhXJfY+0ERVnsDfa0dGcF1UBcOZBcxf8tj6Ev2EAKWGWiHAt44Cnp8gmDGAQ8DrEUJeKv7ajHD8tuAJdB3WXNAdAv5lwq2z4uTKCMmwzoUiOu5fRvZ8NYhS3QbZJMlen4=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=Z2p9o7M5; arc=none smtp.client-ip=209.85.167.49
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="Z2p9o7M5"
Received: by mail-lf1-f49.google.com with SMTP id
 2adb3069b0e04-5a8e33556c0so7961109e87.0
        for <linux-kernel@vger.kernel.org>;
 Thu, 21 May 2026 08:08:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1779376129; x=1779980929;
 darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=1PYTv8RRPyhzNb2k3u7IsJEhmouq09DPN/7fCUe9rJc=;
        b=Z2p9o7M51SOMUPueTBSphmhE21Tmen5ARXcDt1muP19oAxldj/+OSFKNn88n+iL7vc
         2njzOAQL+Or0l7nFKKme8aOhxpl16mIsf6DrS3WESGo0zl0Bc64/oyUJZy042Me33uIe
         88g+fNenO8q3wdt0+KP1kW3SSWaeHHIxBC7m24MqssGtp5J9Coj36WU866IXSuGy1fFA
         CR90yT/rh0Vi6JpVlWTRl3aVPNLNK+8bnycDgq5b2I/6R1STJv4jttW0XhicosBn+G2Z
         7NZRmrIEaA6L9oLGBgBiFdHeef1h/nsAwUKEg4F+A2fbo+Im5fS588DVcg7O+RK38sPU
         l8vw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1779376129; x=1779980929;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from
         :to:cc:subject:date:message-id:reply-to;
        bh=1PYTv8RRPyhzNb2k3u7IsJEhmouq09DPN/7fCUe9rJc=;
        b=FS/UjRKrMQV9JGMbu9mzjF4fAtUki3qjk0rEtIwGXUEIdXpDhs83jKFuhEdf1L2lUw
         zMITGg61CuYWlGtg/eP++m3lGJPcBXN+5ASNlDmAwxSwW3NSDUIHfRuWpEL71MvfgUYh
         dptcFnS2CtRLNDJqDwEorx7KpjA1PNykSljDoUue+B1hC1Dg4us/Mv8Y6aCzxDU/jPeD
         4//zHOqMfOieRqBWM0mvfp3OV09Kket3Hn0/NsrqOW2aD5MldzL6f2i1DDJMKcONyun9
         d+gj0KAredRM5UKGtXWrBYnI4L4IqY+vI6Ln21SD4EV0fWejL/UiLKA7ze83JlMlNivT
         ab1g==
X-Forwarded-Encrypted: i=1;
 AFNElJ91JybHq1W4lrXEsxcaIGqLr2w9BAKdou3AYdPrYgAk5hHS/p6EpnL5EXjDYzY1vXBlmuN5ye9awC0vqL8=@vger.kernel.org
X-Gm-Message-State: AOJu0Yy0yWPVJHA2e7aWbqhbRshwkFs1PD4m4Nz17MFn8d9FY4OQi9f0
	lWGFE75wgkN0RBPbUin34aISA1cfWFg86jmQBuKLar6JXT5Fzj1ej52TQeY33G25go+P+LIt
X-Gm-Gg: Acq92OHqpBsZ14pxhBQI86ikJbTA8yVlw5MOe3J1MIJbuaFOYxnSlXlQ6XOtTYeyCgR
	u2XvrHs7yNkiA0/mQvUv+pulVuwbNgakjrlN2OjKst6BfNottL4NiXlMc0NibCA3BK7PZvp/fw1
	IMCxNUa1iMxEWsI/KNX1epCVBiV5RLBOLlOEV1KzlDdkMfFA8K579Yh604jb7bU2CtYNBe2tLOW
	kGPIzuiRx+/wgbAkjidwDbL5IY74n75dF9PXWPM861FNWo3giD1O0lJKm2D4+YPJJzotCghR3Ce
	3xWdj1Fc1w5xr7ChrAqc/DP9vipq5WKkr5BCNqU+mWkHMQo71JsVfbse8XmCI6g8/3+j6JQ8t3P
	pI477KejLrBmqKKeegENybhujEIPD3JbN/d9KxwsTcI57rzqfwk/5a/cpfb3lfPAKqZvrBDtRlI
	JGegtXIIBhj5iTSKunHCKoPDN6a9iJgdoD
X-Received: by 2002:a05:6512:4608:b0:5a8:837a:2a98 with SMTP id
 2adb3069b0e04-5aa2ba859admr1199599e87.14.1779376128530;
        Thu, 21 May 2026 08:08:48 -0700 (PDT)
Received: from localhost ([188.234.148.119])
        by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-5aa2f120c6esm335686e87.19.2026.05.21.08.08.47
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 21 May 2026 08:08:47 -0700 (PDT)
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Cc: Alex Deucher <alexander.deucher@amd.com>,
	=?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org,
	Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Subject: [PATCH v5 1/2] drm/amdgpu: convert amdgpu_vm_lock_by_pasid() to
 drm_exec
Date: Thu, 21 May 2026 20:08:39 +0500
Message-ID: <20260521150841.20625-2-mikhail.v.gavrilov@gmail.com>
X-Mailer: git-send-email 2.54.0
In-Reply-To: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com>
References: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com>
 <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

amdgpu_vm_lock_by_pasid() looks up a VM by PASID and reserves its root
PD with a bare amdgpu_bo_reserve(), returning the still-reserved root to
the caller. A caller that then needs to reserve further BOs (for example
the devcoredump IB dump) ends up nesting reservation_ww_class_mutex
acquires without a ww_acquire_ctx, which lockdep flags as recursive
locking.

Convert the helper to take a drm_exec context and lock the root PD with
drm_exec_lock_obj(). Callers now run it inside a
drm_exec_until_all_locked() loop and can lock additional BOs in the same
ww ticket, so there is no nested ww_mutex acquire.

The drm_exec context holds its own reference on the locked root BO, so
the helper no longer hands a root reference back to the caller: the
root output parameter is dropped, and the transient reference taken
across the PASID lookup is released before returning.

The only existing caller, amdgpu_vm_handle_fault(), is updated
accordingly. Its is_compute_context path, which previously dropped the
root reservation around svm_range_restore_pages() and re-took it, now
finalises the drm_exec context and re-initialises a fresh one; behaviour
is otherwise unchanged.

No functional change intended for the page-fault path.

Reviewed-by: Christian K=C3=B6nig <christian.koenig@amd.com>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 91 ++++++++++++++++----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
 2 files changed, 58 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a=
mdgpu/amdgpu_vm.c
index 9ba9de16a27a..d734d8c0de6e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2950,47 +2950,56 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *d=
ata, struct drm_file *filp)
 }
=20
 /**
- * amdgpu_vm_lock_by_pasid - return an amdgpu_vm and its root bo from a pa=
sid, if possible.
+ * amdgpu_vm_lock_by_pasid - look up a VM by PASID and lock its root PD
  * @adev: amdgpu device pointer
- * @root: root BO of the VM
  * @pasid: PASID of the VM
- * The caller needs to unreserve and unref the root bo on success.
+ * @exec: drm_exec context to lock the root PD in
+ *
+ * Must be called from within a drm_exec_until_all_locked() loop; the call=
er
+ * runs drm_exec_retry_on_contention() afterwards. The drm_exec context ho=
lds
+ * a reference on the root BO until it is finalised.
+ *
+ * Return: the VM on success, or NULL if the PASID has no VM, the VM is be=
ing
+ * torn down, or locking the root PD failed.
  */
 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev,
-					  struct amdgpu_bo **root, u32 pasid)
+					  u32 pasid, struct drm_exec *exec)
 {
 	unsigned long irqflags;
+	struct amdgpu_bo *root;
 	struct amdgpu_vm *vm;
 	int r;
=20
 	xa_lock_irqsave(&adev->vm_manager.pasids, irqflags);
 	vm =3D xa_load(&adev->vm_manager.pasids, pasid);
-	*root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL;
+	root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL;
 	xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags);
=20
-	if (!*root)
+	if (!root)
 		return NULL;
=20
-	r =3D amdgpu_bo_reserve(*root, true);
-	if (r)
-		goto error_unref;
+	r =3D drm_exec_lock_obj(exec, &root->tbo.base);
+	if (r) {
+		amdgpu_bo_unref(&root);
+		return NULL;
+	}
=20
 	/* Double check that the VM still exists */
 	xa_lock_irqsave(&adev->vm_manager.pasids, irqflags);
 	vm =3D xa_load(&adev->vm_manager.pasids, pasid);
-	if (vm && vm->root.bo !=3D *root)
+	if (vm && vm->root.bo !=3D root)
 		vm =3D NULL;
 	xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags);
-	if (!vm)
-		goto error_unlock;
+	if (!vm) {
+		drm_exec_unlock_obj(exec, &root->tbo.base);
+		amdgpu_bo_unref(&root);
+		return NULL;
+	}
=20
-	return vm;
-error_unlock:
-	amdgpu_bo_unreserve(*root);
+	/* The drm_exec context holds its own reference on the root BO. */
+	amdgpu_bo_unref(&root);
=20
-error_unref:
-	amdgpu_bo_unref(root);
-	return NULL;
+	return vm;
 }
=20
 /**
@@ -3012,33 +3021,49 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a=
dev, u32 pasid,
 			    uint64_t ts, bool write_fault)
 {
 	bool is_compute_context =3D false;
-	struct amdgpu_bo *root;
+	struct drm_exec exec;
 	uint64_t value, flags;
 	struct amdgpu_vm *vm;
 	int r;
=20
-	vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid);
-	if (!vm)
+	drm_exec_init(&exec, 0, 1);
+	drm_exec_until_all_locked(&exec) {
+		vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec);
+		drm_exec_retry_on_contention(&exec);
+		if (!vm)
+			break;
+	}
+	if (!vm) {
+		drm_exec_fini(&exec);
 		return false;
+	}
=20
 	is_compute_context =3D vm->is_compute_context;
=20
 	if (is_compute_context) {
-		/* Unreserve root since svm_range_restore_pages might try to reserve it.=
 */
-		/* TODO: rework svm_range_restore_pages so that this isn't necessary. */
-		amdgpu_bo_unreserve(root);
+		/* Release the root PD lock since svm_range_restore_pages
+		 * might try to take it.
+		 * TODO: rework svm_range_restore_pages so that this isn't
+		 * necessary.
+		 */
+		drm_exec_fini(&exec);
=20
 		if (!svm_range_restore_pages(adev, pasid, vmid,
-					     node_id, addr >> PAGE_SHIFT, ts, write_fault)) {
-			amdgpu_bo_unref(&root);
+					     node_id, addr >> PAGE_SHIFT, ts, write_fault))
 			return true;
-		}
-		amdgpu_bo_unref(&root);
=20
 		/* Re-acquire the VM lock, could be that the VM was freed in between. */
-		vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid);
-		if (!vm)
+		drm_exec_init(&exec, 0, 1);
+		drm_exec_until_all_locked(&exec) {
+			vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec);
+			drm_exec_retry_on_contention(&exec);
+			if (!vm)
+				break;
+		}
+		if (!vm) {
+			drm_exec_fini(&exec);
 			return false;
+		}
 	}
=20
 	addr /=3D AMDGPU_GPU_PAGE_SIZE;
@@ -3062,7 +3087,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *ade=
v, u32 pasid,
 		value =3D 0;
 	}
=20
-	r =3D dma_resv_reserve_fences(root->tbo.base.resv, 1);
+	r =3D dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1);
 	if (r) {
 		pr_debug("failed %d to reserve fence slot\n", r);
 		goto error_unlock;
@@ -3076,12 +3101,10 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a=
dev, u32 pasid,
 	r =3D amdgpu_vm_update_pdes(adev, vm, true);
=20
 error_unlock:
-	amdgpu_bo_unreserve(root);
+	drm_exec_fini(&exec);
 	if (r < 0)
 		dev_err(adev->dev, "Can't handle page fault (%d)\n", r);
=20
-	amdgpu_bo_unref(&root);
-
 	return false;
 }
=20
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a=
mdgpu/amdgpu_vm.h
index d083d7aab75c..0c6e3e0368c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -593,7 +593,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev,=
 u32 pasid,
 			    bool write_fault);
=20
 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev,
-					  struct amdgpu_bo **root, u32 pasid);
+					  u32 pasid, struct drm_exec *exec);
=20
 void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
=20
--=20
2.54.0

From nobody Sun May 24 20:35:28 2026
Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com
 [209.85.167.53])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4934332B128
	for <linux-kernel@vger.kernel.org>; Thu, 21 May 2026 15:08:52 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.167.53
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1779376134; cv=none;
 b=mWVt4HWqNuGOeHZgLhSo3XlVKvl6QszGrcWKFp3A0egG0yhxf7psokBXcnO8hclTdv+e5piWo9VO7N1KGm0bCHF2n9fB0FVG1JTL2oJgv/ghBy/2Rveq3Y9PKu+FPW5m9dQUsSaZVIOUEzPvN+eHenZs20lI5MBafr/USkRdd/A=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1779376134; c=relaxed/simple;
	bh=jhawuaQE4VaUOaqBjSVJ7L9kHTPtT9q8QTpefJsl81M=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version:Content-Type;
 b=D9pkNrDJJwjvB1CqpGQcfhfpMDTmIn6n2PntKjEZRjPIo+VTWhngopkSdVKfvllWU7SY8ZGW+fFCAo1IN13zwfSZH6G9vGZ3IlWm11GFouR0oMTgYUdwnDYxSxMx2buwoxjeBda4bHW2o8WOqe3RYsPftkCz7LMs8JS3bkAv7wI=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=TdviwDXV; arc=none smtp.client-ip=209.85.167.53
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="TdviwDXV"
Received: by mail-lf1-f53.google.com with SMTP id
 2adb3069b0e04-5a8738c178dso5758025e87.1
        for <linux-kernel@vger.kernel.org>;
 Thu, 21 May 2026 08:08:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1779376130; x=1779980930;
 darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=;
        b=TdviwDXVIHPeaDNfUcwJJOO/rRX1Pj2G/9FpNHhuNiL26KcIAKBuyb4W8BmhtTjpmN
         zo8xb6acZRgZXhLzrMLTxj+sINpr2Kc3ekAn/eaTmNWJxuoIckZnXynUncu3+X4q1nTp
         WnrwRZjZAssoef44K1pgkmpZVr6wFJ140cyuWNn4AxOBAMPomV6LmJw6JQKUbWGXwlny
         8h0Xb066367eIX2++TxrUWKE69ACbJhVgUR+k+byruT0O5UxsB+pmwzTPkkW9Qv5/Su5
         QIWWvkw05RipsT5tWq4HTW96XejbNNSC5Rq687N5777KvKXs8PIV9iKU5RI/bQqNCMdg
         2SGQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1779376130; x=1779980930;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from
         :to:cc:subject:date:message-id:reply-to;
        bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=;
        b=AZLbUqhp4qLar+2MGVzsh4hgrOTF0f5ehxxkFx4o+lNsNikE7S/QfvPReR9BEEagHW
         Ga78Tm663tBq/esl/4G5VTM2x96LZxfivOL9+7IvWyyyLE/bZY9I/N3WBfLCal3P0TfK
         NCcX3TwpPJYrhOVYquo2TmKyT3fJLXVjlfZhLPToFo8idXS/Ky14mglUhtxYrRzwc1nV
         qZp4YD6p4Z9dtO4CuG/mDALStMdNnywpEkHgpZJYG5cE17mWto3vK1W2sL+BVk3V/D35
         qVSsW9Fb2YW/ZG96DDvhNTBs8tZ5SBa0eI8SFCY36pQKy8eJi9mWIxCYUmyV2MUSAp8l
         yW1g==
X-Forwarded-Encrypted: i=1;
 AFNElJ9i4FkGLXmg6z4Z7cskm6QsYTYXIDKMUs6bf0tBrwWcgZ+o3cPeDyWq/2HMjYN2WgBBAuBQW4sAuNgKMTk=@vger.kernel.org
X-Gm-Message-State: AOJu0Yx/4WFVHJTV7WPl+W7vm++4E14b3TN5HJgoCmUWbYAZeTk33Qa4
	72eO7RD/pGiWt6g+5rExAON6ZqsnNAqSBi1SwEvSqOM1WIusVt0wWYUa
X-Gm-Gg: Acq92OE80JW/tY/WP5p+Cqmc95lJf2GzldTzl5fawrt2lnoB7Oayqtp3/Zy8Aa9VBnd
	eNhHbkCt/TUrHzvN4qE5ViHJFM6pB/rtXxKi7PyPv+OCAREwkDWTBdwAopEthghLknEH48J4DD+
	/oduY5MBktyRwoCKf7DPAfpq0FxKiD0VElK1cjNCkIrYDYlzqsCj+CkZTo5AW5fklxE3fPqyi7l
	oq1KRfag93aaqwuGHUkcM9415N9Am2gP2Q1FoVdHlxyzb2wnIIbffSmxYLPqNnVsaTZNDlwuyfb
	EvXifwlixbxDfgPRfOXE9MZeklSHCcyxQ7UqGEPIQpQQNdIhOLmoOidlBXcTJkIj/3MI9KhvE2l
	eZ+RXAQy3APB+h6Z9ly9ak13uaxs/yaVuJLqRbYtz7wnajMzkoZ2mzkKbByzJ0yBGB3Gm2F4THj
	qepeXofgInsum3Ci85qkRpujXr2XG35VCK
X-Received: by 2002:ac2:5dcd:0:b0:5a8:a754:f9d5 with SMTP id
 2adb3069b0e04-5aa2ba9ac3bmr843915e87.39.1779376130152;
        Thu, 21 May 2026 08:08:50 -0700 (PDT)
Received: from localhost ([188.234.148.119])
        by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-5aa2f120c6esm335686e87.19.2026.05.21.08.08.48
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 21 May 2026 08:08:49 -0700 (PDT)
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Cc: Alex Deucher <alexander.deucher@amd.com>,
	=?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org,
	Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Subject: [PATCH v5 2/2] drm/amdgpu: fix recursive ww_mutex acquire in
 amdgpu_devcoredump_format
Date: Thu, 21 May 2026 20:08:40 +0500
Message-ID: <20260521150841.20625-3-mikhail.v.gavrilov@gmail.com>
X-Mailer: git-send-email 2.54.0
In-Reply-To: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com>
References: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com>
 <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

When dumping IB contents from a hung job, amdgpu_devcoredump_format()
acquired the VM root PD's reservation via amdgpu_vm_lock_by_pasid() and
then, for each IB, called amdgpu_bo_reserve() on the BO backing the IB.
Both reservations are reservation_ww_class_mutex objects and neither
used a ww_acquire_ctx, which trips lockdep:

  WARNING: possible recursive locking detected
  --------------------------------------------
  kworker/u128:0 is trying to acquire lock:
  ffff88838b16e1f0 (reservation_ww_class_mutex){+.+.}-{4:4},
    at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu]

  but task is already holding lock:
  ffff8882f82681f0 (reservation_ww_class_mutex){+.+.}-{4:4},
    at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu]

   Possible unsafe locking scenario:
         CPU0
         ----
    lock(reservation_ww_class_mutex);
    lock(reservation_ww_class_mutex);

   *** DEADLOCK ***
   May be due to missing lock nesting notation

  Workqueue: events_unbound amdgpu_devcoredump_deferred_work [amdgpu]
  Call Trace:
   __ww_mutex_lock.constprop.0
   ww_mutex_lock
   amdgpu_bo_reserve
   amdgpu_devcoredump_format+0x1594 [amdgpu]
   amdgpu_devcoredump_deferred_work+0xea [amdgpu]

The two reservations are on different BOs in the captured trace, so the
splat is a lockdep-correctness warning, not an observed deadlock. It
becomes a real self-deadlock whenever the IB BO shares its dma_resv with
the root PD (the always-valid case, see amdgpu_vm_is_bo_always_valid()):
amdgpu_bo_reserve(abo) re-acquires the same ww_mutex without a ticket
and blocks forever.

With amdgpu.gpu_recovery=3D0 the timeout handler refires every ~2 s and
each invocation produces this splat, drowning the kernel ring buffer.

Now that amdgpu_vm_lock_by_pasid() takes a drm_exec context, lock the
root PD and every IB BO together in a single drm_exec ticket.
DRM_EXEC_IGNORE_DUPLICATES handles IB BOs that share a dma_resv (e.g.
always-valid BOs, or two IBs backed by the same BO). Every lock is now
a top-level acquire under one ww_acquire_ctx, so the recursive ww_mutex
condition is gone, and the per-IB amdgpu_bo_reserve()/amdgpu_bo_unref()
dance -- including a BO refcount leak on the amdgpu_bo_reserve() failure
path -- is removed.

Reproducer (~150 LoC libdrm_amdgpu): submit a single GFX IB containing
PACKET3_INDIRECT_BUFFER chained at GPU VA 0 and wait for the fence. The
TDR fires within ~10 s and the deferred coredump worker produces the
splat above on every invocation; with this change applied the splat is
gone.

Fixes: 7b15fc2d1f1a ("drm/amdgpu: dump job ibs in the devcoredump")
Suggested-by: Christian K=C3=B6nig <christian.koenig@amd.com>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c  | 105 ++++++++++++------
 1 file changed, 71 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu=
/drm/amd/amdgpu/amdgpu_dev_coredump.c
index d386bc775d03..456ea9911d48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
@@ -24,6 +24,7 @@
=20
 #include <generated/utsrelease.h>
 #include <linux/devcoredump.h>
+#include <drm/drm_exec.h>
 #include "amdgpu_dev_coredump.h"
 #include "atom.h"
=20
@@ -214,13 +215,9 @@ amdgpu_devcoredump_format(char *buffer, size_t count, =
struct amdgpu_coredump_inf
 	struct drm_printer p;
 	struct drm_print_iterator iter;
 	struct amdgpu_vm_fault_info *fault_info;
-	struct amdgpu_bo_va_mapping *mapping;
 	struct amdgpu_ip_block *ip_block;
 	struct amdgpu_res_cursor cursor;
-	struct amdgpu_bo *abo, *root;
-	uint64_t va_start, offset;
 	struct amdgpu_ring *ring;
-	struct amdgpu_vm *vm;
 	u32 *ib_content;
 	uint8_t *kptr;
 	int ver, i, j, r;
@@ -343,43 +340,84 @@ amdgpu_devcoredump_format(char *buffer, size_t count,=
 struct amdgpu_coredump_inf
 		drm_printf(&p, "VRAM is lost due to GPU reset!\n");
=20
 	if (coredump->num_ibs) {
-		/* Don't try to lookup the VM or map the BOs when calculating the
-		 * size required to store the devcoredump.
+		struct amdgpu_bo_va_mapping *mapping;
+		struct amdgpu_bo *abo;
+		struct drm_exec exec;
+		struct amdgpu_vm *vm;
+		u64 va_start, offset;
+		bool locked =3D false;
+
+		/*
+		 * Lock the VM root PD and every IB BO together in a single
+		 * drm_exec ticket. Reserving the IB BOs one by one while the
+		 * root PD is held would be a recursive reservation_ww_class_mutex
+		 * acquire without a ww_acquire_ctx, which trips lockdep and
+		 * self-deadlocks for IB BOs that share their dma_resv with the
+		 * root PD (always-valid BOs).
+		 *
+		 * Skip locking entirely on the sizing pass: it does not write
+		 * IB content, so the size estimate doesn't depend on whether
+		 * the BOs are reachable.
 		 */
-		if (sizing_pass)
-			vm =3D NULL;
-		else
-			vm =3D amdgpu_vm_lock_by_pasid(adev, &root, coredump->pasid);
+		if (!sizing_pass) {
+			drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES,
+				      1 + coredump->num_ibs);
+			drm_exec_until_all_locked(&exec) {
+				vm =3D amdgpu_vm_lock_by_pasid(adev, coredump->pasid,
+							     &exec);
+				drm_exec_retry_on_contention(&exec);
+				if (!vm)
+					break;
+
+				for (int i =3D 0; i < coredump->num_ibs; i++) {
+					u64 pfn;
+
+					va_start =3D coredump->ibs[i].gpu_addr &
+						   AMDGPU_GMC_HOLE_MASK;
+					pfn =3D va_start / AMDGPU_GPU_PAGE_SIZE;
+					mapping =3D amdgpu_vm_bo_lookup_mapping(vm, pfn);
+					if (!mapping)
+						continue;
+
+					abo =3D mapping->bo_va->base.bo;
+					r =3D drm_exec_lock_obj(&exec, &abo->tbo.base);
+					drm_exec_retry_on_contention(&exec);
+					if (r)
+						break;
+				}
+				if (r)
+					break;
+			}
+			if (vm && !r)
+				locked =3D true;
+			else
+				drm_exec_fini(&exec);
+		}
+
+		for (int i =3D 0; i < coredump->num_ibs; i++) {
+			bool emit_content =3D sizing_pass;
=20
-		for (int i =3D 0; i < coredump->num_ibs && (sizing_pass || vm); i++) {
 			ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4,
 						    GFP_KERNEL);
 			if (!ib_content)
 				continue;
=20
-			/* vm=3DNULL can only happen when 'sizing_pass' is true. Skip to the
-			 * drm_printf() calls (ib_content doesn't need to be initialized
-			 * as its content won't be written anywhere).
-			 */
-			if (!vm)
+			if (!locked)
 				goto output_ib_content;
=20
 			va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK;
 			mapping =3D amdgpu_vm_bo_lookup_mapping(vm, va_start / AMDGPU_GPU_PAGE_=
SIZE);
 			if (!mapping)
-				goto free_ib_content;
+				goto output_ib_content;
=20
-			offset =3D va_start - (mapping->start * AMDGPU_GPU_PAGE_SIZE);
-			abo =3D amdgpu_bo_ref(mapping->bo_va->base.bo);
-			r =3D amdgpu_bo_reserve(abo, false);
-			if (r)
-				goto free_ib_content;
+			abo =3D mapping->bo_va->base.bo;
+			offset =3D va_start - mapping->start * AMDGPU_GPU_PAGE_SIZE;
=20
 			if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) {
 				off =3D 0;
=20
 				if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM)
-					goto unreserve_abo;
+					goto output_ib_content;
=20
 				amdgpu_res_first(abo->tbo.resource, offset,
 						 coredump->ibs[i].ib_size_dw * 4,
@@ -391,12 +429,13 @@ amdgpu_devcoredump_format(char *buffer, size_t count,=
 struct amdgpu_coredump_inf
 					off +=3D cursor.size;
 					amdgpu_res_next(&cursor, cursor.size);
 				}
+				emit_content =3D true;
 			} else {
 				r =3D ttm_bo_kmap(&abo->tbo, 0,
 						PFN_UP(abo->tbo.base.size),
 						&abo->kmap);
 				if (r)
-					goto unreserve_abo;
+					goto output_ib_content;
=20
 				kptr =3D amdgpu_bo_kptr(abo);
 				kptr +=3D offset;
@@ -404,23 +443,21 @@ amdgpu_devcoredump_format(char *buffer, size_t count,=
 struct amdgpu_coredump_inf
 				       coredump->ibs[i].ib_size_dw * 4);
=20
 				amdgpu_bo_kunmap(abo);
+				emit_content =3D true;
 			}
=20
 output_ib_content:
 			drm_printf(&p, "\nIB #%d 0x%llx %d dw\n",
 				   i, coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw);
-			for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++)
-				drm_printf(&p, "0x%08x\n", ib_content[j]);
-unreserve_abo:
-			if (vm)
-				amdgpu_bo_unreserve(abo);
-free_ib_content:
+			if (emit_content) {
+				for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++)
+					drm_printf(&p, "0x%08x\n", ib_content[j]);
+			}
 			kvfree(ib_content);
 		}
-		if (vm) {
-			amdgpu_bo_unreserve(root);
-			amdgpu_bo_unref(&root);
-		}
+
+		if (locked)
+			drm_exec_fini(&exec);
 	}
=20
 	return count - iter.remain;
--=20
2.54.0