From nobody Mon Jun  8 13:30:20 2026
Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com
 [209.85.208.169])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5FC33AB29D
	for <linux-kernel@vger.kernel.org>; Fri, 29 May 2026 06:48:22 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.208.169
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1780037304; cv=none;
 b=O2MDmhb7TonZJc1ZTnVdiZGt0eA8nt1DeuLMtyBOp7cOI433mnlVeDcwDjO9CoVsPz5w+++ULGnW9KyNcj0sKN7XCVTLXJf/xLcO6n8vNm1wR3oYP3zGPMJvvMrXhx2o5IDL0TSjgZDTnm4KMsXpKdQp6Q6t4jqjPgh4SL6AnJ0=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1780037304; c=relaxed/simple;
	bh=Cpi20G3hUtpGa0J80DAMLB46h6zNbwjq96KGIYnVB/M=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version:Content-Type;
 b=CJQ84HYC2jqlj+hkFd3duHafGYqWRUJzOqzWQYQaPNE+no8wvbaNuiEIFwl1w2PVnxLe5DEWvs7M0Y3TyQzTo3DggF/00mkvU9HRtJnxrwkiFPVSgvVjXUtJajKI4ZrJ0eQXwh6ybGxWs11KYJCw3f2wnyUa2UOonZqVirv0S1g=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=deJKX4mJ; arc=none smtp.client-ip=209.85.208.169
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="deJKX4mJ"
Received: by mail-lj1-f169.google.com with SMTP id
 38308e7fff4ca-393a49d2e5eso115936701fa.2
        for <linux-kernel@vger.kernel.org>;
 Thu, 28 May 2026 23:48:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1780037301; x=1780642101;
 darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=+aMMV5ilmpFv8Pwd8WuxApxlzDJGWTTKgJEHknglssE=;
        b=deJKX4mJsN+T6F+YLL6nnDRmGf1a+YmI36XhEvp+sMZYKxZnJSC/X8dW5sm+42wR1K
         6h2MmKxTYiGO17z7Qrf6nei5nZplwlPn+ricvRzHxBtegYMQrGHlLOzRoSRrIfI9xl8R
         4JjLBckhYdbMteRfaolfthmAuti1+rQzFBtOemRtVNbFvQuRS4IxgWx8f+ZPaD5Q9rRG
         dKGvucEko2PL+hpMk5oK08y304n9333lzdaHeUxcJUiVMicFrvYIqeK7DQ2LN7N4S2K5
         M/f3LvDh95E6D8VwnvehKqfzqGIJ6fEVCjYgmKiELZdW+E6LIhypUbKH7rK0+Tr4szCe
         Y7Tw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1780037301; x=1780642101;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from
         :to:cc:subject:date:message-id:reply-to;
        bh=+aMMV5ilmpFv8Pwd8WuxApxlzDJGWTTKgJEHknglssE=;
        b=ZdLyoAiA+B63C5yk5FpDdk4R9A2+dODA7Eoc5Wq4RKil2wSyWs3QNAqXFMURgZU1Lw
         OKRTedVq7P9kJ0aSuY+ZDmX3ALbTv6HMqpf0nDoVQtUsRXKApHsetlWxsUK45My/yZt6
         RhRsxneVajNJSgA1pin6SCIdg315dAzAZ2heM5XSv9QHYfK/JpphNbJJFtqlu1q5cNtk
         045XdSrsC//R6TZ0oFdaqYcbxeAtMFygQe2itk8G1y1sr5L5ml0Wf6Wq4iA33bsk+VSV
         MlDAcSxOxymk29N8RjYX2cWCb8bI/RkllTBtOccpYQz/ox1zok/LCbc9//IHd0cCx5Gs
         OcSQ==
X-Forwarded-Encrypted: i=1;
 AFNElJ+CqVXQbOvMvkknch2oXkuOiuGByHoCekzD5rM4XOcnbUgz0p3qokocLbzCeEbL/5EayfWH3NSiCgd6ZBY=@vger.kernel.org
X-Gm-Message-State: AOJu0Yz5tfo4RngYU4M/rFFWytTOP8+upHPMK2H8bGZaP6KQodRWe8Eh
	t1ZqB9ftYXarar7dIHwnqzmMvM3zXQFgDDiHaFfztahsmX6UVQlsguZ3
X-Gm-Gg: Acq92OFGEi5g5wazbmysvKYf3zUhsDvJmC0pGvbmlT9B6Jq0sgKYOLFNhxb4FzzsF9N
	9KAPJK/TESedxmn/BZOrRBlm23kCnVj1JNy3F1xFFsMVsHxXwNmherm5ZavKWko6ciYAzeorr62
	39lg5spLI2yVNhcx0QlVjQhAbUphZNdf7Cz+jATBXY2jPAIkeHzDIItk3bR+QqI7EEnnlp7XOu0
	LJRLiwCF1SV0uLhZAEaWFkKh2oEU8YYjkmcWiOsSqbgctR8QvPOMx6sEjQmVIyGp/voSHJ2ddpF
	5kKqRnUnEzFygEq128mds+WXlbKvQpZJ8mGzr1sJOdlWW9HK6eVz0xoHOYhSle3BXrqGXQMOm2V
	UB74H1I0++06b5Aw+wS4Q/x4yKUTz71W43VybCthObL9lhwMNsgkmE555TmjJF+UA/6s+hBTwQv
	dbP/Uvs6CblkMaDkSk+Oh1xLQNu392neOZpMxcQihs2qY=
X-Received: by 2002:a05:6512:230c:b0:5a8:f6fd:eb15 with SMTP id
 2adb3069b0e04-5aa594a7f4cmr457133e87.24.1780037300718;
        Thu, 28 May 2026 23:48:20 -0700 (PDT)
Received: from localhost ([188.234.148.119])
        by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-5aa5b79f04dsm72147e87.83.2026.05.28.23.48.19
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 28 May 2026 23:48:20 -0700 (PDT)
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Cc: Alex Deucher <alexander.deucher@amd.com>,
	=?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org,
	Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Subject: [PATCH v6 1/2] drm/amdgpu: convert amdgpu_vm_lock_by_pasid() to
 drm_exec
Date: Fri, 29 May 2026 11:47:38 +0500
Message-ID: <20260529064740.25060-2-mikhail.v.gavrilov@gmail.com>
X-Mailer: git-send-email 2.54.0
In-Reply-To: <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com>
References: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com>
 <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

amdgpu_vm_lock_by_pasid() looks up a VM by PASID and reserves its root
PD with a bare amdgpu_bo_reserve(), returning the still-reserved root to
the caller. A caller that then needs to reserve further BOs (for example
the devcoredump IB dump) ends up nesting reservation_ww_class_mutex
acquires without a ww_acquire_ctx, which lockdep flags as recursive
locking.

Convert the helper to take a drm_exec context and lock the root PD with
drm_exec_lock_obj(). Callers now run it inside a
drm_exec_until_all_locked() loop and can lock additional BOs in the same
ww ticket, so there is no nested ww_mutex acquire.

The drm_exec context holds its own reference on the locked root BO, so
the helper no longer hands a root reference back to the caller: the
root output parameter is dropped, and the transient reference taken
across the PASID lookup is released before returning.

The only existing caller, amdgpu_vm_handle_fault(), is updated
accordingly. Its is_compute_context path, which previously dropped the
root reservation around svm_range_restore_pages() and re-took it, now
finalises the drm_exec context and re-initialises a fresh one; behaviour
is otherwise unchanged.

No functional change intended for the page-fault path.

Reviewed-by: Christian K=C3=B6nig <christian.koenig@amd.com>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 91 ++++++++++++++++----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
 2 files changed, 58 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a=
mdgpu/amdgpu_vm.c
index fccd758b6699..b9d93b664daf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2950,47 +2950,56 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *d=
ata, struct drm_file *filp)
 }
=20
 /**
- * amdgpu_vm_lock_by_pasid - return an amdgpu_vm and its root bo from a pa=
sid, if possible.
+ * amdgpu_vm_lock_by_pasid - look up a VM by PASID and lock its root PD
  * @adev: amdgpu device pointer
- * @root: root BO of the VM
  * @pasid: PASID of the VM
- * The caller needs to unreserve and unref the root bo on success.
+ * @exec: drm_exec context to lock the root PD in
+ *
+ * Must be called from within a drm_exec_until_all_locked() loop; the call=
er
+ * runs drm_exec_retry_on_contention() afterwards. The drm_exec context ho=
lds
+ * a reference on the root BO until it is finalised.
+ *
+ * Return: the VM on success, or NULL if the PASID has no VM, the VM is be=
ing
+ * torn down, or locking the root PD failed.
  */
 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev,
-					  struct amdgpu_bo **root, u32 pasid)
+					  u32 pasid, struct drm_exec *exec)
 {
 	unsigned long irqflags;
+	struct amdgpu_bo *root;
 	struct amdgpu_vm *vm;
 	int r;
=20
 	xa_lock_irqsave(&adev->vm_manager.pasids, irqflags);
 	vm =3D xa_load(&adev->vm_manager.pasids, pasid);
-	*root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL;
+	root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL;
 	xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags);
=20
-	if (!*root)
+	if (!root)
 		return NULL;
=20
-	r =3D amdgpu_bo_reserve(*root, true);
-	if (r)
-		goto error_unref;
+	r =3D drm_exec_lock_obj(exec, &root->tbo.base);
+	if (r) {
+		amdgpu_bo_unref(&root);
+		return NULL;
+	}
=20
 	/* Double check that the VM still exists */
 	xa_lock_irqsave(&adev->vm_manager.pasids, irqflags);
 	vm =3D xa_load(&adev->vm_manager.pasids, pasid);
-	if (vm && vm->root.bo !=3D *root)
+	if (vm && vm->root.bo !=3D root)
 		vm =3D NULL;
 	xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags);
-	if (!vm)
-		goto error_unlock;
+	if (!vm) {
+		drm_exec_unlock_obj(exec, &root->tbo.base);
+		amdgpu_bo_unref(&root);
+		return NULL;
+	}
=20
-	return vm;
-error_unlock:
-	amdgpu_bo_unreserve(*root);
+	/* The drm_exec context holds its own reference on the root BO. */
+	amdgpu_bo_unref(&root);
=20
-error_unref:
-	amdgpu_bo_unref(root);
-	return NULL;
+	return vm;
 }
=20
 /**
@@ -3012,33 +3021,49 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a=
dev, u32 pasid,
 			    uint64_t ts, bool write_fault)
 {
 	bool is_compute_context =3D false;
-	struct amdgpu_bo *root;
+	struct drm_exec exec;
 	uint64_t value, flags;
 	struct amdgpu_vm *vm;
 	int r;
=20
-	vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid);
-	if (!vm)
+	drm_exec_init(&exec, 0, 1);
+	drm_exec_until_all_locked(&exec) {
+		vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec);
+		drm_exec_retry_on_contention(&exec);
+		if (!vm)
+			break;
+	}
+	if (!vm) {
+		drm_exec_fini(&exec);
 		return false;
+	}
=20
 	is_compute_context =3D vm->is_compute_context;
=20
 	if (is_compute_context) {
-		/* Unreserve root since svm_range_restore_pages might try to reserve it.=
 */
-		/* TODO: rework svm_range_restore_pages so that this isn't necessary. */
-		amdgpu_bo_unreserve(root);
+		/* Release the root PD lock since svm_range_restore_pages
+		 * might try to take it.
+		 * TODO: rework svm_range_restore_pages so that this isn't
+		 * necessary.
+		 */
+		drm_exec_fini(&exec);
=20
 		if (!svm_range_restore_pages(adev, pasid, vmid,
-					     node_id, addr >> PAGE_SHIFT, ts, write_fault)) {
-			amdgpu_bo_unref(&root);
+					     node_id, addr >> PAGE_SHIFT, ts, write_fault))
 			return true;
-		}
-		amdgpu_bo_unref(&root);
=20
 		/* Re-acquire the VM lock, could be that the VM was freed in between. */
-		vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid);
-		if (!vm)
+		drm_exec_init(&exec, 0, 1);
+		drm_exec_until_all_locked(&exec) {
+			vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec);
+			drm_exec_retry_on_contention(&exec);
+			if (!vm)
+				break;
+		}
+		if (!vm) {
+			drm_exec_fini(&exec);
 			return false;
+		}
 	}
=20
 	addr /=3D AMDGPU_GPU_PAGE_SIZE;
@@ -3062,7 +3087,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *ade=
v, u32 pasid,
 		value =3D 0;
 	}
=20
-	r =3D dma_resv_reserve_fences(root->tbo.base.resv, 1);
+	r =3D dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1);
 	if (r) {
 		pr_debug("failed %d to reserve fence slot\n", r);
 		goto error_unlock;
@@ -3076,12 +3101,10 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a=
dev, u32 pasid,
 	r =3D amdgpu_vm_update_pdes(adev, vm, true);
=20
 error_unlock:
-	amdgpu_bo_unreserve(root);
+	drm_exec_fini(&exec);
 	if (r < 0)
 		dev_err(adev->dev, "Can't handle page fault (%d)\n", r);
=20
-	amdgpu_bo_unref(&root);
-
 	return false;
 }
=20
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a=
mdgpu/amdgpu_vm.h
index d083d7aab75c..0c6e3e0368c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -593,7 +593,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev,=
 u32 pasid,
 			    bool write_fault);
=20
 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev,
-					  struct amdgpu_bo **root, u32 pasid);
+					  u32 pasid, struct drm_exec *exec);
=20
 void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
=20
--=20
2.54.0

From nobody Mon Jun  8 13:30:20 2026
Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com
 [209.85.167.43])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89FA93ACA59
	for <linux-kernel@vger.kernel.org>; Fri, 29 May 2026 06:48:25 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.167.43
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1780037307; cv=none;
 b=lw82pOmrHcRF2eS2wnXHLgNBu6iVRgUmq2tvmn3fAQllnAPhtOJlZA/vJCwjXcKw+8bNihTIpxGk3zAUZS+q8TRF/rRxegnw/5KQYnE0t65+896l9Z+5LnmUg11BsaqyYHimo+uW0PWOFdgesHDdQ622qX40h8o/Cy9sK76Mihs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1780037307; c=relaxed/simple;
	bh=XnmuBqyTezi+9eQjbJlkpvAZoXxxgwskhrT5L1ibFug=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version:Content-Type;
 b=h2n6U1qLjGMBygEkUQFSJ7+WFZ+v/lC7SqZ5uDOxkxjN9vXmwIjT4W80qOijzEAErou4sOwWuAB2V4DMg8iCpYWiTWkfm3TqEiMpMauEszezrFKG/U9Jtruuzn+uBEZ911SQG9hqxJZpCeeQYXYZ0JBV33R/gQdU18nNWyLdaOo=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=iy4BXyoS; arc=none smtp.client-ip=209.85.167.43
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="iy4BXyoS"
Received: by mail-lf1-f43.google.com with SMTP id
 2adb3069b0e04-5a8891febd2so17857305e87.1
        for <linux-kernel@vger.kernel.org>;
 Thu, 28 May 2026 23:48:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1780037304; x=1780642104;
 darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=pJUei/KuumAUMNhgELgEfVVLFOVaFW7qqMv2FxIHEpg=;
        b=iy4BXyoSqRjVW2HpSmG2MROpvHalE9UODTEVX0/uU4xcHClf4hE1ATXlWOoIYWM+K5
         lNo8gtvktsFo0EOVWZVDdVKbzfRxBsQXYFsQZX3UeHkUBuwF+c5jAWlpxM2t9jdHWwr0
         YrUE3/+NZF43oFW7FeNW7B9ZZ75FTIJAJMcUgwCt1HeyMmMgEqVDrHL4XP9e5TObOc2Q
         O6YlxwCb0FtSOjYP8Bx8eVg0VFyLta3HtjTXu02z1q95F5do5r48g5752kqLWmhwT1rC
         3sxRmt9vpUPQBasfmdENVFhLsMQ15wBv5BBKeJ9UjLv7DVokbsJiAr0IWMn29uFFmgYz
         cXAg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1780037304; x=1780642104;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from
         :to:cc:subject:date:message-id:reply-to;
        bh=pJUei/KuumAUMNhgELgEfVVLFOVaFW7qqMv2FxIHEpg=;
        b=UK5EqGbac3uD5as1nTA+6DXJxuhqlEJEfUGrDzN0HTOrZWvoU91PBZU0XRQbUUjShf
         08yoFq2Q/R8LFJirb4mVfxOtyI0ubO0U7C5y+nvElI2RJOAuy++5WB/7pqSwuDnnBkWR
         8QIgaMFr1F76gMAymq/e2ZxPZRuU8aoMOCVVqVmpktujVeoGUx1nBIos81jxBd4LO/1k
         58+jrL+YHKVJGfQ/3sQapWlmdkdrXdUGQlmZZa4WyVJP2mfVAWr1xNuuUUoae5dDPq4E
         IAxRE0RomlPQfxUcNAdAIgcqQ/R+LJQQGkfVNxjcjTT+Eo7ZkW5CexvnSiD+vye086n6
         MOtA==
X-Forwarded-Encrypted: i=1;
 AFNElJ+X9zK9fGTrew1yDSrSiQxLblTKO/7pkwtYeSPXao1VifpMB8KhbNPlThgXdPG/cANEND5ZZcYHbm2XPRU=@vger.kernel.org
X-Gm-Message-State: AOJu0Ywf4cbTtwRzMVMyz9zOl2LW5vH6keONo728ZImTWu3SdKAcm3b3
	Tg68NIBIINrnnuZmD2shWsEQhqEULFfMhKWNnbaHJYGaSFlHtJqba8T7
X-Gm-Gg: Acq92OE3KVpjTKPYQBcAEgDxBu4iaJVyNKpInDtvWZzwW8sXzr3GUl2QMG2CuB1flLW
	96Kbq6mNakMxGcaPq7DUOGqpqOoZUn0ZAJAXrW9yq8t2Q6UPTX11Mo/Ep5eyI0xU7rPT2GoOAfY
	UemWaWY4kE5Bg3SccdctbkUxigJBXc9+RlTD75KEFM84AUJ9kPOjW+ii9iJDTrvNdim9my8rEEr
	lRgcfJD5WHDn8TpP8Enk4cyc0wemtr1UCnz51Hm1ibilBmQPJCA8NE7K91+17jQoIG+TyzbHwzC
	GsP64h4fA17RRdivyWcBki5h3j1ymgcVj44/xmcFlahG031RQeXS3YyayLXrM2mZPrLjJWfFjY/
	+bB5UpAJ7rekgt+3NXaCspy4SEstcB1ML4ePAC8iCzZrKctgil7ffHDboUOqIuZTiR1kjx1PE+h
	zWUPCMvtxrRtOhg6ILD4Wk2W0zpZUKNiiHAl5bWUlWusw=
X-Received: by 2002:a05:6512:a8c:b0:5aa:106f:87bb with SMTP id
 2adb3069b0e04-5aa59475a96mr400169e87.3.1780037303600;
        Thu, 28 May 2026 23:48:23 -0700 (PDT)
Received: from localhost ([188.234.148.119])
        by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-5aa5b79f04dsm72147e87.83.2026.05.28.23.48.21
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 28 May 2026 23:48:23 -0700 (PDT)
From: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Cc: Alex Deucher <alexander.deucher@amd.com>,
	=?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>,
	Simona Vetter <simona@ffwll.ch>,
	Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org,
	Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Subject: [PATCH v6 2/2] drm/amdgpu: fix recursive ww_mutex acquire in
 amdgpu_devcoredump_format
Date: Fri, 29 May 2026 11:47:39 +0500
Message-ID: <20260529064740.25060-3-mikhail.v.gavrilov@gmail.com>
X-Mailer: git-send-email 2.54.0
In-Reply-To: <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com>
References: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com>
 <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

When dumping IB contents from a hung job, amdgpu_devcoredump_format()
acquired the VM root PD's reservation via amdgpu_vm_lock_by_pasid() and
then, for each IB, called amdgpu_bo_reserve() on the BO backing the IB.
Both reservations are reservation_ww_class_mutex objects and neither
used a ww_acquire_ctx, which trips lockdep:

  WARNING: possible recursive locking detected
  --------------------------------------------
  kworker/u128:0 is trying to acquire lock:
  ffff88838b16e1f0 (reservation_ww_class_mutex){+.+.}-{4:4},
    at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu]

  but task is already holding lock:
  ffff8882f82681f0 (reservation_ww_class_mutex){+.+.}-{4:4},
    at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu]

   Possible unsafe locking scenario:
         CPU0
         ----
    lock(reservation_ww_class_mutex);
    lock(reservation_ww_class_mutex);

   *** DEADLOCK ***
   May be due to missing lock nesting notation

  Workqueue: events_unbound amdgpu_devcoredump_deferred_work [amdgpu]
  Call Trace:
   __ww_mutex_lock.constprop.0
   ww_mutex_lock
   amdgpu_bo_reserve
   amdgpu_devcoredump_format+0x1594 [amdgpu]
   amdgpu_devcoredump_deferred_work+0xea [amdgpu]

The two reservations are on different BOs in the captured trace, so the
splat is a lockdep-correctness warning, not an observed deadlock. It
becomes a real self-deadlock whenever the IB BO shares its dma_resv with
the root PD (the always-valid case, see amdgpu_vm_is_bo_always_valid()):
amdgpu_bo_reserve(abo) re-acquires the same ww_mutex without a ticket
and blocks forever. With amdgpu.gpu_recovery=3D0 the timeout handler
refires every ~2 s and each invocation produces this splat, drowning the
kernel ring buffer.

Now that amdgpu_vm_lock_by_pasid() takes a drm_exec context, move the IB
dumping into a separate helper that locks the root PD and every IB BO
together in a single drm_exec ticket. DRM_EXEC_IGNORE_DUPLICATES handles
IB BOs that share a dma_resv (e.g. always-valid BOs, or two IBs backed
by the same BO). Every lock is now a top-level acquire under one
ww_acquire_ctx, so the recursive ww_mutex condition is gone, and the
per-IB amdgpu_bo_reserve()/amdgpu_bo_unref() dance -- including a BO
refcount leak on the amdgpu_bo_reserve() failure path -- is removed.

Fixes: 7b15fc2d1f1a ("drm/amdgpu: dump job ibs in the devcoredump")
Suggested-by: Christian K=C3=B6nig <christian.koenig@amd.com>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c  | 215 ++++++++++--------
 1 file changed, 126 insertions(+), 89 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu=
/drm/amd/amdgpu/amdgpu_dev_coredump.c
index d386bc775d03..c7d43796d603 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c
@@ -24,6 +24,7 @@
=20
 #include <generated/utsrelease.h>
 #include <linux/devcoredump.h>
+#include <drm/drm_exec.h>
 #include "amdgpu_dev_coredump.h"
 #include "atom.h"
=20
@@ -207,23 +208,137 @@ static void amdgpu_devcoredump_fw_info(struct amdgpu=
_device *adev,
 	}
 }
=20
+static void
+amdgpu_devcoredump_print_ibs(struct drm_printer *p,
+			     struct amdgpu_coredump_info *coredump,
+			     bool sizing_pass)
+{
+	struct amdgpu_device *adev =3D coredump->adev;
+	struct amdgpu_bo_va_mapping *mapping;
+	struct amdgpu_bo *abo;
+	struct drm_exec exec;
+	struct amdgpu_vm *vm;
+	u32 *ib_content;
+	u64 va_start, offset;
+	u8 *kptr;
+	u32 off;
+	int r;
+
+	/*
+	 * On the sizing pass there is no VM to look up and no BO to lock; the
+	 * size estimate doesn't depend on whether the IB BOs are reachable.
+	 * Just emit the per-IB headers (the content is not written anywhere).
+	 */
+	if (sizing_pass) {
+		for (int i =3D 0; i < coredump->num_ibs; i++) {
+			drm_printf(p, "\nIB #%d 0x%llx %d dw\n", i,
+				   coredump->ibs[i].gpu_addr,
+				   coredump->ibs[i].ib_size_dw);
+		}
+		return;
+	}
+
+	/*
+	 * Lock the VM root PD and every IB BO together in a single drm_exec
+	 * ticket. Reserving the IB BOs one by one while the root PD is held
+	 * would be a recursive reservation_ww_class_mutex acquire without a
+	 * ww_acquire_ctx, which trips lockdep and self-deadlocks for IB BOs
+	 * that share their dma_resv with the root PD (always-valid BOs).
+	 */
+	drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES, 1 + coredump->num_ibs);
+	drm_exec_until_all_locked(&exec) {
+		vm =3D amdgpu_vm_lock_by_pasid(adev, coredump->pasid, &exec);
+		if (!vm)
+			goto unlock;
+
+		for (int i =3D 0; i < coredump->num_ibs; i++) {
+			u64 pfn =3D (coredump->ibs[i].gpu_addr &
+				   AMDGPU_GMC_HOLE_MASK) / AMDGPU_GPU_PAGE_SIZE;
+
+			mapping =3D amdgpu_vm_bo_lookup_mapping(vm, pfn);
+			if (!mapping)
+				continue;
+
+			abo =3D mapping->bo_va->base.bo;
+			r =3D drm_exec_lock_obj(&exec, &abo->tbo.base);
+			drm_exec_retry_on_contention(&exec);
+			if (r)
+				goto unlock;
+		}
+	}
+
+	for (int i =3D 0; i < coredump->num_ibs; i++) {
+		bool emit_content =3D false;
+
+		ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4,
+					    GFP_KERNEL);
+		if (!ib_content)
+			continue;
+
+		va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK;
+		mapping =3D amdgpu_vm_bo_lookup_mapping(vm,
+						      va_start / AMDGPU_GPU_PAGE_SIZE);
+		if (!mapping)
+			goto output_ib_content;
+
+		abo =3D mapping->bo_va->base.bo;
+		offset =3D va_start - mapping->start * AMDGPU_GPU_PAGE_SIZE;
+
+		if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) {
+			struct amdgpu_res_cursor cursor;
+
+			off =3D 0;
+
+			if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM)
+				goto output_ib_content;
+
+			amdgpu_res_first(abo->tbo.resource, offset,
+					 coredump->ibs[i].ib_size_dw * 4, &cursor);
+			while (cursor.remaining) {
+				amdgpu_device_mm_access(adev, cursor.start / 4,
+							&ib_content[off], cursor.size / 4,
+							false);
+				off +=3D cursor.size;
+				amdgpu_res_next(&cursor, cursor.size);
+			}
+			emit_content =3D true;
+		} else {
+			r =3D ttm_bo_kmap(&abo->tbo, 0, PFN_UP(abo->tbo.base.size),
+					&abo->kmap);
+			if (r)
+				goto output_ib_content;
+
+			kptr =3D amdgpu_bo_kptr(abo);
+			kptr +=3D offset;
+			memcpy(ib_content, kptr, coredump->ibs[i].ib_size_dw * 4);
+
+			amdgpu_bo_kunmap(abo);
+			emit_content =3D true;
+		}
+
+output_ib_content:
+		drm_printf(p, "\nIB #%d 0x%llx %d dw\n", i,
+			   coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw);
+		if (emit_content) {
+			for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++)
+				drm_printf(p, "0x%08x\n", ib_content[j]);
+		}
+		kvfree(ib_content);
+	}
+
+unlock:
+	drm_exec_fini(&exec);
+}
+
 static ssize_t
 amdgpu_devcoredump_format(char *buffer, size_t count, struct amdgpu_coredu=
mp_info *coredump)
 {
-	struct amdgpu_device *adev =3D coredump->adev;
 	struct drm_printer p;
 	struct drm_print_iterator iter;
 	struct amdgpu_vm_fault_info *fault_info;
-	struct amdgpu_bo_va_mapping *mapping;
 	struct amdgpu_ip_block *ip_block;
-	struct amdgpu_res_cursor cursor;
-	struct amdgpu_bo *abo, *root;
-	uint64_t va_start, offset;
 	struct amdgpu_ring *ring;
-	struct amdgpu_vm *vm;
-	u32 *ib_content;
-	uint8_t *kptr;
-	int ver, i, j, r;
+	int ver, i, j;
 	u32 ring_idx, off;
 	bool sizing_pass;
=20
@@ -342,86 +457,8 @@ amdgpu_devcoredump_format(char *buffer, size_t count, =
struct amdgpu_coredump_inf
 	else if (coredump->reset_vram_lost)
 		drm_printf(&p, "VRAM is lost due to GPU reset!\n");
=20
-	if (coredump->num_ibs) {
-		/* Don't try to lookup the VM or map the BOs when calculating the
-		 * size required to store the devcoredump.
-		 */
-		if (sizing_pass)
-			vm =3D NULL;
-		else
-			vm =3D amdgpu_vm_lock_by_pasid(adev, &root, coredump->pasid);
-
-		for (int i =3D 0; i < coredump->num_ibs && (sizing_pass || vm); i++) {
-			ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4,
-						    GFP_KERNEL);
-			if (!ib_content)
-				continue;
-
-			/* vm=3DNULL can only happen when 'sizing_pass' is true. Skip to the
-			 * drm_printf() calls (ib_content doesn't need to be initialized
-			 * as its content won't be written anywhere).
-			 */
-			if (!vm)
-				goto output_ib_content;
-
-			va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK;
-			mapping =3D amdgpu_vm_bo_lookup_mapping(vm, va_start / AMDGPU_GPU_PAGE_=
SIZE);
-			if (!mapping)
-				goto free_ib_content;
-
-			offset =3D va_start - (mapping->start * AMDGPU_GPU_PAGE_SIZE);
-			abo =3D amdgpu_bo_ref(mapping->bo_va->base.bo);
-			r =3D amdgpu_bo_reserve(abo, false);
-			if (r)
-				goto free_ib_content;
-
-			if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) {
-				off =3D 0;
-
-				if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM)
-					goto unreserve_abo;
-
-				amdgpu_res_first(abo->tbo.resource, offset,
-						 coredump->ibs[i].ib_size_dw * 4,
-						 &cursor);
-				while (cursor.remaining) {
-					amdgpu_device_mm_access(adev, cursor.start / 4,
-								&ib_content[off], cursor.size / 4,
-								false);
-					off +=3D cursor.size;
-					amdgpu_res_next(&cursor, cursor.size);
-				}
-			} else {
-				r =3D ttm_bo_kmap(&abo->tbo, 0,
-						PFN_UP(abo->tbo.base.size),
-						&abo->kmap);
-				if (r)
-					goto unreserve_abo;
-
-				kptr =3D amdgpu_bo_kptr(abo);
-				kptr +=3D offset;
-				memcpy(ib_content, kptr,
-				       coredump->ibs[i].ib_size_dw * 4);
-
-				amdgpu_bo_kunmap(abo);
-			}
-
-output_ib_content:
-			drm_printf(&p, "\nIB #%d 0x%llx %d dw\n",
-				   i, coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw);
-			for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++)
-				drm_printf(&p, "0x%08x\n", ib_content[j]);
-unreserve_abo:
-			if (vm)
-				amdgpu_bo_unreserve(abo);
-free_ib_content:
-			kvfree(ib_content);
-		}
-		if (vm) {
-			amdgpu_bo_unreserve(root);
-			amdgpu_bo_unref(&root);
-		}
-	}
+	if (coredump->num_ibs)
+		amdgpu_devcoredump_print_ibs(&p, coredump, sizing_pass);
=20
 	return count - iter.remain;
 }
--=20
2.54.0