From nobody Sun May 24 20:35:28 2026 Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E8BC23BCEE for ; Thu, 21 May 2026 15:08:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779376132; cv=none; b=YdQ2ZJHAQ9f6UG9+NuRK/GMu5bDvEFG8vhZ4a5ke6zDZeRPJCFjFUwhjhVGcjHyS/OgtNuY7rwOgP4K4YWCKfMh7feR9hjKfjJBsteWvGVqI36RNTSafh7ZMuFhXBxYQdDThjCyGMzbEzwnxOwplp58NF+0xpI+eJpvRPpP37PE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779376132; c=relaxed/simple; bh=xah8INPoIl6lNjJx8qaG+onAYm2FXP9my57l6oRqvZk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nl9gLv8WEsmFZECfkbb7jnZGiHVM92PWZ3dxpJsO0mIhXJfY+0ERVnsDfa0dGcF1UBcOZBcxf8tj6Ev2EAKWGWiHAt44Cnp8gmDGAQ8DrEUJeKv7ajHD8tuAJdB3WXNAdAv5lwq2z4uTKCMmwzoUiOu5fRvZ8NYhS3QbZJMlen4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Z2p9o7M5; arc=none smtp.client-ip=209.85.167.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Z2p9o7M5" Received: by mail-lf1-f49.google.com with SMTP id 2adb3069b0e04-5a8e33556c0so7961109e87.0 for ; Thu, 21 May 2026 08:08:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779376129; x=1779980929; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1PYTv8RRPyhzNb2k3u7IsJEhmouq09DPN/7fCUe9rJc=; b=Z2p9o7M51SOMUPueTBSphmhE21Tmen5ARXcDt1muP19oAxldj/+OSFKNn88n+iL7vc 2njzOAQL+Or0l7nFKKme8aOhxpl16mIsf6DrS3WESGo0zl0Bc64/oyUJZy042Me33uIe 88g+fNenO8q3wdt0+KP1kW3SSWaeHHIxBC7m24MqssGtp5J9Coj36WU866IXSuGy1fFA CR90yT/rh0Vi6JpVlWTRl3aVPNLNK+8bnycDgq5b2I/6R1STJv4jttW0XhicosBn+G2Z 7NZRmrIEaA6L9oLGBgBiFdHeef1h/nsAwUKEg4F+A2fbo+Im5fS588DVcg7O+RK38sPU l8vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779376129; x=1779980929; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=1PYTv8RRPyhzNb2k3u7IsJEhmouq09DPN/7fCUe9rJc=; b=FS/UjRKrMQV9JGMbu9mzjF4fAtUki3qjk0rEtIwGXUEIdXpDhs83jKFuhEdf1L2lUw zMITGg61CuYWlGtg/eP++m3lGJPcBXN+5ASNlDmAwxSwW3NSDUIHfRuWpEL71MvfgUYh dptcFnS2CtRLNDJqDwEorx7KpjA1PNykSljDoUue+B1hC1Dg4us/Mv8Y6aCzxDU/jPeD 4//zHOqMfOieRqBWM0mvfp3OV09Kket3Hn0/NsrqOW2aD5MldzL6f2i1DDJMKcONyun9 d+gj0KAredRM5UKGtXWrBYnI4L4IqY+vI6Ln21SD4EV0fWejL/UiLKA7ze83JlMlNivT ab1g== X-Forwarded-Encrypted: i=1; AFNElJ91JybHq1W4lrXEsxcaIGqLr2w9BAKdou3AYdPrYgAk5hHS/p6EpnL5EXjDYzY1vXBlmuN5ye9awC0vqL8=@vger.kernel.org X-Gm-Message-State: AOJu0Yy0yWPVJHA2e7aWbqhbRshwkFs1PD4m4Nz17MFn8d9FY4OQi9f0 lWGFE75wgkN0RBPbUin34aISA1cfWFg86jmQBuKLar6JXT5Fzj1ej52TQeY33G25go+P+LIt X-Gm-Gg: Acq92OHqpBsZ14pxhBQI86ikJbTA8yVlw5MOe3J1MIJbuaFOYxnSlXlQ6XOtTYeyCgR u2XvrHs7yNkiA0/mQvUv+pulVuwbNgakjrlN2OjKst6BfNottL4NiXlMc0NibCA3BK7PZvp/fw1 IMCxNUa1iMxEWsI/KNX1epCVBiV5RLBOLlOEV1KzlDdkMfFA8K579Yh604jb7bU2CtYNBe2tLOW kGPIzuiRx+/wgbAkjidwDbL5IY74n75dF9PXWPM861FNWo3giD1O0lJKm2D4+YPJJzotCghR3Ce 3xWdj1Fc1w5xr7ChrAqc/DP9vipq5WKkr5BCNqU+mWkHMQo71JsVfbse8XmCI6g8/3+j6JQ8t3P pI477KejLrBmqKKeegENybhujEIPD3JbN/d9KxwsTcI57rzqfwk/5a/cpfb3lfPAKqZvrBDtRlI JGegtXIIBhj5iTSKunHCKoPDN6a9iJgdoD X-Received: by 2002:a05:6512:4608:b0:5a8:837a:2a98 with SMTP id 2adb3069b0e04-5aa2ba859admr1199599e87.14.1779376128530; Thu, 21 May 2026 08:08:48 -0700 (PDT) Received: from localhost ([188.234.148.119]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5aa2f120c6esm335686e87.19.2026.05.21.08.08.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2026 08:08:47 -0700 (PDT) From: Mikhail Gavrilov To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Pierre-Eric Pelloux-Prayer , Sumit Semwal , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Mikhail Gavrilov Subject: [PATCH v5 1/2] drm/amdgpu: convert amdgpu_vm_lock_by_pasid() to drm_exec Date: Thu, 21 May 2026 20:08:39 +0500 Message-ID: <20260521150841.20625-2-mikhail.v.gavrilov@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com> References: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com> <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable amdgpu_vm_lock_by_pasid() looks up a VM by PASID and reserves its root PD with a bare amdgpu_bo_reserve(), returning the still-reserved root to the caller. A caller that then needs to reserve further BOs (for example the devcoredump IB dump) ends up nesting reservation_ww_class_mutex acquires without a ww_acquire_ctx, which lockdep flags as recursive locking. Convert the helper to take a drm_exec context and lock the root PD with drm_exec_lock_obj(). Callers now run it inside a drm_exec_until_all_locked() loop and can lock additional BOs in the same ww ticket, so there is no nested ww_mutex acquire. The drm_exec context holds its own reference on the locked root BO, so the helper no longer hands a root reference back to the caller: the root output parameter is dropped, and the transient reference taken across the PASID lookup is released before returning. The only existing caller, amdgpu_vm_handle_fault(), is updated accordingly. Its is_compute_context path, which previously dropped the root reservation around svm_range_restore_pages() and re-took it, now finalises the drm_exec context and re-initialises a fresh one; behaviour is otherwise unchanged. No functional change intended for the page-fault path. Reviewed-by: Christian K=C3=B6nig Signed-off-by: Mikhail Gavrilov --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 91 ++++++++++++++++---------- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +- 2 files changed, 58 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.c index 9ba9de16a27a..d734d8c0de6e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2950,47 +2950,56 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *d= ata, struct drm_file *filp) } =20 /** - * amdgpu_vm_lock_by_pasid - return an amdgpu_vm and its root bo from a pa= sid, if possible. + * amdgpu_vm_lock_by_pasid - look up a VM by PASID and lock its root PD * @adev: amdgpu device pointer - * @root: root BO of the VM * @pasid: PASID of the VM - * The caller needs to unreserve and unref the root bo on success. + * @exec: drm_exec context to lock the root PD in + * + * Must be called from within a drm_exec_until_all_locked() loop; the call= er + * runs drm_exec_retry_on_contention() afterwards. The drm_exec context ho= lds + * a reference on the root BO until it is finalised. + * + * Return: the VM on success, or NULL if the PASID has no VM, the VM is be= ing + * torn down, or locking the root PD failed. */ struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev, - struct amdgpu_bo **root, u32 pasid) + u32 pasid, struct drm_exec *exec) { unsigned long irqflags; + struct amdgpu_bo *root; struct amdgpu_vm *vm; int r; =20 xa_lock_irqsave(&adev->vm_manager.pasids, irqflags); vm =3D xa_load(&adev->vm_manager.pasids, pasid); - *root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL; + root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL; xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags); =20 - if (!*root) + if (!root) return NULL; =20 - r =3D amdgpu_bo_reserve(*root, true); - if (r) - goto error_unref; + r =3D drm_exec_lock_obj(exec, &root->tbo.base); + if (r) { + amdgpu_bo_unref(&root); + return NULL; + } =20 /* Double check that the VM still exists */ xa_lock_irqsave(&adev->vm_manager.pasids, irqflags); vm =3D xa_load(&adev->vm_manager.pasids, pasid); - if (vm && vm->root.bo !=3D *root) + if (vm && vm->root.bo !=3D root) vm =3D NULL; xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags); - if (!vm) - goto error_unlock; + if (!vm) { + drm_exec_unlock_obj(exec, &root->tbo.base); + amdgpu_bo_unref(&root); + return NULL; + } =20 - return vm; -error_unlock: - amdgpu_bo_unreserve(*root); + /* The drm_exec context holds its own reference on the root BO. */ + amdgpu_bo_unref(&root); =20 -error_unref: - amdgpu_bo_unref(root); - return NULL; + return vm; } =20 /** @@ -3012,33 +3021,49 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a= dev, u32 pasid, uint64_t ts, bool write_fault) { bool is_compute_context =3D false; - struct amdgpu_bo *root; + struct drm_exec exec; uint64_t value, flags; struct amdgpu_vm *vm; int r; =20 - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid); - if (!vm) + drm_exec_init(&exec, 0, 1); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + } + if (!vm) { + drm_exec_fini(&exec); return false; + } =20 is_compute_context =3D vm->is_compute_context; =20 if (is_compute_context) { - /* Unreserve root since svm_range_restore_pages might try to reserve it.= */ - /* TODO: rework svm_range_restore_pages so that this isn't necessary. */ - amdgpu_bo_unreserve(root); + /* Release the root PD lock since svm_range_restore_pages + * might try to take it. + * TODO: rework svm_range_restore_pages so that this isn't + * necessary. + */ + drm_exec_fini(&exec); =20 if (!svm_range_restore_pages(adev, pasid, vmid, - node_id, addr >> PAGE_SHIFT, ts, write_fault)) { - amdgpu_bo_unref(&root); + node_id, addr >> PAGE_SHIFT, ts, write_fault)) return true; - } - amdgpu_bo_unref(&root); =20 /* Re-acquire the VM lock, could be that the VM was freed in between. */ - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid); - if (!vm) + drm_exec_init(&exec, 0, 1); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + } + if (!vm) { + drm_exec_fini(&exec); return false; + } } =20 addr /=3D AMDGPU_GPU_PAGE_SIZE; @@ -3062,7 +3087,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *ade= v, u32 pasid, value =3D 0; } =20 - r =3D dma_resv_reserve_fences(root->tbo.base.resv, 1); + r =3D dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1); if (r) { pr_debug("failed %d to reserve fence slot\n", r); goto error_unlock; @@ -3076,12 +3101,10 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a= dev, u32 pasid, r =3D amdgpu_vm_update_pdes(adev, vm, true); =20 error_unlock: - amdgpu_bo_unreserve(root); + drm_exec_fini(&exec); if (r < 0) dev_err(adev->dev, "Can't handle page fault (%d)\n", r); =20 - amdgpu_bo_unref(&root); - return false; } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.h index d083d7aab75c..0c6e3e0368c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -593,7 +593,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev,= u32 pasid, bool write_fault); =20 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev, - struct amdgpu_bo **root, u32 pasid); + u32 pasid, struct drm_exec *exec); =20 void amdgpu_vm_set_task_info(struct amdgpu_vm *vm); =20 --=20 2.54.0 From nobody Sun May 24 20:35:28 2026 Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4934332B128 for ; Thu, 21 May 2026 15:08:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779376134; cv=none; b=mWVt4HWqNuGOeHZgLhSo3XlVKvl6QszGrcWKFp3A0egG0yhxf7psokBXcnO8hclTdv+e5piWo9VO7N1KGm0bCHF2n9fB0FVG1JTL2oJgv/ghBy/2Rveq3Y9PKu+FPW5m9dQUsSaZVIOUEzPvN+eHenZs20lI5MBafr/USkRdd/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779376134; c=relaxed/simple; bh=jhawuaQE4VaUOaqBjSVJ7L9kHTPtT9q8QTpefJsl81M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=D9pkNrDJJwjvB1CqpGQcfhfpMDTmIn6n2PntKjEZRjPIo+VTWhngopkSdVKfvllWU7SY8ZGW+fFCAo1IN13zwfSZH6G9vGZ3IlWm11GFouR0oMTgYUdwnDYxSxMx2buwoxjeBda4bHW2o8WOqe3RYsPftkCz7LMs8JS3bkAv7wI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=TdviwDXV; arc=none smtp.client-ip=209.85.167.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TdviwDXV" Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-5a8738c178dso5758025e87.1 for ; Thu, 21 May 2026 08:08:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779376130; x=1779980930; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=; b=TdviwDXVIHPeaDNfUcwJJOO/rRX1Pj2G/9FpNHhuNiL26KcIAKBuyb4W8BmhtTjpmN zo8xb6acZRgZXhLzrMLTxj+sINpr2Kc3ekAn/eaTmNWJxuoIckZnXynUncu3+X4q1nTp WnrwRZjZAssoef44K1pgkmpZVr6wFJ140cyuWNn4AxOBAMPomV6LmJw6JQKUbWGXwlny 8h0Xb066367eIX2++TxrUWKE69ACbJhVgUR+k+byruT0O5UxsB+pmwzTPkkW9Qv5/Su5 QIWWvkw05RipsT5tWq4HTW96XejbNNSC5Rq687N5777KvKXs8PIV9iKU5RI/bQqNCMdg 2SGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779376130; x=1779980930; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=; b=AZLbUqhp4qLar+2MGVzsh4hgrOTF0f5ehxxkFx4o+lNsNikE7S/QfvPReR9BEEagHW Ga78Tm663tBq/esl/4G5VTM2x96LZxfivOL9+7IvWyyyLE/bZY9I/N3WBfLCal3P0TfK NCcX3TwpPJYrhOVYquo2TmKyT3fJLXVjlfZhLPToFo8idXS/Ky14mglUhtxYrRzwc1nV qZp4YD6p4Z9dtO4CuG/mDALStMdNnywpEkHgpZJYG5cE17mWto3vK1W2sL+BVk3V/D35 qVSsW9Fb2YW/ZG96DDvhNTBs8tZ5SBa0eI8SFCY36pQKy8eJi9mWIxCYUmyV2MUSAp8l yW1g== X-Forwarded-Encrypted: i=1; AFNElJ9i4FkGLXmg6z4Z7cskm6QsYTYXIDKMUs6bf0tBrwWcgZ+o3cPeDyWq/2HMjYN2WgBBAuBQW4sAuNgKMTk=@vger.kernel.org X-Gm-Message-State: AOJu0Yx/4WFVHJTV7WPl+W7vm++4E14b3TN5HJgoCmUWbYAZeTk33Qa4 72eO7RD/pGiWt6g+5rExAON6ZqsnNAqSBi1SwEvSqOM1WIusVt0wWYUa X-Gm-Gg: Acq92OE80JW/tY/WP5p+Cqmc95lJf2GzldTzl5fawrt2lnoB7Oayqtp3/Zy8Aa9VBnd eNhHbkCt/TUrHzvN4qE5ViHJFM6pB/rtXxKi7PyPv+OCAREwkDWTBdwAopEthghLknEH48J4DD+ /oduY5MBktyRwoCKf7DPAfpq0FxKiD0VElK1cjNCkIrYDYlzqsCj+CkZTo5AW5fklxE3fPqyi7l oq1KRfag93aaqwuGHUkcM9415N9Am2gP2Q1FoVdHlxyzb2wnIIbffSmxYLPqNnVsaTZNDlwuyfb EvXifwlixbxDfgPRfOXE9MZeklSHCcyxQ7UqGEPIQpQQNdIhOLmoOidlBXcTJkIj/3MI9KhvE2l eZ+RXAQy3APB+h6Z9ly9ak13uaxs/yaVuJLqRbYtz7wnajMzkoZ2mzkKbByzJ0yBGB3Gm2F4THj qepeXofgInsum3Ci85qkRpujXr2XG35VCK X-Received: by 2002:ac2:5dcd:0:b0:5a8:a754:f9d5 with SMTP id 2adb3069b0e04-5aa2ba9ac3bmr843915e87.39.1779376130152; Thu, 21 May 2026 08:08:50 -0700 (PDT) Received: from localhost ([188.234.148.119]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5aa2f120c6esm335686e87.19.2026.05.21.08.08.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2026 08:08:49 -0700 (PDT) From: Mikhail Gavrilov To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Pierre-Eric Pelloux-Prayer , Sumit Semwal , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Mikhail Gavrilov Subject: [PATCH v5 2/2] drm/amdgpu: fix recursive ww_mutex acquire in amdgpu_devcoredump_format Date: Thu, 21 May 2026 20:08:40 +0500 Message-ID: <20260521150841.20625-3-mikhail.v.gavrilov@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com> References: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com> <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When dumping IB contents from a hung job, amdgpu_devcoredump_format() acquired the VM root PD's reservation via amdgpu_vm_lock_by_pasid() and then, for each IB, called amdgpu_bo_reserve() on the BO backing the IB. Both reservations are reservation_ww_class_mutex objects and neither used a ww_acquire_ctx, which trips lockdep: WARNING: possible recursive locking detected -------------------------------------------- kworker/u128:0 is trying to acquire lock: ffff88838b16e1f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] but task is already holding lock: ffff8882f82681f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] Possible unsafe locking scenario: CPU0 ---- lock(reservation_ww_class_mutex); lock(reservation_ww_class_mutex); *** DEADLOCK *** May be due to missing lock nesting notation Workqueue: events_unbound amdgpu_devcoredump_deferred_work [amdgpu] Call Trace: __ww_mutex_lock.constprop.0 ww_mutex_lock amdgpu_bo_reserve amdgpu_devcoredump_format+0x1594 [amdgpu] amdgpu_devcoredump_deferred_work+0xea [amdgpu] The two reservations are on different BOs in the captured trace, so the splat is a lockdep-correctness warning, not an observed deadlock. It becomes a real self-deadlock whenever the IB BO shares its dma_resv with the root PD (the always-valid case, see amdgpu_vm_is_bo_always_valid()): amdgpu_bo_reserve(abo) re-acquires the same ww_mutex without a ticket and blocks forever. With amdgpu.gpu_recovery=3D0 the timeout handler refires every ~2 s and each invocation produces this splat, drowning the kernel ring buffer. Now that amdgpu_vm_lock_by_pasid() takes a drm_exec context, lock the root PD and every IB BO together in a single drm_exec ticket. DRM_EXEC_IGNORE_DUPLICATES handles IB BOs that share a dma_resv (e.g. always-valid BOs, or two IBs backed by the same BO). Every lock is now a top-level acquire under one ww_acquire_ctx, so the recursive ww_mutex condition is gone, and the per-IB amdgpu_bo_reserve()/amdgpu_bo_unref() dance -- including a BO refcount leak on the amdgpu_bo_reserve() failure path -- is removed. Reproducer (~150 LoC libdrm_amdgpu): submit a single GFX IB containing PACKET3_INDIRECT_BUFFER chained at GPU VA 0 and wait for the fence. The TDR fires within ~10 s and the deferred coredump worker produces the splat above on every invocation; with this change applied the splat is gone. Fixes: 7b15fc2d1f1a ("drm/amdgpu: dump job ibs in the devcoredump") Suggested-by: Christian K=C3=B6nig Signed-off-by: Mikhail Gavrilov --- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 105 ++++++++++++------ 1 file changed, 71 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu= /drm/amd/amdgpu/amdgpu_dev_coredump.c index d386bc775d03..456ea9911d48 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c @@ -24,6 +24,7 @@ =20 #include #include +#include #include "amdgpu_dev_coredump.h" #include "atom.h" =20 @@ -214,13 +215,9 @@ amdgpu_devcoredump_format(char *buffer, size_t count, = struct amdgpu_coredump_inf struct drm_printer p; struct drm_print_iterator iter; struct amdgpu_vm_fault_info *fault_info; - struct amdgpu_bo_va_mapping *mapping; struct amdgpu_ip_block *ip_block; struct amdgpu_res_cursor cursor; - struct amdgpu_bo *abo, *root; - uint64_t va_start, offset; struct amdgpu_ring *ring; - struct amdgpu_vm *vm; u32 *ib_content; uint8_t *kptr; int ver, i, j, r; @@ -343,43 +340,84 @@ amdgpu_devcoredump_format(char *buffer, size_t count,= struct amdgpu_coredump_inf drm_printf(&p, "VRAM is lost due to GPU reset!\n"); =20 if (coredump->num_ibs) { - /* Don't try to lookup the VM or map the BOs when calculating the - * size required to store the devcoredump. + struct amdgpu_bo_va_mapping *mapping; + struct amdgpu_bo *abo; + struct drm_exec exec; + struct amdgpu_vm *vm; + u64 va_start, offset; + bool locked =3D false; + + /* + * Lock the VM root PD and every IB BO together in a single + * drm_exec ticket. Reserving the IB BOs one by one while the + * root PD is held would be a recursive reservation_ww_class_mutex + * acquire without a ww_acquire_ctx, which trips lockdep and + * self-deadlocks for IB BOs that share their dma_resv with the + * root PD (always-valid BOs). + * + * Skip locking entirely on the sizing pass: it does not write + * IB content, so the size estimate doesn't depend on whether + * the BOs are reachable. */ - if (sizing_pass) - vm =3D NULL; - else - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, coredump->pasid); + if (!sizing_pass) { + drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES, + 1 + coredump->num_ibs); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, coredump->pasid, + &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + + for (int i =3D 0; i < coredump->num_ibs; i++) { + u64 pfn; + + va_start =3D coredump->ibs[i].gpu_addr & + AMDGPU_GMC_HOLE_MASK; + pfn =3D va_start / AMDGPU_GPU_PAGE_SIZE; + mapping =3D amdgpu_vm_bo_lookup_mapping(vm, pfn); + if (!mapping) + continue; + + abo =3D mapping->bo_va->base.bo; + r =3D drm_exec_lock_obj(&exec, &abo->tbo.base); + drm_exec_retry_on_contention(&exec); + if (r) + break; + } + if (r) + break; + } + if (vm && !r) + locked =3D true; + else + drm_exec_fini(&exec); + } + + for (int i =3D 0; i < coredump->num_ibs; i++) { + bool emit_content =3D sizing_pass; =20 - for (int i =3D 0; i < coredump->num_ibs && (sizing_pass || vm); i++) { ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4, GFP_KERNEL); if (!ib_content) continue; =20 - /* vm=3DNULL can only happen when 'sizing_pass' is true. Skip to the - * drm_printf() calls (ib_content doesn't need to be initialized - * as its content won't be written anywhere). - */ - if (!vm) + if (!locked) goto output_ib_content; =20 va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK; mapping =3D amdgpu_vm_bo_lookup_mapping(vm, va_start / AMDGPU_GPU_PAGE_= SIZE); if (!mapping) - goto free_ib_content; + goto output_ib_content; =20 - offset =3D va_start - (mapping->start * AMDGPU_GPU_PAGE_SIZE); - abo =3D amdgpu_bo_ref(mapping->bo_va->base.bo); - r =3D amdgpu_bo_reserve(abo, false); - if (r) - goto free_ib_content; + abo =3D mapping->bo_va->base.bo; + offset =3D va_start - mapping->start * AMDGPU_GPU_PAGE_SIZE; =20 if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) { off =3D 0; =20 if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM) - goto unreserve_abo; + goto output_ib_content; =20 amdgpu_res_first(abo->tbo.resource, offset, coredump->ibs[i].ib_size_dw * 4, @@ -391,12 +429,13 @@ amdgpu_devcoredump_format(char *buffer, size_t count,= struct amdgpu_coredump_inf off +=3D cursor.size; amdgpu_res_next(&cursor, cursor.size); } + emit_content =3D true; } else { r =3D ttm_bo_kmap(&abo->tbo, 0, PFN_UP(abo->tbo.base.size), &abo->kmap); if (r) - goto unreserve_abo; + goto output_ib_content; =20 kptr =3D amdgpu_bo_kptr(abo); kptr +=3D offset; @@ -404,23 +443,21 @@ amdgpu_devcoredump_format(char *buffer, size_t count,= struct amdgpu_coredump_inf coredump->ibs[i].ib_size_dw * 4); =20 amdgpu_bo_kunmap(abo); + emit_content =3D true; } =20 output_ib_content: drm_printf(&p, "\nIB #%d 0x%llx %d dw\n", i, coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw); - for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++) - drm_printf(&p, "0x%08x\n", ib_content[j]); -unreserve_abo: - if (vm) - amdgpu_bo_unreserve(abo); -free_ib_content: + if (emit_content) { + for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++) + drm_printf(&p, "0x%08x\n", ib_content[j]); + } kvfree(ib_content); } - if (vm) { - amdgpu_bo_unreserve(root); - amdgpu_bo_unref(&root); - } + + if (locked) + drm_exec_fini(&exec); } =20 return count - iter.remain; --=20 2.54.0