From nobody Mon Jun 8 13:30:20 2026 Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5FC33AB29D for ; Fri, 29 May 2026 06:48:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780037304; cv=none; b=O2MDmhb7TonZJc1ZTnVdiZGt0eA8nt1DeuLMtyBOp7cOI433mnlVeDcwDjO9CoVsPz5w+++ULGnW9KyNcj0sKN7XCVTLXJf/xLcO6n8vNm1wR3oYP3zGPMJvvMrXhx2o5IDL0TSjgZDTnm4KMsXpKdQp6Q6t4jqjPgh4SL6AnJ0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780037304; c=relaxed/simple; bh=Cpi20G3hUtpGa0J80DAMLB46h6zNbwjq96KGIYnVB/M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CJQ84HYC2jqlj+hkFd3duHafGYqWRUJzOqzWQYQaPNE+no8wvbaNuiEIFwl1w2PVnxLe5DEWvs7M0Y3TyQzTo3DggF/00mkvU9HRtJnxrwkiFPVSgvVjXUtJajKI4ZrJ0eQXwh6ybGxWs11KYJCw3f2wnyUa2UOonZqVirv0S1g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=deJKX4mJ; arc=none smtp.client-ip=209.85.208.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="deJKX4mJ" Received: by mail-lj1-f169.google.com with SMTP id 38308e7fff4ca-393a49d2e5eso115936701fa.2 for ; Thu, 28 May 2026 23:48:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780037301; x=1780642101; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+aMMV5ilmpFv8Pwd8WuxApxlzDJGWTTKgJEHknglssE=; b=deJKX4mJsN+T6F+YLL6nnDRmGf1a+YmI36XhEvp+sMZYKxZnJSC/X8dW5sm+42wR1K 6h2MmKxTYiGO17z7Qrf6nei5nZplwlPn+ricvRzHxBtegYMQrGHlLOzRoSRrIfI9xl8R 4JjLBckhYdbMteRfaolfthmAuti1+rQzFBtOemRtVNbFvQuRS4IxgWx8f+ZPaD5Q9rRG dKGvucEko2PL+hpMk5oK08y304n9333lzdaHeUxcJUiVMicFrvYIqeK7DQ2LN7N4S2K5 M/f3LvDh95E6D8VwnvehKqfzqGIJ6fEVCjYgmKiELZdW+E6LIhypUbKH7rK0+Tr4szCe Y7Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780037301; x=1780642101; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+aMMV5ilmpFv8Pwd8WuxApxlzDJGWTTKgJEHknglssE=; b=ZdLyoAiA+B63C5yk5FpDdk4R9A2+dODA7Eoc5Wq4RKil2wSyWs3QNAqXFMURgZU1Lw OKRTedVq7P9kJ0aSuY+ZDmX3ALbTv6HMqpf0nDoVQtUsRXKApHsetlWxsUK45My/yZt6 RhRsxneVajNJSgA1pin6SCIdg315dAzAZ2heM5XSv9QHYfK/JpphNbJJFtqlu1q5cNtk 045XdSrsC//R6TZ0oFdaqYcbxeAtMFygQe2itk8G1y1sr5L5ml0Wf6Wq4iA33bsk+VSV MlDAcSxOxymk29N8RjYX2cWCb8bI/RkllTBtOccpYQz/ox1zok/LCbc9//IHd0cCx5Gs OcSQ== X-Forwarded-Encrypted: i=1; AFNElJ+CqVXQbOvMvkknch2oXkuOiuGByHoCekzD5rM4XOcnbUgz0p3qokocLbzCeEbL/5EayfWH3NSiCgd6ZBY=@vger.kernel.org X-Gm-Message-State: AOJu0Yz5tfo4RngYU4M/rFFWytTOP8+upHPMK2H8bGZaP6KQodRWe8Eh t1ZqB9ftYXarar7dIHwnqzmMvM3zXQFgDDiHaFfztahsmX6UVQlsguZ3 X-Gm-Gg: Acq92OFGEi5g5wazbmysvKYf3zUhsDvJmC0pGvbmlT9B6Jq0sgKYOLFNhxb4FzzsF9N 9KAPJK/TESedxmn/BZOrRBlm23kCnVj1JNy3F1xFFsMVsHxXwNmherm5ZavKWko6ciYAzeorr62 39lg5spLI2yVNhcx0QlVjQhAbUphZNdf7Cz+jATBXY2jPAIkeHzDIItk3bR+QqI7EEnnlp7XOu0 LJRLiwCF1SV0uLhZAEaWFkKh2oEU8YYjkmcWiOsSqbgctR8QvPOMx6sEjQmVIyGp/voSHJ2ddpF 5kKqRnUnEzFygEq128mds+WXlbKvQpZJ8mGzr1sJOdlWW9HK6eVz0xoHOYhSle3BXrqGXQMOm2V UB74H1I0++06b5Aw+wS4Q/x4yKUTz71W43VybCthObL9lhwMNsgkmE555TmjJF+UA/6s+hBTwQv dbP/Uvs6CblkMaDkSk+Oh1xLQNu392neOZpMxcQihs2qY= X-Received: by 2002:a05:6512:230c:b0:5a8:f6fd:eb15 with SMTP id 2adb3069b0e04-5aa594a7f4cmr457133e87.24.1780037300718; Thu, 28 May 2026 23:48:20 -0700 (PDT) Received: from localhost ([188.234.148.119]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5aa5b79f04dsm72147e87.83.2026.05.28.23.48.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 May 2026 23:48:20 -0700 (PDT) From: Mikhail Gavrilov To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Pierre-Eric Pelloux-Prayer , Sumit Semwal , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Mikhail Gavrilov Subject: [PATCH v6 1/2] drm/amdgpu: convert amdgpu_vm_lock_by_pasid() to drm_exec Date: Fri, 29 May 2026 11:47:38 +0500 Message-ID: <20260529064740.25060-2-mikhail.v.gavrilov@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com> References: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com> <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable amdgpu_vm_lock_by_pasid() looks up a VM by PASID and reserves its root PD with a bare amdgpu_bo_reserve(), returning the still-reserved root to the caller. A caller that then needs to reserve further BOs (for example the devcoredump IB dump) ends up nesting reservation_ww_class_mutex acquires without a ww_acquire_ctx, which lockdep flags as recursive locking. Convert the helper to take a drm_exec context and lock the root PD with drm_exec_lock_obj(). Callers now run it inside a drm_exec_until_all_locked() loop and can lock additional BOs in the same ww ticket, so there is no nested ww_mutex acquire. The drm_exec context holds its own reference on the locked root BO, so the helper no longer hands a root reference back to the caller: the root output parameter is dropped, and the transient reference taken across the PASID lookup is released before returning. The only existing caller, amdgpu_vm_handle_fault(), is updated accordingly. Its is_compute_context path, which previously dropped the root reservation around svm_range_restore_pages() and re-took it, now finalises the drm_exec context and re-initialises a fresh one; behaviour is otherwise unchanged. No functional change intended for the page-fault path. Reviewed-by: Christian K=C3=B6nig Signed-off-by: Mikhail Gavrilov --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 91 ++++++++++++++++---------- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +- 2 files changed, 58 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.c index fccd758b6699..b9d93b664daf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2950,47 +2950,56 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *d= ata, struct drm_file *filp) } =20 /** - * amdgpu_vm_lock_by_pasid - return an amdgpu_vm and its root bo from a pa= sid, if possible. + * amdgpu_vm_lock_by_pasid - look up a VM by PASID and lock its root PD * @adev: amdgpu device pointer - * @root: root BO of the VM * @pasid: PASID of the VM - * The caller needs to unreserve and unref the root bo on success. + * @exec: drm_exec context to lock the root PD in + * + * Must be called from within a drm_exec_until_all_locked() loop; the call= er + * runs drm_exec_retry_on_contention() afterwards. The drm_exec context ho= lds + * a reference on the root BO until it is finalised. + * + * Return: the VM on success, or NULL if the PASID has no VM, the VM is be= ing + * torn down, or locking the root PD failed. */ struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev, - struct amdgpu_bo **root, u32 pasid) + u32 pasid, struct drm_exec *exec) { unsigned long irqflags; + struct amdgpu_bo *root; struct amdgpu_vm *vm; int r; =20 xa_lock_irqsave(&adev->vm_manager.pasids, irqflags); vm =3D xa_load(&adev->vm_manager.pasids, pasid); - *root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL; + root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL; xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags); =20 - if (!*root) + if (!root) return NULL; =20 - r =3D amdgpu_bo_reserve(*root, true); - if (r) - goto error_unref; + r =3D drm_exec_lock_obj(exec, &root->tbo.base); + if (r) { + amdgpu_bo_unref(&root); + return NULL; + } =20 /* Double check that the VM still exists */ xa_lock_irqsave(&adev->vm_manager.pasids, irqflags); vm =3D xa_load(&adev->vm_manager.pasids, pasid); - if (vm && vm->root.bo !=3D *root) + if (vm && vm->root.bo !=3D root) vm =3D NULL; xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags); - if (!vm) - goto error_unlock; + if (!vm) { + drm_exec_unlock_obj(exec, &root->tbo.base); + amdgpu_bo_unref(&root); + return NULL; + } =20 - return vm; -error_unlock: - amdgpu_bo_unreserve(*root); + /* The drm_exec context holds its own reference on the root BO. */ + amdgpu_bo_unref(&root); =20 -error_unref: - amdgpu_bo_unref(root); - return NULL; + return vm; } =20 /** @@ -3012,33 +3021,49 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a= dev, u32 pasid, uint64_t ts, bool write_fault) { bool is_compute_context =3D false; - struct amdgpu_bo *root; + struct drm_exec exec; uint64_t value, flags; struct amdgpu_vm *vm; int r; =20 - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid); - if (!vm) + drm_exec_init(&exec, 0, 1); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + } + if (!vm) { + drm_exec_fini(&exec); return false; + } =20 is_compute_context =3D vm->is_compute_context; =20 if (is_compute_context) { - /* Unreserve root since svm_range_restore_pages might try to reserve it.= */ - /* TODO: rework svm_range_restore_pages so that this isn't necessary. */ - amdgpu_bo_unreserve(root); + /* Release the root PD lock since svm_range_restore_pages + * might try to take it. + * TODO: rework svm_range_restore_pages so that this isn't + * necessary. + */ + drm_exec_fini(&exec); =20 if (!svm_range_restore_pages(adev, pasid, vmid, - node_id, addr >> PAGE_SHIFT, ts, write_fault)) { - amdgpu_bo_unref(&root); + node_id, addr >> PAGE_SHIFT, ts, write_fault)) return true; - } - amdgpu_bo_unref(&root); =20 /* Re-acquire the VM lock, could be that the VM was freed in between. */ - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid); - if (!vm) + drm_exec_init(&exec, 0, 1); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + } + if (!vm) { + drm_exec_fini(&exec); return false; + } } =20 addr /=3D AMDGPU_GPU_PAGE_SIZE; @@ -3062,7 +3087,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *ade= v, u32 pasid, value =3D 0; } =20 - r =3D dma_resv_reserve_fences(root->tbo.base.resv, 1); + r =3D dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1); if (r) { pr_debug("failed %d to reserve fence slot\n", r); goto error_unlock; @@ -3076,12 +3101,10 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a= dev, u32 pasid, r =3D amdgpu_vm_update_pdes(adev, vm, true); =20 error_unlock: - amdgpu_bo_unreserve(root); + drm_exec_fini(&exec); if (r < 0) dev_err(adev->dev, "Can't handle page fault (%d)\n", r); =20 - amdgpu_bo_unref(&root); - return false; } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.h index d083d7aab75c..0c6e3e0368c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -593,7 +593,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev,= u32 pasid, bool write_fault); =20 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev, - struct amdgpu_bo **root, u32 pasid); + u32 pasid, struct drm_exec *exec); =20 void amdgpu_vm_set_task_info(struct amdgpu_vm *vm); =20 --=20 2.54.0 From nobody Mon Jun 8 13:30:20 2026 Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89FA93ACA59 for ; Fri, 29 May 2026 06:48:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780037307; cv=none; b=lw82pOmrHcRF2eS2wnXHLgNBu6iVRgUmq2tvmn3fAQllnAPhtOJlZA/vJCwjXcKw+8bNihTIpxGk3zAUZS+q8TRF/rRxegnw/5KQYnE0t65+896l9Z+5LnmUg11BsaqyYHimo+uW0PWOFdgesHDdQ622qX40h8o/Cy9sK76Mihs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780037307; c=relaxed/simple; bh=XnmuBqyTezi+9eQjbJlkpvAZoXxxgwskhrT5L1ibFug=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=h2n6U1qLjGMBygEkUQFSJ7+WFZ+v/lC7SqZ5uDOxkxjN9vXmwIjT4W80qOijzEAErou4sOwWuAB2V4DMg8iCpYWiTWkfm3TqEiMpMauEszezrFKG/U9Jtruuzn+uBEZ911SQG9hqxJZpCeeQYXYZ0JBV33R/gQdU18nNWyLdaOo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=iy4BXyoS; arc=none smtp.client-ip=209.85.167.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iy4BXyoS" Received: by mail-lf1-f43.google.com with SMTP id 2adb3069b0e04-5a8891febd2so17857305e87.1 for ; Thu, 28 May 2026 23:48:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780037304; x=1780642104; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pJUei/KuumAUMNhgELgEfVVLFOVaFW7qqMv2FxIHEpg=; b=iy4BXyoSqRjVW2HpSmG2MROpvHalE9UODTEVX0/uU4xcHClf4hE1ATXlWOoIYWM+K5 lNo8gtvktsFo0EOVWZVDdVKbzfRxBsQXYFsQZX3UeHkUBuwF+c5jAWlpxM2t9jdHWwr0 YrUE3/+NZF43oFW7FeNW7B9ZZ75FTIJAJMcUgwCt1HeyMmMgEqVDrHL4XP9e5TObOc2Q O6YlxwCb0FtSOjYP8Bx8eVg0VFyLta3HtjTXu02z1q95F5do5r48g5752kqLWmhwT1rC 3sxRmt9vpUPQBasfmdENVFhLsMQ15wBv5BBKeJ9UjLv7DVokbsJiAr0IWMn29uFFmgYz cXAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780037304; x=1780642104; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=pJUei/KuumAUMNhgELgEfVVLFOVaFW7qqMv2FxIHEpg=; b=UK5EqGbac3uD5as1nTA+6DXJxuhqlEJEfUGrDzN0HTOrZWvoU91PBZU0XRQbUUjShf 08yoFq2Q/R8LFJirb4mVfxOtyI0ubO0U7C5y+nvElI2RJOAuy++5WB/7pqSwuDnnBkWR 8QIgaMFr1F76gMAymq/e2ZxPZRuU8aoMOCVVqVmpktujVeoGUx1nBIos81jxBd4LO/1k 58+jrL+YHKVJGfQ/3sQapWlmdkdrXdUGQlmZZa4WyVJP2mfVAWr1xNuuUUoae5dDPq4E IAxRE0RomlPQfxUcNAdAIgcqQ/R+LJQQGkfVNxjcjTT+Eo7ZkW5CexvnSiD+vye086n6 MOtA== X-Forwarded-Encrypted: i=1; AFNElJ+X9zK9fGTrew1yDSrSiQxLblTKO/7pkwtYeSPXao1VifpMB8KhbNPlThgXdPG/cANEND5ZZcYHbm2XPRU=@vger.kernel.org X-Gm-Message-State: AOJu0Ywf4cbTtwRzMVMyz9zOl2LW5vH6keONo728ZImTWu3SdKAcm3b3 Tg68NIBIINrnnuZmD2shWsEQhqEULFfMhKWNnbaHJYGaSFlHtJqba8T7 X-Gm-Gg: Acq92OE3KVpjTKPYQBcAEgDxBu4iaJVyNKpInDtvWZzwW8sXzr3GUl2QMG2CuB1flLW 96Kbq6mNakMxGcaPq7DUOGqpqOoZUn0ZAJAXrW9yq8t2Q6UPTX11Mo/Ep5eyI0xU7rPT2GoOAfY UemWaWY4kE5Bg3SccdctbkUxigJBXc9+RlTD75KEFM84AUJ9kPOjW+ii9iJDTrvNdim9my8rEEr lRgcfJD5WHDn8TpP8Enk4cyc0wemtr1UCnz51Hm1ibilBmQPJCA8NE7K91+17jQoIG+TyzbHwzC GsP64h4fA17RRdivyWcBki5h3j1ymgcVj44/xmcFlahG031RQeXS3YyayLXrM2mZPrLjJWfFjY/ +bB5UpAJ7rekgt+3NXaCspy4SEstcB1ML4ePAC8iCzZrKctgil7ffHDboUOqIuZTiR1kjx1PE+h zWUPCMvtxrRtOhg6ILD4Wk2W0zpZUKNiiHAl5bWUlWusw= X-Received: by 2002:a05:6512:a8c:b0:5aa:106f:87bb with SMTP id 2adb3069b0e04-5aa59475a96mr400169e87.3.1780037303600; Thu, 28 May 2026 23:48:23 -0700 (PDT) Received: from localhost ([188.234.148.119]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5aa5b79f04dsm72147e87.83.2026.05.28.23.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 May 2026 23:48:23 -0700 (PDT) From: Mikhail Gavrilov To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Pierre-Eric Pelloux-Prayer , Sumit Semwal , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Mikhail Gavrilov Subject: [PATCH v6 2/2] drm/amdgpu: fix recursive ww_mutex acquire in amdgpu_devcoredump_format Date: Fri, 29 May 2026 11:47:39 +0500 Message-ID: <20260529064740.25060-3-mikhail.v.gavrilov@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com> References: <20260521150841.20625-1-mikhail.v.gavrilov@gmail.com> <20260529064740.25060-1-mikhail.v.gavrilov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When dumping IB contents from a hung job, amdgpu_devcoredump_format() acquired the VM root PD's reservation via amdgpu_vm_lock_by_pasid() and then, for each IB, called amdgpu_bo_reserve() on the BO backing the IB. Both reservations are reservation_ww_class_mutex objects and neither used a ww_acquire_ctx, which trips lockdep: WARNING: possible recursive locking detected -------------------------------------------- kworker/u128:0 is trying to acquire lock: ffff88838b16e1f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] but task is already holding lock: ffff8882f82681f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] Possible unsafe locking scenario: CPU0 ---- lock(reservation_ww_class_mutex); lock(reservation_ww_class_mutex); *** DEADLOCK *** May be due to missing lock nesting notation Workqueue: events_unbound amdgpu_devcoredump_deferred_work [amdgpu] Call Trace: __ww_mutex_lock.constprop.0 ww_mutex_lock amdgpu_bo_reserve amdgpu_devcoredump_format+0x1594 [amdgpu] amdgpu_devcoredump_deferred_work+0xea [amdgpu] The two reservations are on different BOs in the captured trace, so the splat is a lockdep-correctness warning, not an observed deadlock. It becomes a real self-deadlock whenever the IB BO shares its dma_resv with the root PD (the always-valid case, see amdgpu_vm_is_bo_always_valid()): amdgpu_bo_reserve(abo) re-acquires the same ww_mutex without a ticket and blocks forever. With amdgpu.gpu_recovery=3D0 the timeout handler refires every ~2 s and each invocation produces this splat, drowning the kernel ring buffer. Now that amdgpu_vm_lock_by_pasid() takes a drm_exec context, move the IB dumping into a separate helper that locks the root PD and every IB BO together in a single drm_exec ticket. DRM_EXEC_IGNORE_DUPLICATES handles IB BOs that share a dma_resv (e.g. always-valid BOs, or two IBs backed by the same BO). Every lock is now a top-level acquire under one ww_acquire_ctx, so the recursive ww_mutex condition is gone, and the per-IB amdgpu_bo_reserve()/amdgpu_bo_unref() dance -- including a BO refcount leak on the amdgpu_bo_reserve() failure path -- is removed. Fixes: 7b15fc2d1f1a ("drm/amdgpu: dump job ibs in the devcoredump") Suggested-by: Christian K=C3=B6nig Signed-off-by: Mikhail Gavrilov --- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 215 ++++++++++-------- 1 file changed, 126 insertions(+), 89 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu= /drm/amd/amdgpu/amdgpu_dev_coredump.c index d386bc775d03..c7d43796d603 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c @@ -24,6 +24,7 @@ =20 #include #include +#include #include "amdgpu_dev_coredump.h" #include "atom.h" =20 @@ -207,23 +208,137 @@ static void amdgpu_devcoredump_fw_info(struct amdgpu= _device *adev, } } =20 +static void +amdgpu_devcoredump_print_ibs(struct drm_printer *p, + struct amdgpu_coredump_info *coredump, + bool sizing_pass) +{ + struct amdgpu_device *adev =3D coredump->adev; + struct amdgpu_bo_va_mapping *mapping; + struct amdgpu_bo *abo; + struct drm_exec exec; + struct amdgpu_vm *vm; + u32 *ib_content; + u64 va_start, offset; + u8 *kptr; + u32 off; + int r; + + /* + * On the sizing pass there is no VM to look up and no BO to lock; the + * size estimate doesn't depend on whether the IB BOs are reachable. + * Just emit the per-IB headers (the content is not written anywhere). + */ + if (sizing_pass) { + for (int i =3D 0; i < coredump->num_ibs; i++) { + drm_printf(p, "\nIB #%d 0x%llx %d dw\n", i, + coredump->ibs[i].gpu_addr, + coredump->ibs[i].ib_size_dw); + } + return; + } + + /* + * Lock the VM root PD and every IB BO together in a single drm_exec + * ticket. Reserving the IB BOs one by one while the root PD is held + * would be a recursive reservation_ww_class_mutex acquire without a + * ww_acquire_ctx, which trips lockdep and self-deadlocks for IB BOs + * that share their dma_resv with the root PD (always-valid BOs). + */ + drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES, 1 + coredump->num_ibs); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, coredump->pasid, &exec); + if (!vm) + goto unlock; + + for (int i =3D 0; i < coredump->num_ibs; i++) { + u64 pfn =3D (coredump->ibs[i].gpu_addr & + AMDGPU_GMC_HOLE_MASK) / AMDGPU_GPU_PAGE_SIZE; + + mapping =3D amdgpu_vm_bo_lookup_mapping(vm, pfn); + if (!mapping) + continue; + + abo =3D mapping->bo_va->base.bo; + r =3D drm_exec_lock_obj(&exec, &abo->tbo.base); + drm_exec_retry_on_contention(&exec); + if (r) + goto unlock; + } + } + + for (int i =3D 0; i < coredump->num_ibs; i++) { + bool emit_content =3D false; + + ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4, + GFP_KERNEL); + if (!ib_content) + continue; + + va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK; + mapping =3D amdgpu_vm_bo_lookup_mapping(vm, + va_start / AMDGPU_GPU_PAGE_SIZE); + if (!mapping) + goto output_ib_content; + + abo =3D mapping->bo_va->base.bo; + offset =3D va_start - mapping->start * AMDGPU_GPU_PAGE_SIZE; + + if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) { + struct amdgpu_res_cursor cursor; + + off =3D 0; + + if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM) + goto output_ib_content; + + amdgpu_res_first(abo->tbo.resource, offset, + coredump->ibs[i].ib_size_dw * 4, &cursor); + while (cursor.remaining) { + amdgpu_device_mm_access(adev, cursor.start / 4, + &ib_content[off], cursor.size / 4, + false); + off +=3D cursor.size; + amdgpu_res_next(&cursor, cursor.size); + } + emit_content =3D true; + } else { + r =3D ttm_bo_kmap(&abo->tbo, 0, PFN_UP(abo->tbo.base.size), + &abo->kmap); + if (r) + goto output_ib_content; + + kptr =3D amdgpu_bo_kptr(abo); + kptr +=3D offset; + memcpy(ib_content, kptr, coredump->ibs[i].ib_size_dw * 4); + + amdgpu_bo_kunmap(abo); + emit_content =3D true; + } + +output_ib_content: + drm_printf(p, "\nIB #%d 0x%llx %d dw\n", i, + coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw); + if (emit_content) { + for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++) + drm_printf(p, "0x%08x\n", ib_content[j]); + } + kvfree(ib_content); + } + +unlock: + drm_exec_fini(&exec); +} + static ssize_t amdgpu_devcoredump_format(char *buffer, size_t count, struct amdgpu_coredu= mp_info *coredump) { - struct amdgpu_device *adev =3D coredump->adev; struct drm_printer p; struct drm_print_iterator iter; struct amdgpu_vm_fault_info *fault_info; - struct amdgpu_bo_va_mapping *mapping; struct amdgpu_ip_block *ip_block; - struct amdgpu_res_cursor cursor; - struct amdgpu_bo *abo, *root; - uint64_t va_start, offset; struct amdgpu_ring *ring; - struct amdgpu_vm *vm; - u32 *ib_content; - uint8_t *kptr; - int ver, i, j, r; + int ver, i, j; u32 ring_idx, off; bool sizing_pass; =20 @@ -342,86 +457,8 @@ amdgpu_devcoredump_format(char *buffer, size_t count, = struct amdgpu_coredump_inf else if (coredump->reset_vram_lost) drm_printf(&p, "VRAM is lost due to GPU reset!\n"); =20 - if (coredump->num_ibs) { - /* Don't try to lookup the VM or map the BOs when calculating the - * size required to store the devcoredump. - */ - if (sizing_pass) - vm =3D NULL; - else - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, coredump->pasid); - - for (int i =3D 0; i < coredump->num_ibs && (sizing_pass || vm); i++) { - ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4, - GFP_KERNEL); - if (!ib_content) - continue; - - /* vm=3DNULL can only happen when 'sizing_pass' is true. Skip to the - * drm_printf() calls (ib_content doesn't need to be initialized - * as its content won't be written anywhere). - */ - if (!vm) - goto output_ib_content; - - va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK; - mapping =3D amdgpu_vm_bo_lookup_mapping(vm, va_start / AMDGPU_GPU_PAGE_= SIZE); - if (!mapping) - goto free_ib_content; - - offset =3D va_start - (mapping->start * AMDGPU_GPU_PAGE_SIZE); - abo =3D amdgpu_bo_ref(mapping->bo_va->base.bo); - r =3D amdgpu_bo_reserve(abo, false); - if (r) - goto free_ib_content; - - if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) { - off =3D 0; - - if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM) - goto unreserve_abo; - - amdgpu_res_first(abo->tbo.resource, offset, - coredump->ibs[i].ib_size_dw * 4, - &cursor); - while (cursor.remaining) { - amdgpu_device_mm_access(adev, cursor.start / 4, - &ib_content[off], cursor.size / 4, - false); - off +=3D cursor.size; - amdgpu_res_next(&cursor, cursor.size); - } - } else { - r =3D ttm_bo_kmap(&abo->tbo, 0, - PFN_UP(abo->tbo.base.size), - &abo->kmap); - if (r) - goto unreserve_abo; - - kptr =3D amdgpu_bo_kptr(abo); - kptr +=3D offset; - memcpy(ib_content, kptr, - coredump->ibs[i].ib_size_dw * 4); - - amdgpu_bo_kunmap(abo); - } - -output_ib_content: - drm_printf(&p, "\nIB #%d 0x%llx %d dw\n", - i, coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw); - for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++) - drm_printf(&p, "0x%08x\n", ib_content[j]); -unreserve_abo: - if (vm) - amdgpu_bo_unreserve(abo); -free_ib_content: - kvfree(ib_content); - } - if (vm) { - amdgpu_bo_unreserve(root); - amdgpu_bo_unref(&root); - } - } + if (coredump->num_ibs) + amdgpu_devcoredump_print_ibs(&p, coredump, sizing_pass); =20 return count - iter.remain; } --=20 2.54.0