From nobody Sun May 24 21:38:44 2026 Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F91E3A6F05 for ; Thu, 21 May 2026 10:43:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779360227; cv=none; b=d+73JQh2kJ5zOiD0Pb9/rpoRfmSGMvQAPtHBdVyyMCu4MMZEONpzir9VnvIbQL28jDkAqbX+5EEoH7aOa3Rlt2DTxXuLxNq5EScKNGhTYlxPDaYrPHZlPWNlWMAKVlaCPryvm7m2aXM4dgLKM1VKJKUpfqkAfCG+YumX75b9jeI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779360227; c=relaxed/simple; bh=zrEcgZ8Q7c8zsCON9s5of3t6K3mdZED8+/BuMRfbxH8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XFm4ouIgni7BUYS6W18FRlPYO4e9CBWEdfERC4yuDQfLjR6gTn9IdPE89ozedtGo1Zt/UMWSiLOZvzbh5oUN9x1gaF08UQXX8EDLB9DFUeBZTNVK76F+NQXCAlzzavmpx0s1e8bStN7CrOU7yrwT4gRkXqbS7ZF00Yzg2aBlISQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=niX8ZYDb; arc=none smtp.client-ip=209.85.208.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="niX8ZYDb" Received: by mail-lj1-f169.google.com with SMTP id 38308e7fff4ca-393c40246afso69635241fa.1 for ; Thu, 21 May 2026 03:43:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779360224; x=1779965024; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Tost9z39fc3oAwF71abrFy2xi58J916FX1uOTr+0ob4=; b=niX8ZYDbZzj8wPYCSb7CTc1IuO1Uxhp7khPvrlrEsNUJowv5NlGDc8JQtewB+E7N56 bgFbjsQ6Ebt7O+jneuj1q5Idtf7ErRC5UlViYzl+gRXAUjwMVKVwJtAyAf5KGviT6/J0 +j5aGTZbYJ/mrIjrMBea1r6GijJdl/3HJLy+R3tioQbKf9/vGuSHwv07UIjlkRv618YM jrkCnivycj9vFu0h5dpl5um73di6O2UX4el8b1b+cgv5bNcdT15co1rWP0HtBuTMWxlS Wzmjq/iUStD0jS7OyihYH04KPb6CPrZeKygDr/W6FC2Uhd19ikk+h3geOVD+ZvfiAxw8 HPtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779360224; x=1779965024; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Tost9z39fc3oAwF71abrFy2xi58J916FX1uOTr+0ob4=; b=rulQsCF1HNJqnoySPTqV+qndhQr6wQA+W5yQ1zmkR2lK90tELPzd0xUm5Wiv0Y/tAa S4cwmZqPN7JZZmPpeQzF4sKip8Bwo96gD4qD+70J7vxsGYtLswKquN/rMf9ZunwbV+2l wqJgNZA+d4kYgoxR/f6q4OXdmg1tpcaZSOLQqxbxW8MOjfOFnVfrUzhSm+tu3OnIsXlf KjWKLUCTIccv2e0hhDgj63PFoQUmMJMmLUjthkHw50MZhwaA8SWxptuK4bIMRCvYSTMW TrVsRt5MBdMOvUmFYNjioi4iwz9cUoFR+SeyiETZRC/MnKkqRAZTXzkkKMQeHyeVme/V JBzg== X-Forwarded-Encrypted: i=1; AFNElJ80GZ0wU7o4TIGFIwpj1c6Rl0RT5KmlmX44jqZLlcrncN50DRGooV98fSkKpgpQPxMLM3DfmfGSrMCqVYI=@vger.kernel.org X-Gm-Message-State: AOJu0YzVexOeUc9mYGH9nj9DX2J8fj0myHbbNfloJu2KG8ifmVnBlIlT Kl2PDtlWpdC0aBFH1odzmJtXyr5rdcX/0MH+dBPAKyPYSCdrkd/F4blr X-Gm-Gg: Acq92OEjpDyn2w/cPqJvSIAZkrVL5IrZMCFnpJUVynJZdYywMg++QAAWxwleunxEZa/ yqjETI4E67XWxyBqEfltsU4qOfRQtZGPIxNHlMo7zdhQcytRV7m8R45gqq8TuMRlQlMs/lGBDyG F9n6Y/kvNhDatJ1TtS4iH9Erog+DBaFG0cnlnxk4tqy1qu3MZ3AEMTJMOoLAiPp9L8a6thXXYy7 4RQsBt+TRh4CJJbzMlUhsT6pXrBc/uj8mldXps5MfRNUl6fC0T/KQ7S+1AJUXks4rEHQhkUubIP t2dwCvScY3JMtLdqz4OMaGyfD7j10Ejr0UEQEywAk/HZVNFvo59yoCIQUu25st6uLiR374UQa+F Sa5k/+uOyNIL5ZAjwvgOh7meJiL5n1rftw0WydjJSNrN4CjDUZyH39jW9Y0n3O/qx+/TMnfMwI/ JJ/EgzKVbfPHRAHJkH6QcGNNmFb9OJYXCv8Q7/cGojA3QK X-Received: by 2002:a2e:a548:0:b0:38e:cab9:3637 with SMTP id 38308e7fff4ca-395ca6386cfmr9212891fa.18.1779360223995; Thu, 21 May 2026 03:43:43 -0700 (PDT) Received: from localhost ([188.234.148.119]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-395d0b49073sm1595611fa.31.2026.05.21.03.43.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2026 03:43:42 -0700 (PDT) From: Mikhail Gavrilov To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Pierre-Eric Pelloux-Prayer , Sumit Semwal , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Mikhail Gavrilov Subject: [PATCH v4 1/2] drm/amdgpu: convert amdgpu_vm_lock_by_pasid() to drm_exec Date: Thu, 21 May 2026 15:43:32 +0500 Message-ID: <20260521104335.28978-2-mikhail.v.gavrilov@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com> References: <20260520151741.50575-1-mikhail.v.gavrilov@gmail.com> <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" amdgpu_vm_lock_by_pasid() looks up a VM by PASID and reserves its root PD with a bare amdgpu_bo_reserve(), returning the still-reserved root to the caller. A caller that then needs to reserve further BOs (for example the devcoredump IB dump) ends up nesting reservation_ww_class_mutex acquires without a ww_acquire_ctx, which lockdep flags as recursive locking. Convert the helper to take a drm_exec context and lock the root PD with drm_exec_lock_obj(). Callers now run it inside a drm_exec_until_all_locked() loop and can lock additional BOs in the same ww ticket, so there is no nested ww_mutex acquire. The drm_exec context holds its own reference on the locked root BO, so the helper no longer hands a root reference back to the caller: the root output parameter is dropped, and the transient reference taken across the PASID lookup is released before returning. The only existing caller, amdgpu_vm_handle_fault(), is updated accordingly. Its is_compute_context path, which previously dropped the root reservation around svm_range_restore_pages() and re-took it, now finalises the drm_exec context and re-initialises a fresh one; behaviour is otherwise unchanged. No functional change intended for the page-fault path. Signed-off-by: Mikhail Gavrilov --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 91 ++++++++++++++++---------- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +- 2 files changed, 58 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.c index 9ba9de16a27a..591980907211 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2950,47 +2950,56 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *d= ata, struct drm_file *filp) } =20 /** - * amdgpu_vm_lock_by_pasid - return an amdgpu_vm and its root bo from a pa= sid, if possible. + * amdgpu_vm_lock_by_pasid - look up a VM by PASID and lock its root PD * @adev: amdgpu device pointer - * @root: root BO of the VM * @pasid: PASID of the VM - * The caller needs to unreserve and unref the root bo on success. + * @exec: drm_exec context to lock the root PD in + * + * Must be called from within a drm_exec_until_all_locked() loop; the call= er + * runs drm_exec_retry_on_contention() afterwards. The drm_exec context ho= lds + * a reference on the root BO until it is finalised. + * + * Return: the VM on success, or NULL if the PASID has no VM, the VM is be= ing + * torn down, or locking the root PD failed. */ struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev, - struct amdgpu_bo **root, u32 pasid) + u32 pasid, struct drm_exec *exec) { unsigned long irqflags; + struct amdgpu_bo *root; struct amdgpu_vm *vm; int r; =20 xa_lock_irqsave(&adev->vm_manager.pasids, irqflags); vm =3D xa_load(&adev->vm_manager.pasids, pasid); - *root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL; + root =3D vm ? amdgpu_bo_ref(vm->root.bo) : NULL; xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags); =20 - if (!*root) + if (!root) return NULL; =20 - r =3D amdgpu_bo_reserve(*root, true); - if (r) - goto error_unref; + r =3D drm_exec_lock_obj(exec, &root->tbo.base); + if (r) { + amdgpu_bo_unref(&root); + return NULL; + } =20 /* Double check that the VM still exists */ xa_lock_irqsave(&adev->vm_manager.pasids, irqflags); vm =3D xa_load(&adev->vm_manager.pasids, pasid); - if (vm && vm->root.bo !=3D *root) + if (vm && vm->root.bo !=3D root) vm =3D NULL; xa_unlock_irqrestore(&adev->vm_manager.pasids, irqflags); - if (!vm) - goto error_unlock; + if (!vm) { + drm_exec_unlock_obj(exec, &root->tbo.base); + amdgpu_bo_unref(&root); + return NULL; + } =20 - return vm; -error_unlock: - amdgpu_bo_unreserve(*root); + /* The drm_exec context holds its own reference on the root BO. */ + amdgpu_bo_unref(&root); =20 -error_unref: - amdgpu_bo_unref(root); - return NULL; + return vm; } =20 /** @@ -3012,33 +3021,49 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a= dev, u32 pasid, uint64_t ts, bool write_fault) { bool is_compute_context =3D false; - struct amdgpu_bo *root; + struct drm_exec exec; uint64_t value, flags; struct amdgpu_vm *vm; int r; =20 - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid); - if (!vm) + drm_exec_init(&exec, 0, 0); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + } + if (!vm) { + drm_exec_fini(&exec); return false; + } =20 is_compute_context =3D vm->is_compute_context; =20 if (is_compute_context) { - /* Unreserve root since svm_range_restore_pages might try to reserve it.= */ - /* TODO: rework svm_range_restore_pages so that this isn't necessary. */ - amdgpu_bo_unreserve(root); + /* Release the root PD lock since svm_range_restore_pages + * might try to take it. + * TODO: rework svm_range_restore_pages so that this isn't + * necessary. + */ + drm_exec_fini(&exec); =20 if (!svm_range_restore_pages(adev, pasid, vmid, - node_id, addr >> PAGE_SHIFT, ts, write_fault)) { - amdgpu_bo_unref(&root); + node_id, addr >> PAGE_SHIFT, ts, write_fault)) return true; - } - amdgpu_bo_unref(&root); =20 /* Re-acquire the VM lock, could be that the VM was freed in between. */ - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid); - if (!vm) + drm_exec_init(&exec, 0, 0); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, pasid, &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + } + if (!vm) { + drm_exec_fini(&exec); return false; + } } =20 addr /=3D AMDGPU_GPU_PAGE_SIZE; @@ -3062,7 +3087,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *ade= v, u32 pasid, value =3D 0; } =20 - r =3D dma_resv_reserve_fences(root->tbo.base.resv, 1); + r =3D dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1); if (r) { pr_debug("failed %d to reserve fence slot\n", r); goto error_unlock; @@ -3076,12 +3101,10 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *a= dev, u32 pasid, r =3D amdgpu_vm_update_pdes(adev, vm, true); =20 error_unlock: - amdgpu_bo_unreserve(root); + drm_exec_fini(&exec); if (r < 0) dev_err(adev->dev, "Can't handle page fault (%d)\n", r); =20 - amdgpu_bo_unref(&root); - return false; } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.h index d083d7aab75c..0c6e3e0368c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -593,7 +593,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev,= u32 pasid, bool write_fault); =20 struct amdgpu_vm *amdgpu_vm_lock_by_pasid(struct amdgpu_device *adev, - struct amdgpu_bo **root, u32 pasid); + u32 pasid, struct drm_exec *exec); =20 void amdgpu_vm_set_task_info(struct amdgpu_vm *vm); =20 --=20 2.54.0 From nobody Sun May 24 21:38:44 2026 Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1ECC3A5E9F for ; Thu, 21 May 2026 10:43:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779360229; cv=none; b=ritnL+Nt27dvinwxNgjkzO314r+RFVcHKjANgtA3/rlBNGf8IFo+PW0dApNpGs1Jk7BFi28LjML9ghQ9AuzV/PwN/W0/cgYROpqApdqjU71h/gJ5JNpwFt57WyHT1zg7TdJYQfJGG54Fh6lre+p1Ft/phfm8EpcxqrLJvjDIgtU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779360229; c=relaxed/simple; bh=jhawuaQE4VaUOaqBjSVJ7L9kHTPtT9q8QTpefJsl81M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=b08IOaHZfJINZiMfgxhRTaNqvEPjEZSj3pPMb/MVQ1PCGvVEKriXJr/V9Byq501xVggoZ8JocYUWJcjMEBV6s+c8Fp5RawZi6s581dIdhQvazGtgO0gBZIsk38JKKUjW0BCclAodarRgFyNN3uwgAIaKiPtoftQXQjpY5pHIvKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E200blCd; arc=none smtp.client-ip=209.85.167.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E200blCd" Received: by mail-lf1-f46.google.com with SMTP id 2adb3069b0e04-5a62f43b76aso6345426e87.3 for ; Thu, 21 May 2026 03:43:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779360226; x=1779965026; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=; b=E200blCdoeXIVL9RK1c+3nx4PRCn43GPmMTmI0ShzuSNzY+mrw/G+Qh4v7E7n48qqo mkZ0EFWAr5ULbIAv8Hs/+EvhIlw3WNPguUTVVI6Gwmw1XsYAkLB7PAs3c5nrLqJdurcn NwzJGr53qGm2msnWSXWeeTRgRLe30ItTaV/w52Qa/MQtEI933NUScYDreSwkX5LW1kwP VtTlLnmcVQlxLqDipknMf95FbBoYo7EVRWWSdekwiL5YS/HydntqKkhO9REKPqZ7AK8H 3YaPn5WYe42NUJ3g4UIr3Ln8Oop56lJ3aQCgq6yO1NmsZI/ExPL/TtJqvmJ49P3j+B/y +QiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779360226; x=1779965026; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=V2MVfvQBPeY6Pwia0clIRhZPiWkl2nWuf72y0t9gH4Q=; b=fbLHyMnKOZFdekKl5eD96gFC3M2mqpn4EyFtGDIx1IG1kW9yiTomg4azaN9+y8Yu7E 6grHQiw0rGliJaJOLJBNqGskkpOfzFG3ypZbtLkpJ90kUvPW+RTU4UNlLELvPbSiP+li woP5cK7kqG/e+sbOUHLyu1/w5aTOAw+xtBcNo/jncTA7DpnTFDLIJ4M0/3PR7RvAaiPZ CxmqyQS7gepYqcPLStrVv3bWKNBODAzAshZmK/maFo6nHnL8AP47/0o6YXMz/AZnnYQO UYRrl42nUvtknUNwNjFXuDvVWjOhmhql7HNktHR6347sD/W4siu4kSsld1hfjPVMHxEJ 6SHQ== X-Forwarded-Encrypted: i=1; AFNElJ8pa663a/TduyVJ+RAfGI+OdLuCKAaBovlB+1MmV6RziCNKxjGgiaK9CRR9yKyED3LEz/Nfk40cgP0FNEc=@vger.kernel.org X-Gm-Message-State: AOJu0YwzC/5YYYbc+gu+BZWTd5jp1Bi+OlHof8EWOOoa4owwKgIG3+mb uqvWwzo3p3aM4c3BncsWSLzuDTEB+3sDkEb21r8Wq8iuHirfoH9UjtT3 X-Gm-Gg: Acq92OGvueT+RJ2gdQgUUObpnQnpttZ+xDBip8YMOnfTeYXYQR3Rbchs/jzBN/NNJSb KWhMLqnr+mPbNzNJTwA7raGPKMfzUJiiSieIv2tOTBFawkmD11PUjMMlAgaDHhKmOdgkVa4pGZ7 Ad+ZxGU0nOeeZzn9WUNBnqwpdoW22Ce08UcoZQKvjNw/VnnWeRfsd/QU/JXvdnVBs2RuCPOOnFO oYVL/7Tm7RUa3nPvh7u8k8JnjpVVZosxw4Jm1/my+R3OoZrgy1cDMDlFOTnkTZ5v3W3+x9YAuMM j2Btxk5OBRHUk0o4d0RiQAItDM22P+ggV02DkyY9bHxIyAHzumF277M2IjSBCFAbL2MGJPqJqnh ZTeLWGGXhfUTRsJqlGERu7yhAfgTicaVlKm978eMawv3VQO1mjgczFwBptHsHvzl+jsklGq990q wRG5LqRni97sh6ATFB9BDJLmyQWYXBtGQqfWJMyAtkXcXs X-Received: by 2002:ac2:46ea:0:b0:5a8:89ad:e172 with SMTP id 2adb3069b0e04-5aa2ba9afecmr629800e87.37.1779360225752; Thu, 21 May 2026 03:43:45 -0700 (PDT) Received: from localhost ([188.234.148.119]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-395d0b49073sm1595611fa.31.2026.05.21.03.43.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2026 03:43:45 -0700 (PDT) From: Mikhail Gavrilov To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Pierre-Eric Pelloux-Prayer , Sumit Semwal , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Mikhail Gavrilov Subject: [PATCH v4 2/2] drm/amdgpu: fix recursive ww_mutex acquire in amdgpu_devcoredump_format Date: Thu, 21 May 2026 15:43:33 +0500 Message-ID: <20260521104335.28978-3-mikhail.v.gavrilov@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com> References: <20260520151741.50575-1-mikhail.v.gavrilov@gmail.com> <20260521104335.28978-1-mikhail.v.gavrilov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When dumping IB contents from a hung job, amdgpu_devcoredump_format() acquired the VM root PD's reservation via amdgpu_vm_lock_by_pasid() and then, for each IB, called amdgpu_bo_reserve() on the BO backing the IB. Both reservations are reservation_ww_class_mutex objects and neither used a ww_acquire_ctx, which trips lockdep: WARNING: possible recursive locking detected -------------------------------------------- kworker/u128:0 is trying to acquire lock: ffff88838b16e1f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] but task is already holding lock: ffff8882f82681f0 (reservation_ww_class_mutex){+.+.}-{4:4}, at: amdgpu_devcoredump_format+0x1594/0x23f0 [amdgpu] Possible unsafe locking scenario: CPU0 ---- lock(reservation_ww_class_mutex); lock(reservation_ww_class_mutex); *** DEADLOCK *** May be due to missing lock nesting notation Workqueue: events_unbound amdgpu_devcoredump_deferred_work [amdgpu] Call Trace: __ww_mutex_lock.constprop.0 ww_mutex_lock amdgpu_bo_reserve amdgpu_devcoredump_format+0x1594 [amdgpu] amdgpu_devcoredump_deferred_work+0xea [amdgpu] The two reservations are on different BOs in the captured trace, so the splat is a lockdep-correctness warning, not an observed deadlock. It becomes a real self-deadlock whenever the IB BO shares its dma_resv with the root PD (the always-valid case, see amdgpu_vm_is_bo_always_valid()): amdgpu_bo_reserve(abo) re-acquires the same ww_mutex without a ticket and blocks forever. With amdgpu.gpu_recovery=3D0 the timeout handler refires every ~2 s and each invocation produces this splat, drowning the kernel ring buffer. Now that amdgpu_vm_lock_by_pasid() takes a drm_exec context, lock the root PD and every IB BO together in a single drm_exec ticket. DRM_EXEC_IGNORE_DUPLICATES handles IB BOs that share a dma_resv (e.g. always-valid BOs, or two IBs backed by the same BO). Every lock is now a top-level acquire under one ww_acquire_ctx, so the recursive ww_mutex condition is gone, and the per-IB amdgpu_bo_reserve()/amdgpu_bo_unref() dance -- including a BO refcount leak on the amdgpu_bo_reserve() failure path -- is removed. Reproducer (~150 LoC libdrm_amdgpu): submit a single GFX IB containing PACKET3_INDIRECT_BUFFER chained at GPU VA 0 and wait for the fence. The TDR fires within ~10 s and the deferred coredump worker produces the splat above on every invocation; with this change applied the splat is gone. Fixes: 7b15fc2d1f1a ("drm/amdgpu: dump job ibs in the devcoredump") Suggested-by: Christian K=C3=B6nig Signed-off-by: Mikhail Gavrilov --- .../gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 105 ++++++++++++------ 1 file changed, 71 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu= /drm/amd/amdgpu/amdgpu_dev_coredump.c index d386bc775d03..456ea9911d48 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c @@ -24,6 +24,7 @@ =20 #include #include +#include #include "amdgpu_dev_coredump.h" #include "atom.h" =20 @@ -214,13 +215,9 @@ amdgpu_devcoredump_format(char *buffer, size_t count, = struct amdgpu_coredump_inf struct drm_printer p; struct drm_print_iterator iter; struct amdgpu_vm_fault_info *fault_info; - struct amdgpu_bo_va_mapping *mapping; struct amdgpu_ip_block *ip_block; struct amdgpu_res_cursor cursor; - struct amdgpu_bo *abo, *root; - uint64_t va_start, offset; struct amdgpu_ring *ring; - struct amdgpu_vm *vm; u32 *ib_content; uint8_t *kptr; int ver, i, j, r; @@ -343,43 +340,84 @@ amdgpu_devcoredump_format(char *buffer, size_t count,= struct amdgpu_coredump_inf drm_printf(&p, "VRAM is lost due to GPU reset!\n"); =20 if (coredump->num_ibs) { - /* Don't try to lookup the VM or map the BOs when calculating the - * size required to store the devcoredump. + struct amdgpu_bo_va_mapping *mapping; + struct amdgpu_bo *abo; + struct drm_exec exec; + struct amdgpu_vm *vm; + u64 va_start, offset; + bool locked =3D false; + + /* + * Lock the VM root PD and every IB BO together in a single + * drm_exec ticket. Reserving the IB BOs one by one while the + * root PD is held would be a recursive reservation_ww_class_mutex + * acquire without a ww_acquire_ctx, which trips lockdep and + * self-deadlocks for IB BOs that share their dma_resv with the + * root PD (always-valid BOs). + * + * Skip locking entirely on the sizing pass: it does not write + * IB content, so the size estimate doesn't depend on whether + * the BOs are reachable. */ - if (sizing_pass) - vm =3D NULL; - else - vm =3D amdgpu_vm_lock_by_pasid(adev, &root, coredump->pasid); + if (!sizing_pass) { + drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES, + 1 + coredump->num_ibs); + drm_exec_until_all_locked(&exec) { + vm =3D amdgpu_vm_lock_by_pasid(adev, coredump->pasid, + &exec); + drm_exec_retry_on_contention(&exec); + if (!vm) + break; + + for (int i =3D 0; i < coredump->num_ibs; i++) { + u64 pfn; + + va_start =3D coredump->ibs[i].gpu_addr & + AMDGPU_GMC_HOLE_MASK; + pfn =3D va_start / AMDGPU_GPU_PAGE_SIZE; + mapping =3D amdgpu_vm_bo_lookup_mapping(vm, pfn); + if (!mapping) + continue; + + abo =3D mapping->bo_va->base.bo; + r =3D drm_exec_lock_obj(&exec, &abo->tbo.base); + drm_exec_retry_on_contention(&exec); + if (r) + break; + } + if (r) + break; + } + if (vm && !r) + locked =3D true; + else + drm_exec_fini(&exec); + } + + for (int i =3D 0; i < coredump->num_ibs; i++) { + bool emit_content =3D sizing_pass; =20 - for (int i =3D 0; i < coredump->num_ibs && (sizing_pass || vm); i++) { ib_content =3D kvmalloc_array(coredump->ibs[i].ib_size_dw, 4, GFP_KERNEL); if (!ib_content) continue; =20 - /* vm=3DNULL can only happen when 'sizing_pass' is true. Skip to the - * drm_printf() calls (ib_content doesn't need to be initialized - * as its content won't be written anywhere). - */ - if (!vm) + if (!locked) goto output_ib_content; =20 va_start =3D coredump->ibs[i].gpu_addr & AMDGPU_GMC_HOLE_MASK; mapping =3D amdgpu_vm_bo_lookup_mapping(vm, va_start / AMDGPU_GPU_PAGE_= SIZE); if (!mapping) - goto free_ib_content; + goto output_ib_content; =20 - offset =3D va_start - (mapping->start * AMDGPU_GPU_PAGE_SIZE); - abo =3D amdgpu_bo_ref(mapping->bo_va->base.bo); - r =3D amdgpu_bo_reserve(abo, false); - if (r) - goto free_ib_content; + abo =3D mapping->bo_va->base.bo; + offset =3D va_start - mapping->start * AMDGPU_GPU_PAGE_SIZE; =20 if (abo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS) { off =3D 0; =20 if (abo->tbo.resource->mem_type !=3D TTM_PL_VRAM) - goto unreserve_abo; + goto output_ib_content; =20 amdgpu_res_first(abo->tbo.resource, offset, coredump->ibs[i].ib_size_dw * 4, @@ -391,12 +429,13 @@ amdgpu_devcoredump_format(char *buffer, size_t count,= struct amdgpu_coredump_inf off +=3D cursor.size; amdgpu_res_next(&cursor, cursor.size); } + emit_content =3D true; } else { r =3D ttm_bo_kmap(&abo->tbo, 0, PFN_UP(abo->tbo.base.size), &abo->kmap); if (r) - goto unreserve_abo; + goto output_ib_content; =20 kptr =3D amdgpu_bo_kptr(abo); kptr +=3D offset; @@ -404,23 +443,21 @@ amdgpu_devcoredump_format(char *buffer, size_t count,= struct amdgpu_coredump_inf coredump->ibs[i].ib_size_dw * 4); =20 amdgpu_bo_kunmap(abo); + emit_content =3D true; } =20 output_ib_content: drm_printf(&p, "\nIB #%d 0x%llx %d dw\n", i, coredump->ibs[i].gpu_addr, coredump->ibs[i].ib_size_dw); - for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++) - drm_printf(&p, "0x%08x\n", ib_content[j]); -unreserve_abo: - if (vm) - amdgpu_bo_unreserve(abo); -free_ib_content: + if (emit_content) { + for (int j =3D 0; j < coredump->ibs[i].ib_size_dw; j++) + drm_printf(&p, "0x%08x\n", ib_content[j]); + } kvfree(ib_content); } - if (vm) { - amdgpu_bo_unreserve(root); - amdgpu_bo_unref(&root); - } + + if (locked) + drm_exec_fini(&exec); } =20 return count - iter.remain; --=20 2.54.0