From nobody Thu Oct 9 20:26:26 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 984D72DE1E2 for ; Tue, 17 Jun 2025 12:50:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164625; cv=none; b=gLnWd8aeD8YRAL3nXOy9Io/bAfxYKTOPccY9BY0Ol4susX+nmEwSSHgnBRmovjEKTurSd2FfWGhNJZ1DfkZ8ns41em+Fb1sOZiR0H59Uyht/eP6pkbl6c8t7M+LUfoRvp15x33AbahWqrosff1BPud7/CeI74zplQtja6T3GKUY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164625; c=relaxed/simple; bh=D2pX+T/9Z4AhwOQrFVvq4pwh1k+kQ/6T961Ac0+jWOg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GZiQ83m7ACLq1wzeJPPf5sHbEQ8f2FGjG7pGJfq5ptTRPPxveOuruJKoCK/wEYHIkq282AcExQ7davnGee/Jruqvx87fD/X3O8m0c6cFHvmELslTe1bHqe16uawkhVUYxYCy22bybgn7ifa8GJpPx/d9BjiOW46s/vEw0BUiAGg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=lcwt4GNH; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="lcwt4GNH" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=yrfDzOT2kIl7hEsjyNVf8qR8CQoShiARpH5JLckqStg=; b=lcwt4GNHHejK7Yucp1nwlPtmO3 /SnV2hmq66V2+D3YE6T7zCdhrmH0mUfx/0SE4//A82ZdqXmiKrouvypINmRZV9ZTcfsJMLzfJWvSg 227qsOh1Z/JqUS7K1i+mMIx0VqUGqY72/8FIpWGhZTrPI1LgCtiAeXik8RytUYmgSNrJufgMf2NWw /CLxsJaxuhNRhpnMS/NK10TimdUqISWeHxR42QEa1bUepFs3xX0QZKWlUDl4BXN8oPONtrZtqECPZ 0L/GX4sSUSspbcxb5ilAHJQABXi964+Sfvcig3bL/zPNw6LB2l+eTAX7ccjStNzjGUGRP5GBM1W8z KSEIqD8g==; Received: from [191.204.192.64] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRVlM-004ans-Mu; Tue, 17 Jun 2025 14:50:05 +0200 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: "Alex Deucher" , =?UTF-8?q?Christian=20K=C3=B6nig?= , siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch, "Raag Jadav" , rodrigo.vivi@intel.com, jani.nikula@linux.intel.com, Xaver Hugl , Krzysztof Karas Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, amd-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= Subject: [PATCH v9 1/6] drm: amdgpu: Allow NULL pointers at amdgpu_vm_put_task_info() Date: Tue, 17 Jun 2025 09:49:44 -0300 Message-ID: <20250617124949.2151549-2-andrealmeid@igalia.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250617124949.2151549-1-andrealmeid@igalia.com> References: <20250617124949.2151549-1-andrealmeid@igalia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Allow NULL pointers at amdgpu_vm_put_task_info() as it common practice for "put" or "free" functions. This avoid an extra check for NULL for callers. Signed-off-by: Andr=C3=A9 Almeida Reviewed-by: Christian K=C3=B6nig --- v9: use if (task) instead of if (ZERO_OR_NULL_PTR(task)) v8: New patch --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.c index 3911c78f8282..de914a39e3f6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2447,7 +2447,8 @@ amdgpu_vm_get_vm_from_pasid(struct amdgpu_device *ade= v, u32 pasid) */ void amdgpu_vm_put_task_info(struct amdgpu_task_info *task_info) { - kref_put(&task_info->refcount, amdgpu_vm_destroy_task_info); + if (task_info) + kref_put(&task_info->refcount, amdgpu_vm_destroy_task_info); } =20 /** --=20 2.49.0 From nobody Thu Oct 9 20:26:26 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 985092DE1F7 for ; Tue, 17 Jun 2025 12:50:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164627; cv=none; b=mLn56ht+6nYN2kIWoADlwn1nlf/UmstIxZqaf+t7n8ycrNtNfIvtgtFojNmo+khpgZ3jHGSfNQgcJEYqC/W2xLcWiqa++zcBK7khNn+5Xnx+cu4q88XqscdfKSQG/5xVSS3bLwU8OD+B0WRDn6rpTe9uZ0fyeN6NfqaiMhvp/3w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164627; c=relaxed/simple; bh=SOIaepdpXz3UcMi6KhwDvhR3KJhTOWvH/emZbM9rZ7w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=js2FvV1SwpnwEru1Do6gtzEm9CBaj2u5t8yj0E1S+xi4M8SZ6+8oiPSlr7GeyaFK21cx5p/oIcKVZ1+hvsqXo2LlHrfzx0HvwugRGkGWpHcfEwCs5Cwc3k1xoudlreA4aptopK4TMucAmJBFl/+vFlKM/NJfZMhAO9pjZ/m/8w4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=hMTNzwAq; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="hMTNzwAq" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=cqvcCLuXGWcxG6oSMTaVulETfZYDdZYBrWK87jUAr00=; b=hMTNzwAqeJzZCpuTIJ3I54+FPb IIUHTq8PswZ8T1oAOi77OXc9u9ZOSza4ImeE+QmB4PGkW2j3VNWf9akbTb980uUgp0Zjuxwxdnwv9 R5TQ3gwJx5rC8Z9TYLV9RKNin51kN8ti23cXzOrg/VWssxj7KKU1hKInDTpixFacQ5grfBHeoZhPr yFo5VhujOn6Hrdh+N2J0FlIJOFo8y7aQMP86G6i/7Dl3ATmQ9jRvIMIvnroRkrix7wu8bC1vyruy8 PBVfaC3gyVZoKARxKCxchCy8k1uQrDc1cq2Ii0lzbGfcaEMU3l19kHx4SVSOxPt3o0BxU4aGzIDc5 bMVpIoLA==; Received: from [191.204.192.64] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRVlQ-004ans-LH; Tue, 17 Jun 2025 14:50:09 +0200 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: "Alex Deucher" , =?UTF-8?q?Christian=20K=C3=B6nig?= , siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch, "Raag Jadav" , rodrigo.vivi@intel.com, jani.nikula@linux.intel.com, Xaver Hugl , Krzysztof Karas Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, amd-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= Subject: [PATCH v9 2/6] drm: amdgpu: Create amdgpu_vm_print_task_info() Date: Tue, 17 Jun 2025 09:49:45 -0300 Message-ID: <20250617124949.2151549-3-andrealmeid@igalia.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250617124949.2151549-1-andrealmeid@igalia.com> References: <20250617124949.2151549-1-andrealmeid@igalia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable To avoid repetitive code in amdgpu, create a function that prints the content of struct amdgpu_task_info. Signed-off-by: Andr=C3=A9 Almeida Reviewed-by: Christian K=C3=B6nig --- v8: drop the inline v7: new patch --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 +++ drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 5 +---- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 5 +---- drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 5 +---- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 4 +--- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 5 +---- 8 files changed, 18 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_job.c index 75262ce8db27..3d887428ca2b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -124,9 +124,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(stru= ct drm_sched_job *s_job) =20 ti =3D amdgpu_vm_get_task_info_pasid(ring->adev, job->pasid); if (ti) { - dev_err(adev->dev, - "Process information: process %s pid %d thread %s pid %d\n", - ti->process_name, ti->tgid, ti->task_name, ti->pid); + amdgpu_vm_print_task_info(adev, ti); amdgpu_vm_put_task_info(ti); } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.c index de914a39e3f6..3bf63eee2d4e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -3157,3 +3157,12 @@ bool amdgpu_vm_is_bo_always_valid(struct amdgpu_vm *= vm, struct amdgpu_bo *bo) { return bo && bo->tbo.base.resv =3D=3D vm->root.bo->tbo.base.resv; } + +void amdgpu_vm_print_task_info(struct amdgpu_device *adev, + struct amdgpu_task_info *task_info) +{ + dev_err(adev->dev, + " Process %s pid %d thread %s pid %d\n", + task_info->process_name, task_info->tgid, + task_info->task_name, task_info->pid); +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.h index f3ad687125ad..9ec5d94200aa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -668,4 +668,7 @@ void amdgpu_vm_tlb_fence_create(struct amdgpu_device *a= dev, struct amdgpu_vm *vm, struct dma_fence **fence); =20 +void amdgpu_vm_print_task_info(struct amdgpu_device *adev, + struct amdgpu_task_info *task_info); + #endif diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/a= mdgpu/gmc_v10_0.c index a3e2787501f1..7923f491cf73 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -164,10 +164,7 @@ static int gmc_v10_0_process_interrupt(struct amdgpu_d= evice *adev, entry->src_id, entry->ring_id, entry->vmid, entry->pasid); task_info =3D amdgpu_vm_get_task_info_pasid(adev, entry->pasid); if (task_info) { - dev_err(adev->dev, - " in process %s pid %d thread %s pid %d\n", - task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + amdgpu_vm_print_task_info(adev, task_info); amdgpu_vm_put_task_info(task_info); } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/a= mdgpu/gmc_v11_0.c index 72211409227b..f15d691e9a20 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c @@ -134,10 +134,7 @@ static int gmc_v11_0_process_interrupt(struct amdgpu_d= evice *adev, entry->src_id, entry->ring_id, entry->vmid, entry->pasid); task_info =3D amdgpu_vm_get_task_info_pasid(adev, entry->pasid); if (task_info) { - dev_err(adev->dev, - " in process %s pid %d thread %s pid %d)\n", - task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + amdgpu_vm_print_task_info(adev, task_info); amdgpu_vm_put_task_info(task_info); } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c b/drivers/gpu/drm/amd/a= mdgpu/gmc_v12_0.c index b645d3e6a6c8..de763105fdfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c @@ -127,10 +127,7 @@ static int gmc_v12_0_process_interrupt(struct amdgpu_d= evice *adev, entry->src_id, entry->ring_id, entry->vmid, entry->pasid); task_info =3D amdgpu_vm_get_task_info_pasid(adev, entry->pasid); if (task_info) { - dev_err(adev->dev, - " in process %s pid %d thread %s pid %d)\n", - task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + amdgpu_vm_print_task_info(adev, task_info); amdgpu_vm_put_task_info(task_info); } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/am= dgpu/gmc_v8_0.c index 99ca08e9bdb5..b45fa0cea9d2 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c @@ -1458,9 +1458,7 @@ static int gmc_v8_0_process_interrupt(struct amdgpu_d= evice *adev, =20 task_info =3D amdgpu_vm_get_task_info_pasid(adev, entry->pasid); if (task_info) { - dev_err(adev->dev, " for process %s pid %d thread %s pid %d\n", - task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + amdgpu_vm_print_task_info(adev, task_info); amdgpu_vm_put_task_info(task_info); } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/am= dgpu/gmc_v9_0.c index 282197f4ffb1..78f65aea03f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -641,10 +641,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_de= vice *adev, =20 task_info =3D amdgpu_vm_get_task_info_pasid(adev, entry->pasid); if (task_info) { - dev_err(adev->dev, - " for process %s pid %d thread %s pid %d)\n", - task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + amdgpu_vm_print_task_info(adev, task_info); amdgpu_vm_put_task_info(task_info); } =20 --=20 2.49.0 From nobody Thu Oct 9 20:26:26 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F6512E2EF4 for ; Tue, 17 Jun 2025 12:50:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164629; cv=none; b=UQnsbCMySeD1MJH7ycFrXTDKE67PdVBLNqbLJ2TFRG8Lu/x0nEEJ/rKNbeTOkbzCtiFFCjau+uZqgnNDi9OKrdHtBlUnKf1t6nCqwJWO7wwZisKS7DQVYR1/sBJ83jFSJmtwu9eUS68H6KUEwblj8hyWloq+96l8sVQMcOA0g5A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164629; c=relaxed/simple; bh=2v9bmDfoi+U9lc4MgFzPyKK5AMpr4I9Wh765oVNRQhs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DY0REgvI4Y3LoKgPanTvdl6ztYoH/nJEBxSVpR4qvWrR25WQNVUF5KRZLHCR6eUNnYecnh8j6w49Kn4WtXVUgi/oX8mz6hyT5+yOwHnZ5Gn6YpyWf5Aw+nctC8UTcuoKnxxPYk+tJKHe7lQ4FXYfjy8yy42BdrcxqeLdcv1csNQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=RqDLMYsu; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="RqDLMYsu" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=egqVkJr89695k6cbqDYsiYNvXLoUZ0Gdt1PL0OlA0Kg=; b=RqDLMYsuH8jU63JDTdOdev1Lsw /pl6tpnvEd2YuH3zSfPFC2emqXAtuE8A5XMkG2sccrCSxW9cy2ykqWcasfjBDsjNrRAs/dDF37ZTt q/veugERffJDWkqy9BOdRszyUudPAYuQAUmC/eYL1xzjTJXTSvT7sd3vJu+dWN+73ezdBqzsyPoua 4HI8Z21AewuLMr17urp/yT2Q92Xv4ovxQi9sDLIrlHJM3CM/URBZRXnN0nJq7JIHhS0AMIWKg1MUs rFO7F1wSuxGO/kni2LnRgzmHJ/Tjbq90KvaNXzTAyAitR/e9YZ3tqu0pyf9g1F2qiNCnbFoN5aoeq 5Ipw2Orw==; Received: from [191.204.192.64] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRVlU-004ans-Mv; Tue, 17 Jun 2025 14:50:13 +0200 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: "Alex Deucher" , =?UTF-8?q?Christian=20K=C3=B6nig?= , siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch, "Raag Jadav" , rodrigo.vivi@intel.com, jani.nikula@linux.intel.com, Xaver Hugl , Krzysztof Karas Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, amd-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= Subject: [PATCH v9 3/6] drm: Create a task info option for wedge events Date: Tue, 17 Jun 2025 09:49:46 -0300 Message-ID: <20250617124949.2151549-4-andrealmeid@igalia.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250617124949.2151549-1-andrealmeid@igalia.com> References: <20250617124949.2151549-1-andrealmeid@igalia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When a device get wedged, it might be caused by a guilty application. For userspace, knowing which task was involved can be useful for some situations, like for implementing a policy, logs or for giving a chance for the compositor to let the user know what task was involved in the problem. This is an optional argument, when the task info is not available, the PID and TASK string won't appear in the event string. Sometimes just the PID isn't enough giving that the task might be already dead by the time userspace will try to check what was this PID's name, so to make the life easier also notify what's the task's name in the user event. Acked-by: Rodrigo Vivi (for i915 and xe) Reviewed-by: Krzysztof Karas Reviewed-by: Raag Jadav Signed-off-by: Andr=C3=A9 Almeida Acked-by: Christian K=C3=B6nig --- v8: Code style changes (Raag) v7: - Change `char *comm` to `char comm[TASK_COMM_LEN]` v6: - s/cause/involved - drop string initialization v5: - s/app/task for struct and commit message as well - move defines to drm_drv.c - validates if comm is not NULL and it's not empty v4: s/APP/TASK v3: Make comm_string and pid_string empty when there's no app info --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/drm_drv.c | 21 +++++++++++++++++---- drivers/gpu/drm/i915/gt/intel_reset.c | 3 ++- drivers/gpu/drm/xe/xe_device.c | 3 ++- include/drm/drm_device.h | 9 +++++++++ include/drm/drm_drv.h | 3 ++- 7 files changed, 34 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/a= md/amdgpu/amdgpu_device.c index e1bab6a96cb6..8a0f36f33f13 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -6364,7 +6364,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *a= dev, atomic_set(&adev->reset_domain->reset_res, r); =20 if (!r) - drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE); + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL); =20 return r; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_job.c index 3d887428ca2b..0c1381b527fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -164,7 +164,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(stru= ct drm_sched_job *s_job) if (amdgpu_ring_sched_ready(ring)) drm_sched_start(&ring->sched, 0); dev_err(adev->dev, "Ring %s reset succeeded\n", ring->sched.name); - drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE); + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL); goto exit; } dev_err(adev->dev, "Ring %s reset failure\n", ring->sched.name); diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 56dd61f8e05a..a994da9d9233 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -538,10 +539,15 @@ static const char *drm_get_wedge_recovery(unsigned in= t opt) } } =20 +#define WEDGE_STR_LEN 32 +#define PID_STR_LEN 15 +#define COMM_STR_LEN (TASK_COMM_LEN + 5) + /** * drm_dev_wedged_event - generate a device wedged uevent * @dev: DRM device * @method: method(s) to be used for recovery + * @info: optional information about the guilty task * * This generates a device wedged uevent for the DRM device specified by @= dev. * Recovery @method\(s) of choice will be sent in the uevent environment as @@ -554,13 +560,13 @@ static const char *drm_get_wedge_recovery(unsigned in= t opt) * * Returns: 0 on success, negative error code otherwise. */ -int drm_dev_wedged_event(struct drm_device *dev, unsigned long method) +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method, + struct drm_wedge_task_info *info) { + char event_string[WEDGE_STR_LEN], pid_string[PID_STR_LEN], comm_string[CO= MM_STR_LEN]; + char *envp[] =3D { event_string, NULL, NULL, NULL }; const char *recovery =3D NULL; unsigned int len, opt; - /* Event string length up to 28+ characters with available methods */ - char event_string[32]; - char *envp[] =3D { event_string, NULL }; =20 len =3D scnprintf(event_string, sizeof(event_string), "%s", "WEDGED=3D"); =20 @@ -582,6 +588,13 @@ int drm_dev_wedged_event(struct drm_device *dev, unsig= ned long method) drm_info(dev, "device wedged, %s\n", method =3D=3D DRM_WEDGE_RECOVERY_NON= E ? "but recovered through reset" : "needs recovery"); =20 + if (info && (info->comm[0] !=3D '\0') && (info->pid >=3D 0)) { + snprintf(pid_string, sizeof(pid_string), "PID=3D%u", info->pid); + snprintf(comm_string, sizeof(comm_string), "TASK=3D%s", info->comm); + envp[1] =3D pid_string; + envp[2] =3D comm_string; + } + return kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp); } EXPORT_SYMBOL(drm_dev_wedged_event); diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/g= t/intel_reset.c index dbdcfe130ad4..ba1d8fdc3c7b 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -1448,7 +1448,8 @@ static void intel_gt_reset_global(struct intel_gt *gt, kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event); else drm_dev_wedged_event(>->i915->drm, - DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET); + DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET, + NULL); } =20 /** diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index c02c4c4e9412..f329613e061f 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -1168,7 +1168,8 @@ void xe_device_declare_wedged(struct xe_device *xe) =20 /* Notify userspace of wedged device */ drm_dev_wedged_event(&xe->drm, - DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET); + DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET, + NULL); } =20 for_each_gt(gt, xe, id) diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h index e2f894f1b90a..08b3b2467c4c 100644 --- a/include/drm/drm_device.h +++ b/include/drm/drm_device.h @@ -5,6 +5,7 @@ #include #include #include +#include =20 #include =20 @@ -30,6 +31,14 @@ struct pci_controller; #define DRM_WEDGE_RECOVERY_REBIND BIT(1) /* unbind + bind driver */ #define DRM_WEDGE_RECOVERY_BUS_RESET BIT(2) /* unbind + reset bus device += bind */ =20 +/** + * struct drm_wedge_task_info - information about the guilty task of a wed= ge dev + */ +struct drm_wedge_task_info { + pid_t pid; + char comm[TASK_COMM_LEN]; +}; + /** * enum switch_power_state - power state of drm device */ diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 63b51942d606..3f76a32d6b84 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -487,7 +487,8 @@ void drm_put_dev(struct drm_device *dev); bool drm_dev_enter(struct drm_device *dev, int *idx); void drm_dev_exit(int idx); void drm_dev_unplug(struct drm_device *dev); -int drm_dev_wedged_event(struct drm_device *dev, unsigned long method); +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method, + struct drm_wedge_task_info *info); =20 /** * drm_dev_is_unplugged - is a DRM device unplugged --=20 2.49.0 From nobody Thu Oct 9 20:26:26 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 400B528BAB1 for ; Tue, 17 Jun 2025 12:50:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164627; cv=none; b=tIkzDP2MTZVGy0qBhOfdMzQkWMytDU3+cbGnO95bySgPIWfiM0vuNd9gtb8E+lJBlk1CfVVT/IT3cHx7O9qJXEMCrCuAfE8sHdLJpVj4Es+0iSMKCexPIQJgufZIPm/i0N/jevgDcrg4hiatZkctV+D83otY/w/G4bQ40jVqSME= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164627; c=relaxed/simple; bh=vDHKuEB52NE3FyntkV/XWuE9A1twU7QMvf0KaC9hsyI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MDieLvx0lVFSOn+1xz7zHSdiCJfkMTGHRa0yVaT/FYNCeKSzft5sIa1OgAzDYqaCdvNMassDAvk5L8O4MMbxLsaYvQ8sSTQFw19auQQYUlMpoBJZ1smT6EC58d3nnc5Y9jUVkdp4HSLRnuKBMQrfalmKT3weu0qXNgVCwdb2lD0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=h7zmWnCh; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="h7zmWnCh" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=5bW97YWvE2WNp84/m7LouGNWv9aZZ0XfiU91U9Q+7L4=; b=h7zmWnChsIeMDopOeedwqXEska 01HbouEu5rpdgMA42JOhgDXEynxTf+R7CHq2EwR4MQlIFxehyhSwtRt9v0udqrkpau8ijYYO2AbRo zk0XnoEXyvN3OoI415lmYQJhROiy7kI5fClkWSPUZt/W7YxytRSwCS6Nx0XGidnucqHCXitNCgmLh /monqQTtPJi48u7ILCk8O7FkezexU0Ev/qwxDgAaByEsd3JAiXDl3oQzE6bQNZrPgpAL6LA+qTkTh 3Y4aaWoXdtvrwsP3ywoNa8ixv6p0c3eLCkk3WlxlPhMe5q7PSLUYyXlJrXue08WjF09q3X+vWsie8 uVEiTkWg==; Received: from [191.204.192.64] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRVlY-004ans-LF; Tue, 17 Jun 2025 14:50:17 +0200 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: "Alex Deucher" , =?UTF-8?q?Christian=20K=C3=B6nig?= , siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch, "Raag Jadav" , rodrigo.vivi@intel.com, jani.nikula@linux.intel.com, Xaver Hugl , Krzysztof Karas Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, amd-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= Subject: [PATCH v9 4/6] drm/doc: Add a section about "Task information" for the wedge API Date: Tue, 17 Jun 2025 09:49:47 -0300 Message-ID: <20250617124949.2151549-5-andrealmeid@igalia.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250617124949.2151549-1-andrealmeid@igalia.com> References: <20250617124949.2151549-1-andrealmeid@igalia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add a section about "Task information" for the wedge API. Reviewed-by: Krzysztof Karas Reviewed-by: Raag Jadav Signed-off-by: Andr=C3=A9 Almeida Reviewed-by: Christian K=C3=B6nig --- v5: - Change app to task in the text as well v4: - Change APP to TASK v3: - Change "app that caused ..." to "app involved ..." - Clarify that devcoredump have more information about what happened - Update that PID and APP will be empty if there's no app info --- Documentation/gpu/drm-uapi.rst | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst index 4863a4deb0ee..263e5a97c080 100644 --- a/Documentation/gpu/drm-uapi.rst +++ b/Documentation/gpu/drm-uapi.rst @@ -446,6 +446,23 @@ telemetry information (devcoredump, syslog). This is u= seful because the first hang is usually the most critical one which can result in consequential ha= ngs or complete wedging. =20 +Task information +--------------- + +The information about which application (if any) was involved in the device +wedging is useful for userspace if they want to notify the user about what +happened (e.g. the compositor display a message to the user "The +caused a graphical error and the system recovered") or to implement polici= es +(e.g. the daemon may "ban" an task that keeps resetting the device). If th= e task +information is available, the uevent will display as ``PID=3D`` and +``TASK=3D``. Otherwise, ``PID`` and ``TASK`` will not appear in= the +event string. + +The reliability of this information is driver and hardware specific, and s= hould +be taken with a caution regarding it's precision. To have a big picture of= what +really happened, the devcoredump file provides should have much more detai= led +information about the device state and about the event. + Consumer prerequisites ---------------------- =20 --=20 2.49.0 From nobody Thu Oct 9 20:26:26 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 973642E8881 for ; Tue, 17 Jun 2025 12:50:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164632; cv=none; b=Yt7CtDEj6FKklIhHZHsFlgCtQdQGD7Hk1s5HqsPIIqLa5K4k6axQUpS8k/w9Fx0dvFCyEie3EtDrppNaXlG9ZKDQ+yaryjG7RG+Ai2/8NMj8sTfTk7RmJDlfWm5PF/DraGgUSTkWYiOx7zf7PWUCB+M5NN4YfAsDwPW08GQJkP0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164632; c=relaxed/simple; bh=vyBvdlYAMRhvbjUpLOQyAYx4Kd7HNbNHnwnO1tYb95M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=f+BbPV7W9y3sWVZ/IwbwzTWbzL14cQgHrL5eqMj0LtzRIplG0bwRuZdpo0cX7b/qNXhbCVpon2No90IeCmU0LPKqGYKwW4wwO1VPF6SYa1PZwDm3WodKCG6aQbLeNW+V+E6J4X2z34RmgreiU/ChSgxXpqJWEFomN83Wslbso/g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=g5CUUded; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="g5CUUded" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=igjcypcHRqWMBVYlF2JqC//zq6/A67FBmSApQUqFsk4=; b=g5CUUdedEF8I7mHpFMz8lqsCST tSZfSfw+EM4MD/+JoEmBpSsZz9dRH/THE5DgK9tWVGg+Ty06rtengEl15c+s/SLe2Eq95lcga2Pzt IuBA7GMaP6nEziquzTTgcn7+RBmqjIENlU1gXVwNiKXi3JZ2Z+vch4DJ6qMIpzeLEJR5qAzzvRfMP P4zmnWtUw9dUs/fC0hlp/vsX77vqqidoqT27mnkWzkudWJTQmg5nV82BWihwknOHxskTeQVQWCamt KnyJwmRmx7X0/yQC255zOD5Q1D3rCNrZYBiWnOVD2VrCt42/eULCmKn08B4ZJpF2LjdqHQQaNEeHd mU/lp8SA==; Received: from [191.204.192.64] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRVlc-004ans-JW; Tue, 17 Jun 2025 14:50:20 +0200 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: "Alex Deucher" , =?UTF-8?q?Christian=20K=C3=B6nig?= , siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch, "Raag Jadav" , rodrigo.vivi@intel.com, jani.nikula@linux.intel.com, Xaver Hugl , Krzysztof Karas Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, amd-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= Subject: [PATCH v9 5/6] drm: amdgpu: Use struct drm_wedge_task_info inside of struct amdgpu_task_info Date: Tue, 17 Jun 2025 09:49:48 -0300 Message-ID: <20250617124949.2151549-6-andrealmeid@igalia.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250617124949.2151549-1-andrealmeid@igalia.com> References: <20250617124949.2151549-1-andrealmeid@igalia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable To avoid a cast when calling drm_dev_wedged_event(), replace pid and task name inside of struct amdgpu_task_info with struct drm_wedge_task_info. Signed-off-by: Andr=C3=A9 Almeida Reviewed-by: Christian K=C3=B6nig --- v7: New patch --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 ++++++------ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 +-- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 8 ++++---- 9 files changed, 18 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/= amd/amdgpu/amdgpu_debugfs.c index 8e626f50b362..dac4b926e7be 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -1786,7 +1786,7 @@ static int amdgpu_debugfs_vm_info_show(struct seq_fil= e *m, void *unused) =20 ti =3D amdgpu_vm_get_task_info_vm(vm); if (ti) { - seq_printf(m, "pid:%d\tProcess:%s ----------\n", ti->pid, ti->process_n= ame); + seq_printf(m, "pid:%d\tProcess:%s ----------\n", ti->task.pid, ti->proc= ess_name); amdgpu_vm_put_task_info(ti); } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c b/drivers/gpu= /drm/amd/amdgpu/amdgpu_dev_coredump.c index 7b50741dc097..8a026bc9ea44 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev_coredump.c @@ -220,10 +220,10 @@ amdgpu_devcoredump_read(char *buffer, loff_t offset, = size_t count, drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); =20 - if (coredump->reset_task_info.pid) + if (coredump->reset_task_info.task.pid) drm_printf(&p, "process_name: %s PID: %d\n", coredump->reset_task_info.process_name, - coredump->reset_task_info.pid); + coredump->reset_task_info.task.pid); =20 /* SOC Information */ drm_printf(&p, "\nSOC Information\n"); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_gem.c index 0ecc88df7208..e5e33a68d935 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -329,7 +329,7 @@ static int amdgpu_gem_object_open(struct drm_gem_object= *obj, =20 dev_warn(adev->dev, "validate_and_fence failed: %d\n", r); if (ti) { - dev_warn(adev->dev, "pid %d\n", ti->pid); + dev_warn(adev->dev, "pid %d\n", ti->task.pid); amdgpu_vm_put_task_info(ti); } } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.c index 3bf63eee2d4e..0ff95a56c2ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -622,7 +622,7 @@ int amdgpu_vm_validate(struct amdgpu_device *adev, stru= ct amdgpu_vm *vm, =20 pr_warn_ratelimited("Evicted user BO is not reserved\n"); if (ti) { - pr_warn_ratelimited("pid %d\n", ti->pid); + pr_warn_ratelimited("pid %d\n", ti->task.pid); amdgpu_vm_put_task_info(ti); } =20 @@ -2508,11 +2508,11 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm) if (!vm->task_info) return; =20 - if (vm->task_info->pid =3D=3D current->pid) + if (vm->task_info->task.pid =3D=3D current->pid) return; =20 - vm->task_info->pid =3D current->pid; - get_task_comm(vm->task_info->task_name, current); + vm->task_info->task.pid =3D current->pid; + get_task_comm(vm->task_info->task.comm, current); =20 if (current->group_leader->mm !=3D current->mm) return; @@ -2775,7 +2775,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struc= t amdgpu_vm *vm) =20 dev_warn(adev->dev, "VM memory stats for proc %s(%d) task %s(%d) is non-zero when fini\n", - ti->process_name, ti->pid, ti->task_name, ti->tgid); + ti->process_name, ti->task.pid, ti->task.comm, ti->tgid); } =20 amdgpu_vm_put_task_info(vm->task_info); @@ -3164,5 +3164,5 @@ void amdgpu_vm_print_task_info(struct amdgpu_device *= adev, dev_err(adev->dev, " Process %s pid %d thread %s pid %d\n", task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + task_info->task.comm, task_info->task.pid); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.h index 9ec5d94200aa..fd086efd8457 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -236,9 +236,8 @@ struct amdgpu_vm_pte_funcs { }; =20 struct amdgpu_task_info { + struct drm_wedge_task_info task; char process_name[TASK_COMM_LEN]; - char task_name[TASK_COMM_LEN]; - pid_t pid; pid_t tgid; struct kref refcount; }; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/a= mdgpu/sdma_v4_0.c index 33ed2b158fcd..f38004e6064e 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -2187,7 +2187,7 @@ static int sdma_v4_0_print_iv_entry(struct amdgpu_dev= ice *adev, dev_dbg_ratelimited(adev->dev, " for process %s pid %d thread %s pid %d\n", task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + task_info->task.comm, task_info->task.pid); amdgpu_vm_put_task_info(task_info); } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd= /amdgpu/sdma_v4_4_2.c index 9c169112a5e7..bcde34e4e0a1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c @@ -1884,7 +1884,7 @@ static int sdma_v4_4_2_print_iv_entry(struct amdgpu_d= evice *adev, if (task_info) { dev_dbg_ratelimited(adev->dev, " for process %s pid %d thread %s pid %d\= n", task_info->process_name, task_info->tgid, - task_info->task_name, task_info->pid); + task_info->task.comm, task_info->task.pid); amdgpu_vm_put_task_info(task_info); } =20 diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/= amdkfd/kfd_events.c index 2b294ada3ec0..82905f3e54dd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c @@ -1302,7 +1302,7 @@ void kfd_signal_reset_event(struct kfd_node *dev) if (ti) { dev_err(dev->adev->dev, "Queues reset on process %s tid %d thread %s pid %d\n", - ti->process_name, ti->tgid, ti->task_name, ti->pid); + ti->process_name, ti->tgid, ti->task.comm, ti->task.pid); amdgpu_vm_put_task_info(ti); } } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/= amd/amdkfd/kfd_smi_events.c index 83d9384ac815..a499449fcb06 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c @@ -253,9 +253,9 @@ void kfd_smi_event_update_vmfault(struct kfd_node *dev,= uint16_t pasid) task_info =3D amdgpu_vm_get_task_info_pasid(dev->adev, pasid); if (task_info) { /* Report VM faults from user applications, not retry from kernel */ - if (task_info->pid) + if (task_info->task.pid) kfd_smi_event_add(0, dev, KFD_SMI_EVENT_VMFAULT, KFD_EVENT_FMT_VMFAULT( - task_info->pid, task_info->task_name)); + task_info->task.pid, task_info->task.comm)); amdgpu_vm_put_task_info(task_info); } } @@ -359,8 +359,8 @@ void kfd_smi_event_process(struct kfd_process_device *p= dd, bool start) kfd_smi_event_add(0, pdd->dev, start ? KFD_SMI_EVENT_PROCESS_START : KFD_SMI_EVENT_PROCESS_END, - KFD_EVENT_FMT_PROCESS(task_info->pid, - task_info->task_name)); + KFD_EVENT_FMT_PROCESS(task_info->task.pid, + task_info->task.comm)); amdgpu_vm_put_task_info(task_info); } } --=20 2.49.0 From nobody Thu Oct 9 20:26:26 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B27CE2E2EF4 for ; Tue, 17 Jun 2025 12:50:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164635; cv=none; b=egUWa/Ais0KAH0fJg5xLLj68x4OwyF5thJHoRhksvKTJCA7OWGnDJAoQyrQ0NqaTcVphH4AaGClgrK8oiSKy2Slu1VPOAAqTKD07Fzi5Q6M0zyPE3KmTIDGLa8T1szu2qQ9PV7QKj5ZdtYajPqPlcaGoiT+f+iz8L2vC1g8mwWE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750164635; c=relaxed/simple; bh=MBXvbiv09WFTedILbln+GxGD4d0ugksV6zXN5X3pQqg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Xc+iWRWkz4qHTpGdqffcU0XGpVcoJer5WLy2GyNcmwKq9wkhMrSoOp/VBpHZXvpYoTjwaTqU6CgvKAbgo0xCjlNRD4aWg/OzhyoQySHzEKLHUzNxJq2RP7LByMOICOCwg5QwTqaL3Wzm7KuTzH1l8ONhLiWPx7qV8tUp+U7F7vg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=EoNVbxYl; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="EoNVbxYl" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=HoiH7l3fENLBTE2ibK2tEdCTvd0Zntd1kdiJj4+rKlE=; b=EoNVbxYlkfITsauBU6dd35+UZp DSis1k6TzZJ0r8E3t467/UHQjn/o9RgY+HNRTHVABehhaIKsLe2UL1lXRLYNiH7+UVHurdGSVYHv4 EPPxgvoseE2Y7X04UpTfwJvJo+ehS4w8EPNYEi/PZKzM7jscWdtyrBffolIK9R64DyW1geCBp02qC BGoHyaqIfhPKe0rZSg8oMbXHJc+3K8gBuQst9q5RZf9xu8f6Tg2vOqmOtiWjbZ55VzXF7vbQeo3ZR pcHXHSIFD2hHCGR8ZXhV73lHpK4f+PZfVcvD+Yl7Hhm8fPDr89huj3zhAARh6hYaaa3QBHa3V6j4V VzudYTtA==; Received: from [191.204.192.64] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRVlg-004ans-Hd; Tue, 17 Jun 2025 14:50:24 +0200 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: "Alex Deucher" , =?UTF-8?q?Christian=20K=C3=B6nig?= , siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch, "Raag Jadav" , rodrigo.vivi@intel.com, jani.nikula@linux.intel.com, Xaver Hugl , Krzysztof Karas Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, amd-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, =?UTF-8?q?Andr=C3=A9=20Almeida?= Subject: [PATCH v9 6/6] drm/amdgpu: Make use of drm_wedge_task_info Date: Tue, 17 Jun 2025 09:49:49 -0300 Message-ID: <20250617124949.2151549-7-andrealmeid@igalia.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250617124949.2151549-1-andrealmeid@igalia.com> References: <20250617124949.2151549-1-andrealmeid@igalia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable To notify userspace about which task (if any) made the device get in a wedge state, make use of drm_wedge_task_info parameter, filling it with the task PID and name. Signed-off-by: Andr=C3=A9 Almeida Reviewed-by: Christian K=C3=B6nig --- v8: - Drop check before calling amdgpu_vm_put_task_info() - Drop local variable `info` v7: - Remove struct cast, now we can use `info =3D &ti->task` - Fix struct lifetime, move amdgpu_vm_put_task_info() after drm_dev_wedged_event() call --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 13 +++++++++++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 7 +++++-- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/a= md/amdgpu/amdgpu_device.c index 8a0f36f33f13..a59f194e3360 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -6363,8 +6363,17 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *= adev, =20 atomic_set(&adev->reset_domain->reset_res, r); =20 - if (!r) - drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL); + if (!r) { + struct amdgpu_task_info *ti =3D NULL; + + if (job) + ti =3D amdgpu_vm_get_task_info_pasid(adev, job->pasid); + + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, + ti ? &ti->task : NULL); + + amdgpu_vm_put_task_info(ti); + } =20 return r; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_job.c index 0c1381b527fe..1e24590ae144 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -89,6 +89,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct= drm_sched_job *s_job) { struct amdgpu_ring *ring =3D to_amdgpu_ring(s_job->sched); struct amdgpu_job *job =3D to_amdgpu_job(s_job); + struct drm_wedge_task_info *info =3D NULL; struct amdgpu_task_info *ti; struct amdgpu_device *adev =3D ring->adev; int idx; @@ -125,7 +126,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(stru= ct drm_sched_job *s_job) ti =3D amdgpu_vm_get_task_info_pasid(ring->adev, job->pasid); if (ti) { amdgpu_vm_print_task_info(adev, ti); - amdgpu_vm_put_task_info(ti); + info =3D &ti->task; } =20 /* attempt a per ring reset */ @@ -164,13 +165,15 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(st= ruct drm_sched_job *s_job) if (amdgpu_ring_sched_ready(ring)) drm_sched_start(&ring->sched, 0); dev_err(adev->dev, "Ring %s reset succeeded\n", ring->sched.name); - drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL); + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, info); goto exit; } dev_err(adev->dev, "Ring %s reset failure\n", ring->sched.name); } dma_fence_set_error(&s_job->s_fence->finished, -ETIME); =20 + amdgpu_vm_put_task_info(ti); + if (amdgpu_device_should_recover_gpu(ring->adev)) { struct amdgpu_reset_context reset_context; memset(&reset_context, 0, sizeof(reset_context)); --=20 2.49.0