From nobody Sun Feb 8 21:19:41 2026 Received: from rtg-sunil-navi33.amd.com (unknown [165.204.156.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FD775102B for ; Tue, 5 Mar 2024 13:57:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=165.204.156.251 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709647071; cv=none; b=tXFMMihkjJaT0vMDH3YnSNgtAebgzbpohRkrzan7YI2asvtJx4Jdk+fUqK3HOOZAH5Z/Dqyvz9kZhiAVcWzrClLr5VeO1c0pxFVaI/naQXaeq86kfYtSe7BwtC0Qql/Ntx1KslWfQV33nQI+iS2HYqewQJ5sxtM/0FaotRr1UoQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709647071; c=relaxed/simple; bh=SgB4glX/R4uwKlTXq+i1mt48b4uCheTTh3gIApGe16w=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=oX1OOR7xIcMJiJN/EIouYFBa7V3Uq3ML6x6Sl/vR8Zab3O6joFpz5WuJzYbtB9O6tmBYUPLQmOpuNaubdVPL7K7engA3qlpkHhVHU7+zX1SUUUQAxx/d0WnB5M5YHRlaxr8rFrcUp8MZr376cQr/k1QMwj6x+1tf1xxTaVE8uxk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amd.com; spf=none smtp.mailfrom=rtg-sunil-navi33.amd.com; arc=none smtp.client-ip=165.204.156.251 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=rtg-sunil-navi33.amd.com Received: from rtg-sunil-navi33.amd.com (localhost [127.0.0.1]) by rtg-sunil-navi33.amd.com (8.15.2/8.15.2/Debian-22ubuntu3) with ESMTP id 425Dve6o3162904; Tue, 5 Mar 2024 19:27:40 +0530 Received: (from sunil@localhost) by rtg-sunil-navi33.amd.com (8.15.2/8.15.2/Submit) id 425DvegI3162903; Tue, 5 Mar 2024 19:27:40 +0530 From: Sunil Khatri To: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , Shashank Sharma Cc: amd-gfx@lists.freedesktop.org, Pan@rtg-sunil-navi33.amd.com, Xinhui , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Sunil Khatri Subject: [PATCH v3] drm/amdgpu: add ring timeout information in devcoredump Date: Tue, 5 Mar 2024 19:27:38 +0530 Message-Id: <20240305135738.3162878-1-sunil.khatri@amd.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add ring timeout related information in the amdgpu devcoredump file for debugging purposes. During the gpu recovery process the registered call is triggered and add the debug information in data file created by devcoredump framework under the directory /sys/class/devcoredump/devcdx/ Signed-off-by: Sunil Khatri Reviewed-by: Christian K=C3=B6nig --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 ++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1 + 2 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/am= d/amdgpu/amdgpu_reset.c index a59364e9b6ed..b5fd93cc5731 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c @@ -196,6 +196,13 @@ amdgpu_devcoredump_read(char *buffer, loff_t offset, s= ize_t count, coredump->reset_task_info.process_name, coredump->reset_task_info.pid); =20 + if (coredump->ring) { + drm_printf(&p, "\nRing timed out details\n"); + drm_printf(&p, "IP Type: %d Ring Name: %s \n", + coredump->ring->funcs->type, + coredump->ring->name); + } + if (coredump->reset_vram_lost) drm_printf(&p, "VRAM is lost due to GPU reset!\n"); if (coredump->adev->reset_info.num_regs) { @@ -220,6 +227,8 @@ void amdgpu_coredump(struct amdgpu_device *adev, bool v= ram_lost, { struct amdgpu_coredump_info *coredump; struct drm_device *dev =3D adev_to_drm(adev); + struct amdgpu_job *job =3D reset_context->job; + struct drm_sched_job *s_job; =20 coredump =3D kzalloc(sizeof(*coredump), GFP_NOWAIT); =20 @@ -241,6 +250,11 @@ void amdgpu_coredump(struct amdgpu_device *adev, bool = vram_lost, } } =20 + if (job) { + s_job =3D &job->base; + coredump->ring =3D to_amdgpu_ring(s_job->sched); + } + coredump->adev =3D adev; =20 ktime_get_ts64(&coredump->reset_time); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/am= d/amdgpu/amdgpu_reset.h index 19899f6b9b2b..60522963aaca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h @@ -97,6 +97,7 @@ struct amdgpu_coredump_info { struct amdgpu_task_info reset_task_info; struct timespec64 reset_time; bool reset_vram_lost; + struct amdgpu_ring *ring; }; #endif =20 --=20 2.34.1