drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 1 + 1 file changed, 1 insertion(+)
Instead of only triggering a wedged event for complete GPU resets,
trigger for ring resets. Regardless of the reset, it's useful for
userspace to know that it happened because the kernel will reject
further submissions from that app.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
v3: do only for ring resets, no soft recoveries
v2: Keep the wedge event in amdgpu_device_gpu_recover() and add and
extra check to avoid triggering two events.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 698e5799e542..760a720c842e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -150,6 +150,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
if (amdgpu_ring_sched_ready(ring))
drm_sched_start(&ring->sched, 0);
dev_err(adev->dev, "Ring %s reset succeeded\n", ring->sched.name);
+ drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE);
goto exit;
}
dev_err(adev->dev, "Ring %s reset failure\n", ring->sched.name);
--
2.48.1
Am 25.02.25 um 02:02 schrieb André Almeida: > Instead of only triggering a wedged event for complete GPU resets, > trigger for ring resets. Regardless of the reset, it's useful for > userspace to know that it happened because the kernel will reject > further submissions from that app. > > Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Sorry for the delay, have been on sick leave for nearly two weeks. Regards, Christian. > --- > v3: do only for ring resets, no soft recoveries > v2: Keep the wedge event in amdgpu_device_gpu_recover() and add and > extra check to avoid triggering two events. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > index 698e5799e542..760a720c842e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > @@ -150,6 +150,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) > if (amdgpu_ring_sched_ready(ring)) > drm_sched_start(&ring->sched, 0); > dev_err(adev->dev, "Ring %s reset succeeded\n", ring->sched.name); > + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE); > goto exit; > } > dev_err(adev->dev, "Ring %s reset failure\n", ring->sched.name);
Applied. Thanks On Tue, Mar 4, 2025 at 4:29 AM Christian König <christian.koenig@amd.com> wrote: > > Am 25.02.25 um 02:02 schrieb André Almeida: > > Instead of only triggering a wedged event for complete GPU resets, > > trigger for ring resets. Regardless of the reset, it's useful for > > userspace to know that it happened because the kernel will reject > > further submissions from that app. > > > > Signed-off-by: André Almeida <andrealmeid@igalia.com> > > Reviewed-by: Christian König <christian.koenig@amd.com> > > Sorry for the delay, have been on sick leave for nearly two weeks. > > Regards, > Christian. > > > --- > > v3: do only for ring resets, no soft recoveries > > v2: Keep the wedge event in amdgpu_device_gpu_recover() and add and > > extra check to avoid triggering two events. > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > index 698e5799e542..760a720c842e 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > @@ -150,6 +150,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) > > if (amdgpu_ring_sched_ready(ring)) > > drm_sched_start(&ring->sched, 0); > > dev_err(adev->dev, "Ring %s reset succeeded\n", ring->sched.name); > > + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE); > > goto exit; > > } > > dev_err(adev->dev, "Ring %s reset failure\n", ring->sched.name); >
© 2016 - 2026 Red Hat, Inc.