drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 3 +++ 1 file changed, 3 insertions(+)
If two fault IRQs arrive in short succession recovery work will be
queued up twice.
When recovery runs a second time it may end up killing an unrelated
context.
Prevent this by masking off interrupts when triggering recovery.
Signed-off-by: Antonino Maniscalco <antomani103@gmail.com>
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 45dd5fd1c2bfcb0a01b71a326c7d95b0f9496d99..f8992a68df7fb77362273206859e696c1a52e02f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1727,6 +1727,9 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
/* Turn off the hangcheck timer to keep it from bothering us */
timer_delete(&gpu->hangcheck_timer);
+ /* Turn off interrupts to avoid triggering recovery again */
+ gpu_write(gpu, REG_A6XX_RBBM_INT_0_MASK, 0);
+
kthread_queue_work(gpu->worker, &gpu->recover_work);
}
---
base-commit: ba0f4c4c0f9d0f90300578fc8d081f43be281a71
change-id: 20250821-recovery-fix-350c07a92f97
Best regards,
--
Antonino Maniscalco <antomani103@gmail.com>
On 8/21/2025 6:36 PM, Antonino Maniscalco wrote: > If two fault IRQs arrive in short succession recovery work will be > queued up twice. > > When recovery runs a second time it may end up killing an unrelated > context. > > Prevent this by masking off interrupts when triggering recovery. > > Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Reviewed-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> -Akhil > --- > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > index 45dd5fd1c2bfcb0a01b71a326c7d95b0f9496d99..f8992a68df7fb77362273206859e696c1a52e02f 100644 > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > @@ -1727,6 +1727,9 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu) > /* Turn off the hangcheck timer to keep it from bothering us */ > timer_delete(&gpu->hangcheck_timer); > > + /* Turn off interrupts to avoid triggering recovery again */ > + gpu_write(gpu, REG_A6XX_RBBM_INT_0_MASK, 0); > + > kthread_queue_work(gpu->worker, &gpu->recover_work); > } > > > --- > base-commit: ba0f4c4c0f9d0f90300578fc8d081f43be281a71 > change-id: 20250821-recovery-fix-350c07a92f97 > > Best regards,
© 2016 - 2025 Red Hat, Inc.