Here's on debug enabling software signaling for the stub fence
which is always signaled. This fence should enable software
signaling otherwise the AMD GPU scheduler will cause a GPU reset
due to a GPU scheduler cleanup activity timeout.
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
Changes in v1 :
1- Addressing Christian's comment to remove unnecessary callback.
2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS.
3- The version of this patch is also changed and previously
it was [PATCH 3/4]
---
drivers/dma-buf/dma-fence.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 066400ed8841..2378b12538c4 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -27,6 +27,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
static DEFINE_SPINLOCK(dma_fence_stub_lock);
static struct dma_fence dma_fence_stub;
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+static bool __dma_fence_enable_signaling(struct dma_fence *fence);
+#endif
+
/*
* fence context counter: each execution context should have its own
* fence context, this allows checking if fences belong to the same
@@ -136,6 +140,9 @@ struct dma_fence *dma_fence_get_stub(void)
&dma_fence_stub_ops,
&dma_fence_stub_lock,
0, 0);
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+ __dma_fence_enable_signaling(&dma_fence_stub);
+#endif
dma_fence_signal_locked(&dma_fence_stub);
}
spin_unlock(&dma_fence_stub_lock);
--
2.25.1
Am 05.09.22 um 18:35 schrieb Arvind Yadav: > Here's on debug enabling software signaling for the stub fence > which is always signaled. This fence should enable software > signaling otherwise the AMD GPU scheduler will cause a GPU reset > due to a GPU scheduler cleanup activity timeout. > > Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> > --- > > Changes in v1 : > 1- Addressing Christian's comment to remove unnecessary callback. > 2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS. > 3- The version of this patch is also changed and previously > it was [PATCH 3/4] > > --- > drivers/dma-buf/dma-fence.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index 066400ed8841..2378b12538c4 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -27,6 +27,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled); > static DEFINE_SPINLOCK(dma_fence_stub_lock); > static struct dma_fence dma_fence_stub; > > +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH > +static bool __dma_fence_enable_signaling(struct dma_fence *fence); > +#endif > + I would rename the function to something like dma_fence_enable_signaling_locked(). And please don't add any #ifdef if it isn't absolutely necessary. This makes the code pretty fragile. > /* > * fence context counter: each execution context should have its own > * fence context, this allows checking if fences belong to the same > @@ -136,6 +140,9 @@ struct dma_fence *dma_fence_get_stub(void) > &dma_fence_stub_ops, > &dma_fence_stub_lock, > 0, 0); > +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH > + __dma_fence_enable_signaling(&dma_fence_stub); > +#endif Alternatively in this particular case you could just set the bit manually here since this is part of the dma_fence code anyway. Christian. > dma_fence_signal_locked(&dma_fence_stub); > } > spin_unlock(&dma_fence_stub_lock);
On 9/6/2022 12:39 PM, Christian König wrote: > > > Am 05.09.22 um 18:35 schrieb Arvind Yadav: >> Here's on debug enabling software signaling for the stub fence >> which is always signaled. This fence should enable software >> signaling otherwise the AMD GPU scheduler will cause a GPU reset >> due to a GPU scheduler cleanup activity timeout. >> >> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> >> --- >> >> Changes in v1 : >> 1- Addressing Christian's comment to remove unnecessary callback. >> 2- Replacing CONFIG_DEBUG_WW_MUTEX_SLOWPATH instead of CONFIG_DEBUG_FS. >> 3- The version of this patch is also changed and previously >> it was [PATCH 3/4] >> >> --- >> drivers/dma-buf/dma-fence.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c >> index 066400ed8841..2378b12538c4 100644 >> --- a/drivers/dma-buf/dma-fence.c >> +++ b/drivers/dma-buf/dma-fence.c >> @@ -27,6 +27,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled); >> static DEFINE_SPINLOCK(dma_fence_stub_lock); >> static struct dma_fence dma_fence_stub; >> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH >> +static bool __dma_fence_enable_signaling(struct dma_fence *fence); >> +#endif >> + > > I would rename the function to something like > dma_fence_enable_signaling_locked(). > > And please don't add any #ifdef if it isn't absolutely necessary. This > makes the code pretty fragile. > >> /* >> * fence context counter: each execution context should have its own >> * fence context, this allows checking if fences belong to the same >> @@ -136,6 +140,9 @@ struct dma_fence *dma_fence_get_stub(void) >> &dma_fence_stub_ops, >> &dma_fence_stub_lock, >> 0, 0); >> +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH >> + __dma_fence_enable_signaling(&dma_fence_stub); >> +#endif > > Alternatively in this particular case you could just set the bit > manually here since this is part of the dma_fence code anyway. > > Christian. > As per per review comment. I will set the bit manually. ~arvind >> dma_fence_signal_locked(&dma_fence_stub); >> } >> spin_unlock(&dma_fence_stub_lock); >
© 2016 - 2026 Red Hat, Inc.