[v4] Add a few tracepoints to panthor

[PATCH v4 1/3] drm/panthor: Add panthor_*_irq_mask_set helper

Posted by Nicolas Frattaroli 1 month, 3 weeks ago

Add a function to modify an IRQ's mask. If the IRQ is currently active,
it will write to the register, otherwise it will only set the struct
member.

There's no locking done to guarantee exclusion with the other two
functions that touch the IRQ mask, and it should only be called from a
context where the circumstances guarantee no concurrent access is
performed.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
 drivers/gpu/drm/panthor/panthor_device.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index f35e52b9546a..894d28b3eb02 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -470,6 +470,13 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
 					 panthor_ ## __name ## _irq_threaded_handler,		\
 					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
 					 pirq);							\
+}												\
+												\
+static inline void panthor_ ## __name ## _irq_mask_set(struct panthor_irq *pirq, u32 mask)	\
+{												\
+	pirq->mask = mask;									\
+	if (!atomic_read(&pirq->suspended))							\
+		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask);			\
 }
 
 extern struct workqueue_struct *panthor_cleanup_wq;

-- 
2.52.0

Re: [PATCH v4 1/3] drm/panthor: Add panthor_*_irq_mask_set helper

Posted by Boris Brezillon 1 month, 3 weeks ago

On Wed, 17 Dec 2025 15:29:38 +0100
Nicolas Frattaroli <nicolas.frattaroli@collabora.com> wrote:

> Add a function to modify an IRQ's mask. If the IRQ is currently active,
> it will write to the register, otherwise it will only set the struct
> member.
> 
> There's no locking done to guarantee exclusion with the other two
> functions that touch the IRQ mask, and it should only be called from a
> context where the circumstances guarantee no concurrent access is
> performed.
> 
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_device.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index f35e52b9546a..894d28b3eb02 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -470,6 +470,13 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
>  					 panthor_ ## __name ## _irq_threaded_handler,		\
>  					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
>  					 pirq);							\
> +}												\
> +												\
> +static inline void panthor_ ## __name ## _irq_mask_set(struct panthor_irq *pirq, u32 mask)	\
> +{												\
> +	pirq->mask = mask;									\
> +	if (!atomic_read(&pirq->suspended))							\
> +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask);			\

This is racy if called outside the (threaded) IRQ handler, which I
believe is the case since its called when a trace point is enabled:

1. _irq_raw_handler() sets INT_MASK to zero to avoid receiving
interrupts from this IRQ line until the threaded handler has processed
events

2. _irq_mask_set() sets INT_MASK to something non-zero, meaning the
interrupt will be re-enabled before events have been processed

this leads to at least one spurious interrupt being received before we
set INT_MASK to zero again. Probably not the end of the world, but if
we can avoid it, that'd be better.

Also, I'd like to see if we could re-purpose panthor_irq::mask to be
the mask of events the user wants to monitor instead of a pure proxy of
INT_MASK. If we do that, and we make panthor_irq::mask an atomic_t, we
can add panthor_xxx_irq_{enable,disable}_event() helpers that would do
the atomic_{or,and} on panthor_irq::mask, and write the new value to
_INT_MASK if:
- we're processing events in the threaded handler (we would need
  another field, or we'd need to turn suspended into a state that can
  encode more than just "suspended or not")
- the device is not suspended (that test you already have)

>  }
>  
>  extern struct workqueue_struct *panthor_cleanup_wq;
>

Re: [PATCH v4 1/3] drm/panthor: Add panthor_*_irq_mask_set helper

Posted by Nicolas Frattaroli 1 month, 3 weeks ago

On Thursday, 18 December 2025 13:38:25 Central European Standard Time Boris Brezillon wrote:
> On Wed, 17 Dec 2025 15:29:38 +0100
> Nicolas Frattaroli <nicolas.frattaroli@collabora.com> wrote:
> 
> > Add a function to modify an IRQ's mask. If the IRQ is currently active,
> > it will write to the register, otherwise it will only set the struct
> > member.
> > 
> > There's no locking done to guarantee exclusion with the other two
> > functions that touch the IRQ mask, and it should only be called from a
> > context where the circumstances guarantee no concurrent access is
> > performed.
> > 
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_device.h | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> > index f35e52b9546a..894d28b3eb02 100644
> > --- a/drivers/gpu/drm/panthor/panthor_device.h
> > +++ b/drivers/gpu/drm/panthor/panthor_device.h
> > @@ -470,6 +470,13 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
> >  					 panthor_ ## __name ## _irq_threaded_handler,		\
> >  					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
> >  					 pirq);							\
> > +}												\
> > +												\
> > +static inline void panthor_ ## __name ## _irq_mask_set(struct panthor_irq *pirq, u32 mask)	\
> > +{												\
> > +	pirq->mask = mask;									\
> > +	if (!atomic_read(&pirq->suspended))							\
> > +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask);			\
> 
> This is racy if called outside the (threaded) IRQ handler, which I
> believe is the case since its called when a trace point is enabled:
> 
> 1. _irq_raw_handler() sets INT_MASK to zero to avoid receiving
> interrupts from this IRQ line until the threaded handler has processed
> events
> 
> 2. _irq_mask_set() sets INT_MASK to something non-zero, meaning the
> interrupt will be re-enabled before events have been processed

Good catch, I only considered the case where the irq mask is modified
by a PM runtime suspend/resume, not by the actual handler functions
itself.

> this leads to at least one spurious interrupt being received before we
> set INT_MASK to zero again. Probably not the end of the world, but if
> we can avoid it, that'd be better.
> 
> Also, I'd like to see if we could re-purpose panthor_irq::mask to be
> the mask of events the user wants to monitor instead of a pure proxy of
> INT_MASK. If we do that, and we make panthor_irq::mask an atomic_t,

Yeah, I was wondering why panthor_irq::mask changed on IRQ suspend
and resume, since that information is already stored in the
"suspended" atomic.

> we can add panthor_xxx_irq_{enable,disable}_event() helpers that would
> do the atomic_{or,and} on panthor_irq::mask, and write the new value
> to _INT_MASK if:
> - we're processing events in the threaded handler (we would need
>   another field, or we'd need to turn suspended into a state that can
>   encode more than just "suspended or not")

Right, I assume synchronize_irq will not really fix the race,
since a single synchronization point does not a mutually exclusive
section make.

I thought about suspending IRQs, synchronising, then changing the
mask, then re-enabling them, but that feels potentially disruptive.
Though I may be misunderstanding the hardware here; if a bit is
disabled in the INT_MASK and the hardware would want to fire such
an interrupt, does it discard it? Because in that case, any solution
where we suspend IRQs -> wait for threaded handler to finish ->
modify mask -> resume IRQs could lead to us missing out on critical
knowledge, like that a GPU fault occurred.

In a perfect world, this hardware would use a register access
pattern that allows for race-free updates of non-overlapping bits,
like FIELD_PREP_WM16 implements. ;) But we do not live in such a
world.

> - the device is not suspended (that test you already have)
> 
> >  }
> >  
> >  extern struct workqueue_struct *panthor_cleanup_wq;
> > 
> 
> 

Thanks for the review, I'll try to come up with a solution that
won't potentially blow our legs off.

Kind regards,
Nicolas Frattaroli

Re: [PATCH v4 1/3] drm/panthor: Add panthor_*_irq_mask_set helper

Posted by Boris Brezillon 1 month, 3 weeks ago

On Thu, 18 Dec 2025 14:42:42 +0100
Nicolas Frattaroli <nicolas.frattaroli@collabora.com> wrote:

> On Thursday, 18 December 2025 13:38:25 Central European Standard Time Boris Brezillon wrote:
> > On Wed, 17 Dec 2025 15:29:38 +0100
> > Nicolas Frattaroli <nicolas.frattaroli@collabora.com> wrote:
> >   
> > > Add a function to modify an IRQ's mask. If the IRQ is currently active,
> > > it will write to the register, otherwise it will only set the struct
> > > member.
> > > 
> > > There's no locking done to guarantee exclusion with the other two
> > > functions that touch the IRQ mask, and it should only be called from a
> > > context where the circumstances guarantee no concurrent access is
> > > performed.
> > > 
> > > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > > ---
> > >  drivers/gpu/drm/panthor/panthor_device.h | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> > > index f35e52b9546a..894d28b3eb02 100644
> > > --- a/drivers/gpu/drm/panthor/panthor_device.h
> > > +++ b/drivers/gpu/drm/panthor/panthor_device.h
> > > @@ -470,6 +470,13 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
> > >  					 panthor_ ## __name ## _irq_threaded_handler,		\
> > >  					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
> > >  					 pirq);							\
> > > +}												\
> > > +												\
> > > +static inline void panthor_ ## __name ## _irq_mask_set(struct panthor_irq *pirq, u32 mask)	\
> > > +{												\
> > > +	pirq->mask = mask;									\
> > > +	if (!atomic_read(&pirq->suspended))							\
> > > +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask);			\  
> > 
> > This is racy if called outside the (threaded) IRQ handler, which I
> > believe is the case since its called when a trace point is enabled:
> > 
> > 1. _irq_raw_handler() sets INT_MASK to zero to avoid receiving
> > interrupts from this IRQ line until the threaded handler has processed
> > events
> > 
> > 2. _irq_mask_set() sets INT_MASK to something non-zero, meaning the
> > interrupt will be re-enabled before events have been processed  
> 
> Good catch, I only considered the case where the irq mask is modified
> by a PM runtime suspend/resume, not by the actual handler functions
> itself.
> 
> > this leads to at least one spurious interrupt being received before we
> > set INT_MASK to zero again. Probably not the end of the world, but if
> > we can avoid it, that'd be better.
> > 
> > Also, I'd like to see if we could re-purpose panthor_irq::mask to be
> > the mask of events the user wants to monitor instead of a pure proxy of
> > INT_MASK. If we do that, and we make panthor_irq::mask an atomic_t,  
> 
> Yeah, I was wondering why panthor_irq::mask changed on IRQ suspend
> and resume, since that information is already stored in the
> "suspended" atomic.

That's actually done on purpose, to avoid the threaded handler from
re-enabling the interrupts if it's still running when irq_suspend() is
called, since suspended it set to true only after the synchronize_irq()
(there a reason for that, but I don't remember it :D).

Now, if we re-purpose the suspended into a multi-state value, we can go:

	SUSPENDED => IRQ is suspended/disabled
	IDLE => active state, but no IRQ being processed
	PROCESSING => active state, the threaded handler is on its way
	SUSPENDING => about to be suspended (should be set to that at
	the beginning of irq_suspend()

> 
> > we can add panthor_xxx_irq_{enable,disable}_event() helpers that would
> > do the atomic_{or,and} on panthor_irq::mask, and write the new value
> > to _INT_MASK if:
> > - we're processing events in the threaded handler (we would need
> >   another field, or we'd need to turn suspended into a state that can
> >   encode more than just "suspended or not")  
> 
> Right, I assume synchronize_irq will not really fix the race,
> since a single synchronization point does not a mutually exclusive
> section make.

Correct.

> 
> I thought about suspending IRQs, synchronising, then changing the
> mask, then re-enabling them, but that feels potentially disruptive.

Yeah, that's a bit heavy, especially since most of the time we're going
to hit the case where the update can go in directly without impacting
the rest of the driver.

> Though I may be misunderstanding the hardware here; if a bit is
> disabled in the INT_MASK and the hardware would want to fire such
> an interrupt, does it discard it? Because in that case, any solution
> where we suspend IRQs -> wait for threaded handler to finish ->
> modify mask -> resume IRQs could lead to us missing out on critical
> knowledge, like that a GPU fault occurred.

That too.

> 
> In a perfect world, this hardware would use a register access
> pattern that allows for race-free updates of non-overlapping bits,
> like FIELD_PREP_WM16 implements. ;) But we do not live in such a
> world.

Actually, it does support that for STATUS (there's a separate CLEAR
register), but the MASK is not something you're supposed to modify
concurrently at a high rate, so it's not surprising we don't have the
same for INT_MASK. Also, atomic updates is something we can ensure at
a higher level, and the write to INT_MASK itself is atomic.

[PATCH v4 1/3] drm/panthor: Add panthor_*_irq_mask_set helper
[PATCH v4 2/3] drm/panthor: Add tracepoint for hardware utilisation changes
[PATCH v4 3/3] drm/panthor: Add gpu_job_irq tracepoint