From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D6043914E8 for ; Wed, 29 Apr 2026 09:39:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455551; cv=none; b=qlMzKWX5ydp7YFy9T3x+kEmQ+JINIsfyAuiFzuScsVPK2xJGG1gqYvxNh1r5HhEAaar2gZPjbhd50VzXLlLI2eaeq6WyYnUJQ1gMG+IQJ0wcyiSSsY/q3aw0Z8hKVqlLUHkQj8P4RhAGAW44ed2AqVg1NnRVxSQvvsa0Sl2iGdY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455551; c=relaxed/simple; bh=qOx2MQk3JW++3W1rDOThwqmi/jAY/AHbh2LtClX1Cos=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GdI09o/6wj8K54J5iykM89IKJcAnk5Lovs0LdLO5ZzgHoViteYtgIUZWva15Mlr5X7MKcnmI7zgfTYGYr/Se/8P7glsuaO/jiQZtHA4UW6jegS7IL4bMt0wdrZsBfylREigLZWoVIdd4C3yUUg9lph9lLfgPJIn0ppCwCIdhwxk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=kuMMj3y8; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="kuMMj3y8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455535; bh=qOx2MQk3JW++3W1rDOThwqmi/jAY/AHbh2LtClX1Cos=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=kuMMj3y8FyQUnyWWlNTxQbHVMq7i/vvws5iDREODkkl42LCItLBa4oUENbpNpxomE IYlFP03g5zZAYCHqGCBftDCWXNmbmpAyX7jxQpbGJkWuWncTHuwpgSyn4d+EV9iybh Fq+Lq2islMuk5F9m1Lw+nylPkiXqHocNpeH/v3UrC40S7sioxEuTS+nsUZG/qCHWw/ 6QSRbwldvTqb5HaYQgZnKOUMPLGNzJj5v5+7KgyPEKfktp/KQ/CynYhItXJNDkU+1u YGTtURyu9fOG1uWL7RPUlMIfWVLgaEm5Ck6XIXZjKGRFrA6sXrRZKD+DBZVK3Msy7t YoSWFeLvzx6oA== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id D9ED617E0698; Wed, 29 Apr 2026 11:38:54 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:28 +0200 Subject: [PATCH 01/10] drm/panthor: Make panthor_irq::state a non-atomic field Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-1-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=4735; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=qOx2MQk3JW++3W1rDOThwqmi/jAY/AHbh2LtClX1Cos=; b=IJgw81UE8Vxn3TLGmluD/mdXMyXVI1Kg0V1mDYvdxUXP8qEeJCsWFsNEQurIX6LhtA9RTiWkk 2esLlrM2qCiDKmoZ5G81mLoKZsGX0QT3vJclY3rBAQoHvjTLn+FAUer X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= The only place where panthor_irq::state is accessed without panthor_irq::mask_lock held is in the prologue of _irq_suspend(), which is not really a fast-path. So let's simplify things by assuming panthor_irq::state must always be accessed with the mask_lock held, and add a scoped_guard() in _irq_suspend(). Signed-off-by: Boris Brezillon Reviewed-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_device.h | 35 ++++++++++++++++------------= ---- 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/pan= thor/panthor_device.h index 4e4607bca7cc..3f91ba73829d 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -101,8 +101,12 @@ struct panthor_irq { */ spinlock_t mask_lock; =20 - /** @state: one of &enum panthor_irq_state reflecting the current state. = */ - atomic_t state; + /** + * @state: one of &enum panthor_irq_state reflecting the current state. + * + * Must be accessed with mask_lock held. + */ + enum panthor_irq_state state; }; =20 /** @@ -510,18 +514,15 @@ const char *panthor_exception_name(struct panthor_dev= ice *ptdev, static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *d= ata) \ { \ struct panthor_irq *pirq =3D data; \ - enum panthor_irq_state old_state; \ \ if (!gpu_read(pirq->iomem, INT_STAT)) \ return IRQ_NONE; \ \ guard(spinlock_irqsave)(&pirq->mask_lock); \ - old_state =3D atomic_cmpxchg(&pirq->state, \ - PANTHOR_IRQ_STATE_ACTIVE, \ - PANTHOR_IRQ_STATE_PROCESSING); \ - if (old_state !=3D PANTHOR_IRQ_STATE_ACTIVE) \ + if (pirq->state !=3D PANTHOR_IRQ_STATE_ACTIVE) \ return IRQ_NONE; \ \ + pirq->state =3D PANTHOR_IRQ_STATE_PROCESSING; \ gpu_write(pirq->iomem, INT_MASK, 0); \ return IRQ_WAKE_THREAD; \ } \ @@ -551,13 +552,10 @@ static irqreturn_t panthor_ ## __name ## _irq_threade= d_handler(int irq, void *da } \ \ scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \ - enum panthor_irq_state old_state; \ - \ - old_state =3D atomic_cmpxchg(&pirq->state, \ - PANTHOR_IRQ_STATE_PROCESSING, \ - PANTHOR_IRQ_STATE_ACTIVE); \ - if (old_state =3D=3D PANTHOR_IRQ_STATE_PROCESSING) \ + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_PROCESSING) { \ + pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; \ gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ + } \ } \ \ return ret; \ @@ -566,18 +564,19 @@ static irqreturn_t panthor_ ## __name ## _irq_threade= d_handler(int irq, void *da static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *= pirq) \ { \ scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \ - atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDING); \ + pirq->state =3D PANTHOR_IRQ_STATE_SUSPENDING; \ gpu_write(pirq->iomem, INT_MASK, 0); \ } \ synchronize_irq(pirq->irq); \ - atomic_set(&pirq->state, PANTHOR_IRQ_STATE_SUSPENDED); \ + scoped_guard(spinlock_irqsave, &pirq->mask_lock) \ + pirq->state =3D PANTHOR_IRQ_STATE_SUSPENDED; \ } \ \ static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *p= irq) \ { \ guard(spinlock_irqsave)(&pirq->mask_lock); \ \ - atomic_set(&pirq->state, PANTHOR_IRQ_STATE_ACTIVE); \ + pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; \ gpu_write(pirq->iomem, INT_CLEAR, pirq->mask); \ gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ } \ @@ -610,7 +609,7 @@ static inline void panthor_ ## __name ## _irq_enable_ev= ents(struct panthor_irq * * on the PROCESSING -> ACTIVE transition. \ * If the IRQ is suspended/suspending, the mask is restored at resume tim= e. \ */ \ - if (atomic_read(&pirq->state) =3D=3D PANTHOR_IRQ_STATE_ACTIVE) \ + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_ACTIVE) \ gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ } \ \ @@ -624,7 +623,7 @@ static inline void panthor_ ## __name ## _irq_disable_e= vents(struct panthor_irq * on the PROCESSING -> ACTIVE transition. \ * If the IRQ is suspended/suspending, the mask is restored at resume tim= e. \ */ \ - if (atomic_read(&pirq->state) =3D=3D PANTHOR_IRQ_STATE_ACTIVE) \ + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_ACTIVE) \ gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ } =20 --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EBC23B95E9 for ; Wed, 29 Apr 2026 09:39:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455554; cv=none; b=txTW92FP1AfURdOfRH2w7vziys8a9hRGYlmb4jDfwWaXJH9FdblPCYJvzyt9GKnnYEmvkISF2COHg82lGzChmNpHZwRJTqoj21RUERdRPDmL34e16B2BofUQuQhbh/NChJHL2LZO5HdpYu5s6Uae/IioQK7q1pOYAfy7FqQwssc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455554; c=relaxed/simple; bh=JCuOHv6q4CvfrEuovqS12pygdtTMQTEp7KamPRltOjE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=rREAERJsAVqkfQtmRx2rZAz9zJ9PB8qqHzMDDVSfFR0z0qekEHY4tm+yxS1BKd4xXzUD5MOSttKZlQjlmMsCtgT65DnE93LGHQE6hS/E/jWCODfyExAe6qRyX8DZyJkyVCBKeM88wzrVDjuDODL4DD44WfFlepXs4rjG/bB1VCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=TSfG2n2H; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="TSfG2n2H" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455535; bh=JCuOHv6q4CvfrEuovqS12pygdtTMQTEp7KamPRltOjE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=TSfG2n2HjQ4rtF2RwJBOR9ZOxbfP4p3OD7fG9WQBR4/OT7agh+IRG9A8DaH8sIN8y Vo3q6Q1skqDIQ1krZSQbWoT5ZEXD9hr3SJdp82eS/9LOtD6Hg0IZOF20aTmuC2mno3 AOLIejMI4aaiQZ7fMvuV+uXeJ1FpayIGWVXWFDvLHtyLCsxFoYDgANbVtD5R21S+ie aR/khir+qpiShjvasH+9G+Q8SQwUMWRTQT1GTgDe7D3P5PgAAk45yVNJs9e0Szwo9t wouPtDPOHl6AGGbIDfPHqGFKmdsCMhOHRcrDb+E+0BpSiTDNLZZoTt/M5TX2SAO7Bn ajciO5cjeG5qQ== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id 6B0E717E1513; Wed, 29 Apr 2026 11:38:55 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:29 +0200 Subject: [PATCH 02/10] drm/panthor: Move the register accessors before the IRQ helpers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-2-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=5600; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=JCuOHv6q4CvfrEuovqS12pygdtTMQTEp7KamPRltOjE=; b=5KZWvVTmvzOp3oomDi6MJLECzWK3GWbrrnHtLeoY3NVLgSNPD6m85pLKkTE88DUF0Cx691Hn6 2LPA6D2gU9sDxE2qhBUg4m79aMS2v8410z4ihh9NyH01xzI1HqHMzRt X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= We're about to add an IRQ inline helper using gpu_read(). Move things around to avoid forward declarations. No functional changes. Signed-off-by: Boris Brezillon Reviewed-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_device.h | 142 +++++++++++++++------------= ---- 1 file changed, 71 insertions(+), 71 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/pan= thor/panthor_device.h index 3f91ba73829d..768fc1992368 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -495,6 +495,77 @@ panthor_exception_is_fault(u32 exception_code) const char *panthor_exception_name(struct panthor_device *ptdev, u32 exception_code); =20 +static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data) +{ + writel(data, iomem + reg); +} + +static inline u32 gpu_read(void __iomem *iomem, u32 reg) +{ + return readl(iomem + reg); +} + +static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg) +{ + return readl_relaxed(iomem + reg); +} + +static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data) +{ + gpu_write(iomem, reg, lower_32_bits(data)); + gpu_write(iomem, reg + 4, upper_32_bits(data)); +} + +static inline u64 gpu_read64(void __iomem *iomem, u32 reg) +{ + return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32)); +} + +static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg) +{ + return (gpu_read_relaxed(iomem, reg) | + ((u64)gpu_read_relaxed(iomem, reg + 4) << 32)); +} + +static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg) +{ + u32 lo, hi1, hi2; + do { + hi1 =3D gpu_read(iomem, reg + 4); + lo =3D gpu_read(iomem, reg); + hi2 =3D gpu_read(iomem, reg + 4); + } while (hi1 !=3D hi2); + return lo | ((u64)hi2 << 32); +} + +#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us)= \ + read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \ + iomem, reg) + +#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \ + timeout_us) \ + read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \ + false, iomem, reg) + +#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_u= s) \ + read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \ + iomem, reg) + +#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \ + timeout_us) \ + read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \ + false, iomem, reg) + +#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_= us, \ + timeout_us) \ + read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us, \ + timeout_us, false, iomem, reg) + +#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us, \ + timeout_us) \ + read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \ + false, iomem, reg) + #define INT_RAWSTAT 0x0 #define INT_CLEAR 0x4 #define INT_MASK 0x8 @@ -629,75 +700,4 @@ static inline void panthor_ ## __name ## _irq_disable_= events(struct panthor_irq =20 extern struct workqueue_struct *panthor_cleanup_wq; =20 -static inline void gpu_write(void __iomem *iomem, u32 reg, u32 data) -{ - writel(data, iomem + reg); -} - -static inline u32 gpu_read(void __iomem *iomem, u32 reg) -{ - return readl(iomem + reg); -} - -static inline u32 gpu_read_relaxed(void __iomem *iomem, u32 reg) -{ - return readl_relaxed(iomem + reg); -} - -static inline void gpu_write64(void __iomem *iomem, u32 reg, u64 data) -{ - gpu_write(iomem, reg, lower_32_bits(data)); - gpu_write(iomem, reg + 4, upper_32_bits(data)); -} - -static inline u64 gpu_read64(void __iomem *iomem, u32 reg) -{ - return (gpu_read(iomem, reg) | ((u64)gpu_read(iomem, reg + 4) << 32)); -} - -static inline u64 gpu_read64_relaxed(void __iomem *iomem, u32 reg) -{ - return (gpu_read_relaxed(iomem, reg) | - ((u64)gpu_read_relaxed(iomem, reg + 4) << 32)); -} - -static inline u64 gpu_read64_counter(void __iomem *iomem, u32 reg) -{ - u32 lo, hi1, hi2; - do { - hi1 =3D gpu_read(iomem, reg + 4); - lo =3D gpu_read(iomem, reg); - hi2 =3D gpu_read(iomem, reg + 4); - } while (hi1 !=3D hi2); - return lo | ((u64)hi2 << 32); -} - -#define gpu_read_poll_timeout(iomem, reg, val, cond, delay_us, timeout_us)= \ - read_poll_timeout(gpu_read, val, cond, delay_us, timeout_us, false, \ - iomem, reg) - -#define gpu_read_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \ - timeout_us) \ - read_poll_timeout_atomic(gpu_read, val, cond, delay_us, timeout_us, \ - false, iomem, reg) - -#define gpu_read64_poll_timeout(iomem, reg, val, cond, delay_us, timeout_u= s) \ - read_poll_timeout(gpu_read64, val, cond, delay_us, timeout_us, false, \ - iomem, reg) - -#define gpu_read64_poll_timeout_atomic(iomem, reg, val, cond, delay_us, \ - timeout_us) \ - read_poll_timeout_atomic(gpu_read64, val, cond, delay_us, timeout_us, \ - false, iomem, reg) - -#define gpu_read_relaxed_poll_timeout_atomic(iomem, reg, val, cond, delay_= us, \ - timeout_us) \ - read_poll_timeout_atomic(gpu_read_relaxed, val, cond, delay_us, \ - timeout_us, false, iomem, reg) - -#define gpu_read64_relaxed_poll_timeout(iomem, reg, val, cond, delay_us, \ - timeout_us) \ - read_poll_timeout(gpu_read64_relaxed, val, cond, delay_us, timeout_us, \ - false, iomem, reg) - #endif --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E1473B893B for ; Wed, 29 Apr 2026 09:39:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455552; cv=none; b=Z5zaUXuiE1ZaLDFD9NG0KNvpDloKOGm/JUNUg6XJlurgvUaX7O0eq57TEdRfztTC5CDMR3ywsL4DVuZp46WhTc0EBDyLo40aGHpFYCThRa0j1r+vAo1z0dzkl7JDAJTEzuBE6qbOHYdw8DwD7/CWzHJXZrGxxtpDR7KfONOsNWA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455552; c=relaxed/simple; bh=dtH3YY7XIqmDP+Wrp3idhz6eq8+CS3IqDewzhdRU5BI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=QpBXtZFsCdnMRlTm7j0gjz4Ee2OOXcqWP0B8DUFwYhIOxymmTh4R97dkVpQ/0OZ0fWOEEoS1Qu9M+TOn6veDzXCqff6iNa56OArNViA22R12Htx/uAU7/rVsDJWW74rx1jXsKWEDJlH7aH99B+lLc0M4mYbGc18Ov0RIUrZPleU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=Hv0ZvF9U; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="Hv0ZvF9U" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455536; bh=dtH3YY7XIqmDP+Wrp3idhz6eq8+CS3IqDewzhdRU5BI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Hv0ZvF9UQKxwcT2ngHp7U7yyOaXmrTmcSVbZrRdIGJiwNdpwwiQg3pyQ9SzEgKdJg yY5J4B44dqhSr2t516/hqJpA5rTpJHgUwNUHIBNiYzk+pqMKtbEH5a7VFqQpIQ5S82 M3nquHvUXMl9OCeV3r1pCxfXO2o29smjFef3LfxL3fhGZhzRCmC6eKrOI542c2OazC f5XKVxMD1mUdFWu/FNzCTQVAepzefAY5Ysa9CV1gtNSFHzvh/V+dvw60Mgljbk6Paz SRWU8Q7SYMcJpbViYbvGy3d61XowVFzYV6WVS3STyDBuBsjTnlGdXJ9aAnYSktbhpt fCNF2mwfkYpnw== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id EDC5217E1525; Wed, 29 Apr 2026 11:38:55 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:30 +0200 Subject: [PATCH 03/10] drm/panthor: Replace the panthor_irq macro machinery by inline helpers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-3-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=23058; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=dtH3YY7XIqmDP+Wrp3idhz6eq8+CS3IqDewzhdRU5BI=; b=3Q4g5kn/B8DwxxYjzUPUwTomBokzu+ylc8P0wY/yUhE2siw2ieKmrB0sAb+RKDHXqLwgxg681 354iYABLHkOBjLeduggK5Cpg17bZEKEz7/QcRzvtpaN55sekMtElbbQ X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= Now that panthor_irq contains the iomem region, there's no real need for the macro-based panthor_irq helper generation logic. We can just provide inline helpers that do the same and let the compiler optimize indirect function calls. The only extra annoyance is the fact we have to open-code the panthor_xxx_irq_threaded_handler() implementation, but those are single-line functions, so it's acceptable. While at it, we changed the prototype of the IRQ handlers to take a panthor_irq instead of panthor_device, since that's the thing that's passed around when it comes to panthor_irq, and the panthor_device can be directly extracted from there. Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_device.h | 245 +++++++++++++++------------= ---- drivers/gpu/drm/panthor/panthor_fw.c | 22 ++- drivers/gpu/drm/panthor/panthor_gpu.c | 26 ++-- drivers/gpu/drm/panthor/panthor_mmu.c | 37 ++--- drivers/gpu/drm/panthor/panthor_pwr.c | 20 ++- 5 files changed, 183 insertions(+), 167 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/pan= thor/panthor_device.h index 768fc1992368..afa202546316 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -571,131 +571,126 @@ static inline u64 gpu_read64_counter(void __iomem *= iomem, u32 reg) #define INT_MASK 0x8 #define INT_STAT 0xc =20 -/** - * PANTHOR_IRQ_HANDLER() - Define interrupt handlers and the interrupt - * registration function. - * - * The boiler-plate to gracefully deal with shared interrupts is - * auto-generated. All you have to do is call PANTHOR_IRQ_HANDLER() - * just after the actual handler. The handler prototype is: - * - * void (*handler)(struct panthor_device *, u32 status); - */ -#define PANTHOR_IRQ_HANDLER(__name, __handler) \ -static irqreturn_t panthor_ ## __name ## _irq_raw_handler(int irq, void *d= ata) \ -{ \ - struct panthor_irq *pirq =3D data; \ - \ - if (!gpu_read(pirq->iomem, INT_STAT)) \ - return IRQ_NONE; \ - \ - guard(spinlock_irqsave)(&pirq->mask_lock); \ - if (pirq->state !=3D PANTHOR_IRQ_STATE_ACTIVE) \ - return IRQ_NONE; \ - \ - pirq->state =3D PANTHOR_IRQ_STATE_PROCESSING; \ - gpu_write(pirq->iomem, INT_MASK, 0); \ - return IRQ_WAKE_THREAD; \ -} \ - \ -static irqreturn_t panthor_ ## __name ## _irq_threaded_handler(int irq, vo= id *data) \ -{ \ - struct panthor_irq *pirq =3D data; \ - struct panthor_device *ptdev =3D pirq->ptdev; \ - irqreturn_t ret =3D IRQ_NONE; \ - \ - while (true) { \ - /* It's safe to access pirq->mask without the lock held here. If a new \ - * event gets added to the mask and the corresponding IRQ is pending, \ - * we'll process it right away instead of adding an extra raw -> threade= d \ - * round trip. If an event is removed and the status bit is set, it will= \ - * be ignored, just like it would have been if the mask had been adjuste= d \ - * right before the HW event kicks in. TLDR; it's all expected races we'= re \ - * covered for. \ - */ \ - u32 status =3D gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask; \ - \ - if (!status) \ - break; \ - \ - __handler(ptdev, status); \ - ret =3D IRQ_HANDLED; \ - } \ - \ - scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \ - if (pirq->state =3D=3D PANTHOR_IRQ_STATE_PROCESSING) { \ - pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; \ - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ - } \ - } \ - \ - return ret; \ -} \ - \ -static inline void panthor_ ## __name ## _irq_suspend(struct panthor_irq *= pirq) \ -{ \ - scoped_guard(spinlock_irqsave, &pirq->mask_lock) { \ - pirq->state =3D PANTHOR_IRQ_STATE_SUSPENDING; \ - gpu_write(pirq->iomem, INT_MASK, 0); \ - } \ - synchronize_irq(pirq->irq); \ - scoped_guard(spinlock_irqsave, &pirq->mask_lock) \ - pirq->state =3D PANTHOR_IRQ_STATE_SUSPENDED; \ -} \ - \ -static inline void panthor_ ## __name ## _irq_resume(struct panthor_irq *p= irq) \ -{ \ - guard(spinlock_irqsave)(&pirq->mask_lock); \ - \ - pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; \ - gpu_write(pirq->iomem, INT_CLEAR, pirq->mask); \ - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ -} \ - \ -static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev= , \ - struct panthor_irq *pirq, \ - int irq, u32 mask, void __iomem *iomem) \ -{ \ - pirq->ptdev =3D ptdev; \ - pirq->irq =3D irq; \ - pirq->mask =3D mask; \ - pirq->iomem =3D iomem; \ - spin_lock_init(&pirq->mask_lock); \ - panthor_ ## __name ## _irq_resume(pirq); \ - \ - return devm_request_threaded_irq(ptdev->base.dev, irq, \ - panthor_ ## __name ## _irq_raw_handler, \ - panthor_ ## __name ## _irq_threaded_handler, \ - IRQF_SHARED, KBUILD_MODNAME "-" # __name, \ - pirq); \ -} \ - \ -static inline void panthor_ ## __name ## _irq_enable_events(struct panthor= _irq *pirq, u32 mask) \ -{ \ - guard(spinlock_irqsave)(&pirq->mask_lock); \ - pirq->mask |=3D mask; \ - \ - /* The only situation where we need to write the new mask is if the IRQ i= s active. \ - * If it's being processed, the mask will be restored for us in _irq_thre= aded_handler() \ - * on the PROCESSING -> ACTIVE transition. \ - * If the IRQ is suspended/suspending, the mask is restored at resume tim= e. \ - */ \ - if (pirq->state =3D=3D PANTHOR_IRQ_STATE_ACTIVE) \ - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ -} \ - \ -static inline void panthor_ ## __name ## _irq_disable_events(struct pantho= r_irq *pirq, u32 mask)\ -{ \ - guard(spinlock_irqsave)(&pirq->mask_lock); \ - pirq->mask &=3D ~mask; \ - \ - /* The only situation where we need to write the new mask is if the IRQ i= s active. \ - * If it's being processed, the mask will be restored for us in _irq_thre= aded_handler() \ - * on the PROCESSING -> ACTIVE transition. \ - * If the IRQ is suspended/suspending, the mask is restored at resume tim= e. \ - */ \ - if (pirq->state =3D=3D PANTHOR_IRQ_STATE_ACTIVE) \ - gpu_write(pirq->iomem, INT_MASK, pirq->mask); \ +static inline irqreturn_t panthor_irq_default_raw_handler(int irq, void *d= ata) +{ + struct panthor_irq *pirq =3D data; + + if (!gpu_read(pirq->iomem, INT_STAT)) + return IRQ_NONE; + + guard(spinlock_irqsave)(&pirq->mask_lock); + if (pirq->state !=3D PANTHOR_IRQ_STATE_ACTIVE) + return IRQ_NONE; + + pirq->state =3D PANTHOR_IRQ_STATE_PROCESSING; + gpu_write(pirq->iomem, INT_MASK, 0); + return IRQ_WAKE_THREAD; +} + +static inline irqreturn_t +panthor_irq_default_threaded_handler(void *data, + void (*slow_handler)(struct panthor_irq *, u32)) +{ + struct panthor_irq *pirq =3D data; + irqreturn_t ret =3D IRQ_NONE; + + while (true) { + /* It's safe to access pirq->mask without the lock held here. If a new + * event gets added to the mask and the corresponding IRQ is pending, + * we'll process it right away instead of adding an extra raw -> threaded + * round trip. If an event is removed and the status bit is set, it will + * be ignored, just like it would have been if the mask had been adjusted + * right before the HW event kicks in. TLDR; it's all expected races we'= re + * covered for. + */ + u32 status =3D gpu_read(pirq->iomem, INT_RAWSTAT) & pirq->mask; + + if (!status) + break; + + slow_handler(pirq, status); + ret =3D IRQ_HANDLED; + } + + scoped_guard(spinlock_irqsave, &pirq->mask_lock) { + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_PROCESSING) { + pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; + gpu_write(pirq->iomem, INT_MASK, pirq->mask); + } + } + + return ret; +} + +static inline void panthor_irq_suspend(struct panthor_irq *pirq) +{ + scoped_guard(spinlock_irqsave, &pirq->mask_lock) { + pirq->state =3D PANTHOR_IRQ_STATE_SUSPENDING; + gpu_write(pirq->iomem, INT_MASK, 0); + } + synchronize_irq(pirq->irq); + scoped_guard(spinlock_irqsave, &pirq->mask_lock) + pirq->state =3D PANTHOR_IRQ_STATE_SUSPENDED; +} + +static inline void panthor_irq_resume(struct panthor_irq *pirq) +{ + guard(spinlock_irqsave)(&pirq->mask_lock); + pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; + gpu_write(pirq->iomem, INT_CLEAR, pirq->mask); + gpu_write(pirq->iomem, INT_MASK, pirq->mask); +} + +static inline void panthor_irq_enable_events(struct panthor_irq *pirq, u32= mask) +{ + guard(spinlock_irqsave)(&pirq->mask_lock); + pirq->mask |=3D mask; + + /* The only situation where we need to write the new mask is if the IRQ i= s active. + * If it's being processed, the mask will be restored for us in _irq_thre= aded_handler() + * on the PROCESSING -> ACTIVE transition. + * If the IRQ is suspended/suspending, the mask is restored at resume tim= e. + */ + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_ACTIVE) + gpu_write(pirq->iomem, INT_MASK, pirq->mask); +} + +static inline void panthor_irq_disable_events(struct panthor_irq *pirq, u3= 2 mask) +{ + guard(spinlock_irqsave)(&pirq->mask_lock); + pirq->mask &=3D ~mask; + + /* The only situation where we need to write the new mask is if the IRQ i= s active. + * If it's being processed, the mask will be restored for us in _irq_thre= aded_handler() + * on the PROCESSING -> ACTIVE transition. + * If the IRQ is suspended/suspending, the mask is restored at resume tim= e. + */ + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_ACTIVE) + gpu_write(pirq->iomem, INT_MASK, pirq->mask); +} + +static inline int +panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq, + int irq, u32 mask, void __iomem *iomem, const char *name, + irqreturn_t (*threaded_handler)(int, void *data)) +{ + const char *full_name; + + pirq->ptdev =3D ptdev; + pirq->irq =3D irq; + pirq->mask =3D mask; + pirq->iomem =3D iomem; + spin_lock_init(&pirq->mask_lock); + panthor_irq_resume(pirq); + + full_name =3D devm_kasprintf(ptdev->base.dev, GFP_KERNEL, KBUILD_MODNAME = "-%s", name); + if (!full_name) + return -ENOMEM; + + return devm_request_threaded_irq(ptdev->base.dev, irq, + panthor_irq_default_raw_handler, + threaded_handler, + IRQF_SHARED, full_name, pirq); } =20 extern struct workqueue_struct *panthor_cleanup_wq; diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor= /panthor_fw.c index 986151681b24..eaf599b0a887 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.c +++ b/drivers/gpu/drm/panthor/panthor_fw.c @@ -1064,8 +1064,9 @@ static void panthor_fw_init_global_iface(struct panth= or_device *ptdev) msecs_to_jiffies(PING_INTERVAL_MS)); } =20 -static void panthor_job_irq_handler(struct panthor_device *ptdev, u32 stat= us) +static void panthor_job_irq_handler(struct panthor_irq *pirq, u32 status) { + struct panthor_device *ptdev =3D pirq->ptdev; u32 duration; u64 start =3D 0; =20 @@ -1091,7 +1092,11 @@ static void panthor_job_irq_handler(struct panthor_d= evice *ptdev, u32 status) trace_gpu_job_irq(ptdev->base.dev, status, duration); } } -PANTHOR_IRQ_HANDLER(job, panthor_job_irq_handler); + +static irqreturn_t panthor_job_irq_threaded_handler(int irq, void *data) +{ + return panthor_irq_default_threaded_handler(data, panthor_job_irq_handler= ); +} =20 static int panthor_fw_start(struct panthor_device *ptdev) { @@ -1099,8 +1104,8 @@ static int panthor_fw_start(struct panthor_device *pt= dev) bool timedout =3D false; =20 ptdev->fw->booted =3D false; - panthor_job_irq_enable_events(&ptdev->fw->irq, ~0); - panthor_job_irq_resume(&ptdev->fw->irq); + panthor_irq_enable_events(&ptdev->fw->irq, ~0); + panthor_irq_resume(&ptdev->fw->irq); gpu_write(fw->iomem, MCU_CONTROL, MCU_CONTROL_AUTO); =20 if (!wait_event_timeout(ptdev->fw->req_waitqueue, @@ -1210,7 +1215,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptde= v, bool on_hang) ptdev->reset.fast =3D true; } =20 - panthor_job_irq_suspend(&ptdev->fw->irq); + panthor_irq_suspend(&ptdev->fw->irq); panthor_fw_stop(ptdev); } =20 @@ -1280,7 +1285,7 @@ void panthor_fw_unplug(struct panthor_device *ptdev) if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) { /* Make sure the IRQ handler cannot be called after that point. */ if (ptdev->fw->irq.irq) - panthor_job_irq_suspend(&ptdev->fw->irq); + panthor_irq_suspend(&ptdev->fw->irq); =20 panthor_fw_stop(ptdev); } @@ -1476,8 +1481,9 @@ int panthor_fw_init(struct panthor_device *ptdev) if (irq <=3D 0) return -ENODEV; =20 - ret =3D panthor_request_job_irq(ptdev, &fw->irq, irq, 0, - ptdev->iomem + JOB_INT_BASE); + ret =3D panthor_irq_request(ptdev, &fw->irq, irq, 0, + ptdev->iomem + JOB_INT_BASE, "job", + panthor_job_irq_threaded_handler); if (ret) { drm_err(&ptdev->base, "failed to request job irq"); return ret; diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/pantho= r/panthor_gpu.c index e52c5675981f..ce208e384762 100644 --- a/drivers/gpu/drm/panthor/panthor_gpu.c +++ b/drivers/gpu/drm/panthor/panthor_gpu.c @@ -86,8 +86,9 @@ static void panthor_gpu_l2_config_set(struct panthor_devi= ce *ptdev) gpu_write(gpu->iomem, GPU_L2_CONFIG, l2_config); } =20 -static void panthor_gpu_irq_handler(struct panthor_device *ptdev, u32 stat= us) +static void panthor_gpu_irq_handler(struct panthor_irq *pirq, u32 status) { + struct panthor_device *ptdev =3D pirq->ptdev; struct panthor_gpu *gpu =3D ptdev->gpu; =20 gpu_write(gpu->irq.iomem, INT_CLEAR, status); @@ -116,7 +117,11 @@ static void panthor_gpu_irq_handler(struct panthor_dev= ice *ptdev, u32 status) } spin_unlock(&ptdev->gpu->reqs_lock); } -PANTHOR_IRQ_HANDLER(gpu, panthor_gpu_irq_handler); + +static irqreturn_t panthor_gpu_irq_threaded_handler(int irq, void *data) +{ + return panthor_irq_default_threaded_handler(data, panthor_gpu_irq_handler= ); +} =20 /** * panthor_gpu_unplug() - Called when the GPU is unplugged. @@ -128,7 +133,7 @@ void panthor_gpu_unplug(struct panthor_device *ptdev) =20 /* Make sure the IRQ handler is not running after that point. */ if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) - panthor_gpu_irq_suspend(&ptdev->gpu->irq); + panthor_irq_suspend(&ptdev->gpu->irq); =20 /* Wake-up all waiters. */ spin_lock_irqsave(&ptdev->gpu->reqs_lock, flags); @@ -169,9 +174,10 @@ int panthor_gpu_init(struct panthor_device *ptdev) if (irq < 0) return irq; =20 - ret =3D panthor_request_gpu_irq(ptdev, &ptdev->gpu->irq, irq, - GPU_INTERRUPTS_MASK, - ptdev->iomem + GPU_INT_BASE); + ret =3D panthor_irq_request(ptdev, &ptdev->gpu->irq, irq, + GPU_INTERRUPTS_MASK, + ptdev->iomem + GPU_INT_BASE, "gpu", + panthor_gpu_irq_threaded_handler); if (ret) return ret; =20 @@ -182,7 +188,7 @@ int panthor_gpu_power_changed_on(struct panthor_device = *ptdev) { guard(pm_runtime_active)(ptdev->base.dev); =20 - panthor_gpu_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK= ); + panthor_irq_enable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK); =20 return 0; } @@ -191,7 +197,7 @@ void panthor_gpu_power_changed_off(struct panthor_devic= e *ptdev) { guard(pm_runtime_active)(ptdev->base.dev); =20 - panthor_gpu_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MAS= K); + panthor_irq_disable_events(&ptdev->gpu->irq, GPU_POWER_INTERRUPTS_MASK); } =20 /** @@ -424,7 +430,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev) else panthor_hw_l2_power_off(ptdev); =20 - panthor_gpu_irq_suspend(&ptdev->gpu->irq); + panthor_irq_suspend(&ptdev->gpu->irq); } =20 /** @@ -436,7 +442,7 @@ void panthor_gpu_suspend(struct panthor_device *ptdev) */ void panthor_gpu_resume(struct panthor_device *ptdev) { - panthor_gpu_irq_resume(&ptdev->gpu->irq); + panthor_irq_resume(&ptdev->gpu->irq); panthor_hw_l2_power_on(ptdev); } =20 diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/pantho= r/panthor_mmu.c index a7ee14986849..a0d0a9b2926f 100644 --- a/drivers/gpu/drm/panthor/panthor_mmu.c +++ b/drivers/gpu/drm/panthor/panthor_mmu.c @@ -586,17 +586,13 @@ static u32 panthor_mmu_as_fault_mask(struct panthor_d= evice *ptdev, u32 as) return BIT(as); } =20 -/* Forward declaration to call helpers within as_enable/disable */ -static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 stat= us); -PANTHOR_IRQ_HANDLER(mmu, panthor_mmu_irq_handler); - static int panthor_mmu_as_enable(struct panthor_device *ptdev, u32 as_nr, u64 transtab, u64 transcfg, u64 memattr) { struct panthor_mmu *mmu =3D ptdev->mmu; =20 - panthor_mmu_irq_enable_events(&ptdev->mmu->irq, - panthor_mmu_as_fault_mask(ptdev, as_nr)); + panthor_irq_enable_events(&ptdev->mmu->irq, + panthor_mmu_as_fault_mask(ptdev, as_nr)); =20 gpu_write64(mmu->iomem, AS_TRANSTAB(as_nr), transtab); gpu_write64(mmu->iomem, AS_MEMATTR(as_nr), memattr); @@ -614,8 +610,8 @@ static int panthor_mmu_as_disable(struct panthor_device= *ptdev, u32 as_nr, =20 lockdep_assert_held(&ptdev->mmu->as.slots_lock); =20 - panthor_mmu_irq_disable_events(&ptdev->mmu->irq, - panthor_mmu_as_fault_mask(ptdev, as_nr)); + panthor_irq_disable_events(&ptdev->mmu->irq, + panthor_mmu_as_fault_mask(ptdev, as_nr)); =20 /* Flush+invalidate RW caches, invalidate RO ones. */ ret =3D panthor_gpu_flush_caches(ptdev, CACHE_CLEAN | CACHE_INV, @@ -1785,8 +1781,9 @@ static void panthor_vm_unlock_region(struct panthor_v= m *vm) mutex_unlock(&ptdev->mmu->as.slots_lock); } =20 -static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 stat= us) +static void panthor_mmu_irq_handler(struct panthor_irq *pirq, u32 status) { + struct panthor_device *ptdev =3D pirq->ptdev; struct panthor_mmu *mmu =3D ptdev->mmu; bool has_unhandled_faults =3D false; =20 @@ -1849,6 +1846,11 @@ static void panthor_mmu_irq_handler(struct panthor_d= evice *ptdev, u32 status) panthor_sched_report_mmu_fault(ptdev); } =20 +static irqreturn_t panthor_mmu_irq_threaded_handler(int irq, void *data) +{ + return panthor_irq_default_threaded_handler(data, panthor_mmu_irq_handler= ); +} + /** * panthor_mmu_suspend() - Suspend the MMU logic * @ptdev: Device. @@ -1873,7 +1875,7 @@ void panthor_mmu_suspend(struct panthor_device *ptdev) } mutex_unlock(&ptdev->mmu->as.slots_lock); =20 - panthor_mmu_irq_suspend(&ptdev->mmu->irq); + panthor_irq_suspend(&ptdev->mmu->irq); } =20 /** @@ -1892,7 +1894,7 @@ void panthor_mmu_resume(struct panthor_device *ptdev) ptdev->mmu->as.faulty_mask =3D 0; mutex_unlock(&ptdev->mmu->as.slots_lock); =20 - panthor_mmu_irq_resume(&ptdev->mmu->irq); + panthor_irq_resume(&ptdev->mmu->irq); } =20 /** @@ -1909,7 +1911,7 @@ void panthor_mmu_pre_reset(struct panthor_device *ptd= ev) { struct panthor_vm *vm; =20 - panthor_mmu_irq_suspend(&ptdev->mmu->irq); + panthor_irq_suspend(&ptdev->mmu->irq); =20 mutex_lock(&ptdev->mmu->vm.lock); ptdev->mmu->vm.reset_in_progress =3D true; @@ -1946,7 +1948,7 @@ void panthor_mmu_post_reset(struct panthor_device *pt= dev) =20 mutex_unlock(&ptdev->mmu->as.slots_lock); =20 - panthor_mmu_irq_resume(&ptdev->mmu->irq); + panthor_irq_resume(&ptdev->mmu->irq); =20 /* Restart the VM_BIND queues. */ mutex_lock(&ptdev->mmu->vm.lock); @@ -3201,7 +3203,7 @@ panthor_mmu_reclaim_priv_bos(struct panthor_device *p= tdev, void panthor_mmu_unplug(struct panthor_device *ptdev) { if (!IS_ENABLED(CONFIG_PM) || pm_runtime_active(ptdev->base.dev)) - panthor_mmu_irq_suspend(&ptdev->mmu->irq); + panthor_irq_suspend(&ptdev->mmu->irq); =20 mutex_lock(&ptdev->mmu->as.slots_lock); for (u32 i =3D 0; i < ARRAY_SIZE(ptdev->mmu->as.slots); i++) { @@ -3255,9 +3257,10 @@ int panthor_mmu_init(struct panthor_device *ptdev) if (irq <=3D 0) return -ENODEV; =20 - ret =3D panthor_request_mmu_irq(ptdev, &mmu->irq, irq, - panthor_mmu_fault_mask(ptdev, ~0), - ptdev->iomem + MMU_INT_BASE); + ret =3D panthor_irq_request(ptdev, &mmu->irq, irq, + panthor_mmu_fault_mask(ptdev, ~0), + ptdev->iomem + MMU_INT_BASE, "mmu", + panthor_mmu_irq_threaded_handler); if (ret) return ret; =20 diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/pantho= r/panthor_pwr.c index 7c7f424a1436..80cf78007896 100644 --- a/drivers/gpu/drm/panthor/panthor_pwr.c +++ b/drivers/gpu/drm/panthor/panthor_pwr.c @@ -56,8 +56,9 @@ struct panthor_pwr { wait_queue_head_t reqs_acked; }; =20 -static void panthor_pwr_irq_handler(struct panthor_device *ptdev, u32 stat= us) +static void panthor_pwr_irq_handler(struct panthor_irq *pirq, u32 status) { + struct panthor_device *ptdev =3D pirq->ptdev; struct panthor_pwr *pwr =3D ptdev->pwr; =20 spin_lock(&ptdev->pwr->reqs_lock); @@ -75,7 +76,11 @@ static void panthor_pwr_irq_handler(struct panthor_devic= e *ptdev, u32 status) } spin_unlock(&ptdev->pwr->reqs_lock); } -PANTHOR_IRQ_HANDLER(pwr, panthor_pwr_irq_handler); + +static irqreturn_t panthor_pwr_irq_threaded_handler(int irq, void *data) +{ + return panthor_irq_default_threaded_handler(data, panthor_pwr_irq_handler= ); +} =20 static void panthor_pwr_write_command(struct panthor_device *ptdev, u32 co= mmand, u64 args) { @@ -453,7 +458,7 @@ void panthor_pwr_unplug(struct panthor_device *ptdev) return; =20 /* Make sure the IRQ handler is not running after that point. */ - panthor_pwr_irq_suspend(&ptdev->pwr->irq); + panthor_irq_suspend(&ptdev->pwr->irq); =20 /* Wake-up all waiters. */ spin_lock_irqsave(&ptdev->pwr->reqs_lock, flags); @@ -483,9 +488,10 @@ int panthor_pwr_init(struct panthor_device *ptdev) if (irq < 0) return irq; =20 - err =3D panthor_request_pwr_irq( + err =3D panthor_irq_request( ptdev, &pwr->irq, irq, PWR_INTERRUPTS_MASK, - pwr->iomem + PWR_INT_BASE); + pwr->iomem + PWR_INT_BASE, "pwr", + panthor_pwr_irq_threaded_handler); if (err) return err; =20 @@ -564,7 +570,7 @@ void panthor_pwr_suspend(struct panthor_device *ptdev) if (!ptdev->pwr) return; =20 - panthor_pwr_irq_suspend(&ptdev->pwr->irq); + panthor_irq_suspend(&ptdev->pwr->irq); } =20 void panthor_pwr_resume(struct panthor_device *ptdev) @@ -572,5 +578,5 @@ void panthor_pwr_resume(struct panthor_device *ptdev) if (!ptdev->pwr) return; =20 - panthor_pwr_irq_resume(&ptdev->pwr->irq); + panthor_irq_resume(&ptdev->pwr->irq); } --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AD9F3AC0D0 for ; Wed, 29 Apr 2026 09:39:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455554; cv=none; b=txsY+Am3hHfc/87SwyD/fmMOWTcsDnSAiLCkf6NSO7w4AzTP2puGGpVkwnTfUa+OpteZb8JjkYkxBUrht4Pdcq5ef/qW1YCyeQQIPSaAYV4O8+k+LiHEYs7pIgsZNlkL3wlKqYXLoCEPCZE9KfZnScZ8dGLnkmYFNE0zy7n4TOI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455554; c=relaxed/simple; bh=/5LHygL35NbAvEVMkMlPBVMXU0VvJhV0KqmkrOIVHxo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ExdWDL2lEF/VNYRFwmgayxwBPBDqt8hBIGuaBGyDJ9kMI1FtRlwtVIvVVqL17ytxK+JijNYkIzZdS3QMOtjYX2pIxwLM1bpR+NVYx6Kvi16QpQrpG48IgMtNQ1MhZAKuxKoX9s8SuyM/1KaKML9FVnX4EkAtVUPx/FL143vZNhQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=OTzRwJ2I; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="OTzRwJ2I" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455536; bh=/5LHygL35NbAvEVMkMlPBVMXU0VvJhV0KqmkrOIVHxo=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=OTzRwJ2I5rdUbVCa8vsjZtiKxIS8gJ7vng9p1+UGQqU46LHviHv/tTh5R5TVIo9JN ok2uJdc2oY9aiTx8i5lqGxhK7OQCkRhDQPj3UwSHPHsmQjMc9OjDJccszSRkhI1y+3 /RY3OBRlzmZxBkwJK8icwGYCJjiV2rEz231NBmBV3+Bf2g8VQ2/ctjIc4JYb5vIprn 7rYgjMAEyLzN+iFsS3MZMeHsbpwyIdciQYU9GvXYGv6hgdVdglkwLS2jIU4SM0Qpr3 J9xb66BN7XCx3Ca/xr5J0OExz4UhLlFgay6iDTaO1RNEvTpcTOiiGRDHK0SKZ0/DMP pDu3gSl7zBWog== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id 8264E17E1537; Wed, 29 Apr 2026 11:38:56 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:31 +0200 Subject: [PATCH 04/10] drm/panthor: Extend the IRQ logic to allow fast/raw IRQ handlers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-4-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=4579; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=/5LHygL35NbAvEVMkMlPBVMXU0VvJhV0KqmkrOIVHxo=; b=3RCWSOSXMuInGiKcUQr+frvF6JokiFotvhgmUp5XLYRro0dyJYIGXzGEUdD/b/bsqI5U+xV3Z IvBFmZVvbqVCUH/U2ikhIX+LdQYgfJUphdpvP0g2jcm6M+VDv/0F5g2 X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= All drivers except panthor signal their fences from their interrupt handler to minimize latency. We could do the same from the interrupt handler, but the latency is still quite high in that case, so let's allow components to choose the context they want their IRQ handler to run in. This takes the form of an extra fast_handler() returning an irqreturn_t reflecting the need to wake-up a thread or not. A new PANTHOR_IRQ_ADV_HANDLER() macro taking this extra fast_handler argument is added, PANTHOR_IRQ_HANDLER() is implemented as a wrapper around PANTHOR_IRQ_ADV_HANDLER() with a default fast_handler returning IRQ_WAKE_THREAD. The fast and slow handler are still assumed to be mutually exclusive. In case a fast handler is provided, the slow_handler is expected to be run when the event can't be processed directly in the fast handler, or when the driver thinks it would be beneficial to coalesce interrupts by polling in the thread rather than re-enabling interrupts immediately. Signed-off-by: Boris Brezillon Reviewed-by: Liviu Dudau --- drivers/gpu/drm/panthor/panthor_device.h | 5 ++--- drivers/gpu/drm/panthor/panthor_fw.c | 1 + drivers/gpu/drm/panthor/panthor_gpu.c | 1 + drivers/gpu/drm/panthor/panthor_mmu.c | 1 + drivers/gpu/drm/panthor/panthor_pwr.c | 1 + 5 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/pan= thor/panthor_device.h index afa202546316..1c130b8394ab 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -672,6 +672,7 @@ static inline void panthor_irq_disable_events(struct pa= nthor_irq *pirq, u32 mask static inline int panthor_irq_request(struct panthor_device *ptdev, struct panthor_irq *pirq, int irq, u32 mask, void __iomem *iomem, const char *name, + irqreturn_t (*raw_handler)(int, void *data), irqreturn_t (*threaded_handler)(int, void *data)) { const char *full_name; @@ -687,9 +688,7 @@ panthor_irq_request(struct panthor_device *ptdev, struc= t panthor_irq *pirq, if (!full_name) return -ENOMEM; =20 - return devm_request_threaded_irq(ptdev->base.dev, irq, - panthor_irq_default_raw_handler, - threaded_handler, + return devm_request_threaded_irq(ptdev->base.dev, irq, raw_handler, threa= ded_handler, IRQF_SHARED, full_name, pirq); } =20 diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor= /panthor_fw.c index eaf599b0a887..8239a6951569 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.c +++ b/drivers/gpu/drm/panthor/panthor_fw.c @@ -1483,6 +1483,7 @@ int panthor_fw_init(struct panthor_device *ptdev) =20 ret =3D panthor_irq_request(ptdev, &fw->irq, irq, 0, ptdev->iomem + JOB_INT_BASE, "job", + panthor_irq_default_raw_handler, panthor_job_irq_threaded_handler); if (ret) { drm_err(&ptdev->base, "failed to request job irq"); diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/pantho= r/panthor_gpu.c index ce208e384762..d0be758ea3e1 100644 --- a/drivers/gpu/drm/panthor/panthor_gpu.c +++ b/drivers/gpu/drm/panthor/panthor_gpu.c @@ -177,6 +177,7 @@ int panthor_gpu_init(struct panthor_device *ptdev) ret =3D panthor_irq_request(ptdev, &ptdev->gpu->irq, irq, GPU_INTERRUPTS_MASK, ptdev->iomem + GPU_INT_BASE, "gpu", + panthor_irq_default_raw_handler, panthor_gpu_irq_threaded_handler); if (ret) return ret; diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/pantho= r/panthor_mmu.c index a0d0a9b2926f..2cb07933b629 100644 --- a/drivers/gpu/drm/panthor/panthor_mmu.c +++ b/drivers/gpu/drm/panthor/panthor_mmu.c @@ -3260,6 +3260,7 @@ int panthor_mmu_init(struct panthor_device *ptdev) ret =3D panthor_irq_request(ptdev, &mmu->irq, irq, panthor_mmu_fault_mask(ptdev, ~0), ptdev->iomem + MMU_INT_BASE, "mmu", + panthor_irq_default_raw_handler, panthor_mmu_irq_threaded_handler); if (ret) return ret; diff --git a/drivers/gpu/drm/panthor/panthor_pwr.c b/drivers/gpu/drm/pantho= r/panthor_pwr.c index 80cf78007896..1efb7f3482ba 100644 --- a/drivers/gpu/drm/panthor/panthor_pwr.c +++ b/drivers/gpu/drm/panthor/panthor_pwr.c @@ -491,6 +491,7 @@ int panthor_pwr_init(struct panthor_device *ptdev) err =3D panthor_irq_request( ptdev, &pwr->irq, irq, PWR_INTERRUPTS_MASK, pwr->iomem + PWR_INT_BASE, "pwr", + panthor_irq_default_raw_handler, panthor_pwr_irq_threaded_handler); if (err) return err; --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B770F3BD25E for ; Wed, 29 Apr 2026 09:39:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455558; cv=none; b=Q0Ip+0Os3E4a0xfLj5ysX/i5H3t+vNyj/M3lsqj5mZE0DXOcZjOPXfF9+M0wQCn6nR8ip92TgxnFJwBZUzzfVsc/TDEg6r1ovjxSWGOvydO1fp8/SieH4tYmiEYhX+HHOPPg1YV484403q+dwyvn3gJLiaOdwxdhCzlGBhvgJIk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455558; c=relaxed/simple; bh=YEXgj6Ay4Q/wh0v7MXHI8NL2Nbu+OEpLdioOx+8oNO8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=m9+1JrNk2Asa5Vxp8GzYdssVLRjBnagI1yckJbb9Tr9P22PCmXAE7yKwmUcPMFpX1pe2AYDhneSxBCza5l9Lk/pV50B4wkfpUq2rd0hhTRW76Mm0aOIi30DpCrzkVIJCtUh5CShCt8HaHhQnCAwI99zuy47FWi1q0uBghplGRQM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=LvU2zwhg; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="LvU2zwhg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455537; bh=YEXgj6Ay4Q/wh0v7MXHI8NL2Nbu+OEpLdioOx+8oNO8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=LvU2zwhgiC9f7XSx78xjmx+GBo+RsS4zFDd28XUwEuR9f33AJAf1axnzZQ6I4CJ5x W3gCGpc0VO1eji9AHuRFMAtFjJLQU2OU8b04gWjd+VppCGiit7VTZEYqGB9Vrq2edM k6M9vDtC0vlnLYpRZV+IPpgvHTFRShwyv0msIOaKARR1Of6u+ixgiqXw/0qywqHseJ BzcCEWV3ip3Tj/mxZ1YL8NjOp+Q76X0Ol58oRNHxK3v83zL5oU64bd600y7SgPfb0K Q+FVn9XXazZs5josJ2BTdrogHzT5ZFCHWu9XWX6OrUdMom+Z3ek8GC82bQ+EueUeJ0 yjo7JLs0EjoTA== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id 1225F17E1553; Wed, 29 Apr 2026 11:38:57 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:32 +0200 Subject: [PATCH 05/10] drm/panthor: Make panthor_fw_{update,toggle}_reqs() callable from IRQ context Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-5-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=2195; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=YEXgj6Ay4Q/wh0v7MXHI8NL2Nbu+OEpLdioOx+8oNO8=; b=scJ7bM7R+tkKoMzuKWHjBmfyGImo5IKRuEv+Qn4rBFKNxGmid+aXq9ORO8BSyq/w5AAEwhB6W 2eu1EQmmF0HAsaXn9VQ8JaWc4AlW74jxIRk1L77XD/gOutT1XYkRdku X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= If we want some FW events to be processed in the interrupt path, we need the helpers manipulating req regs to be IRQ-safe, which implies using spin_lock_irqsave instead of spinlock. While at it, use guards instead of plain spin_lock/unlock calls. Signed-off-by: Boris Brezillon Reviewed-by: Liviu Dudau Reviewed-by: Steven Price --- drivers/gpu/drm/panthor/panthor_fw.h | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_fw.h b/drivers/gpu/drm/panthor= /panthor_fw.h index a99a9b6f4825..e56b7fe15bb3 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.h +++ b/drivers/gpu/drm/panthor/panthor_fw.h @@ -432,12 +432,11 @@ struct panthor_fw_global_iface { #define panthor_fw_toggle_reqs(__iface, __in_reg, __out_reg, __mask) \ do { \ u32 __cur_val, __new_val, __out_val; \ - spin_lock(&(__iface)->lock); \ + guard(spinlock_irqsave)(&(__iface)->lock); \ __cur_val =3D READ_ONCE((__iface)->input->__in_reg); \ __out_val =3D READ_ONCE((__iface)->output->__out_reg); \ __new_val =3D ((__out_val ^ (__mask)) & (__mask)) | (__cur_val & ~(__mas= k)); \ WRITE_ONCE((__iface)->input->__in_reg, __new_val); \ - spin_unlock(&(__iface)->lock); \ } while (0) =20 /** @@ -458,21 +457,19 @@ struct panthor_fw_global_iface { #define panthor_fw_update_reqs(__iface, __in_reg, __val, __mask) \ do { \ u32 __cur_val, __new_val; \ - spin_lock(&(__iface)->lock); \ + guard(spinlock_irqsave)(&(__iface)->lock); \ __cur_val =3D READ_ONCE((__iface)->input->__in_reg); \ __new_val =3D (__cur_val & ~(__mask)) | ((__val) & (__mask)); \ WRITE_ONCE((__iface)->input->__in_reg, __new_val); \ - spin_unlock(&(__iface)->lock); \ } while (0) =20 #define panthor_fw_update_reqs64(__iface, __in_reg, __val, __mask) \ do { \ u64 __cur_val, __new_val; \ - spin_lock(&(__iface)->lock); \ + guard(spinlock_irqsave)(&(__iface)->lock); \ __cur_val =3D READ_ONCE((__iface)->input->__in_reg); \ __new_val =3D (__cur_val & ~(__mask)) | ((__val) & (__mask)); \ WRITE_ONCE((__iface)->input->__in_reg, __new_val); \ - spin_unlock(&(__iface)->lock); \ } while (0) =20 struct panthor_fw_global_iface * --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE76A3C344B for ; Wed, 29 Apr 2026 09:39:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455563; cv=none; b=kEKEgY5VtHSjcUPOONKTylWY2PPEMA8HVfHRgdNSX8DzkNmLgi4lYe2Y0L8NuF13BQ3aBNbh9mmvwf46M/1LIvPv92yklET4SIe7gvE5j25FTFoxWslszRwA9CELve8Y9sJrjJA5j0C+exP7sCr7t6oBbFZCF7xlTq0ZSAnVBjg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455563; c=relaxed/simple; bh=6s+deQjPnvQnKOBymns2VHD8EvTlg4hAYH0UyTNsEpA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ouPFHXX+M+TdoSzdSDzWoHl+lrbk9sBLRE/Xp1sORaXnauJktInRztN0MuTusSNqHaffy76u+a69C8KHwvrpPfanHNVgj8xrKn2Oi/Sgrao3lSgSBV3iqG2RYy/LeIEuFRH5hLhJiql3s8qngETO2n7f9tjNfGPUjK7//lsZu84= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=fKpnE6I/; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="fKpnE6I/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455538; bh=6s+deQjPnvQnKOBymns2VHD8EvTlg4hAYH0UyTNsEpA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=fKpnE6I/MskhSwH0yqQKjSD3ngJE0g0YYH21Udbn/ovwN2lbfRlM9oay5OHhbH4DR ZxKWwSMRiZYsZgQysJ41r14zdSNiBe3Zu5HOIe90RTeMTvJeRIHHY9AZx9a2Nhqe2D SxJMqEhP+P5fOoH+4uWwghO1eo7RObN8m6ZS2r4jXWdTLOpjf/090tMoHUKRgxvSde oTr84PYtJqlQQdD+9xmySeQ80tSWfjFObIuH59qwqgSa9pm0jiIGzTiNHObwkGkYdV 5SUGhpk6n05HD+cnsgC9kC3dCMwzSU8L9mWisF3qFf97cXLjJHpADHsLAP89sATkAJ QEkJdUZJAjZLQ== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id 9513417E1562; Wed, 29 Apr 2026 11:38:57 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:33 +0200 Subject: [PATCH 06/10] drm/panthor: Prepare the scheduler logic for FW events in IRQ context Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-6-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=20098; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=6s+deQjPnvQnKOBymns2VHD8EvTlg4hAYH0UyTNsEpA=; b=1kL5iR9zCyt6KrkJ7y1ig3RIkNj6VPLNSN6IDM4sUnj1t2d9NgwcOpka4VOZZVEo0yMDCPnkc 5JwTmyhYzmqAdHHlPzLoUYcZXO4FJGMYU2MVG83Q7fbLWtak4+/wldd X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= Add a specific spinlock for events processing, and force processing of events in the panthor_sched_report_fw_events() path rather than deferring it to a work item. We also fast-track fence signalling by making the job completion logic IRQ-safe. Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_sched.c | 322 +++++++++++++++-------------= ---- 1 file changed, 149 insertions(+), 173 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/pant= hor/panthor_sched.c index 5b34032deff8..c197bdc4b2c7 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -177,18 +177,6 @@ struct panthor_scheduler { */ struct work_struct sync_upd_work; =20 - /** - * @fw_events_work: Work used to process FW events outside the interrupt = path. - * - * Even if the interrupt is threaded, we need any event processing - * that require taking the panthor_scheduler::lock to be processed - * outside the interrupt path so we don't block the tick logic when - * it calls panthor_fw_{csg,wait}_wait_acks(). Since most of the - * event processing requires taking this lock, we just delegate all - * FW event processing to the scheduler workqueue. - */ - struct work_struct fw_events_work; - /** * @fw_events: Bitmask encoding pending FW events. */ @@ -254,6 +242,15 @@ struct panthor_scheduler { struct list_head waiting; } groups; =20 + /** + * @events_lock: Lock taken when processing events. + * + * This also needs to be taken when csg_slots are updated, to make sure + * the event processing logic doesn't touch groups that have left the CSG + * slot. + */ + spinlock_t events_lock; + /** * @csg_slots: FW command stream group slots. */ @@ -676,9 +673,6 @@ struct panthor_group { */ struct panthor_kernel_bo *protm_suspend_buf; =20 - /** @sync_upd_work: Work used to check/signal job fences. */ - struct work_struct sync_upd_work; - /** @tiler_oom_work: Work used to process tiler OOM events happening on t= his group. */ struct work_struct tiler_oom_work; =20 @@ -999,7 +993,6 @@ static int group_bind_locked(struct panthor_group *group, u32 csg_id) { struct panthor_device *ptdev =3D group->ptdev; - struct panthor_csg_slot *csg_slot; int ret; =20 lockdep_assert_held(&ptdev->scheduler->lock); @@ -1012,9 +1005,7 @@ group_bind_locked(struct panthor_group *group, u32 cs= g_id) if (ret) return ret; =20 - csg_slot =3D &ptdev->scheduler->csg_slots[csg_id]; group_get(group); - group->csg_id =3D csg_id; =20 /* Dummy doorbell allocation: doorbell is assigned to the group and * all queues use the same doorbell. @@ -1026,7 +1017,10 @@ group_bind_locked(struct panthor_group *group, u32 c= sg_id) for (u32 i =3D 0; i < group->queue_count; i++) group->queues[i]->doorbell_id =3D csg_id + 1; =20 - csg_slot->group =3D group; + scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) { + ptdev->scheduler->csg_slots[csg_id].group =3D group; + group->csg_id =3D csg_id; + } =20 return 0; } @@ -1041,7 +1035,6 @@ static int group_unbind_locked(struct panthor_group *group) { struct panthor_device *ptdev =3D group->ptdev; - struct panthor_csg_slot *slot; =20 lockdep_assert_held(&ptdev->scheduler->lock); =20 @@ -1051,9 +1044,12 @@ group_unbind_locked(struct panthor_group *group) if (drm_WARN_ON(&ptdev->base, group->state =3D=3D PANTHOR_CS_GROUP_ACTIVE= )) return -EINVAL; =20 - slot =3D &ptdev->scheduler->csg_slots[group->csg_id]; + scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) { + ptdev->scheduler->csg_slots[group->csg_id].group =3D NULL; + group->csg_id =3D -1; + } + panthor_vm_idle(group->vm); - group->csg_id =3D -1; =20 /* Tiler OOM events will be re-issued next time the group is scheduled. */ atomic_set(&group->tiler_oom, 0); @@ -1062,8 +1058,6 @@ group_unbind_locked(struct panthor_group *group) for (u32 i =3D 0; i < group->queue_count; i++) group->queues[i]->doorbell_id =3D -1; =20 - slot->group =3D NULL; - group_put(group); return 0; } @@ -1151,16 +1145,14 @@ queue_suspend_timeout_locked(struct panthor_queue *= queue) static void queue_suspend_timeout(struct panthor_queue *queue) { - spin_lock(&queue->fence_ctx.lock); + guard(spinlock_irqsave)(&queue->fence_ctx.lock); queue_suspend_timeout_locked(queue); - spin_unlock(&queue->fence_ctx.lock); } =20 static void queue_resume_timeout(struct panthor_queue *queue) { - spin_lock(&queue->fence_ctx.lock); - + guard(spinlock_irqsave)(&queue->fence_ctx.lock); if (queue_timeout_is_suspended(queue)) { mod_delayed_work(queue->scheduler.timeout_wq, &queue->timeout.work, @@ -1168,8 +1160,6 @@ queue_resume_timeout(struct panthor_queue *queue) =20 queue->timeout.remaining =3D MAX_SCHEDULE_TIMEOUT; } - - spin_unlock(&queue->fence_ctx.lock); } =20 /** @@ -1484,7 +1474,7 @@ cs_slot_process_fatal_event_locked(struct panthor_dev= ice *ptdev, u32 fatal; u64 info; =20 - lockdep_assert_held(&sched->lock); + lockdep_assert_held(&sched->events_lock); =20 cs_iface =3D panthor_fw_get_cs_iface(ptdev, csg_id, cs_id); fatal =3D cs_iface->output->fatal; @@ -1532,7 +1522,7 @@ cs_slot_process_fault_event_locked(struct panthor_dev= ice *ptdev, u32 fault; u64 info; =20 - lockdep_assert_held(&sched->lock); + lockdep_assert_held(&sched->events_lock); =20 cs_iface =3D panthor_fw_get_cs_iface(ptdev, csg_id, cs_id); fault =3D cs_iface->output->fault; @@ -1542,7 +1532,7 @@ cs_slot_process_fault_event_locked(struct panthor_dev= ice *ptdev, u64 cs_extract =3D queue->iface.output->extract; struct panthor_job *job; =20 - spin_lock(&queue->fence_ctx.lock); + guard(spinlock_irqsave)(&queue->fence_ctx.lock); list_for_each_entry(job, &queue->fence_ctx.in_flight_jobs, node) { if (cs_extract >=3D job->ringbuf.end) continue; @@ -1552,7 +1542,6 @@ cs_slot_process_fault_event_locked(struct panthor_dev= ice *ptdev, =20 dma_fence_set_error(job->done_fence, -EINVAL); } - spin_unlock(&queue->fence_ctx.lock); } =20 if (group) { @@ -1682,7 +1671,7 @@ cs_slot_process_tiler_oom_event_locked(struct panthor= _device *ptdev, struct panthor_csg_slot *csg_slot =3D &sched->csg_slots[csg_id]; struct panthor_group *group =3D csg_slot->group; =20 - lockdep_assert_held(&sched->lock); + lockdep_assert_held(&sched->events_lock); =20 if (drm_WARN_ON(&ptdev->base, !group)) return; @@ -1703,7 +1692,7 @@ static bool cs_slot_process_irq_locked(struct panthor= _device *ptdev, struct panthor_fw_cs_iface *cs_iface; u32 req, ack, events; =20 - lockdep_assert_held(&ptdev->scheduler->lock); + lockdep_assert_held(&ptdev->scheduler->events_lock); =20 cs_iface =3D panthor_fw_get_cs_iface(ptdev, csg_id, cs_id); req =3D cs_iface->input->req; @@ -1731,7 +1720,7 @@ static void csg_slot_process_idle_event_locked(struct= panthor_device *ptdev, u32 { struct panthor_scheduler *sched =3D ptdev->scheduler; =20 - lockdep_assert_held(&sched->lock); + lockdep_assert_held(&sched->events_lock); =20 sched->might_have_idle_groups =3D true; =20 @@ -1742,16 +1731,102 @@ static void csg_slot_process_idle_event_locked(str= uct panthor_device *ptdev, u32 sched_queue_delayed_work(sched, tick, 0); } =20 +static void update_fdinfo_stats(struct panthor_job *job) +{ + struct panthor_group *group =3D job->group; + struct panthor_queue *queue =3D group->queues[job->queue_idx]; + struct panthor_gpu_usage *fdinfo =3D &group->fdinfo.data; + struct panthor_job_profiling_data *slots =3D queue->profiling.slots->kmap; + struct panthor_job_profiling_data *data =3D &slots[job->profiling.slot]; + + scoped_guard(spinlock, &group->fdinfo.lock) { + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES) + fdinfo->cycles +=3D data->cycles.after - data->cycles.before; + if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP) + fdinfo->time +=3D data->time.after - data->time.before; + } +} + +static bool queue_check_job_completion(struct panthor_queue *queue) +{ + struct panthor_syncobj_64b *syncobj =3D NULL; + struct panthor_job *job, *job_tmp; + bool cookie, progress =3D false; + LIST_HEAD(done_jobs); + + cookie =3D dma_fence_begin_signalling(); + scoped_guard(spinlock_irqsave, &queue->fence_ctx.lock) { + list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs,= node) { + if (!syncobj) { + struct panthor_group *group =3D job->group; + + syncobj =3D group->syncobjs->kmap + + (job->queue_idx * sizeof(*syncobj)); + } + + if (syncobj->seqno < job->done_fence->seqno) + break; + + list_move_tail(&job->node, &done_jobs); + dma_fence_signal_locked(job->done_fence); + } + + if (list_empty(&queue->fence_ctx.in_flight_jobs)) { + /* If we have no job left, we cancel the timer, and reset remaining + * time to its default so it can be restarted next time + * queue_resume_timeout() is called. + */ + queue_suspend_timeout_locked(queue); + + /* If there's no job pending, we consider it progress to avoid a + * spurious timeout if the timeout handler and the sync update + * handler raced. + */ + progress =3D true; + } else if (!list_empty(&done_jobs)) { + queue_reset_timeout_locked(queue); + progress =3D true; + } + } + dma_fence_end_signalling(cookie); + + list_for_each_entry_safe(job, job_tmp, &done_jobs, node) { + if (job->profiling.mask) + update_fdinfo_stats(job); + list_del_init(&job->node); + panthor_job_put(&job->base); + } + + return progress; +} + +static void group_check_job_completion(struct panthor_group *group) +{ + bool cookie; + u32 queue_idx; + + cookie =3D dma_fence_begin_signalling(); + for (queue_idx =3D 0; queue_idx < group->queue_count; queue_idx++) { + struct panthor_queue *queue =3D group->queues[queue_idx]; + + if (!queue) + continue; + + queue_check_job_completion(queue); + } + dma_fence_end_signalling(cookie); +} + static void csg_slot_sync_update_locked(struct panthor_device *ptdev, u32 csg_id) { struct panthor_csg_slot *csg_slot =3D &ptdev->scheduler->csg_slots[csg_id= ]; struct panthor_group *group =3D csg_slot->group; =20 - lockdep_assert_held(&ptdev->scheduler->lock); + lockdep_assert_held(&ptdev->scheduler->events_lock); =20 if (group) - group_queue_work(group, sync_upd); + group_check_job_completion(group); =20 sched_queue_work(ptdev->scheduler, sync_upd); } @@ -1784,7 +1859,7 @@ static void sched_process_csg_irq_locked(struct panth= or_device *ptdev, u32 csg_i struct panthor_fw_csg_iface *csg_iface; u32 ring_cs_db_mask =3D 0; =20 - lockdep_assert_held(&ptdev->scheduler->lock); + lockdep_assert_held(&ptdev->scheduler->events_lock); =20 if (drm_WARN_ON(&ptdev->base, csg_id >=3D ptdev->scheduler->csg_slot_coun= t)) return; @@ -1842,7 +1917,7 @@ static void sched_process_idle_event_locked(struct pa= nthor_device *ptdev) { struct panthor_fw_global_iface *glb_iface =3D panthor_fw_get_glb_iface(pt= dev); =20 - lockdep_assert_held(&ptdev->scheduler->lock); + lockdep_assert_held(&ptdev->scheduler->events_lock); =20 /* Acknowledge the idle event and schedule a tick. */ panthor_fw_update_reqs(glb_iface, req, glb_iface->output->ack, GLB_IDLE); @@ -1858,7 +1933,7 @@ static void sched_process_global_irq_locked(struct pa= nthor_device *ptdev) struct panthor_fw_global_iface *glb_iface =3D panthor_fw_get_glb_iface(pt= dev); u32 req, ack, evts; =20 - lockdep_assert_held(&ptdev->scheduler->lock); + lockdep_assert_held(&ptdev->scheduler->events_lock); =20 req =3D READ_ONCE(glb_iface->input->req); ack =3D READ_ONCE(glb_iface->output->ack); @@ -1868,30 +1943,6 @@ static void sched_process_global_irq_locked(struct p= anthor_device *ptdev) sched_process_idle_event_locked(ptdev); } =20 -static void process_fw_events_work(struct work_struct *work) -{ - struct panthor_scheduler *sched =3D container_of(work, struct panthor_sch= eduler, - fw_events_work); - u32 events =3D atomic_xchg(&sched->fw_events, 0); - struct panthor_device *ptdev =3D sched->ptdev; - - mutex_lock(&sched->lock); - - if (events & JOB_INT_GLOBAL_IF) { - sched_process_global_irq_locked(ptdev); - events &=3D ~JOB_INT_GLOBAL_IF; - } - - while (events) { - u32 csg_id =3D ffs(events) - 1; - - sched_process_csg_irq_locked(ptdev, csg_id); - events &=3D ~BIT(csg_id); - } - - mutex_unlock(&sched->lock); -} - /** * panthor_sched_report_fw_events() - Report FW events to the scheduler. * @ptdev: Device. @@ -1902,8 +1953,19 @@ void panthor_sched_report_fw_events(struct panthor_d= evice *ptdev, u32 events) if (!ptdev->scheduler) return; =20 - atomic_or(events, &ptdev->scheduler->fw_events); - sched_queue_work(ptdev->scheduler, fw_events); + guard(spinlock_irqsave)(&ptdev->scheduler->events_lock); + + if (events & JOB_INT_GLOBAL_IF) { + sched_process_global_irq_locked(ptdev); + events &=3D ~JOB_INT_GLOBAL_IF; + } + + while (events) { + u32 csg_id =3D ffs(events) - 1; + + sched_process_csg_irq_locked(ptdev, csg_id); + events &=3D ~BIT(csg_id); + } } =20 static const char *fence_get_driver_name(struct dma_fence *fence) @@ -2136,7 +2198,9 @@ tick_ctx_init(struct panthor_scheduler *sched, * CSG IRQs, so we can flag the faulty queue. */ if (panthor_vm_has_unhandled_faults(group->vm)) { - sched_process_csg_irq_locked(ptdev, i); + scoped_guard(spinlock_irqsave, &sched->events_lock) { + sched_process_csg_irq_locked(ptdev, i); + } =20 /* No fatal fault reported, flag all queues as faulty. */ if (!group->fatal_queues) @@ -2183,13 +2247,13 @@ group_term_post_processing(struct panthor_group *gr= oup) if (!queue) continue; =20 - spin_lock(&queue->fence_ctx.lock); - list_for_each_entry_safe(job, tmp, &queue->fence_ctx.in_flight_jobs, nod= e) { - list_move_tail(&job->node, &faulty_jobs); - dma_fence_set_error(job->done_fence, err); - dma_fence_signal_locked(job->done_fence); + scoped_guard(spinlock_irqsave, &queue->fence_ctx.lock) { + list_for_each_entry_safe(job, tmp, &queue->fence_ctx.in_flight_jobs, no= de) { + list_move_tail(&job->node, &faulty_jobs); + dma_fence_set_error(job->done_fence, err); + dma_fence_signal_locked(job->done_fence); + } } - spin_unlock(&queue->fence_ctx.lock); =20 /* Manually update the syncobj seqno to unblock waiters. */ syncobj =3D group->syncobjs->kmap + (i * sizeof(*syncobj)); @@ -2336,8 +2400,10 @@ tick_ctx_apply(struct panthor_scheduler *sched, stru= ct panthor_sched_tick_ctx *c * any pending interrupts before we start the new * group. */ - if (group->csg_id >=3D 0) + if (group->csg_id >=3D 0) { + guard(spinlock_irqsave)(&sched->events_lock); sched_process_csg_irq_locked(ptdev, group->csg_id); + } =20 group_unbind_locked(group); } @@ -2920,8 +2986,10 @@ void panthor_sched_suspend(struct panthor_device *pt= dev) =20 group_get(group); =20 - if (group->csg_id >=3D 0) + if (group->csg_id >=3D 0) { + guard(spinlock_irqsave)(&sched->events_lock); sched_process_csg_irq_locked(ptdev, group->csg_id); + } =20 group_unbind_locked(group); =20 @@ -3005,22 +3073,6 @@ void panthor_sched_post_reset(struct panthor_device = *ptdev, bool reset_failed) } } =20 -static void update_fdinfo_stats(struct panthor_job *job) -{ - struct panthor_group *group =3D job->group; - struct panthor_queue *queue =3D group->queues[job->queue_idx]; - struct panthor_gpu_usage *fdinfo =3D &group->fdinfo.data; - struct panthor_job_profiling_data *slots =3D queue->profiling.slots->kmap; - struct panthor_job_profiling_data *data =3D &slots[job->profiling.slot]; - - scoped_guard(spinlock, &group->fdinfo.lock) { - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES) - fdinfo->cycles +=3D data->cycles.after - data->cycles.before; - if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP) - fdinfo->time +=3D data->time.after - data->time.before; - } -} - void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile) { struct panthor_group_pool *gpool =3D pfile->groups; @@ -3041,80 +3093,6 @@ void panthor_fdinfo_gather_group_samples(struct pant= hor_file *pfile) xa_unlock(&gpool->xa); } =20 -static bool queue_check_job_completion(struct panthor_queue *queue) -{ - struct panthor_syncobj_64b *syncobj =3D NULL; - struct panthor_job *job, *job_tmp; - bool cookie, progress =3D false; - LIST_HEAD(done_jobs); - - cookie =3D dma_fence_begin_signalling(); - spin_lock(&queue->fence_ctx.lock); - list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs, = node) { - if (!syncobj) { - struct panthor_group *group =3D job->group; - - syncobj =3D group->syncobjs->kmap + - (job->queue_idx * sizeof(*syncobj)); - } - - if (syncobj->seqno < job->done_fence->seqno) - break; - - list_move_tail(&job->node, &done_jobs); - dma_fence_signal_locked(job->done_fence); - } - - if (list_empty(&queue->fence_ctx.in_flight_jobs)) { - /* If we have no job left, we cancel the timer, and reset remaining - * time to its default so it can be restarted next time - * queue_resume_timeout() is called. - */ - queue_suspend_timeout_locked(queue); - - /* If there's no job pending, we consider it progress to avoid a - * spurious timeout if the timeout handler and the sync update - * handler raced. - */ - progress =3D true; - } else if (!list_empty(&done_jobs)) { - queue_reset_timeout_locked(queue); - progress =3D true; - } - spin_unlock(&queue->fence_ctx.lock); - dma_fence_end_signalling(cookie); - - list_for_each_entry_safe(job, job_tmp, &done_jobs, node) { - if (job->profiling.mask) - update_fdinfo_stats(job); - list_del_init(&job->node); - panthor_job_put(&job->base); - } - - return progress; -} - -static void group_sync_upd_work(struct work_struct *work) -{ - struct panthor_group *group =3D - container_of(work, struct panthor_group, sync_upd_work); - u32 queue_idx; - bool cookie; - - cookie =3D dma_fence_begin_signalling(); - for (queue_idx =3D 0; queue_idx < group->queue_count; queue_idx++) { - struct panthor_queue *queue =3D group->queues[queue_idx]; - - if (!queue) - continue; - - queue_check_job_completion(queue); - } - dma_fence_end_signalling(cookie); - - group_put(group); -} - struct panthor_job_ringbuf_instrs { u64 buffer[MAX_INSTRS_PER_JOB]; u32 count; @@ -3346,9 +3324,8 @@ queue_run_job(struct drm_sched_job *sched_job) job->ringbuf.end =3D job->ringbuf.start + (instrs.count * sizeof(u64)); =20 panthor_job_get(&job->base); - spin_lock(&queue->fence_ctx.lock); - list_add_tail(&job->node, &queue->fence_ctx.in_flight_jobs); - spin_unlock(&queue->fence_ctx.lock); + scoped_guard(spinlock_irqsave, &queue->fence_ctx.lock) + list_add_tail(&job->node, &queue->fence_ctx.in_flight_jobs); =20 /* Make sure the ring buffer is updated before the INSERT * register. @@ -3683,7 +3660,6 @@ int panthor_group_create(struct panthor_file *pfile, INIT_LIST_HEAD(&group->wait_node); INIT_LIST_HEAD(&group->run_node); INIT_WORK(&group->term_work, group_term_work); - INIT_WORK(&group->sync_upd_work, group_sync_upd_work); INIT_WORK(&group->tiler_oom_work, group_tiler_oom_work); INIT_WORK(&group->release_work, group_release_work); =20 @@ -4054,7 +4030,6 @@ void panthor_sched_unplug(struct panthor_device *ptde= v) struct panthor_scheduler *sched =3D ptdev->scheduler; =20 disable_delayed_work_sync(&sched->tick_work); - disable_work_sync(&sched->fw_events_work); disable_work_sync(&sched->sync_upd_work); =20 mutex_lock(&sched->lock); @@ -4139,7 +4114,8 @@ int panthor_sched_init(struct panthor_device *ptdev) sched->tick_period =3D msecs_to_jiffies(10); INIT_DELAYED_WORK(&sched->tick_work, tick_work); INIT_WORK(&sched->sync_upd_work, sync_upd_work); - INIT_WORK(&sched->fw_events_work, process_fw_events_work); + + spin_lock_init(&sched->events_lock); =20 ret =3D drmm_mutex_init(&ptdev->base, &sched->lock); if (ret) --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CEF4839FCAE for ; Wed, 29 Apr 2026 09:39:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455560; cv=none; b=P43T22xFdc2AoulDEiX6AbSS2hFMNK2OitW2Rx5H2Fzzu7N9eY2dXAFnbEusaLx5zPwMk+cH9zNOADr5Cz1L42f92a99V+K23FKAlUCOfcJsNCKb6NJjOAr8oaIr4VW4tOC2yQCXGfZ1pZbA+8GlxYkZk1fJM3uN2ugrUu2mRsI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455560; c=relaxed/simple; bh=fAowcBEu7j3pM+kSJL/CrZv1sU+IDJSbuFK/s5NqU7w=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=mIEKFv4Lt9ZAn8aQWtCapZIIDJg65Umg674ndvsaw1tjPT8sfuxXiWbtI/BwjpeKII41Ki6+dz0gKX/C9261ae+YCC6lwIiGdeMSUZpi1QbNbEDNeOB9rI10ao/Rcseqcz80irFR5gQ3oi6pPzFoLymVqwfdU9W0bpLzuv9uljQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=lMIe4TGn; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="lMIe4TGn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455538; bh=fAowcBEu7j3pM+kSJL/CrZv1sU+IDJSbuFK/s5NqU7w=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=lMIe4TGnIPAdvJ4JwzjFjxW2d7jTx8yB3Nt9R1QVE23i+O3Z4lrMfP0HYRTMrAcSt sbBQKPau29Uc2dtk7Ly0PXvJ3nskZBsXqb1SVZypNRmuzHnH5NTZUmfkI1tbBYnt5h vlmY9ay2V5h0W6E1v3BAI+vlbtylzBqXX4pVEW6ujbHnffAqal7I3L8TwaF47h1Udw Psdezn+dcTA0Vc5GvOOOLdqySjWR9qpsLoA6uqwdmuA50K549wdGS1WMkktpOe5TuT tmzvCN3Zx7RgrLEDC55voca+xku7h3c+7fIMa+JnLuUHjfvcSubYHE2rSCuXolMXg8 nmXLq3RqHGIZA== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id 2770B17E157E; Wed, 29 Apr 2026 11:38:58 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:34 +0200 Subject: [PATCH 07/10] drm/panthor: Automate CSG IRQ processing at group unbind time Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-7-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=6906; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=fAowcBEu7j3pM+kSJL/CrZv1sU+IDJSbuFK/s5NqU7w=; b=TRCJD84U0ewxFu90IZeFqnvOzpE4aw5aGQS6+yqgYaW1saKpCiPcAT9I9lSyT+6gqaZSwkCIS OudgOqKvkvJA2vd9L8VItgw57xPEgj8lys5/sZnPxNq1SQvckaTdfWx X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= Make the sched_process_csg_irq_locked() call part of group_unbind_locked() so we don't have to manually call it in tick_ctx_apply()/panthor_sched_suspend(). This implies moving group_[un]bind_locked() around to avoid a forward declaration. Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_sched.c | 178 +++++++++++++++-------------= ---- 1 file changed, 82 insertions(+), 96 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/pant= hor/panthor_sched.c index c197bdc4b2c7..601a9bff1485 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -982,86 +982,6 @@ group_get(struct panthor_group *group) return group; } =20 -/** - * group_bind_locked() - Bind a group to a group slot - * @group: Group. - * @csg_id: Slot. - * - * Return: 0 on success, a negative error code otherwise. - */ -static int -group_bind_locked(struct panthor_group *group, u32 csg_id) -{ - struct panthor_device *ptdev =3D group->ptdev; - int ret; - - lockdep_assert_held(&ptdev->scheduler->lock); - - if (drm_WARN_ON(&ptdev->base, group->csg_id !=3D -1 || csg_id >=3D MAX_CS= GS || - ptdev->scheduler->csg_slots[csg_id].group)) - return -EINVAL; - - ret =3D panthor_vm_active(group->vm); - if (ret) - return ret; - - group_get(group); - - /* Dummy doorbell allocation: doorbell is assigned to the group and - * all queues use the same doorbell. - * - * TODO: Implement LRU-based doorbell assignment, so the most often - * updated queues get their own doorbell, thus avoiding useless checks - * on queues belonging to the same group that are rarely updated. - */ - for (u32 i =3D 0; i < group->queue_count; i++) - group->queues[i]->doorbell_id =3D csg_id + 1; - - scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) { - ptdev->scheduler->csg_slots[csg_id].group =3D group; - group->csg_id =3D csg_id; - } - - return 0; -} - -/** - * group_unbind_locked() - Unbind a group from a slot. - * @group: Group to unbind. - * - * Return: 0 on success, a negative error code otherwise. - */ -static int -group_unbind_locked(struct panthor_group *group) -{ - struct panthor_device *ptdev =3D group->ptdev; - - lockdep_assert_held(&ptdev->scheduler->lock); - - if (drm_WARN_ON(&ptdev->base, group->csg_id < 0 || group->csg_id >=3D MAX= _CSGS)) - return -EINVAL; - - if (drm_WARN_ON(&ptdev->base, group->state =3D=3D PANTHOR_CS_GROUP_ACTIVE= )) - return -EINVAL; - - scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) { - ptdev->scheduler->csg_slots[group->csg_id].group =3D NULL; - group->csg_id =3D -1; - } - - panthor_vm_idle(group->vm); - - /* Tiler OOM events will be re-issued next time the group is scheduled. */ - atomic_set(&group->tiler_oom, 0); - cancel_work(&group->tiler_oom_work); - - for (u32 i =3D 0; i < group->queue_count; i++) - group->queues[i]->doorbell_id =3D -1; - - group_put(group); - return 0; -} - static bool group_is_idle(struct panthor_group *group) { @@ -1968,6 +1888,88 @@ void panthor_sched_report_fw_events(struct panthor_d= evice *ptdev, u32 events) } } =20 +/** + * group_bind_locked() - Bind a group to a group slot + * @group: Group. + * @csg_id: Slot. + * + * Return: 0 on success, a negative error code otherwise. + */ +static int +group_bind_locked(struct panthor_group *group, u32 csg_id) +{ + struct panthor_device *ptdev =3D group->ptdev; + int ret; + + lockdep_assert_held(&ptdev->scheduler->lock); + + if (drm_WARN_ON(&ptdev->base, group->csg_id !=3D -1 || csg_id >=3D MAX_CS= GS || + ptdev->scheduler->csg_slots[csg_id].group)) + return -EINVAL; + + ret =3D panthor_vm_active(group->vm); + if (ret) + return ret; + + group_get(group); + + /* Dummy doorbell allocation: doorbell is assigned to the group and + * all queues use the same doorbell. + * + * TODO: Implement LRU-based doorbell assignment, so the most often + * updated queues get their own doorbell, thus avoiding useless checks + * on queues belonging to the same group that are rarely updated. + */ + for (u32 i =3D 0; i < group->queue_count; i++) + group->queues[i]->doorbell_id =3D csg_id + 1; + + scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) { + ptdev->scheduler->csg_slots[csg_id].group =3D group; + group->csg_id =3D csg_id; + } + + return 0; +} + +/** + * group_unbind_locked() - Unbind a group from a slot. + * @group: Group to unbind. + * + * Return: 0 on success, a negative error code otherwise. + */ +static int +group_unbind_locked(struct panthor_group *group) +{ + struct panthor_device *ptdev =3D group->ptdev; + + lockdep_assert_held(&ptdev->scheduler->lock); + + if (drm_WARN_ON(&ptdev->base, group->csg_id < 0 || group->csg_id >=3D MAX= _CSGS)) + return -EINVAL; + + if (drm_WARN_ON(&ptdev->base, group->state =3D=3D PANTHOR_CS_GROUP_ACTIVE= )) + return -EINVAL; + + scoped_guard(spinlock_irqsave, &ptdev->scheduler->events_lock) { + /* Process all pending IRQs before returning the slot. */ + sched_process_csg_irq_locked(ptdev, group->csg_id); + ptdev->scheduler->csg_slots[group->csg_id].group =3D NULL; + group->csg_id =3D -1; + } + + panthor_vm_idle(group->vm); + + /* Tiler OOM events will be re-issued next time the group is scheduled. */ + atomic_set(&group->tiler_oom, 0); + cancel_work(&group->tiler_oom_work); + + for (u32 i =3D 0; i < group->queue_count; i++) + group->queues[i]->doorbell_id =3D -1; + + group_put(group); + return 0; +} + static const char *fence_get_driver_name(struct dma_fence *fence) { return "panthor"; @@ -2396,15 +2398,6 @@ tick_ctx_apply(struct panthor_scheduler *sched, stru= ct panthor_sched_tick_ctx *c /* Unbind evicted groups. */ for (prio =3D PANTHOR_CSG_PRIORITY_COUNT - 1; prio >=3D 0; prio--) { list_for_each_entry(group, &ctx->old_groups[prio], run_node) { - /* This group is gone. Process interrupts to clear - * any pending interrupts before we start the new - * group. - */ - if (group->csg_id >=3D 0) { - guard(spinlock_irqsave)(&sched->events_lock); - sched_process_csg_irq_locked(ptdev, group->csg_id); - } - group_unbind_locked(group); } } @@ -2970,8 +2963,6 @@ void panthor_sched_suspend(struct panthor_device *ptd= ev) =20 if (flush_caches_failed) csg_slot->group->state =3D PANTHOR_CS_GROUP_TERMINATED; - else - csg_slot_sync_update_locked(ptdev, csg_id); =20 slot_mask &=3D ~BIT(csg_id); } @@ -2986,11 +2977,6 @@ void panthor_sched_suspend(struct panthor_device *pt= dev) =20 group_get(group); =20 - if (group->csg_id >=3D 0) { - guard(spinlock_irqsave)(&sched->events_lock); - sched_process_csg_irq_locked(ptdev, group->csg_id); - } - group_unbind_locked(group); =20 drm_WARN_ON(&group->ptdev->base, !list_empty(&group->run_node)); --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E4093C2772 for ; Wed, 29 Apr 2026 09:39:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455560; cv=none; b=CUoU/ruQEyl6RDyIwTCGBvBhAvEQVEP8NWFC/dJKYGJHKX/BZHhgvjmm3ZTR5iY3CP/4s9JufpwefXCR486ha+ZFVPYNUHRAROTJtrYQPZnKuiJuEM2e2Pyp3LGT2Zddyz0fD9TrkxDUnD3vco4IcC4wnfpf8eR2Q2U5UEQXGc8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455560; c=relaxed/simple; bh=wziwJjIxxAv4lvMAYsfdrTzTInL8uzMnt6s9npP1kEE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tnflhMVw5+qiWqHSTPrYzZjENLy6NqdjYVQiyNP/UPBUE7LRYIqOyGYYucESJpcxhJ3S7DOQvS/Q3cXgZ47doui4/bPVsUeLP0ffggs1p8l0DdNCMvQNah6EIvdVikpt/dfnd7Yma69eN8WI9PjjWb83zqsBOc6fpfGEm5uZZZc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=FSVzqITs; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="FSVzqITs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455539; bh=wziwJjIxxAv4lvMAYsfdrTzTInL8uzMnt6s9npP1kEE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=FSVzqITsq9lYLA/chq8SkZV1UDilKKuN4SpABf48/qCn1DQATva6gWmJ3mmNJ1yFz xDlgl0PrM/CoWqYBxyIxd9ZewtXE8u5jCgf8NF0pA84OZQxx9MF6v8GfxIlO93Rsh0 9+/fSbJbfc39zEdFgMkpyCedcIAIdVKg5OYzMF3fRD41sJsGZh02gdOwRKSxXK6rYz jSyf5MdPm3iCDDJwQ1hXizJnKtpPsrmMIfAlegGscKmhBzViczae4Gt37qHiykOi2c peZAWQPy354Pw/fGxklTvsNdRXEVsn4qRkkB62TUL0F88EYl0KSBeTKK1Ju2H3ae7R BRQt5QzPeNRcQ== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id A8A1617E1580; Wed, 29 Apr 2026 11:38:58 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:35 +0200 Subject: [PATCH 08/10] drm/panthor: Automatically enable interrupts in panthor_fw_wait_acks() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-8-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=4411; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=wziwJjIxxAv4lvMAYsfdrTzTInL8uzMnt6s9npP1kEE=; b=H0qk+dnvemBGiEl72kN3BDM6vDpXRu99o0FDyP2ypDgiLTaWA+mYHhzDCRKOR+Tg4ynaB1MD8 TlKf7yQ47GuBABceoBkoICMq5U73GWibyR+pKSiFC/RzRZ5KJw99YGC X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= Rather than assuming an interrupt is always expected for request acks, temporarily enable the relevant interrupts when the polling-wait failed. This should hopefully reduce the number of interrupts the CPU has to process. Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_fw.c | 34 +++++++++++++++++++----------= ---- drivers/gpu/drm/panthor/panthor_sched.c | 5 +++-- 2 files changed, 23 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor= /panthor_fw.c index 8239a6951569..f5e0ceca4130 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.c +++ b/drivers/gpu/drm/panthor/panthor_fw.c @@ -1039,16 +1039,10 @@ static void panthor_fw_init_global_iface(struct pan= thor_device *ptdev) glb_iface->input->progress_timer =3D PROGRESS_TIMEOUT_CYCLES >> PROGRESS_= TIMEOUT_SCALE_SHIFT; glb_iface->input->idle_timer =3D panthor_fw_conv_timeout(ptdev, IDLE_HYST= ERESIS_US); =20 - /* Enable interrupts we care about. */ - glb_iface->input->ack_irq_mask =3D GLB_CFG_ALLOC_EN | - GLB_PING | - GLB_CFG_PROGRESS_TIMER | - GLB_CFG_POWEROFF_TIMER | - GLB_IDLE_EN | - GLB_IDLE; - - if (panthor_fw_has_glb_state(ptdev)) - glb_iface->input->ack_irq_mask |=3D GLB_STATE_MASK; + /* Enable interrupts for asynchronous events that are not + * triggered by request acks. + */ + glb_iface->input->ack_irq_mask =3D GLB_IDLE; =20 panthor_fw_update_reqs(glb_iface, req, GLB_IDLE_EN | GLB_COUNTER_EN, GLB_IDLE_EN | GLB_COUNTER_EN); @@ -1318,8 +1312,8 @@ void panthor_fw_unplug(struct panthor_device *ptdev) * Return: 0 on success, -ETIMEDOUT otherwise. */ static int panthor_fw_wait_acks(const u32 *req_ptr, const u32 *ack_ptr, - wait_queue_head_t *wq, - u32 req_mask, u32 *acked, + u32 *ack_irq_mask_ptr, spinlock_t *lock, + wait_queue_head_t *wq, u32 req_mask, u32 *acked, u32 timeout_ms) { u32 ack, req =3D READ_ONCE(*req_ptr) & req_mask; @@ -1334,8 +1328,16 @@ static int panthor_fw_wait_acks(const u32 *req_ptr, = const u32 *ack_ptr, if (!ret) return 0; =20 - if (wait_event_timeout(*wq, (READ_ONCE(*ack_ptr) & req_mask) =3D=3D req, - msecs_to_jiffies(timeout_ms))) + scoped_guard(spinlock_irqsave, lock) + *ack_irq_mask_ptr |=3D req_mask; + + ret =3D wait_event_timeout(*wq, (READ_ONCE(*ack_ptr) & req_mask) =3D=3D r= eq, + msecs_to_jiffies(timeout_ms)); + + scoped_guard(spinlock_irqsave, lock) + *ack_irq_mask_ptr &=3D ~req_mask; + + if (ret) return 0; =20 /* Check one last time, in case we were not woken up for some reason. */ @@ -1369,6 +1371,8 @@ int panthor_fw_glb_wait_acks(struct panthor_device *p= tdev, =20 return panthor_fw_wait_acks(&glb_iface->input->req, &glb_iface->output->ack, + &glb_iface->input->ack_irq_mask, + &glb_iface->lock, &ptdev->fw->req_waitqueue, req_mask, acked, timeout_ms); } @@ -1395,6 +1399,8 @@ int panthor_fw_csg_wait_acks(struct panthor_device *p= tdev, u32 csg_slot, =20 ret =3D panthor_fw_wait_acks(&csg_iface->input->req, &csg_iface->output->ack, + &csg_iface->input->ack_irq_mask, + &csg_iface->lock, &ptdev->fw->req_waitqueue, req_mask, acked, timeout_ms); =20 diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/pant= hor/panthor_sched.c index 601a9bff1485..2edba335f22d 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -1110,7 +1110,7 @@ cs_slot_prog_locked(struct panthor_device *ptdev, u32= csg_id, u32 cs_id) cs_iface->input->ringbuf_output =3D queue->iface.output_fw_va; cs_iface->input->config =3D CS_CONFIG_PRIORITY(queue->priority) | CS_CONFIG_DOORBELL(queue->doorbell_id); - cs_iface->input->ack_irq_mask =3D ~0; + cs_iface->input->ack_irq_mask =3D CS_FATAL | CS_FAULT | CS_TILER_OOM; panthor_fw_update_reqs(cs_iface, req, CS_IDLE_SYNC_WAIT | CS_IDLE_EMPTY | @@ -1378,7 +1378,8 @@ csg_slot_prog_locked(struct panthor_device *ptdev, u3= 2 csg_id, u32 priority) csg_iface->input->protm_suspend_buf =3D 0; } =20 - csg_iface->input->ack_irq_mask =3D ~0; + csg_iface->input->ack_irq_mask =3D CSG_SYNC_UPDATE | CSG_IDLE | + CSG_PROGRESS_TIMER_EVENT; panthor_fw_toggle_reqs(csg_iface, doorbell_req, doorbell_ack, queue_mask); return 0; } --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC72A3B9D81 for ; Wed, 29 Apr 2026 09:39:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455562; cv=none; b=nIIfLdpYRvXzb6o7llVCKyM8nHLycuEdHgsXg3lLrUHS3vi0PINoNWuWh5tY7qhrzdhA+W20Rl20cq4Iqc/B4zWDNKI+Vbp/oixmdt1FWfzfFS9r7GMwFGX5Ak5OGsVPGdfsq4JCa/fWg/i+u+W+Vqsh6lhQzr2CDxIuQTzd9I0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455562; c=relaxed/simple; bh=C6CmoxCJ94dFNPX2KUluUf+swmFBOWWsAAPz9g59ZmY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XoQNEP35bZVBgcIFvNQ8IMY27VeRxK5nCdR2jOisHgP0htxEiDgdJ4zYqv8kYB9yQmy89aQ5QygwSsZPIe9Ic9yiw6UcjuvDzeVCWRaZL7eG8h3ligIxMKZJ4yC3DRHPzaThmRdnuBCcylLfAMnvrbmtSnbXtd3dei0RRNJ39wo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=QAI7mA48; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="QAI7mA48" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455539; bh=C6CmoxCJ94dFNPX2KUluUf+swmFBOWWsAAPz9g59ZmY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=QAI7mA48RIceGm9SR24HZuL4X+8x8MzILC+GDxJrKoH1+qC5BPAZSAwGoG77i1ruX v92hmWq5x9foZQH+YjkjJoSdk7/zAaZLgsDh/9grRMO/uBPOoq9TjCpQvuKOZRN4bC QTvVB48Ui2cmgs0LK1ymlNnH2ogAtdVubEOeyGE7ouXZKx2vIBD77f3TOvpFgkbJ99 9w/IsDSBOaljvPrw1HTf5mxKFPywNfsN6O1Zim784u2sBiTvKsIqKLobm++0JrCFm7 lkfFMtqTOD2Yo5bVQyTqsFveIpA3s3BO96HL4WzexRKyDm8rsakY8Rv/GIvX6nEwTI oyIcoc9902HNw== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id 38FC517E1582; Wed, 29 Apr 2026 11:38:59 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:36 +0200 Subject: [PATCH 09/10] drm/panthor: Process FW events in IRQ context Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-9-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=2131; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=C6CmoxCJ94dFNPX2KUluUf+swmFBOWWsAAPz9g59ZmY=; b=vFDJN0g5xeVGO1Uc0HjlfFTk1xdt/8x7z4Vur+kCVMWc7Ca7BcYNRy/NpNXv9lH/A/r+GS8fx jO2l5UJD8sgCPrQQRXIXQR2hfB3v1tkPEOyvGxG2kfk1PjySM57BGyB X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= Now that everything is set to allow processing FW events in IRQ context, go for it. This should reduce the dma_fence signaling latency. Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_fw.c | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor= /panthor_fw.c index f5e0ceca4130..05c632913359 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.c +++ b/drivers/gpu/drm/panthor/panthor_fw.c @@ -1087,9 +1087,38 @@ static void panthor_job_irq_handler(struct panthor_i= rq *pirq, u32 status) } } =20 +static irqreturn_t panthor_job_irq_raw_handler(int irq, void *data) +{ + struct panthor_irq *pirq =3D data; + + if (!gpu_read(pirq->iomem, INT_STAT)) + return IRQ_NONE; + + scoped_guard(spinlock_irqsave, &pirq->mask_lock) { + if (pirq->state !=3D PANTHOR_IRQ_STATE_ACTIVE) + return IRQ_NONE; + + pirq->state =3D PANTHOR_IRQ_STATE_PROCESSING; + } + + panthor_job_irq_handler(pirq, gpu_read(pirq->iomem, INT_RAWSTAT)); + + scoped_guard(spinlock_irqsave, &pirq->mask_lock) { + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_PROCESSING) + pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; + } + + return IRQ_HANDLED; +} + static irqreturn_t panthor_job_irq_threaded_handler(int irq, void *data) { - return panthor_irq_default_threaded_handler(data, panthor_job_irq_handler= ); + struct panthor_irq *pirq =3D data; + + /* We never return IRQ_WAKE_THREAD, so we're not supposed to be called. */ + drm_WARN_ON_ONCE(&pirq->ptdev->base, + "threaded IRQ handler should never be called."); + return IRQ_NONE; } =20 static int panthor_fw_start(struct panthor_device *ptdev) @@ -1489,7 +1518,7 @@ int panthor_fw_init(struct panthor_device *ptdev) =20 ret =3D panthor_irq_request(ptdev, &fw->irq, irq, 0, ptdev->iomem + JOB_INT_BASE, "job", - panthor_irq_default_raw_handler, + panthor_job_irq_raw_handler, panthor_job_irq_threaded_handler); if (ret) { drm_err(&ptdev->base, "failed to request job irq"); --=20 2.53.0 From nobody Tue Jun 16 20:39:30 2026 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED5823C3433 for ; Wed, 29 Apr 2026 09:39:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455564; cv=none; b=VihgBFqfN+ldCzWDuTxa5fwQMVPye0bs7InQ/pmVdwrpyzjAvx3Q58f1Pr+dOXvQMU/fM/yN1iuWiXj4B68DF8ZFfozalpuOb7cOze8yVaRok2nDfpsNj5OrxL+0kJBvRdCtb5rSyAXE0NpN1WtEhwDMHcxsMVfVYPiuSWkqwzA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777455564; c=relaxed/simple; bh=Do3Vqxpw6UgsDr87mQNKL4wcFc5sA82LI8m+cuWV8xw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OOZpiryR4DrdGsYLuRhDddF0vGMm/iAtgvPAB6oMOGC/v/hCW/IjyHdphNMiechM56QS8uAnPt1NR9wHBhH2czQZIt0h4YMGXixeozamA4FKYY1XdUpVV9oA5OMAhPDZDTXk3o5Jyl59aKFOny1OZfOtscuRDntOi/ObikbKSJM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=DiB9gmzL; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="DiB9gmzL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1777455540; bh=Do3Vqxpw6UgsDr87mQNKL4wcFc5sA82LI8m+cuWV8xw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=DiB9gmzLb62VoZ37mdS8lQIX5+w6A8/RNmL/zlMwwZUnJSCfV7zsgjmnWSDwMSUmu wje1lK1ShXmt5Xoousq834hObOMpszVbigZF886gidHRI/v+GmQ+r1lXioEmh2/Y8l sJGRSX1G0QOXrBRoQbcDICMQVAvLcv+6c4bGJvQHgu7MD+PX+tkhOr1+Edt1C9vBpP ZVlQLEclfmRwp9l8sLtqIHrTkQQPT5T9n0YVPvLkhiaXROQNl6uKWMu54hs+6nMrts bNOwYCkfR67GNUwCCF4e/th2R5kGMNKwkLjgSKG6AoCnuXF0rzDhvsWPAsonjUx/+B DQvnFAuSsxqyA== Received: from [100.64.0.11] (unknown [100.64.0.11]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id BC2A517E1584; Wed, 29 Apr 2026 11:38:59 +0200 (CEST) From: Boris Brezillon Date: Wed, 29 Apr 2026 11:38:37 +0200 Subject: [PATCH 10/10] drm/panthor: Introduce interrupt coalescing support for job IRQs Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260429-panthor-signal-from-irq-v1-10-4b92ae4142d2@collabora.com> References: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> In-Reply-To: <20260429-panthor-signal-from-irq-v1-0-4b92ae4142d2@collabora.com> To: Steven Price , Liviu Dudau Cc: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Boris Brezillon X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777455534; l=12327; i=boris.brezillon@collabora.com; s=20260429; h=from:subject:message-id; bh=Do3Vqxpw6UgsDr87mQNKL4wcFc5sA82LI8m+cuWV8xw=; b=ludwheV29uZ2jn/Gpb0Wh6zbqHmv998eHc7rRHsOrI2yBFgS+QZ/06AiVKboF+X50PuIiml5b XqCsPP4qoNwBuK7PVPGMQUTm/Es2ZS0HmJQ3927KBusBZtJAoDtoRVI X-Developer-Key: i=boris.brezillon@collabora.com; a=ed25519; pk=eN+ORdOgQY7d5U+0kA8h5bf67XdD8bhKbjD/TCHexSY= Dealing with interrupts from the raw IRQ handler is good for latency, but might be detrimental for the overall throughput, because the system keeps being interrupted to process job interrupts. Try to mitigate that with some interrupt coalescing infrastructure, where we wake up the IRQ thread if close enough interrupts gets detected. It's still experimental, which explains why the feature is off by default, and can be enabled through a debugfs knob. Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_device.h | 83 +++++++++++++++++ drivers/gpu/drm/panthor/panthor_drv.c | 1 + drivers/gpu/drm/panthor/panthor_fw.c | 150 +++++++++++++++++++++++++++= ++-- drivers/gpu/drm/panthor/panthor_fw.h | 2 + 4 files changed, 231 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/pan= thor/panthor_device.h index 1c130b8394ab..e90f251f75e2 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -109,6 +109,48 @@ struct panthor_irq { enum panthor_irq_state state; }; =20 +/** + * struct panthor_irq_coalescing - IRQ coalescing info + */ +struct panthor_irq_coalescing { + /** + * @max_us: Maximum time in microseconds between two consecutive + * interrupts to consider coalescing. + * + * It being a u16 means we can't encode more than 65-ish msecs, but + * if we have to poll status for more than a few hundreds usecs it's + * going to make the IRQ thread consume more CPU than we want. + */ + u16 max_us; + + /** + * @poll_perios_us: Rate at which status polling happens. + * + * It being a u16 means we can't encode more than 65-ish msecs, but + * if we have to delay each status check by more than a few usecs + * it's going to add latency we don't want. + */ + u16 poll_period_us; + + /** + * @inbounds_cnt_threshold: Minimum of consecutive interrupts with no + * more than max_us between them to wake up the thread handler. + */ + u16 inbounds_cnt_threshold; + + /** + * @inbounds_cnt: Current number of consecutive interrupts with no more + * than max_us between. + */ + u16 inbounds_cnt; + + /** @coalesced_cnt: Total number of interrupts coalesced. */ + u64 coalesced_cnt; + + /** @last_ts: Timestamp of the last IRQ. */ + ktime_t last_ts; +}; + /** * enum panthor_device_profiling_mode - Profiling state */ @@ -571,6 +613,47 @@ static inline u64 gpu_read64_counter(void __iomem *iom= em, u32 reg) #define INT_MASK 0x8 #define INT_STAT 0xc =20 +static inline bool +panthor_irq_coalescing_wake_thread(struct panthor_irq_coalescing *coalesci= ng) +{ + ktime_t ts; + s64 diff_ns; + + if (!coalescing->inbounds_cnt_threshold) + return false; + + ts =3D ktime_get(); + diff_ns =3D ktime_to_ns(ktime_sub(ts, coalescing->last_ts)); + if (diff_ns > coalescing->max_us * 1000) { + coalescing->inbounds_cnt =3D 1; + return false; + } + + if (coalescing->inbounds_cnt < U16_MAX) + coalescing->inbounds_cnt++; + + return coalescing->inbounds_cnt >=3D coalescing->inbounds_cnt_threshold; +} + +static inline void +panthor_irq_coalescing_update_ts(struct panthor_irq_coalescing *coalescing) +{ + if (coalescing->inbounds_cnt_threshold) + coalescing->last_ts =3D ktime_get(); +} + +static inline void +panthor_irq_coalescing_init(struct panthor_irq_coalescing *coalescing, + u16 max_us, u16 poll_period_us, u16 inbounds_cnt_threshold) +{ + coalescing->inbounds_cnt =3D 0; + coalescing->coalesced_cnt =3D 0; + coalescing->max_us =3D max_us; + coalescing->poll_period_us =3D poll_period_us; + coalescing->inbounds_cnt_threshold =3D inbounds_cnt_threshold; + coalescing->last_ts =3D ktime_set(0, 0); +} + static inline irqreturn_t panthor_irq_default_raw_handler(int irq, void *d= ata) { struct panthor_irq *pirq =3D data; diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/pantho= r/panthor_drv.c index 66996c9147c2..2fac5ba57f9d 100644 --- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -1760,6 +1760,7 @@ static void panthor_debugfs_init(struct drm_minor *mi= nor) { panthor_mmu_debugfs_init(minor); panthor_gem_debugfs_init(minor); + panthor_fw_debugfs_init(minor); } #endif =20 diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor= /panthor_fw.c index 05c632913359..cbb7d00f0e6e 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.c +++ b/drivers/gpu/drm/panthor/panthor_fw.c @@ -6,6 +6,7 @@ #endif =20 #include +#include #include #include #include @@ -15,6 +16,7 @@ #include =20 #include +#include #include #include =20 @@ -271,6 +273,9 @@ struct panthor_fw { =20 /** @irq: Job irq data. */ struct panthor_irq irq; + + /** @irq_coalescing: Job IRQ coalescing. */ + struct panthor_irq_coalescing irq_coalescing; }; =20 struct panthor_vm *panthor_fw_vm(struct panthor_device *ptdev) @@ -1090,6 +1095,8 @@ static void panthor_job_irq_handler(struct panthor_ir= q *pirq, u32 status) static irqreturn_t panthor_job_irq_raw_handler(int irq, void *data) { struct panthor_irq *pirq =3D data; + struct panthor_device *ptdev =3D pirq->ptdev; + irqreturn_t ret =3D IRQ_HANDLED; =20 if (!gpu_read(pirq->iomem, INT_STAT)) return IRQ_NONE; @@ -1101,6 +1108,9 @@ static irqreturn_t panthor_job_irq_raw_handler(int ir= q, void *data) pirq->state =3D PANTHOR_IRQ_STATE_PROCESSING; } =20 + if (panthor_irq_coalescing_wake_thread(&ptdev->fw->irq_coalescing)) + ret =3D IRQ_WAKE_THREAD; + panthor_job_irq_handler(pirq, gpu_read(pirq->iomem, INT_RAWSTAT)); =20 scoped_guard(spinlock_irqsave, &pirq->mask_lock) { @@ -1108,17 +1118,58 @@ static irqreturn_t panthor_job_irq_raw_handler(int = irq, void *data) pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; } =20 - return IRQ_HANDLED; + panthor_irq_coalescing_update_ts(&ptdev->fw->irq_coalescing); + return ret; } =20 static irqreturn_t panthor_job_irq_threaded_handler(int irq, void *data) { struct panthor_irq *pirq =3D data; + struct panthor_device *ptdev =3D pirq->ptdev; + irqreturn_t ret =3D IRQ_NONE; + u32 processed_count =3D 0; =20 - /* We never return IRQ_WAKE_THREAD, so we're not supposed to be called. */ - drm_WARN_ON_ONCE(&pirq->ptdev->base, - "threaded IRQ handler should never be called."); - return IRQ_NONE; + scoped_guard(spinlock_irqsave, &pirq->mask_lock) { + if (pirq->state !=3D PANTHOR_IRQ_STATE_ACTIVE) + return IRQ_NONE; + + gpu_write(pirq->iomem, INT_MASK, 0); + pirq->state =3D PANTHOR_IRQ_STATE_PROCESSING; + } + + while (true) { + u32 status; + + /* It's safe to access pirq->mask without the lock held here. If a new + * event gets added to the mask and the corresponding IRQ is pending, + * we'll process it right away instead of adding an extra raw -> threaded + * round trip. If an event is removed and the status bit is set, it will + * be ignored, just like it would have been if the mask had been adjusted + * right before the HW event kicks in. TLDR; it's all expected races we'= re + * covered for. + */ + if (readl_poll_timeout_atomic(pirq->iomem + INT_RAWSTAT, + status, status & pirq->mask, + ptdev->fw->irq_coalescing.poll_period_us, + ptdev->fw->irq_coalescing.max_us)) + break; + + panthor_job_irq_handler(pirq, status); + ret =3D IRQ_HANDLED; + processed_count++; + } + + if (processed_count > 1) + ptdev->fw->irq_coalescing.coalesced_cnt +=3D processed_count - 1; + + scoped_guard(spinlock_irqsave, &pirq->mask_lock) { + if (pirq->state =3D=3D PANTHOR_IRQ_STATE_PROCESSING) { + pirq->state =3D PANTHOR_IRQ_STATE_ACTIVE; + gpu_write(pirq->iomem, INT_MASK, pirq->mask); + } + } + + return ret; } =20 static int panthor_fw_start(struct panthor_device *ptdev) @@ -1516,6 +1567,11 @@ int panthor_fw_init(struct panthor_device *ptdev) if (irq <=3D 0) return -ENODEV; =20 + /* Start with IRQ coalescing disabled, until we have enough proof it's + * useful and doesn't have a too big CPU overhead. Those parameters can + * be tweaked with the debugfs knobs. + */ + panthor_irq_coalescing_init(&fw->irq_coalescing, 0, 0, 0); ret =3D panthor_irq_request(ptdev, &fw->irq, irq, 0, ptdev->iomem + JOB_INT_BASE, "job", panthor_job_irq_raw_handler, @@ -1563,6 +1619,90 @@ int panthor_fw_init(struct panthor_device *ptdev) return ret; } =20 +static ssize_t job_irq_coalescing_props_read(struct file *file, + char __user *ubuf, + size_t ubuf_size, + loff_t *ppos) +{ + struct panthor_device *ptdev =3D container_of(file->private_data, + struct panthor_device, base); + char kbuf[256] =3D {}; + int kbuf_size; + + kbuf_size =3D snprintf(kbuf, sizeof(kbuf) - 1, + "max_us=3D%u poll_period_us=3D%u inbounds_cnt_threshold=3D%u\n", + ptdev->fw->irq_coalescing.max_us, + ptdev->fw->irq_coalescing.poll_period_us, + ptdev->fw->irq_coalescing.inbounds_cnt_threshold); + if (kbuf_size > sizeof(kbuf) - 1) + kbuf_size =3D sizeof(kbuf) - 1; + + return simple_read_from_buffer(ubuf, ubuf_size, ppos, kbuf, kbuf_size); +} + +static ssize_t job_irq_coalescing_props_write(struct file *file, + const char __user *ubuf, + size_t ubuf_size, loff_t *ppos) +{ + struct panthor_device *ptdev =3D container_of(file->private_data, + struct panthor_device, base); + unsigned int max_us =3D 0, poll_period_us =3D 0, inbounds_cnt_threshold = =3D 0; + char kbuf[256] =3D {}; + int ret; + + simple_write_to_buffer(kbuf, sizeof(kbuf) - 1, ppos, ubuf, ubuf_size); + ret =3D sscanf(kbuf, + "max_us=3D%u poll_period_us=3D%u inbounds_cnt_threshold=3D%u", + &max_us, &poll_period_us, &inbounds_cnt_threshold); + if (ret !=3D 3) + return -EINVAL; + + if (max_us > U16_MAX || poll_period_us > U16_MAX || inbounds_cnt_threshol= d > U16_MAX) + return -EINVAL; + + panthor_irq_coalescing_init(&ptdev->fw->irq_coalescing, max_us, + poll_period_us, inbounds_cnt_threshold); + return ubuf_size; +} + +static const struct debugfs_short_fops job_irq_coalescing_props_fops =3D { + .read =3D job_irq_coalescing_props_read, + .write =3D job_irq_coalescing_props_write, +}; + +static ssize_t job_irq_coalescing_stats_read(struct file *file, + char __user *ubuf, + size_t ubuf_size, + loff_t *ppos) +{ + struct panthor_device *ptdev =3D container_of(file->private_data, + struct panthor_device, base); + char kbuf[256] =3D {}; + int kbuf_size; + + kbuf_size =3D snprintf(kbuf, sizeof(kbuf) - 1, + "inbounds_cnt=3D%u coalesced_cnt=3D%llu last_ts=3D%llu\n", + ptdev->fw->irq_coalescing.inbounds_cnt, + ptdev->fw->irq_coalescing.coalesced_cnt, + ktime_to_ns(ptdev->fw->irq_coalescing.last_ts)); + if (kbuf_size > sizeof(kbuf) - 1) + kbuf_size =3D sizeof(kbuf) - 1; + + return simple_read_from_buffer(ubuf, ubuf_size, ppos, kbuf, kbuf_size); +} + +static const struct debugfs_short_fops job_irq_coalescing_stats_fops =3D { + .read =3D job_irq_coalescing_stats_read, +}; + +void panthor_fw_debugfs_init(struct drm_minor *minor) +{ + debugfs_create_file("job_irq_coalescing_props", 0600, minor->debugfs_root, + minor->dev, &job_irq_coalescing_props_fops); + debugfs_create_file("job_irq_coalescing_stats", 0400, minor->debugfs_root, + minor->dev, &job_irq_coalescing_stats_fops); +} + MODULE_FIRMWARE("arm/mali/arch10.8/mali_csffw.bin"); MODULE_FIRMWARE("arm/mali/arch10.10/mali_csffw.bin"); MODULE_FIRMWARE("arm/mali/arch10.12/mali_csffw.bin"); diff --git a/drivers/gpu/drm/panthor/panthor_fw.h b/drivers/gpu/drm/panthor= /panthor_fw.h index e56b7fe15bb3..2643bd9e4ef9 100644 --- a/drivers/gpu/drm/panthor/panthor_fw.h +++ b/drivers/gpu/drm/panthor/panthor_fw.h @@ -526,4 +526,6 @@ static inline int panthor_fw_resume(struct panthor_devi= ce *ptdev) int panthor_fw_init(struct panthor_device *ptdev); void panthor_fw_unplug(struct panthor_device *ptdev); =20 +void panthor_fw_debugfs_init(struct drm_minor *minor); + #endif --=20 2.53.0