[PATCH net-next v4 0/2] net: pse-pd: add poll path and LED trigger support

Carlo Szelinsky posted 2 patches 2 months ago
There is a newer version of this series
drivers/net/pse-pd/pse_core.c | 296 ++++++++++++++++++++++++++++++----
include/linux/pse-pd/pse.h    |  34 ++++
2 files changed, 299 insertions(+), 31 deletions(-)
[PATCH net-next v4 0/2] net: pse-pd: add poll path and LED trigger support
Posted by Carlo Szelinsky 2 months ago
Thanks to Kory, Oleksij, Krzysztof and Andrew for all the helpful
feedback on earlier versions - really appreciate the time you put
into reviewing this.

This series adds poll-based event detection and LED trigger support
to the PSE core subsystem.

Patch 1 introduces the poll path independently of LED support,
so it can be tested in isolation on boards with and without IRQ
configured.

Patch 2 adds LED triggers that hook into the shared event handling
path introduced by patch 1.

Note: pse_handle_events() and the existing pse_isr() pass notifs_mask
as a single unsigned long, which limits the bitmask to BITS_PER_LONG
PI lines. This is a pre-existing constraint in the IRQ path and is
sufficient for all current PSE controllers (max 48 ports vs 64-bit
unsigned long), but may need to be converted to DECLARE_BITMAP() if
future hardware exceeds this limit.

Changes since v3:
- Dropped the dt-bindings poll-interval-ms patch: the poll interval
  is a driver decision, not a hardware property (Krzysztof)
- Removed of_property_read_u32() for poll-interval-ms from
  devm_pse_poll_helper(); the 500ms default is now hardcoded but
  drivers can override pcdev->poll_interval_ms before calling the
  helper
- Rebased on net-next/main

Changes since v2:
- Based on net-next/main, added net-next subject prefix
- Added --base tree information
- Added CC for devicetree list and DT maintainers
- Collected Reviewed-by from Kory Maincent on patch 1/3
- Fixed build error when CONFIG_LEDS_TRIGGERS is disabled:
  moved LED registration before list_add(), removing the
  pcdev->pi_led_trigs = NULL assignment on conditionally
  compiled struct member (reported by kernel test robot)
- Fixed use-after-free on device unbind: poll work is now
  cancelled via devm_add_action_or_reset() to ensure correct
  devres teardown ordering (poll_work cancelled before
  poll_notifs is freed)
- Used system_freezable_wq for poll worker to prevent hardware
  access during system suspend
- Added PoDL power status and admin state checks to LED triggers
  so they work for both C33 and PoDL controller types
- Used dev_name(dev) for LED trigger names to ensure uniqueness
  across multiple PSE controllers (of_node->name can be generic)
- Added initial LED state query at registration so already-active
  ports are reflected immediately
- Added pse_led_update() calls in regulator enable/disable paths
  so ethtool admin state changes are reflected in LEDs
- Moved LED trigger registration before list_add() to prevent
  race where IRQ/poll could invoke pse_led_update() on partially
  initialized triggers

Changes since v1:
- Split single patch into 3 separate patches
- Extracted pse_handle_events() and devm_pse_poll_helper() as a
  standalone poll path (patches 1-2), testable without LED code
- Added DT binding for poll-interval-ms as a separate patch
- Renamed led-poll-interval-ms to poll-interval-ms for generic use
- Fire LED triggers from the notification path rather than a
  separate poll loop

Tested on Realtek RTL9303 with HS104 PoE chip, poll path only
(without IRQ configured). Verified PD connect/disconnect notifications
and LED trigger state changes.

Link: https://lore.kernel.org/all/20260329153124.2823980-1-github@szelinsky.de/
Link: https://lore.kernel.org/all/20260323201225.1836561-1-github@szelinsky.de/
Link: https://lore.kernel.org/all/20260314235916.2391678-1-github@szelinsky.de/

Carlo Szelinsky (2):
  net: pse-pd: add devm_pse_poll_helper()
  net: pse-pd: add LED trigger support via notification path

 drivers/net/pse-pd/pse_core.c | 296 ++++++++++++++++++++++++++++++----
 include/linux/pse-pd/pse.h    |  34 ++++
 2 files changed, 299 insertions(+), 31 deletions(-)


base-commit: b3e69fc3196fc421e26196e7792f17b0463edc6f
-- 
2.43.0
[PATCH net-next v5 0/2] net: pse-pd: add poll path and LED trigger support
Posted by Carlo Szelinsky 1 month, 2 weeks ago
Thanks to Kory, Oleksij, Krzysztof, Andrew and Jakub for all the
helpful feedback on earlier versions:-) I really appreciate the time
you put into reviewing this. I hope this is now the last round ;-)

This series adds poll-based event detection and LED trigger support
to the PSE core subsystem.

Patch 1 introduces the poll path independently of LED support,
so it can be tested in isolation on boards with and without IRQ
configured.

Patch 2 adds LED triggers that hook into the shared event handling
path introduced by patch 1.

Note: while reworking the poll teardown for v5, I noticed
pse_release_pis() (which kfree()s the pi[] array) runs before
disable_irq() / cancel_delayed_work_sync() in
pse_controller_unregister(). A concurrent IRQ or poll worker that
fires in that window would read through a freed pi[]. This is
pre-existing for the IRQ path (since fc0e6db30941, "Add support for
reporting events"); the v5 poll cancel inherits the same placement
for symmetry. Does it make sense that I send a separate fix that disables
both async sources before pse_release_pis() or am I wrong here?

Changes since v4:
- Rebased on top of "net: pse-pd: fix out-of-bounds bitmap access in
  pse_isr() on 32-bit" (5099807f335c, merged via net). Extended the
  same bitmap-pointer pattern to the new pse_handle_events() and
  pse_poll_worker() code paths so for_each_set_bit() does not read
  past a single unsigned long when nr_lines > BITS_PER_LONG (Jakub,
  Kory)
- Cancel the poll work explicitly in pse_controller_unregister(),
  next to the existing disable_irq() for the IRQ path, instead of
  relying on devm_add_action_or_reset() LIFO ordering. Makes the
  helper safe even when a driver registers it in the standard order
  (helper before devm_pse_controller_register()) (Jakub)
- struct pse_pi_led_triggers no longer wrapped in
  #if IS_ENABLED(CONFIG_LEDS_TRIGGERS); only the function bodies
  remain ifdef'd, so reference sites no longer need their own
  ifdefs (Jakub)
- Added .activate callbacks for both LED triggers. Without these,
  an LED bound to a trigger after pse_controller_register() (e.g.
  via sysfs) would stay dark until the next hardware event toggled
  state (Jakub)
- Run the post-registration initial-state pass under pcdev->lock,
  matching pse_led_update()'s documented locking contract and
  avoiding races with concurrent regulator_enable() that share
  last_delivering / last_enabled and the hardware ops (Jakub)
- On partial pse_led_triggers_register() failure, NULL out
  pcdev->pi_led_trigs so the existing early-return guard in
  pse_led_update() short-circuits any later calls onto
  partially-registered triggers (Jakub)

Changes since v3:
- Dropped the dt-bindings poll-interval-ms patch: the poll interval
  is a driver decision, not a hardware property (Krzysztof)
- Removed of_property_read_u32() for poll-interval-ms from
  devm_pse_poll_helper(); the 500ms default is now hardcoded but
  drivers can override pcdev->poll_interval_ms before calling the
  helper
- Rebased on net-next/main

Changes since v2:
- Based on net-next/main, added net-next subject prefix
- Added --base tree information
- Added CC for devicetree list and DT maintainers
- Collected Reviewed-by from Kory Maincent on patch 1/3
- Fixed build error when CONFIG_LEDS_TRIGGERS is disabled:
  moved LED registration before list_add(), removing the
  pcdev->pi_led_trigs = NULL assignment on conditionally
  compiled struct member (reported by kernel test robot)
- Fixed use-after-free on device unbind: poll work is now
  cancelled via devm_add_action_or_reset() to ensure correct
  devres teardown ordering (poll_work cancelled before
  poll_notifs is freed)
- Used system_freezable_wq for poll worker to prevent hardware
  access during system suspend
- Added PoDL power status and admin state checks to LED triggers
  so they work for both C33 and PoDL controller types
- Used dev_name(dev) for LED trigger names to ensure uniqueness
  across multiple PSE controllers (of_node->name can be generic)
- Added initial LED state query at registration so already-active
  ports are reflected immediately
- Added pse_led_update() calls in regulator enable/disable paths
  so ethtool admin state changes are reflected in LEDs
- Moved LED trigger registration before list_add() to prevent
  race where IRQ/poll could invoke pse_led_update() on partially
  initialized triggers

Changes since v1:
- Split single patch into 3 separate patches
- Extracted pse_handle_events() and devm_pse_poll_helper() as a
  standalone poll path (patches 1-2), testable without LED code
- Added DT binding for poll-interval-ms as a separate patch
- Renamed led-poll-interval-ms to poll-interval-ms for generic use
- Fire LED triggers from the notification path rather than a
  separate poll loop

Tested on Realtek RTL9303 with HS104 PoE chip, poll path only
(without IRQ configured). Verified PD connect/disconnect notifications
and LED trigger state changes.

Link: https://lore.kernel.org/all/20260410124428.809943-1-github@szelinsky.de/
Link: https://lore.kernel.org/all/20260329153124.2823980-1-github@szelinsky.de/
Link: https://lore.kernel.org/all/20260323201225.1836561-1-github@szelinsky.de/

Carlo Szelinsky (2):
  net: pse-pd: add devm_pse_poll_helper()
  net: pse-pd: add LED trigger support via notification path

 drivers/net/pse-pd/pse_core.c | 355 +++++++++++++++++++++++++++++++---
 include/linux/pse-pd/pse.h    |  32 +++
 2 files changed, 356 insertions(+), 31 deletions(-)


base-commit: 09942ddedcb960f9e78fd817ec33f501d1040c5b
--
2.43.0
[PATCH net-next v5 1/2] net: pse-pd: add devm_pse_poll_helper()
Posted by Carlo Szelinsky 1 month, 2 weeks ago
Extract the common event handling loop from pse_isr() into a shared
pse_handle_events() function, and add a generic poll-based alternative
to the IRQ path for PSE controllers that lack interrupt support or
have IRQ lines not wired on the board.

The new devm_pse_poll_helper() function sets up a delayed work that
periodically calls the driver's map_event callback to detect state
changes, feeding events into the existing ntf_fifo / pse_send_ntf_worker
notification pipeline. This reuses the same pse_irq_desc interface as
the IRQ path: the driver provides a map_event callback that populates
per-PI notification arrays.

The poll worker uses system_freezable_wq to avoid running during system
suspend when the underlying hardware (e.g. I2C bus) may be inaccessible.

Cancel the poll work explicitly in pse_controller_unregister() (next to
the existing disable_irq() for the IRQ path) so it cannot outlive the
controller it pushes events into. This avoids relying on devres LIFO
ordering between devm_pse_controller_register() and devm_pse_poll_helper()
when a driver follows the standard pattern of setting up the helper
before registering the controller.

The notifs_mask is allocated as a real bitmap via devm_bitmap_zalloc()
and passed to pse_handle_events() and the driver's map_event callback
as a pointer (mirroring the pattern from commit 5099807f335c
("net: pse-pd: fix out-of-bounds bitmap access in pse_isr() on 32-bit"))
so for_each_set_bit() does not read past a single unsigned long when
nr_lines > BITS_PER_LONG.

The poll interval defaults to 500ms, balancing responsiveness against
bus load (e.g. I2C).

Signed-off-by: Carlo Szelinsky <github@szelinsky.de>
---
 drivers/net/pse-pd/pse_core.c | 157 +++++++++++++++++++++++++++-------
 include/linux/pse-pd/pse.h    |  14 +++
 2 files changed, 140 insertions(+), 31 deletions(-)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 87aa4f4e9724..b7ffec0c942c 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -14,10 +14,17 @@
 #include <linux/regulator/driver.h>
 #include <linux/regulator/machine.h>
 #include <linux/rtnetlink.h>
+#include <linux/workqueue.h>
 #include <net/net_trackers.h>
 
 #define PSE_PW_D_LIMIT INT_MAX
 
+/* Default poll interval for controllers without IRQ support.
+ * 500ms provides a reasonable trade-off between responsiveness
+ * (event detection, PD detection) and I2C bus utilization.
+ */
+#define PSE_DEFAULT_POLL_INTERVAL_MS 500
+
 static DEFINE_MUTEX(pse_list_mutex);
 static LIST_HEAD(pse_controller_list);
 static DEFINE_XARRAY_ALLOC(pse_pw_d_map);
@@ -1118,6 +1125,8 @@ void pse_controller_unregister(struct pse_controller_dev *pcdev)
 	pse_release_pis(pcdev);
 	if (pcdev->irq)
 		disable_irq(pcdev->irq);
+	if (pcdev->polling)
+		cancel_delayed_work_sync(&pcdev->poll_work);
 	cancel_work_sync(&pcdev->ntf_work);
 	kfifo_free(&pcdev->ntf_fifo);
 	mutex_lock(&pse_list_mutex);
@@ -1239,66 +1248,104 @@ static int pse_set_config_isr(struct pse_controller_dev *pcdev, int id,
 }
 
 /**
- * pse_isr - IRQ handler for PSE
- * @irq: irq number
- * @data: pointer to user interrupt structure
+ * pse_handle_events - Process PSE events for all PIs
+ * @pcdev: a pointer to the PSE controller device
+ * @notifs: per-PI notification array
+ * @notifs_mask: bitmap of PIs with events (sized for pcdev->nr_lines)
  *
- * Return: irqreturn_t - status of IRQ
+ * Common event handling shared between IRQ and poll paths.
+ * Caller must hold pcdev->lock.
  */
-static irqreturn_t pse_isr(int irq, void *data)
+static void pse_handle_events(struct pse_controller_dev *pcdev,
+			      unsigned long *notifs,
+			      unsigned long *notifs_mask)
 {
-	struct pse_controller_dev *pcdev;
-	struct pse_irq_desc *desc;
-	struct pse_irq *h = data;
-	int ret, i;
-
-	desc = &h->desc;
-	pcdev = h->pcdev;
-
-	/* Clear notifs mask */
-	memset(h->notifs, 0, pcdev->nr_lines * sizeof(*h->notifs));
-	bitmap_zero(h->notifs_mask, pcdev->nr_lines);
-	mutex_lock(&pcdev->lock);
-	ret = desc->map_event(irq, pcdev, h->notifs, h->notifs_mask);
-	if (ret || bitmap_empty(h->notifs_mask, pcdev->nr_lines)) {
-		mutex_unlock(&pcdev->lock);
-		return IRQ_NONE;
-	}
+	int i;
 
-	for_each_set_bit(i, h->notifs_mask, pcdev->nr_lines) {
-		unsigned long notifs, rnotifs;
+	for_each_set_bit(i, notifs_mask, pcdev->nr_lines) {
+		unsigned long pi_notifs, rnotifs;
 		struct pse_ntf ntf = {};
+		int ret;
 
 		/* Do nothing PI not described */
 		if (!pcdev->pi[i].rdev)
 			continue;
 
-		notifs = h->notifs[i];
+		pi_notifs = notifs[i];
 		if (pse_pw_d_is_sw_pw_control(pcdev, pcdev->pi[i].pw_d)) {
-			ret = pse_set_config_isr(pcdev, i, notifs);
+			ret = pse_set_config_isr(pcdev, i, pi_notifs);
 			if (ret)
-				notifs |= ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR;
+				pi_notifs |= ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR;
 		}
 
-		dev_dbg(h->pcdev->dev,
-			"Sending PSE notification EVT 0x%lx\n", notifs);
+		dev_dbg(pcdev->dev,
+			"Sending PSE notification EVT 0x%lx\n", pi_notifs);
 
-		ntf.notifs = notifs;
+		ntf.notifs = pi_notifs;
 		ntf.id = i;
 		kfifo_in_spinlocked(&pcdev->ntf_fifo, &ntf, 1,
 				    &pcdev->ntf_fifo_lock);
 		schedule_work(&pcdev->ntf_work);
 
-		rnotifs = pse_to_regulator_notifs(notifs);
+		rnotifs = pse_to_regulator_notifs(pi_notifs);
 		regulator_notifier_call_chain(pcdev->pi[i].rdev, rnotifs,
 					      NULL);
 	}
+}
+
+/**
+ * pse_isr - IRQ handler for PSE
+ * @irq: irq number
+ * @data: pointer to user interrupt structure
+ *
+ * Return: irqreturn_t - status of IRQ
+ */
+static irqreturn_t pse_isr(int irq, void *data)
+{
+	struct pse_controller_dev *pcdev;
+	struct pse_irq *h = data;
+	int ret;
+
+	pcdev = h->pcdev;
 
+	/* Clear notifs mask */
+	memset(h->notifs, 0, pcdev->nr_lines * sizeof(*h->notifs));
+	bitmap_zero(h->notifs_mask, pcdev->nr_lines);
+	mutex_lock(&pcdev->lock);
+	ret = h->desc.map_event(irq, pcdev, h->notifs, h->notifs_mask);
+	if (ret || bitmap_empty(h->notifs_mask, pcdev->nr_lines)) {
+		mutex_unlock(&pcdev->lock);
+		return IRQ_NONE;
+	}
+
+	pse_handle_events(pcdev, h->notifs, h->notifs_mask);
 	mutex_unlock(&pcdev->lock);
 
 	return IRQ_HANDLED;
 }
 
+static void pse_poll_worker(struct work_struct *work)
+{
+	struct pse_controller_dev *pcdev =
+		container_of(work, struct pse_controller_dev,
+			     poll_work.work);
+	int ret;
+
+	memset(pcdev->poll_notifs, 0,
+	       pcdev->nr_lines * sizeof(*pcdev->poll_notifs));
+	bitmap_zero(pcdev->poll_notifs_mask, pcdev->nr_lines);
+	mutex_lock(&pcdev->lock);
+	ret = pcdev->poll_desc.map_event(0, pcdev, pcdev->poll_notifs,
+					 pcdev->poll_notifs_mask);
+	if (!ret && !bitmap_empty(pcdev->poll_notifs_mask, pcdev->nr_lines))
+		pse_handle_events(pcdev, pcdev->poll_notifs,
+				  pcdev->poll_notifs_mask);
+	mutex_unlock(&pcdev->lock);
+
+	queue_delayed_work(system_freezable_wq, &pcdev->poll_work,
+			   msecs_to_jiffies(pcdev->poll_interval_ms));
+}
+
 /**
  * devm_pse_irq_helper - Register IRQ based PSE event notifier
  * @pcdev: a pointer to the PSE
@@ -1356,6 +1403,54 @@ int devm_pse_irq_helper(struct pse_controller_dev *pcdev, int irq,
 }
 EXPORT_SYMBOL_GPL(devm_pse_irq_helper);
 
+/**
+ * devm_pse_poll_helper - Register poll-based PSE event notifier
+ * @pcdev: a pointer to the PSE controller device
+ * @d: PSE event description (uses same pse_irq_desc as IRQ path)
+ *
+ * For PSE controllers without IRQ support or with IRQ not wired. Sets
+ * up a delayed work that periodically calls the driver's map_event
+ * callback to detect state changes, feeding events into the standard
+ * notification pipeline.
+ *
+ * The poll worker uses system_freezable_wq to ensure it does not run
+ * during system suspend while the hardware may be inaccessible.
+ *
+ * Return: 0 on success and errno on failure
+ */
+int devm_pse_poll_helper(struct pse_controller_dev *pcdev,
+			 const struct pse_irq_desc *d)
+{
+	struct device *dev = pcdev->dev;
+
+	if (!d || !d->map_event || !d->name)
+		return -EINVAL;
+
+	pcdev->poll_desc = *d;
+	pcdev->poll_notifs = devm_kcalloc(dev, pcdev->nr_lines,
+					  sizeof(*pcdev->poll_notifs),
+					  GFP_KERNEL);
+	if (!pcdev->poll_notifs)
+		return -ENOMEM;
+
+	pcdev->poll_notifs_mask = devm_bitmap_zalloc(dev, pcdev->nr_lines,
+						     GFP_KERNEL);
+	if (!pcdev->poll_notifs_mask)
+		return -ENOMEM;
+
+	if (!pcdev->poll_interval_ms)
+		pcdev->poll_interval_ms = PSE_DEFAULT_POLL_INTERVAL_MS;
+
+	INIT_DELAYED_WORK(&pcdev->poll_work, pse_poll_worker);
+	pcdev->polling = true;
+
+	queue_delayed_work(system_freezable_wq, &pcdev->poll_work,
+			   msecs_to_jiffies(pcdev->poll_interval_ms));
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devm_pse_poll_helper);
+
 /* PSE control section */
 
 static void __pse_control_release(struct kref *kref)
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 4e5696cfade7..4b5d5b11a084 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -292,6 +292,12 @@ struct pse_ntf {
  * @pi: table of PSE PIs described in this controller device
  * @no_of_pse_pi: flag set if the pse_pis devicetree node is not used
  * @irq: PSE interrupt
+ * @polling: flag indicating poll-based event detection is active
+ * @poll_interval_ms: poll interval in milliseconds
+ * @poll_work: delayed work for poll-based event detection
+ * @poll_desc: copy of the driver's event descriptor for polling
+ * @poll_notifs: per-PI notification scratch space for poll worker
+ * @poll_notifs_mask: bitmap of PIs with events for poll worker
  * @pis_prio_max: Maximum value allowed for the PSE PIs priority
  * @supp_budget_eval_strategies: budget evaluation strategies supported
  *				 by the PSE
@@ -312,6 +318,12 @@ struct pse_controller_dev {
 	struct pse_pi *pi;
 	bool no_of_pse_pi;
 	int irq;
+	bool polling;
+	unsigned int poll_interval_ms;
+	struct delayed_work poll_work;
+	struct pse_irq_desc poll_desc;
+	unsigned long *poll_notifs;
+	unsigned long *poll_notifs_mask;
 	unsigned int pis_prio_max;
 	u32 supp_budget_eval_strategies;
 	struct work_struct ntf_work;
@@ -345,6 +357,8 @@ int devm_pse_controller_register(struct device *dev,
 				 struct pse_controller_dev *pcdev);
 int devm_pse_irq_helper(struct pse_controller_dev *pcdev, int irq,
 			int irq_flags, const struct pse_irq_desc *d);
+int devm_pse_poll_helper(struct pse_controller_dev *pcdev,
+			 const struct pse_irq_desc *d);
 
 struct pse_control *of_pse_control_get(struct device_node *node,
 				       struct phy_device *phydev);
-- 
2.43.0
Re: [PATCH net-next v5 1/2] net: pse-pd: add devm_pse_poll_helper()
Posted by Jakub Kicinski 1 month, 1 week ago
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
    net: pse-pd: add devm_pse_poll_helper()

This patch extracts the common event handling loop from pse_isr() into a
shared function and adds a poll-based alternative for PSE controllers
that lack interrupt support.

> diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
> index 87aa4f4e9724..b7ffec0c942c 100644
> --- a/drivers/net/pse-pd/pse_core.c
> +++ b/drivers/net/pse-pd/pse_core.c
[ ... ]
> @@ -1118,6 +1125,8 @@ void pse_controller_unregister(struct pse_controller_dev *pcdev)
>  	pse_release_pis(pcdev);
>  	if (pcdev->irq)
>  		disable_irq(pcdev->irq);
> +	if (pcdev->polling)
> +		cancel_delayed_work_sync(&pcdev->poll_work);

Because pse_release_pis() executes kfree(pcdev->pi) before the delayed work
is cancelled, can pse_poll_worker() run concurrently and access the freed
pcdev->pi array inside pse_handle_events() before the cancellation completes?

[ ... ]
> @@ -1356,6 +1403,54 @@ int devm_pse_irq_helper(struct pse_controller_dev *pcdev, int irq,
>  }
>  EXPORT_SYMBOL_GPL(devm_pse_irq_helper);
>  
> +/**
> + * devm_pse_poll_helper - Register poll-based PSE event notifier
> + * @pcdev: a pointer to the PSE controller device
> + * @d: PSE event description (uses same pse_irq_desc as IRQ path)
[ ... ]
> +int devm_pse_poll_helper(struct pse_controller_dev *pcdev,
> +			 const struct pse_irq_desc *d)
> +{
> +	struct device *dev = pcdev->dev;
> +
> +	if (!d || !d->map_event || !d->name)
> +		return -EINVAL;
> +
> +	pcdev->poll_desc = *d;
> +	pcdev->poll_notifs = devm_kcalloc(dev, pcdev->nr_lines,
> +					  sizeof(*pcdev->poll_notifs),
> +					  GFP_KERNEL);
> +	if (!pcdev->poll_notifs)
> +		return -ENOMEM;
> +
> +	pcdev->poll_notifs_mask = devm_bitmap_zalloc(dev, pcdev->nr_lines,
> +						     GFP_KERNEL);
> +	if (!pcdev->poll_notifs_mask)
> +		return -ENOMEM;
> +
> +	if (!pcdev->poll_interval_ms)
> +		pcdev->poll_interval_ms = PSE_DEFAULT_POLL_INTERVAL_MS;
> +
> +	INIT_DELAYED_WORK(&pcdev->poll_work, pse_poll_worker);
> +	pcdev->polling = true;
> +
> +	queue_delayed_work(system_freezable_wq, &pcdev->poll_work,
> +			   msecs_to_jiffies(pcdev->poll_interval_ms));

Since memory is allocated with devm_ helpers but no devm cleanup action is
registered to cancel the delayed work, what happens if a driver's probe
function encounters an error after calling this helper? Will devres free the
pcdev structure while the system workqueue still has the delayed work armed,
leading to a use-after-free when the timer expires?

The commit message mentions that drivers typically set up the helper before
registering the controller. If queue_delayed_work() schedules the worker
to run before devm_pse_controller_register() completes, will
pse_poll_worker() attempt to acquire pcdev->lock and push events into
pcdev->ntf_fifo before they are initialized?

> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(devm_pse_poll_helper);
Re: [PATCH net-next v5 1/2] net: pse-pd: add devm_pse_poll_helper()
Posted by Carlo Szelinsky 1 month ago
Hi Jakub,

Thanks for sending the AI review. I checked the code again... both
points are real, but one of them looks to me like more theory than practice.
Out of curiosity, what prompt do you use for these reviews, so I can use it 
before sending the next version? Replying inline:

On Mon, 4 May 2026 18:57:57 -0700 Jakub Kicinski wrote:
> > @@ -1118,6 +1125,8 @@ void pse_controller_unregister(struct pse_controller_dev *pcdev)
> >  	pse_release_pis(pcdev);
> >  	if (pcdev->irq)
> >  		disable_irq(pcdev->irq);
> > +	if (pcdev->polling)
> > +		cancel_delayed_work_sync(&pcdev->poll_work);
>
> Because pse_release_pis() executes kfree(pcdev->pi) before the delayed work
> is cancelled, can pse_poll_worker() run concurrently and access the freed
> pcdev->pi array inside pse_handle_events() before the cancellation completes?

Yes, seems like a real bug. The IRQ thread and the poll worker can still run
after pse_release_pis() frees pcdev->pi, and both end up in
pse_handle_events() touching pcdev->pi[i]. We need to stop them
first. v6 will reorder pse_controller_unregister() to:

	if (pcdev->irq)
		disable_irq(pcdev->irq);
	if (pcdev->polling)
		cancel_delayed_work_sync(&pcdev->poll_work);
	cancel_work_sync(&pcdev->ntf_work);
	pse_release_pis(pcdev);

The IRQ side seem to be already broken before this patch. I went
through the git log:

* fc0e6db30941 ("net: pse-pd: Add support for reporting events")
  added pse_isr but never called disable_irq() in unregister at all.
* ffef61d6d273 ("net: pse-pd: Add support for budget evaluation
  strategies") added the disable_irq() call, but put it in the
  wrong place (after pse_release_pis()).

What do you suggest to do, I could imagine: ... 

1) Send the fix as a standalone "Fixes:" patch to net first,
   then rebase v6 on top once it lands? Or fold it into v6 as
   patch 1/3? My vote is standalone since the fix stands on its
   own, but up to you.

2) Which commit for the Fixes: tag? I'd pick ffef61d6d273
   (that's where the broken order came from), but fc0e6db30941
   also works (no protection at all back then). wdyt?

> > +int devm_pse_poll_helper(struct pse_controller_dev *pcdev,
> > +			 const struct pse_irq_desc *d)
> > +{
> [...]
> > +	queue_delayed_work(system_freezable_wq, &pcdev->poll_work,
> > +			   msecs_to_jiffies(pcdev->poll_interval_ms));
>
> Since memory is allocated with devm_ helpers but no devm cleanup action is
> registered to cancel the delayed work, what happens if a driver's probe
> function encounters an error after calling this helper? [...]
>
> The commit message mentions that drivers typically set up the helper before
> registering the controller. If queue_delayed_work() schedules the worker
> to run before devm_pse_controller_register() completes, will
> pse_poll_worker() attempt to acquire pcdev->lock and push events into
> pcdev->ntf_fifo before they are initialized?

The probe-failure case is a real bug. If probe fails between this
helper and devm_pse_controller_register(), the work stays queued
and fires later on a pcdev whose devres memory is gone. UAF.

The "worker fires before mutex/kfifo are ready" case is real in
theory, but I don't think you can actually hit it?! Default
poll_interval_ms is 1000 ms, so pse_controller_register() would
need to take over a second to lose the race. Worth fixing to keep
the code clean, but not urgent or?

Same fix for both: move queue_delayed_work() out of the helper.
The helper just allocates the state and does INIT_DELAYED_WORK.
pse_controller_register() then arms the work as the very last
step on success, gated on pcdev->polling. So the work only ever
runs when everything is set up AND register has succeeded.

Should I fix both in v6, or just the probe-failure leak? One
change covers both, so I would suggest to do both. wdyt?

So the steps for v6 [1/2] could be:
* standalone "Fixes:" patch reordering pse_controller_unregister()
  (sent to net first, unless you suggest to fold it in)
* move queue_delayed_work() into pse_controller_register()

Do you have any other feedback, do I miss anything ?

Cheers,
Carlo
Re: [PATCH net-next v5 1/2] net: pse-pd: add devm_pse_poll_helper()
Posted by Oleksij Rempel 4 weeks, 1 day ago
On Sat, May 16, 2026 at 12:17:20PM +0200, Carlo Szelinsky wrote:
> Hi Jakub,
> 
> Thanks for sending the AI review. I checked the code again... both
> points are real, but one of them looks to me like more theory than practice.
> Out of curiosity, what prompt do you use for these reviews, so I can use it 
> before sending the next version?

Sashiko is using https://github.com/masoncl/review-prompts

I personally run pre-review with opencode + review-prompts +
 gemini-3.1-pro-preview-customtools / or claude-sonnet-4-5

After installing review-prompts to opencode config the review can be
done with "/kreview commit_id"

But sashiko still finds things, i do not know what is the difference.

> Replying inline:
> 
> On Mon, 4 May 2026 18:57:57 -0700 Jakub Kicinski wrote:
> > > @@ -1118,6 +1125,8 @@ void pse_controller_unregister(struct pse_controller_dev *pcdev)
> > >  	pse_release_pis(pcdev);
> > >  	if (pcdev->irq)
> > >  		disable_irq(pcdev->irq);
> > > +	if (pcdev->polling)
> > > +		cancel_delayed_work_sync(&pcdev->poll_work);
> >
> > Because pse_release_pis() executes kfree(pcdev->pi) before the delayed work
> > is cancelled, can pse_poll_worker() run concurrently and access the freed
> > pcdev->pi array inside pse_handle_events() before the cancellation completes?
> 
> Yes, seems like a real bug. The IRQ thread and the poll worker can still run
> after pse_release_pis() frees pcdev->pi, and both end up in
> pse_handle_events() touching pcdev->pi[i]. We need to stop them
> first. v6 will reorder pse_controller_unregister() to:
> 
> 	if (pcdev->irq)
> 		disable_irq(pcdev->irq);
> 	if (pcdev->polling)
> 		cancel_delayed_work_sync(&pcdev->poll_work);
> 	cancel_work_sync(&pcdev->ntf_work);
> 	pse_release_pis(pcdev);
> 
> The IRQ side seem to be already broken before this patch. I went
> through the git log:
> 
> * fc0e6db30941 ("net: pse-pd: Add support for reporting events")
>   added pse_isr but never called disable_irq() in unregister at all.
> * ffef61d6d273 ("net: pse-pd: Add support for budget evaluation
>   strategies") added the disable_irq() call, but put it in the
>   wrong place (after pse_release_pis()).
> 
> What do you suggest to do, I could imagine: ... 
> 
> 1) Send the fix as a standalone "Fixes:" patch to net first,
>    then rebase v6 on top once it lands? Or fold it into v6 as
>    patch 1/3? My vote is standalone since the fix stands on its
>    own, but up to you.

Yes, please send the fix to stable. In this case you will need to wait
until net-next includes stable fixes.

> 2) Which commit for the Fixes: tag? I'd pick ffef61d6d273
>    (that's where the broken order came from), but fc0e6db30941
>    also works (no protection at all back then). wdyt?

If it is introduce by ffef61d6d273, then use this for Fixes.

> > > +int devm_pse_poll_helper(struct pse_controller_dev *pcdev,
> > > +			 const struct pse_irq_desc *d)
> > > +{
> > [...]
> > > +	queue_delayed_work(system_freezable_wq, &pcdev->poll_work,
> > > +			   msecs_to_jiffies(pcdev->poll_interval_ms));
> >
> > Since memory is allocated with devm_ helpers but no devm cleanup action is
> > registered to cancel the delayed work, what happens if a driver's probe
> > function encounters an error after calling this helper? [...]
> >
> > The commit message mentions that drivers typically set up the helper before
> > registering the controller. If queue_delayed_work() schedules the worker
> > to run before devm_pse_controller_register() completes, will
> > pse_poll_worker() attempt to acquire pcdev->lock and push events into
> > pcdev->ntf_fifo before they are initialized?
> 
> The probe-failure case is a real bug. If probe fails between this
> helper and devm_pse_controller_register(), the work stays queued
> and fires later on a pcdev whose devres memory is gone. UAF.
> 
> The "worker fires before mutex/kfifo are ready" case is real in
> theory, but I don't think you can actually hit it?! Default
> poll_interval_ms is 1000 ms, so pse_controller_register() would
> need to take over a second to lose the race. Worth fixing to keep
> the code clean, but not urgent or?

The probe() is a process and can be outscheduled by any high prio process
at any time. May be on a single core SoC with pse-pd core build
as module and squashfs root it will make the boot load heavy enough
to reproduce it?

> Same fix for both: move queue_delayed_work() out of the helper.
> The helper just allocates the state and does INIT_DELAYED_WORK.
> pse_controller_register() then arms the work as the very last
> step on success, gated on pcdev->polling. So the work only ever
> runs when everything is set up AND register has succeeded.

Sounds good.

> Should I fix both in v6, or just the probe-failure leak? One
> change covers both, so I would suggest to do both. wdyt?

Fixes for pre-existing bugs to net, new bugs to net-next :) 

Best Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
[PATCH net-next v5 2/2] net: pse-pd: add LED trigger support via notification path
Posted by Carlo Szelinsky 1 month, 2 weeks ago
Add per-PI "delivering" and "enabled" LED triggers to the PSE core
subsystem. LED state is updated from the shared pse_handle_events()
function whenever the IRQ or poll path detects a state change, as well
as from the regulator enable/disable paths so that host-initiated
admin state changes via ethtool are immediately reflected.

Each trigger registers an .activate callback that syncs a freshly-bound
LED to the cached state, so an LED bound after pse_controller_register()
(e.g. via sysfs) does not stay dark until the next hardware event.

The post-registration initial-state pass is run under pcdev->lock,
matching pse_led_update()'s documented locking contract and avoiding
races with concurrent regulator_enable() paths that share last_delivering
/ last_enabled and the hardware ops.

If pse_led_triggers_register() fails partway through, the partially-
registered triggers are detached by clearing pcdev->pi_led_trigs, so the
existing early-return guard in pse_led_update() is the single point of
truth and a stale led_trigger.name cannot fool the dereference path
into touching an unregistered trigger.

The per-PI trigger struct lives outside the CONFIG_LEDS_TRIGGERS ifdef
to avoid forcing every reference site to be wrapped; the LED-specific
code paths remain ifdef'd.

Link: https://lore.kernel.org/oe-kbuild-all/202603251254.o5PqMBRU-lkp@intel.com/
Link: https://lore.kernel.org/oe-kbuild-all/202603251250.cuMCk5Yv-lkp@intel.com/
Signed-off-by: Carlo Szelinsky <github@szelinsky.de>
---
 drivers/net/pse-pd/pse_core.c | 200 +++++++++++++++++++++++++++++++++-
 include/linux/pse-pd/pse.h    |  18 +++
 2 files changed, 217 insertions(+), 1 deletion(-)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index b7ffec0c942c..bff97211ae22 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -8,6 +8,7 @@
 #include <linux/device.h>
 #include <linux/ethtool.h>
 #include <linux/ethtool_netlink.h>
+#include <linux/leds.h>
 #include <linux/of.h>
 #include <linux/phy.h>
 #include <linux/pse-pd/pse.h>
@@ -669,6 +670,168 @@ static int _pse_pi_delivery_power_sw_pw_ctrl(struct pse_controller_dev *pcdev,
 	return 0;
 }
 
+#if IS_ENABLED(CONFIG_LEDS_TRIGGERS)
+/**
+ * pse_pi_get_states - Fetch current delivering/enabled state for a PI
+ * @pcdev: PSE controller device
+ * @id: PI index
+ * @delivering: out, set to true if PI is currently delivering power
+ * @enabled: out, set to true if PI is administratively enabled
+ *
+ * Queries hardware via the controller ops. Caller must hold pcdev->lock.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+static int pse_pi_get_states(struct pse_controller_dev *pcdev, int id,
+			     bool *delivering, bool *enabled)
+{
+	struct pse_pw_status pw_status = {};
+	struct pse_admin_state admin_state = {};
+	int ret;
+
+	ret = pcdev->ops->pi_get_pw_status(pcdev, id, &pw_status);
+	if (ret)
+		return ret;
+	ret = pcdev->ops->pi_get_admin_state(pcdev, id, &admin_state);
+	if (ret)
+		return ret;
+
+	*delivering = pw_status.c33_pw_status ==
+		ETHTOOL_C33_PSE_PW_D_STATUS_DELIVERING ||
+		pw_status.podl_pw_status ==
+		ETHTOOL_PODL_PSE_PW_D_STATUS_DELIVERING;
+	*enabled = admin_state.c33_admin_state ==
+		ETHTOOL_C33_PSE_ADMIN_STATE_ENABLED ||
+		admin_state.podl_admin_state ==
+		ETHTOOL_PODL_PSE_ADMIN_STATE_ENABLED;
+
+	return 0;
+}
+
+/**
+ * pse_led_update - Update LED triggers for a PI based on current state
+ * @pcdev: PSE controller device
+ * @id: PI index
+ *
+ * Queries the current power status and admin state of the PI and
+ * fires LED trigger events on state changes. Called from the
+ * notification path and the regulator enable/disable paths.
+ *
+ * Must be called with pcdev->lock held.
+ */
+static void pse_led_update(struct pse_controller_dev *pcdev, int id)
+{
+	struct pse_pi_led_triggers *trigs;
+	bool delivering, enabled;
+
+	if (!pcdev->pi_led_trigs)
+		return;
+
+	trigs = &pcdev->pi_led_trigs[id];
+	if (!trigs->delivering.name)
+		return;
+
+	if (pse_pi_get_states(pcdev, id, &delivering, &enabled))
+		return;
+
+	if (trigs->last_delivering != delivering) {
+		trigs->last_delivering = delivering;
+		led_trigger_event(&trigs->delivering,
+				  delivering ? LED_FULL : LED_OFF);
+	}
+
+	if (trigs->last_enabled != enabled) {
+		trigs->last_enabled = enabled;
+		led_trigger_event(&trigs->enabled,
+				  enabled ? LED_FULL : LED_OFF);
+	}
+}
+
+/* Sync a freshly-bound LED to the cached trigger state. Without these
+ * .activate callbacks, an LED bound to the trigger after
+ * pse_controller_register() (e.g. via sysfs) would stay dark until the
+ * next hardware event toggles state.
+ */
+static int pse_led_delivering_activate(struct led_classdev *led_cdev)
+{
+	struct pse_pi_led_triggers *trigs =
+		container_of(led_cdev->trigger, struct pse_pi_led_triggers,
+			     delivering);
+
+	led_set_brightness(led_cdev,
+			   trigs->last_delivering ? LED_FULL : LED_OFF);
+	return 0;
+}
+
+static int pse_led_enabled_activate(struct led_classdev *led_cdev)
+{
+	struct pse_pi_led_triggers *trigs =
+		container_of(led_cdev->trigger, struct pse_pi_led_triggers,
+			     enabled);
+
+	led_set_brightness(led_cdev,
+			   trigs->last_enabled ? LED_FULL : LED_OFF);
+	return 0;
+}
+
+static int pse_led_triggers_register(struct pse_controller_dev *pcdev)
+{
+	struct device *dev = pcdev->dev;
+	const char *dev_id;
+	int i, ret;
+
+	dev_id = dev_name(dev);
+
+	pcdev->pi_led_trigs = devm_kcalloc(dev, pcdev->nr_lines,
+					   sizeof(*pcdev->pi_led_trigs),
+					   GFP_KERNEL);
+	if (!pcdev->pi_led_trigs)
+		return -ENOMEM;
+
+	for (i = 0; i < pcdev->nr_lines; i++) {
+		struct pse_pi_led_triggers *trigs = &pcdev->pi_led_trigs[i];
+
+		/* Skip PIs not described in device tree */
+		if (!pcdev->no_of_pse_pi && !pcdev->pi[i].np)
+			continue;
+
+		trigs->delivering.name = devm_kasprintf(dev, GFP_KERNEL,
+							"pse-%s:port%d:delivering",
+							dev_id, i);
+		if (!trigs->delivering.name)
+			return -ENOMEM;
+		trigs->delivering.activate = pse_led_delivering_activate;
+
+		ret = devm_led_trigger_register(dev, &trigs->delivering);
+		if (ret) {
+			trigs->delivering.name = NULL;
+			return ret;
+		}
+
+		trigs->enabled.name = devm_kasprintf(dev, GFP_KERNEL,
+						     "pse-%s:port%d:enabled",
+						     dev_id, i);
+		if (!trigs->enabled.name)
+			return -ENOMEM;
+		trigs->enabled.activate = pse_led_enabled_activate;
+
+		ret = devm_led_trigger_register(dev, &trigs->enabled);
+		if (ret) {
+			trigs->enabled.name = NULL;
+			return ret;
+		}
+	}
+
+	return 0;
+}
+#else
+static inline void pse_led_update(struct pse_controller_dev *pcdev, int id) {}
+static int pse_led_triggers_register(struct pse_controller_dev *pcdev)
+{
+	return 0;
+}
+#endif /* CONFIG_LEDS_TRIGGERS */
+
 static int pse_pi_enable(struct regulator_dev *rdev)
 {
 	struct pse_controller_dev *pcdev = rdev_get_drvdata(rdev);
@@ -694,6 +857,7 @@ static int pse_pi_enable(struct regulator_dev *rdev)
 			pcdev->pi[id].admin_state_enabled = 1;
 			ret = 0;
 		}
+		pse_led_update(pcdev, id);
 		mutex_unlock(&pcdev->lock);
 		return ret;
 	}
@@ -701,6 +865,7 @@ static int pse_pi_enable(struct regulator_dev *rdev)
 	ret = ops->pi_enable(pcdev, id);
 	if (!ret)
 		pcdev->pi[id].admin_state_enabled = 1;
+	pse_led_update(pcdev, id);
 	mutex_unlock(&pcdev->lock);
 
 	return ret;
@@ -718,6 +883,7 @@ static int pse_pi_disable(struct regulator_dev *rdev)
 	ret = _pse_pi_disable(pcdev, id);
 	if (!ret)
 		pi->admin_state_enabled = 0;
+	pse_led_update(pcdev, id);
 
 	mutex_unlock(&pcdev->lock);
 	return 0;
@@ -1107,6 +1273,31 @@ int pse_controller_register(struct pse_controller_dev *pcdev)
 	if (ret)
 		return ret;
 
+	ret = pse_led_triggers_register(pcdev);
+	if (ret) {
+		/* LED triggers are non-essential for power delivery; warn
+		 * and continue. NULL out the array so pse_led_update()'s
+		 * early-return guard short-circuits any later calls onto
+		 * partially-registered triggers.
+		 */
+		dev_warn(pcdev->dev, "Failed to register LED triggers: %d\n",
+			 ret);
+		pcdev->pi_led_trigs = NULL;
+	}
+
+	/* Query initial LED state for all PIs so already-active ports
+	 * are reflected immediately without waiting for a hardware event.
+	 * Hold pcdev->lock: regulators are already exposed and a
+	 * concurrent regulator_enable() would race on the hw callbacks
+	 * and on last_delivering / last_enabled.
+	 */
+	mutex_lock(&pcdev->lock);
+	for (i = 0; i < pcdev->nr_lines; i++) {
+		if (pcdev->no_of_pse_pi || pcdev->pi[i].np)
+			pse_led_update(pcdev, i);
+	}
+	mutex_unlock(&pcdev->lock);
+
 	mutex_lock(&pse_list_mutex);
 	list_add(&pcdev->list, &pse_controller_list);
 	mutex_unlock(&pse_list_mutex);
@@ -1267,7 +1458,14 @@ static void pse_handle_events(struct pse_controller_dev *pcdev,
 		struct pse_ntf ntf = {};
 		int ret;
 
-		/* Do nothing PI not described */
+		/* Update LEDs for described PIs regardless of consumer state.
+		 * LED triggers are registered at controller init, before any
+		 * PHY claims a PSE control, so rdev may still be NULL here.
+		 */
+		if (pcdev->no_of_pse_pi || pcdev->pi[i].np)
+			pse_led_update(pcdev, i);
+
+		/* Skip regulator/netlink path for PIs without consumers */
 		if (!pcdev->pi[i].rdev)
 			continue;
 
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 4b5d5b11a084..ba321e669642 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -10,6 +10,7 @@
 #include <linux/kfifo.h>
 #include <uapi/linux/ethtool.h>
 #include <uapi/linux/ethtool_netlink_generated.h>
+#include <linux/leds.h>
 #include <linux/regulator/driver.h>
 
 /* Maximum current in uA according to IEEE 802.3-2022 Table 145-1 */
@@ -266,6 +267,21 @@ struct pse_pi {
 	int pw_allocated_mW;
 };
 
+/**
+ * struct pse_pi_led_triggers - LED trigger state for a PSE PI
+ *
+ * @delivering: LED trigger for power delivering state
+ * @enabled: LED trigger for admin enabled state
+ * @last_delivering: cached delivering state for change detection
+ * @last_enabled: cached enabled state for change detection
+ */
+struct pse_pi_led_triggers {
+	struct led_trigger delivering;
+	struct led_trigger enabled;
+	bool last_delivering;
+	bool last_enabled;
+};
+
 /**
  * struct pse_ntf - PSE notification element
  *
@@ -304,6 +320,7 @@ struct pse_ntf {
  * @ntf_work: workqueue for PSE notification management
  * @ntf_fifo: PSE notifications FIFO
  * @ntf_fifo_lock: protect @ntf_fifo writer
+ * @pi_led_trigs: per-PI LED trigger state array
  */
 struct pse_controller_dev {
 	const struct pse_controller_ops *ops;
@@ -329,6 +346,7 @@ struct pse_controller_dev {
 	struct work_struct ntf_work;
 	DECLARE_KFIFO_PTR(ntf_fifo, struct pse_ntf);
 	spinlock_t ntf_fifo_lock; /* Protect @ntf_fifo writer */
+	struct pse_pi_led_triggers *pi_led_trigs;
 };
 
 /**
-- 
2.43.0
Re: [PATCH net-next v5 2/2] net: pse-pd: add LED trigger support via notification path
Posted by Jakub Kicinski 1 month, 1 week ago
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: pse-pd: add LED trigger support via notification path

This commit adds per-PI LED triggers to the PSE core subsystem to reflect
delivering and enabled states. The LED state is updated during hardware
events and administrative state changes via ethtool.

Link: https://lore.kernel.org/oe-kbuild-all/202603251254.o5PqMBRU-lkp@intel.com/
Link: https://lore.kernel.org/oe-kbuild-all/202603251250.cuMCk5Yv-lkp@intel.com/

> diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
> index b7ffec0c942c..bff97211ae22 100644
> --- a/drivers/net/pse-pd/pse_core.c
> +++ b/drivers/net/pse-pd/pse_core.c

[ ... ]

> @@ -669,6 +670,168 @@ static int _pse_pi_delivery_power_sw_pw_ctrl(struct pse_controller_dev *pcdev,
>  	return 0;
>  }
>  
> +#if IS_ENABLED(CONFIG_LEDS_TRIGGERS)

[ ... ]

> +static int pse_led_triggers_register(struct pse_controller_dev *pcdev)
> +{
> +	struct device *dev = pcdev->dev;
> +	const char *dev_id;
> +	int i, ret;
> +
> +	dev_id = dev_name(dev);
> +
> +	pcdev->pi_led_trigs = devm_kcalloc(dev, pcdev->nr_lines,
> +					   sizeof(*pcdev->pi_led_trigs),
> +					   GFP_KERNEL);

Since devm resources are released in strict LIFO order, and pi_led_trigs
is allocated here after the regulators are registered in
pse_controller_register(), will the pi_led_trigs array be freed before
the regulators are unregistered on driver unbind?

When the regulator core later unregisters the regulators and flushes
pending disables, could it invoke pse_pi_disable() and call pse_led_update(),
dereferencing the dangling pcdev->pi_led_trigs pointer?

> +	if (!pcdev->pi_led_trigs)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < pcdev->nr_lines; i++) {
> +		struct pse_pi_led_triggers *trigs = &pcdev->pi_led_trigs[i];
> +
> +		/* Skip PIs not described in device tree */
> +		if (!pcdev->no_of_pse_pi && !pcdev->pi[i].np)
> +			continue;
> +
> +		trigs->delivering.name = devm_kasprintf(dev, GFP_KERNEL,
> +							"pse-%s:port%d:delivering",
> +							dev_id, i);
> +		if (!trigs->delivering.name)
> +			return -ENOMEM;
> +		trigs->delivering.activate = pse_led_delivering_activate;
> +
> +		ret = devm_led_trigger_register(dev, &trigs->delivering);

Regulators are exposed before this code runs in pse_controller_register().
What happens if a consumer calls regulator_enable() concurrently?

If a concurrent pse_pi_enable() calls pse_led_update() right after
trigs->delivering.name is set but before devm_led_trigger_register() executes:

pse_led_update() sees pcdev->pi_led_trigs and trigs->delivering.name are
both non-NULL, and proceeds to call led_trigger_event().

Since devm_led_trigger_register() hasn't initialized the trigger's internal
led_cdevs list head yet, will led_trigger_event() attempt to iterate a
zeroed memory list and cause a NULL pointer dereference?

> +		if (ret) {
> +			trigs->delivering.name = NULL;
> +			return ret;
> +		}

[ ... ]

> @@ -1107,6 +1273,31 @@ int pse_controller_register(struct pse_controller_dev *pcdev)
>  	if (ret)
>  		return ret;
>  
> +	ret = pse_led_triggers_register(pcdev);
> +	if (ret) {
> +		/* LED triggers are non-essential for power delivery; warn
> +		 * and continue. NULL out the array so pse_led_update()'s
> +		 * early-return guard short-circuits any later calls onto
> +		 * partially-registered triggers.
> +		 */
> +		dev_warn(pcdev->dev, "Failed to register LED triggers: %d\n",
> +			 ret);
> +		pcdev->pi_led_trigs = NULL;
> +	}

If pse_led_triggers_register() fails halfway through, the device probe still
succeeds, which means devm cleanup will not run. Could the successfully
registered LED triggers from earlier loop iterations remain registered
indefinitely with the LED subsystem?

Since pcdev->pi_led_trigs is NULLed, pse_led_update() will short-circuit.
Does this leave these globally visible triggers permanently orphaned without
ever receiving state updates?

> +
> +	/* Query initial LED state for all PIs so already-active ports
> +	 * are reflected immediately without waiting for a hardware event.
> +	 * Hold pcdev->lock: regulators are already exposed and a
> +	 * concurrent regulator_enable() would race on the hw callbacks
> +	 * and on last_delivering / last_enabled.
> +	 */
> +	mutex_lock(&pcdev->lock);

[ ... ]

> @@ -1267,7 +1458,14 @@ static void pse_handle_events(struct pse_controller_dev *pcdev,
>  		struct pse_ntf ntf = {};
>  		int ret;
>  
> -		/* Do nothing PI not described */
> +		/* Update LEDs for described PIs regardless of consumer state.
> +		 * LED triggers are registered at controller init, before any
> +		 * PHY claims a PSE control, so rdev may still be NULL here.
> +		 */
> +		if (pcdev->no_of_pse_pi || pcdev->pi[i].np)
> +			pse_led_update(pcdev, i);

The docstring for pse_led_update() requires it to be called with pcdev->lock
held. Does calling it here locklessly inside the event handler violate
that locking contract?

Could this lack of synchronization cause data races on the cached state
variables, or allow concurrent execution of the hardware callbacks
(pi_get_pw_status and pi_get_admin_state) if they are invoked at the same
time by other paths?

> +
> +		/* Skip regulator/netlink path for PIs without consumers */
>  		if (!pcdev->pi[i].rdev)
>  			continue;
Re: [PATCH net-next v5 2/2] net: pse-pd: add LED trigger support via notification path
Posted by Carlo Szelinsky 1 month ago
Hi Jakub,

Thanks for the second pass! I went through all four points and
would love to clarify some points before moving on to v6. 
Replying inline again:

On Mon, 4 May 2026 18:57:59 -0700 Jakub Kicinski wrote:
> > +	pcdev->pi_led_trigs = devm_kcalloc(dev, pcdev->nr_lines,
> > +					   sizeof(*pcdev->pi_led_trigs),
> > +					   GFP_KERNEL);
>
> Since devm resources are released in strict LIFO order, and pi_led_trigs
> is allocated here after the regulators are registered in
> pse_controller_register(), will the pi_led_trigs array be freed before
> the regulators are unregistered on driver unbind?
>
> When the regulator core later unregisters the regulators and flushes
> pending disables, could it invoke pse_pi_disable() and call pse_led_update(),
> dereferencing the dangling pcdev->pi_led_trigs pointer?

Yes, seems to be real: devm LIFO frees pi_led_trigs before the 
regulators get unregistered. If a deferred disable fires during
regulator_unregister() (via flush_work on rdev->disable_work),
pse_pi_disable() runs and pse_led_update() walks freed memory.

But for one piece I got a question: the same path also touches
pcdev->pi. pse_pi_disable() calls _pse_pi_disable(), before:
pse_led_update(), and _pse_pi_disable() derefs pcdev->pi[id]
(via pse_pi_deallocate_pw_budget(&pcdev->pi[id]) and
pcdev->pi[id].pw_d). pcdev->pi is already freed by
pse_release_pis() in pse_controller_unregister(). So setting
pi_led_trigs NULL only fixes the LED half, or?

Should v6 also handle pcdev->pi here (NULL after
kfree, plus a guard in _pse_pi_disable)? Feels like the same
class of bug as 1/2's reorder, just on the regulator-cleanup
path. Or is a separate fix outside this patch series better?

> > +	for (i = 0; i < pcdev->nr_lines; i++) {
> > +		struct pse_pi_led_triggers *trigs = &pcdev->pi_led_trigs[i];
> [...]
> > +		trigs->delivering.name = devm_kasprintf(dev, GFP_KERNEL,
> > +							"pse-%s:port%d:delivering",
> > +							dev_id, i);
> [...]
> > +		ret = devm_led_trigger_register(dev, &trigs->delivering);
>
> Regulators are exposed before this code runs in pse_controller_register().
> What happens if a consumer calls regulator_enable() concurrently?
> [...] Since devm_led_trigger_register() hasn't initialized the trigger's
> internal led_cdevs list head yet, will led_trigger_event() attempt to
> iterate a zeroed memory list and cause a NULL pointer dereference?

Right, but the window is microseconds wide (between setting
trigs->delivering.name and devm_led_trigger_register() returning).
You'd need a consumer racing during probe to hit it, right?

The fix seems to be simple: move pse_led_triggers_register() before the
regulator loop in pse_controller_register(). of_load_pse_pis()
runs earlier so pi[]/pi[i].np is already filled in, which is
all the trigger loop needs. Regulators only get exposed after
triggers are fully registered, no race window left.

Question: Do you agree with that solution for v6?

> > +	ret = pse_led_triggers_register(pcdev);
> > +	if (ret) {
> > +		/* LED triggers are non-essential for power delivery; warn
> > +		 * and continue. [...]
> > +		dev_warn(pcdev->dev, "Failed to register LED triggers: %d\n",
> > +			 ret);
> > +		pcdev->pi_led_trigs = NULL;
> > +	}
>
> If pse_led_triggers_register() fails halfway through, the device probe
> still succeeds, which means devm cleanup will not run. Could the
> successfully registered LED triggers from earlier loop iterations remain
> registered indefinitely with the LED subsystem?

Would love to clarify this one with you as well: devm cleanup does still run
on driver unbind, since devm_led_trigger_register() attaches a
per-call release action. So the partially-registered triggers
stay in sysfs until unbind, but they're not leaked across it.
And with pi_led_trigs = NULL, pse_led_update() short-circuits
so the orphans never fire either. So practically they're inert
sysfs entries until unbind, not a leak.

That said, I agree the "warn and orphan" middle ground is a bit
weird. So which path should I follow:

1) treat LED reg failure as fatal: return ret, fail probe.
Smallest change, matches the rest of the function. The
"non-essential" framing was mine, happy to drop it.
2) wrap the registration in a devres group so partial
failures roll back the triggers we did manage to register.
3) leave as-is, since devm already cleans up on unbind.

I'd go with 1) since OOM during probe is fatal for most things
anyway, but happy to do 2) or 3) if you have a preference.

> > +		/* Update LEDs for described PIs regardless of consumer state.
> [...]
> > +		if (pcdev->no_of_pse_pi || pcdev->pi[i].np)
> > +			pse_led_update(pcdev, i);
>
> The docstring for pse_led_update() requires it to be called with pcdev->lock
> held. Does calling it here locklessly inside the event handler violate
> that locking contract?

From what I see, both callers of pse_handle_events() hold pcdev->lock
across the call:

* pse_isr() takes mutex_lock(&pcdev->lock) at line 1524, then
  calls pse_handle_events() at line 1531.
* pse_poll_worker() takes mutex_lock(&pcdev->lock) at line 1547,
  then calls pse_handle_events() at line 1551.

So pse_led_update() is running with the lock held in both paths.

Tbh I don't get completely what the AI is flagging here, or is
this a false positive? If false positive, I'd still add
lockdep_assert_held(&pcdev->lock) in pse_led_update() and
pse_handle_events() to make the contract explicit and catch any
future caller that forgets, but that would be documentation,
not a bug fix.

So plan for v6 [2/2], pending your answers:
* NULL pcdev->pi_led_trigs in pse_controller_unregister()
  (and possibly NULL pcdev->pi too, depending on your answer?)
* move pse_led_triggers_register() before the regulator loop
* add lockdep_assert_held() (assuming is a false positive)
* whichever option you pick for question 3?

Depending on what we do with [1/2], I'll roll v6 with your
decisions baked in.

Sorry for the long text...

Cheers,
Carlo
Re: [PATCH net-next v5 2/2] net: pse-pd: add LED trigger support via notification path
Posted by Oleksij Rempel 4 weeks, 1 day ago
Hi Carlo,

Thank you for your work!

On Sat, May 16, 2026 at 12:17:59PM +0200, Carlo Szelinsky wrote:
> Hi Jakub,
> 
> Thanks for the second pass! I went through all four points and
> would love to clarify some points before moving on to v6. 
> Replying inline again:
> 
> On Mon, 4 May 2026 18:57:59 -0700 Jakub Kicinski wrote:
> > > +	pcdev->pi_led_trigs = devm_kcalloc(dev, pcdev->nr_lines,
> > > +					   sizeof(*pcdev->pi_led_trigs),
> > > +					   GFP_KERNEL);
> >
> > Since devm resources are released in strict LIFO order, and pi_led_trigs
> > is allocated here after the regulators are registered in
> > pse_controller_register(), will the pi_led_trigs array be freed before
> > the regulators are unregistered on driver unbind?
> >
> > When the regulator core later unregisters the regulators and flushes
> > pending disables, could it invoke pse_pi_disable() and call pse_led_update(),
> > dereferencing the dangling pcdev->pi_led_trigs pointer?
> 
> Yes, seems to be real: devm LIFO frees pi_led_trigs before the 
> regulators get unregistered. If a deferred disable fires during
> regulator_unregister() (via flush_work on rdev->disable_work),
> pse_pi_disable() runs and pse_led_update() walks freed memory.
> 
> But for one piece I got a question: the same path also touches
> pcdev->pi. pse_pi_disable() calls _pse_pi_disable(), before:
> pse_led_update(), and _pse_pi_disable() derefs pcdev->pi[id]
> (via pse_pi_deallocate_pw_budget(&pcdev->pi[id]) and
> pcdev->pi[id].pw_d). pcdev->pi is already freed by
> pse_release_pis() in pse_controller_unregister(). So setting
> pi_led_trigs NULL only fixes the LED half, or?
> 
> Should v6 also handle pcdev->pi here (NULL after
> kfree, plus a guard in _pse_pi_disable)? Feels like the same
> class of bug as 1/2's reorder, just on the regulator-cleanup
> path. Or is a separate fix outside this patch series better?

Please, send fixes for existing bugs separately for stable fixes. 

> > > +	for (i = 0; i < pcdev->nr_lines; i++) {
> > > +		struct pse_pi_led_triggers *trigs = &pcdev->pi_led_trigs[i];
> > [...]
> > > +		trigs->delivering.name = devm_kasprintf(dev, GFP_KERNEL,
> > > +							"pse-%s:port%d:delivering",
> > > +							dev_id, i);
> > [...]
> > > +		ret = devm_led_trigger_register(dev, &trigs->delivering);
> >
> > Regulators are exposed before this code runs in pse_controller_register().
> > What happens if a consumer calls regulator_enable() concurrently?
> > [...] Since devm_led_trigger_register() hasn't initialized the trigger's
> > internal led_cdevs list head yet, will led_trigger_event() attempt to
> > iterate a zeroed memory list and cause a NULL pointer dereference?
> 
> Right, but the window is microseconds wide (between setting
> trigs->delivering.name and devm_led_trigger_register() returning).
> You'd need a consumer racing during probe to hit it, right?
> 
> The fix seems to be simple: move pse_led_triggers_register() before the
> regulator loop in pse_controller_register(). of_load_pse_pis()
> runs earlier so pi[]/pi[i].np is already filled in, which is
> all the trigger loop needs. Regulators only get exposed after
> triggers are fully registered, no race window left.
> 
> Question: Do you agree with that solution for v6?

Sounds plausible.

> > > +	ret = pse_led_triggers_register(pcdev);
> > > +	if (ret) {
> > > +		/* LED triggers are non-essential for power delivery; warn
> > > +		 * and continue. [...]
> > > +		dev_warn(pcdev->dev, "Failed to register LED triggers: %d\n",
> > > +			 ret);
> > > +		pcdev->pi_led_trigs = NULL;
> > > +	}
> >
> > If pse_led_triggers_register() fails halfway through, the device probe
> > still succeeds, which means devm cleanup will not run. Could the
> > successfully registered LED triggers from earlier loop iterations remain
> > registered indefinitely with the LED subsystem?
> 
> Would love to clarify this one with you as well: devm cleanup does still run
> on driver unbind, since devm_led_trigger_register() attaches a
> per-call release action. So the partially-registered triggers
> stay in sysfs until unbind, but they're not leaked across it.
> And with pi_led_trigs = NULL, pse_led_update() short-circuits
> so the orphans never fire either. So practically they're inert
> sysfs entries until unbind, not a leak.
> 
> That said, I agree the "warn and orphan" middle ground is a bit
> weird. So which path should I follow:
> 
> 1) treat LED reg failure as fatal: return ret, fail probe.
> Smallest change, matches the rest of the function. The
> "non-essential" framing was mine, happy to drop it.
> 2) wrap the registration in a devres group so partial
> failures roll back the triggers we did manage to register.
> 3) leave as-is, since devm already cleans up on unbind.
> 
> I'd go with 1) since OOM during probe is fatal for most things
> anyway, but happy to do 2) or 3) if you have a preference.

1 feels for me as most straightforward way. I would expect to see this
kind of errors in a pre-production phase not in the end product.

> > > +		/* Update LEDs for described PIs regardless of consumer state.
> > [...]
> > > +		if (pcdev->no_of_pse_pi || pcdev->pi[i].np)
> > > +			pse_led_update(pcdev, i);
> >
> > The docstring for pse_led_update() requires it to be called with pcdev->lock
> > held. Does calling it here locklessly inside the event handler violate
> > that locking contract?
> 
> From what I see, both callers of pse_handle_events() hold pcdev->lock
> across the call:
> 
> * pse_isr() takes mutex_lock(&pcdev->lock) at line 1524, then
>   calls pse_handle_events() at line 1531.
> * pse_poll_worker() takes mutex_lock(&pcdev->lock) at line 1547,
>   then calls pse_handle_events() at line 1551.
> 
> So pse_led_update() is running with the lock held in both paths.
> 
> Tbh I don't get completely what the AI is flagging here, or is
> this a false positive?

Yes.

> If false positive, I'd still add
> lockdep_assert_held(&pcdev->lock) in pse_led_update() and
> pse_handle_events() to make the contract explicit and catch any
> future caller that forgets, but that would be documentation,
> not a bug fix.

I do not know what here went wrong, too much context?

In the raw log it looks like LLM was able to detect proper locking:
https://sashiko.dev/#/log/35678

"I'm seeing that in `pse_controller_register`, the code takes the mutex,
iterates and calls `pse_led_update`. Also, in `pse_handle_events`, this
is happening again, and this is called from `pse_isr` with
`mutex_lock(&pcdev->lock)`. And in `pse_pi_enable`, it's again called
within the mutex. So, `pse_led_update` *always* seems to be called with
the lock held. This is consistent and good."

And later:

"However, there's no evidence that `pcdev->lock` is held inside
`pse_handle_events`. Furthermore, the code uses a separate
`ntf_fifo_lock`."

We can ignore it. In case if you cares, lockdep_assert_held() sounds
good, comments can be misread. I would add it to pse_handle_events(), it
will make it fully visible for AI and detectable by CI.

> So plan for v6 [2/2], pending your answers:
> * NULL pcdev->pi_led_trigs in pse_controller_unregister()
>   (and possibly NULL pcdev->pi too, depending on your answer?)
> * move pse_led_triggers_register() before the regulator loop
> * add lockdep_assert_held() (assuming is a false positive)
> * whichever option you pick for question 3?
> 
> Depending on what we do with [1/2], I'll roll v6 with your
> decisions baked in.
> 
> Sorry for the long text...
> 
> Cheers,
> Carlo


Thank you!

Best Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |