[PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority

Kory Maincent posted 27 patches 1 year, 2 months ago
[PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Kory Maincent 1 year, 2 months ago
From: Kory Maincent (Dent Project) <kory.maincent@bootlin.com>

This patch introduces the ability to configure the PSE PI port priority.
Port priority is utilized by PSE controllers to determine which ports to
turn off first in scenarios such as power budget exceedance.

The pis_prio_max value is used to define the maximum priority level
supported by the controller. Both the current priority and the maximum
priority are exposed to the user through the pse_ethtool_get_status call.

This patch add support for two mode of port priority modes.
1. Static Method:

   This method involves distributing power based on PD classification.
   It’s straightforward and stable, the PSE core keeping track of the
   budget and subtracting the power requested by each PD’s class.

   Advantages: Every PD gets its promised power at any time, which
   guarantees reliability.

   Disadvantages: PD classification steps are large, meaning devices
   request much more power than they actually need. As a result, the power
   supply may only operate at, say, 50% capacity, which is inefficient and
   wastes money.

   Priority max value is matching the number of PSE PIs within the PSE.

2. Dynamic Method:

   To address the inefficiencies of the static method, vendors like
   Microchip have introduced dynamic power budgeting, as seen in the
   PD692x0 firmware. This method monitors the current consumption per port
   and subtracts it from the available power budget. When the budget is
   exceeded, lower-priority ports are shut down.

   Advantages: This method optimizes resource utilization, saving costs.

   Disadvantages: Low-priority devices may experience instability.

   Priority max value is set by the PSE controller driver.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
---

Change in v3:
- Add disconnection policy.
- Add management of disabled port priority in the interrupt handler.
- Move port prio mode in the power domain instead of the PSE.

Change in v2:
- Rethink the port priority support.
---
 drivers/net/pse-pd/pse_core.c | 550 +++++++++++++++++++++++++++++++++++++++++-
 include/linux/pse-pd/pse.h    |  63 +++++
 include/uapi/linux/ethtool.h  |  73 ++++++
 3 files changed, 676 insertions(+), 10 deletions(-)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index ff0ffbccf139..f15a693692ae 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -40,10 +40,17 @@ struct pse_control {
  * struct pse_power_domain - a PSE power domain
  * @id: ID of the power domain
  * @supply: Power supply the Power Domain
+ * @port_prio_mode: Current port priority mode of the power domain
+ * @discon_pol: Current disonnection policy of the power domain
+ * @pi_lrc_id: ID of the last recently connected PI. -1 if none. Relevant
+ *	       for static port priority mode.
  */
 struct pse_power_domain {
 	int id;
 	struct regulator *supply;
+	u32 port_prio_mode;
+	u32 discon_pol;
+	int pi_lrc_id;
 };
 
 static int of_load_single_pse_pi_pairset(struct device_node *node,
@@ -222,6 +229,33 @@ static int of_load_pse_pis(struct pse_controller_dev *pcdev)
 	return ret;
 }
 
+static void pse_pi_deallocate_pw_budget(struct pse_pi *pi)
+{
+	if (!pi->pw_d)
+		return;
+
+	regulator_free_power_budget(pi->pw_d->supply, pi->pw_allocated);
+}
+
+/* Helper returning true if the power control is managed from the software
+ * in the interrupt handler
+ */
+static bool pse_pw_d_is_sw_pw_control(struct pse_controller_dev *pcdev,
+				      struct pse_power_domain *pw_d)
+{
+	if (!pw_d)
+		return false;
+
+	if (pw_d->port_prio_mode & ETHTOOL_PSE_PORT_PRIO_STATIC)
+		return true;
+	if (pw_d->port_prio_mode == ETHTOOL_PSE_PORT_PRIO_DISABLED &&
+	    pcdev->ops->pi_enable && pcdev->ops->pi_get_pw_req &&
+	    pcdev->irq)
+		return true;
+
+	return false;
+}
+
 static int pse_pi_is_enabled(struct regulator_dev *rdev)
 {
 	struct pse_controller_dev *pcdev = rdev_get_drvdata(rdev);
@@ -234,6 +268,13 @@ static int pse_pi_is_enabled(struct regulator_dev *rdev)
 
 	id = rdev_get_id(rdev);
 	mutex_lock(&pcdev->lock);
+	if (pse_pw_d_is_sw_pw_control(pcdev, pcdev->pi[id].pw_d)) {
+		ret = pcdev->pi[id].isr_enabled &&
+		      pcdev->pi[id].admin_state_enabled;
+		mutex_unlock(&pcdev->lock);
+		return ret;
+	}
+
 	ret = ops->pi_is_enabled(pcdev, id);
 	mutex_unlock(&pcdev->lock);
 
@@ -244,7 +285,7 @@ static int pse_pi_enable(struct regulator_dev *rdev)
 {
 	struct pse_controller_dev *pcdev = rdev_get_drvdata(rdev);
 	const struct pse_controller_ops *ops;
-	int id, ret;
+	int id, ret = 0;
 
 	ops = pcdev->ops;
 	if (!ops->pi_enable)
@@ -252,6 +293,21 @@ static int pse_pi_enable(struct regulator_dev *rdev)
 
 	id = rdev_get_id(rdev);
 	mutex_lock(&pcdev->lock);
+	if (pse_pw_d_is_sw_pw_control(pcdev, pcdev->pi[id].pw_d)) {
+		/* Manage enabled status by software.
+		 * Real enable process will happen if a port is connected.
+		 */
+		if (pcdev->pi[id].isr_enabled) {
+			ret = ops->pi_enable(pcdev, id);
+			if (!ret)
+				pcdev->pi[id].admin_state_enabled = 1;
+		} else {
+			pcdev->pi[id].admin_state_enabled = 1;
+		}
+		mutex_unlock(&pcdev->lock);
+		return ret;
+	}
+
 	ret = ops->pi_enable(pcdev, id);
 	if (!ret)
 		pcdev->pi[id].admin_state_enabled = 1;
@@ -264,6 +320,7 @@ static int pse_pi_disable(struct regulator_dev *rdev)
 {
 	struct pse_controller_dev *pcdev = rdev_get_drvdata(rdev);
 	const struct pse_controller_ops *ops;
+	struct pse_pi *pi;
 	int id, ret;
 
 	ops = pcdev->ops;
@@ -272,9 +329,16 @@ static int pse_pi_disable(struct regulator_dev *rdev)
 
 	id = rdev_get_id(rdev);
 	mutex_lock(&pcdev->lock);
+
+	pi = &pcdev->pi[id];
 	ret = ops->pi_disable(pcdev, id);
-	if (!ret)
-		pcdev->pi[id].admin_state_enabled = 0;
+	if (!ret) {
+		pse_pi_deallocate_pw_budget(pi);
+		pi->admin_state_enabled = 0;
+		pi->isr_enabled = 0;
+		if (pi->pw_d && pi->pw_d->pi_lrc_id == id)
+			pi->pw_d->pi_lrc_id = -1;
+	}
 	mutex_unlock(&pcdev->lock);
 
 	return ret;
@@ -516,6 +580,8 @@ static int pse_register_pw_ds(struct pse_controller_dev *pcdev)
 		}
 
 		pw_d->supply = supply;
+		pw_d->port_prio_mode = ETHTOOL_PSE_PORT_PRIO_DISABLED;
+		pw_d->discon_pol = ETHTOOL_PSE_DISCON_ROUND_ROBIN_IDX_LOWEST_FIRST;
 		pcdev->pi[i].pw_d = pw_d;
 	}
 
@@ -539,6 +605,7 @@ int pse_controller_register(struct pse_controller_dev *pcdev)
 	if (ret < 0)
 		return ret;
 	pcdev->id = ret;
+	pcdev->port_prio_supp_modes |= ETHTOOL_PSE_PORT_PRIO_DISABLED;
 
 	if (!pcdev->nr_lines)
 		pcdev->nr_lines = 1;
@@ -683,10 +750,279 @@ pse_control_find_phy_by_id(struct pse_controller_dev *pcdev, int id)
 		}
 	}
 	mutex_unlock(&pse_list_mutex);
-
 	return NULL;
 }
 
+static int pse_pi_disable_isr(struct pse_controller_dev *pcdev, int id,
+			      struct netlink_ext_ack *extack)
+{
+	const struct pse_controller_ops *ops = pcdev->ops;
+	struct pse_pi *pi = &pcdev->pi[id];
+	int ret;
+
+	if (!ops->pi_disable) {
+		NL_SET_ERR_MSG(extack, "PSE does not support disable control");
+		return -EOPNOTSUPP;
+	}
+
+	if (!pi->isr_enabled)
+		return 0;
+
+	if (pi->admin_state_enabled) {
+		ret = ops->pi_disable(pcdev, id);
+		if (ret) {
+			NL_SET_ERR_MSG_FMT(extack,
+					   "PI %d: disable error %d",
+					   id, ret);
+			return ret;
+		}
+	}
+
+	pse_pi_deallocate_pw_budget(pi);
+	pi->isr_enabled = 0;
+	if (pi->pw_d && pi->pw_d->pi_lrc_id == id)
+		pi->pw_d->pi_lrc_id = -1;
+
+	return 0;
+}
+
+static int _pse_disable_pi_pol(struct pse_controller_dev *pcdev, int id)
+{
+	unsigned long notifs = ETHTOOL_C33_PSE_EVENT_DISCONNECTION;
+	struct netlink_ext_ack extack = {};
+	struct phy_device *phydev;
+	int ret;
+
+	dev_dbg(pcdev->dev, "Disabling PI %d to free power budget\n", id);
+
+	NL_SET_ERR_MSG_FMT(&extack,
+			   "Disabling PI %d to free power budget", id);
+
+	ret = pse_pi_disable_isr(pcdev, id, &extack);
+	if (ret)
+		notifs |= ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR;
+	phydev = pse_control_find_phy_by_id(pcdev, id);
+	if (phydev)
+		ethnl_pse_send_ntf(phydev, notifs, &extack);
+
+	return ret;
+}
+
+static int pse_disable_pi_prio_lrc(struct pse_controller_dev *pcdev,
+				   struct pse_power_domain *pw_d,
+				   int prio)
+{
+	int lrc_id = pw_d->pi_lrc_id;
+	int ret;
+
+	if (lrc_id < 0)
+		return 0;
+
+	if (pcdev->pi[lrc_id].prio != prio)
+		return 0;
+
+	ret = _pse_disable_pi_pol(pcdev, lrc_id);
+	if (ret)
+		return ret;
+
+	/* PI disabled */
+	return 1;
+}
+
+static int pse_disable_pi_prio_round_rob_low(struct pse_controller_dev *pcdev,
+					     struct pse_power_domain *pw_d,
+					     int prio)
+{
+	int i;
+
+	for (i = 0; i < pcdev->nr_lines; i++) {
+		int ret;
+
+		if (pcdev->pi[i].prio != prio ||
+		    pcdev->pi[i].pw_d != pw_d ||
+		    !pcdev->pi[i].isr_enabled)
+			continue;
+
+		ret = _pse_disable_pi_pol(pcdev, i);
+		if (ret)
+			return ret;
+
+		/* PI disabled */
+		return 1;
+	}
+
+	/* No PI disabled */
+	return 0;
+}
+
+static int pse_disable_pi_prio_pol(struct pse_controller_dev *pcdev,
+				   struct pse_power_domain *pw_d,
+				   int prio)
+{
+	int ret;
+
+	if (pw_d->discon_pol & ETHTOOL_PSE_DISCON_LRC) {
+		ret = pse_disable_pi_prio_lrc(pcdev, pw_d, prio);
+		if (ret)
+			return ret;
+	}
+
+	if (pw_d->discon_pol &
+	    ETHTOOL_PSE_DISCON_ROUND_ROBIN_IDX_LOWEST_FIRST) {
+		ret = pse_disable_pi_prio_round_rob_low(pcdev, pw_d, prio);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int
+pse_pi_allocate_pw_budget_static_prio(struct pse_controller_dev *pcdev, int id,
+				      int pw_req, struct netlink_ext_ack *extack)
+{
+	struct pse_pi *pi = &pcdev->pi[id];
+	int ret, _prio;
+
+	_prio = pcdev->nr_lines;
+	while (regulator_request_power_budget(pi->pw_d->supply, pw_req) == -ERANGE) {
+		ret = pse_disable_pi_prio_pol(pcdev, pi->pw_d, _prio);
+		if (ret < 0)
+			return ret;
+		/* No pi disabled, decrease priority value */
+		if (!ret)
+			_prio--;
+
+		if (_prio <= pi->prio) {
+			NL_SET_ERR_MSG_FMT(extack,
+					   "PI %d: not enough power budget available",
+					   id);
+			return -ERANGE;
+		}
+	}
+
+	pi->pw_allocated = pw_req;
+	return 0;
+}
+
+static int pse_pi_allocate_pw_budget(struct pse_controller_dev *pcdev, int id,
+				     int pw_req, struct netlink_ext_ack *extack)
+{
+	struct pse_pi *pi = &pcdev->pi[id];
+	int ret;
+
+	if (!pi->pw_d)
+		return 0;
+
+	/* ETHTOOL_PSE_PORT_PRIO_STATIC */
+	if (pi->pw_d->port_prio_mode & ETHTOOL_PSE_PORT_PRIO_STATIC)
+		return pse_pi_allocate_pw_budget_static_prio(pcdev, id, pw_req,
+							     extack);
+
+	/* ETHTOOL_PSE_PORT_PRIO_DISABLED */
+	ret = regulator_request_power_budget(pi->pw_d->supply, pw_req);
+	if (ret)
+		NL_SET_ERR_MSG_FMT(extack,
+				   "PI %d: not enough power budget available",
+				   id);
+	else
+		pi->pw_allocated = pw_req;
+
+	return ret;
+}
+
+static int pse_pi_enable_isr(struct pse_controller_dev *pcdev, int id,
+			     struct netlink_ext_ack *extack)
+{
+	const struct pse_controller_ops *ops = pcdev->ops;
+	struct pse_pi *pi = &pcdev->pi[id];
+	int ret, pw_req;
+
+	if (!ops->pi_enable || !ops->pi_get_pw_req) {
+		NL_SET_ERR_MSG(extack, "PSE does not support enable control");
+		return -EOPNOTSUPP;
+	}
+
+	if (pi->isr_enabled)
+		return 0;
+
+	ret = ops->pi_get_pw_req(pcdev, id);
+	if (ret < 0)
+		return ret;
+
+	pw_req = ret;
+
+	/* Compare requested power with port power limit and use the lowest
+	 * one.
+	 */
+	if (ops->pi_get_current_limit && ops->pi_get_voltage) {
+		int uV, mW;
+		s64 tmp_64;
+
+		ret = ops->pi_get_voltage(pcdev, id);
+		if (ret < 0)
+			return ret;
+		uV = ret;
+
+		ret = ops->pi_get_current_limit(pcdev, id);
+		if (ret < 0)
+			return ret;
+
+		tmp_64 = ret;
+		tmp_64 *= uV;
+		/* mW = uV * uA / 1000000000 */
+		mW = DIV_ROUND_CLOSEST_ULL(tmp_64, 1000000000);
+		if (mW < pw_req)
+			pw_req = mW;
+	}
+
+	ret = pse_pi_allocate_pw_budget(pcdev, id, pw_req, extack);
+	if (ret)
+		return ret;
+
+	if (pi->admin_state_enabled) {
+		ret = ops->pi_enable(pcdev, id);
+		if (ret) {
+			pse_pi_deallocate_pw_budget(pi);
+			NL_SET_ERR_MSG_FMT(extack,
+					   "PI %d: enable error %d",
+					   id, ret);
+			return ret;
+		}
+	}
+
+	pi->isr_enabled = 1;
+	if (pi->pw_d)
+		pi->pw_d->pi_lrc_id = id;
+	return 0;
+}
+
+static int pse_set_config_isr(struct pse_controller_dev *pcdev, int id,
+			      unsigned long notifs,
+			      struct netlink_ext_ack *extack)
+{
+	int ret = 0;
+
+	if (notifs & ETHTOOL_PSE_PORT_PRIO_DYNAMIC)
+		return 0;
+
+	if ((notifs & ETHTOOL_C33_PSE_EVENT_DISCONNECTION) &&
+	    ((notifs & ETHTOOL_C33_PSE_EVENT_DETECTION) ||
+	     (notifs & ETHTOOL_C33_PSE_EVENT_CLASSIFICATION))) {
+		NL_SET_ERR_MSG_FMT(extack,
+				   "PI %d: error, connection and disconnection reported simultaneously",
+				   id);
+		return -EINVAL;
+	}
+
+	if (notifs & ETHTOOL_C33_PSE_EVENT_CLASSIFICATION)
+		ret = pse_pi_enable_isr(pcdev, id, extack);
+	else if (notifs & ETHTOOL_C33_PSE_EVENT_DISCONNECTION)
+		ret = pse_pi_disable_isr(pcdev, id, extack);
+
+	return ret;
+}
+
 static irqreturn_t pse_isr(int irq, void *data)
 {
 	struct netlink_ext_ack extack = {};
@@ -703,9 +1039,10 @@ static irqreturn_t pse_isr(int irq, void *data)
 	memset(h->notifs, 0, pcdev->nr_lines * sizeof(*h->notifs));
 	mutex_lock(&pcdev->lock);
 	ret = desc->map_event(irq, pcdev, h->notifs, &notifs_mask);
-	mutex_unlock(&pcdev->lock);
-	if (ret || !notifs_mask)
+	if (ret || !notifs_mask) {
+		mutex_unlock(&pcdev->lock);
 		return IRQ_NONE;
+	}
 
 	for_each_set_bit(i, &notifs_mask, pcdev->nr_lines) {
 		struct phy_device *phydev;
@@ -716,6 +1053,12 @@ static irqreturn_t pse_isr(int irq, void *data)
 			continue;
 
 		notifs = h->notifs[i];
+		if (pse_pw_d_is_sw_pw_control(pcdev, pcdev->pi[i].pw_d)) {
+			ret = pse_set_config_isr(pcdev, i, notifs, &extack);
+			if (ret)
+				notifs |= ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR;
+		}
+
 		dev_dbg(h->pcdev->dev,
 			"Sending PSE notification EVT 0x%lx\n", notifs);
 
@@ -727,6 +1070,8 @@ static irqreturn_t pse_isr(int irq, void *data)
 					      NULL);
 	}
 
+	mutex_unlock(&pcdev->lock);
+
 	return IRQ_HANDLED;
 }
 
@@ -971,6 +1316,7 @@ static int _pse_ethtool_get_status(struct pse_controller_dev *pcdev,
 				   struct pse_control_status *status)
 {
 	const struct pse_controller_ops *ops;
+	struct pse_pi *pi = &pcdev->pi[id];
 
 	ops = pcdev->ops;
 	if (!ops->ethtool_get_status) {
@@ -980,8 +1326,23 @@ static int _pse_ethtool_get_status(struct pse_controller_dev *pcdev,
 	}
 
 	status->pse_id = pcdev->id;
-	if (pcdev->pi[id].pw_d)
-		status->pw_d_id = pcdev->pi[id].pw_d->id;
+	if (pi->pw_d) {
+		status->pw_d_id = pi->pw_d->id;
+		status->c33_prio_mode = pi->pw_d->port_prio_mode;
+		status->c33_discon_pol = pi->pw_d->discon_pol;
+		switch (pi->pw_d->port_prio_mode) {
+		case ETHTOOL_PSE_PORT_PRIO_STATIC:
+			status->c33_prio_max = pcdev->nr_lines;
+			status->c33_prio = pi->prio;
+			break;
+		case ETHTOOL_PSE_PORT_PRIO_DYNAMIC:
+			status->c33_prio_max = pcdev->pis_prio_max;
+			break;
+		default:
+			break;
+		}
+		status->c33_prio_supp_modes = pcdev->port_prio_supp_modes;
+	}
 
 	return ops->ethtool_get_status(pcdev, id, extack, status);
 }
@@ -1020,8 +1381,9 @@ static int pse_ethtool_c33_set_config(struct pse_control *psec,
 	case ETHTOOL_C33_PSE_ADMIN_STATE_ENABLED:
 		/* We could have mismatch between admin_state_enabled and
 		 * state reported by regulator_is_enabled. This can occur when
-		 * the PI is forcibly turn off by the controller. Call
-		 * regulator_disable on that case to fix the counters state.
+		 * the PI is forcibly turn off by the controller or by the
+		 * interrupt context. Call regulator_disable on that case
+		 * to fix the counters state.
 		 */
 		if (psec->pcdev->pi[psec->id].admin_state_enabled &&
 		    !regulator_is_enabled(psec->ps)) {
@@ -1094,6 +1456,32 @@ int pse_ethtool_set_config(struct pse_control *psec,
 }
 EXPORT_SYMBOL_GPL(pse_ethtool_set_config);
 
+static int pse_pi_update_pw_budget(struct pse_pi *pi, int id,
+				   const unsigned int pw_limit,
+				   struct netlink_ext_ack *extack)
+{
+	int pw_diff, ret;
+
+	pw_diff = pw_limit - pi->pw_allocated;
+	if (!pw_diff) {
+		return 0;
+	} else if (pw_diff > 0) {
+		ret = regulator_request_power_budget(pi->pw_d->supply, pw_diff);
+		if (ret) {
+			NL_SET_ERR_MSG_FMT(extack,
+					   "PI %d: not enough power budget available",
+					   id);
+			return ret;
+		}
+
+	} else {
+		regulator_free_power_budget(pi->pw_d->supply, -pw_diff);
+	}
+	pi->pw_allocated = pw_limit;
+
+	return 0;
+}
+
 /**
  * pse_ethtool_set_pw_limit - set PSE control power limit
  * @psec: PSE control pointer
@@ -1130,10 +1518,152 @@ int pse_ethtool_set_pw_limit(struct pse_control *psec,
 	/* uA = mW * 1000000000 / uV */
 	uA = DIV_ROUND_CLOSEST_ULL(tmp_64, uV);
 
+	if (psec->pcdev->pi[psec->id].pw_d) {
+		ret = pse_pi_update_pw_budget(&psec->pcdev->pi[psec->id],
+					      psec->id, pw_limit, extack);
+		if (ret)
+			return ret;
+	}
+
 	return regulator_set_current_limit(psec->ps, 0, uA);
 }
 EXPORT_SYMBOL_GPL(pse_ethtool_set_pw_limit);
 
+int pse_ethtool_set_prio(struct pse_control *psec,
+			 struct netlink_ext_ack *extack,
+			 unsigned int prio)
+{
+	struct pse_controller_dev *pcdev = psec->pcdev;
+	const struct pse_controller_ops *ops;
+	int ret = 0;
+
+	if (!pcdev->pi[psec->id].pw_d) {
+		NL_SET_ERR_MSG(extack, "no power domain attached");
+		return -EOPNOTSUPP;
+	}
+
+	/* We don't want priority change in the middle of an
+	 * enable/disable call or a priority mode change
+	 */
+	mutex_lock(&pcdev->lock);
+	switch (pcdev->pi[psec->id].pw_d->port_prio_mode) {
+	case ETHTOOL_PSE_PORT_PRIO_STATIC:
+		if (prio > pcdev->nr_lines) {
+			NL_SET_ERR_MSG_FMT(extack,
+					   "priority %d exceed priority max %d",
+					   prio, pcdev->nr_lines);
+			ret = -ERANGE;
+			goto out;
+		}
+
+		pcdev->pi[psec->id].prio = prio;
+		break;
+
+	case ETHTOOL_PSE_PORT_PRIO_DYNAMIC:
+		ops = psec->pcdev->ops;
+		if (!ops->pi_set_prio) {
+			NL_SET_ERR_MSG(extack,
+				       "pse driver does not support setting port priority");
+			ret = -EOPNOTSUPP;
+			goto out;
+		}
+
+		if (prio > pcdev->pis_prio_max) {
+			NL_SET_ERR_MSG_FMT(extack,
+					   "priority %d exceed priority max %d",
+					   prio, pcdev->pis_prio_max);
+			ret = -ERANGE;
+			goto out;
+		}
+
+		ret = ops->pi_set_prio(pcdev, psec->id, prio);
+		break;
+
+	default:
+		ret = -EOPNOTSUPP;
+	}
+
+out:
+	mutex_unlock(&pcdev->lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pse_ethtool_set_prio);
+
+int pse_ethtool_set_prio_mode(struct pse_control *psec,
+			      struct netlink_ext_ack *extack,
+			      u32 prio_mode)
+{
+	struct pse_controller_dev *pcdev = psec->pcdev;
+	const struct pse_controller_ops *ops;
+	int ret = 0, i;
+
+	if (!(prio_mode & pcdev->port_prio_supp_modes)) {
+		NL_SET_ERR_MSG(extack, "priority mode not supported");
+		return -EOPNOTSUPP;
+	}
+
+	if (!pcdev->pi[psec->id].pw_d) {
+		NL_SET_ERR_MSG(extack, "no power domain attached");
+		return -EOPNOTSUPP;
+	}
+
+	/* ETHTOOL_PSE_PORT_PRIO_DISABLED can't be ORed with another mode */
+	if (prio_mode & ETHTOOL_PSE_PORT_PRIO_DISABLED &&
+	    prio_mode & ~ETHTOOL_PSE_PORT_PRIO_DISABLED) {
+		NL_SET_ERR_MSG(extack,
+			       "port priority can't be enabled and disabled simultaneously");
+		return -EINVAL;
+	}
+
+	ops = psec->pcdev->ops;
+
+	/* We don't want priority mode change in the middle of an
+	 * enable/disable call
+	 */
+	mutex_lock(&pcdev->lock);
+	pcdev->pi[psec->id].pw_d->port_prio_mode = prio_mode;
+
+	/* Reset all priorities of the Power Domain */
+	for (i = 0; i < psec->pcdev->nr_lines; i++) {
+		if (!pcdev->pi[i].rdev ||
+		    pcdev->pi[i].pw_d != pcdev->pi[psec->id].pw_d)
+			continue;
+
+		pcdev->pi[i].prio = 0;
+
+		if (!ops->pi_set_prio)
+			continue;
+
+		if (pcdev->port_prio_supp_modes &
+		    ETHTOOL_PSE_PORT_PRIO_DYNAMIC)
+			ret = ops->pi_set_prio(pcdev, psec->id, 0);
+	}
+
+	mutex_unlock(&psec->pcdev->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pse_ethtool_set_prio_mode);
+
+int pse_ethtool_set_discon_pol(struct pse_control *psec,
+			       struct netlink_ext_ack *extack,
+			       u32 pol)
+{
+	struct pse_controller_dev *pcdev = psec->pcdev;
+
+	if (!pcdev->pi[psec->id].pw_d) {
+		NL_SET_ERR_MSG(extack, "no power domain attached");
+		return -EOPNOTSUPP;
+	}
+
+	mutex_lock(&pcdev->lock);
+	pcdev->pi[psec->id].pw_d->discon_pol = pol;
+	mutex_unlock(&psec->pcdev->lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pse_ethtool_set_discon_pol);
+
 bool pse_has_podl(struct pse_control *psec)
 {
 	return psec->pcdev->types & ETHTOOL_PSE_PODL;
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index bdf3e8c468fc..e84eba710dc8 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -9,6 +9,7 @@
 #include <linux/list.h>
 #include <uapi/linux/ethtool.h>
 #include <linux/regulator/driver.h>
+#include <linux/workqueue.h>
 
 /* Maximum current in uA according to IEEE 802.3-2022 Table 145-1 */
 #define MAX_PI_CURRENT 1920000
@@ -62,6 +63,13 @@ struct pse_control_config {
  *	is in charge of the memory allocation.
  * @c33_pw_limit_nb_ranges: number of supported power limit configuration
  *	ranges
+ * @c33_prio_supp_modes: PSE port priority modes supported. Set by PSE core.
+ * @c33_prio_mode: PSE port priority mode selected. Set by PSE core.
+ * @c33_prio_max: max priority allowed for the c33_prio variable value. Set
+ *	by PSE core.
+ * @c33_prio: priority of the PSE. Set by PSE core in case of static port
+ *	priority mode.
+ * @c33_discon_pol: PSE disconnection policy selected. Set by PSE core.
  */
 struct pse_control_status {
 	u32 pse_id;
@@ -76,6 +84,11 @@ struct pse_control_status {
 	u32 c33_avail_pw_limit;
 	struct ethtool_c33_pse_pw_limit_range *c33_pw_limit_ranges;
 	u32 c33_pw_limit_nb_ranges;
+	u32 c33_prio_supp_modes;
+	enum pse_port_prio_modes c33_prio_mode;
+	u32 c33_prio_max;
+	u32 c33_prio;
+	u32 c33_discon_pol;
 };
 
 /**
@@ -95,6 +108,10 @@ struct pse_control_status {
  *			  set_current_limit regulator callback.
  *			  Should not return an error in case of MAX_PI_CURRENT
  *			  current value set.
+ * @pi_set_prio: Configure the PSE PI priority.
+ * @pi_get_pw_req: Get the power requested by a PD before enabling the PSE PI.
+ *		   This is only relevant when an interrupt is registered using
+ *		   devm_pse_irq_helper helper.
  */
 struct pse_controller_ops {
 	int (*ethtool_get_status)(struct pse_controller_dev *pcdev,
@@ -109,6 +126,9 @@ struct pse_controller_ops {
 				    int id);
 	int (*pi_set_current_limit)(struct pse_controller_dev *pcdev,
 				    int id, int max_uA);
+	int (*pi_set_prio)(struct pse_controller_dev *pcdev, int id,
+			   unsigned int prio);
+	int (*pi_get_pw_req)(struct pse_controller_dev *pcdev, int id);
 };
 
 struct module;
@@ -143,6 +163,12 @@ struct pse_pi_pairset {
  * @rdev: regulator represented by the PSE PI
  * @admin_state_enabled: PI enabled state
  * @pw_d: Power domain of the PSE PI
+ * @prio: Priority of the PSE PI. Used in static port priority mode
+ * @isr_enabled: PSE PI power status managed by the interruption handler.
+ *		 This variable is relevant when the power enabled management
+ *		 is a managed in software like the static port priority mode.
+ * @pw_allocated: Power allocated to a PSE PI to manage power budget in
+ *	static port priority mode
  */
 struct pse_pi {
 	struct pse_pi_pairset pairset[2];
@@ -150,6 +176,9 @@ struct pse_pi {
 	struct regulator_dev *rdev;
 	bool admin_state_enabled;
 	struct pse_power_domain *pw_d;
+	int prio;
+	bool isr_enabled;
+	int pw_allocated;
 };
 
 /**
@@ -168,6 +197,8 @@ struct pse_pi {
  * @no_of_pse_pi: flag set if the pse_pis devicetree node is not used
  * @id: Index of the PSE
  * @irq: PSE interrupt
+ * @pis_prio_max: Maximum value allowed for the PSE PIs priority
+ * @port_prio_supp_modes: Bitfield of port priority mode supported by the PSE
  */
 struct pse_controller_dev {
 	const struct pse_controller_ops *ops;
@@ -183,6 +214,8 @@ struct pse_controller_dev {
 	bool no_of_pse_pi;
 	u32 id;
 	int irq;
+	unsigned int pis_prio_max;
+	u32 port_prio_supp_modes;
 };
 
 #if IS_ENABLED(CONFIG_PSE_CONTROLLER)
@@ -207,6 +240,15 @@ int pse_ethtool_set_config(struct pse_control *psec,
 int pse_ethtool_set_pw_limit(struct pse_control *psec,
 			     struct netlink_ext_ack *extack,
 			     const unsigned int pw_limit);
+int pse_ethtool_set_prio(struct pse_control *psec,
+			 struct netlink_ext_ack *extack,
+			 unsigned int prio);
+int pse_ethtool_set_prio_mode(struct pse_control *psec,
+			      struct netlink_ext_ack *extack,
+			      u32 prio_mode);
+int pse_ethtool_set_discon_pol(struct pse_control *psec,
+			       struct netlink_ext_ack *extack,
+			       u32 pol);
 
 bool pse_has_podl(struct pse_control *psec);
 bool pse_has_c33(struct pse_control *psec);
@@ -244,6 +286,27 @@ static inline int pse_ethtool_set_pw_limit(struct pse_control *psec,
 	return -EOPNOTSUPP;
 }
 
+static inline int pse_ethtool_set_prio(struct pse_control *psec,
+				       struct netlink_ext_ack *extack,
+				       unsigned int prio)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int pse_ethtool_set_prio_mode(struct pse_control *psec,
+					    struct netlink_ext_ack *extack,
+					    u32 prio_mode)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int pse_ethtool_set_discon_pol(struct pse_control *psec,
+					     struct netlink_ext_ack *extack,
+					     u32 pol)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline bool pse_has_podl(struct pse_control *psec)
 {
 	return false;
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index a4c93d6de218..b6727049840c 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -1002,11 +1002,84 @@ enum ethtool_c33_pse_pw_d_status {
  * enum ethtool_pse_events - event list of the PSE controller.
  * @ETHTOOL_PSE_EVENT_OVER_CURRENT: PSE output current is too high.
  * @ETHTOOL_PSE_EVENT_OVER_TEMP: PSE in over temperature state.
+ * @ETHTOOL_C33_PSE_EVENT_DETECTION: detection process occur on the PSE.
+ *	IEEE 802.3-2022 145.2.6 PSE detection of PDs
+ * @ETHTOOL_C33_PSE_EVENT_CLASSIFICATION: classification process occur on
+ *	the PSE. IEEE 802.3-2022 145.2.8 PSE classification of PDs and
+ *	mutual identification
+ * @ETHTOOL_C33_PSE_EVENT_DISCONNECTION: PD has been disconnected on the PSE.
+ * @ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR: PSE faced an error managing the
+ *	power control from software.
  */
 
 enum ethtool_pse_events {
 	ETHTOOL_PSE_EVENT_OVER_CURRENT =	1 << 0,
 	ETHTOOL_PSE_EVENT_OVER_TEMP =		1 << 1,
+	ETHTOOL_C33_PSE_EVENT_DETECTION =	1 << 2,
+	ETHTOOL_C33_PSE_EVENT_CLASSIFICATION =	1 << 3,
+	ETHTOOL_C33_PSE_EVENT_DISCONNECTION =	1 << 4,
+	ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR =	1 << 5,
+};
+
+/**
+ * enum pse_port_prio_modes - PSE port priority modes.
+ * @ETHTOOL_PSE_PORT_PRIO_DISABLED: Port priority disabled.
+ * @ETHTOOL_PSE_PORT_PRIO_STATIC: PSE static port priority. Port priority
+ *	based on the power requested during PD classification. This mode
+ *	is managed by the PSE core.
+ * @ETHTOOL_PSE_PORT_PRIO_DYNAMIC: PSE dynamic port priority. Port priority
+ *	based on the current consumption per ports compared to the total
+ *	power budget. This mode is managed by the PSE controller.
+ */
+
+enum pse_port_prio_modes {
+	ETHTOOL_PSE_PORT_PRIO_DISABLED	= 1 << 0,
+	ETHTOOL_PSE_PORT_PRIO_STATIC	= 1 << 1,
+	ETHTOOL_PSE_PORT_PRIO_DYNAMIC	= 1 << 2,
+};
+
+/**
+ * enum ethtool_pse_disconnection_policy - Disconnection strategies for
+ *	same-priority devices when power budget is exceeded, tailored to
+ *	specific priority modes.
+ *
+ * @ETHTOOL_PSE_DISCON_LRC: Disconnect least recently connected device.
+ *	Relevant for: ETHTOOL_PSE_PORT_PRIO_STATIC
+ *	Behavior: When multiple devices share the same priority level, the
+ *	system disconnects the device that was most recently connected.
+ *	Rationale: This strategy favors stability for longer-standing
+ *	connections, assuming that established devices may be more critical.
+ *	Use Case: Suitable for systems prioritizing stable power allocation for
+ *	existing static-priority connections, making newer devices suitable
+ *	candidates for disconnection if limits are exceeded.
+ * @ETHTOOL_PSE_DISCON_ROUND_ROBIN_IDX_LOWEST_FIRST: Disconnect based on port
+ *	index in a round-robin manner, starting with the lowest index.
+ *	Relevant for: ETHTOOL_PSE_PORT_PRIO_STATIC
+ *	Behavior: Disconnects devices sequentially based on port index, starting
+ *	with the lowest. If multiple disconnections are required, the process
+ *	continues in ascending order.
+ *	Rationale: Provides a predictable, systematic approach for
+ *	static-priority devices, making it clear which device will be
+ *	disconnected next if power limits are reached.
+ *	Use Case: Appropriate for systems where static-priority devices are
+ *	equal in role, and fairness in disconnections is prioritized.
+ *
+ * Each device can have multiple disconnection policies set as an array of
+ * priorities. When the power budget is exceeded, the policies are executed
+ * in the order defined by the user. This allows for a more nuanced and
+ * flexible approach to handling power constraints across a range of devices
+ * with similar priorities or attributes.
+ *
+ * Example Usage:
+ *   Users can specify an ordered list of policies, such as starting with
+ *   `PSE_DISCON_STATIC_CLASS_HIGHEST_FIRST` to prioritize based on class,
+ *   followed by `PSE_DISCON_LRC` to break ties based on connection time.
+ *   This ordered execution ensures that power disconnections align closely
+ *   with the system’s operational requirements and priorities.
+ */
+enum ethtool_pse_disconnection_policy {
+	ETHTOOL_PSE_DISCON_LRC = 1 << 0,
+	ETHTOOL_PSE_DISCON_ROUND_ROBIN_IDX_LOWEST_FIRST = 1 << 1,
 };
 
 /**

-- 
2.34.1

Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Oleksij Rempel 1 year, 2 months ago
On Thu, Nov 21, 2024 at 03:42:47PM +0100, Kory Maincent wrote:
> From: Kory Maincent (Dent Project) <kory.maincent@bootlin.com>
> 
> This patch introduces the ability to configure the PSE PI port priority.
> Port priority is utilized by PSE controllers to determine which ports to
> turn off first in scenarios such as power budget exceedance.
> 
> The pis_prio_max value is used to define the maximum priority level
> supported by the controller. Both the current priority and the maximum
> priority are exposed to the user through the pse_ethtool_get_status call.
> 
> This patch add support for two mode of port priority modes.

Priorit mode is too abstract for me, in this case we are talking about
Budget Evaluation Strategy.

> 1. Static Method:
> 
>    This method involves distributing power based on PD classification.
>    It’s straightforward and stable, the PSE core keeping track of the
>    budget and subtracting the power requested by each PD’s class.
> 
>    Advantages: Every PD gets its promised power at any time, which
>    guarantees reliability.
> 
>    Disadvantages: PD classification steps are large, meaning devices
>    request much more power than they actually need. As a result, the power
>    supply may only operate at, say, 50% capacity, which is inefficient and
>    wastes money.
> 
>    Priority max value is matching the number of PSE PIs within the PSE.
> 
> 2. Dynamic Method:
> 
>    To address the inefficiencies of the static method, vendors like
>    Microchip have introduced dynamic power budgeting, as seen in the
>    PD692x0 firmware. This method monitors the current consumption per port
>    and subtracts it from the available power budget. When the budget is
>    exceeded, lower-priority ports are shut down.
> 
>    Advantages: This method optimizes resource utilization, saving costs.
> 
>    Disadvantages: Low-priority devices may experience instability.
> 
>    Priority max value is set by the PSE controller driver.
> 
> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> ---
> 
> Change in v3:
> - Add disconnection policy.
> - Add management of disabled port priority in the interrupt handler.
> - Move port prio mode in the power domain instead of the PSE.
> 
> Change in v2:
> - Rethink the port priority support.
> ---
>  drivers/net/pse-pd/pse_core.c | 550 +++++++++++++++++++++++++++++++++++++++++-
>  include/linux/pse-pd/pse.h    |  63 +++++
>  include/uapi/linux/ethtool.h  |  73 ++++++
>  3 files changed, 676 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
> index ff0ffbccf139..f15a693692ae 100644
> --- a/drivers/net/pse-pd/pse_core.c
> +++ b/drivers/net/pse-pd/pse_core.c
> @@ -40,10 +40,17 @@ struct pse_control {
>   * struct pse_power_domain - a PSE power domain
>   * @id: ID of the power domain
>   * @supply: Power supply the Power Domain
> + * @port_prio_mode: Current port priority mode of the power domain

Same here, it is Budget Evaluation Strategy. May be: budget_eval_strategy

> + * @discon_pol: Current disonnection policy of the power domain
> + * @pi_lrc_id: ID of the last recently connected PI. -1 if none. Relevant
> + *	       for static port priority mode.
>   */
>  struct pse_power_domain {
>  	int id;
>  	struct regulator *supply;
> +	u32 port_prio_mode;
> +	u32 discon_pol;
> +	int pi_lrc_id;

We should store all ports withing the domain in a list and process the
list backwards or forwards, depending on disconnection policy.

>  };
>  
>  static int of_load_single_pse_pi_pairset(struct device_node *node,
> @@ -222,6 +229,33 @@ static int of_load_pse_pis(struct pse_controller_dev *pcdev)
>  	return ret;
>  }
>  
> +static void pse_pi_deallocate_pw_budget(struct pse_pi *pi)
> +{
> +	if (!pi->pw_d)
> +		return;
> +
> +	regulator_free_power_budget(pi->pw_d->supply, pi->pw_allocated);
> +}
> +
> +/* Helper returning true if the power control is managed from the software
> + * in the interrupt handler
> + */

Please use function comment format. I would really love to have comments
on all new functions.

> +static bool pse_pw_d_is_sw_pw_control(struct pse_controller_dev *pcdev,
> +				      struct pse_power_domain *pw_d)
> +{
> +	if (!pw_d)
> +		return false;
> +
> +	if (pw_d->port_prio_mode & ETHTOOL_PSE_PORT_PRIO_STATIC)

here should be pw_d->port_prio_mode == ETHTOOL_PSE_PORT_PRIO_STATIC

We can't support multiple evaluation strategies per port.

> +		return true;
> +	if (pw_d->port_prio_mode == ETHTOOL_PSE_PORT_PRIO_DISABLED &&
> +	    pcdev->ops->pi_enable && pcdev->ops->pi_get_pw_req &&
> +	    pcdev->irq)
> +		return true;
> +
> +	return false;
> +}
> +

....

> +int pse_ethtool_set_prio_mode(struct pse_control *psec,
> +			      struct netlink_ext_ack *extack,
> +			      u32 prio_mode)
> +{
> +	struct pse_controller_dev *pcdev = psec->pcdev;
> +	const struct pse_controller_ops *ops;
> +	int ret = 0, i;
> +
> +	if (!(prio_mode & pcdev->port_prio_supp_modes)) {
> +		NL_SET_ERR_MSG(extack, "priority mode not supported");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (!pcdev->pi[psec->id].pw_d) {
> +		NL_SET_ERR_MSG(extack, "no power domain attached");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	/* ETHTOOL_PSE_PORT_PRIO_DISABLED can't be ORed with another mode */
> +	if (prio_mode & ETHTOOL_PSE_PORT_PRIO_DISABLED &&
> +	    prio_mode & ~ETHTOOL_PSE_PORT_PRIO_DISABLED) {
> +		NL_SET_ERR_MSG(extack,
> +			       "port priority can't be enabled and disabled simultaneously");
> +		return -EINVAL;
> +	}
> +
> +	ops = psec->pcdev->ops;
> +
> +	/* We don't want priority mode change in the middle of an
> +	 * enable/disable call
> +	 */
> +	mutex_lock(&pcdev->lock);
> +	pcdev->pi[psec->id].pw_d->port_prio_mode = prio_mode;

In proposed implementation we have can set policies per port, but it
will affect complete domain. This is not good. It feels like a separate
challenge with extra discussion and work. I would recommend not to
implement policy setting right now.

If you will decide to implement setting of policies anyway, then we need
to discuss the interface.
- If the policy should be done per domain, then we will need a separate
  interface to interact with domains.
  Pro: seems to be easier to implement.
- If we will go with policy per port, wich would make sense too, then
  some rework of this patch is needed.
  Pro: can combine best of both strategies: set ports with wide load
  range to static strategy and use dynamic strategy on other ports.

Right now we do not have software implementation for dynamic mode,
implementing configuration of the policies from user space can be
implemented later. It is enough to provide information about what
hard coded policy is currently used.

>  /* Maximum current in uA according to IEEE 802.3-2022 Table 145-1 */
>  #define MAX_PI_CURRENT 1920000
> @@ -62,6 +63,13 @@ struct pse_control_config {
>   *	is in charge of the memory allocation.
>   * @c33_pw_limit_nb_ranges: number of supported power limit configuration
>   *	ranges
> + * @c33_prio_supp_modes: PSE port priority modes supported. Set by PSE core.
> + * @c33_prio_mode: PSE port priority mode selected. Set by PSE core.
> + * @c33_prio_max: max priority allowed for the c33_prio variable value. Set
> + *	by PSE core.
> + * @c33_prio: priority of the PSE. Set by PSE core in case of static port
> + *	priority mode.
> + * @c33_discon_pol: PSE disconnection policy selected. Set by PSE core.

Priority configuration is not port of IEEE 802.3 specification. c33_
prefix should be removed here.

>   */
>  struct pse_control_status {
>  	u32 pse_id;
> @@ -76,6 +84,11 @@ struct pse_control_status {
>  	u32 c33_avail_pw_limit;
>  	struct ethtool_c33_pse_pw_limit_range *c33_pw_limit_ranges;
>  	u32 c33_pw_limit_nb_ranges;
> +	u32 c33_prio_supp_modes;
> +	enum pse_port_prio_modes c33_prio_mode;
> +	u32 c33_prio_max;
> +	u32 c33_prio;
> +	u32 c33_discon_pol;
>  };

....

>  struct module;
> @@ -143,6 +163,12 @@ struct pse_pi_pairset {
>   * @rdev: regulator represented by the PSE PI
>   * @admin_state_enabled: PI enabled state
>   * @pw_d: Power domain of the PSE PI
> + * @prio: Priority of the PSE PI. Used in static port priority mode
> + * @isr_enabled: PSE PI power status managed by the interruption handler.
> + *		 This variable is relevant when the power enabled management
> + *		 is a managed in software like the static port priority mode.
> + * @pw_allocated: Power allocated to a PSE PI to manage power budget in
> + *	static port priority mode
>   */
>  struct pse_pi {
>  	struct pse_pi_pairset pairset[2];
> @@ -150,6 +176,9 @@ struct pse_pi {
>  	struct regulator_dev *rdev;
>  	bool admin_state_enabled;
>  	struct pse_power_domain *pw_d;
> +	int prio;
> +	bool isr_enabled;
> +	int pw_allocated;

s/pw_allocated/pw_allocated_mw/

>  	return false;
> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> index a4c93d6de218..b6727049840c 100644
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -1002,11 +1002,84 @@ enum ethtool_c33_pse_pw_d_status {
>   * enum ethtool_pse_events - event list of the PSE controller.
>   * @ETHTOOL_PSE_EVENT_OVER_CURRENT: PSE output current is too high.
>   * @ETHTOOL_PSE_EVENT_OVER_TEMP: PSE in over temperature state.
> + * @ETHTOOL_C33_PSE_EVENT_DETECTION: detection process occur on the PSE.
> + *	IEEE 802.3-2022 145.2.6 PSE detection of PDs

33.2.5 and 145.2.6 -> 30.9.1.1.5 aPSEPowerDetectionStatus

> + * @ETHTOOL_C33_PSE_EVENT_CLASSIFICATION: classification process occur on
> + *	the PSE. IEEE 802.3-2022 145.2.8 PSE classification of PDs and
> + *	mutual identification

33.2.6 and 145.2.8 -> 30.9.1.1.8 aPSEPowerClassification

> + * @ETHTOOL_C33_PSE_EVENT_DISCONNECTION: PD has been disconnected on the PSE.

This one seems to be related to following parts of specification:
33.3.8 and 145.3.9 PD Maintain Power Signature
33.5.1.2.9 MPS Absent
30.9.1.1.20 aPSEMPSAbsentCounter

> + * @ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR: PSE faced an error managing the
> + *	power control from software.
>   */
>  
>  enum ethtool_pse_events {
>  	ETHTOOL_PSE_EVENT_OVER_CURRENT =	1 << 0,
>  	ETHTOOL_PSE_EVENT_OVER_TEMP =		1 << 1,
> +	ETHTOOL_C33_PSE_EVENT_DETECTION =	1 << 2,
> +	ETHTOOL_C33_PSE_EVENT_CLASSIFICATION =	1 << 3,
> +	ETHTOOL_C33_PSE_EVENT_DISCONNECTION =	1 << 4,
> +	ETHTOOL_PSE_EVENT_SW_PW_CONTROL_ERROR =	1 << 5,
> +};
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Kory Maincent 1 year, 2 months ago
Hello Oleksij,

Thanks for your quick reviews!

On Tue, 26 Nov 2024 09:38:27 +0100
Oleksij Rempel <o.rempel@pengutronix.de> wrote:

> > +int pse_ethtool_set_prio_mode(struct pse_control *psec,
> > +			      struct netlink_ext_ack *extack,
> > +			      u32 prio_mode)
> > +{
> > +	struct pse_controller_dev *pcdev = psec->pcdev;
> > +	const struct pse_controller_ops *ops;
> > +	int ret = 0, i;
> > +
> > +	if (!(prio_mode & pcdev->port_prio_supp_modes)) {
> > +		NL_SET_ERR_MSG(extack, "priority mode not supported");
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	if (!pcdev->pi[psec->id].pw_d) {
> > +		NL_SET_ERR_MSG(extack, "no power domain attached");
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	/* ETHTOOL_PSE_PORT_PRIO_DISABLED can't be ORed with another mode
> > */
> > +	if (prio_mode & ETHTOOL_PSE_PORT_PRIO_DISABLED &&
> > +	    prio_mode & ~ETHTOOL_PSE_PORT_PRIO_DISABLED) {
> > +		NL_SET_ERR_MSG(extack,
> > +			       "port priority can't be enabled and
> > disabled simultaneously");
> > +		return -EINVAL;
> > +	}
> > +
> > +	ops = psec->pcdev->ops;
> > +
> > +	/* We don't want priority mode change in the middle of an
> > +	 * enable/disable call
> > +	 */
> > +	mutex_lock(&pcdev->lock);
> > +	pcdev->pi[psec->id].pw_d->port_prio_mode = prio_mode;  
> 
> In proposed implementation we have can set policies per port, but it
> will affect complete domain. This is not good. It feels like a separate
> challenge with extra discussion and work. I would recommend not to
> implement policy setting right now.
> 
> If you will decide to implement setting of policies anyway, then we need
> to discuss the interface.
> - If the policy should be done per domain, then we will need a separate
>   interface to interact with domains.
>   Pro: seems to be easier to implement.
> - If we will go with policy per port, wich would make sense too, then
>   some rework of this patch is needed.
>   Pro: can combine best of both strategies: set ports with wide load
>   range to static strategy and use dynamic strategy on other ports.
> 
> Right now we do not have software implementation for dynamic mode,
> implementing configuration of the policies from user space can be
> implemented later. It is enough to provide information about what
> hard coded policy is currently used.

There is no PSE that support static and dynamic mode indeed but the aim was to
be able to disable the budget evaluation strategy.

In fact we could have static strategy with a disconnection policy that do not
power a newly connected PD if we become over budget. This behavior would be
something similar to no budget evaluation strategy.

Regards,
-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Kory Maincent 1 year, 2 months ago
On Tue, 26 Nov 2024 16:31:55 +0100
Kory Maincent <kory.maincent@bootlin.com> wrote:

> Hello Oleksij,
> 
> Thanks for your quick reviews!
> 
> On Tue, 26 Nov 2024 09:38:27 +0100
> Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> 
> > > +int pse_ethtool_set_prio_mode(struct pse_control *psec,
> > > +			      struct netlink_ext_ack *extack,
> > > +			      u32 prio_mode)
> > > +{
> > > +	struct pse_controller_dev *pcdev = psec->pcdev;
> > > +	const struct pse_controller_ops *ops;
> > > +	int ret = 0, i;
> > > +
> > > +	if (!(prio_mode & pcdev->port_prio_supp_modes)) {
> > > +		NL_SET_ERR_MSG(extack, "priority mode not supported");
> > > +		return -EOPNOTSUPP;
> > > +	}
> > > +
> > > +	if (!pcdev->pi[psec->id].pw_d) {
> > > +		NL_SET_ERR_MSG(extack, "no power domain attached");
> > > +		return -EOPNOTSUPP;
> > > +	}
> > > +
> > > +	/* ETHTOOL_PSE_PORT_PRIO_DISABLED can't be ORed with another mode
> > > */
> > > +	if (prio_mode & ETHTOOL_PSE_PORT_PRIO_DISABLED &&
> > > +	    prio_mode & ~ETHTOOL_PSE_PORT_PRIO_DISABLED) {
> > > +		NL_SET_ERR_MSG(extack,
> > > +			       "port priority can't be enabled and
> > > disabled simultaneously");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	ops = psec->pcdev->ops;
> > > +
> > > +	/* We don't want priority mode change in the middle of an
> > > +	 * enable/disable call
> > > +	 */
> > > +	mutex_lock(&pcdev->lock);
> > > +	pcdev->pi[psec->id].pw_d->port_prio_mode = prio_mode;    
> > 
> > In proposed implementation we have can set policies per port, but it
> > will affect complete domain. This is not good. It feels like a separate
> > challenge with extra discussion and work. I would recommend not to
> > implement policy setting right now.
> > 
> > If you will decide to implement setting of policies anyway, then we need
> > to discuss the interface.
> > - If the policy should be done per domain, then we will need a separate
> >   interface to interact with domains.
> >   Pro: seems to be easier to implement.
> > - If we will go with policy per port, wich would make sense too, then
> >   some rework of this patch is needed.
> >   Pro: can combine best of both strategies: set ports with wide load
> >   range to static strategy and use dynamic strategy on other ports.

We already talked about it but a policies per port seems irrelevant to me.
https://lore.kernel.org/netdev/ZySR75i3BEzNbjnv@pengutronix.de/
How do we compare the priority value of ports that use different budget
strategy? How do we manage in the same power domain two ports with
different budget strategies or disconnection policies?

We indeed may need a separate interface to configure the PSE power domain
budget strategies and disconnection policies.

I think not being able to set the budget evaluation strategy is not relevant
for now as we don't have PSE which could support both, but being able to
set the disconnection policies may be relevant.
If we don't add this support to this series how do we decide which is the
default disconnection policy supported?

Regards,
-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Oleksij Rempel 1 year, 2 months ago
On Tue, Nov 26, 2024 at 04:52:28PM +0100, Kory Maincent wrote:
> On Tue, 26 Nov 2024 16:31:55 +0100
> Kory Maincent <kory.maincent@bootlin.com> wrote:
> 
> > Hello Oleksij,
> > 
> > Thanks for your quick reviews!
> > 
> > On Tue, 26 Nov 2024 09:38:27 +0100
> > Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> > 
> > > > +int pse_ethtool_set_prio_mode(struct pse_control *psec,
> > > > +			      struct netlink_ext_ack *extack,
> > > > +			      u32 prio_mode)
> > > > +{
> > > > +	struct pse_controller_dev *pcdev = psec->pcdev;
> > > > +	const struct pse_controller_ops *ops;
> > > > +	int ret = 0, i;
> > > > +
> > > > +	if (!(prio_mode & pcdev->port_prio_supp_modes)) {
> > > > +		NL_SET_ERR_MSG(extack, "priority mode not supported");
> > > > +		return -EOPNOTSUPP;
> > > > +	}
> > > > +
> > > > +	if (!pcdev->pi[psec->id].pw_d) {
> > > > +		NL_SET_ERR_MSG(extack, "no power domain attached");
> > > > +		return -EOPNOTSUPP;
> > > > +	}
> > > > +
> > > > +	/* ETHTOOL_PSE_PORT_PRIO_DISABLED can't be ORed with another mode
> > > > */
> > > > +	if (prio_mode & ETHTOOL_PSE_PORT_PRIO_DISABLED &&
> > > > +	    prio_mode & ~ETHTOOL_PSE_PORT_PRIO_DISABLED) {
> > > > +		NL_SET_ERR_MSG(extack,
> > > > +			       "port priority can't be enabled and
> > > > disabled simultaneously");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	ops = psec->pcdev->ops;
> > > > +
> > > > +	/* We don't want priority mode change in the middle of an
> > > > +	 * enable/disable call
> > > > +	 */
> > > > +	mutex_lock(&pcdev->lock);
> > > > +	pcdev->pi[psec->id].pw_d->port_prio_mode = prio_mode;    
> > > 
> > > In proposed implementation we have can set policies per port, but it
> > > will affect complete domain. This is not good. It feels like a separate
> > > challenge with extra discussion and work. I would recommend not to
> > > implement policy setting right now.
> > > 
> > > If you will decide to implement setting of policies anyway, then we need
> > > to discuss the interface.
> > > - If the policy should be done per domain, then we will need a separate
> > >   interface to interact with domains.
> > >   Pro: seems to be easier to implement.
> > > - If we will go with policy per port, wich would make sense too, then
> > >   some rework of this patch is needed.
> > >   Pro: can combine best of both strategies: set ports with wide load
> > >   range to static strategy and use dynamic strategy on other ports.
> 
> We already talked about it but a policies per port seems irrelevant to me.
> https://lore.kernel.org/netdev/ZySR75i3BEzNbjnv@pengutronix.de/
> How do we compare the priority value of ports that use different budget
> strategy? How do we manage in the same power domain two ports with
> different budget strategies or disconnection policies?

Good question :)

> We indeed may need a separate interface to configure the PSE power domain
> budget strategies and disconnection policies.

And a way to upload everything in atomic way, but I see it as
optimization and can be done separately

> I think not being able to set the budget evaluation strategy is not relevant
> for now as we don't have PSE which could support both,

Both can be implemented for TI. By constantly polling the channel
current register, it should be possible to implement dynamic strategy.

> but being able to set the disconnection policies may be relevant.
> If we don't add this support to this series how do we decide which is the
> default disconnection policy supported?

Use hard coded one ¯\_(ツ)_/¯

Anyway, assuming you will decide to implement everything per port. Here
how I assume it would work.

Budget Evaluation Strategy: We have following modes for now: disabled, static, and
dynamic.
- Disabled: In this mode, the port is excluded from active budget evaluation. It
  allows the port to violate the budget and is intended primarily for testing
  purposes.

- Static: The static method is the safest option and should be used by default.
  When the static method is enabled for a port, the classification information
  (such as power class) is used for budget evaluation.

- Dynamic: If the dynamic method is used, the software will start with the
  classification as an initial step. After that, it will begin monitoring the
  port by polling the current information for the related channel. The system
  will likely use the maximum detected current and a threshold to update or
  reduce the budget allocation for the related port.

Right now I'm not sure about manual mode - for the case where classification
is not working properly.

Disconnection Policy: can only be applied if the Budget Evaluation Strategy
is not disabled.

- Disabled: The port is not subject to disconnection under any circumstances.
  This can be used for critical devices that need to remain powered at all
  times, or for administrative purposes to avoid unexpected disconnections.
  Possible use cases include testing, allowing user space to implement a
  disconnection policy, or pinning certain ports as the highest priority that
  cannot be overwritten for critical devices. Another use case could be
  temporarily addressing misconfigured priorities: an admin could set the
  policy to disabled, adjust the priority, and use a disconnection policy
  resolution status interface (currently not implemented) to verify if the
  port would be disconnected with the updated priority settings, and then set
  the policy back to a non-disabled state.

- Port Index-Based Policy: "I Know What I Do, but I Do Not Have Permission to
  Configure Software"

  Behavior: With this approach, the port index becomes the way to determine the
  priority of connections. Users can manage priorities simply by deciding which
  port they plug into:

    Lower-Indexed Ports: These ports have higher priority, meaning devices
    plugged into these ports are protected from disconnection until absolutely
    necessary.  Higher-Indexed Ports: These ports are more likely to be
    disconnected first when a power budget violation occurs.

    Use Cases:
	Structured Environment Without Full Control: Ideal for situations where
        users understand the importance of the devices but do not have admin
        access to configure software settings.

        Mechanical Administration: Users can simply re-plug devices into
        different ports to change their priority, using the port layout itself
        as a mechanism for priority management. This allows an effective but
        simple way of reassigning criticality without touching the software
        layer.

- LRC-Based Policy: "I Have No Idea About the Architecture, and I Just Enabled
  This One Device"

    Behavior: In this case, the policy targets recent user actions, assuming
    the user has minimal context about the system's architecture. The most
    recently connected device is likely the least critical and is therefore
    disconnected first.

    Use Cases:
        Chaotic Environment: Suitable for environments where users do not
        know the priority structure and are randomly plugging in devices,
        often without understanding the power budget implications.

        Immediate Feedback: If a user connects a device and it is quickly
        disconnected, they are more likely to notice and either try connecting
        at a different time or consult someone for guidance.

- Round-Robin (RR) Policy: "I Do Not Care, but I Want You All to Suffer Equally"

    Behavior: In the Round-Robin policy, all ports are treated equally without
    any specific prioritization. The disconnection burden is distributed evenly
    among all eligible ports, ensuring that no specific port is repeatedly
    penalized. This policy ensures fairness by giving every port an equal
    chance of being disconnected.

    Use Cases:
        Fair Distribution in Shared Environments: Suitable for environments
        where all devices are of similar importance, and fairness is key. This
        ensures that no single user or device consistently bears the impact of
        a budget violation.

	Non-Critical Setup: Ideal for situations where there are no critical
        devices, and all ports should have an equal chance of being
        disconnected. It provides a simple and fair mechanism for managing
        power resources without requiring specific prioritization.

Regarding the mixing of disconnection policies within one power domain, I've
thought more about how we could implement this, particularly when we focus on
three primary types of policies: Least Recently Connected (LRC), Port
Index-Based (Index), and Round-Robin (RR).

Defined Priority Order:

To make the system straightforward and predictable, we could define a hard
execution priority order for these policies:

- LRC - Highest priority: If a port has LRC enabled, it will be considered for
  disconnection first. This ensures that the least recently connected device is
  always targeted first, which works well for devices that are likely temporary
  or less critical.

- Index-Based - Second priority: Once no more LRC ports are eligible for
  disconnection, the system evaluates Index-Based ports. Ports with a higher
  physical index will be prioritized for disconnection, allowing for some manual
  control by re-arranging device connections.

- Round-Robin (RR) - Lowest priority: Finally, Round-Robin is applied to ensure
  fairness among the remaining eligible ports. This policy cycles through the
  ports to distribute the disconnection burden evenly.

User Configuration

In terms of user configuration:

Users only need to set the top allowed priority for each port. For example, if
a port is set to LRC, it will always be considered first for disconnection
during a budget violation. The connection order of all LRC ports should be
preserved.

If a port is set to Index, it will be preserved until all LRC ports are
disconnected.

Setting a port to RR will make it the last in line for disconnection, thus
ensuring the fairest distribution when other more prioritized policies have
already been applied. However, in practice, it may never be executed if all
ports have higher priority policies.


-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Kory Maincent 1 year, 2 months ago
On Wed, 27 Nov 2024 10:30:43 +0100
Oleksij Rempel <o.rempel@pengutronix.de> wrote:

> On Tue, Nov 26, 2024 at 04:52:28PM +0100, Kory Maincent wrote:
> > On Tue, 26 Nov 2024 16:31:55 +0100
> > Kory Maincent <kory.maincent@bootlin.com> wrote:
> >   
> > > Hello Oleksij,
> > > 
> > > Thanks for your quick reviews!
> > > 
> > > On Tue, 26 Nov 2024 09:38:27 +0100
> > > Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> > >   
>  [...]  
>  [...]  
> > 
> > We already talked about it but a policies per port seems irrelevant to me.
> > https://lore.kernel.org/netdev/ZySR75i3BEzNbjnv@pengutronix.de/
> > How do we compare the priority value of ports that use different budget
> > strategy? How do we manage in the same power domain two ports with
> > different budget strategies or disconnection policies?  
> 
> Good question :)
> 
> > We indeed may need a separate interface to configure the PSE power domain
> > budget strategies and disconnection policies.  
> 
> And a way to upload everything in atomic way, but I see it as
> optimization and can be done separately
> 
> > I think not being able to set the budget evaluation strategy is not relevant
> > for now as we don't have PSE which could support both,  
> 
> Both can be implemented for TI. By constantly polling the channel
> current register, it should be possible to implement dynamic strategy.
> 
> > but being able to set the disconnection policies may be relevant.
> > If we don't add this support to this series how do we decide which is the
> > default disconnection policy supported?  
> 
> Use hard coded one ¯\_(ツ)_/¯

I think we could start with disabled disconnection policy for now.
The user cans still play with the priority value which is really reasonable as
there is as many priority values as PSE ports in the static strategy.

Should we still report it in the status as there is no disconnection policy?
Maybe we could add it at the time we will support several disconnection
policies.

> In terms of user configuration:
> 
> Users only need to set the top allowed priority for each port. For example, if
> a port is set to LRC, it will always be considered first for disconnection
> during a budget violation. The connection order of all LRC ports should be
> preserved.
> 
> If a port is set to Index, it will be preserved until all LRC ports are
> disconnected.
> 
> Setting a port to RR will make it the last in line for disconnection, thus
> ensuring the fairest distribution when other more prioritized policies have
> already been applied. However, in practice, it may never be executed if all
> ports have higher priority policies.

That's a nice brainstorm! With that we will have a first idea when we would
like to really implement the disconnection policies.

Regards,
-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Oleksij Rempel 1 year, 2 months ago
On Wed, Nov 27, 2024 at 11:11:26AM +0100, Kory Maincent wrote:
> On Wed, 27 Nov 2024 10:30:43 +0100
> Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> 
> > On Tue, Nov 26, 2024 at 04:52:28PM +0100, Kory Maincent wrote:
> > > On Tue, 26 Nov 2024 16:31:55 +0100
> > > Kory Maincent <kory.maincent@bootlin.com> wrote:
> > >   
> > > > Hello Oleksij,
> > > > 
> > > > Thanks for your quick reviews!
> > > > 
> > > > On Tue, 26 Nov 2024 09:38:27 +0100
> > > > Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> > > >   
> >  [...]  
> >  [...]  
> > > 
> > > We already talked about it but a policies per port seems irrelevant to me.
> > > https://lore.kernel.org/netdev/ZySR75i3BEzNbjnv@pengutronix.de/
> > > How do we compare the priority value of ports that use different budget
> > > strategy? How do we manage in the same power domain two ports with
> > > different budget strategies or disconnection policies?  
> > 
> > Good question :)
> > 
> > > We indeed may need a separate interface to configure the PSE power domain
> > > budget strategies and disconnection policies.  
> > 
> > And a way to upload everything in atomic way, but I see it as
> > optimization and can be done separately
> > 
> > > I think not being able to set the budget evaluation strategy is not relevant
> > > for now as we don't have PSE which could support both,  
> > 
> > Both can be implemented for TI. By constantly polling the channel
> > current register, it should be possible to implement dynamic strategy.
> > 
> > > but being able to set the disconnection policies may be relevant.
> > > If we don't add this support to this series how do we decide which is the
> > > default disconnection policy supported?  
> > 
> > Use hard coded one ¯\_(ツ)_/¯
> 
> I think we could start with disabled disconnection policy for now.
> The user cans still play with the priority value which is really reasonable as
> there is as many priority values as PSE ports in the static strategy.

Hm, since prios without disconnecting do not make sens and it looks more like
all disconnection policies are optimizations steps for configurations with
multiple ports having same prio, i would suggest an implementation where
no same prios are allowed on multiple ports.

> Should we still report it in the status as there is no disconnection policy?
> Maybe we could add it at the time we will support several disconnection
> policies.

Yes. It would be better to have a discussion with some one having real use case.

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority
Posted by Kory Maincent 1 year, 2 months ago
On Wed, 27 Nov 2024 11:31:18 +0100
Oleksij Rempel <o.rempel@pengutronix.de> wrote:

> On Wed, Nov 27, 2024 at 11:11:26AM +0100, Kory Maincent wrote:
> > On Wed, 27 Nov 2024 10:30:43 +0100
> > Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> >   
> > > On Tue, Nov 26, 2024 at 04:52:28PM +0100, Kory Maincent wrote:  
>  [...]  
>  [...]  
> > >  [...]  
> > >  [...]    
>  [...]  
> > > 
> > > Good question :)
> > >   
>  [...]  
> > > 
> > > And a way to upload everything in atomic way, but I see it as
> > > optimization and can be done separately
> > >   
>  [...]  
> > > 
> > > Both can be implemented for TI. By constantly polling the channel
> > > current register, it should be possible to implement dynamic strategy.
> > >   
>  [...]  
> > > 
> > > Use hard coded one ¯\_(ツ)_/¯  
> > 
> > I think we could start with disabled disconnection policy for now.
> > The user cans still play with the priority value which is really reasonable
> > as there is as many priority values as PSE ports in the static strategy.  
> 
> Hm, since prios without disconnecting do not make sens and it looks more like
> all disconnection policies are optimizations steps for configurations with
> multiple ports having same prio, i would suggest an implementation where
> no same prios are allowed on multiple ports.

This won't allow users that don't care about priorities.

I think as we support only budget strategy for now. On the case of power budget
exceeded we should disconnect all the ports that are on the lowest priority
until we get enough power. If there is no ports with a lower priority than the
newly connected we should not power on the newly connected.

With my proposition, at boot as all the ports have the same priority it is like
we don't care about priority.

Regards,
-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com