[PATCH net-next v4 1/2] net: ti: icssg-prueth: Add Frame Preemption MAC Merge support

Meghana Malladi posted 2 patches 1 month ago
[PATCH net-next v4 1/2] net: ti: icssg-prueth: Add Frame Preemption MAC Merge support
Posted by Meghana Malladi 1 month ago
From: MD Danish Anwar <danishanwar@ti.com>

This patch introduces qos support for the icssg driver. This
includes adding support to configure mqprio qdisc and IET FPE.
By default all the queues are marked as express which can be
overwritten by the mqprio tc mask passed by tc qdisc.

icssg_config_ietfpe() work thread takes care of configuring
IET FPE in the firmware and triggering the verify state machine
based on the MAC Merge sublayer parameters set by the ethtool.
The firmware handles the cleanup after successful mac verification.
And in case the remote peer fails to respond to verify command
before the timeout (5secs), then FPE is disabled by firmware.

During link up/down, verify state machine gets triggered again
based on the state of fpe_enabled and fpe_active.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
---

v4-v3:
- Cancel work_sync during interface down and module removal time as suggested
  by Paolo <pabeni@redhat.com>
- Added INIT_WORK() inside prueth_netdev_init() instead of emac_ndo_open() as
  flagged by AI-generated review.
- Add mutex protection to serialize FPE configuration access.

 drivers/net/ethernet/ti/Makefile             |   2 +-
 drivers/net/ethernet/ti/icssg/icssg_config.h |   9 -
 drivers/net/ethernet/ti/icssg/icssg_prueth.c |  10 +
 drivers/net/ethernet/ti/icssg/icssg_prueth.h |   2 +
 drivers/net/ethernet/ti/icssg/icssg_qos.c    | 223 +++++++++++++++++++
 drivers/net/ethernet/ti/icssg/icssg_qos.h    |  60 +++++
 6 files changed, 296 insertions(+), 10 deletions(-)
 create mode 100644 drivers/net/ethernet/ti/icssg/icssg_qos.c
 create mode 100644 drivers/net/ethernet/ti/icssg/icssg_qos.h

diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile
index 6da50f4b7c2e..6893baf47d46 100644
--- a/drivers/net/ethernet/ti/Makefile
+++ b/drivers/net/ethernet/ti/Makefile
@@ -35,7 +35,7 @@ ti-am65-cpsw-nuss-$(CONFIG_TI_K3_AM65_CPSW_SWITCHDEV) += am65-cpsw-switchdev.o
 obj-$(CONFIG_TI_K3_AM65_CPTS) += am65-cpts.o
 
 obj-$(CONFIG_TI_ICSSG_PRUETH) += icssg-prueth.o icssg.o
-icssg-prueth-y := icssg/icssg_prueth.o icssg/icssg_switchdev.o
+icssg-prueth-y := icssg/icssg_prueth.o icssg/icssg_switchdev.o icssg/icssg_qos.o
 
 obj-$(CONFIG_TI_ICSSG_PRUETH_SR1) += icssg-prueth-sr1.o icssg.o
 icssg-prueth-sr1-y := icssg/icssg_prueth_sr1.o
diff --git a/drivers/net/ethernet/ti/icssg/icssg_config.h b/drivers/net/ethernet/ti/icssg/icssg_config.h
index 60d69744ffae..1ac202f855ed 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_config.h
+++ b/drivers/net/ethernet/ti/icssg/icssg_config.h
@@ -323,13 +323,4 @@ struct prueth_fdb_slot {
 	u8 fid;
 	u8 fid_c2;
 } __packed;
-
-enum icssg_ietfpe_verify_states {
-	ICSSG_IETFPE_STATE_UNKNOWN = 0,
-	ICSSG_IETFPE_STATE_INITIAL,
-	ICSSG_IETFPE_STATE_VERIFYING,
-	ICSSG_IETFPE_STATE_SUCCEEDED,
-	ICSSG_IETFPE_STATE_FAILED,
-	ICSSG_IETFPE_STATE_DISABLED
-};
 #endif /* __NET_TI_ICSSG_CONFIG_H */
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
index 0939994c932f..fc44beda86a5 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
@@ -374,9 +374,11 @@ static void emac_adjust_link(struct net_device *ndev)
 			spin_unlock_irqrestore(&emac->lock, flags);
 			icssg_config_set_speed(emac);
 			icssg_set_port_state(emac, ICSSG_EMAC_PORT_FORWARD);
+			icssg_qos_link_up(ndev);
 
 		} else {
 			icssg_set_port_state(emac, ICSSG_EMAC_PORT_DISABLE);
+			icssg_qos_link_down(ndev);
 		}
 	}
 
@@ -1024,6 +1026,7 @@ static int emac_ndo_stop(struct net_device *ndev)
 	prueth_destroy_rxq(emac);
 
 	cancel_work_sync(&emac->rx_mode_work);
+	cancel_work_sync(&emac->qos.iet.fpe_config_task);
 
 	/* Destroying the queued work in ndo_stop() */
 	cancel_delayed_work_sync(&emac->stats_work);
@@ -1421,6 +1424,7 @@ static const struct net_device_ops emac_netdev_ops = {
 	.ndo_hwtstamp_get = icssg_ndo_get_ts_config,
 	.ndo_hwtstamp_set = icssg_ndo_set_ts_config,
 	.ndo_xsk_wakeup = prueth_xsk_wakeup,
+	.ndo_setup_tc = icssg_qos_ndo_setup_tc,
 };
 
 static int prueth_netdev_init(struct prueth *prueth,
@@ -1455,6 +1459,8 @@ static int prueth_netdev_init(struct prueth *prueth,
 
 	INIT_DELAYED_WORK(&emac->stats_work, icssg_stats_work_handler);
 
+	icssg_qos_init(ndev);
+
 	ret = pruss_request_mem_region(prueth->pruss,
 				       port == PRUETH_PORT_MII0 ?
 				       PRUSS_MEM_DRAM0 : PRUSS_MEM_DRAM1,
@@ -2230,6 +2236,8 @@ static int prueth_probe(struct platform_device *pdev)
 		}
 		unregister_netdev(prueth->registered_netdevs[i]);
 		disable_work_sync(&prueth->emac[i]->rx_mode_work);
+		disable_work_sync(&prueth->emac[i]->qos.iet.fpe_config_task);
+		mutex_destroy(&prueth->emac[i]->qos.iet.fpe_lock);
 	}
 
 netdev_exit:
@@ -2290,6 +2298,8 @@ static void prueth_remove(struct platform_device *pdev)
 		prueth->emac[i]->ndev->phydev = NULL;
 		unregister_netdev(prueth->registered_netdevs[i]);
 		disable_work_sync(&prueth->emac[i]->rx_mode_work);
+		disable_work_sync(&prueth->emac[i]->qos.iet.fpe_config_task);
+		mutex_destroy(&prueth->emac[i]->qos.iet.fpe_lock);
 	}
 
 	for (i = 0; i < PRUETH_NUM_MACS; i++) {
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.h b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
index 3d94fa5a7ac1..37de534e4d43 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.h
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
@@ -44,6 +44,7 @@
 #include "icssg_config.h"
 #include "icss_iep.h"
 #include "icssg_switch_map.h"
+#include "icssg_qos.h"
 
 #define PRUETH_MAX_MTU          (2000 - ETH_HLEN - ETH_FCS_LEN)
 #define PRUETH_MIN_PKT_SIZE     (VLAN_ETH_ZLEN)
@@ -254,6 +255,7 @@ struct prueth_emac {
 	struct bpf_prog *xdp_prog;
 	struct xdp_attachment_info xdpi;
 	int xsk_qid;
+	struct prueth_qos qos;
 };
 
 /* The buf includes headroom compatible with both skb and xdpf */
diff --git a/drivers/net/ethernet/ti/icssg/icssg_qos.c b/drivers/net/ethernet/ti/icssg/icssg_qos.c
new file mode 100644
index 000000000000..388dfcea426b
--- /dev/null
+++ b/drivers/net/ethernet/ti/icssg/icssg_qos.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Texas Instruments ICSSG PRUETH QoS submodule
+ * Copyright (C) 2023 Texas Instruments Incorporated - http://www.ti.com/
+ */
+
+#include "icssg_prueth.h"
+#include "icssg_switch_map.h"
+
+static void icssg_iet_set_preempt_mask(struct prueth_emac *emac, u8 preemptible_tcs)
+{
+	void __iomem *config = emac->dram.va + ICSSG_CONFIG_OFFSET;
+	struct prueth_qos_mqprio *p_mqprio = &emac->qos.mqprio;
+	struct tc_mqprio_qopt *qopt = &p_mqprio->mqprio.qopt;
+	int prempt_mask = 0, i;
+	u8 tc;
+
+	/* Configure the queues based on the preemptible tc map set by the user */
+	for (tc = 0; tc < p_mqprio->mqprio.qopt.num_tc; tc++) {
+		/* check if the tc is preemptive or not */
+		if (preemptible_tcs & BIT(tc)) {
+			for (i = qopt->offset[tc]; i < qopt->offset[tc] + qopt->count[tc]; i++) {
+				/* Set all the queues in this tc as preemptive queues */
+				writeb(BIT(4), config + EXPRESS_PRE_EMPTIVE_Q_MAP + i);
+				prempt_mask &= ~BIT(i);
+			}
+		} else {
+			/* Set all the queues in this tc as express queues */
+			for (i = qopt->offset[tc]; i < qopt->offset[tc] + qopt->count[tc]; i++) {
+				writeb(0, config + EXPRESS_PRE_EMPTIVE_Q_MAP + i);
+				prempt_mask |= BIT(i);
+			}
+		}
+		netdev_set_tc_queue(emac->ndev, tc, qopt->count[tc], qopt->offset[tc]);
+	}
+	writeb(prempt_mask, config + EXPRESS_PRE_EMPTIVE_Q_MASK);
+}
+
+static void icssg_config_ietfpe(struct work_struct *work)
+{
+	struct prueth_qos_iet *iet =
+		container_of(work, struct prueth_qos_iet, fpe_config_task);
+	void __iomem *config = iet->emac->dram.va + ICSSG_CONFIG_OFFSET;
+	struct prueth_qos_mqprio *p_mqprio =  &iet->emac->qos.mqprio;
+	bool enable = !!atomic_read(&iet->enable_fpe_config);
+	int ret;
+	u8 val;
+
+	if (!netif_running(iet->emac->ndev))
+		return;
+
+	mutex_lock(&iet->fpe_lock);
+
+	/* Update FPE Tx enable bit (PRE_EMPTION_ENABLE_TX) if
+	 * fpe_enabled is set to enable MM in Tx direction
+	 */
+	writeb(enable ? 1 : 0, config + PRE_EMPTION_ENABLE_TX);
+
+	/* If FPE is to be enabled, first configure MAC Verify state
+	 * machine in firmware as firmware kicks the Verify process
+	 * as soon as ICSSG_EMAC_PORT_PREMPT_TX_ENABLE command is
+	 * received.
+	 */
+	if (enable && iet->mac_verify_configure) {
+		writeb(1, config + PRE_EMPTION_ENABLE_VERIFY);
+		writew(iet->tx_min_frag_size, config + PRE_EMPTION_ADD_FRAG_SIZE_LOCAL);
+		writel(iet->verify_time_ms, config + PRE_EMPTION_VERIFY_TIME);
+	}
+
+	/* Send command to enable FPE Tx side. Rx is always enabled */
+	ret = icssg_set_port_state(iet->emac,
+				   enable ? ICSSG_EMAC_PORT_PREMPT_TX_ENABLE :
+					    ICSSG_EMAC_PORT_PREMPT_TX_DISABLE);
+	if (ret) {
+		netdev_err(iet->emac->ndev, "TX preempt %s command failed\n",
+			   str_enable_disable(enable));
+		writeb(0, config + PRE_EMPTION_ENABLE_VERIFY);
+		iet->verify_status = ICSSG_IETFPE_STATE_DISABLED;
+		goto unlock;
+	}
+
+	if (enable && iet->mac_verify_configure) {
+		ret = readb_poll_timeout(config + PRE_EMPTION_VERIFY_STATUS, iet->verify_status,
+					 (iet->verify_status == ICSSG_IETFPE_STATE_SUCCEEDED),
+					 USEC_PER_MSEC, 5 * USEC_PER_SEC);
+		if (ret) {
+			iet->verify_status = ICSSG_IETFPE_STATE_FAILED;
+			netdev_err(iet->emac->ndev,
+				   "timeout for MAC Verify: status %x\n",
+				   iet->verify_status);
+			goto unlock;
+		}
+	} else if (enable) {
+		/* Give f/w some time to update PRE_EMPTION_ACTIVE_TX state */
+		usleep_range(100, 200);
+	}
+
+	if (enable) {
+		val = readb(config + PRE_EMPTION_ACTIVE_TX);
+		if (val != 1) {
+			netdev_err(iet->emac->ndev,
+				   "F/w fails to activate IET/FPE\n");
+			goto unlock;
+		}
+		iet->fpe_active = true;
+	} else {
+		iet->fpe_active = false;
+	}
+
+	netdev_info(iet->emac->ndev, "IET FPE %s successfully\n",
+		    str_enable_disable(iet->fpe_active));
+	icssg_iet_set_preempt_mask(iet->emac, p_mqprio->preemptible_tcs);
+
+unlock:
+	mutex_unlock(&iet->fpe_lock);
+}
+
+void icssg_qos_init(struct net_device *ndev)
+{
+	struct prueth_emac *emac = netdev_priv(ndev);
+	struct prueth_qos_iet *iet = &emac->qos.iet;
+
+	/* Init work queue for IET MAC verify process */
+	iet->emac = emac;
+	INIT_WORK(&iet->fpe_config_task, icssg_config_ietfpe);
+	mutex_init(&iet->fpe_lock);
+}
+
+static int emac_tc_query_caps(struct net_device *ndev, void *type_data)
+{
+	struct tc_query_caps_base *base = type_data;
+
+	switch (base->type) {
+	case TC_SETUP_QDISC_MQPRIO: {
+		struct tc_mqprio_caps *caps = base->caps;
+
+		caps->validate_queue_counts = true;
+		return 0;
+	}
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int emac_tc_setup_mqprio(struct net_device *ndev, void *type_data)
+{
+	struct tc_mqprio_qopt_offload *mqprio = type_data;
+	struct prueth_emac *emac = netdev_priv(ndev);
+	struct tc_mqprio_qopt *qopt = &mqprio->qopt;
+	struct prueth_qos_mqprio *p_mqprio;
+	u8 num_tc = mqprio->qopt.num_tc;
+	int tc, offset, count;
+
+	p_mqprio = &emac->qos.mqprio;
+
+	if (!num_tc) {
+		netdev_reset_tc(ndev);
+		p_mqprio->preemptible_tcs = 0;
+		goto reset_tcs;
+	}
+
+	memcpy(&p_mqprio->mqprio, mqprio, sizeof(*mqprio));
+	p_mqprio->preemptible_tcs = mqprio->preemptible_tcs;
+	netdev_set_num_tc(ndev, mqprio->qopt.num_tc);
+
+	for (tc = 0; tc < num_tc; tc++) {
+		count = qopt->count[tc];
+		offset = qopt->offset[tc];
+		netdev_set_tc_queue(ndev, tc, count, offset);
+	}
+
+reset_tcs:
+	mutex_lock(&emac->qos.iet.fpe_lock);
+	icssg_iet_set_preempt_mask(emac, p_mqprio->preemptible_tcs);
+	mutex_unlock(&emac->qos.iet.fpe_lock);
+
+	return 0;
+}
+
+int icssg_qos_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
+			   void *type_data)
+{
+	switch (type) {
+	case TC_QUERY_CAPS:
+		return emac_tc_query_caps(ndev, type_data);
+	case TC_SETUP_QDISC_MQPRIO:
+		return emac_tc_setup_mqprio(ndev, type_data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+void icssg_qos_link_up(struct net_device *ndev)
+{
+	struct prueth_emac *emac = netdev_priv(ndev);
+	struct prueth_qos_iet *iet = &emac->qos.iet;
+
+	/* Enable FPE if not active but fpe_enabled is true
+	 * and disable FPE if active but fpe_enabled is false
+	 */
+	if (!iet->fpe_active && iet->fpe_enabled) {
+		/* Schedule IET FPE enable */
+		atomic_set(&iet->enable_fpe_config, 1);
+	} else if (iet->fpe_active && !iet->fpe_enabled) {
+		/* Schedule IET FPE disable */
+		atomic_set(&iet->enable_fpe_config, 0);
+	} else {
+		return;
+	}
+	schedule_work(&iet->fpe_config_task);
+}
+
+void icssg_qos_link_down(struct net_device *ndev)
+{
+	struct prueth_emac *emac = netdev_priv(ndev);
+	struct prueth_qos_iet *iet = &emac->qos.iet;
+
+	/* disable FPE if active during link down */
+	if (iet->fpe_active) {
+		/* Schedule IET FPE disable */
+		atomic_set(&iet->enable_fpe_config, 0);
+		schedule_work(&iet->fpe_config_task);
+	}
+}
diff --git a/drivers/net/ethernet/ti/icssg/icssg_qos.h b/drivers/net/ethernet/ti/icssg/icssg_qos.h
new file mode 100644
index 000000000000..653dbb57791d
--- /dev/null
+++ b/drivers/net/ethernet/ti/icssg/icssg_qos.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2023 Texas Instruments Incorporated - http://www.ti.com/
+ */
+
+#ifndef __NET_TI_ICSSG_QOS_H
+#define __NET_TI_ICSSG_QOS_H
+
+#include <linux/atomic.h>
+#include <linux/netdevice.h>
+#include <net/pkt_sched.h>
+
+enum icssg_ietfpe_verify_states {
+	ICSSG_IETFPE_STATE_UNKNOWN = 0,
+	ICSSG_IETFPE_STATE_INITIAL,
+	ICSSG_IETFPE_STATE_VERIFYING,
+	ICSSG_IETFPE_STATE_SUCCEEDED,
+	ICSSG_IETFPE_STATE_FAILED,
+	ICSSG_IETFPE_STATE_DISABLED
+};
+
+struct prueth_qos_mqprio {
+	struct tc_mqprio_qopt_offload mqprio;
+	u8 preemptible_tcs;
+};
+
+struct prueth_qos_iet {
+	struct work_struct fpe_config_task;
+	struct prueth_emac *emac;
+	atomic_t enable_fpe_config;
+	/* Set when IET frame preemption is enabled via ethtool */
+	bool fpe_enabled;
+	/* Set when the IET MAC Verify state machine is enabled
+	 * via ethtool
+	 */
+	bool mac_verify_configure;
+	/* Min TX fragment size, set via ethtool */
+	u32 tx_min_frag_size;
+	/* wait time between verification attempts in ms (according to clause
+	 * 30.14.1.6 aMACMergeVerifyTime), set via ethtool
+	 */
+	u32 verify_time_ms;
+	/* Set if IET FPE is active */
+	bool fpe_active;
+	/* State of verification state machine */
+	enum icssg_ietfpe_verify_states verify_status;
+	/* Mutex to serialize FPE configuration access */
+	struct mutex fpe_lock;
+};
+
+struct prueth_qos {
+	struct prueth_qos_iet iet;
+	struct prueth_qos_mqprio mqprio;
+};
+
+void icssg_qos_init(struct net_device *ndev);
+void icssg_qos_link_up(struct net_device *ndev);
+void icssg_qos_link_down(struct net_device *ndev);
+int icssg_qos_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
+			   void *type_data);
+#endif /* __NET_TI_ICSSG_QOS_H */
-- 
2.43.0
Re: [PATCH net-next v4 1/2] net: ti: icssg-prueth: Add Frame Preemption MAC Merge support
Posted by Vladimir Oltean 1 month ago
Hi Meghana,

On Tue, Feb 24, 2026 at 06:18:02PM +0530, Meghana Malladi wrote:
> diff --git a/drivers/net/ethernet/ti/icssg/icssg_qos.c b/drivers/net/ethernet/ti/icssg/icssg_qos.c
> new file mode 100644
> index 000000000000..388dfcea426b
> --- /dev/null
> +++ b/drivers/net/ethernet/ti/icssg/icssg_qos.c
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Texas Instruments ICSSG PRUETH QoS submodule
> + * Copyright (C) 2023 Texas Instruments Incorporated - http://www.ti.com/
> + */
> +
> +#include "icssg_prueth.h"
> +#include "icssg_switch_map.h"
> +
> +static void icssg_iet_set_preempt_mask(struct prueth_emac *emac, u8 preemptible_tcs)
> +{
> +	void __iomem *config = emac->dram.va + ICSSG_CONFIG_OFFSET;
> +	struct prueth_qos_mqprio *p_mqprio = &emac->qos.mqprio;
> +	struct tc_mqprio_qopt *qopt = &p_mqprio->mqprio.qopt;
> +	int prempt_mask = 0, i;
> +	u8 tc;
> +
> +	/* Configure the queues based on the preemptible tc map set by the user */
> +	for (tc = 0; tc < p_mqprio->mqprio.qopt.num_tc; tc++) {
> +		/* check if the tc is preemptive or not */
> +		if (preemptible_tcs & BIT(tc)) {
> +			for (i = qopt->offset[tc]; i < qopt->offset[tc] + qopt->count[tc]; i++) {
> +				/* Set all the queues in this tc as preemptive queues */
> +				writeb(BIT(4), config + EXPRESS_PRE_EMPTIVE_Q_MAP + i);
> +				prempt_mask &= ~BIT(i);
> +			}
> +		} else {
> +			/* Set all the queues in this tc as express queues */
> +			for (i = qopt->offset[tc]; i < qopt->offset[tc] + qopt->count[tc]; i++) {
> +				writeb(0, config + EXPRESS_PRE_EMPTIVE_Q_MAP + i);
> +				prempt_mask |= BIT(i);
> +			}
> +		}
> +		netdev_set_tc_queue(emac->ndev, tc, qopt->count[tc], qopt->offset[tc]);
> +	}
> +	writeb(prempt_mask, config + EXPRESS_PRE_EMPTIVE_Q_MASK);
> +}

Shouldn't the preemptible TCs be committed to hardware only if FPE is
active? The callers pay absolutely no regard to that.

> +
> +static void icssg_config_ietfpe(struct work_struct *work)
> +{
> +	struct prueth_qos_iet *iet =
> +		container_of(work, struct prueth_qos_iet, fpe_config_task);
> +	void __iomem *config = iet->emac->dram.va + ICSSG_CONFIG_OFFSET;
> +	struct prueth_qos_mqprio *p_mqprio =  &iet->emac->qos.mqprio;
> +	bool enable = !!atomic_read(&iet->enable_fpe_config);
> +	int ret;
> +	u8 val;
> +
> +	if (!netif_running(iet->emac->ndev))
> +		return;
> +
> +	mutex_lock(&iet->fpe_lock);
> +
> +	/* Update FPE Tx enable bit (PRE_EMPTION_ENABLE_TX) if
> +	 * fpe_enabled is set to enable MM in Tx direction
> +	 */
> +	writeb(enable ? 1 : 0, config + PRE_EMPTION_ENABLE_TX);
> +
> +	/* If FPE is to be enabled, first configure MAC Verify state
> +	 * machine in firmware as firmware kicks the Verify process
> +	 * as soon as ICSSG_EMAC_PORT_PREMPT_TX_ENABLE command is
> +	 * received.
> +	 */
> +	if (enable && iet->mac_verify_configure) {
> +		writeb(1, config + PRE_EMPTION_ENABLE_VERIFY);
> +		writew(iet->tx_min_frag_size, config + PRE_EMPTION_ADD_FRAG_SIZE_LOCAL);
> +		writel(iet->verify_time_ms, config + PRE_EMPTION_VERIFY_TIME);
> +	}
> +
> +	/* Send command to enable FPE Tx side. Rx is always enabled */
> +	ret = icssg_set_port_state(iet->emac,
> +				   enable ? ICSSG_EMAC_PORT_PREMPT_TX_ENABLE :
> +					    ICSSG_EMAC_PORT_PREMPT_TX_DISABLE);
> +	if (ret) {
> +		netdev_err(iet->emac->ndev, "TX preempt %s command failed\n",
> +			   str_enable_disable(enable));
> +		writeb(0, config + PRE_EMPTION_ENABLE_VERIFY);
> +		iet->verify_status = ICSSG_IETFPE_STATE_DISABLED;
> +		goto unlock;
> +	}
> +
> +	if (enable && iet->mac_verify_configure) {
> +		ret = readb_poll_timeout(config + PRE_EMPTION_VERIFY_STATUS, iet->verify_status,
> +					 (iet->verify_status == ICSSG_IETFPE_STATE_SUCCEEDED),
> +					 USEC_PER_MSEC, 5 * USEC_PER_SEC);

You are sleeping up to 5 seconds in the system_percpu_wq kernel-wide
workqueue, blocking the kernel from making any sort of progress with
other items in this workqueue. As include/linux/workqueue.h puts it:
"Don't queue works which can run for too long.".

I guess you should allocate a driver-private workqueue on which you can
sleep as much as you want. Or make the icssg_config_ietfpe task smarter,
to be stateful and reschedule itself until the PRE_EMPTION_VERIFY_STATUS
changes, or a timeout expires. But that's more complicated.

Side note: I had this question on my mind - all contexts from which you
call schedule_work(&iet->fpe_config_task) are sleepable, so why not just
invoke icssg_config_ietfpe() via a direct function call instead?
I guess that's why - it takes too long to reasonably wait for it from
call sites like ethtool, phylink etc. I would make sure this design
decision is part of the commit message.

But let's not lie to ourselves. Having a deferred fpe_config_task
creates its own problems which you are not handling well.

Consider:
- iet->tx_min_frag_size
- iet->verify_time_ms

Writer is emac_set_mm(), reader is icssg_config_ietfpe(). But the reader
can run concurrently with the writer. This means the reader can pick up
and send to firmware settings in an inconsistent state (old tx_min_frag_size
with new verify_time_ms). Configuration which was never requested by the user.

You have a mutex &iet->fpe_lock. Does it help? No.
You have an atomic &iet->enable_fpe_config. Does it help? Also no.

Please try to think of a synchronization pattern where the config
writer, emac_set_mm(), stops or otherwise blocks out the deferred reader
while it's making changes.

In addition, schedule_work() while the work is already scheduled will do
nothing. So if the fpe_config_task takes close to 5 seconds and the user
sends multiple ethtool --set-mm requests in that time, they will be
ignored or incorrectly processed.

Also, iet->fpe_active is problematic too. It has emac_get_mm(),
icssg_qos_link_up() and icssg_qos_link_down() as readers, and
icssg_config_ietfpe() as writer. But it's not obvious what the correct
access pattern is. These things rarely work correctly by chance :(

I'm sorry that I don't have more time to untangle everything and see
what would work best. As a result, the comments above are just "some"
observations. Please try to be more deliberate with the synchronization
procedures, explain them and I am more than happy to double-check their
sanity. It's just that not much effort seems to have been put into the
current proposal.