From nobody Mon Feb 9 19:04:48 2026 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D0F92E7F2C for ; Fri, 23 Jan 2026 09:07:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769159272; cv=none; b=RZAbPLnWeFWOCZ3orWkUrVWlW4/eaklmqbazX2ZlieipC3/8rxuPsI7IA5SHmqDKjeNdW2RejA5Jz2R9Q+ZEjWNxFK5s0GgmMJqBhZyAfW7aCkzvgQecqgENk/CM94uRbQcdPWudluhnflVvx2+2vjrAF4gh7fRDDUIfoyiXH+w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769159272; c=relaxed/simple; bh=r5XEHOJhn1h9gr3DcBd/TQpuCeRA/qz6bGYE0Gsm3I0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y9uyGmuxh/kCnGftB7ZfMC4sA8riON2vqnUtCMQcAt4AvCmUJVlxYlNYuKApLIzDkN0oT6Xf+BMCnYeGDSLW1y4sMYQB8KQ3RjpEpuR1TxS6jdIyp3KxwpHfgZ/D1TFei6mRoE/V7vKfGVeG1GXZaiZMGM8L6ttH5dUlQyUk3As= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1vjD8q-00027J-59; Fri, 23 Jan 2026 10:07:44 +0100 Received: from dude04.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::ac] helo=dude04) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vjD8p-00244z-0l; Fri, 23 Jan 2026 10:07:42 +0100 Received: from ore by dude04 with local (Exim 4.98.2) (envelope-from ) id 1vjD8o-00000006ZX2-1uae; Fri, 23 Jan 2026 10:07:42 +0100 From: Oleksij Rempel To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Andrew Lunn , Thangaraj Samynathan , Rengarajan Sundararajan Cc: Oleksij Rempel , kernel@pengutronix.de, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, UNGLinuxDriver@microchip.com Subject: [RFC PATCH 3/4] net: lan78xx: Enhance health reporting with workqueue and detailed flow control stats Date: Fri, 23 Jan 2026 10:07:39 +0100 Message-ID: <20260123090741.1566469-4-o.rempel@pengutronix.de> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260123090741.1566469-1-o.rempel@pengutronix.de> References: <20260123090741.1566469-1-o.rempel@pengutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: ore@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Refactor the health reporting to: 1. Introduce a dedicated workqueue for TX timeouts. This prevents calling devlink_health_report (which may sleep) from an atomic context (netdev tx_timeout). 2. Update statistics tracking and reporting context to separate TX Pause and RX Pause frames, allowing finer-grained stall analysis (local vs. link partner induced flow control storm). 3. Change the devlink recovery function to call phylink_mac_change(false). This leverages the newly robust link_down path which performs the necessary locking and conditional Lite Reset. Signed-off-by: Oleksij Rempel --- drivers/net/usb/lan78xx.c | 133 +++++++++++++++++++++++++------------- 1 file changed, 87 insertions(+), 46 deletions(-) diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 9dadca4101bc..316a3a8d0534 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -425,15 +425,36 @@ struct lan78xx_stat_snapshot { ktime_t time; =20 u64 tx_pause_total; + u64 rx_pause_total; u64 tx_unicast_total; u64 rx_total_frames; u64 rx_hw_drop_total; u64 rx_sw_packets_total; =20 - u32 last_delta_pause; + u32 last_delta_rx_pause; + u32 last_delta_tx_pause; u32 last_delta_drops; }; =20 +struct lan78xx_dump_ctx { + const char *msg; + ktime_t ts; /* Timestamp of detection */ + + union { + struct { + u64 delta_tx_pause; + u64 delta_rx_pause; + u64 delta_rx; + u64 delta_hw_drop; + u64 delta_sw_rx; + } fifo; + struct { + u32 int_sts; /* The ISR's view of INT_STS */ + u32 int_enp; /* The ISR's view of INT_ENP_CTL */ + } err; + }; +}; + struct irq_domain_data { struct irq_domain *irqdomain; unsigned int phyirq; @@ -505,27 +526,10 @@ struct lan78xx_net { struct devlink_health_reporter *fifo_reporter; struct devlink_health_reporter *internal_err_reporter; struct lan78xx_stat_snapshot snapshot; + struct work_struct tx_timeout_work; + struct lan78xx_dump_ctx timeout_ctx; }; =20 -struct lan78xx_dump_ctx { - const char *msg; - ktime_t ts; /* Timestamp of detection */ - - union { - struct { - u64 delta_pause; - u64 delta_rx; - u64 delta_hw_drop; - u64 delta_sw_rx; - } fifo; - struct { - u32 int_sts; /* The ISR's view of INT_STS */ - u32 int_enp; /* The ISR's view of INT_ENP_CTL */ - } err; - }; -}; - -/* Register Dump Map Structure */ struct lan78xx_reg_map { u32 reg; const char *name; @@ -966,7 +970,7 @@ static void lan78xx_check_stat_rollover(struct lan78xx_= net *dev, =20 static void lan78xx_check_stat_anomalies(struct lan78xx_net *dev) { - u64 delta_pause, delta_rx, delta_hw_drop, delta_sw_rx; + u64 delta_tx_pause, delta_rx_pause, delta_rx, delta_hw_drop, delta_sw_rx; struct lan78xx_dump_ctx ctx =3D {0}; struct lan78xx_stat_snapshot now; const char *anomaly_msg =3D NULL; @@ -976,6 +980,7 @@ static void lan78xx_check_stat_anomalies(struct lan78xx= _net *dev) =20 mutex_lock(&dev->stats.access_lock); now.tx_pause_total =3D dev->stats.curr_stat.tx_pause_frames; + now.rx_pause_total =3D dev->stats.curr_stat.rx_pause_frames; now.rx_total_frames =3D dev->stats.curr_stat.rx_unicast_frames + dev->stats.curr_stat.rx_broadcast_frames + dev->stats.curr_stat.rx_multicast_frames; @@ -985,17 +990,19 @@ static void lan78xx_check_stat_anomalies(struct lan78= xx_net *dev) =20 now.rx_sw_packets_total =3D dev->net->stats.rx_packets; =20 - delta_pause =3D now.tx_pause_total - dev->snapshot.tx_pause_total; + delta_tx_pause =3D now.tx_pause_total - dev->snapshot.tx_pause_total; + delta_rx_pause =3D now.rx_pause_total - dev->snapshot.rx_pause_total; delta_rx =3D now.rx_total_frames - dev->snapshot.rx_total_frames; delta_hw_drop =3D now.rx_hw_drop_total - dev->snapshot.rx_hw_drop_total; delta_sw_rx =3D now.rx_sw_packets_total - dev->snapshot.rx_sw_packets_tot= al; =20 - now.last_delta_pause =3D (u32)delta_pause; + now.last_delta_tx_pause =3D (u32)delta_tx_pause; + now.last_delta_rx_pause =3D (u32)delta_rx_pause; now.last_delta_drops =3D (u32)delta_hw_drop; =20 dev->snapshot =3D now; =20 - if (delta_pause > LAN78XX_STALL_PAUSE_THRESH && delta_rx =3D=3D 0) { + if (delta_tx_pause > LAN78XX_STALL_PAUSE_THRESH && delta_rx =3D=3D 0) { anomaly_msg =3D "Stall: Pause Storm & No RX"; } else if (delta_hw_drop > LAN78XX_LIVELOCK_DROP_THRESH && delta_hw_drop > (delta_sw_rx * LAN78XX_LIVELOCK_DROP_RATIO)) { @@ -1008,10 +1015,11 @@ static void lan78xx_check_stat_anomalies(struct lan= 78xx_net *dev) /* 5. Reporting */ ctx.msg =3D anomaly_msg; ctx.ts =3D now.time; - ctx.fifo.delta_pause =3D delta_pause; - ctx.fifo.delta_rx =3D delta_rx; + ctx.fifo.delta_tx_pause =3D delta_tx_pause; + ctx.fifo.delta_rx_pause =3D delta_rx_pause; + ctx.fifo.delta_rx =3D delta_rx; ctx.fifo.delta_hw_drop =3D delta_hw_drop; - ctx.fifo.delta_sw_rx =3D delta_sw_rx; + ctx.fifo.delta_sw_rx =3D delta_sw_rx; =20 netdev_warn(dev->net, "%s (HW Drops: +%llu, SW RX: +%llu)\n", ctx.msg, delta_hw_drop, delta_sw_rx); @@ -2495,6 +2503,24 @@ static void lan78xx_mac_config(struct phylink_config= *config, unsigned int mode, ERR_PTR(ret)); } =20 +static int lan78xx_configure_flowcontrol(struct lan78xx_net *dev, + bool tx_pause, bool rx_pause); +static int lan78xx_reset(struct lan78xx_net *dev); + +static void lan78xx_dump_status(struct lan78xx_net *dev, const char *msg) +{ + u32 int_sts, mac_tx, fct_tx_ctl, mac_rx, fct_rx_ctl; + + lan78xx_read_reg(dev, INT_STS, &int_sts); + lan78xx_read_reg(dev, MAC_TX, &mac_tx); + lan78xx_read_reg(dev, FCT_TX_CTL, &fct_tx_ctl); + lan78xx_read_reg(dev, MAC_RX, &mac_rx); + lan78xx_read_reg(dev, FCT_RX_CTL, &fct_rx_ctl); + + netdev_info(dev->net, "[%s] INT_STS: 0x%08x, MAC_TX: 0x%08x, FCT_TX: 0x%0= 8x, MAC_RX: 0x%08x, FCT_RX: 0x%08x\n", + msg, int_sts, mac_tx, fct_tx_ctl, mac_rx, fct_rx_ctl); +} + static void lan78xx_mac_link_down(struct phylink_config *config, unsigned int mode, phy_interface_t interface) { @@ -4939,8 +4965,10 @@ static int lan78xx_fifo_dump(struct devlink_health_r= eporter *reporter, ktime_to_ns(ctx->ts)); =20 devlink_fmsg_obj_nest_start(fmsg); - devlink_fmsg_u64_pair_put(fmsg, "trigger_delta_pause", - ctx->fifo.delta_pause); + devlink_fmsg_u64_pair_put(fmsg, "trigger_delta_tx_pause", + ctx->fifo.delta_tx_pause); + devlink_fmsg_u64_pair_put(fmsg, "trigger_delta_rx_pause", + ctx->fifo.delta_rx_pause); devlink_fmsg_u64_pair_put(fmsg, "trigger_delta_rx", ctx->fifo.delta_rx); devlink_fmsg_u64_pair_put(fmsg, "trigger_delta_hw_drop", @@ -4989,8 +5017,9 @@ static int lan78xx_fifo_recover(struct devlink_health= _reporter *reporter, { struct lan78xx_net *dev =3D devlink_health_reporter_priv(reporter); =20 - netdev_warn(dev->net, "Recovering from FIFO stall via Lite Reset\n"); - return lan78xx_reset(dev); + netdev_warn(dev->net, "Recovering via Lite Reset\n"); + phylink_mac_change(dev->phylink, false); + return 0; } =20 static const struct devlink_health_reporter_ops lan78xx_fifo_ops =3D { @@ -5075,6 +5104,7 @@ static void lan78xx_disconnect(struct usb_interface *= intf) =20 lan78xx_health_cleanup(dev); if (dev->devlink) { + cancel_work_sync(&dev->tx_timeout_work); devlink_unregister(dev->devlink); devlink_free(dev->devlink); dev->devlink =3D NULL; @@ -5107,36 +5137,45 @@ static void lan78xx_disconnect(struct usb_interface= *intf) usb_put_dev(udev); } =20 +static void lan78xx_tx_timeout_work(struct work_struct *work) +{ + struct lan78xx_net *dev =3D container_of(work, struct lan78xx_net, + tx_timeout_work); + + devlink_health_report(dev->fifo_reporter, dev->timeout_ctx.msg, + &dev->timeout_ctx); +} + static void lan78xx_tx_timeout(struct net_device *net, unsigned int txqueu= e) { struct lan78xx_net *dev =3D netdev_priv(net); - struct lan78xx_dump_ctx ctx =3D {0}; - s64 diff_ms; + s64 diff_ms =3D 0; =20 /* Calculate time since last health check */ - ctx.ts =3D ktime_get_real(); - diff_ms =3D ktime_ms_delta(ctx.ts, dev->snapshot.time); + dev->timeout_ctx.ts =3D ktime_get_real(); + diff_ms =3D ktime_ms_delta(dev->timeout_ctx.ts, dev->snapshot.time); =20 /* We rely on the trend data captured during the last valid stat update * to infer the system state before the crash. */ - if (dev->snapshot.last_delta_pause > LAN78XX_STALL_PAUSE_THRESH) - ctx.msg =3D "TX Timeout (Flow Control Storm?)"; + if (dev->snapshot.last_delta_rx_pause > LAN78XX_STALL_PAUSE_THRESH) + dev->timeout_ctx.msg =3D "TX Timeout (Link Partner Pause Storm?)"; + else if (dev->snapshot.last_delta_tx_pause > LAN78XX_STALL_PAUSE_THRESH) + dev->timeout_ctx.msg =3D "TX Timeout (Local Flow Control Storm?)"; else if (dev->snapshot.last_delta_drops > LAN78XX_TX_TIMEOUT_DROP_THRESH) - ctx.msg =3D "TX Timeout (FIFO Drop Storm?)"; + dev->timeout_ctx.msg =3D "TX Timeout (FIFO Drop Storm?)"; else - ctx.msg =3D "TX Timeout"; + dev->timeout_ctx.msg =3D "TX Timeout"; =20 - ctx.fifo.delta_pause =3D dev->snapshot.last_delta_pause; - ctx.fifo.delta_hw_drop =3D dev->snapshot.last_delta_drops; + dev->timeout_ctx.fifo.delta_rx_pause =3D dev->snapshot.last_delta_rx_paus= e; + dev->timeout_ctx.fifo.delta_tx_pause =3D dev->snapshot.last_delta_tx_paus= e; + dev->timeout_ctx.fifo.delta_hw_drop =3D dev->snapshot.last_delta_drops; =20 netdev_warn(dev->net, "%s (Last stat update: %lld ms ago)\n", - ctx.msg, diff_ms); + dev->timeout_ctx.msg, diff_ms); =20 - devlink_health_report(dev->fifo_reporter, ctx.msg, &ctx); - - unlink_urbs(dev, &dev->txq); - napi_schedule(&dev->napi); + /* Defer report to worker to avoid sleeping in atomic context */ + schedule_work(&dev->tx_timeout_work); } =20 static netdev_features_t lan78xx_features_check(struct sk_buff *skb, @@ -5542,6 +5581,8 @@ static int lan78xx_probe(struct usb_interface *intf, pm_runtime_set_autosuspend_delay(&udev->dev, DEFAULT_AUTOSUSPEND_DELAY); =20 + INIT_WORK(&dev->tx_timeout_work, lan78xx_tx_timeout_work); + dev->devlink =3D devlink_alloc(&lan78xx_devlink_ops, sizeof(struct lan78xx_devlink_priv), &udev->dev); --=20 2.47.3