drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
Hi netdev, This is v4 of my series containing a pair of bugfixes for the stmmac driver's receive pipeline. These issues occur when stmmac_rx_refill() does not (fully) succeed, which happens more frequently when free memory is low. The first patch closes Bugzilla bug #221010 [1], where stmmac_rx() can circle around to a still-dirty descriptor (with a NULL buffer pointer), mistake it for a filled descriptor (due to OWN=0), and attempt to dereference the buffer. In testing that patch, I discovered a second issue: starvation of available RX buffers causes the NIC to stop sending interrupts; if the driver stops polling, it will wait indefinitely for an interrupt that will never come. (Note: the first patch makes this issue more prominent -- mostly because it lets the system survive long enough to exhibit it -- but doesn't *cause* it.) The second patch addresses that problem as well. Both patches are minimal, appropriate for stable, and designated to `net`. My focus is on small, obviously-correct, easy-to-explain changes: I'll follow up with another patch/series (something like [2]) for `net-next` that fixes the ring in a more robust way. The tx and zc paths seem to have similar low-memory bugs, to be addressed in separate series. Regards, Sam --- [1] https://bugzilla.kernel.org/show_bug.cgi?id=221010 [2] https://lore.kernel.org/netdev/20260316021009.262358-4-CFSworks@gmail.com/ v4: - Changed patch 2 to tolerate dirty stragglers up to a critical threshold (the same threshold tolerated by the zero-copy path), to avoid nuisance looping during OOM conditions (thanks Jakub) v3: https://lore.kernel.org/netdev/20260328192503.520689-1-CFSworks@gmail.com/T/ - Rebased on latest net/main - Changed patch 2 to require that stmmac_rx_refill() *fully* succeeds before exiting polling, to reduce the chance of rx drops. - DID NOT use the CIRC_SPACE() macro as suggested by Russell: I fear that the perspective shift (first think of the dirty descriptors as the "work" that refill "consumes" -- therefore the "space" is how much stmmac_rx() may loop) is too counterintuitive for a stable fix, but I'll do it in v4 if reviewers insist. - Updated the recipients for the series, which was invalidated in v2 due to the `Fixes:` v2: https://lore.kernel.org/netdev/20260319184031.8596-1-CFSworks@gmail.com/T/ - Completely rewrote the commit message of patch 1, now assuming the reader is generally familiar with DMA but wholly unfamiliar with the stmmac device (thanks Jakub!) - Added missing `Fixes:` to patch 2 - Moved patch 2's `int budget = limit;` decl per the reverse-xmas-tree rule - Dropped patch 3: this was a code improvement not appropriate for stable - Generated the series with --subject-prefix='PATCH net' v1: https://lore.kernel.org/netdev/20260316021009.262358-1-CFSworks@gmail.com/ Sam Edwards (2): net: stmmac: Prevent NULL deref when RX memory exhausted net: stmmac: Prevent indefinite RX stall on buffer exhaustion drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) -- 2.52.0
On Tue, Mar 31, 2026 at 09:19:27PM -0700, Sam Edwards wrote: > Hi netdev, > > This is v4 of my series containing a pair of bugfixes for the stmmac driver's > receive pipeline. These issues occur when stmmac_rx_refill() does not (fully) > succeed, which happens more frequently when free memory is low. > > The first patch closes Bugzilla bug #221010 [1], where stmmac_rx() can circle > around to a still-dirty descriptor (with a NULL buffer pointer), mistake it for > a filled descriptor (due to OWN=0), and attempt to dereference the buffer. > > In testing that patch, I discovered a second issue: starvation of available RX > buffers causes the NIC to stop sending interrupts; if the driver stops polling, > it will wait indefinitely for an interrupt that will never come. (Note: the > first patch makes this issue more prominent -- mostly because it lets the > system survive long enough to exhibit it -- but doesn't *cause* it.) The second > patch addresses that problem as well. > > Both patches are minimal, appropriate for stable, and designated to `net`. My > focus is on small, obviously-correct, easy-to-explain changes: I'll follow up > with another patch/series (something like [2]) for `net-next` that fixes the > ring in a more robust way. > > The tx and zc paths seem to have similar low-memory bugs, to be addressed in > separate series. I've tested this on my Jetson Xavier platform. One of the issues I've had is that running iperf3 results in the receive side stalling because it runs out of descriptors. However, despite the receive ring eventually being re-filled and the hardware appropriately prodded, it steadfastly refuses to restart, despite the descriptors having been updated. What I can see is there's 40 packets in the internal FIFOs via the PRXQ[13:0] field of the ETH_MTLRXQxDR register. With your patches applied: root@tegra-ubuntu:~# iperf3 -c 192.168.248.1 -R Connecting to host 192.168.248.1, port 5201 Reverse mode, remote host 192.168.248.1 is sending [ 5] local 192.168.248.174 port 43728 connected to 192.168.248.1 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 30.3 MBytes 254 Mbits/sec [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec [ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec [ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec ... The remote system says: Accepted connection from 192.168.248.174, port 43720 [ 5] local 192.168.248.1 port 5201 connected to 192.168.248.174 port 43728 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 31.8 MBytes 266 Mbits/sec 1 1.41 KBytes [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 1 1.41 KBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes This is a dwmac v5.0. I think the relevant registers are: ETH_MTLQxICSR: Value at address 0x02490d2c: 0x00000002 ETH_MTLRXQxDR: Value at address 0x02490d38: 0x00280020 PRXQ[13:0]: Number of Packets in Receive Queue = 40 RXQSTS[1:0]: MTL Rx Queue Fill-Level status = 2 (Rx queue fill-level above flow-control activate threshold) RRXSTS[1:0]: MTL Rx Queue Read Controller State = 0 (Idle) RWCSTS: MTL Rx Queue Write Controller Active Status = 0 ETH_DMADS1R: Value at address 0x0249100c: 0x00006300 RPS0[3:0]: DMA Channel Receive Process State = Running (Waiting for Rx packet) ETH_DMACxRXDTPR: Value at address 0x02491128: 0xffffe7d0 RDT[31:0]: Receive Descriptor Tail Pointer (writes should apparently trigger DMA to start, but doesn't seem to) ETH_DMACxCARXDR: Value at address 0x0249114c: 0xffffe7d0 CURRDESAPTR[31:0]: Application receive descriptor address pointer ETH_DMACxCARXBR: Value at address 0x0249115c: 0xffb55040 CURRBUFAPTR[31:0]: Application receive buffer address pointer This makes it look like it's read the descriptor at 0x7fffffe7d0. ETH_DMAC0SR: Value at address 0x02491160: 0x00000484 ETI (early transmit interrupt)=1 RBU (receive buffer unavailable)=1 TBU (transmit buffer unavailable)=1 Clearing RBU doesn't seem to help. Descriptors: 123 [0x0000007fffffe7b0]: 0xffbba040 0x7f 0x0 0x81000000 124 [0x0000007fffffe7c0]: 0xffbb9040 0x7f 0x0 0x81000000 125 [0x0000007fffffe7d0]: 0xffb55040 0x7f 0x0 0x81000000 <--- 126 [0x0000007fffffe7e0]: 0xffc26040 0x7f 0x0 0x81000000 127 [0x0000007fffffe7f0]: 0xfff95040 0x7f 0x0 0x81000000 So they're all refilled. Any ideas? Full register dump via devmem2 (ethtool -d doesn't dump all registers): Value at address 0x02490000: 0x08072203 Value at address 0x02490004: 0x00200000 Value at address 0x02490008: 0x00010404 Value at address 0x0249000c: 0x00000000 Value at address 0x02490010: 0x00040008 Value at address 0x02490014: 0x20000008 Value at address 0x02490018: 0x00000001 Value at address 0x0249001c: 0x00000001 Value at address 0x02490020: 0x00000000 Value at address 0x02490024: 0x00000000 Value at address 0x02490028: 0x00000000 Value at address 0x0249002c: 0x00000000 Value at address 0x02490030: 0x00000000 Value at address 0x02490034: 0x00000000 Value at address 0x02490038: 0x00000000 Value at address 0x0249003c: 0x00000000 Value at address 0x02490040: 0x00000000 Value at address 0x02490044: 0x00000000 Value at address 0x02490048: 0x00000000 Value at address 0x0249004c: 0x00000000 Value at address 0x02490050: 0x01600000 Value at address 0x02490054: 0x00000000 Value at address 0x02490058: 0x00000000 Value at address 0x0249005c: 0x00000000 Value at address 0x02490060: 0x00120000 Value at address 0x02490064: 0x00000000 Value at address 0x02490068: 0x00000000 Value at address 0x0249006c: 0x00000000 Value at address 0x02490070: 0xffff0002 Value at address 0x02490074: 0x00000000 Value at address 0x02490078: 0x00000000 Value at address 0x0249007c: 0x00000000 Value at address 0x02490080: 0x00000000 Value at address 0x02490084: 0x00000000 Value at address 0x02490088: 0x00000000 Value at address 0x0249008c: 0x00000000 Value at address 0x02490090: 0x00000001 Value at address 0x02490094: 0x00000000 Value at address 0x02490098: 0x00000000 Value at address 0x0249009c: 0x00000000 Value at address 0x024900a0: 0x00000002 Value at address 0x024900a4: 0x00000000 Value at address 0x024900a8: 0x00000000 Value at address 0x024900ac: 0x00000000 Value at address 0x024900b0: 0x00000001 Value at address 0x024900b4: 0x00001030 Value at address 0x024900b8: 0x00000000 Value at address 0x024900bc: 0x00000000 Value at address 0x024900c0: 0x00000000 Value at address 0x024900c4: 0x00000000 Value at address 0x024900c8: 0x00000000 Value at address 0x024900cc: 0x00000000 Value at address 0x024900d0: 0x003b0200 Value at address 0x024900d4: 0x03e8001e Value at address 0x024900d8: 0x000f4240 Value at address 0x024900dc: 0x0000007c Value at address 0x024900e0: 0x00000000 Value at address 0x024900e4: 0x00000000 Value at address 0x024900e8: 0x00000000 Value at address 0x024900ec: 0x00000000 Value at address 0x024900f0: 0x00000000 Value at address 0x024900f4: 0x00000000 Value at address 0x024900f8: 0x000d0000 Value at address 0x024900fc: 0x00000000 Value at address 0x02490100: 0x00000000 Value at address 0x02490104: 0x00000000 Value at address 0x02490108: 0x00000000 Value at address 0x0249010c: 0x00000000 Value at address 0x02490110: 0x00001050 Value at address 0x02490114: 0x00000000 Value at address 0x02490118: 0x00000000 Value at address 0x0249011c: 0x1bfd73f7 Value at address 0x02490120: 0x421e7a49 Value at address 0x02490124: 0x100c30c3 Value at address 0x02490128: 0x00320220 Value at address 0x02490200: 0x000e010c Value at address 0x02490204: 0x00000006 Value at address 0x02490208: 0x00000000 Value at address 0x0249020c: 0x00000000 Value at address 0x02490210: 0x00000000 Value at address 0x02490214: 0x00000000 Value at address 0x02490218: 0x00000000 Value at address 0x0249021c: 0x00000000 Value at address 0x02490220: 0x00000000 Value at address 0x02490224: 0x00000000 Value at address 0x02490228: 0x00000000 Value at address 0x0249022c: 0x00000000 Value at address 0x02490230: 0x00000000 Value at address 0x02490234: 0x00000000 Value at address 0x02490238: 0x00000102 Value at address 0x02490d00: 0x00ff000a Value at address 0x02490d04: 0x00000000 Value at address 0x02490d08: 0x00000000 Value at address 0x02490d0c: 0x00000000 Value at address 0x02490d10: 0x00000000 Value at address 0x02490d14: 0x00000030 Value at address 0x02490d18: 0x00000000 Value at address 0x02490d1c: 0x00000000 Value at address 0x02490d20: 0x00000000 Value at address 0x02490d24: 0x00000000 Value at address 0x02490d28: 0x00000000 Value at address 0x02490d2c: 0x00000002 Value at address 0x02490d30: 0x0ff1c4e0 Value at address 0x02490d34: 0x00000000 Value at address 0x02490d38: 0x00280020 Value at address 0x02490d3c: 0x00000000 Value at address 0x02491000: 0x00000000 Value at address 0x02491004: 0x0002180e Value at address 0x02491008: 0x00000000 Value at address 0x0249100c: 0x00006300 Value at address 0x02491010: 0x00000000 Value at address 0x02491014: 0x00000000 Value at address 0x02491018: 0x00000000 Value at address 0x0249101c: 0x00000000 Value at address 0x02491020: 0x00000000 Value at address 0x02491024: 0x00000000 Value at address 0x02491028: 0x00000000 Value at address 0x0249102c: 0x00000000 Value at address 0x02491030: 0x00000000 Value at address 0x02491034: 0x00000000 Value at address 0x02491038: 0x00000000 Value at address 0x0249103c: 0x00000000 Value at address 0x02491040: 0x00000000 Value at address 0x02491044: 0x00000000 Value at address 0x02491048: 0x00000000 Value at address 0x0249104c: 0x00000000 Value at address 0x02491050: 0x00000000 Value at address 0x02491054: 0x00000000 Value at address 0x02491100: 0x00010000 Value at address 0x02491104: 0x00101011 Value at address 0x02491108: 0x00080c01 Value at address 0x0249110c: 0x00000000 Value at address 0x02491110: 0x0000007f Value at address 0x02491114: 0xffffc000 Value at address 0x02491118: 0x0000007f Value at address 0x0249111c: 0xffffe000 Value at address 0x02491120: 0xffffc500 Value at address 0x02491124: 0x00000000 Value at address 0x02491128: 0xffffe7d0 Value at address 0x0249112c: 0x000001ff Value at address 0x02491130: 0x000001ff Value at address 0x02491134: 0x0000d041 Value at address 0x02491138: 0x000000a0 Value at address 0x0249113c: 0x000d07c0 Value at address 0x02491140: 0x00000000 Value at address 0x02491144: 0xffffc500 Value at address 0x02491148: 0x00000000 Value at address 0x0249114c: 0xffffe7d0 Value at address 0x02491150: 0x0000007f Value at address 0x02491154: 0xffc45b02 Value at address 0x02491158: 0x0000007f Value at address 0x0249115c: 0xffb55040 Value at address 0x02491160: 0x00000484 Value at address 0x02491164: 0x00000000 Value at address 0x02491168: 0x00000000 Value at address 0x0249116c: 0x00000000 Value at address 0x02491170: 0x00000000 Value at address 0x02491174: 0x00000000 Value at address 0x02491178: 0x00000000 Value at address 0x0249117c: 0x00000000 -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Hi Russell On 02/04/2026 19:16, Russell King (Oracle) wrote: > On Tue, Mar 31, 2026 at 09:19:27PM -0700, Sam Edwards wrote: >> Hi netdev, >> >> This is v4 of my series containing a pair of bugfixes for the stmmac driver's >> receive pipeline. These issues occur when stmmac_rx_refill() does not (fully) >> succeed, which happens more frequently when free memory is low. >> >> The first patch closes Bugzilla bug #221010 [1], where stmmac_rx() can circle >> around to a still-dirty descriptor (with a NULL buffer pointer), mistake it for >> a filled descriptor (due to OWN=0), and attempt to dereference the buffer. >> >> In testing that patch, I discovered a second issue: starvation of available RX >> buffers causes the NIC to stop sending interrupts; if the driver stops polling, >> it will wait indefinitely for an interrupt that will never come. (Note: the >> first patch makes this issue more prominent -- mostly because it lets the >> system survive long enough to exhibit it -- but doesn't *cause* it.) The second >> patch addresses that problem as well. >> >> Both patches are minimal, appropriate for stable, and designated to `net`. My >> focus is on small, obviously-correct, easy-to-explain changes: I'll follow up >> with another patch/series (something like [2]) for `net-next` that fixes the >> ring in a more robust way. >> >> The tx and zc paths seem to have similar low-memory bugs, to be addressed in >> separate series. > > I've tested this on my Jetson Xavier platform. One of the issues I've > had is that running iperf3 results in the receive side stalling because > it runs out of descriptors. However, despite the receive ring > eventually being re-filled and the hardware appropriately prodded, it > steadfastly refuses to restart, despite the descriptors having been > updated. > > What I can see is there's 40 packets in the internal FIFOs via the > PRXQ[13:0] field of the ETH_MTLRXQxDR register. > > With your patches applied: > > root@tegra-ubuntu:~# iperf3 -c 192.168.248.1 -R > Connecting to host 192.168.248.1, port 5201 > Reverse mode, remote host 192.168.248.1 is sending > [ 5] local 192.168.248.174 port 43728 connected to 192.168.248.1 port 5201 > [ ID] Interval Transfer Bitrate > [ 5] 0.00-1.00 sec 30.3 MBytes 254 Mbits/sec > [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec > [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec > [ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec > [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec > [ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec > ... Ah !! I have been struggling with that problem this week too. I stumbled upon it while trying to test your TSO series, and at firts I thought it was because of the TSO patches, but turns out it's not, I reproduce it on net-next. The main problem for me is that it's not always reproducible, it may or may not show up when I run iperf3 after a fresh restart. This is on socfpga (dwmac1000), so it seems the problem exists across IP versions. I've been on and off trying to make progress on that during the week, but without success so far... Maxime
On Thu, Apr 2, 2026 at 10:16 AM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > I've tested this on my Jetson Xavier platform. One of the issues I've > had is that running iperf3 results in the receive side stalling because > it runs out of descriptors. However, despite the receive ring > eventually being re-filled and the hardware appropriately prodded, it > steadfastly refuses to restart, despite the descriptors having been > updated. Hi Russell, Just to make sure I understand correctly: before my patches, you've been observing this problem on Xavier for a while (no interrupts, ring goes dry); with my patches, the ring is refilled, but the dwmac5 doesn't resume DMA. (Ah, just saw your follow-up email.) > Any ideas? Off the top of my head, my hypothesis is that dwmac5 has an additional tripwire when the receive DMA is exhausted, and the stmmac_set_rx_tail_ptr()/stmmac_enable_dma_reception() at the end of stmmac_rx_refill() aren't sufficient to wake it back up. I think this is new to dwmac5, because my RK3588 (dwmac4.20 iirc) happily resumes after the same condition. You gave a lot of info; thanks! I'll try to scrape up some documentation on dwmac5 to see if there's something more stmmac_rx_refill() ought to be doing. I think I have a Xavier NX around here somewhere, I'll see if I can repro the problem. Cheers, Sam
On Thu, Apr 02, 2026 at 06:16:45PM +0100, Russell King (Oracle) wrote: > On Tue, Mar 31, 2026 at 09:19:27PM -0700, Sam Edwards wrote: > > Hi netdev, > > > > This is v4 of my series containing a pair of bugfixes for the stmmac driver's > > receive pipeline. These issues occur when stmmac_rx_refill() does not (fully) > > succeed, which happens more frequently when free memory is low. > > > > The first patch closes Bugzilla bug #221010 [1], where stmmac_rx() can circle > > around to a still-dirty descriptor (with a NULL buffer pointer), mistake it for > > a filled descriptor (due to OWN=0), and attempt to dereference the buffer. > > > > In testing that patch, I discovered a second issue: starvation of available RX > > buffers causes the NIC to stop sending interrupts; if the driver stops polling, > > it will wait indefinitely for an interrupt that will never come. (Note: the > > first patch makes this issue more prominent -- mostly because it lets the > > system survive long enough to exhibit it -- but doesn't *cause* it.) The second > > patch addresses that problem as well. > > > > Both patches are minimal, appropriate for stable, and designated to `net`. My > > focus is on small, obviously-correct, easy-to-explain changes: I'll follow up > > with another patch/series (something like [2]) for `net-next` that fixes the > > ring in a more robust way. > > > > The tx and zc paths seem to have similar low-memory bugs, to be addressed in > > separate series. > > I've tested this on my Jetson Xavier platform. One of the issues I've > had is that running iperf3 results in the receive side stalling because > it runs out of descriptors. However, despite the receive ring > eventually being re-filled and the hardware appropriately prodded, it > steadfastly refuses to restart, despite the descriptors having been > updated. I'll make it clear: this problem exists without your patches, so it is not a regression. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, 31 Mar 2026 21:19:27 -0700 Sam Edwards wrote: > - Changed patch 2 to tolerate dirty stragglers up to a critical threshold (the > same threshold tolerated by the zero-copy path), to avoid nuisance looping > during OOM conditions (thanks Jakub) I meant we need both a threshold, and a delay :(
On Thu, Apr 2, 2026 at 8:05 AM Jakub Kicinski <kuba@kernel.org> wrote: > I meant we need both a threshold, and a delay :( Hi Jakub - got it: when the critical threshold is reached, allow the NAPI instance to sleep and start a timer instead. 1) We'd either have to leave interrupts masked or let them race against the timer. Either one is manageable, but I feel like those interactions carry *just* enough regression risk to bump that patch to -next. 2) Could you point out which NAPI driver best handles this situation? I'd like to replicate its approach. Thanks, Sam
On Thu, 2 Apr 2026 09:53:43 -0700 Sam Edwards wrote: > On Thu, Apr 2, 2026 at 8:05 AM Jakub Kicinski <kuba@kernel.org> wrote: > > I meant we need both a threshold, and a delay :( > > Hi Jakub - got it: when the critical threshold is reached, allow the > NAPI instance to sleep and start a timer instead. > > 1) We'd either have to leave interrupts masked or let them race > against the timer. Either one is manageable, but I feel like those > interactions carry *just* enough regression risk to bump that patch to > -next. > > 2) Could you point out which NAPI driver best handles this situation? > I'd like to replicate its approach. Not sure, the last few NICs I worked on had the ability for SW to trigger IRQs exactly because of the Rx buffer depletion issue. fbnic_napi_depletion_check() for example. But let's not overthink it.. say we arm a timer and let the IRQ be unmasked. The timer just runs napi_schedule(). napi_schedule() is thread-safe, if IRQ fires with the timer armed - no problem.
© 2016 - 2026 Red Hat, Inc.