Documentation/networking/mptcp-sysctl.rst | 8 +- include/net/rps.h | 85 ++++++++++------ net/mptcp/crypto.c | 35 +------ net/mptcp/pm.c | 28 +++++- net/mptcp/protocol.c | 21 ++++ tools/testing/selftests/net/mptcp/mptcp_join.sh | 126 +++++++++++++++++++++++- 6 files changed, 231 insertions(+), 72 deletions(-)
This series contains 4 independent new features: - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC. - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify selftests. - Patch 4: selftests: check for unexpected fallback counter increments. - Patches 5-6: record subflows in RPS table, for aRFS support. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Christoph Paasch (2): net: Add rfs_needed() helper mptcp: record subflows in RPS table Eric Biggers (1): mptcp: use HMAC-SHA256 library instead of open-coded HMAC Gang Yan (1): selftests: mptcp: add checks for fallback counters Geliang Tang (2): mptcp: make ADD_ADDR retransmission timeout adaptive selftests: mptcp: remove add_addr_timeout settings Documentation/networking/mptcp-sysctl.rst | 8 +- include/net/rps.h | 85 ++++++++++------ net/mptcp/crypto.c | 35 +------ net/mptcp/pm.c | 28 +++++- net/mptcp/protocol.c | 21 ++++ tools/testing/selftests/net/mptcp/mptcp_join.sh | 126 +++++++++++++++++++++++- 6 files changed, 231 insertions(+), 72 deletions(-) --- base-commit: d23ad54de795ec0054f90ecb03b41e8f2c410f3a change-id: 20250829-net-next-mptcp-misc-feat-6-18-722fa87a60f1 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
Hello, On 01/09/2025 11:39, Matthieu Baerts (NGI0) wrote: > This series contains 4 independent new features: > > - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC. > > - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify > selftests. I just noticed that NIPA reported some issues due to these 2 patches. In short, some packets (MPTCP ADD_ADDR notifications) can now be retransmitted quicker, but some tests check MIB counters and don't expect retransmissions. If the environment is a bit slow, it is possible to have more retransmissions. We should adapt the tests to avoid false positives. Is it possible to drop just these two patches? Or do you prefer to mark the whole series as "Changes requested"? > - Patch 4: selftests: check for unexpected fallback counter increments. > > - Patches 5-6: record subflows in RPS table, for aRFS support. Cheers, Matt -- Sponsored by the NGI0 Core fund.
On Tue, 2 Sep 2025 16:29:33 +0200 Matthieu Baerts wrote: > On 01/09/2025 11:39, Matthieu Baerts (NGI0) wrote: > > This series contains 4 independent new features: > > > > - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC. > > > > - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify > > selftests. > > I just noticed that NIPA reported some issues due to these 2 patches. In > short, some packets (MPTCP ADD_ADDR notifications) can now be > retransmitted quicker, but some tests check MIB counters and don't > expect retransmissions. If the environment is a bit slow, it is possible > to have more retransmissions. We should adapt the tests to avoid false > positives. > > Is it possible to drop just these two patches? Or do you prefer to mark > the whole series as "Changes requested"? Your call, we can also apply as is. mptcp-join is ignored, anyway.
2 Sept 2025 21:09:36 Jakub Kicinski <kuba@kernel.org>: > On Tue, 2 Sep 2025 16:29:33 +0200 Matthieu Baerts wrote: >> On 01/09/2025 11:39, Matthieu Baerts (NGI0) wrote: >>> This series contains 4 independent new features: >>> >>> - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC. >>> >>> - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify >>> selftests. >> >> I just noticed that NIPA reported some issues due to these 2 patches. In >> short, some packets (MPTCP ADD_ADDR notifications) can now be >> retransmitted quicker, but some tests check MIB counters and don't >> expect retransmissions. If the environment is a bit slow, it is possible >> to have more retransmissions. We should adapt the tests to avoid false >> positives. >> >> Is it possible to drop just these two patches? Or do you prefer to mark >> the whole series as "Changes requested"? > > Your call, we can also apply as is. mptcp-join is ignored, anyway. I realised patch 3/6 is going to cause issues when running on older kernels, so we would need to revert it if we want to apply all patches. But if you prefer a v2 for the whole series instead of applying 1,4-6, I can also do that :) Cheers, Matt
On Tue, 2 Sep 2025 21:25:33 +0200 (GMT+02:00) Matthieu Baerts wrote: > >> I just noticed that NIPA reported some issues due to these 2 patches. In > >> short, some packets (MPTCP ADD_ADDR notifications) can now be > >> retransmitted quicker, but some tests check MIB counters and don't > >> expect retransmissions. If the environment is a bit slow, it is possible > >> to have more retransmissions. We should adapt the tests to avoid false > >> positives. > >> > >> Is it possible to drop just these two patches? Or do you prefer to mark > >> the whole series as "Changes requested"? > > > > Your call, we can also apply as is. mptcp-join is ignored, anyway. > > I realised patch 3/6 is going to cause issues when running on older > kernels, so we would need to revert it if we want to apply all patches. > > But if you prefer a v2 for the whole series instead of applying 1,4-6, > I can also do that :) Alright, please send a v2, then. Sorry for the flip-flop. -- pw-bot: cr
On Mon, 01 Sep 2025 11:39:09 +0200 Matthieu Baerts (NGI0) wrote: > This series contains 4 independent new features: > > - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC. > > - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify > selftests. > > - Patch 4: selftests: check for unexpected fallback counter increments. > > - Patches 5-6: record subflows in RPS table, for aRFS support. I don't see why, but kmemleak started to hit this with the join test 2 branches ago :\ Have you seen any kmemleak issues on your side? We also see occasional leaked skb in driver tests which makes no sense. unreferenced object 0xffff8880029d3340 (size 3016): comm "softirq", pid 0, jiffies 4297316940 hex dump (first 32 bytes): 0a 00 01 02 0a 00 01 01 00 00 00 00 9e b8 7d 27 ..............}' 0a 00 07 41 00 00 00 00 00 00 00 00 00 00 00 00 ...A............ backtrace (crc 3653d88c): kmem_cache_alloc_noprof+0x284/0x330 sk_prot_alloc.constprop.0+0x4e/0x1b0 sk_clone_lock+0x4b/0x10d0 mptcp_sk_clone_init+0x2e/0x10d0 subflow_syn_recv_sock+0x9d1/0x1680 tcp_check_req+0x3a4/0x1910 tcp_v4_rcv+0x1004/0x30a0 ip_protocol_deliver_rcu+0x82/0x350 ip_local_deliver_finish+0x35d/0x620 ip_local_deliver+0x19c/0x470 ip_rcv+0xc2/0x370 __netif_receive_skb_one_core+0x108/0x180 process_backlog+0x3c1/0x13e0 __napi_poll.constprop.0+0x9f/0x460 net_rx_action+0x54f/0xda0 handle_softirqs+0x215/0x610
Hi Jakub, On 02/09/2025 16:26, Jakub Kicinski wrote: > On Mon, 01 Sep 2025 11:39:09 +0200 Matthieu Baerts (NGI0) wrote: >> This series contains 4 independent new features: >> >> - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC. >> >> - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify >> selftests. >> >> - Patch 4: selftests: check for unexpected fallback counter increments. >> >> - Patches 5-6: record subflows in RPS table, for aRFS support. > > I don't see why, but kmemleak started to hit this with the join test > 2 branches ago :\ Have you seen any kmemleak issues on your side? > We also see occasional leaked skb in driver tests which makes no sense. > > unreferenced object 0xffff8880029d3340 (size 3016): > comm "softirq", pid 0, jiffies 4297316940 > hex dump (first 32 bytes): > 0a 00 01 02 0a 00 01 01 00 00 00 00 9e b8 7d 27 ..............}' > 0a 00 07 41 00 00 00 00 00 00 00 00 00 00 00 00 ...A............ > backtrace (crc 3653d88c): > kmem_cache_alloc_noprof+0x284/0x330 > sk_prot_alloc.constprop.0+0x4e/0x1b0 > sk_clone_lock+0x4b/0x10d0 > mptcp_sk_clone_init+0x2e/0x10d0 > subflow_syn_recv_sock+0x9d1/0x1680 > tcp_check_req+0x3a4/0x1910 > tcp_v4_rcv+0x1004/0x30a0 > ip_protocol_deliver_rcu+0x82/0x350 > ip_local_deliver_finish+0x35d/0x620 > ip_local_deliver+0x19c/0x470 > ip_rcv+0xc2/0x370 > __netif_receive_skb_one_core+0x108/0x180 > process_backlog+0x3c1/0x13e0 > __napi_poll.constprop.0+0x9f/0x460 > net_rx_action+0x54f/0xda0 > handle_softirqs+0x215/0x610 Thank you for this notification! No, I didn't notice that on our side. For KMemLeak, now I'm waiting 5 seconds, then I force the scan, and check for issues once. On NIPA, I see that there are still 2 scans + cat, and apparently, the issue was always visible during the 2nd scan: https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/279881/1-mptcp-join-sh/stdout https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/280062/1-mptcp-join-sh/stdout It is unclear why a second scan is needed and only the second one caught something. Was it the same with the strange issues you mentioned in driver tests? Do you think I should re-add the second scan + cat? When looking at the modifications of this series, it is unclear what could cause that. Cheers, Matt -- Sponsored by the NGI0 Core fund.
On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote: > It is unclear why a second scan is needed and only the second one caught > something. Was it the same with the strange issues you mentioned in > driver tests? Do you think I should re-add the second scan + cat? Not sure, cc: Catalin, from experience it seems like second scan often surfaces issues the first scan missed. > When looking at the modifications of this series, it is unclear what > could cause that. Yes, I don't think it's related to the series. For one thing the series a couple of before the first report.
On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote: > On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote: > > It is unclear why a second scan is needed and only the second one caught > > something. Was it the same with the strange issues you mentioned in > > driver tests? Do you think I should re-add the second scan + cat? > > Not sure, cc: Catalin, from experience it seems like second scan often > surfaces issues the first scan missed. It's some of the kmemleak heuristics to reduce false positives. It does a checksum of the object during scanning and only reports a leak if the checksum is the same in two consecutive scans. -- Catalin
Hi Catalin, 2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>: > On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote: >> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote: >>> It is unclear why a second scan is needed and only the second one caught >>> something. Was it the same with the strange issues you mentioned in >>> driver tests? Do you think I should re-add the second scan + cat? >> >> Not sure, cc: Catalin, from experience it seems like second scan often >> surfaces issues the first scan missed. > > It's some of the kmemleak heuristics to reduce false positives. It does > a checksum of the object during scanning and only reports a leak if the > checksum is the same in two consecutive scans. Thank you for the explanation! Does that mean a scan should be triggered at the end of the tests, then wait 5 second for the grace period, then trigger another scan and check the results? Or wait 5 seconds, then trigger two consecutive scans? Cheers, Matt
On Tue, Sep 02, 2025 at 08:50:19PM +0200, Matthieu Baerts wrote: > Hi Catalin, > > 2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>: > > > On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote: > >> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote: > >>> It is unclear why a second scan is needed and only the second one caught > >>> something. Was it the same with the strange issues you mentioned in > >>> driver tests? Do you think I should re-add the second scan + cat? > >> > >> Not sure, cc: Catalin, from experience it seems like second scan often > >> surfaces issues the first scan missed. > > > > It's some of the kmemleak heuristics to reduce false positives. It does > > a checksum of the object during scanning and only reports a leak if the > > checksum is the same in two consecutive scans. > > Thank you for the explanation! > > Does that mean a scan should be triggered at the end of the tests, > then wait 5 second for the grace period, then trigger another scan > and check the results? > > Or wait 5 seconds, then trigger two consecutive scans? The 5 seconds is the minimum age of an object before it gets reported as a leak. It's not related to the scanning process. So you could do two scans in succession and wait 5 seconds before checking for leaks. However, I'd go with the first option - do a scan, wait 5 seconds and do another. That's mostly because at the end of the scan kmemleak prints if it found new unreferenced objects. It might not print the message if a leaked object is younger than 5 seconds. In practice, though, the scan may take longer, depending on how loaded your system is. The second option works as well but waiting between them has a better chance of removing false positives if, say, some objects are moved between lists and two consecutive scans do not detect the list_head change (and update the object's checksum). -- Catalin
2 Sept 2025 23:18:56 Catalin Marinas <catalin.marinas@arm.com>: > On Tue, Sep 02, 2025 at 08:50:19PM +0200, Matthieu Baerts wrote: >> Hi Catalin, >> >> 2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>: >> >>> On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote: >>>> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote: >>>>> It is unclear why a second scan is needed and only the second one caught >>>>> something. Was it the same with the strange issues you mentioned in >>>>> driver tests? Do you think I should re-add the second scan + cat? >>>> >>>> Not sure, cc: Catalin, from experience it seems like second scan often >>>> surfaces issues the first scan missed. >>> >>> It's some of the kmemleak heuristics to reduce false positives. It does >>> a checksum of the object during scanning and only reports a leak if the >>> checksum is the same in two consecutive scans. >> >> Thank you for the explanation! >> >> Does that mean a scan should be triggered at the end of the tests, >> then wait 5 second for the grace period, then trigger another scan >> and check the results? >> >> Or wait 5 seconds, then trigger two consecutive scans? > > The 5 seconds is the minimum age of an object before it gets reported as > a leak. It's not related to the scanning process. So you could do two > scans in succession and wait 5 seconds before checking for leaks. > > However, I'd go with the first option - do a scan, wait 5 seconds and do > another. That's mostly because at the end of the scan kmemleak prints if > it found new unreferenced objects. It might not print the message if a > leaked object is younger than 5 seconds. In practice, though, the scan > may take longer, depending on how loaded your system is. > > The second option works as well but waiting between them has a better > chance of removing false positives if, say, some objects are moved > between lists and two consecutive scans do not detect the list_head > change (and update the object's checksum). Thank you for this very nice reply, that's very clear! I will then adapt our CI having CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF to do a manual scan at the very end, wait 5 seconds and do another. Cheers, Matt
On Tue, Sep 2, 2025 at 2:38 PM Matthieu Baerts <matttbe@kernel.org> wrote: > > 2 Sept 2025 23:18:56 Catalin Marinas <catalin.marinas@arm.com>: > > > On Tue, Sep 02, 2025 at 08:50:19PM +0200, Matthieu Baerts wrote: > >> Hi Catalin, > >> > >> 2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>: > >> > >>> On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote: > >>>> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote: > >>>>> It is unclear why a second scan is needed and only the second one caught > >>>>> something. Was it the same with the strange issues you mentioned in > >>>>> driver tests? Do you think I should re-add the second scan + cat? > >>>> > >>>> Not sure, cc: Catalin, from experience it seems like second scan often > >>>> surfaces issues the first scan missed. > >>> > >>> It's some of the kmemleak heuristics to reduce false positives. It does > >>> a checksum of the object during scanning and only reports a leak if the > >>> checksum is the same in two consecutive scans. > >> > >> Thank you for the explanation! > >> > >> Does that mean a scan should be triggered at the end of the tests, > >> then wait 5 second for the grace period, then trigger another scan > >> and check the results? > >> > >> Or wait 5 seconds, then trigger two consecutive scans? > > > > The 5 seconds is the minimum age of an object before it gets reported as > > a leak. It's not related to the scanning process. So you could do two > > scans in succession and wait 5 seconds before checking for leaks. > > > > However, I'd go with the first option - do a scan, wait 5 seconds and do > > another. That's mostly because at the end of the scan kmemleak prints if > > it found new unreferenced objects. It might not print the message if a > > leaked object is younger than 5 seconds. In practice, though, the scan > > may take longer, depending on how loaded your system is. > > > > The second option works as well but waiting between them has a better > > chance of removing false positives if, say, some objects are moved > > between lists and two consecutive scans do not detect the list_head > > change (and update the object's checksum). > > Thank you for this very nice reply, that's very clear! > > I will then adapt our CI having CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF > to do a manual scan at the very end, wait 5 seconds and do another. FWIW - I am able to pretty reliably reproduce the kmemleak. However, I also tried adding an inline kmemleak scan to the test harness (did it once with, once without a sleep). When I do that the kmemleak disappears :-) (not saying that adding the scan isn't useful, just pointing out that this particular leak seems to be related to how quickly we iterate over the testcases) Christoph
© 2016 - 2025 Red Hat, Inc.