[v1] mptcp: misc. features for v6.18

[PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Matthieu Baerts (NGI0) 5 months, 1 week ago

This series contains 4 independent new features:

- Patch 1: use HMAC-SHA256 library instead of open-coded HMAC.

- Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify
  selftests.

- Patch 4: selftests: check for unexpected fallback counter increments.

- Patches 5-6: record subflows in RPS table, for aRFS support.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Christoph Paasch (2):
      net: Add rfs_needed() helper
      mptcp: record subflows in RPS table

Eric Biggers (1):
      mptcp: use HMAC-SHA256 library instead of open-coded HMAC

Gang Yan (1):
      selftests: mptcp: add checks for fallback counters

Geliang Tang (2):
      mptcp: make ADD_ADDR retransmission timeout adaptive
      selftests: mptcp: remove add_addr_timeout settings

 Documentation/networking/mptcp-sysctl.rst       |   8 +-
 include/net/rps.h                               |  85 ++++++++++------
 net/mptcp/crypto.c                              |  35 +------
 net/mptcp/pm.c                                  |  28 +++++-
 net/mptcp/protocol.c                            |  21 ++++
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 126 +++++++++++++++++++++++-
 6 files changed, 231 insertions(+), 72 deletions(-)
---
base-commit: d23ad54de795ec0054f90ecb03b41e8f2c410f3a
change-id: 20250829-net-next-mptcp-misc-feat-6-18-722fa87a60f1

Best regards,
-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Matthieu Baerts 5 months, 1 week ago

Hello,

On 01/09/2025 11:39, Matthieu Baerts (NGI0) wrote:
> This series contains 4 independent new features:
> 
> - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC.
> 
> - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify
>   selftests.

I just noticed that NIPA reported some issues due to these 2 patches. In
short, some packets (MPTCP ADD_ADDR notifications) can now be
retransmitted quicker, but some tests check MIB counters and don't
expect retransmissions. If the environment is a bit slow, it is possible
to have more retransmissions. We should adapt the tests to avoid false
positives.

Is it possible to drop just these two patches? Or do you prefer to mark
the whole series as "Changes requested"?

> - Patch 4: selftests: check for unexpected fallback counter increments.
> 
> - Patches 5-6: record subflows in RPS table, for aRFS support.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Jakub Kicinski 5 months, 1 week ago

On Tue, 2 Sep 2025 16:29:33 +0200 Matthieu Baerts wrote:
> On 01/09/2025 11:39, Matthieu Baerts (NGI0) wrote:
> > This series contains 4 independent new features:
> > 
> > - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC.
> > 
> > - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify
> >   selftests.  
> 
> I just noticed that NIPA reported some issues due to these 2 patches. In
> short, some packets (MPTCP ADD_ADDR notifications) can now be
> retransmitted quicker, but some tests check MIB counters and don't
> expect retransmissions. If the environment is a bit slow, it is possible
> to have more retransmissions. We should adapt the tests to avoid false
> positives.
> 
> Is it possible to drop just these two patches? Or do you prefer to mark
> the whole series as "Changes requested"?

Your call, we can also apply as is. mptcp-join is ignored, anyway.

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Matthieu Baerts 5 months, 1 week ago

2 Sept 2025 21:09:36 Jakub Kicinski <kuba@kernel.org>:

> On Tue, 2 Sep 2025 16:29:33 +0200 Matthieu Baerts wrote:
>> On 01/09/2025 11:39, Matthieu Baerts (NGI0) wrote:
>>> This series contains 4 independent new features:
>>>
>>> - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC.
>>>
>>> - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify
>>>   selftests. 
>>
>> I just noticed that NIPA reported some issues due to these 2 patches. In
>> short, some packets (MPTCP ADD_ADDR notifications) can now be
>> retransmitted quicker, but some tests check MIB counters and don't
>> expect retransmissions. If the environment is a bit slow, it is possible
>> to have more retransmissions. We should adapt the tests to avoid false
>> positives.
>>
>> Is it possible to drop just these two patches? Or do you prefer to mark
>> the whole series as "Changes requested"?
>
> Your call, we can also apply as is. mptcp-join is ignored, anyway.

I realised patch 3/6 is going to cause issues when running on older
kernels, so we would need to revert it if we want to apply all patches.

But if you prefer a v2 for the whole series instead of applying 1,4-6,
I can also do that :)

Cheers,
Matt

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Jakub Kicinski 5 months, 1 week ago

On Tue, 2 Sep 2025 21:25:33 +0200 (GMT+02:00) Matthieu Baerts wrote:
> >> I just noticed that NIPA reported some issues due to these 2 patches. In
> >> short, some packets (MPTCP ADD_ADDR notifications) can now be
> >> retransmitted quicker, but some tests check MIB counters and don't
> >> expect retransmissions. If the environment is a bit slow, it is possible
> >> to have more retransmissions. We should adapt the tests to avoid false
> >> positives.
> >>
> >> Is it possible to drop just these two patches? Or do you prefer to mark
> >> the whole series as "Changes requested"?  
> >
> > Your call, we can also apply as is. mptcp-join is ignored, anyway.  
> 
> I realised patch 3/6 is going to cause issues when running on older
> kernels, so we would need to revert it if we want to apply all patches.
> 
> But if you prefer a v2 for the whole series instead of applying 1,4-6,
> I can also do that :)

Alright, please send a v2, then. Sorry for the flip-flop.
-- 
pw-bot: cr

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Jakub Kicinski 5 months, 1 week ago

On Mon, 01 Sep 2025 11:39:09 +0200 Matthieu Baerts (NGI0) wrote:
> This series contains 4 independent new features:
> 
> - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC.
> 
> - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify
>   selftests.
> 
> - Patch 4: selftests: check for unexpected fallback counter increments.
> 
> - Patches 5-6: record subflows in RPS table, for aRFS support.

I don't see why, but kmemleak started to hit this with the join test
2 branches ago :\ Have you seen any kmemleak issues on your side?
We also see occasional leaked skb in driver tests which makes no sense.

unreferenced object 0xffff8880029d3340 (size 3016):
  comm "softirq", pid 0, jiffies 4297316940
  hex dump (first 32 bytes):
    0a 00 01 02 0a 00 01 01 00 00 00 00 9e b8 7d 27  ..............}'
    0a 00 07 41 00 00 00 00 00 00 00 00 00 00 00 00  ...A............
  backtrace (crc 3653d88c):
    kmem_cache_alloc_noprof+0x284/0x330
    sk_prot_alloc.constprop.0+0x4e/0x1b0
    sk_clone_lock+0x4b/0x10d0
    mptcp_sk_clone_init+0x2e/0x10d0
    subflow_syn_recv_sock+0x9d1/0x1680
    tcp_check_req+0x3a4/0x1910
    tcp_v4_rcv+0x1004/0x30a0
    ip_protocol_deliver_rcu+0x82/0x350
    ip_local_deliver_finish+0x35d/0x620
    ip_local_deliver+0x19c/0x470
    ip_rcv+0xc2/0x370
    __netif_receive_skb_one_core+0x108/0x180
    process_backlog+0x3c1/0x13e0
    __napi_poll.constprop.0+0x9f/0x460
    net_rx_action+0x54f/0xda0
    handle_softirqs+0x215/0x610

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Matthieu Baerts 5 months, 1 week ago

Hi Jakub,

On 02/09/2025 16:26, Jakub Kicinski wrote:
> On Mon, 01 Sep 2025 11:39:09 +0200 Matthieu Baerts (NGI0) wrote:
>> This series contains 4 independent new features:
>>
>> - Patch 1: use HMAC-SHA256 library instead of open-coded HMAC.
>>
>> - Patches 2-3: make ADD_ADDR retransmission timeout adaptive + simplify
>>   selftests.
>>
>> - Patch 4: selftests: check for unexpected fallback counter increments.
>>
>> - Patches 5-6: record subflows in RPS table, for aRFS support.
> 
> I don't see why, but kmemleak started to hit this with the join test
> 2 branches ago :\ Have you seen any kmemleak issues on your side?
> We also see occasional leaked skb in driver tests which makes no sense.
> 
> unreferenced object 0xffff8880029d3340 (size 3016):
>   comm "softirq", pid 0, jiffies 4297316940
>   hex dump (first 32 bytes):
>     0a 00 01 02 0a 00 01 01 00 00 00 00 9e b8 7d 27  ..............}'
>     0a 00 07 41 00 00 00 00 00 00 00 00 00 00 00 00  ...A............
>   backtrace (crc 3653d88c):
>     kmem_cache_alloc_noprof+0x284/0x330
>     sk_prot_alloc.constprop.0+0x4e/0x1b0
>     sk_clone_lock+0x4b/0x10d0
>     mptcp_sk_clone_init+0x2e/0x10d0
>     subflow_syn_recv_sock+0x9d1/0x1680
>     tcp_check_req+0x3a4/0x1910
>     tcp_v4_rcv+0x1004/0x30a0
>     ip_protocol_deliver_rcu+0x82/0x350
>     ip_local_deliver_finish+0x35d/0x620
>     ip_local_deliver+0x19c/0x470
>     ip_rcv+0xc2/0x370
>     __netif_receive_skb_one_core+0x108/0x180
>     process_backlog+0x3c1/0x13e0
>     __napi_poll.constprop.0+0x9f/0x460
>     net_rx_action+0x54f/0xda0
>     handle_softirqs+0x215/0x610

Thank you for this notification!

No, I didn't notice that on our side. For KMemLeak, now I'm waiting 5
seconds, then I force the scan, and check for issues once. On NIPA, I
see that there are still 2 scans + cat, and apparently, the issue was
always visible during the 2nd scan:


https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/279881/1-mptcp-join-sh/stdout

https://netdev-3.bots.linux.dev/vmksft-mptcp-dbg/results/280062/1-mptcp-join-sh/stdout

It is unclear why a second scan is needed and only the second one caught
something. Was it the same with the strange issues you mentioned in
driver tests? Do you think I should re-add the second scan + cat?

When looking at the modifications of this series, it is unclear what
could cause that.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Jakub Kicinski 5 months, 1 week ago

On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote:
> It is unclear why a second scan is needed and only the second one caught
> something. Was it the same with the strange issues you mentioned in
> driver tests? Do you think I should re-add the second scan + cat?

Not sure, cc: Catalin, from experience it seems like second scan often
surfaces issues the first scan missed.

> When looking at the modifications of this series, it is unclear what
> could cause that.

Yes, I don't think it's related to the series. For one thing the series
a couple of before the first report.

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Catalin Marinas 5 months, 1 week ago

On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote:
> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote:
> > It is unclear why a second scan is needed and only the second one caught
> > something. Was it the same with the strange issues you mentioned in
> > driver tests? Do you think I should re-add the second scan + cat?
> 
> Not sure, cc: Catalin, from experience it seems like second scan often
> surfaces issues the first scan missed.

It's some of the kmemleak heuristics to reduce false positives. It does
a checksum of the object during scanning and only reports a leak if the
checksum is the same in two consecutive scans.

-- 
Catalin

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Matthieu Baerts 5 months, 1 week ago

Hi Catalin,

2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>:

> On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote:
>> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote:
>>> It is unclear why a second scan is needed and only the second one caught
>>> something. Was it the same with the strange issues you mentioned in
>>> driver tests? Do you think I should re-add the second scan + cat?
>>
>> Not sure, cc: Catalin, from experience it seems like second scan often
>> surfaces issues the first scan missed.
>
> It's some of the kmemleak heuristics to reduce false positives. It does
> a checksum of the object during scanning and only reports a leak if the
> checksum is the same in two consecutive scans.

Thank you for the explanation!

Does that mean a scan should be triggered at the end of the tests,
then wait 5 second for the grace period, then trigger another scan
and check the results?

Or wait 5 seconds, then trigger two consecutive scans?

Cheers,
Matt

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Catalin Marinas 5 months, 1 week ago

On Tue, Sep 02, 2025 at 08:50:19PM +0200, Matthieu Baerts wrote:
> Hi Catalin,
> 
> 2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>:
> 
> > On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote:
> >> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote:
> >>> It is unclear why a second scan is needed and only the second one caught
> >>> something. Was it the same with the strange issues you mentioned in
> >>> driver tests? Do you think I should re-add the second scan + cat?
> >>
> >> Not sure, cc: Catalin, from experience it seems like second scan often
> >> surfaces issues the first scan missed.
> >
> > It's some of the kmemleak heuristics to reduce false positives. It does
> > a checksum of the object during scanning and only reports a leak if the
> > checksum is the same in two consecutive scans.
> 
> Thank you for the explanation!
> 
> Does that mean a scan should be triggered at the end of the tests,
> then wait 5 second for the grace period, then trigger another scan
> and check the results?
> 
> Or wait 5 seconds, then trigger two consecutive scans?

The 5 seconds is the minimum age of an object before it gets reported as
a leak. It's not related to the scanning process. So you could do two
scans in succession and wait 5 seconds before checking for leaks.

However, I'd go with the first option - do a scan, wait 5 seconds and do
another. That's mostly because at the end of the scan kmemleak prints if
it found new unreferenced objects. It might not print the message if a
leaked object is younger than 5 seconds. In practice, though, the scan
may take longer, depending on how loaded your system is.

The second option works as well but waiting between them has a better
chance of removing false positives if, say, some objects are moved
between lists and two consecutive scans do not detect the list_head
change (and update the object's checksum).

-- 
Catalin

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Matthieu Baerts 5 months, 1 week ago

2 Sept 2025 23:18:56 Catalin Marinas <catalin.marinas@arm.com>:

> On Tue, Sep 02, 2025 at 08:50:19PM +0200, Matthieu Baerts wrote:
>> Hi Catalin,
>>
>> 2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>:
>>
>>> On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote:
>>>> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote:
>>>>> It is unclear why a second scan is needed and only the second one caught
>>>>> something. Was it the same with the strange issues you mentioned in
>>>>> driver tests? Do you think I should re-add the second scan + cat?
>>>>
>>>> Not sure, cc: Catalin, from experience it seems like second scan often
>>>> surfaces issues the first scan missed.
>>>
>>> It's some of the kmemleak heuristics to reduce false positives. It does
>>> a checksum of the object during scanning and only reports a leak if the
>>> checksum is the same in two consecutive scans.
>>
>> Thank you for the explanation!
>>
>> Does that mean a scan should be triggered at the end of the tests,
>> then wait 5 second for the grace period, then trigger another scan
>> and check the results?
>>
>> Or wait 5 seconds, then trigger two consecutive scans?
>
> The 5 seconds is the minimum age of an object before it gets reported as
> a leak. It's not related to the scanning process. So you could do two
> scans in succession and wait 5 seconds before checking for leaks.
>
> However, I'd go with the first option - do a scan, wait 5 seconds and do
> another. That's mostly because at the end of the scan kmemleak prints if
> it found new unreferenced objects. It might not print the message if a
> leaked object is younger than 5 seconds. In practice, though, the scan
> may take longer, depending on how loaded your system is.
>
> The second option works as well but waiting between them has a better
> chance of removing false positives if, say, some objects are moved
> between lists and two consecutive scans do not detect the list_head
> change (and update the object's checksum).

Thank you for this very nice reply, that's very clear!

I will then adapt our CI having CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF
to do a manual scan at the very end, wait 5 seconds and do another.

Cheers,
Matt

Re: [PATCH net-next 0/6] mptcp: misc. features for v6.18

Posted by Christoph Paasch 5 months, 1 week ago

On Tue, Sep 2, 2025 at 2:38 PM Matthieu Baerts <matttbe@kernel.org> wrote:
>
> 2 Sept 2025 23:18:56 Catalin Marinas <catalin.marinas@arm.com>:
>
> > On Tue, Sep 02, 2025 at 08:50:19PM +0200, Matthieu Baerts wrote:
> >> Hi Catalin,
> >>
> >> 2 Sept 2025 20:25:19 Catalin Marinas <catalin.marinas@arm.com>:
> >>
> >>> On Tue, Sep 02, 2025 at 08:27:59AM -0700, Jakub Kicinski wrote:
> >>>> On Tue, 2 Sep 2025 16:51:47 +0200 Matthieu Baerts wrote:
> >>>>> It is unclear why a second scan is needed and only the second one caught
> >>>>> something. Was it the same with the strange issues you mentioned in
> >>>>> driver tests? Do you think I should re-add the second scan + cat?
> >>>>
> >>>> Not sure, cc: Catalin, from experience it seems like second scan often
> >>>> surfaces issues the first scan missed.
> >>>
> >>> It's some of the kmemleak heuristics to reduce false positives. It does
> >>> a checksum of the object during scanning and only reports a leak if the
> >>> checksum is the same in two consecutive scans.
> >>
> >> Thank you for the explanation!
> >>
> >> Does that mean a scan should be triggered at the end of the tests,
> >> then wait 5 second for the grace period, then trigger another scan
> >> and check the results?
> >>
> >> Or wait 5 seconds, then trigger two consecutive scans?
> >
> > The 5 seconds is the minimum age of an object before it gets reported as
> > a leak. It's not related to the scanning process. So you could do two
> > scans in succession and wait 5 seconds before checking for leaks.
> >
> > However, I'd go with the first option - do a scan, wait 5 seconds and do
> > another. That's mostly because at the end of the scan kmemleak prints if
> > it found new unreferenced objects. It might not print the message if a
> > leaked object is younger than 5 seconds. In practice, though, the scan
> > may take longer, depending on how loaded your system is.
> >
> > The second option works as well but waiting between them has a better
> > chance of removing false positives if, say, some objects are moved
> > between lists and two consecutive scans do not detect the list_head
> > change (and update the object's checksum).
>
> Thank you for this very nice reply, that's very clear!
>
> I will then adapt our CI having CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF
> to do a manual scan at the very end, wait 5 seconds and do another.

FWIW - I am able to pretty reliably reproduce the kmemleak. However, I
also tried adding an inline kmemleak scan to the test harness (did it
once with, once without a sleep). When I do that the kmemleak
disappears :-)

(not saying that adding the scan isn't useful, just pointing out that
this particular leak seems to be related to how quickly we iterate
over the testcases)


Christoph