[PATCH net v5 0/2] mptcp: pm: fix stale ADD_ADDR anno_list entry on id 0 removal

Kalpan Jani posted 2 patches 4 days, 13 hours ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/20260630150632.899750-1-kalpan.jani@mpiricsoftware.com
net/mptcp/pm_kernel.c                    |  8 ++++++++
tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 +++++++++++
2 files changed, 19 insertions(+)
[PATCH net v5 0/2] mptcp: pm: fix stale ADD_ADDR anno_list entry on id 0 removal
Posted by Kalpan Jani 4 days, 13 hours ago
The in-kernel MPTCP path manager can leave a stale ADD_ADDR announcement
entry alive when removing the id 0 endpoint, causing WARN_ON_ONCE() to fire
on subsequent PM reselection (issue #620).

**The Problem:**
When the in-kernel PM removes a signal endpoint, it must cancel any pending
ADD_ADDR announcement: stop the retransmit timer and unlink the anno_list
entry. For non-zero id endpoints, this is done correctly via
mptcp_nl_remove_subflow_and_signal_addr() -> mptcp_pm_remove_anno_addr() ->
mptcp_pm_announced_remove().

However, the id 0 removal path (mptcp_nl_remove_id_zero_address()) skips
this cleanup, only queuing an RM_ADDR and marking the id available. If an
ADD_ADDR for that address was previously announced but the echo hasn't
arrived yet, the anno_list entry survives.

When the kernel PM later reselects id 0 (after adding another signal
endpoint to satisfy the PM's endpoint limits), it finds this stale entry
and hits the WARN_ON_ONCE(mptcp_pm_is_kernel()) assertion in
mptcp_pm_alloc_anno_list().

**How It Was Found:**
syzkaller detected this via an MPTCP protocol-flow harness. Three crashes
were reported between 2026/05/19 and 2026/05/22 on kernels including commit
cca95436be15 ("mptcp: remove id 0 address"). The root cause was confirmed
through code inspection:
asymmetry between the id 0 and non-zero id removal paths.

**The Fix:**
Patch 1 makes mptcp_nl_remove_id_zero_address() symmetric with the
non-zero id path by calling mptcp_pm_announced_remove() before queuing
RM_ADDR, and decrementing add_addr_signaled if the address had been
announced. The fix is minimal (8 lines) and mirrors existing correct code.

A subtle detail: signal endpoints are stored in anno_list with port 0, but
msk_local carries the connection's actual local port. The lookup in
mptcp_pm_announced_remove() uses use_port=true, so the port must be
cleared before the call to match the stored entry key.

Patch 2 adds a regression test integrated into mptcp_join.sh (per
maintainer feedback, avoiding new test files). The test verifies no WARNING
appears when removing id 0 from a fully-established MPTCP connection.

**Verification:**
- syzkaller: 3 crashes, issue #620 reproducible
- Docker CI: 23/24 tests pass, no new failures introduced
- Code review: Fix is minimal, correct, and mirrors existing patterns
- Suggested-by: Tao Cui <cuitao@kylinos.cn> identified the port mismatch
  (v4 -> v5 fix)

Kalpan Jani (2):
  mptcp: pm: drop pending ADD_ADDR when removing id 0 endpoint
  selftests: mptcp: add id 0 stale ADD_ADDR regression test

 net/mptcp/pm_kernel.c                    |  8 ++++++++
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 +++++++++++
 2 files changed, 19 insertions(+)

---
base-commit: net/main
change-id: 20260630-mptcp-id0-stale-anno-fix-kalpanj
Re: [PATCH net v5 0/2] mptcp: pm: fix stale ADD_ADDR anno_list entry on id 0 removal
Posted by MPTCP CI 4 days, 12 hours ago
Hi Kalpan,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/28457083924

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/81b37efcf998
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1119114


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)