net/mptcp/mptcp_diag.c | 3 +- net/mptcp/pm.c | 4 +- net/mptcp/pm_kernel.c | 2 + net/mptcp/protocol.c | 363 ++++++++++++++++++++++++++++------------- net/mptcp/protocol.h | 10 +- net/mptcp/subflow.c | 12 +- 6 files changed, 272 insertions(+), 122 deletions(-)
This series includes RX path improvement built around backlog processing The main goals are improving the RX performances _and_ increase the long term maintainability. Patches 2-4 prepare the stack for backlog processing, removing assumptions that will not hold true anymore after backlog introduction. Patches 1 and 5 fixes long standing issues which are quite hard to reproduce with the current implementation but the 2nd one will become very apparent with backlog usage. Patches 6, 7 and 9 are more cleanups that will make the backlog patch a little less huge. Patch 8 is a somewhat unrelated cleanup, included here before I forgot about it. The real work is done by patch 10 and 11. Patch 10 introduces the helpers needed to manipulate the msk-level backlog, and the data struct itself, without any actual functional change. Patch 11 finally use the backlog for RX skb processing. Note that MPTCP can't uset the sk_backlog, as the mptcp release callback can also release and re-acquire the msk-level spinlock and core backlog processing works under the assumption that such event is not possible. A relevant point is memory accounts for skbs in the backlog. It's somewhat "original" due to MPTCP constraints. Such skbs use space from the incoming subflow receive buffer, do not use explicitly any forward allocated memory, as we can't update the msk fwd mem while enqueuing, nor we want to acquire again the ssk socket lock while processing the skbs. Instead the msk borrows memory from the subflow and reserve it for the backlog - see patch 3 and 11 for the gory details. Note that even if the skbs can sit in the backlog for an unbounded time, --- v5 -> v6: - added patch 1/11 - reworked widely patch 10 && 11 to avoid double accounts for backlog skb and to address the fwd allocated memory criticality mentioned in previous iterations. Paolo Abeni (11): mptcp: drop bogus optimization in __mptcp_check_push() mptcp: borrow forward memory from subflow mptcp: cleanup fallback data fin reception mptcp: cleanup fallback dummy mapping generation mptcp: fix MSG_PEEK stream corruption mptcp: ensure the kernel PM does not take action too late mptcp: do not miss early first subflow close event notification. mptcp: make mptcp_destroy_common() static mptcp: drop the __mptcp_data_ready() helper mptcp: introduce mptcp-level backlog mptcp: leverage the backlog for RX packet processing net/mptcp/mptcp_diag.c | 3 +- net/mptcp/pm.c | 4 +- net/mptcp/pm_kernel.c | 2 + net/mptcp/protocol.c | 363 ++++++++++++++++++++++++++++------------- net/mptcp/protocol.h | 10 +- net/mptcp/subflow.c | 12 +- 6 files changed, 272 insertions(+), 122 deletions(-) -- 2.51.0
Hi Paolo, Thanks for this v6. On Wed, 2025-10-22 at 16:31 +0200, Paolo Abeni wrote: > This series includes RX path improvement built around backlog > processing > > The main goals are improving the RX performances _and_ increase the > long term maintainability. > > Patches 2-4 prepare the stack for backlog processing, removing > assumptions that will not hold true anymore after backlog > introduction. > > Patches 1 and 5 fixes long standing issues which are quite hard to > reproduce with the current implementation but the 2nd one will become > very apparent with backlog usage. > > Patches 6, 7 and 9 are more cleanups that will make the backlog patch > a > little less huge. > > Patch 8 is a somewhat unrelated cleanup, included here before I > forgot > about it. > > The real work is done by patch 10 and 11. Patch 10 introduces the > helpers > needed to manipulate the msk-level backlog, and the data struct > itself, > without any actual functional change. Patch 11 finally use the > backlog > for RX skb processing. Note that MPTCP can't uset the sk_backlog, as > the mptcp release callback can also release and re-acquire the msk- > level > spinlock and core backlog processing works under the assumption that > such event is not possible. > > A relevant point is memory accounts for skbs in the backlog. > > It's somewhat "original" due to MPTCP constraints. Such skbs use > space > from the incoming subflow receive buffer, do not use explicitly any > forward allocated memory, as we can't update the msk fwd mem while > enqueuing, nor we want to acquire again the ssk socket lock while > processing the skbs. > > Instead the msk borrows memory from the subflow and reserve it for > the backlog - see patch 3 and 11 for the gory details. > > Note that even if the skbs can sit in the backlog for an unbounded > time, > > --- > v5 -> v6: > - added patch 1/11 > - reworked widely patch 10 && 11 to avoid double accounts for > backlog > skb and to address the fwd allocated memory criticality mentioned > in previous iterations. All tests have passed on my end. The only minor issues requiring cleanup are in patch 2 and patch 9, which Matt or I can address in subsequent revisions. All patches LGTM. Reviewed-by: Geliang Tang <geliang@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> > > Paolo Abeni (11): > mptcp: drop bogus optimization in __mptcp_check_push() > mptcp: borrow forward memory from subflow > mptcp: cleanup fallback data fin reception > mptcp: cleanup fallback dummy mapping generation > mptcp: fix MSG_PEEK stream corruption > mptcp: ensure the kernel PM does not take action too late > mptcp: do not miss early first subflow close event notification. > mptcp: make mptcp_destroy_common() static > mptcp: drop the __mptcp_data_ready() helper > mptcp: introduce mptcp-level backlog > mptcp: leverage the backlog for RX packet processing > > net/mptcp/mptcp_diag.c | 3 +- > net/mptcp/pm.c | 4 +- > net/mptcp/pm_kernel.c | 2 + > net/mptcp/protocol.c | 363 ++++++++++++++++++++++++++++----------- > -- > net/mptcp/protocol.h | 10 +- > net/mptcp/subflow.c | 12 +- > 6 files changed, 272 insertions(+), 122 deletions(-) >
Hi Paolo,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/18720365057
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/61d32775fca0
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1014567
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal
For more details:
    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
                
            Hi Paolo, On 22/10/2025 16:31, Paolo Abeni wrote: > This series includes RX path improvement built around backlog processing > > The main goals are improving the RX performances _and_ increase the > long term maintainability. > > Patches 2-4 prepare the stack for backlog processing, removing > assumptions that will not hold true anymore after backlog introduction. > > Patches 1 and 5 fixes long standing issues which are quite hard to > reproduce with the current implementation but the 2nd one will become > very apparent with backlog usage. > > Patches 6, 7 and 9 are more cleanups that will make the backlog patch a > little less huge. > > Patch 8 is a somewhat unrelated cleanup, included here before I forgot > about it. > > The real work is done by patch 10 and 11. Patch 10 introduces the helpers > needed to manipulate the msk-level backlog, and the data struct itself, > without any actual functional change. Patch 11 finally use the backlog > for RX skb processing. Note that MPTCP can't uset the sk_backlog, as > the mptcp release callback can also release and re-acquire the msk-level > spinlock and core backlog processing works under the assumption that > such event is not possible. > > A relevant point is memory accounts for skbs in the backlog. > > It's somewhat "original" due to MPTCP constraints. Such skbs use space > from the incoming subflow receive buffer, do not use explicitly any > forward allocated memory, as we can't update the msk fwd mem while > enqueuing, nor we want to acquire again the ssk socket lock while > processing the skbs. > > Instead the msk borrows memory from the subflow and reserve it for > the backlog - see patch 3 and 11 for the gory details. > > Note that even if the skbs can sit in the backlog for an unbounded time, > > --- > v5 -> v6: > - added patch 1/11 > - reworked widely patch 10 && 11 to avoid double accounts for backlog > skb and to address the fwd allocated memory criticality mentioned > in previous iterations. > > Paolo Abeni (11): > mptcp: drop bogus optimization in __mptcp_check_push() > mptcp: borrow forward memory from subflow > mptcp: cleanup fallback data fin reception > mptcp: cleanup fallback dummy mapping generation > mptcp: fix MSG_PEEK stream corruption > mptcp: ensure the kernel PM does not take action too late > mptcp: do not miss early first subflow close event notification. > mptcp: make mptcp_destroy_common() static > mptcp: drop the __mptcp_data_ready() helper > mptcp: introduce mptcp-level backlog > mptcp: leverage the backlog for RX packet processing Sorry for the delay to apply the patches. Now in our tree: New patches for t/upstream-net and t/upstream: - 3a771ce1a023: mptcp: drop bogus optimization in __mptcp_check_push() - b7ad5d80dff9: mptcp: fix MSG_PEEK stream corruption - Results: 2a626f21446a..1ce735a9c1d6 (export-net) - Results: 0d7b92893723..f23a9744c467 (export) Tests are now in progress: - export-net: https://github.com/multipath-tcp/mptcp_net-next/commit/294a632909acc62a7656edc52770f9f59332af39/checks - export: https://github.com/multipath-tcp/mptcp_net-next/commit/1035c9b6c98cf0d15998eb7ce53e5f81621e5116/checks New patches for t/upstream: - f0025b89c22c: mptcp: cleanup fallback data fin reception - 905fec8ceabc: mptcp: cleanup fallback dummy mapping generation - 8b29c30e82d2: mptcp: ensure the kernel PM does not take action too late - 439f10b1648b: mptcp: do not miss early first subflow close event notification - 00804e3abc8d: mptcp: make mptcp_destroy_common() static - 25c9415c554d: mptcp: drop the __mptcp_data_ready() helper - Results: f23a9744c467..eb31e166f63e (export) Tests are now in progress: - export: https://github.com/multipath-tcp/mptcp_net-next/commit/54c37fb023bf32f10f60fb90d27a8fa800de426f/checks Cheers, Matt -- Sponsored by the NGI0 Core fund.
© 2016 - 2025 Red Hat, Inc.