I've been chasing bad/unstable performances with multiple subflows
on very high speed links.
It looks like the root cause is due to the current mptcp-level
congestion window handling. There are apparently a few different
sub-issues:
- the rcv_wnd is not effectively shared on the tx side, as each
subflow takes in account only the value received by the underlaying
TCP connection. This is addressed in patch 1/4
- The mptcp-level offered wnd right edge is currently allowed to shrink.
Reading section 3.3.4.:
"""
The receive window is relative to the DATA_ACK. As in TCP, a
receiver MUST NOT shrink the right edge of the receive window (i.e.,
DATA_ACK + receive window). The receiver will use the data sequence
number to tell if a packet should be accepted at the connection
level.
"""
I read the above as we need to reflect window right-edge tracking
on the wire, see patch 3/4.
- The offered window right edge tracking can happen concurrently on
multiple subflows, but there is no mutex protection. We need an
additional atomic operation - still patch 3/4
This series additionally bump a few new MIBs to track all the above
(ensure/observe that the suspected races actually take place).
I could not access again the host where the issue was su much
noticeable, still in the current setup the tput changes from
[6-18] Gbps to 19Gbps very stable.
RFC -> v1:
- added patch 3/5 to address Mat's comment, and rebased the
following on top of it - I hope Eric may tolerate that, it's
more an hope than guess ;)
Paolo Abeni (5):
mptcp: really share subflow snd_wnd
mptcp: add mib for xmit window sharing
tcp: allow MPTCP to update the announced window.
mptcp: never shrink offered window
mptcp: add more offered MIBs counter.
include/net/mptcp.h | 2 +-
net/ipv4/tcp_output.c | 13 +++++-----
net/mptcp/mib.c | 4 +++
net/mptcp/mib.h | 6 +++++
net/mptcp/options.c | 58 +++++++++++++++++++++++++++++++++++++------
net/mptcp/protocol.c | 24 ++++++++++++------
6 files changed, 84 insertions(+), 23 deletions(-)
--
2.35.1