[RFC PATCH 0/4] mptcp: improve mptcp-level window tracking

Paolo Abeni posted 4 patches 2 years ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/cover.1649672265.git.pabeni@redhat.com
Maintainers: Matthieu Baerts <matthieu.baerts@tessares.net>, "David S. Miller" <davem@davemloft.net>, Paolo Abeni <pabeni@redhat.com>, David Ahern <dsahern@kernel.org>, Jakub Kicinski <kuba@kernel.org>, Mat Martineau <mathew.j.martineau@linux.intel.com>, Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>, Eric Dumazet <edumazet@google.com>
There is a newer version of this series
include/net/mptcp.h   |  2 +-
net/ipv4/tcp_output.c |  2 +-
net/mptcp/mib.c       |  4 +++
net/mptcp/mib.h       |  6 +++++
net/mptcp/options.c   | 61 +++++++++++++++++++++++++++++++++++++------
net/mptcp/protocol.c  | 24 +++++++++++------
6 files changed, 81 insertions(+), 18 deletions(-)
[RFC PATCH 0/4] mptcp: improve mptcp-level window tracking
Posted by Paolo Abeni 2 years ago
I've been chasing bad/unstable performances with multiple subflows
on very high speed links.

It looks like the root cause is due to the current mptcp-level
congestion window handling. There are apparently a few different
sub-issues:

- the rcv_wnd is not effectively shared on the tx side, as each
  subflow takes in account only the value received by the underlaying
  TCP connection. This is addressed in patch 1/4

- The mptcp-level offered wnd right edge is currently allowed to shrink.
  Reading section 3.3.4.:

"""
   The receive window is relative to the DATA_ACK.  As in TCP, a
   receiver MUST NOT shrink the right edge of the receive window (i.e.,
   DATA_ACK + receive window).  The receiver will use the data sequence
   number to tell if a packet should be accepted at the connection
   level.
"""

   I read the above as we need to reflect window right-edge tracking
   on the wire, see patch 3/4.

- The offered window right edge tracking can happen concurrently on
  multiple subflows, but there is no mutex protection. We need an
  additional atomic operation - still patch 3/4

This series additionally bump a few new MIBs to track all the above
(ensure/observe that the suspected races actually take place).

With this series tput in the critical scenario raises from ~26 Gbps
(ranging in 4-30 Gbps) to ~43 Gbps (with min > 33 Gbps)

I guess patch 3/4 is the most debatable - expecially for RFC compliance
Any feedback more then welcome!

Note: still in patch 3/4, I'm unsure that the th->window update is
strictly necessary from functional perspective (e.g. possibly the atomic
operation is enough), I'll try to test that, too.

Paolo Abeni (4):
  mptcp: really share subflow snd_wnd
  mptcp: add mib for xmit window sharing
  mptcp: never shrink offered window
  mptcp: add more offered MIBs counter.

 include/net/mptcp.h   |  2 +-
 net/ipv4/tcp_output.c |  2 +-
 net/mptcp/mib.c       |  4 +++
 net/mptcp/mib.h       |  6 +++++
 net/mptcp/options.c   | 61 +++++++++++++++++++++++++++++++++++++------
 net/mptcp/protocol.c  | 24 +++++++++++------
 6 files changed, 81 insertions(+), 18 deletions(-)

-- 
2.35.1


Re: [RFC PATCH 0/4] mptcp: improve mptcp-level window tracking
Posted by Paolo Abeni 2 years ago
On Mon, 2022-04-11 at 12:40 +0200, Paolo Abeni wrote:
> Note: still in patch 3/4, I'm unsure that the th->window update is
> strictly necessary from functional perspective (e.g. possibly the atomic
> operation is enough), I'll try to test that, too.

A couple more notes:

- it looks like the th->window update is necessary, a test run without
the relevant chunk produced significantly less good results.

- this is still (supriese, surprise!) not perfect, I can observe
low_but_not_zero count of TcpExtTCPZeroWindowDrop, which instead should
be 0 without evil middle-box (and is 0 with plain TCP)

- I'm wondering if we should additionally add another mib counter for 
MPTCPWANTZEROWINDOWADV - quite similar to TCPWANTZEROWINDOWADV

Cheers,

Paolo