[PATCH mptcp-next v2 3/5] tcp: add mptcp join demultiplex hooks

Florian Westphal posted 5 patches 3 years, 11 months ago
There is a newer version of this series
[PATCH mptcp-next v2 3/5] tcp: add mptcp join demultiplex hooks
Posted by Florian Westphal 3 years, 11 months ago
Split from the next patch to make core tcp changes more obvious:
add a dummy function that gets called after tcp socket demux came up
empty.

This will be used by mptcp to check if a tcp syn contains an mptcp
join option with a valid token (connection id).

If so, a hidden pernet mptcp listener socket is returned and packet
resumes normally.

This patch series does not cover timewait sockets so far.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/mptcp.h | 5 +++++
 net/ipv4/tcp_ipv4.c | 4 ++++
 net/ipv6/tcp_ipv6.c | 4 ++++
 3 files changed, 13 insertions(+)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 8b1afd6f5cc4..5ee422b56902 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -197,6 +197,10 @@ static inline __be32 mptcp_reset_option(const struct sk_buff *skb)
 
 	return htonl(0u);
 }
+static inline struct sock *mptcp_handle_join(int af, struct sk_buff *skb)
+{
+	return NULL;
+}
 #else
 
 static inline void mptcp_init(void)
@@ -274,6 +278,7 @@ static inline int mptcp_subflow_init_cookie_req(struct request_sock *req,
 }
 
 static inline __be32 mptcp_reset_option(const struct sk_buff *skb)  { return htonl(0u); }
+static inline struct sock *mptcp_handle_join(int af, struct sk_buff *skb) { return NULL; }
 #endif /* CONFIG_MPTCP */
 
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 6873f46fc8ba..6e6675a09443 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2140,6 +2140,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
 		goto discard_it;
 
+	sk = mptcp_handle_join(AF_INET, skb);
+	if (sk)
+		goto process;
+
 	tcp_v4_fill_cb(skb, iph, th);
 
 	if (tcp_checksum_complete(skb)) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0c648bf07f39..788040db8e9e 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1782,6 +1782,10 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 	if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb))
 		goto discard_it;
 
+	sk = mptcp_handle_join(AF_INET6, skb);
+	if (sk)
+		goto process;
+
 	tcp_v6_fill_cb(skb, hdr, th);
 
 	if (tcp_checksum_complete(skb)) {
-- 
2.34.1


Re: [PATCH mptcp-next v2 3/5] tcp: add mptcp join demultiplex hooks
Posted by Paolo Abeni 3 years, 11 months ago
On Thu, 2022-02-17 at 15:25 +0100, Florian Westphal wrote:
> Split from the next patch to make core tcp changes more obvious:
> add a dummy function that gets called after tcp socket demux came up
> empty.
> 
> This will be used by mptcp to check if a tcp syn contains an mptcp
> join option with a valid token (connection id).
> 
> If so, a hidden pernet mptcp listener socket is returned and packet
> resumes normally.
> 
> This patch series does not cover timewait sockets so far.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

Following-up todays mtg discussion WRT filtering incoming MPJ on
existing endpoint. 

If I read correctly, subflow_syn_recv_sock() will filter the incoming
packets vs the annonced list, but only if the incoming packet's
destination port is different from the listener port.

- incoming request targeting a bad address and/or port will reach the
syn-ack before being rejected.

- incoming request targeting port 0 will likely be always accepted.

What about moving the anno_list check in __mptcp_handle_join()? Should
address the 2 above points and keep the anno_list traversals to the
bare minimum.

WDYT?

Cheers,

Paolo


Re: [PATCH mptcp-next v2 3/5] tcp: add mptcp join demultiplex hooks
Posted by Florian Westphal 3 years, 11 months ago
Paolo Abeni <pabeni@redhat.com> wrote:
> If I read correctly, subflow_syn_recv_sock() will filter the incoming
> packets vs the annonced list, but only if the incoming packet's
> destination port is different from the listener port.

Same as today, if it hit listener port then no issue...?
The listener port isn't in the announced list, so the check would fail.

> - incoming request targeting a bad address and/or port will reach the
> syn-ack before being rejected.

Yes.

> - incoming request targeting port 0 will likely be always accepted.

No, port 0 is illegal, there is an explicit check that prevents
such port from being wired up to the magic listener.

> What about moving the anno_list check in __mptcp_handle_join()? Should
> address the 2 above points and keep the anno_list traversals to the
> bare minimum.

I can have a look.

Re: [PATCH mptcp-next v2 3/5] tcp: add mptcp join demultiplex hooks
Posted by Paolo Abeni 3 years, 11 months ago
On Fri, 2022-02-18 at 08:29 +0100, Florian Westphal wrote:
> Paolo Abeni <pabeni@redhat.com> wrote:
> > If I read correctly, subflow_syn_recv_sock() will filter the incoming
> > packets vs the annonced list, but only if the incoming packet's
> > destination port is different from the listener port.
> 
> Same as today, if it hit listener port then no issue...?
> The listener port isn't in the announced list, so the check would fail.

Side not not related to this series: It looks like that if we have 2
MPTCP listeners on different addresses and/or ports, and no 'signal'
endpoints, the kernel will accept MPJ on either sockets for any mptcp
connection, regardless of the originating listener. I'm unsure if that
is expeted/fits the RFC fully. 
@Mat: ^^^ WDYT?

> 
> > - incoming request targeting a bad address and/or port will reach the
> > syn-ack before being rejected.
> 
> Yes.
> 
> > - incoming request targeting port 0 will likely be always accepted.
> 
> No, port 0 is illegal, there is an explicit check that prevents
> such port from being wired up to the magic listener.

whoops, I missed that check, despite the huge comment. Sorry for the
noise here.

> > What about moving the anno_list check in __mptcp_handle_join()? Should
> > address the 2 above points and keep the anno_list traversals to the
> > bare minimum.
> 
> I can have a look.

Thanks! It looks like some minor changes are needed to handle !ipv6
build.

Cheers,
Paolo