:p
atchew
Login
From: Geliang Tang <tanggeliang@kylinos.cn> v9: - add a new patch to "add MPTCP SKB offset check in strp queue walk", thanks to Gang Yan for the fix. - add a new patch to "avoid deadlocks in read_sock path", replacing the "in_softirq()" check used in v8. - update the selftests. v8: - do not hold tls_prot_ops_lock in tls_init(); otherwise, a deadlock occurs. - change return value of mptcp_stream_is_readable() as 'bool' to fix the "expected restricted __poll_t" warning reported by CI. - fixed other CI checkpatch warnings regarding excessively long lines. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1768294706.git.tanggeliang@kylinos.cn/ v7: - Passing an MPTCP socket to tcp_sock_rate_check_app_limited() causes a crash. In v7, an MPTCP version of check_app_limited() is implemented, which calls tcp_sock_rate_check_app_limited() for each subflow. - Register tls_tcp_ops and tls_mptcp_ops in tls_register() rather than in tls_init(). - Set ctx->ops in tls_init() instead of in do_tls_setsockopt_conf(). - Keep tls_device.c unchanged. MPTCP TLS_HW mode has not been implemented yet, so EOPNOTSUPP is returned in this case. - Also add TCP TLS tests in mptcp_join.sh. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1768284047.git.tanggeliang@kylinos.cn/ v6: - register each ops as Matt suggested. - drop sk_is_msk(). - add tcp_sock_get_ulp/tcp_sock_set_ulp helpers. - set another ULP in sock_test_tcpulp as Matt suggested. - add tls tests using multiple subflows in mptcp_join.sh. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1767518836.git.tanggeliang@kylinos.cn/ v5: - As suggested by Mat and Matt, this set introduces struct tls_prot_ops for TLS. - Includes Gang Yan's patches to add MPTCP support to the TLS selftests. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1766372799.git.tanggeliang@kylinos.cn/ v4: - split "tls: add MPTCP protocol support" into smaller, more focused patches. - a new mptcp_inq helper has been implemented instead of directly using mptcp_inq_hint to fix the issue mentioned in [1]. - add sk_is_msk helper. - the 'expect' parameter will no longer be added to sock_test_tcpulp. Instead, SOCK_TEST_TCPULP items causing the tests failure will be directly removed. - remove the "TCP KTLS" tests, keeping only the MPTCP-related ones. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1765505775.git.tanggeliang@kylinos.cn/ [1] https://patchwork.kernel.org/project/mptcp/patch/ce74452f4c095a1761ef493b767b4bd9f9c14359.1764333805.git.tanggeliang@kylinos.cn/ v3: - mptcp_read_sock() and mptcp_poll() are not exported, as mptcp_sockopt test does not use read_sock/poll interfaces. They will be exported when new tests are added in the future. - call mptcp_inq_hint in tls_device_rx_resync_new_rec(), tls_device_core_ctrl_rx_resync() and tls_read_flush_backlog() too. - update selftests. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1763800601.git.tanggeliang@kylinos.cn/ v2: - fix disconnect. - update selftests. This series adds KTLS support for MPTCP. Since the ULP of msk is not being used, ULP KTLS can be directly configured onto msk without affecting its communication. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/480 Gang Yan (2): tls: add MPTCP SKB offset check in strp queue walk mptcp: update mptcp_check_readable Geliang Tang (8): tls: introduce struct tls_prot_ops tls: add ops in tls_context mptcp: avoid deadlocks in read_sock path mptcp: implement tls_mptcp_ops mptcp: update ULP getsockopt mptcp: enable TLS setsockopt selftests: mptcp: connect: set smc instead of tls selftests: mptcp: connect: add TLS tests include/linux/tcp.h | 2 + include/net/mptcp.h | 2 + include/net/tcp.h | 1 + include/net/tls.h | 20 +++ net/ipv4/tcp.c | 87 +++++++----- net/mptcp/protocol.c | 131 +++++++++++++++--- net/mptcp/protocol.h | 1 + net/mptcp/sockopt.c | 37 ++++- net/tls/tls_main.c | 107 +++++++++++++- net/tls/tls_strp.c | 31 +++-- net/tls/tls_sw.c | 7 +- tools/testing/selftests/net/mptcp/config | 1 + .../selftests/net/mptcp/mptcp_connect.c | 49 ++++++- .../selftests/net/mptcp/mptcp_connect.sh | 33 +++++ 14 files changed, 438 insertions(+), 71 deletions(-) -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> To extend MPTCP support based on TCP TLS, a tls_prot_ops structure has been introduced for TLS, encapsulating TCP-specific helpers within this structure. Add registering, validating and finding functions for this structure to add, validate and find a tls_prot_ops on the global list tls_prot_ops_list. Register TCP-specific structure tls_tcp_ops in tls_init(). Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/net/tls.h | 19 ++++++++++ net/tls/tls_main.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 107 insertions(+) diff --git a/include/net/tls.h b/include/net/tls.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -XXX,XX +XXX,XX @@ struct tls_prot_info { u16 tail_size; }; +struct tls_prot_ops { + int protocol; + struct module *owner; + struct list_head list; + + int (*inq)(struct sock *sk); + int (*sendmsg_locked)(struct sock *sk, struct msghdr *msg, size_t size); + struct sk_buff *(*recv_skb)(struct sock *sk, u32 *off); + int (*read_sock)(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor); + void (*read_done)(struct sock *sk, size_t len); + u32 (*get_skb_off)(struct sk_buff *skb); + u32 (*get_skb_seq)(struct sk_buff *skb); + __poll_t (*poll)(struct file *file, struct socket *sock, + struct poll_table_struct *wait); + bool (*epollin_ready)(const struct sock *sk); + void (*check_app_limited)(struct sock *sk); +}; + struct tls_context { /* read-only cache line */ struct tls_prot_info prot_info; diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ static struct proto_ops tls_proto_ops[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CON static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], const struct proto *base); +static DEFINE_SPINLOCK(tls_prot_ops_lock); +static LIST_HEAD(tls_prot_ops_list); + +/* Must be called with rcu read lock held */ +static struct tls_prot_ops *tls_prot_ops_find(int protocol) +{ + struct tls_prot_ops *ops, *ret = NULL; + + list_for_each_entry_rcu(ops, &tls_prot_ops_list, list) { + if (ops->protocol == protocol) { + ret = ops; + break; + } + } + + return ret; +} + void update_sk_prot(struct sock *sk, struct tls_context *ctx) { int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4; @@ -XXX,XX +XXX,XX @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = { .get_info_size = tls_get_info_size, }; +static int tls_validate_prot_ops(const struct tls_prot_ops *ops) +{ + if (!ops->inq || !ops->sendmsg_locked || + !ops->recv_skb || !ops->read_sock || !ops->read_done || + !ops->get_skb_off || !ops->get_skb_seq || + !ops->poll || !ops->epollin_ready || + !ops->check_app_limited) { + pr_err("%d does not implement required ops\n", ops->protocol); + return -EINVAL; + } + + return 0; +} + +static int tls_register_prot_ops(struct tls_prot_ops *ops) +{ + int ret; + + ret = tls_validate_prot_ops(ops); + if (ret) + return ret; + + spin_lock(&tls_prot_ops_lock); + if (tls_prot_ops_find(ops->protocol)) { + spin_unlock(&tls_prot_ops_lock); + return -EEXIST; + } + list_add_tail_rcu(&ops->list, &tls_prot_ops_list); + spin_unlock(&tls_prot_ops_lock); + + pr_debug("tls_prot_ops %d registered\n", ops->protocol); + return 0; +} + +static struct sk_buff *tls_tcp_recv_skb(struct sock *sk, u32 *off) +{ + return tcp_recv_skb(sk, tcp_sk(sk)->copied_seq, off); +} + +static u32 tls_tcp_get_skb_off(struct sk_buff *skb) +{ + return 0; +} + +static u32 tls_tcp_get_skb_seq(struct sk_buff *skb) +{ + return TCP_SKB_CB(skb)->seq; +} + +static bool tls_tcp_epollin_ready(const struct sock *sk) +{ + return tcp_epollin_ready(sk, INT_MAX); +} + +static struct tls_prot_ops tls_tcp_ops = { + .protocol = IPPROTO_TCP, + .inq = tcp_inq, + .sendmsg_locked = tcp_sendmsg_locked, + .recv_skb = tls_tcp_recv_skb, + .read_sock = tcp_read_sock, + .read_done = tcp_read_done, + .get_skb_off = tls_tcp_get_skb_off, + .get_skb_seq = tls_tcp_get_skb_seq, + .poll = tcp_poll, + .epollin_ready = tls_tcp_epollin_ready, + .check_app_limited = tcp_rate_check_app_limited, +}; + static int __init tls_register(void) { int err; @@ -XXX,XX +XXX,XX @@ static int __init tls_register(void) tcp_register_ulp(&tcp_tls_ulp_ops); + tls_register_prot_ops(&tls_tcp_ops); + return 0; err_strp: tls_strp_dev_exit(); -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> A pointer to struct tls_prot_ops, named 'ops', has been added to struct tls_context. The places originally calling TLS-specific helpers have now been modified to indirectly invoke them via 'ops' pointer in tls_context. In do_tls_setsockopt_conf(), ctx->ops is assigned either 'tls_mptcp_ops' or 'tls_tcp_ops' based on the socket protocol. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/net/tls.h | 1 + net/tls/tls_main.c | 13 +++++++++---- net/tls/tls_strp.c | 29 +++++++++++++++++++---------- net/tls/tls_sw.c | 7 +++++-- 4 files changed, 34 insertions(+), 16 deletions(-) diff --git a/include/net/tls.h b/include/net/tls.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -XXX,XX +XXX,XX @@ struct tls_context { struct sock *sk; void (*sk_destruct)(struct sock *sk); + const struct tls_prot_ops *ops; union tls_crypto_context crypto_send; union tls_crypto_context crypto_recv; diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ int tls_push_sg(struct sock *sk, ctx->splicing_pages = true; while (1) { /* is sending application-limited? */ - tcp_rate_check_app_limited(sk); + ctx->ops->check_app_limited(sk); p = sg_page(sg); retry: bvec_set_page(&bvec, p, size, offset); iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - ret = tcp_sendmsg_locked(sk, &msg, size); + ret = ctx->ops->sendmsg_locked(sk, &msg, size); if (ret != size) { if (ret > 0) { @@ -XXX,XX +XXX,XX @@ static __poll_t tls_sk_poll(struct file *file, struct socket *sock, u8 shutdown; int state; - mask = tcp_poll(file, sock, wait); + tls_ctx = tls_get_ctx(sk); + mask = tls_ctx->ops->poll(file, sock, wait); state = inet_sk_state_load(sk); shutdown = READ_ONCE(sk->sk_shutdown); if (unlikely(state != TCP_ESTABLISHED || shutdown & RCV_SHUTDOWN)) return mask; - tls_ctx = tls_get_ctx(sk); ctx = tls_sw_ctx_rx(tls_ctx); psock = sk_psock_get(sk); @@ -XXX,XX +XXX,XX @@ static int tls_init(struct sock *sk) ctx->tx_conf = TLS_BASE; ctx->rx_conf = TLS_BASE; ctx->tx_max_payload_len = TLS_MAX_PAYLOAD_SIZE; + ctx->ops = tls_prot_ops_find(sk->sk_protocol); + if (!ctx->ops) { + rc = -EINVAL; + goto out; + } update_sk_prot(sk, ctx); out: write_unlock_bh(&sk->sk_callback_lock); diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_strp.c +++ b/net/tls/tls_strp.c @@ -XXX,XX +XXX,XX @@ struct sk_buff *tls_strp_msg_detach(struct tls_sw_context_rx *ctx) int tls_strp_msg_cow(struct tls_sw_context_rx *ctx) { struct tls_strparser *strp = &ctx->strp; + struct tls_context *tls_ctx; struct sk_buff *skb; if (strp->copy_mode) @@ -XXX,XX +XXX,XX @@ int tls_strp_msg_cow(struct tls_sw_context_rx *ctx) tls_strp_anchor_free(strp); strp->anchor = skb; - tcp_read_done(strp->sk, strp->stm.full_len); + tls_ctx = tls_get_ctx(strp->sk); + tls_ctx->ops->read_done(strp->sk, strp->stm.full_len); strp->copy_mode = 1; return 0; @@ -XXX,XX +XXX,XX @@ static int tls_strp_copyin(read_descriptor_t *desc, struct sk_buff *in_skb, static int tls_strp_read_copyin(struct tls_strparser *strp) { + struct tls_context *ctx = tls_get_ctx(strp->sk); read_descriptor_t desc; desc.arg.data = strp; @@ -XXX,XX +XXX,XX @@ static int tls_strp_read_copyin(struct tls_strparser *strp) desc.count = 1; /* give more than one skb per call */ /* sk should be locked here, so okay to do read_sock */ - tcp_read_sock(strp->sk, &desc, tls_strp_copyin); + ctx->ops->read_sock(strp->sk, &desc, tls_strp_copyin); return desc.error; } static int tls_strp_read_copy(struct tls_strparser *strp, bool qshort) { + struct tls_context *ctx = tls_get_ctx(strp->sk); struct skb_shared_info *shinfo; struct page *page; int need_spc, len; @@ -XXX,XX +XXX,XX @@ static int tls_strp_read_copy(struct tls_strparser *strp, bool qshort) * to read the data out. Otherwise the connection will stall. * Without pressure threshold of INT_MAX will never be ready. */ - if (likely(qshort && !tcp_epollin_ready(strp->sk, INT_MAX))) + if (likely(qshort && !ctx->ops->epollin_ready(strp->sk))) return 0; shinfo = skb_shinfo(strp->anchor); @@ -XXX,XX +XXX,XX @@ static int tls_strp_read_copy(struct tls_strparser *strp, bool qshort) static bool tls_strp_check_queue_ok(struct tls_strparser *strp) { unsigned int len = strp->stm.offset + strp->stm.full_len; + struct tls_context *ctx = tls_get_ctx(strp->sk); struct sk_buff *first, *skb; u32 seq; first = skb_shinfo(strp->anchor)->frag_list; skb = first; - seq = TCP_SKB_CB(first)->seq; + seq = ctx->ops->get_skb_seq(first); /* Make sure there's no duplicate data in the queue, * and the decrypted status matches. @@ -XXX,XX +XXX,XX @@ static bool tls_strp_check_queue_ok(struct tls_strparser *strp) len -= skb->len; skb = skb->next; - if (TCP_SKB_CB(skb)->seq != seq) + if (ctx->ops->get_skb_seq(skb) != seq) return false; if (skb_cmp_decrypted(first, skb)) return false; @@ -XXX,XX +XXX,XX @@ static bool tls_strp_check_queue_ok(struct tls_strparser *strp) static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len) { - struct tcp_sock *tp = tcp_sk(strp->sk); + struct tls_context *ctx = tls_get_ctx(strp->sk); struct sk_buff *first; u32 offset; - first = tcp_recv_skb(strp->sk, tp->copied_seq, &offset); + first = ctx->ops->recv_skb(strp->sk, &offset); if (WARN_ON_ONCE(!first)) return; @@ -XXX,XX +XXX,XX @@ static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len) bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) { + struct tls_context *ctx = tls_get_ctx(strp->sk); struct strp_msg *rxm; struct tls_msg *tlm; @@ -XXX,XX +XXX,XX @@ bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) DEBUG_NET_WARN_ON_ONCE(!strp->stm.full_len); if (!strp->copy_mode && force_refresh) { - if (unlikely(tcp_inq(strp->sk) < strp->stm.full_len)) { + if (unlikely(ctx->ops->inq(strp->sk) < strp->stm.full_len)) { WRITE_ONCE(strp->msg_ready, 0); memset(&strp->stm, 0, sizeof(strp->stm)); return false; @@ -XXX,XX +XXX,XX @@ bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) /* Called with lock held on lower socket */ static int tls_strp_read_sock(struct tls_strparser *strp) { + struct tls_context *ctx = tls_get_ctx(strp->sk); int sz, inq; - inq = tcp_inq(strp->sk); + inq = ctx->ops->inq(strp->sk); if (inq < 1) return 0; @@ -XXX,XX +XXX,XX @@ static void tls_strp_work(struct work_struct *w) void tls_strp_msg_done(struct tls_strparser *strp) { + struct tls_context *ctx = tls_get_ctx(strp->sk); + WARN_ON(!strp->stm.full_len); if (likely(!strp->copy_mode)) - tcp_read_done(strp->sk, strp->stm.full_len); + ctx->ops->read_done(strp->sk, strp->stm.full_len); else tls_strp_flush_anchor_copy(strp); diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -XXX,XX +XXX,XX @@ tls_read_flush_backlog(struct sock *sk, struct tls_prot_info *prot, size_t len_left, size_t decrypted, ssize_t done, size_t *flushed_at) { + struct tls_context *tls_ctx = tls_get_ctx(sk); size_t max_rec; if (len_left <= decrypted) return false; max_rec = prot->overhead_size - prot->tail_size + TLS_MAX_PAYLOAD_SIZE; - if (done - *flushed_at < SZ_128K && tcp_inq(sk) > max_rec) + if (done - *flushed_at < SZ_128K && tls_ctx->ops->inq(sk) > max_rec) return false; *flushed_at = done; @@ -XXX,XX +XXX,XX @@ int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb) size_t cipher_overhead; size_t data_len = 0; int ret; + u32 seq; /* Verify that we have a full TLS header, or wait for more data */ if (strp->stm.offset + prot->prepend_size > skb->len) @@ -XXX,XX +XXX,XX @@ int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb) goto read_failure; } + seq = tls_ctx->ops->get_skb_seq(skb); tls_device_rx_resync_new_rec(strp->sk, data_len + TLS_HEADER_SIZE, - TCP_SKB_CB(skb)->seq + strp->stm.offset); + seq + strp->stm.offset); return data_len + TLS_HEADER_SIZE; read_failure: -- 2.53.0
From: Gang Yan <yangang@kylinos.cn> In MPTCP scenarios, subflow SKBs may have non-zero offsets due to out-of-order packet handling or partial data delivery. When walking the TLS strp queue to verify sequence numbers and decryption status, we must also validate each SKB's offset using get_skb_off() to ensure the queue state is consistent. This check is specific to MPTCP; TCP does not require offset validation as its SKBs always start at offset 0. If any SKB reports an invalid offset, return false to indicate the queue is not in a consistent state and trigger a resynchronization. Co-developed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Gang Yan <yangang@kylinos.cn> --- net/tls/tls_strp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_strp.c +++ b/net/tls/tls_strp.c @@ -XXX,XX +XXX,XX @@ static bool tls_strp_check_queue_ok(struct tls_strparser *strp) len -= skb->len; skb = skb->next; + if (ctx->ops->get_skb_off(skb)) + return false; if (ctx->ops->get_skb_seq(skb) != seq) return false; if (skb_cmp_decrypted(first, skb)) -- 2.53.0
From: Gang Yan <yangang@kylinos.cn> This patch makes mptcp_check_readable() aligned with TCP, and renames it to mptcp_stream_is_readable(). It will be used in the case of KTLS, because 'prot' will be modified, tls_sw_sock_is_readable() is expected to be called from prot->sock_is_readable(). Co-developed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Gang Yan <yangang@kylinos.cn> --- net/mptcp/protocol.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ void __mptcp_unaccepted_force_close(struct sock *sk) __mptcp_destroy_sock(sk); } -static __poll_t mptcp_check_readable(struct sock *sk) +static bool mptcp_stream_is_readable(struct sock *sk) { - return mptcp_epollin_ready(sk) ? EPOLLIN | EPOLLRDNORM : 0; + if (mptcp_epollin_ready(sk)) + return true; + return sk_is_readable(sk); } static void mptcp_check_listen_stop(struct sock *sk) @@ -XXX,XX +XXX,XX @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock, mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP; if (state != TCP_SYN_SENT && state != TCP_SYN_RECV) { - mask |= mptcp_check_readable(sk); + if (mptcp_stream_is_readable(sk)) + mask |= EPOLLIN | EPOLLRDNORM; if (shutdown & SEND_SHUTDOWN) mask |= EPOLLOUT | EPOLLWRNORM; else -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> When invoking mptcp_read_sock() from a softirq context (e.g., through the TLS read_sock interface), calling lock_sock_fast() in mptcp_rcv_space_adjust() or mptcp_cleanup_rbuf() can lead to deadlocks, since the socket lock may already be held. Replace lock_sock_fast() with spin_trylock_bh() in these functions to make the locking attempt non-blocking. If the lock cannot be acquired, skip the operation to avoid deadlock. Also introduce mptcp_data_trylock() and use it in mptcp_move_skbs() to make the data locking non-blocking in the read_sock path. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- net/mptcp/protocol.c | 16 ++++++++-------- net/mptcp/protocol.h | 1 + 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ static void mptcp_send_ack(struct mptcp_sock *msk) static void mptcp_subflow_cleanup_rbuf(struct sock *ssk, int copied) { - bool slow; - - slow = lock_sock_fast(ssk); + if (!spin_trylock_bh(&ssk->sk_lock.slock)) + return; if (tcp_can_send_ack(ssk)) tcp_cleanup_rbuf(ssk, copied); - unlock_sock_fast(ssk, slow); + spin_unlock_bh(&ssk->sk_lock.slock); } static bool mptcp_subflow_could_cleanup(const struct sock *ssk, bool rx_empty) @@ -XXX,XX +XXX,XX @@ static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied) */ mptcp_for_each_subflow(msk, subflow) { struct sock *ssk; - bool slow; ssk = mptcp_subflow_tcp_sock(subflow); - slow = lock_sock_fast(ssk); + if (!spin_trylock_bh(&ssk->sk_lock.slock)) + continue; /* subflows can be added before tcp_init_transfer() */ if (tcp_sk(ssk)->rcvq_space.space) tcp_rcvbuf_grow(ssk, copied); - unlock_sock_fast(ssk, slow); + spin_unlock_bh(&ssk->sk_lock.slock); } } @@ -XXX,XX +XXX,XX @@ static bool mptcp_move_skbs(struct sock *sk) bool enqueued = false; u32 moved; - mptcp_data_lock(sk); + if (!mptcp_data_trylock(sk)) + return false; while (mptcp_can_spool_backlog(sk, &skbs)) { mptcp_data_unlock(sk); enqueued |= __mptcp_move_skbs(sk, &skbs, &moved); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -XXX,XX +XXX,XX @@ struct mptcp_sock { }; #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock) +#define mptcp_data_trylock(sk) spin_trylock_bh(&(sk)->sk_lock.slock) #define mptcp_data_unlock(sk) spin_unlock_bh(&(sk)->sk_lock.slock) #define mptcp_for_each_subflow(__msk, __subflow) \ -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> This patch implements the MPTCP-specific struct tls_prot_ops, named 'tls_mptcp_ops'. Note that there is a slight difference between mptcp_inq() and mptcp_inq_hint(), it does not return 1 when the socket is closed or shut down; instead, it returns 0. Otherwise, it would break the condition "inq < 1" in tls_strp_read_sock(). Passing an MPTCP socket to tcp_sock_rate_check_app_limited() can trigger a crash. Here, an MPTCP version of check_app_limited() is implemented, which calls tcp_sock_rate_check_app_limited() for each subflow. MPTCP TLS_HW mode is not yet implemented, returning EOPNOTSUPP here. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/net/mptcp.h | 2 + include/net/tcp.h | 1 + net/ipv4/tcp.c | 9 +++- net/mptcp/protocol.c | 106 ++++++++++++++++++++++++++++++++++++++++--- net/tls/tls_main.c | 6 +++ 5 files changed, 116 insertions(+), 8 deletions(-) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -XXX,XX +XXX,XX @@ struct mptcp_pm_ops { void (*release)(struct mptcp_sock *msk); } ____cacheline_aligned_in_smp; +extern struct tls_prot_ops tls_mptcp_ops; + #ifdef CONFIG_MPTCP void mptcp_init(void); diff --git a/include/net/tcp.h b/include/net/tcp.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -XXX,XX +XXX,XX @@ static inline int tcp_bound_to_half_wnd(struct tcp_sock *tp, int pktsize) /* tcp.c */ void tcp_get_info(struct sock *, struct tcp_info *); +void tcp_sock_rate_check_app_limited(struct tcp_sock *tp); void tcp_rate_check_app_limited(struct sock *sk); /* Read 'sendfile()'-style from a TCP socket */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index XXXXXXX..XXXXXXX 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -XXX,XX +XXX,XX @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied, } /* If a gap is detected between sends, mark the socket application-limited. */ -void tcp_rate_check_app_limited(struct sock *sk) +void tcp_sock_rate_check_app_limited(struct tcp_sock *tp) { - struct tcp_sock *tp = tcp_sk(sk); + struct sock *sk = (struct sock *)tp; if (/* We have less than one packet to send. */ tp->write_seq - tp->snd_nxt < tp->mss_cache && @@ -XXX,XX +XXX,XX @@ void tcp_rate_check_app_limited(struct sock *sk) tp->app_limited = (tp->delivered + tcp_packets_in_flight(tp)) ? : 1; } + +void tcp_rate_check_app_limited(struct sock *sk) +{ + tcp_sock_rate_check_app_limited(tcp_sk(sk)); +} EXPORT_SYMBOL_GPL(tcp_rate_check_app_limited); int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ #include <net/mptcp.h> #include <net/hotdata.h> #include <net/xfrm.h> +#include <net/tls.h> #include <asm/ioctls.h> #include "protocol.h" #include "mib.h" -static unsigned int mptcp_inq_hint(const struct sock *sk); +static unsigned int mptcp_inq_hint(struct sock *sk); #define CREATE_TRACE_POINTS #include <trace/events/mptcp.h> @@ -XXX,XX +XXX,XX @@ static void mptcp_rps_record_subflows(const struct mptcp_sock *msk) } } -static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +static int mptcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len) { struct mptcp_sock *msk = mptcp_sk(sk); struct page_frag *pfrag; @@ -XXX,XX +XXX,XX @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) /* silently ignore everything else */ msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_FASTOPEN; - lock_sock(sk); - mptcp_rps_record_subflows(msk); if (unlikely(inet_test_bit(DEFER_CONNECT, sk) || @@ -XXX,XX +XXX,XX @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) __mptcp_push_pending(sk, msg->msg_flags); out: - release_sock(sk); return copied; do_error: @@ -XXX,XX +XXX,XX @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto out; } +static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +{ + int ret; + + lock_sock(sk); + ret = mptcp_sendmsg_locked(sk, msg, len); + release_sock(sk); + + return ret; +} + static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied); static void mptcp_eat_recv_skb(struct sock *sk, struct sk_buff *skb) @@ -XXX,XX +XXX,XX @@ static bool mptcp_move_skbs(struct sock *sk) return enqueued; } -static unsigned int mptcp_inq_hint(const struct sock *sk) +static int mptcp_inq(struct sock *sk) { const struct mptcp_sock *msk = mptcp_sk(sk); const struct sk_buff *skb; @@ -XXX,XX +XXX,XX @@ static unsigned int mptcp_inq_hint(const struct sock *sk) return (unsigned int)hint_val; } + return 0; +} + +static unsigned int mptcp_inq_hint(struct sock *sk) +{ + unsigned int inq = mptcp_inq(sk); + + if (inq) + return inq; + if (sk->sk_state == TCP_CLOSE || (sk->sk_shutdown & RCV_SHUTDOWN)) return 1; @@ -XXX,XX +XXX,XX @@ int __init mptcp_proto_v6_init(void) return err; } #endif + +static void mptcp_read_done(struct sock *sk, size_t len) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct sk_buff *skb; + size_t left; + u32 offset; + + msk_owned_by_me(msk); + + if (sk->sk_state == TCP_LISTEN) + return; + + left = len; + while (left && (skb = mptcp_recv_skb(sk, &offset)) != NULL) { + int used; + + used = min_t(size_t, skb->len - offset, left); + msk->bytes_consumed += used; + MPTCP_SKB_CB(skb)->offset += used; + MPTCP_SKB_CB(skb)->map_seq += used; + left -= used; + + if (skb->len > offset + used) + break; + + mptcp_eat_recv_skb(sk, skb); + } + + mptcp_rcv_space_adjust(msk, len - left); + + /* Clean up data we have read: This will do ACK frames. */ + if (left != len) + mptcp_cleanup_rbuf(msk, len - left); +} + +static u32 mptcp_get_skb_off(struct sk_buff *skb) +{ + return MPTCP_SKB_CB(skb)->offset; +} + +static u32 mptcp_get_skb_seq(struct sk_buff *skb) +{ + return MPTCP_SKB_CB(skb)->map_seq; +} + +static void mptcp_check_app_limited(struct sock *sk) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct mptcp_subflow_context *subflow; + + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + bool slow; + + slow = lock_sock_fast(ssk); + tcp_sock_rate_check_app_limited(tcp_sk(ssk)); + unlock_sock_fast(ssk, slow); + } +} + +struct tls_prot_ops tls_mptcp_ops = { + .protocol = IPPROTO_MPTCP, + .inq = mptcp_inq, + .sendmsg_locked = mptcp_sendmsg_locked, + .recv_skb = mptcp_recv_skb, + .read_sock = mptcp_read_sock, + .read_done = mptcp_read_done, + .get_skb_off = mptcp_get_skb_off, + .get_skb_seq = mptcp_get_skb_seq, + .poll = mptcp_poll, + .epollin_ready = mptcp_epollin_ready, + .check_app_limited = mptcp_check_app_limited, +}; +EXPORT_SYMBOL(tls_mptcp_ops); diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval, tls_sw_strparser_arm(sk, ctx); } + if (conf == TLS_HW && sk->sk_protocol == IPPROTO_MPTCP) + return -EOPNOTSUPP; + if (tx) ctx->tx_conf = conf; else @@ -XXX,XX +XXX,XX @@ static int __init tls_register(void) tcp_register_ulp(&tcp_tls_ulp_ops); tls_register_prot_ops(&tls_tcp_ops); +#ifdef CONFIG_MPTCP + tls_register_prot_ops(&tls_mptcp_ops); +#endif return 0; err_strp: -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> This patch extracts TCP_ULP getsockopt operation into a tcp_sock_get_ulp() helper so that it can also be used in MPTCP. TCP_ULP was obtained by calling mptcp_getsockopt_first_sf_only() to get ULP of the first subflow. Now that the mechanism has changed, a new helper mptcp_getsockopt_tcp_ulp() is added to get ULP of msk. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/linux/tcp.h | 1 + net/ipv4/tcp.c | 36 ++++++++++++++++++++++-------------- net/mptcp/sockopt.c | 12 ++++++++++++ 3 files changed, 35 insertions(+), 14 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index XXXXXXX..XXXXXXX 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -XXX,XX +XXX,XX @@ void tcp_sock_set_quickack(struct sock *sk, int val); int tcp_sock_set_syncnt(struct sock *sk, int val); int tcp_sock_set_user_timeout(struct sock *sk, int val); int tcp_sock_set_maxseg(struct sock *sk, int val); +int tcp_sock_get_ulp(struct sock *sk, sockptr_t optval, sockptr_t optlen); static inline bool dst_tcp_usec_ts(const struct dst_entry *dst) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index XXXXXXX..XXXXXXX 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -XXX,XX +XXX,XX @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk, return stats; } +int tcp_sock_get_ulp(struct sock *sk, sockptr_t optval, sockptr_t optlen) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + int len; + + if (copy_from_sockptr(&len, optlen, sizeof(int))) + return -EFAULT; + len = min_t(unsigned int, len, TCP_ULP_NAME_MAX); + if (!icsk->icsk_ulp_ops) { + len = 0; + if (copy_to_sockptr(optlen, &len, sizeof(int))) + return -EFAULT; + return 0; + } + if (copy_to_sockptr(optlen, &len, sizeof(int))) + return -EFAULT; + if (copy_to_sockptr(optval, icsk->icsk_ulp_ops->name, len)) + return -EFAULT; + return 0; +} + int do_tcp_getsockopt(struct sock *sk, int level, int optname, sockptr_t optval, sockptr_t optlen) { @@ -XXX,XX +XXX,XX @@ int do_tcp_getsockopt(struct sock *sk, int level, return 0; case TCP_ULP: - if (copy_from_sockptr(&len, optlen, sizeof(int))) - return -EFAULT; - len = min_t(unsigned int, len, TCP_ULP_NAME_MAX); - if (!icsk->icsk_ulp_ops) { - len = 0; - if (copy_to_sockptr(optlen, &len, sizeof(int))) - return -EFAULT; - return 0; - } - if (copy_to_sockptr(optlen, &len, sizeof(int))) - return -EFAULT; - if (copy_to_sockptr(optval, icsk->icsk_ulp_ops->name, len)) - return -EFAULT; - return 0; + return tcp_sock_get_ulp(sk, optval, optlen); case TCP_FASTOPEN_KEY: { u64 key[TCP_FASTOPEN_KEY_BUF_LENGTH / sizeof(u64)]; diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -XXX,XX +XXX,XX @@ static int mptcp_put_int_option(struct mptcp_sock *msk, char __user *optval, return 0; } +static int mptcp_getsockopt_tcp_ulp(struct sock *sk, char __user *optval, + int __user *optlen) +{ + int ret; + + lock_sock(sk); + ret = tcp_sock_get_ulp(sk, USER_SOCKPTR(optval), USER_SOCKPTR(optlen)); + release_sock(sk); + return ret; +} + static int mptcp_getsockopt_sol_tcp(struct mptcp_sock *msk, int optname, char __user *optval, int __user *optlen) { @@ -XXX,XX +XXX,XX @@ static int mptcp_getsockopt_sol_tcp(struct mptcp_sock *msk, int optname, switch (optname) { case TCP_ULP: + return mptcp_getsockopt_tcp_ulp(sk, optval, optlen); case TCP_CONGESTION: case TCP_INFO: case TCP_CC_INFO: -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> This patch extracts TCP_ULP setsockopt operation into a tcp_sock_set_ulp() helper so that it can also be used in MPTCP. Add MPTCP TLS setsockopt support in mptcp_setsockopt_sol_tcp(). It allows setting the TCP_ULP option to 'tls' exclusively, and enables configuration of the TLS_TX and TLS_RX options at the SOL_TLS level. This option cannot be set when the socket is in CLOSE or LISTEN state. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/linux/tcp.h | 1 + net/ipv4/tcp.c | 42 ++++++++++++++++++++++++------------------ net/mptcp/sockopt.c | 25 ++++++++++++++++++++++++- 3 files changed, 49 insertions(+), 19 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index XXXXXXX..XXXXXXX 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -XXX,XX +XXX,XX @@ int tcp_sock_set_syncnt(struct sock *sk, int val); int tcp_sock_set_user_timeout(struct sock *sk, int val); int tcp_sock_set_maxseg(struct sock *sk, int val); int tcp_sock_get_ulp(struct sock *sk, sockptr_t optval, sockptr_t optlen); +int tcp_sock_set_ulp(struct sock *sk, sockptr_t optval, unsigned int optlen); static inline bool dst_tcp_usec_ts(const struct dst_entry *dst) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index XXXXXXX..XXXXXXX 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -XXX,XX +XXX,XX @@ int tcp_sock_set_maxseg(struct sock *sk, int val) return 0; } +int tcp_sock_set_ulp(struct sock *sk, sockptr_t optval, unsigned int optlen) +{ + char name[TCP_ULP_NAME_MAX]; + int err = 0; + size_t len; + int val; + + if (optlen < 1) + return -EINVAL; + + len = min_t(long, TCP_ULP_NAME_MAX - 1, optlen); + val = strncpy_from_sockptr(name, optval, len); + if (val < 0) + return -EFAULT; + name[val] = 0; + + sockopt_lock_sock(sk); + err = tcp_set_ulp(sk, name); + sockopt_release_sock(sk); + return err; +} + /* * Socket option code for TCP. */ @@ -XXX,XX +XXX,XX @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, sockopt_release_sock(sk); return err; } - case TCP_ULP: { - char name[TCP_ULP_NAME_MAX]; - - if (optlen < 1) - return -EINVAL; - - val = strncpy_from_sockptr(name, optval, - min_t(long, TCP_ULP_NAME_MAX - 1, - optlen)); - if (val < 0) - return -EFAULT; - name[val] = 0; - - sockopt_lock_sock(sk); - err = tcp_set_ulp(sk, name); - sockopt_release_sock(sk); - return err; - } + case TCP_ULP: + return tcp_sock_set_ulp(sk, optval, optlen); case TCP_FASTOPEN_KEY: { __u8 key[TCP_FASTOPEN_KEY_BUF_LENGTH]; __u8 *backup_key = NULL; diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -XXX,XX +XXX,XX @@ #include <net/protocol.h> #include <net/tcp.h> #include <net/mptcp.h> +#include <net/tls.h> #include "protocol.h" #define MIN_INFO_OPTLEN_SIZE 16 @@ -XXX,XX +XXX,XX @@ static bool mptcp_supported_sockopt(int level, int optname) case TCP_FASTOPEN_CONNECT: case TCP_FASTOPEN_KEY: case TCP_FASTOPEN_NO_COOKIE: + case TCP_ULP: return true; } @@ -XXX,XX +XXX,XX @@ static bool mptcp_supported_sockopt(int level, int optname) * TCP_REPAIR_WINDOW are not supported, better avoid this mess */ } + if (level == SOL_TLS) { + switch (optname) { + case TLS_TX: + case TLS_RX: + return true; + } + } return false; } @@ -XXX,XX +XXX,XX @@ static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level, return ret; } +static int mptcp_setsockopt_tcp_ulp(struct sock *sk, sockptr_t optval, + unsigned int optlen) +{ + char ulp[4] = ""; + + if (copy_from_user(ulp, optval.user, 4)) + return -EFAULT; + if (strcmp(ulp, "tls\0")) + return -EOPNOTSUPP; + if ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) + return -ENOTCONN; + return tcp_sock_set_ulp(sk, optval, optlen); +} + static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, sockptr_t optval, unsigned int optlen) { @@ -XXX,XX +XXX,XX @@ static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, switch (optname) { case TCP_ULP: - return -EOPNOTSUPP; + return mptcp_setsockopt_tcp_ulp(sk, optval, optlen); case TCP_CONGESTION: return mptcp_setsockopt_sol_tcp_congestion(msk, optval, optlen); case TCP_DEFER_ACCEPT: -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> With KTLS being implemented, "tls" should no longer be used in sock_test_tcpulp(), it breaks mptcp_connect.sh tests. Another ULP name, "smc", is set instead in this patch. Cc: Dust Li <dust.li@linux.alibaba.com> Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/mptcp/config | 1 + tools/testing/selftests/net/mptcp/mptcp_connect.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/config +++ b/tools/testing/selftests/net/mptcp/config @@ -XXX,XX +XXX,XX @@ CONFIG_NFT_SOCKET=m CONFIG_NFT_TPROXY=m CONFIG_SYN_COOKIES=y CONFIG_VETH=y +CONFIG_TLS=y diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -XXX,XX +XXX,XX @@ static void sock_test_tcpulp(int sock, int proto, unsigned int line) if (ret == 0) X("setsockopt"); } else if (proto == IPPROTO_MPTCP) { - ret = do_ulp_so(sock, "tls"); + ret = do_ulp_so(sock, "smc"); if (ret != -1) X("setsockopt"); } -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> This patch adds MPTCP TLS tests for mptcp_connect.c/mptcp_connect.sh. A new TLS type has been added to cfg_sockopt_types, enabled via the parameter "-o TLS". do_setsockopt_tls() has been implemented to set TLS parameters for both the server and client. After adding TLS configuration, sock_test_tcpulp() needs to be updated as getsockopt ULP may now return not only "mptcp" but also "tls". These tests report "read: Resource temporarily unavailable" errors occasionally, which is fixed by adding handling for EAGAIN in copyfd_io_poll(). Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- .../selftests/net/mptcp/mptcp_connect.c | 47 ++++++++++++++++++- .../selftests/net/mptcp/mptcp_connect.sh | 33 +++++++++++++ 2 files changed, 78 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -XXX,XX +XXX,XX @@ #include <linux/time_types.h> #include <linux/sockios.h> #include <linux/compiler.h> +#include <linux/tls.h> extern int optind; @@ -XXX,XX +XXX,XX @@ struct cfg_cmsg_types { struct cfg_sockopt_types { unsigned int transparent:1; unsigned int mptfo:1; + unsigned int tls:1; }; struct tcp_inq_state { @@ -XXX,XX +XXX,XX @@ static int do_ulp_so(int sock, const char *name) return setsockopt(sock, IPPROTO_TCP, TCP_ULP, name, strlen(name)); } +static void do_setsockopt_tls(int fd) +{ + struct tls12_crypto_info_aes_gcm_128 tls_tx = { + .info = { + .version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_AES_GCM_128, + }, + }; + struct tls12_crypto_info_aes_gcm_128 tls_rx = { + .info = { + .version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_AES_GCM_128, + }, + }; + int err; + + err = do_ulp_so(fd, "tls"); + if (err) + xerror("setsockopt TCP_ULP"); + + err = setsockopt(fd, SOL_TLS, TLS_TX, (void *)&tls_tx, sizeof(tls_tx)); + if (err) + xerror("setsockopt TLS_TX"); + + err = setsockopt(fd, SOL_TLS, TLS_RX, (void *)&tls_rx, sizeof(tls_rx)); + if (err) + xerror("setsockopt TLS_RX"); +} + #define X(m) xerror("%s:%u: %s: failed for proto %d at line %u", __FILE__, __LINE__, (m), proto, line) static void sock_test_tcpulp(int sock, int proto, unsigned int line) { @@ -XXX,XX +XXX,XX @@ static void sock_test_tcpulp(int sock, int proto, unsigned int line) X("getsockopt"); if (buflen > 0) { - if (strcmp(buf, "mptcp") != 0) + if (strcmp(buf, "mptcp") != 0 && strcmp(buf, "tls") != 0) xerror("unexpected ULP '%s' for proto %d at line %u", buf, proto, line); ret = do_ulp_so(sock, "tls"); if (ret == 0) @@ -XXX,XX +XXX,XX @@ static int sock_connect_mptcp(const char * const remoteaddr, } freeaddrinfo(addr); - if (sock != -1) + if (sock != -1) { SOCK_TEST_TCPULP(sock, proto); + if (cfg_sockopt_types.tls) + do_setsockopt_tls(sock); + } return sock; } @@ -XXX,XX +XXX,XX @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, /* Else, still have data to transmit */ } else if (len < 0) { + if (errno == EAGAIN) + continue; if (cfg_rcv_trunc) return 0; perror("read"); @@ -XXX,XX +XXX,XX @@ int main_loop_s(int listensock) } SOCK_TEST_TCPULP(remotesock, 0); + if (cfg_sockopt_types.tls) + do_setsockopt_tls(remotesock); memset(&winfo, 0, sizeof(winfo)); err = copyfd_io(fd, remotesock, 1, true, &winfo); @@ -XXX,XX +XXX,XX @@ static void parse_setsock_options(const char *name) return; } + if (strncmp(name, "TLS", len) == 0) { + cfg_sockopt_types.tls = 1; + return; + } + fprintf(stderr, "Unrecognized setsockopt option %s\n", name); exit(1); } diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh @@ -XXX,XX +XXX,XX @@ run_tests_disconnect() connect_per_transfer=1 } +run_tests_tls() +{ + TEST_GROUP="TLS" + local lret=0 + + if ! mptcp_lib_kallsyms_has "mptcp_read_done"; then + mptcp_lib_pr_skip "TLS not supported by the kernel" + mptcp_lib_result_skip "${TEST_GROUP}" + return + fi + + mptcp_lib_pr_info "with TLS start" + + do_transfer "$ns1" "$ns2" MPTCP MPTCP "10.0.1.1" "0.0.0.0" "-o TLS" + lret=$? + if [ $lret -ne 0 ]; then + ret=$lret + return 1 + fi + + do_transfer "$ns1" "$ns2" MPTCP MPTCP "dead:beef:1::1" "::" "-o TLS" + lret=$? + if [ $lret -ne 0 ]; then + ret=$lret + return 1 + fi + + mptcp_lib_pr_info "with TLS end" +} + display_time() { time_end=$(date +%s) @@ -XXX,XX +XXX,XX @@ log_if_error "Tests with tproxy have failed" run_tests_disconnect log_if_error "Tests of the full disconnection have failed" +run_tests_tls +log_if_error "Tests with TLS have failed" + display_time mptcp_lib_result_print_all_tap exit ${final_ret} -- 2.53.0
From: Geliang Tang <tanggeliang@kylinos.cn> v16: - drop rcu_head from struct tls_proto, use refcnt for lifecycle management. - add back TLS_NUM_PROTS to handle IPv4/IPv6 separately. - add .owner field to tls_tcp_ops and tls_mptcp_ops (THIS_MODULE). - add module refcounting (try_module_get / module_put) in tls_build_proto and tls_init. - add missing NULL check for tls_ctx->proto->ops in tls_sk_poll. - add RCU read lock protection in tls_register_prot_ops. - add error handling for tls_register_prot_ops calls in tls_register (with rollback on failure). - adjust MPTCP cleanup: move tcp_cleanup_ulp from mptcp_destroy_common to mptcp_destroy. - remove increase_rlimit from selftest and fix fd check. v15: - patch 1: add proto parameter for tls_toe_bypass. - patch 1: add a proto null-check in update_sk_prot. - patch 1: hold mutex_lock in tls_proto_cleanup. - patch 14: raise the limit of file descriptor values to 4096 to avoid test failures. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1776469068.git.tanggeliang@kylinos.cn/ v14: - address review comments from sashiko - patch 1: add rcu for tls_proto, add tls_proto_cleanup. - patch 2: add unregister helper. - patch 3: add tls_prot_ops pointer to tls_proto, instead of tls_context - patch 5: update mptcp_get_skb_seq, using map_seq - offset, then the patch "tls: add skb offset check for mptcp" can be dropped. - patch 7: check len < 0. - patch 8: call tcp_cleanup_ulp in mptcp_destroy_common. - patch 9: replace all "tls" as "espintcp" in sock_test_tcpulp. - patch 10: add is_mptcp_enable helper. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1775476921.git.tanggeliang@kylinos.cn/ v13: - patch 1: Add new patch "add per-protocol cache" to address AI review. - patch 2: Hold RCU read lock in tls_prot_ops_find(). - patch 3: Set icsk_ulp_data to NULL in error path. - patch 6: Use spin_is_locked() instead of lockdep_is_held() to fix build errors. - patch 9: Drop tcp_sock_set_ulp(). - patch 11: Remove the "return" statement in ulp_sock_pair and check the return values of socket(). - patch 14: Update wait_for_tcp_close(). - patch 16: Add a max argument to init() and set it to '0' to disable multipath testing, so that this series does not depend on the "mptcp: fix stall because of data_ready" series. Multipath testing will be re‑enabled together with that series later, as a squash‑to patch. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1775227717.git.tanggeliang@kylinos.cn/ v12: - Thanks for the help from Paolo and Gang Yan, I finally solved the deadlock issue in read_sock. As a result, the patch "mptcp: avoid sleeping in read_sock path under softirq" in v11 has been dropped, and instead a lock_is_held interface has been added to struct tls_prot_ops. When MPTCP implements this interface, it not only checks sock_owned_by_user_nocheck(sk) as TCP does, but also needs to check whether the MPTCP data lock is held. - Update selftests to make them more stable. - Fix shellcheck errors for the selftests. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1775115102.git.tanggeliang@kylinos.cn/ v11: - Fix memory leak errors reported by CI. In v10, these occurred in the shutdown_reuse test and "usleep(500000)" caused the memory leaks. In v11, a dedicated helper wait_for_tcp_close() has been added to provide an appropriate delay. - Drop the code that used mptcp_data_trylock() in mptcp_move_skbs() to fix a deadlock issue, as that deadlock no longer occurs in v11. - Do not add "mptcp" variable for the "tls_err" tests, adding it for the "tls" tests is sufficient. - No longer increase timeout values for poll/epoll tests, as they are no longer needed. - Add ns1 definition in mptcp_tls.sh to fix "ns1 is referenced but not assigned" error. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1773911536.git.tanggeliang@kylinos.cn/ v10: - Address comments by ai review: - patch 2: call tls_ctx_free(sk, ctx) and clear icsk_ulp_data before goto out. - patch 3: update commit log as "validate each SKB's offset except the first". - patch 5: add sock_owned_by_user() checks. - patch 7: disable device offload for MPTCP sockets. - patch 9: use TCP_ULP_NAME_MAX in mptcp_setsockopt_tcp_ulp(), drop SOL_TLS in mptcp_supported_sockopt(). - Make .get_skb_off optional instead of mandatory, TCP does not need to define it. - Test "espintcp" ULP instead of "smc" in patch 10. "smc" ULP is removed recently. - With Gang Yan's "mptcp: fix stall because of data_ready" v3, mptcp tls selftests can run without failures. Now add them in this set. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1773737371.git.tanggeliang@kylinos.cn/ v9: - add a new patch to "add MPTCP SKB offset check in strp queue walk", thanks to Gang Yan for the fix. - add a new patch to "avoid deadlocks in read_sock path", replacing the "in_softirq()" check used in v8. - update the selftests. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1773365606.git.tanggeliang@kylinos.cn/ v8: - do not hold tls_prot_ops_lock in tls_init(); otherwise, a deadlock occurs. - change return value of mptcp_stream_is_readable() as 'bool' to fix the "expected restricted __poll_t" warning reported by CI. - fixed other CI checkpatch warnings regarding excessively long lines. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1768294706.git.tanggeliang@kylinos.cn/ v7: - Passing an MPTCP socket to tcp_sock_rate_check_app_limited() causes a crash. In v7, an MPTCP version of check_app_limited() is implemented, which calls tcp_sock_rate_check_app_limited() for each subflow. - Register tls_tcp_ops and tls_mptcp_ops in tls_register() rather than in tls_init(). - Set ctx->ops in tls_init() instead of in do_tls_setsockopt_conf(). - Keep tls_device.c unchanged. MPTCP TLS_HW mode has not been implemented yet, so EOPNOTSUPP is returned in this case. - Also add TCP TLS tests in mptcp_join.sh. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1768284047.git.tanggeliang@kylinos.cn/ v6: - register each ops as Matt suggested. - drop sk_is_msk(). - add tcp_sock_get_ulp/tcp_sock_set_ulp helpers. - set another ULP in sock_test_tcpulp as Matt suggested. - add tls tests using multiple subflows in mptcp_join.sh. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1767518836.git.tanggeliang@kylinos.cn/ v5: - As suggested by Mat and Matt, this set introduces struct tls_prot_ops for TLS. - Includes Gang Yan's patches to add MPTCP support to the TLS selftests. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1766372799.git.tanggeliang@kylinos.cn/ v4: - split "tls: add MPTCP protocol support" into smaller, more focused patches. - a new mptcp_inq helper has been implemented instead of directly using mptcp_inq_hint to fix the issue mentioned in [1]. - add sk_is_msk helper. - the 'expect' parameter will no longer be added to sock_test_tcpulp. Instead, SOCK_TEST_TCPULP items causing the tests failure will be directly removed. - remove the "TCP KTLS" tests, keeping only the MPTCP-related ones. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1765505775.git.tanggeliang@kylinos.cn/ [1] https://patchwork.kernel.org/project/mptcp/patch/ce74452f4c095a1761ef493b767b4bd9f9c14359.1764333805.git.tanggeliang@kylinos.cn/ v3: - mptcp_read_sock() and mptcp_poll() are not exported, as mptcp_sockopt test does not use read_sock/poll interfaces. They will be exported when new tests are added in the future. - call mptcp_inq_hint in tls_device_rx_resync_new_rec(), tls_device_core_ctrl_rx_resync() and tls_read_flush_backlog() too. - update selftests. - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1763800601.git.tanggeliang@kylinos.cn/ v2: - fix disconnect. - update selftests. This series adds KTLS support for MPTCP. Since the ULP of msk is not being used, ULP KTLS can be directly configured onto msk without affecting its communication. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/480 Gang Yan (1): mptcp: update mptcp_check_readable Geliang Tang (15): tls: add per-protocol cache to support mptcp tls: introduce struct tls_prot_ops tls: add tls_prot_ops pointer to tls_proto tls: replace direct protocol calls with tls_proto->ops mptcp: implement tls_mptcp_ops tls: disable device offload for mptcp sockets mptcp: update ulp getsockopt for tls support mptcp: enable ulp setsockopt for tls support selftests: mptcp: connect: use espintcp for ulp test selftests: tls: add mptcp variant for testing selftests: tls: increase pollin timeouts for mptcp selftests: tls: increase nonblocking data size for mptcp selftests: tls: wait close in shutdown_reuse for mptcp selftests: tls: add mptcp test cases selftests: mptcp: cover mptcp tls tests include/linux/tcp.h | 1 + include/net/mptcp.h | 2 + include/net/tcp.h | 1 + include/net/tls.h | 36 +++ include/net/tls_toe.h | 3 +- net/ipv4/tcp.c | 45 +-- net/mptcp/protocol.c | 118 +++++++- net/mptcp/protocol.h | 1 + net/mptcp/sockopt.c | 53 +++- net/tls/tls.h | 3 +- net/tls/tls_device.c | 6 + net/tls/tls_main.c | 270 +++++++++++++++--- net/tls/tls_strp.c | 33 ++- net/tls/tls_sw.c | 7 +- net/tls/tls_toe.c | 5 +- tools/testing/selftests/net/mptcp/.gitignore | 1 + tools/testing/selftests/net/mptcp/Makefile | 2 + tools/testing/selftests/net/mptcp/config | 5 + .../selftests/net/mptcp/mptcp_connect.c | 4 +- .../testing/selftests/net/mptcp/mptcp_tls.sh | 62 ++++ tools/testing/selftests/net/mptcp/tls.c | 1 + tools/testing/selftests/net/tls.c | 174 ++++++++++- 22 files changed, 736 insertions(+), 97 deletions(-) create mode 100755 tools/testing/selftests/net/mptcp/mptcp_tls.sh create mode 120000 tools/testing/selftests/net/mptcp/tls.c -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> The TLS ULP uses a single global array to cache base protocol operations. When MPTCP sockets enable TLS, they overwrite this global cache with mptcp_prot, causing active TCP TLS sockets to use MPTCP-specific ops. This leads to type confusion and kernel panics. Fix by replacing the global cache with a per-protocol linked list. Each protocol (TCP, MPTCP, etc.) now has its own cached operations, stored in struct tls_proto and referenced from tls_context. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/net/tls.h | 16 ++++++ include/net/tls_toe.h | 3 +- net/tls/tls.h | 3 +- net/tls/tls_main.c | 126 ++++++++++++++++++++++++++++-------------- net/tls/tls_toe.c | 5 +- 5 files changed, 106 insertions(+), 47 deletions(-) diff --git a/include/net/tls.h b/include/net/tls.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -XXX,XX +XXX,XX @@ struct tls_prot_info { u16 tail_size; }; +enum { + TLSV4, + TLSV6, + TLS_NUM_PROTS, +}; + +struct tls_proto { + refcount_t refcnt; + struct list_head list; + const struct proto *prot; + struct proto prots[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG]; + struct proto_ops proto_ops[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG]; +}; + struct tls_context { /* read-only cache line */ struct tls_prot_info prot_info; @@ -XXX,XX +XXX,XX @@ struct tls_context { struct proto *sk_proto; struct sock *sk; + struct tls_proto *proto; + void (*sk_destruct)(struct sock *sk); union tls_crypto_context crypto_send; diff --git a/include/net/tls_toe.h b/include/net/tls_toe.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tls_toe.h +++ b/include/net/tls_toe.h @@ -XXX,XX +XXX,XX @@ struct tls_toe_device { struct kref kref; }; -int tls_toe_bypass(struct sock *sk); +int tls_toe_bypass(struct sock *sk, + struct tls_proto *proto); int tls_toe_hash(struct sock *sk); void tls_toe_unhash(struct sock *sk); diff --git a/net/tls/tls.h b/net/tls/tls.h index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls.h +++ b/net/tls/tls.h @@ -XXX,XX +XXX,XX @@ struct tls_rec { int __net_init tls_proc_init(struct net *net); void __net_exit tls_proc_fini(struct net *net); -struct tls_context *tls_ctx_create(struct sock *sk); +struct tls_context *tls_ctx_create(struct sock *sk, + struct tls_proto *proto); void tls_ctx_free(struct sock *sk, struct tls_context *ctx); void update_sk_prot(struct sock *sk, struct tls_context *ctx); diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ MODULE_DESCRIPTION("Transport Layer Security Support"); MODULE_LICENSE("Dual BSD/GPL"); MODULE_ALIAS_TCP_ULP("tls"); -enum { - TLSV4, - TLSV6, - TLS_NUM_PROTS, -}; - #define CHECK_CIPHER_DESC(cipher,ci) \ static_assert(cipher ## _IV_SIZE <= TLS_MAX_IV_SIZE); \ static_assert(cipher ## _SALT_SIZE <= TLS_MAX_SALT_SIZE); \ @@ -XXX,XX +XXX,XX @@ CHECK_CIPHER_DESC(TLS_CIPHER_SM4_CCM, tls12_crypto_info_sm4_ccm); CHECK_CIPHER_DESC(TLS_CIPHER_ARIA_GCM_128, tls12_crypto_info_aria_gcm_128); CHECK_CIPHER_DESC(TLS_CIPHER_ARIA_GCM_256, tls12_crypto_info_aria_gcm_256); -static const struct proto *saved_tcpv6_prot; -static DEFINE_MUTEX(tcpv6_prot_mutex); -static const struct proto *saved_tcpv4_prot; -static DEFINE_MUTEX(tcpv4_prot_mutex); -static struct proto tls_prots[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG]; -static struct proto_ops tls_proto_ops[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG]; +static LIST_HEAD(tls_proto_list); +static DEFINE_MUTEX(tls_proto_mutex); static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], const struct proto *base); +static struct tls_proto *tls_proto_find(const struct proto *prot) +{ + struct tls_proto *proto, *ret = NULL; + + rcu_read_lock(); + list_for_each_entry_rcu(proto, &tls_proto_list, list) { + if (proto->prot == prot) { + if (refcount_inc_not_zero(&proto->refcnt)) + ret = proto; + break; + } + } + rcu_read_unlock(); + return ret; +} + +static void tls_proto_cleanup(void) +{ + struct tls_proto *prot, *tmp; + + mutex_lock(&tls_proto_mutex); + list_for_each_entry_safe(prot, tmp, &tls_proto_list, list) { + if (refcount_dec_and_test(&prot->refcnt)) { + list_del_rcu(&prot->list); + synchronize_rcu(); + kfree(prot); + } + } + mutex_unlock(&tls_proto_mutex); +} + void update_sk_prot(struct sock *sk, struct tls_context *ctx) { int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4; + struct tls_proto *proto = ctx->proto; + + if (!proto) + return; WRITE_ONCE(sk->sk_prot, - &tls_prots[ip_ver][ctx->tx_conf][ctx->rx_conf]); + &proto->prots[ip_ver][ctx->tx_conf][ctx->rx_conf]); WRITE_ONCE(sk->sk_socket->ops, - &tls_proto_ops[ip_ver][ctx->tx_conf][ctx->rx_conf]); + &proto->proto_ops[ip_ver][ctx->tx_conf][ctx->rx_conf]); } int wait_on_pending_writer(struct sock *sk, long *timeo) @@ -XXX,XX +XXX,XX @@ void tls_ctx_free(struct sock *sk, struct tls_context *ctx) if (!ctx) return; + if (ctx->proto) { + if (refcount_dec_and_test(&ctx->proto->refcnt)) { + list_del_rcu(&ctx->proto->list); + synchronize_rcu(); + kfree(ctx->proto); + } + } + memzero_explicit(&ctx->crypto_send, sizeof(ctx->crypto_send)); memzero_explicit(&ctx->crypto_recv, sizeof(ctx->crypto_recv)); mutex_destroy(&ctx->tx_lock); @@ -XXX,XX +XXX,XX @@ static int tls_disconnect(struct sock *sk, int flags) return -EOPNOTSUPP; } -struct tls_context *tls_ctx_create(struct sock *sk) +struct tls_context *tls_ctx_create(struct sock *sk, + struct tls_proto *proto) { struct inet_connection_sock *icsk = inet_csk(sk); struct tls_context *ctx; @@ -XXX,XX +XXX,XX @@ struct tls_context *tls_ctx_create(struct sock *sk) mutex_init(&ctx->tx_lock); ctx->sk_proto = READ_ONCE(sk->sk_prot); + ctx->proto = proto; ctx->sk = sk; /* Release semantic of rcu_assign_pointer() ensures that * ctx->sk_proto is visible before changing sk->sk_prot in @@ -XXX,XX +XXX,XX @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG] #endif } -static void tls_build_proto(struct sock *sk) +static struct tls_proto *tls_build_proto(struct sock *sk) { int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4; struct proto *prot = READ_ONCE(sk->sk_prot); + struct tls_proto *proto; - /* Build IPv6 TLS whenever the address of tcpv6 _prot changes */ - if (ip_ver == TLSV6 && - unlikely(prot != smp_load_acquire(&saved_tcpv6_prot))) { - mutex_lock(&tcpv6_prot_mutex); - if (likely(prot != saved_tcpv6_prot)) { - build_protos(tls_prots[TLSV6], prot); - build_proto_ops(tls_proto_ops[TLSV6], - sk->sk_socket->ops); - smp_store_release(&saved_tcpv6_prot, prot); - } - mutex_unlock(&tcpv6_prot_mutex); - } + mutex_lock(&tls_proto_mutex); + proto = tls_proto_find(prot); + if (proto) + goto out; - if (ip_ver == TLSV4 && - unlikely(prot != smp_load_acquire(&saved_tcpv4_prot))) { - mutex_lock(&tcpv4_prot_mutex); - if (likely(prot != saved_tcpv4_prot)) { - build_protos(tls_prots[TLSV4], prot); - build_proto_ops(tls_proto_ops[TLSV4], - sk->sk_socket->ops); - smp_store_release(&saved_tcpv4_prot, prot); - } - mutex_unlock(&tcpv4_prot_mutex); - } + proto = kzalloc(sizeof(*proto), GFP_KERNEL); + if (!proto) + goto out; + + proto->prot = prot; + refcount_set(&proto->refcnt, 2); + build_protos(proto->prots[ip_ver], prot); + build_proto_ops(proto->proto_ops[ip_ver], + sk->sk_socket->ops); + list_add_rcu(&proto->list, &tls_proto_list); + +out: + mutex_unlock(&tls_proto_mutex); + return proto; } static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], @@ -XXX,XX +XXX,XX @@ static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], static int tls_init(struct sock *sk) { + struct tls_proto *proto; struct tls_context *ctx; int rc = 0; - tls_build_proto(sk); + proto = tls_build_proto(sk); + if (!proto) + return -ENOMEM; #ifdef CONFIG_TLS_TOE - if (tls_toe_bypass(sk)) + if (tls_toe_bypass(sk, proto)) { + refcount_dec(&proto->refcnt); return 0; + } #endif /* The TLS ulp is currently supported only for TCP sockets @@ -XXX,XX +XXX,XX @@ static int tls_init(struct sock *sk) * to modify the accept implementation to clone rather then * share the ulp context. */ - if (sk->sk_state != TCP_ESTABLISHED) + if (sk->sk_state != TCP_ESTABLISHED) { + refcount_dec(&proto->refcnt); return -ENOTCONN; + } /* allocate tls context */ write_lock_bh(&sk->sk_callback_lock); - ctx = tls_ctx_create(sk); + ctx = tls_ctx_create(sk, proto); if (!ctx) { + refcount_dec(&proto->refcnt); rc = -ENOMEM; goto out; } @@ -XXX,XX +XXX,XX @@ static int __init tls_register(void) static void __exit tls_unregister(void) { + tls_proto_cleanup(); tcp_unregister_ulp(&tcp_tls_ulp_ops); tls_strp_dev_exit(); tls_device_cleanup(); diff --git a/net/tls/tls_toe.c b/net/tls/tls_toe.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_toe.c +++ b/net/tls/tls_toe.c @@ -XXX,XX +XXX,XX @@ static void tls_toe_sk_destruct(struct sock *sk) tls_ctx_free(sk, ctx); } -int tls_toe_bypass(struct sock *sk) +int tls_toe_bypass(struct sock *sk, + struct tls_proto *proto) { struct tls_toe_device *dev; struct tls_context *ctx; @@ -XXX,XX +XXX,XX @@ int tls_toe_bypass(struct sock *sk) spin_lock_bh(&device_spinlock); list_for_each_entry(dev, &device_list, dev_list) { if (dev->feature && dev->feature(dev)) { - ctx = tls_ctx_create(sk); + ctx = tls_ctx_create(sk, proto); if (!ctx) goto out; -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> To extend MPTCP support based on TCP TLS, a tls_prot_ops structure has been introduced for TLS, encapsulating TCP-specific helpers within this structure. Add registering, validating and finding functions for this structure to add, validate and find a tls_prot_ops on the global list tls_prot_ops_list. Register TCP-specific structure tls_tcp_ops in tls_init(). Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/net/tls.h | 19 +++++++++ net/tls/tls_main.c | 102 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 121 insertions(+) diff --git a/include/net/tls.h b/include/net/tls.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -XXX,XX +XXX,XX @@ struct tls_prot_info { u16 tail_size; }; +struct tls_prot_ops { + int protocol; + struct module *owner; + struct list_head list; + + int (*inq)(struct sock *sk); + int (*sendmsg_locked)(struct sock *sk, struct msghdr *msg, size_t size); + struct sk_buff *(*recv_skb)(struct sock *sk, u32 *off); + bool (*lock_is_held)(struct sock *sk); + int (*read_sock)(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor); + void (*read_done)(struct sock *sk, size_t len); + u32 (*get_skb_seq)(struct sk_buff *skb); + __poll_t (*poll)(struct file *file, struct socket *sock, + struct poll_table_struct *wait); + bool (*epollin_ready)(const struct sock *sk); + void (*check_app_limited)(struct sock *sk); +}; + enum { TLSV4, TLSV6, diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ CHECK_CIPHER_DESC(TLS_CIPHER_ARIA_GCM_256, tls12_crypto_info_aria_gcm_256); static LIST_HEAD(tls_proto_list); static DEFINE_MUTEX(tls_proto_mutex); +static LIST_HEAD(tls_prot_ops_list); +static DEFINE_SPINLOCK(tls_prot_ops_lock); static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG], const struct proto *base); @@ -XXX,XX +XXX,XX @@ static void tls_proto_cleanup(void) mutex_unlock(&tls_proto_mutex); } +static struct tls_prot_ops *tls_prot_ops_find(int protocol) +{ + struct tls_prot_ops *ops; + + list_for_each_entry_rcu(ops, &tls_prot_ops_list, list) { + if (ops->protocol == protocol) + return ops; + } + + return NULL; +} + void update_sk_prot(struct sock *sk, struct tls_context *ctx) { int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4; @@ -XXX,XX +XXX,XX @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = { .get_info_size = tls_get_info_size, }; +static int tls_validate_prot_ops(const struct tls_prot_ops *ops) +{ + if (!ops->inq || !ops->sendmsg_locked || + !ops->recv_skb || !ops->lock_is_held || + !ops->read_sock || !ops->read_done || + !ops->get_skb_seq || + !ops->poll || !ops->epollin_ready || + !ops->check_app_limited) { + pr_err("%d does not implement required ops\n", ops->protocol); + return -EINVAL; + } + + return 0; +} + +static int tls_register_prot_ops(struct tls_prot_ops *ops) +{ + int ret; + + ret = tls_validate_prot_ops(ops); + if (ret) + return ret; + + spin_lock(&tls_prot_ops_lock); + rcu_read_lock(); + if (tls_prot_ops_find(ops->protocol)) { + rcu_read_unlock(); + spin_unlock(&tls_prot_ops_lock); + return -EEXIST; + } + rcu_read_unlock(); + list_add_tail_rcu(&ops->list, &tls_prot_ops_list); + spin_unlock(&tls_prot_ops_lock); + + pr_debug("tls_prot_ops %d registered\n", ops->protocol); + return 0; +} + +static void tls_unregister_prot_ops(struct tls_prot_ops *ops) +{ + spin_lock(&tls_prot_ops_lock); + list_del_rcu(&ops->list); + synchronize_rcu(); + spin_unlock(&tls_prot_ops_lock); +} + +static struct sk_buff *tls_tcp_recv_skb(struct sock *sk, u32 *off) +{ + return tcp_recv_skb(sk, tcp_sk(sk)->copied_seq, off); +} + +static bool tls_tcp_lock_is_held(struct sock *sk) +{ + return sock_owned_by_user_nocheck(sk); +} + +static u32 tls_tcp_get_skb_seq(struct sk_buff *skb) +{ + return TCP_SKB_CB(skb)->seq; +} + +static bool tls_tcp_epollin_ready(const struct sock *sk) +{ + return tcp_epollin_ready(sk, INT_MAX); +} + +static struct tls_prot_ops tls_tcp_ops = { + .protocol = IPPROTO_TCP, + .owner = THIS_MODULE, + .inq = tcp_inq, + .sendmsg_locked = tcp_sendmsg_locked, + .recv_skb = tls_tcp_recv_skb, + .lock_is_held = tls_tcp_lock_is_held, + .read_sock = tcp_read_sock, + .read_done = tcp_read_done, + .get_skb_seq = tls_tcp_get_skb_seq, + .poll = tcp_poll, + .epollin_ready = tls_tcp_epollin_ready, + .check_app_limited = tcp_rate_check_app_limited, +}; + static int __init tls_register(void) { int err; @@ -XXX,XX +XXX,XX @@ static int __init tls_register(void) if (err) goto err_strp; + err = tls_register_prot_ops(&tls_tcp_ops); + if (err) + goto err_dev; + tcp_register_ulp(&tcp_tls_ulp_ops); return 0; +err_dev: + tls_device_cleanup(); err_strp: tls_strp_dev_exit(); err_pernet: @@ -XXX,XX +XXX,XX @@ static int __init tls_register(void) static void __exit tls_unregister(void) { tls_proto_cleanup(); + tls_unregister_prot_ops(&tls_tcp_ops); tcp_unregister_ulp(&tcp_tls_ulp_ops); tls_strp_dev_exit(); tls_device_cleanup(); -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> A pointer to struct tls_prot_ops, named 'ops', has been added to struct tls_proto. In tls_build_proto(), proto->ops is assigned either 'tls_mptcp_ops' or 'tls_tcp_ops' based on the socket protocol. Fix module reference counting bug where each socket release called module_put() without matching get for existing tls_proto. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/net/tls.h | 1 + net/tls/tls_main.c | 25 ++++++++++++++++++++++++- 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/include/net/tls.h b/include/net/tls.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -XXX,XX +XXX,XX @@ struct tls_proto { refcount_t refcnt; struct list_head list; const struct proto *prot; + const struct tls_prot_ops *ops; struct proto prots[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG]; struct proto_ops proto_ops[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG]; }; diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ static void tls_proto_cleanup(void) if (refcount_dec_and_test(&prot->refcnt)) { list_del_rcu(&prot->list); synchronize_rcu(); + module_put(prot->ops->owner); kfree(prot); } } @@ -XXX,XX +XXX,XX @@ void tls_ctx_free(struct sock *sk, struct tls_context *ctx) return; if (ctx->proto) { + module_put(ctx->proto->ops->owner); if (refcount_dec_and_test(&ctx->proto->refcnt)) { list_del_rcu(&ctx->proto->list); synchronize_rcu(); + module_put(ctx->proto->ops->owner); kfree(ctx->proto); } } @@ -XXX,XX +XXX,XX @@ static struct tls_proto *tls_build_proto(struct sock *sk) { int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4; struct proto *prot = READ_ONCE(sk->sk_prot); + struct tls_prot_ops *ops; struct tls_proto *proto; mutex_lock(&tls_proto_mutex); @@ -XXX,XX +XXX,XX @@ static struct tls_proto *tls_build_proto(struct sock *sk) if (proto) goto out; + rcu_read_lock(); + ops = tls_prot_ops_find(sk->sk_protocol); + if (!ops || !try_module_get(ops->owner)) { + rcu_read_unlock(); + goto out; + } + rcu_read_unlock(); + proto = kzalloc(sizeof(*proto), GFP_KERNEL); - if (!proto) + if (!proto) { + module_put(ops->owner); goto out; + } proto->prot = prot; + proto->ops = ops; refcount_set(&proto->refcnt, 2); build_protos(proto->prots[ip_ver], prot); build_proto_ops(proto->proto_ops[ip_ver], @@ -XXX,XX +XXX,XX @@ static int tls_init(struct sock *sk) if (!proto) return -ENOMEM; + if (!try_module_get(proto->ops->owner)) { + refcount_dec(&proto->refcnt); + return -ENOENT; + } + #ifdef CONFIG_TLS_TOE if (tls_toe_bypass(sk, proto)) { refcount_dec(&proto->refcnt); + module_put(proto->ops->owner); return 0; } #endif @@ -XXX,XX +XXX,XX @@ static int tls_init(struct sock *sk) */ if (sk->sk_state != TCP_ESTABLISHED) { refcount_dec(&proto->refcnt); + module_put(proto->ops->owner); return -ENOTCONN; } @@ -XXX,XX +XXX,XX @@ static int tls_init(struct sock *sk) ctx = tls_ctx_create(sk, proto); if (!ctx) { refcount_dec(&proto->refcnt); + module_put(proto->ops->owner); rc = -ENOMEM; goto out; } -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> The places originally calling TLS-specific helpers have now been modified to indirectly invoke them via 'ops' pointer in tls_proto. Make TLS code protocol-agnostic. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- net/tls/tls_main.c | 10 ++++++---- net/tls/tls_strp.c | 33 ++++++++++++++++++++++----------- net/tls/tls_sw.c | 7 +++++-- 3 files changed, 33 insertions(+), 17 deletions(-) diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ int tls_push_sg(struct sock *sk, ctx->splicing_pages = true; while (1) { /* is sending application-limited? */ - tcp_rate_check_app_limited(sk); + ctx->proto->ops->check_app_limited(sk); p = sg_page(sg); retry: bvec_set_page(&bvec, p, size, offset); iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - ret = tcp_sendmsg_locked(sk, &msg, size); + ret = ctx->proto->ops->sendmsg_locked(sk, &msg, size); if (ret != size) { if (ret > 0) { @@ -XXX,XX +XXX,XX @@ static __poll_t tls_sk_poll(struct file *file, struct socket *sock, u8 shutdown; int state; - mask = tcp_poll(file, sock, wait); + tls_ctx = tls_get_ctx(sk); + if (!tls_ctx || !tls_ctx->proto || !tls_ctx->proto->ops) + return 0; + mask = tls_ctx->proto->ops->poll(file, sock, wait); state = inet_sk_state_load(sk); shutdown = READ_ONCE(sk->sk_shutdown); if (unlikely(state != TCP_ESTABLISHED || shutdown & RCV_SHUTDOWN)) return mask; - tls_ctx = tls_get_ctx(sk); ctx = tls_sw_ctx_rx(tls_ctx); psock = sk_psock_get(sk); diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_strp.c +++ b/net/tls/tls_strp.c @@ -XXX,XX +XXX,XX @@ struct sk_buff *tls_strp_msg_detach(struct tls_sw_context_rx *ctx) int tls_strp_msg_cow(struct tls_sw_context_rx *ctx) { struct tls_strparser *strp = &ctx->strp; + struct tls_context *tls_ctx; struct sk_buff *skb; if (strp->copy_mode) @@ -XXX,XX +XXX,XX @@ int tls_strp_msg_cow(struct tls_sw_context_rx *ctx) tls_strp_anchor_free(strp); strp->anchor = skb; - tcp_read_done(strp->sk, strp->stm.full_len); + tls_ctx = tls_get_ctx(strp->sk); + tls_ctx->proto->ops->read_done(strp->sk, strp->stm.full_len); strp->copy_mode = 1; return 0; @@ -XXX,XX +XXX,XX @@ static int tls_strp_copyin(read_descriptor_t *desc, struct sk_buff *in_skb, static int tls_strp_read_copyin(struct tls_strparser *strp) { + struct tls_context *ctx = tls_get_ctx(strp->sk); read_descriptor_t desc; desc.arg.data = strp; @@ -XXX,XX +XXX,XX @@ static int tls_strp_read_copyin(struct tls_strparser *strp) desc.count = 1; /* give more than one skb per call */ /* sk should be locked here, so okay to do read_sock */ - tcp_read_sock(strp->sk, &desc, tls_strp_copyin); + ctx->proto->ops->read_sock(strp->sk, &desc, tls_strp_copyin); return desc.error; } static int tls_strp_read_copy(struct tls_strparser *strp, bool qshort) { + struct tls_context *ctx = tls_get_ctx(strp->sk); struct skb_shared_info *shinfo; struct page *page; int need_spc, len; @@ -XXX,XX +XXX,XX @@ static int tls_strp_read_copy(struct tls_strparser *strp, bool qshort) * to read the data out. Otherwise the connection will stall. * Without pressure threshold of INT_MAX will never be ready. */ - if (likely(qshort && !tcp_epollin_ready(strp->sk, INT_MAX))) + if (likely(qshort && !ctx->proto->ops->epollin_ready(strp->sk))) return 0; shinfo = skb_shinfo(strp->anchor); @@ -XXX,XX +XXX,XX @@ static int tls_strp_read_copy(struct tls_strparser *strp, bool qshort) static bool tls_strp_check_queue_ok(struct tls_strparser *strp) { unsigned int len = strp->stm.offset + strp->stm.full_len; + struct tls_context *ctx = tls_get_ctx(strp->sk); struct sk_buff *first, *skb; u32 seq; first = skb_shinfo(strp->anchor)->frag_list; skb = first; - seq = TCP_SKB_CB(first)->seq; + seq = ctx->proto->ops->get_skb_seq(first); /* Make sure there's no duplicate data in the queue, * and the decrypted status matches. @@ -XXX,XX +XXX,XX @@ static bool tls_strp_check_queue_ok(struct tls_strparser *strp) len -= skb->len; skb = skb->next; - if (TCP_SKB_CB(skb)->seq != seq) + if (ctx->proto->ops->get_skb_seq(skb) != seq) return false; if (skb_cmp_decrypted(first, skb)) return false; @@ -XXX,XX +XXX,XX @@ static bool tls_strp_check_queue_ok(struct tls_strparser *strp) static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len) { - struct tcp_sock *tp = tcp_sk(strp->sk); + struct tls_context *ctx = tls_get_ctx(strp->sk); struct sk_buff *first; u32 offset; - first = tcp_recv_skb(strp->sk, tp->copied_seq, &offset); + first = ctx->proto->ops->recv_skb(strp->sk, &offset); if (WARN_ON_ONCE(!first)) return; @@ -XXX,XX +XXX,XX @@ static void tls_strp_load_anchor_with_queue(struct tls_strparser *strp, int len) bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) { + struct tls_context *ctx = tls_get_ctx(strp->sk); struct strp_msg *rxm; struct tls_msg *tlm; @@ -XXX,XX +XXX,XX @@ bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) DEBUG_NET_WARN_ON_ONCE(!strp->stm.full_len); if (!strp->copy_mode && force_refresh) { - if (unlikely(tcp_inq(strp->sk) < strp->stm.full_len)) { + if (unlikely(ctx->proto->ops->inq(strp->sk) < strp->stm.full_len)) { WRITE_ONCE(strp->msg_ready, 0); memset(&strp->stm, 0, sizeof(strp->stm)); return false; @@ -XXX,XX +XXX,XX @@ bool tls_strp_msg_load(struct tls_strparser *strp, bool force_refresh) /* Called with lock held on lower socket */ static int tls_strp_read_sock(struct tls_strparser *strp) { + struct tls_context *ctx = tls_get_ctx(strp->sk); int sz, inq; - inq = tcp_inq(strp->sk); + inq = ctx->proto->ops->inq(strp->sk); if (inq < 1) return 0; @@ -XXX,XX +XXX,XX @@ void tls_strp_check_rcv(struct tls_strparser *strp) /* Lower sock lock held */ void tls_strp_data_ready(struct tls_strparser *strp) { + struct tls_context *ctx = tls_get_ctx(strp->sk); + /* This check is needed to synchronize with do_tls_strp_work. * do_tls_strp_work acquires a process lock (lock_sock) whereas * the lock held here is bh_lock_sock. The two locks can be @@ -XXX,XX +XXX,XX @@ void tls_strp_data_ready(struct tls_strparser *strp) * allows a thread in BH context to safely check if the process * lock is held. In this case, if the lock is held, queue work. */ - if (sock_owned_by_user_nocheck(strp->sk)) { + if (ctx->proto->ops->lock_is_held(strp->sk)) { queue_work(tls_strp_wq, &strp->work); return; } @@ -XXX,XX +XXX,XX @@ static void tls_strp_work(struct work_struct *w) void tls_strp_msg_done(struct tls_strparser *strp) { + struct tls_context *ctx = tls_get_ctx(strp->sk); + WARN_ON(!strp->stm.full_len); if (likely(!strp->copy_mode)) - tcp_read_done(strp->sk, strp->stm.full_len); + ctx->proto->ops->read_done(strp->sk, strp->stm.full_len); else tls_strp_flush_anchor_copy(strp); diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -XXX,XX +XXX,XX @@ tls_read_flush_backlog(struct sock *sk, struct tls_prot_info *prot, size_t len_left, size_t decrypted, ssize_t done, size_t *flushed_at) { + struct tls_context *tls_ctx = tls_get_ctx(sk); size_t max_rec; if (len_left <= decrypted) return false; max_rec = prot->overhead_size - prot->tail_size + TLS_MAX_PAYLOAD_SIZE; - if (done - *flushed_at < SZ_128K && tcp_inq(sk) > max_rec) + if (done - *flushed_at < SZ_128K && tls_ctx->proto->ops->inq(sk) > max_rec) return false; *flushed_at = done; @@ -XXX,XX +XXX,XX @@ int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb) size_t cipher_overhead; size_t data_len = 0; int ret; + u32 seq; /* Verify that we have a full TLS header, or wait for more data */ if (strp->stm.offset + prot->prepend_size > skb->len) @@ -XXX,XX +XXX,XX @@ int tls_rx_msg_size(struct tls_strparser *strp, struct sk_buff *skb) goto read_failure; } + seq = tls_ctx->proto->ops->get_skb_seq(skb); tls_device_rx_resync_new_rec(strp->sk, data_len + TLS_HEADER_SIZE, - TCP_SKB_CB(skb)->seq + strp->stm.offset); + seq + strp->stm.offset); return data_len + TLS_HEADER_SIZE; read_failure: -- 2.51.0
From: Gang Yan <yangang@kylinos.cn> This patch makes mptcp_check_readable() aligned with TCP, and renames it to mptcp_stream_is_readable(). It will be used in the case of KTLS, because 'prot' will be modified, tls_sw_sock_is_readable() is expected to be called from prot->sock_is_readable(). Co-developed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Gang Yan <yangang@kylinos.cn> --- net/mptcp/protocol.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ void __mptcp_unaccepted_force_close(struct sock *sk) __mptcp_destroy_sock(sk); } -static __poll_t mptcp_check_readable(struct sock *sk) +static bool mptcp_stream_is_readable(struct sock *sk) { - return mptcp_epollin_ready(sk) ? EPOLLIN | EPOLLRDNORM : 0; + if (mptcp_epollin_ready(sk)) + return true; + return sk_is_readable(sk); } static void mptcp_check_listen_stop(struct sock *sk) @@ -XXX,XX +XXX,XX @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock, mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP; if (state != TCP_SYN_SENT && state != TCP_SYN_RECV) { - mask |= mptcp_check_readable(sk); + if (mptcp_stream_is_readable(sk)) + mask |= EPOLLIN | EPOLLRDNORM; if (shutdown & SEND_SHUTDOWN) mask |= EPOLLOUT | EPOLLWRNORM; else -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> This patch implements the MPTCP-specific struct tls_prot_ops, named 'tls_mptcp_ops'. Note that there is a slight difference between mptcp_inq() and mptcp_inq_hint(), it does not return 1 when the socket is closed or shut down; instead, it returns 0. Otherwise, it would break the condition "inq < 1" in tls_strp_read_sock(). Passing an MPTCP socket to tcp_sock_rate_check_app_limited() can trigger a crash. Here, an MPTCP version of check_app_limited() is implemented, which calls tcp_sock_rate_check_app_limited() for each subflow. When MPTCP implements lock_is_held interface, it not only checks sock_owned_by_user_nocheck(sk) as TCP does, but also needs to check whether the MPTCP data lock is held. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/net/mptcp.h | 2 + include/net/tcp.h | 1 + net/ipv4/tcp.c | 9 +++- net/mptcp/protocol.c | 108 ++++++++++++++++++++++++++++++++++++++++--- net/mptcp/protocol.h | 1 + net/tls/tls_main.c | 13 ++++++ 6 files changed, 126 insertions(+), 8 deletions(-) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -XXX,XX +XXX,XX @@ struct mptcp_pm_ops { void (*release)(struct mptcp_sock *msk); } ____cacheline_aligned_in_smp; +extern struct tls_prot_ops tls_mptcp_ops; + #ifdef CONFIG_MPTCP void mptcp_init(void); diff --git a/include/net/tcp.h b/include/net/tcp.h index XXXXXXX..XXXXXXX 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -XXX,XX +XXX,XX @@ static inline int tcp_bound_to_half_wnd(struct tcp_sock *tp, int pktsize) /* tcp.c */ void tcp_get_info(struct sock *, struct tcp_info *); +void tcp_sock_rate_check_app_limited(struct tcp_sock *tp); void tcp_rate_check_app_limited(struct sock *sk); /* Read 'sendfile()'-style from a TCP socket */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index XXXXXXX..XXXXXXX 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -XXX,XX +XXX,XX @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied, } /* If a gap is detected between sends, mark the socket application-limited. */ -void tcp_rate_check_app_limited(struct sock *sk) +void tcp_sock_rate_check_app_limited(struct tcp_sock *tp) { - struct tcp_sock *tp = tcp_sk(sk); + struct sock *sk = (struct sock *)tp; if (/* We have less than one packet to send. */ tp->write_seq - tp->snd_nxt < tp->mss_cache && @@ -XXX,XX +XXX,XX @@ void tcp_rate_check_app_limited(struct sock *sk) tp->app_limited = (tp->delivered + tcp_packets_in_flight(tp)) ? : 1; } + +void tcp_rate_check_app_limited(struct sock *sk) +{ + tcp_sock_rate_check_app_limited(tcp_sk(sk)); +} EXPORT_SYMBOL_GPL(tcp_rate_check_app_limited); int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ #include <net/mptcp.h> #include <net/hotdata.h> #include <net/xfrm.h> +#include <net/tls.h> #include <asm/ioctls.h> #include "protocol.h" #include "mib.h" -static unsigned int mptcp_inq_hint(const struct sock *sk); +static unsigned int mptcp_inq_hint(struct sock *sk); #define CREATE_TRACE_POINTS #include <trace/events/mptcp.h> @@ -XXX,XX +XXX,XX @@ static void mptcp_rps_record_subflows(const struct mptcp_sock *msk) } } -static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +static int mptcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len) { struct mptcp_sock *msk = mptcp_sk(sk); struct page_frag *pfrag; @@ -XXX,XX +XXX,XX @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_FASTOPEN | MSG_EOR; - lock_sock(sk); - mptcp_rps_record_subflows(msk); if (unlikely(inet_test_bit(DEFER_CONNECT, sk) || @@ -XXX,XX +XXX,XX @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) } out: - release_sock(sk); return copied; do_error: @@ -XXX,XX +XXX,XX @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto out; } +static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +{ + int ret; + + lock_sock(sk); + ret = mptcp_sendmsg_locked(sk, msg, len); + release_sock(sk); + + return ret; +} + static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied); static void mptcp_eat_recv_skb(struct sock *sk, struct sk_buff *skb) @@ -XXX,XX +XXX,XX @@ static bool mptcp_move_skbs(struct sock *sk) return enqueued; } -static unsigned int mptcp_inq_hint(const struct sock *sk) +static int mptcp_inq(struct sock *sk) { const struct mptcp_sock *msk = mptcp_sk(sk); const struct sk_buff *skb; @@ -XXX,XX +XXX,XX @@ static unsigned int mptcp_inq_hint(const struct sock *sk) return (unsigned int)hint_val; } + return 0; +} + +static unsigned int mptcp_inq_hint(struct sock *sk) +{ + unsigned int inq = mptcp_inq(sk); + + if (inq) + return inq; + if (sk->sk_state == TCP_CLOSE || (sk->sk_shutdown & RCV_SHUTDOWN)) return 1; @@ -XXX,XX +XXX,XX @@ int __init mptcp_proto_v6_init(void) return err; } #endif + +static bool mptcp_lock_is_held(struct sock *sk) +{ + return sock_owned_by_user_nocheck(sk) || + mptcp_data_is_locked(sk); +} + +static void mptcp_read_done(struct sock *sk, size_t len) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct sk_buff *skb; + size_t left; + u32 offset; + + msk_owned_by_me(msk); + + if (sk->sk_state == TCP_LISTEN) + return; + + left = len; + while (left && (skb = mptcp_recv_skb(sk, &offset)) != NULL) { + int used; + + used = min_t(size_t, skb->len - offset, left); + msk->bytes_consumed += used; + MPTCP_SKB_CB(skb)->offset += used; + MPTCP_SKB_CB(skb)->map_seq += used; + left -= used; + + if (skb->len > offset + used) + break; + + mptcp_eat_recv_skb(sk, skb); + } + + mptcp_rcv_space_adjust(msk, len - left); + + /* Clean up data we have read: This will do ACK frames. */ + if (left != len) + mptcp_cleanup_rbuf(msk, len - left); +} + +static u32 mptcp_get_skb_seq(struct sk_buff *skb) +{ + return MPTCP_SKB_CB(skb)->map_seq - MPTCP_SKB_CB(skb)->offset; +} + +static void mptcp_check_app_limited(struct sock *sk) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct mptcp_subflow_context *subflow; + + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + bool slow; + + slow = lock_sock_fast(ssk); + tcp_sock_rate_check_app_limited(tcp_sk(ssk)); + unlock_sock_fast(ssk, slow); + } +} + +struct tls_prot_ops tls_mptcp_ops = { + .protocol = IPPROTO_MPTCP, + .owner = THIS_MODULE, + .inq = mptcp_inq, + .sendmsg_locked = mptcp_sendmsg_locked, + .recv_skb = mptcp_recv_skb, + .lock_is_held = mptcp_lock_is_held, + .read_sock = mptcp_read_sock, + .read_done = mptcp_read_done, + .get_skb_seq = mptcp_get_skb_seq, + .poll = mptcp_poll, + .epollin_ready = mptcp_epollin_ready, + .check_app_limited = mptcp_check_app_limited, +}; +EXPORT_SYMBOL(tls_mptcp_ops); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -XXX,XX +XXX,XX @@ struct mptcp_sock { #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock) #define mptcp_data_unlock(sk) spin_unlock_bh(&(sk)->sk_lock.slock) +#define mptcp_data_is_locked(sk) spin_is_locked(&(sk)->sk_lock.slock) #define mptcp_for_each_subflow(__msk, __subflow) \ list_for_each_entry(__subflow, &((__msk)->conn_list), node) diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -XXX,XX +XXX,XX @@ static int __init tls_register(void) if (err) goto err_dev; +#ifdef CONFIG_MPTCP + err = tls_register_prot_ops(&tls_mptcp_ops); + if (err) + goto err_prot_ops; +#endif + tcp_register_ulp(&tcp_tls_ulp_ops); return 0; +#ifdef CONFIG_MPTCP +err_prot_ops: + tls_unregister_prot_ops(&tls_tcp_ops); +#endif err_dev: tls_device_cleanup(); err_strp: @@ -XXX,XX +XXX,XX @@ static int __init tls_register(void) static void __exit tls_unregister(void) { tls_proto_cleanup(); +#ifdef CONFIG_MPTCP + tls_unregister_prot_ops(&tls_mptcp_ops); +#endif tls_unregister_prot_ops(&tls_tcp_ops); tcp_unregister_ulp(&tcp_tls_ulp_ops); tls_strp_dev_exit(); -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> MPTCP TLS hardware offload is not yet implemented. Return -EOPNOTSUPP when attempting to enable device offload on MPTCP sockets. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- net/tls/tls_device.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index XXXXXXX..XXXXXXX 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -XXX,XX +XXX,XX @@ int tls_set_device_offload(struct sock *sk) ctx = tls_get_ctx(sk); prot = &ctx->prot_info; + if (sk->sk_protocol == IPPROTO_MPTCP) + return -EOPNOTSUPP; + if (ctx->priv_ctx_tx) return -EEXIST; @@ -XXX,XX +XXX,XX @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx) struct net_device *netdev; int rc = 0; + if (sk->sk_protocol == IPPROTO_MPTCP) + return -EOPNOTSUPP; + if (ctx->crypto_recv.info.version != TLS_1_2_VERSION) return -EOPNOTSUPP; -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> This patch extracts TCP_ULP getsockopt operation into a tcp_sock_get_ulp() helper so that it can also be used in MPTCP. TCP_ULP was obtained by calling mptcp_getsockopt_first_sf_only() to get ULP of the first subflow. Now that the mechanism has changed, a new helper mptcp_getsockopt_tcp_ulp() is added to get ULP of msk. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- include/linux/tcp.h | 1 + net/ipv4/tcp.c | 36 ++++++++++++++++++++++-------------- net/mptcp/sockopt.c | 18 ++++++++++++++++++ 3 files changed, 41 insertions(+), 14 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index XXXXXXX..XXXXXXX 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -XXX,XX +XXX,XX @@ void tcp_sock_set_quickack(struct sock *sk, int val); int tcp_sock_set_syncnt(struct sock *sk, int val); int tcp_sock_set_user_timeout(struct sock *sk, int val); int tcp_sock_set_maxseg(struct sock *sk, int val); +int tcp_sock_get_ulp(struct sock *sk, sockptr_t optval, sockptr_t optlen); static inline bool dst_tcp_usec_ts(const struct dst_entry *dst) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index XXXXXXX..XXXXXXX 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -XXX,XX +XXX,XX @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk, return stats; } +int tcp_sock_get_ulp(struct sock *sk, sockptr_t optval, sockptr_t optlen) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + int len; + + if (copy_from_sockptr(&len, optlen, sizeof(int))) + return -EFAULT; + len = min_t(unsigned int, len, TCP_ULP_NAME_MAX); + if (!icsk->icsk_ulp_ops) { + len = 0; + if (copy_to_sockptr(optlen, &len, sizeof(int))) + return -EFAULT; + return 0; + } + if (copy_to_sockptr(optlen, &len, sizeof(int))) + return -EFAULT; + if (copy_to_sockptr(optval, icsk->icsk_ulp_ops->name, len)) + return -EFAULT; + return 0; +} + int do_tcp_getsockopt(struct sock *sk, int level, int optname, sockptr_t optval, sockptr_t optlen) { @@ -XXX,XX +XXX,XX @@ int do_tcp_getsockopt(struct sock *sk, int level, return 0; case TCP_ULP: - if (copy_from_sockptr(&len, optlen, sizeof(int))) - return -EFAULT; - len = min_t(unsigned int, len, TCP_ULP_NAME_MAX); - if (!icsk->icsk_ulp_ops) { - len = 0; - if (copy_to_sockptr(optlen, &len, sizeof(int))) - return -EFAULT; - return 0; - } - if (copy_to_sockptr(optlen, &len, sizeof(int))) - return -EFAULT; - if (copy_to_sockptr(optval, icsk->icsk_ulp_ops->name, len)) - return -EFAULT; - return 0; + return tcp_sock_get_ulp(sk, optval, optlen); case TCP_FASTOPEN_KEY: { u64 key[TCP_FASTOPEN_KEY_BUF_LENGTH / sizeof(u64)]; diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -XXX,XX +XXX,XX @@ static int mptcp_put_int_option(struct mptcp_sock *msk, char __user *optval, return 0; } +static int mptcp_getsockopt_tcp_ulp(struct sock *sk, char __user *optval, + int __user *optlen) +{ + int ret, len; + + if (copy_from_sockptr(&len, USER_SOCKPTR(optlen), sizeof(int))) + return -EFAULT; + + if (len < 0) + return -EINVAL; + + lock_sock(sk); + ret = tcp_sock_get_ulp(sk, USER_SOCKPTR(optval), USER_SOCKPTR(optlen)); + release_sock(sk); + return ret; +} + static int mptcp_getsockopt_sol_tcp(struct mptcp_sock *msk, int optname, char __user *optval, int __user *optlen) { @@ -XXX,XX +XXX,XX @@ static int mptcp_getsockopt_sol_tcp(struct mptcp_sock *msk, int optname, switch (optname) { case TCP_ULP: + return mptcp_getsockopt_tcp_ulp(sk, optval, optlen); case TCP_CONGESTION: case TCP_INFO: case TCP_CC_INFO: -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> Allow MPTCP sockets to set the TCP_ULP socket option to enable TLS. Add mptcp_setsockopt_tcp_ulp() which validates the socket state (must not be CLOSE or LISTEN), only accepts "tls" as the ULP name, and then calls tcp_set_ulp(). Include TCP_ULP in the list of supported options in supported_sockopt(), and handle it in setsockopt_sol_tcp() instead of returning -EOPNOTSUPP. Call tcp_cleanup_ulp() in mptcp_destroy_common() to release ULP module's reference count. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- net/mptcp/protocol.c | 1 + net/mptcp/sockopt.c | 35 ++++++++++++++++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ static void mptcp_destroy(struct sock *sk) /* allow the following to close even the initial subflow */ msk->free_first = 1; mptcp_destroy_common(msk); + tcp_cleanup_ulp(sk); sk_sockets_allocated_dec(sk); } diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -XXX,XX +XXX,XX @@ #include <net/protocol.h> #include <net/tcp.h> #include <net/mptcp.h> +#include <net/tls.h> #include "protocol.h" #define MIN_INFO_OPTLEN_SIZE 16 @@ -XXX,XX +XXX,XX @@ static bool mptcp_supported_sockopt(int level, int optname) case TCP_FASTOPEN_CONNECT: case TCP_FASTOPEN_KEY: case TCP_FASTOPEN_NO_COOKIE: + case TCP_ULP: return true; } @@ -XXX,XX +XXX,XX @@ static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level, return ret; } +static int mptcp_setsockopt_tcp_ulp(struct sock *sk, sockptr_t optval, + unsigned int optlen) +{ + char name[TCP_ULP_NAME_MAX]; + int err = 0; + size_t len; + int val; + + if (optlen < 1) + return -EINVAL; + + len = min_t(long, TCP_ULP_NAME_MAX - 1, optlen); + val = strncpy_from_sockptr(name, optval, len); + if (val < 0) + return -EFAULT; + name[val] = 0; + + if (strcmp(name, "tls")) + return -EOPNOTSUPP; + + sockopt_lock_sock(sk); + if ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) { + err = -ENOTCONN; + goto out; + } + err = tcp_set_ulp(sk, name); +out: + sockopt_release_sock(sk); + return err; +} + static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, sockptr_t optval, unsigned int optlen) { @@ -XXX,XX +XXX,XX @@ static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, switch (optname) { case TCP_ULP: - return -EOPNOTSUPP; + return mptcp_setsockopt_tcp_ulp(sk, optval, optlen); case TCP_CONGESTION: return mptcp_setsockopt_sol_tcp_congestion(msk, optval, optlen); case TCP_DEFER_ACCEPT: -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> With KTLS being implemented, "tls" should no longer be used in sock_test_tcpulp(), it breaks mptcp_connect.sh tests. Another ULP name, "espintcp", is set instead in this patch. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/mptcp/mptcp_connect.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -XXX,XX +XXX,XX @@ static void sock_test_tcpulp(int sock, int proto, unsigned int line) if (buflen > 0) { if (strcmp(buf, "mptcp") != 0) xerror("unexpected ULP '%s' for proto %d at line %u", buf, proto, line); - ret = do_ulp_so(sock, "tls"); + ret = do_ulp_so(sock, "espintcp"); if (ret == 0) X("setsockopt"); } else if (proto == IPPROTO_MPTCP) { - ret = do_ulp_so(sock, "tls"); + ret = do_ulp_so(sock, "espintcp"); if (ret != -1) X("setsockopt"); } -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> To enable easy MPTCP socket creation in MPTCP TLS tests, two protocol parameters (cli_proto and srv_proto) have been added to ulp_sock_pair(). These are passed as third arguments of socket(): 0 creates TCP sockets, IPPROTO_MPTCP creates MPTCP sockets. A new variant "mptcp" is added both in FIXTURE_VARIANT(tls) to control whether to create MPTCP sockets or not for tests. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/tls.c | 44 +++++++++++++++++++++++++++---- 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -XXX,XX +XXX,XX @@ #define TLS_PAYLOAD_MAX_LEN 16384 #define SOL_TLS 282 +#ifndef IPPROTO_MPTCP +#define IPPROTO_MPTCP 262 +#endif + static int fips_enabled; struct tls_crypto_info_keys { @@ -XXX,XX +XXX,XX @@ static void memrnd(void *s, size_t n) *byte++ = rand(); } -static void ulp_sock_pair(struct __test_metadata *_metadata, - int *fd, int *cfd, bool *notls) +static void __ulp_sock_pair(struct __test_metadata *_metadata, + int *fd, int *cfd, bool *notls, + int cli_proto, int srv_proto) { struct sockaddr_in addr; socklen_t len; @@ -XXX,XX +XXX,XX @@ static void ulp_sock_pair(struct __test_metadata *_metadata, addr.sin_addr.s_addr = htonl(INADDR_ANY); addr.sin_port = 0; - *fd = socket(AF_INET, SOCK_STREAM, 0); - sfd = socket(AF_INET, SOCK_STREAM, 0); + *fd = socket(AF_INET, SOCK_STREAM, cli_proto); + sfd = socket(AF_INET, SOCK_STREAM, srv_proto); ret = bind(sfd, &addr, sizeof(addr)); ASSERT_EQ(ret, 0); @@ -XXX,XX +XXX,XX @@ static void ulp_sock_pair(struct __test_metadata *_metadata, ASSERT_EQ(ret, 0); } +static void ulp_sock_pair(struct __test_metadata *_metadata, + int *fd, int *cfd, bool *notls) +{ + __ulp_sock_pair(_metadata, fd, cfd, notls, 0, 0); +} + /* Produce a basic cmsg */ static int tls_send_cmsg(int fd, unsigned char record_type, void *data, size_t len, int flags) @@ -XXX,XX +XXX,XX @@ FIXTURE_VARIANT(tls) uint16_t tls_version; uint16_t cipher_type; bool nopad, fips_non_compliant; + bool mptcp; }; FIXTURE_VARIANT_ADD(tls, 12_aes_gcm) @@ -XXX,XX +XXX,XX @@ FIXTURE_VARIANT_ADD(tls, 12_aria_gcm_256) .cipher_type = TLS_CIPHER_ARIA_GCM_256, }; +static bool is_mptcp_enable(struct __test_metadata *_metadata) +{ + char buf[16] = { 0 }; + ssize_t n; + int fd; + + fd = open("/proc/sys/net/mptcp/enabled", O_RDONLY); + if (fd < 0) + return false; + + n = read(fd, buf, sizeof(buf) - 1); + close(fd); + if (n <= 0) + return false; + return (atoi(buf) == 1); +} + FIXTURE_SETUP(tls) { struct tls_crypto_info_keys tls12; @@ -XXX,XX +XXX,XX @@ FIXTURE_SETUP(tls) if (fips_enabled && variant->fips_non_compliant) SKIP(return, "Unsupported cipher in FIPS mode"); + if (variant->mptcp && !is_mptcp_enable(_metadata)) + SKIP(return, "no MPTCP support"); + tls_crypto_info_init(variant->tls_version, variant->cipher_type, &tls12, 0); - ulp_sock_pair(_metadata, &self->fd, &self->cfd, &self->notls); + __ulp_sock_pair(_metadata, &self->fd, &self->cfd, &self->notls, + variant->mptcp ? IPPROTO_MPTCP : 0, + variant->mptcp ? IPPROTO_MPTCP : 0); if (self->notls) return; -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> MPTCP requires longer timeouts in pollin test due to subflow establishment delays and slower state transitions. Increase timeout values to prevent false failures: # RUN tls.13_sm4_ccm_mptcp.pollin ... # tls.c:1411:pollin:Expected poll(&fd, 1, 20) (0) == 1 (1) # tls.c:1412:pollin:Expected fd.revents & POLLIN (0) == 1 (1) # pollin: Test failed # FAIL tls.13_sm4_ccm_mptcp.pollin not ok 357 tls.13_sm4_ccm_mptcp.pollin Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/tls.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -XXX,XX +XXX,XX @@ TEST_F(tls, bidir) TEST_F(tls, pollin) { + int timeout = variant->mptcp ? 100 : 20; char const *test_str = "test_poll"; struct pollfd fd = { 0, 0, 0 }; char buf[10]; @@ -XXX,XX +XXX,XX @@ TEST_F(tls, pollin) fd.fd = self->cfd; fd.events = POLLIN; - EXPECT_EQ(poll(&fd, 1, 20), 1); + EXPECT_EQ(poll(&fd, 1, timeout), 1); EXPECT_EQ(fd.revents & POLLIN, 1); EXPECT_EQ(recv(self->cfd, buf, send_len, MSG_WAITALL), send_len); /* Test timing out */ - EXPECT_EQ(poll(&fd, 1, 20), 0); + EXPECT_EQ(poll(&fd, 1, timeout), 0); } TEST_F(tls, poll_wait) -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> Increase the data size in nonblocking tests to accommodate MPTCP's multi-subflow behavior and ensure sufficient data for testing, avoiding the following errors: # RUN tls.12_aria_gcm_mptcp.nonblocking ... # tls.c:1534:nonblocking:Expected 0 (0) != eagain (0) # nonblocking: Test failed Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/tls.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -XXX,XX +XXX,XX @@ TEST_F(tls, nonblocking) int flags; int res; + if (variant->mptcp) + data *= 4; + flags = fcntl(self->fd, F_GETFL, 0); fcntl(self->fd, F_SETFL, flags | O_NONBLOCK); fcntl(self->cfd, F_SETFL, flags | O_NONBLOCK); -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> In shutdown_reuse tests, add a delay after shutdown to wait for MPTCP sockets to reach TCP_CLOSE state before reuse via bind(), avoiding the following errors: # RUN tls.12_aes_gcm_mptcp.shutdown_reuse ... # tls.c:1790:shutdown_reuse:Expected ret (-1) == 0 (0) # shutdown_reuse: Test failed # FAIL tls.12_aes_gcm_mptcp.shutdown_reuse not ok 14 tls.12_aes_gcm_mptcp.shutdown_reuse # RUN tls.13_aes_gcm_mptcp.shutdown_reuse ... # tls.c:1790:shutdown_reuse:Expected ret (-1) == 0 (0) # shutdown_reuse: Test failed # FAIL tls.13_aes_gcm_mptcp.shutdown_reuse not ok 15 tls.13_aes_gcm_mptcp.shutdown_reuse # RUN tls.12_chacha_mptcp.shutdown_reuse ... # OK tls.12_chacha_mptcp.shutdown_reuse ok 16 tls.12_chacha_mptcp.shutdown_reuse # RUN tls.13_chacha_mptcp.shutdown_reuse ... # OK tls.13_chacha_mptcp.shutdown_reuse ok 17 tls.13_chacha_mptcp.shutdown_reuse # RUN tls.13_sm4_gcm_mptcp.shutdown_reuse ... # tls.c:1790:shutdown_reuse:Expected ret (-1) == 0 (0) # shutdown_reuse: Test failed # FAIL tls.13_sm4_gcm_mptcp.shutdown_reuse not ok 18 tls.13_sm4_gcm_mptcp.shutdown_reuse This TCP_CLOSE check is just for MPTCP, because it should not slow down plain TCP tests. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/tls.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -XXX,XX +XXX,XX @@ #define IPPROTO_MPTCP 262 #endif +#ifndef TCP_CLOSE +#define TCP_CLOSE 7 +#endif + static int fips_enabled; struct tls_crypto_info_keys { @@ -XXX,XX +XXX,XX @@ TEST_F(tls, shutdown_unsent) shutdown(self->cfd, SHUT_RDWR); } +static bool wait_for_tcp_close(struct __test_metadata *_metadata, + int fd, int max) +{ + struct tcp_info info; + socklen_t len; + int i, ret; + + for (i = 0; i < max; i++) { + len = sizeof(info); + ret = getsockopt(fd, IPPROTO_TCP, TCP_INFO, &info, &len); + ASSERT_EQ(ret, 0); + if (info.tcpi_state == TCP_CLOSE) + return true; + usleep(1000); + } + + return false; +} + TEST_F(tls, shutdown_reuse) { struct sockaddr_in addr; @@ -XXX,XX +XXX,XX @@ TEST_F(tls, shutdown_reuse) shutdown(self->cfd, SHUT_RDWR); close(self->cfd); + if (variant->mptcp) + EXPECT_TRUE(wait_for_tcp_close(_metadata, self->fd, 1000)); + addr.sin_family = AF_INET; addr.sin_addr.s_addr = htonl(INADDR_ANY); addr.sin_port = 0; -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> This patch introduces MPTCP test cases for the TLS fixture. These "mptcp" variants are configured to create MPTCP sockets specifically for MPTCP TLS testing purposes. The default limit of 1024 for file descriptor values is too low for the newly added MPTCP tests, causing accept() to fail when the fd number exceeds 1024. Raise the limit to 4096 to avoid test failures. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/tls.c | 96 +++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -XXX,XX +XXX,XX @@ FIXTURE_VARIANT_ADD(tls, 12_aria_gcm_256) .cipher_type = TLS_CIPHER_ARIA_GCM_256, }; +FIXTURE_VARIANT_ADD(tls, 12_aes_gcm_mptcp) +{ + .tls_version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_AES_GCM_128, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 13_aes_gcm_mptcp) +{ + .tls_version = TLS_1_3_VERSION, + .cipher_type = TLS_CIPHER_AES_GCM_128, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 12_chacha_mptcp) +{ + .tls_version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_CHACHA20_POLY1305, + .fips_non_compliant = true, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 13_chacha_mptcp) +{ + .tls_version = TLS_1_3_VERSION, + .cipher_type = TLS_CIPHER_CHACHA20_POLY1305, + .fips_non_compliant = true, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 13_sm4_gcm_mptcp) +{ + .tls_version = TLS_1_3_VERSION, + .cipher_type = TLS_CIPHER_SM4_GCM, + .fips_non_compliant = true, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 13_sm4_ccm_mptcp) +{ + .tls_version = TLS_1_3_VERSION, + .cipher_type = TLS_CIPHER_SM4_CCM, + .fips_non_compliant = true, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 12_aes_ccm_mptcp) +{ + .tls_version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_AES_CCM_128, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 13_aes_ccm_mptcp) +{ + .tls_version = TLS_1_3_VERSION, + .cipher_type = TLS_CIPHER_AES_CCM_128, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 12_aes_gcm_256_mptcp) +{ + .tls_version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_AES_GCM_256, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 13_aes_gcm_256_mptcp) +{ + .tls_version = TLS_1_3_VERSION, + .cipher_type = TLS_CIPHER_AES_GCM_256, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 13_nopad_mptcp) +{ + .tls_version = TLS_1_3_VERSION, + .cipher_type = TLS_CIPHER_AES_GCM_128, + .nopad = true, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 12_aria_gcm_mptcp) +{ + .tls_version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_ARIA_GCM_128, + .mptcp = true, +}; + +FIXTURE_VARIANT_ADD(tls, 12_aria_gcm_256_mptcp) +{ + .tls_version = TLS_1_2_VERSION, + .cipher_type = TLS_CIPHER_ARIA_GCM_256, + .mptcp = true, +}; + static bool is_mptcp_enable(struct __test_metadata *_metadata) { char buf[16] = { 0 }; -- 2.51.0
From: Geliang Tang <tanggeliang@kylinos.cn> The mptcp tests for tls.c is available now, this patch adds mptcp_tls.sh to test it in the MPTCP CI by default. Co-developed-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> --- tools/testing/selftests/net/mptcp/.gitignore | 1 + tools/testing/selftests/net/mptcp/Makefile | 2 + tools/testing/selftests/net/mptcp/config | 5 ++ .../testing/selftests/net/mptcp/mptcp_tls.sh | 62 +++++++++++++++++++ tools/testing/selftests/net/mptcp/tls.c | 1 + 5 files changed, 71 insertions(+) create mode 100755 tools/testing/selftests/net/mptcp/mptcp_tls.sh create mode 120000 tools/testing/selftests/net/mptcp/tls.c diff --git a/tools/testing/selftests/net/mptcp/.gitignore b/tools/testing/selftests/net/mptcp/.gitignore index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/.gitignore +++ b/tools/testing/selftests/net/mptcp/.gitignore @@ -XXX,XX +XXX,XX @@ mptcp_diag mptcp_inq mptcp_sockopt pm_nl_ctl +tls *.pcap diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/selftests/net/mptcp/Makefile index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/Makefile +++ b/tools/testing/selftests/net/mptcp/Makefile @@ -XXX,XX +XXX,XX @@ TEST_PROGS := \ mptcp_connect_splice.sh \ mptcp_join.sh \ mptcp_sockopt.sh \ + mptcp_tls.sh \ pm_netlink.sh \ simult_flows.sh \ userspace_pm.sh \ @@ -XXX,XX +XXX,XX @@ TEST_GEN_FILES := \ mptcp_inq \ mptcp_sockopt \ pm_nl_ctl \ + tls \ # end of TEST_GEN_FILES TEST_FILES := \ diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/config +++ b/tools/testing/selftests/net/mptcp/config @@ -XXX,XX +XXX,XX @@ CONFIG_NFT_SOCKET=m CONFIG_NFT_TPROXY=m CONFIG_SYN_COOKIES=y CONFIG_VETH=y +CONFIG_TLS=m +CONFIG_CRYPTO_ARIA=m +CONFIG_CRYPTO_CCM=m +CONFIG_CRYPTO_CHACHA20POLY1305=m +CONFIG_CRYPTO_SM4_GENERIC=m diff --git a/tools/testing/selftests/net/mptcp/mptcp_tls.sh b/tools/testing/selftests/net/mptcp/mptcp_tls.sh new file mode 100755 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tools/testing/selftests/net/mptcp/mptcp_tls.sh @@ -XXX,XX +XXX,XX @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(dirname "${0}")/mptcp_lib.sh" + +ret=0 +ns1="" + +# This function is used in the cleanup trap +#shellcheck disable=SC2317,SC2329 +cleanup() +{ + if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then + kill "$pid" 2>/dev/null + wait "$pid" 2>/dev/null + fi + + mptcp_lib_ns_exit "$ns1" +} + +init() +{ + local max="${1:-4}" + + mptcp_lib_ns_init ns1 + + mptcp_lib_pm_nl_set_limits "$ns1" "$max" "$max" + + local i + for i in $(seq 1 "$max"); do + mptcp_lib_pm_nl_add_endpoint "$ns1" \ + "127.0.0.1" flags signal port 1000"$i" + done +} + +trap cleanup EXIT + +mptcp_lib_check_mptcp +# Temporarily set max to '0' to disable multipath testing, +# as it depends on "mptcp: fix stall because of data_ready" series of fixes. +# It will be re-enabled together with that series later as a squash-to patch. +init 0 + +ip netns exec "$ns1" ./tls -v 12_aes_gcm_mptcp \ + -v 13_aes_gcm_mptcp \ + -v 12_chacha_mptcp \ + -v 13_chacha_mptcp \ + -v 13_sm4_gcm_mptcp \ + -v 13_sm4_ccm_mptcp \ + -v 12_aes_ccm_mptcp \ + -v 13_aes_ccm_mptcp \ + -v 12_aes_gcm_256_mptcp \ + -v 13_aes_gcm_256_mptcp \ + -v 13_nopad_mptcp \ + -v 12_aria_gcm_mptcp \ + -v 12_aria_gcm_256_mptcp & +pid=$! +wait $pid +ret=$? + +mptcp_lib_result_print_all_tap +exit $ret diff --git a/tools/testing/selftests/net/mptcp/tls.c b/tools/testing/selftests/net/mptcp/tls.c new file mode 120000 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tools/testing/selftests/net/mptcp/tls.c @@ -0,0 +1 @@ +../tls.c \ No newline at end of file -- 2.51.0