From nobody Sun Mar 22 08:25:59 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4238B28488F for ; Fri, 13 Mar 2026 04:13:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375191; cv=none; b=PQZTpRI/orjD67Je7DM19gFit2A0BmqidyWJPiIMvrbsKkhxAB/DMcZFvPN85x288PxfcCbO8YoThsdaeyhMErD8Z+XcWaCu7b588kZymD1wbNapRd0VskcbYGZB3BEz1IoKGCu2BpSDyv/e5SEqVhThV1PNyg5Fchg7j1cXAeI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375191; c=relaxed/simple; bh=5wAqg4Y76q9Vqv/QfZpGMwdsak39VQtYVPozE4uN6y0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k87Jd+8HY2yrSGRPFlIEXnUAUocunzpacXU/XffSNzE5D86xtS+P9ujeNfER48gItWs0ImDBasZ5r7tKbWWfyW6+C4upgrKnh3oYwbXc9apbm6gc018GftprsGS0f12lBbQpy1rNuR9DcqhLru3gm3TZkTAEHk7RGiHf7Uq+Dkc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=prr8HL5S; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="prr8HL5S" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0BAD7C19421; Fri, 13 Mar 2026 04:13:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773375190; bh=5wAqg4Y76q9Vqv/QfZpGMwdsak39VQtYVPozE4uN6y0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=prr8HL5SM1iDgwv2pjRijYnJTZrywAYoIw8dJWa8M6y1TiJeJEjmANFWc7RiEYUN+ EqJzsvNR6zBzS4AlsMqb8nuxeP7/Ux0rHpR+cBVSCPfBGmx+ksjNxeTaMwT4JKdYze tFCWs3oN7oxPoQsAbPA1sRhTfrrDwWD0QDfvMF7Jx2NwnWbUnqf2HaEJSWDns7nGZT HYDKze6gY11Dg6hsfUHsBgEMmD2eEk1ocH1XwmEjhXnOxGU9/hqdOaMdYfhBuSac9P /Tke2lIYiWL4u0AYt1IyVfz3qwi7ya29kAX6nOj3FESHDMDenRs3LHFK1gRDqdR7/C ImqnCSb8fA+EA== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang Subject: [RFC mptcp-next v5 1/7] mptcp: add sk_is_msk helper Date: Fri, 13 Mar 2026 12:12:45 +0800 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang This patch introduces a sk_is_msk() helper modeled after sk_is_tcp() to determine whether the socket is an MPTCP one. Unlike sk_is_mptcp(), which accepts a subflow socket as its parameter, this new helper specifically accepts an MPTCP socket parameter. Signed-off-by: Geliang Tang --- include/net/mptcp.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 4cf59e83c1c5..82660374859a 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -150,6 +150,13 @@ static inline bool rsk_drop_req(const struct request_s= ock *req) return tcp_rsk(req)->is_mptcp && tcp_rsk(req)->drop_req; } =20 +static inline bool sk_is_msk(const struct sock *sk) +{ + return sk_is_inet(sk) && + sk->sk_type =3D=3D SOCK_STREAM && + sk->sk_protocol =3D=3D IPPROTO_MPTCP; +} + void mptcp_space(const struct sock *ssk, int *space, int *full_space); bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb, unsigned int *size, struct mptcp_out_options *opts); @@ -258,6 +265,11 @@ static inline bool rsk_drop_req(const struct request_s= ock *req) return false; } =20 +static inline bool sk_is_msk(const struct sock *sk) +{ + return false; +} + static inline bool mptcp_syn_options(struct sock *sk, const struct sk_buff= *skb, unsigned int *size, struct mptcp_out_options *opts) --=20 2.53.0 From nobody Sun Mar 22 08:25:59 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EE2617A300 for ; Fri, 13 Mar 2026 04:13:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375193; cv=none; b=Hp2kj+OpdUrjs6AHnB/rL7s1OBmXN1NiWPMMQuYB5hFDem9OXp37YniAnjbRnqUWP1zZfies6T5K7tW/tr+d34w1l0vB9qUHCi7IsiaEgkHgeeZgfMA1UAdBYgcLyz26iE5/ZbYi8VpWu11hQv9rvmr1/Ztctdyx4JHk2mlgJeQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375193; c=relaxed/simple; bh=9fGVPeyCC48lSRiLlOwPbwecJBzIGIxPHEt+eOWhs9E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JaPfnNovlHcGniVascOrSE37pMbQ2w/MxB/cPrnHHgMN39YgZ7RZhyhPNKdiaWWgLzpRAKPJtjJmk2rvid7ryEJIRteqzqFpqiPPl4NdcgY73txiikPVk8+X5Uc1Ur3odBs9tviHBFRShMP42gIoobsmXpyFomOtVZ4QXeKnG+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=T4IxTeZK; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="T4IxTeZK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C135C19421; Fri, 13 Mar 2026 04:13:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773375193; bh=9fGVPeyCC48lSRiLlOwPbwecJBzIGIxPHEt+eOWhs9E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T4IxTeZKPHCInLIorT46y3IniDpHg0aFhVthGiZNomfF6ulJduJ/pR2FB7PQ/21XT uX/LU2aay48WvMGl83FUqWKemZAFjraPJzLRElDlpYgwg+Mvw/zyXq1MnIphnt6XY7 Hgh1FohO/ZxVSaq54XKQrt0O2d77a8LDq0+FSBpacSfWY1K/osqh1y21WiY0SnEqkf DxO6VeGhzJuibIGuAlwUIa+vV67ix7CC7HyTFrbESm0nW88xoaGuet1GUAm4QTzBKN EkVmkz/jHve5FdQkPQt8+w6tJae55GKYQxZejGeahH65mm5G29+vkfOpVnsXbtybtE GPfQjF6EzZlBA== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang , zhenwei pi , Hui Zhu , Gang Yan Subject: [RFC mptcp-next v5 2/7] mptcp: add sock_set_nodelay Date: Fri, 13 Mar 2026 12:12:46 +0800 Message-ID: <1f53896c2bdd54166639b28da5e8968f1e11e2f9.1773374342.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang This patch introduces an MPTCP-specific helper, mptcp_sock_set_nodelay, which sets the TCP_NODELAY option for every subflow socket within an MPTCP connection. It will be utilized on both the target and host sides in the 'NVMe over MPTCP' implementation. Using tcp_sock_set_nodelay() with MPTCP will cause list corruption: nvmet: adding nsid 1 to subsystem nqn.2014-08.org.nvmexpress.mptcpdev nvmet_tcp: enabling port 1234 (127.0.0.1:4420) slab MPTCP start ffff8880108f0b80 pointer offset 2480 size 2816 list_add corruption. prev->next should be next (ffff8880108f1530), but was ffff8885108f1530. (prev=3Dffff8880108f1530). ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:32! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 1 UID: 0 PID: 182 Comm: nvme Not tainted 6.16.0-rc3+ #1 PREEMPT(full) Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Co-developed-by: zhenwei pi Signed-off-by: zhenwei pi Co-developed-by: Hui Zhu Signed-off-by: Hui Zhu Co-developed-by: Gang Yan Signed-off-by: Gang Yan Signed-off-by: Geliang Tang --- include/net/mptcp.h | 4 ++++ net/mptcp/protocol.c | 18 ++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 82660374859a..60cbf29448b0 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -244,6 +244,8 @@ static inline __be32 mptcp_reset_option(const struct sk= _buff *skb) } =20 void mptcp_active_detect_blackhole(struct sock *sk, bool expired); + +void mptcp_sock_set_nodelay(struct sock *sk); #else =20 static inline void mptcp_init(void) @@ -335,6 +337,8 @@ static inline struct request_sock *mptcp_subflow_reqsk_= alloc(const struct reques static inline __be32 mptcp_reset_option(const struct sk_buff *skb) { retu= rn htonl(0u); } =20 static inline void mptcp_active_detect_blackhole(struct sock *sk, bool exp= ired) { } + +static inline void mptcp_sock_set_nodelay(struct sock *sk) { } #endif /* CONFIG_MPTCP */ =20 #if IS_ENABLED(CONFIG_MPTCP_IPV6) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 7c06b8d9eb37..692111941808 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3800,6 +3800,24 @@ static void mptcp_sock_check_graft(struct sock *sk, = struct sock *ssk) } } =20 +void mptcp_sock_set_nodelay(struct sock *sk) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + struct mptcp_subflow_context *subflow; + + lock_sock(sk); + msk->nodelay =3D true; + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); + + lock_sock(ssk); + __tcp_sock_set_nodelay(ssk, true); + release_sock(ssk); + } + release_sock(sk); +} +EXPORT_SYMBOL(mptcp_sock_set_nodelay); + bool mptcp_finish_join(struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); --=20 2.53.0 From nobody Sun Mar 22 08:25:59 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45C012264C7 for ; Fri, 13 Mar 2026 04:13:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375196; cv=none; b=iSpGBMWow57OMfq5ocPaoDNSAuwc8iJY9zXESaz3qKwsHYQxd+wM571cSNgvOF1yKrLldlM36xlDRgDYXkCXdnzT7C+v4wpsm0If1FF/qgK4n980nxFF/XJBPy2wyYDicw8LN5o/+VSlzx4L3lLqK/9XA0KWtgsGt1WDmnxCsjw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375196; c=relaxed/simple; bh=F6pEaYtSEL1gE+MhT6Dp29hC7bG/DqXUJeekugLWabA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MNDZw0av03fjnVnZP6LESoQfs8IBmenKMc1r2VBr5S/DsPQ0Fkl1Ms0WgmR2a0TdpqcyL1uP5lZKpO6XYzn0LeOCM5OMFpdWS1jTLEQDg6g0CyyP6xERjdSqsTWjlKZoTdoJA1AXdpK1lbPg4HmraSLTFB6ZiUalTDWuKcTT2rg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=et42VoZ4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="et42VoZ4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9015C19424; Fri, 13 Mar 2026 04:13:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773375195; bh=F6pEaYtSEL1gE+MhT6Dp29hC7bG/DqXUJeekugLWabA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=et42VoZ4tyI0Yv6nhEZ3/7tTqN0Yqg6xWfsuBFXnOUFJIdPkV2shgLJ57jpqXGOvO k2V6RScvscTa+ZJlNclGDYFFhCr04ShS5yxQIqXGM3A2HJin/b2oUiKfexxGqZA4Hk rdeSNwrOmqZPQOSq4PlC358BNqY505bOulzwbbLX6wMHLmqEZ/hiJwIrOtjtU/PzBN NwTj5+z91mXgpQtQ+5NIvrYELcdZK6zPYA6S8ceYa+VBy0apJeWFGuwknBS+3JQB5s 7x2tmWuwhfUFcUQC3OF0BQjA0wYFZc2urueua0Ql5Y9cjtddqP9SuxJc63Qbk1gQzO LoraHfhrA7sFQ== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang , zhenwei pi , Hui Zhu , Gang Yan Subject: [RFC mptcp-next v5 3/7] mptcp: add sock_set_reuseaddr Date: Fri, 13 Mar 2026 12:12:47 +0800 Message-ID: <253dbac0e981aefb100d7aacd533e04865090afb.1773374342.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang This patch introduces a dedicated MPTCP helper, sock_set_reuseaddr, which sets the address reuse flag on the first subflow socket of an MPTCP connection, and it will be applied to the target side in the 'NVMe over MPTCP' implementation. Co-developed-by: zhenwei pi Signed-off-by: zhenwei pi Co-developed-by: Hui Zhu Signed-off-by: Hui Zhu Co-developed-by: Gang Yan Signed-off-by: Gang Yan Signed-off-by: Geliang Tang --- include/net/mptcp.h | 4 ++++ net/mptcp/protocol.c | 16 ++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 60cbf29448b0..63b64b7699e3 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -246,6 +246,8 @@ static inline __be32 mptcp_reset_option(const struct sk= _buff *skb) void mptcp_active_detect_blackhole(struct sock *sk, bool expired); =20 void mptcp_sock_set_nodelay(struct sock *sk); + +void mptcp_sock_set_reuseaddr(struct sock *sk); #else =20 static inline void mptcp_init(void) @@ -339,6 +341,8 @@ static inline __be32 mptcp_reset_option(const struct sk= _buff *skb) { return hto static inline void mptcp_active_detect_blackhole(struct sock *sk, bool exp= ired) { } =20 static inline void mptcp_sock_set_nodelay(struct sock *sk) { } + +static inline void mptcp_sock_set_reuseaddr(struct sock *sk) { } #endif /* CONFIG_MPTCP */ =20 #if IS_ENABLED(CONFIG_MPTCP_IPV6) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 692111941808..bb923c4fabd1 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3818,6 +3818,22 @@ void mptcp_sock_set_nodelay(struct sock *sk) } EXPORT_SYMBOL(mptcp_sock_set_nodelay); =20 +void mptcp_sock_set_reuseaddr(struct sock *sk) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + struct sock *ssk; + + lock_sock(sk); + ssk =3D __mptcp_nmpc_sk(msk); + if (IS_ERR(ssk)) + goto unlock; + ssk->sk_reuse =3D SK_CAN_REUSE; + sk->sk_reuse =3D ssk->sk_reuse; +unlock: + release_sock(sk); +} +EXPORT_SYMBOL(mptcp_sock_set_reuseaddr); + bool mptcp_finish_join(struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); --=20 2.53.0 From nobody Sun Mar 22 08:25:59 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16E7F1DF74F for ; Fri, 13 Mar 2026 04:13:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375199; cv=none; b=DhhnPVA5NyakysDnPbNA89Jl1QhwHqvV1kDYJ4e8fhtif4OlXRwdYGfzIPYNrEmRUaXs2VLagKkxDexNpAD/KxNZRl/d83baOvVBajQAfaYEhYqH2VTd+oDzmqGbrNIw+u5XF1JywzOlQ4S4VgOHOUnyCDBZ+W9Xi0TAj44O7Dw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375199; c=relaxed/simple; bh=QonxdimWLUdY5S+w2alNfGjh2kbwr0k5aGh0us6QwPk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OFpbYIxWYr97mfiyMRj8Y8LbZgycYHjtDYhLVY0cAmDxmiDc+Gj02zZtFxibykvNCmy+/WuYvWsv2LDq2lqpu6F5JOkqVwel7rQqLNv6CpSmuwnOy60iBeqs9hq3nYfhUpLSymUcpFtnTwNLc73mBKr5ERMX/dk9x4wimOCHjLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jSH388mJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jSH388mJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 85C05C19423; Fri, 13 Mar 2026 04:13:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773375198; bh=QonxdimWLUdY5S+w2alNfGjh2kbwr0k5aGh0us6QwPk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jSH388mJfqxSi6StZYxnWZue5cC8nyO+gywypqVUsSOuoXT54gDiEhERtP6ULEY+U YhzAGAppZC/ehANHoBbjaU44csuLJa/Fcie1+eb9aFE7Msf0+wicf2lVTM/KXXIbnX scC2nL/QZS/rUUFF+2cC6J0+vCLxilwJZQhKrDwzktuEWd/vZwYPkmmQIBLNyGrijD riNMsaxFAWUEvD6gz8myF4jDCyWeIjXVvTSzI0KMWBAnhbufB0a0xLXhTs1cPVyJD/ R0apO1wI56SYzy2dx1K+iIjK8JGmjcQjBsI7/ItAfQw/LmcbBjLFRPETAo0s790FVz CcXqAx0xXltFg== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang , Hannes Reinecke , zhenwei pi , Hui Zhu , Gang Yan Subject: [RFC mptcp-next v5 4/7] nvmet-tcp: add mptcp support Date: Fri, 13 Mar 2026 12:12:48 +0800 Message-ID: <3b7fb4de94235c624151cc281e4b965daabcd937.1773374342.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang This patch adds a new nvme target transport type NVMF_TRTYPE_MPTCP for MPTCP. And defines a new nvmet_fabrics_ops named nvmet_mptcp_ops, which is almost the same as nvmet_tcp_ops except .type. Check if disc_addr.trtype is NVMF_TRTYPE_MPTCP in nvmet_tcp_add_port() to decide whether to pass IPPROTO_MPTCP to sock_create() to create a MPTCP socket instead of a TCP one. This new nvmet_fabrics_ops can be switched in nvmet_tcp_done_recv_pdu() according to different protocol. v2: - use trtype instead of tsas (Hannes). v3: - check mptcp protocol from disc_addr.trtype instead of passing a parameter (Hannes). v4: - check CONFIG_MPTCP. Cc: Hannes Reinecke Co-developed-by: zhenwei pi Signed-off-by: zhenwei pi Co-developed-by: Hui Zhu Signed-off-by: Hui Zhu Co-developed-by: Gang Yan Signed-off-by: Gang Yan Signed-off-by: Geliang Tang --- drivers/nvme/target/configfs.c | 1 + drivers/nvme/target/tcp.c | 38 ++++++++++++++++++++++++++++++++-- include/linux/nvme.h | 1 + 3 files changed, 38 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c index 3088e044dbcb..4b7498ffb102 100644 --- a/drivers/nvme/target/configfs.c +++ b/drivers/nvme/target/configfs.c @@ -38,6 +38,7 @@ static struct nvmet_type_name_map nvmet_transport[] =3D { { NVMF_TRTYPE_RDMA, "rdma" }, { NVMF_TRTYPE_FC, "fc" }, { NVMF_TRTYPE_TCP, "tcp" }, + { NVMF_TRTYPE_MPTCP, "mptcp" }, { NVMF_TRTYPE_PCI, "pci" }, { NVMF_TRTYPE_LOOP, "loop" }, }; diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index acc71a26733f..5111c0e690ee 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -212,6 +212,7 @@ static DEFINE_MUTEX(nvmet_tcp_queue_mutex); =20 static struct workqueue_struct *nvmet_tcp_wq; static const struct nvmet_fabrics_ops nvmet_tcp_ops; +static const struct nvmet_fabrics_ops nvmet_mptcp_ops; static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c); static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd); =20 @@ -1067,7 +1068,9 @@ static int nvmet_tcp_done_recv_pdu(struct nvmet_tcp_q= ueue *queue) req =3D &queue->cmd->req; memcpy(req->cmd, nvme_cmd, sizeof(*nvme_cmd)); =20 - if (unlikely(!nvmet_req_init(req, &queue->nvme_sq, &nvmet_tcp_ops))) { + if (unlikely(!nvmet_req_init(req, &queue->nvme_sq, + sk_is_msk(queue->sock->sk) ? + &nvmet_mptcp_ops : &nvmet_tcp_ops))) { pr_err("failed cmd %p id %d opcode %d, data_len: %d, status: %04x\n", req->cmd, req->cmd->common.command_id, req->cmd->common.opcode, @@ -2034,6 +2037,7 @@ static int nvmet_tcp_add_port(struct nvmet_port *npor= t) { struct nvmet_tcp_port *port; __kernel_sa_family_t af; + int proto =3D IPPROTO_TCP; int ret; =20 port =3D kzalloc_obj(*port); @@ -2054,6 +2058,11 @@ static int nvmet_tcp_add_port(struct nvmet_port *npo= rt) goto err_port; } =20 +#ifdef CONFIG_MPTCP + if (nport->disc_addr.trtype =3D=3D NVMF_TRTYPE_MPTCP) + proto =3D IPPROTO_MPTCP; +#endif + ret =3D inet_pton_with_scope(&init_net, af, nport->disc_addr.traddr, nport->disc_addr.trsvcid, &port->addr); if (ret) { @@ -2068,7 +2077,7 @@ static int nvmet_tcp_add_port(struct nvmet_port *npor= t) port->nport->inline_data_size =3D NVMET_TCP_DEF_INLINE_DATA_SIZE; =20 ret =3D sock_create(port->addr.ss_family, SOCK_STREAM, - IPPROTO_TCP, &port->sock); + proto, &port->sock); if (ret) { pr_err("failed to create a socket\n"); goto err_port; @@ -2077,7 +2086,11 @@ static int nvmet_tcp_add_port(struct nvmet_port *npo= rt) port->sock->sk->sk_user_data =3D port; port->data_ready =3D port->sock->sk->sk_data_ready; port->sock->sk->sk_data_ready =3D nvmet_tcp_listen_data_ready; + sk_is_msk(port->sock->sk) ? + mptcp_sock_set_reuseaddr(port->sock->sk) : sock_set_reuseaddr(port->sock->sk); + sk_is_msk(port->sock->sk) ? + mptcp_sock_set_nodelay(port->sock->sk) : tcp_sock_set_nodelay(port->sock->sk); if (so_priority > 0) sock_set_priority(port->sock->sk, so_priority); @@ -2220,6 +2233,19 @@ static const struct nvmet_fabrics_ops nvmet_tcp_ops = =3D { .host_traddr =3D nvmet_tcp_host_port_addr, }; =20 +static const struct nvmet_fabrics_ops nvmet_mptcp_ops =3D { + .owner =3D THIS_MODULE, + .type =3D NVMF_TRTYPE_MPTCP, + .msdbd =3D 1, + .add_port =3D nvmet_tcp_add_port, + .remove_port =3D nvmet_tcp_remove_port, + .queue_response =3D nvmet_tcp_queue_response, + .delete_ctrl =3D nvmet_tcp_delete_ctrl, + .install_queue =3D nvmet_tcp_install_queue, + .disc_traddr =3D nvmet_tcp_disc_port_addr, + .host_traddr =3D nvmet_tcp_host_port_addr, +}; + static int __init nvmet_tcp_init(void) { int ret; @@ -2233,6 +2259,12 @@ static int __init nvmet_tcp_init(void) if (ret) goto err; =20 + ret =3D nvmet_register_transport(&nvmet_mptcp_ops); + if (ret) { + nvmet_unregister_transport(&nvmet_tcp_ops); + goto err; + } + return 0; err: destroy_workqueue(nvmet_tcp_wq); @@ -2243,6 +2275,7 @@ static void __exit nvmet_tcp_exit(void) { struct nvmet_tcp_queue *queue; =20 + nvmet_unregister_transport(&nvmet_mptcp_ops); nvmet_unregister_transport(&nvmet_tcp_ops); =20 flush_workqueue(nvmet_wq); @@ -2262,3 +2295,4 @@ module_exit(nvmet_tcp_exit); MODULE_DESCRIPTION("NVMe target TCP transport driver"); MODULE_LICENSE("GPL v2"); MODULE_ALIAS("nvmet-transport-3"); /* 3 =3D=3D NVMF_TRTYPE_TCP */ +MODULE_ALIAS("nvmet-transport-4"); /* 4 =3D=3D NVMF_TRTYPE_MPTCP */ diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 655d194f8e72..8069667ad47e 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -68,6 +68,7 @@ enum { NVMF_TRTYPE_RDMA =3D 1, /* RDMA */ NVMF_TRTYPE_FC =3D 2, /* Fibre Channel */ NVMF_TRTYPE_TCP =3D 3, /* TCP/IP */ + NVMF_TRTYPE_MPTCP =3D 4, /* Multipath TCP */ NVMF_TRTYPE_LOOP =3D 254, /* Reserved for host usage */ NVMF_TRTYPE_MAX, }; --=20 2.53.0 From nobody Sun Mar 22 08:25:59 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D44F117A300 for ; Fri, 13 Mar 2026 04:13:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375202; cv=none; b=ss9OaEFtK7gCPJQabyCxOgjW7axuVcSlD9rXdP+O73FG8jZcdoha/zw0m5Qo48PwERVkAS19lgRWbPnpqX6v8PxiSuAihaJj25J9SPz3jvFp1CQYoY4i2j0I5hXEaxtl+VLLlVkcUVqrDNga0IpamVLkaUcfY2+Kxe6ds/lu6H4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375202; c=relaxed/simple; bh=+ZWitJQ36qz37QpYbpOlNA2mRboYOnrPwRBY58VwNJA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ahw4btqqz5wLxyvlqD1h2nFsmlER3+5Af0uXpStajjtBhUEZaSqVIpLIcaZMduxIoIL00EfFnJigbhqbtFZyux142iWBArQ0PAw97I5ZmmQBcDwmsWq+3R4hkDZz9zyVW2IWl+ychMWM01e2BeXwPOTP0808sYHlEa8ejo76ijY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gl7XRFGf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gl7XRFGf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B2CBC19423; Fri, 13 Mar 2026 04:13:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773375202; bh=+ZWitJQ36qz37QpYbpOlNA2mRboYOnrPwRBY58VwNJA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gl7XRFGf3DRBq1PxoLp3mioR1shGZL43qR85cvXDY3iWYdAPo1fOXwaDzhxoHbcvt +2p9X+/34dI+ko5gLo8KJQ069gRUSYIpadq7QY6BdsCEPXTAFlmTGoFf3pNdep+O4c mZFrRYZqMZ1KxptlUSDTW4oVJQT8BLN3RiswJYQDYOfl0Tqy1fymLCDJyZYjxWoHRX TLKJZlzNxP2IIB0mOGGuHfoHhJ7YZlheODy/R8FPUSJ9C9bHjIVLGRuIoIhJgbzw0k 2tJ83pRp+650JBUWLy1QLR23eZwnw5Q35096zav6/+3rZUrc7ivl7VmBkfbBtKLpuh jtiCidK1yaZyQ== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang , zhenwei pi , Hui Zhu , Gang Yan Subject: [RFC mptcp-next v5 5/7] mptcp: add sock_set_syncnt Date: Fri, 13 Mar 2026 12:12:49 +0800 Message-ID: <7cd0862447c2d36dc844f94b7a09a8e31f6edcd2.1773374342.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang This patch introduces a dedicated MPTCP helper, sock_set_syncnt, which sets the SYN retransmission count on the first subflow socket and it will be applied to the host side in the 'NVMe over MPTCP' implementation. Co-developed-by: zhenwei pi Signed-off-by: zhenwei pi Co-developed-by: Hui Zhu Signed-off-by: Hui Zhu Co-developed-by: Gang Yan Signed-off-by: Gang Yan Signed-off-by: Geliang Tang --- include/net/mptcp.h | 7 +++++++ net/mptcp/protocol.c | 19 +++++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 63b64b7699e3..d6bb67a55f24 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -248,6 +248,8 @@ void mptcp_active_detect_blackhole(struct sock *sk, boo= l expired); void mptcp_sock_set_nodelay(struct sock *sk); =20 void mptcp_sock_set_reuseaddr(struct sock *sk); + +int mptcp_sock_set_syncnt(struct sock *sk, int val); #else =20 static inline void mptcp_init(void) @@ -343,6 +345,11 @@ static inline void mptcp_active_detect_blackhole(struc= t sock *sk, bool expired) static inline void mptcp_sock_set_nodelay(struct sock *sk) { } =20 static inline void mptcp_sock_set_reuseaddr(struct sock *sk) { } + +static inline int mptcp_sock_set_syncnt(struct sock *sk, int val) +{ + return 0; +} #endif /* CONFIG_MPTCP */ =20 #if IS_ENABLED(CONFIG_MPTCP_IPV6) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index bb923c4fabd1..961f11a24277 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3834,6 +3834,25 @@ void mptcp_sock_set_reuseaddr(struct sock *sk) } EXPORT_SYMBOL(mptcp_sock_set_reuseaddr); =20 +int mptcp_sock_set_syncnt(struct sock *sk, int val) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + struct sock *ssk; + + if (val < 1 || val > MAX_TCP_SYNCNT) + return -EINVAL; + + lock_sock(sk); + ssk =3D __mptcp_nmpc_sk(msk); + if (IS_ERR(ssk)) + goto unlock; + WRITE_ONCE(inet_csk(ssk)->icsk_syn_retries, val); +unlock: + release_sock(sk); + return 0; +} +EXPORT_SYMBOL(mptcp_sock_set_syncnt); + bool mptcp_finish_join(struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); --=20 2.53.0 From nobody Sun Mar 22 08:25:59 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 482F317A300 for ; Fri, 13 Mar 2026 04:13:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375206; cv=none; b=mBQcZv3K0eE311FWcEq3C0qPQLvRQp124jljD5LRwGZfvF2VCBoVC41fn+u8eF+fAqK2XzVNfDSaGYC9vWlosz9sNWfb4aAeNaLgyvqTPX1zlZqCzCNepyaXx4FwVVcAZnYSlUyBbw49nk6WXipIRJ4QfaKUvopEquBGnFd1rKg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375206; c=relaxed/simple; bh=txGusJsb2qe3LEaVFOYsAVYjDe3zgK0OofgTE7CT3bk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=paZ+xWw8FSFzBKbrdNRINZjxj0MU7jQ35ic2rlM+e9Zngf930V7HgQWc6+OB8W74L5dQ3ETCjrRYR3hyZOGGrugwEC7rJ4SLeZguUu6qBRk2AXFawQHycZ0Fh2ed2cm35JI/DZyX4rFgixaO7QExbV1yGmO5E12j6GVsWozQZtA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bkduPLD0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bkduPLD0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F1528C19424; Fri, 13 Mar 2026 04:13:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773375205; bh=txGusJsb2qe3LEaVFOYsAVYjDe3zgK0OofgTE7CT3bk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bkduPLD0CJ/VqZY1QjXMkSAxRl9RCq1mvdJERqbC9WDj3lroWX32ArZ4IDloSMb6y Gcgf2Gy1+JjixFkLVIbSSQWAVIySMe7ThHdT0Jj1fD4MePhi0R4S0qztLpyFE6WaXT BoeqHPYFPFnPwIHVtHOu8GSKOsYlal6F76TdrxRdG6mrhFO8cUWkwyrmw1LoC6Jtna u5n4wRi+EvIUQN+U10rASHbkA0UgEJrOznB01LWcGqPPpTbKD7Nlpf4ID7259uGOOO KcnBsYRT0BvV+1bcTixDNIu8mp4vbhK5GjT3WaUsj5OUCqH3ZIGJHNeptjWkElgfp3 tkaKZzAD0emfQ== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang , Hannes Reinecke , zhenwei pi , Hui Zhu , Gang Yan Subject: [RFC mptcp-next v5 6/7] nvme-tcp: add mptcp support Date: Fri, 13 Mar 2026 12:12:50 +0800 Message-ID: <3fbbe8a82795986d3a14c3e3a89077e8605d416b.1773374342.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang This patch defines a new nvmf_transport_ops named nvme_mptcp_transport, which is almost the same as nvme_tcp_transport except .type and .allowed_opts. MPTCP currently does not support TLS. The four TLS-related options (NVMF_OPT_TLS, NVMF_OPT_KEYRING, NVMF_OPT_TLS_KEY, and NVMF_OPT_CONCAT) have been removed from allowed_opts. They will be added back once MPTCP TLS is supported. Check if opts->transport is "mptcp" in nvme_tcp_alloc_queue() to decide whether to pass IPPROTO_MPTCP to sock_create_kern() to create a MPTCP socket instead of a TCP one. v2: - use 'trtype' instead of '--mptcp' (Hannes) v3: - check mptcp protocol from opts->transport instead of passing a parameter (Hannes). v4: - check CONFIG_MPTCP. Cc: Hannes Reinecke Co-developed-by: zhenwei pi Signed-off-by: zhenwei pi Co-developed-by: Hui Zhu Signed-off-by: Hui Zhu Co-developed-by: Gang Yan Signed-off-by: Gang Yan Signed-off-by: Geliang Tang --- drivers/nvme/host/tcp.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 9ab3f61196a3..8ead932e7999 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1766,6 +1766,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nct= rl, int qid, { struct nvme_tcp_ctrl *ctrl =3D to_tcp_ctrl(nctrl); struct nvme_tcp_queue *queue =3D &ctrl->queues[qid]; + int proto =3D IPPROTO_TCP; int ret, rcv_pdu_size; struct file *sock_file; =20 @@ -1782,9 +1783,14 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nc= trl, int qid, queue->cmnd_capsule_len =3D sizeof(struct nvme_command) + NVME_TCP_ADMIN_CCSZ; =20 +#ifdef CONFIG_MPTCP + if (!strcmp(ctrl->ctrl.opts->transport, "mptcp")) + proto =3D IPPROTO_MPTCP; +#endif + ret =3D sock_create_kern(current->nsproxy->net_ns, ctrl->addr.ss_family, SOCK_STREAM, - IPPROTO_TCP, &queue->sock); + proto, &queue->sock); if (ret) { dev_err(nctrl->device, "failed to create socket: %d\n", ret); @@ -1801,9 +1807,13 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nc= trl, int qid, nvme_tcp_reclassify_socket(queue->sock); =20 /* Single syn retry */ + sk_is_msk(queue->sock->sk) ? + mptcp_sock_set_syncnt(queue->sock->sk, 1) : tcp_sock_set_syncnt(queue->sock->sk, 1); =20 /* Set TCP no delay */ + sk_is_msk(queue->sock->sk) ? + mptcp_sock_set_nodelay(queue->sock->sk) : tcp_sock_set_nodelay(queue->sock->sk); =20 /* @@ -3022,6 +3032,18 @@ static struct nvmf_transport_ops nvme_tcp_transport = =3D { .create_ctrl =3D nvme_tcp_create_ctrl, }; =20 +static struct nvmf_transport_ops nvme_mptcp_transport =3D { + .name =3D "mptcp", + .module =3D THIS_MODULE, + .required_opts =3D NVMF_OPT_TRADDR, + .allowed_opts =3D NVMF_OPT_TRSVCID | NVMF_OPT_RECONNECT_DELAY | + NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO | + NVMF_OPT_HDR_DIGEST | NVMF_OPT_DATA_DIGEST | + NVMF_OPT_NR_WRITE_QUEUES | NVMF_OPT_NR_POLL_QUEUES | + NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE, + .create_ctrl =3D nvme_tcp_create_ctrl, +}; + static int __init nvme_tcp_init_module(void) { unsigned int wq_flags =3D WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_SYSFS; @@ -3047,6 +3069,7 @@ static int __init nvme_tcp_init_module(void) atomic_set(&nvme_tcp_cpu_queues[cpu], 0); =20 nvmf_register_transport(&nvme_tcp_transport); + nvmf_register_transport(&nvme_mptcp_transport); return 0; } =20 @@ -3054,6 +3077,7 @@ static void __exit nvme_tcp_cleanup_module(void) { struct nvme_tcp_ctrl *ctrl; =20 + nvmf_unregister_transport(&nvme_mptcp_transport); nvmf_unregister_transport(&nvme_tcp_transport); =20 mutex_lock(&nvme_tcp_ctrl_mutex); --=20 2.53.0 From nobody Sun Mar 22 08:25:59 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3477617A300 for ; Fri, 13 Mar 2026 04:13:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375209; cv=none; b=MFS2gvj4c8lodPsKMGUw29MgIzzeuyIHPHw+a0KJ55iLQ8/Ea7bzs9fTSZYSXEy7QSbXggqMSou8Bd33lPzeUG/L3f/6NeXJG2E8Xjw4W9Ym05RYbIcRA7Xkh4cCI3TxbQRP/op9wm1bhkfCPTGuH7w7bljflSUh6Rz2c4qdQyc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773375209; c=relaxed/simple; bh=cEGyVy8n1tJGXZ/PViBX1eqnGOgVTI/bDDBTDVx79R4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BxcmDEqO9zi1XcCakvs/yeuf8BQBfDeVdFBVelmIwHTMZoeTB+bWQ0AsSvr3eF19QsA3jrP41EhU7CgNYrI3c5J9HDvEK0gxOtDrDv5ibKYHJ6PGM9UVLGxHGAyEXuYw8gmoB8KNnST+R+itQ0dqYKgEkrf+T2FWFLAUAwuCuDY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Ez5w9Wl0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Ez5w9Wl0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 735CBC19421; Fri, 13 Mar 2026 04:13:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773375208; bh=cEGyVy8n1tJGXZ/PViBX1eqnGOgVTI/bDDBTDVx79R4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ez5w9Wl0oQhINcIYrhQjsTXrJPYXyf6Ts8V8/n4YmsT+QA3KCD2hcpzoh0o8/PHGx e5JFBfvg8Ve5f0bICo4cT4x2MsBp4gXOlmo14eO8jR5bzAuqskfoBsDQLo+7/3+KYl 7Zg2bOBraLLP0APhiykd2J8anNhk8pfBOWJJy+tELs2ARjbaNRzdET2PvUk9836+mL c1M67ytXpZY01x9RonhSH/LcoEqwvVK2gsQIe56oe2N8No2h+2Zqkdbx+do15NawsK ATGEg0sdleLfW5f0yTD0/otPQnfF//18nUZBxue6KucwGEQ6XAHeULujLy1wdR7V89 vA/t6XhlNVx1Q== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang , Nilay Shroff , Ming Lei , zhenwei pi , Hui Zhu , Gang Yan Subject: [RFC mptcp-next v5 7/7] selftests: mptcp: add NVMe over MPTCP test Date: Fri, 13 Mar 2026 12:12:51 +0800 Message-ID: <11d42b89963a3b1a8a9d39c7044dcee5cd49ffcd.1773374342.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang A test case for NVMe over MPTCP has been implemented. It verifies the proper functionality of nvme list, discover, connect, and disconnect commands. Additionally, read/write performance has been evaluated using fio. This test simulats four NICs on both target and host sides, each limited to 100MB/s. It shows that 'NVMe over MPTCP' delivered bandwidth up to four times that of standard TCP: # ./mptcp_nvme.sh tcp READ: bw=3D112MiB/s (118MB/s), 112MiB/s-112MiB/s (118MB/s-118MB/s), io=3D1123MiB (1177MB), run=3D10018-10018msec WRITE: bw=3D112MiB/s (117MB/s), 112MiB/s-112MiB/s (117MB/s-117MB/s), io=3D1118MiB (1173MB), run=3D10018-10018msec # ./mptcp_nvme.sh mptcp READ: bw=3D427MiB/s (448MB/s), 427MiB/s-427MiB/s (448MB/s-448MB/s), io=3D4286MiB (4494MB), run=3D10039-10039msec WRITE: bw=3D387MiB/s (406MB/s), 387MiB/s-387MiB/s (406MB/s-406MB/s), io=3D3885MiB (4073MB), run=3D10043-10043msec Also add NVMe iopolicy testing to mptcp_nvme.sh, with the default set to "numa". It can be set to "round-robin" or "queue-depth". # ./mptcp_nvme.sh mptcp round-robin Cc: Nilay Shroff Cc: Ming Lei Co-developed-by: zhenwei pi Signed-off-by: zhenwei pi Co-developed-by: Hui Zhu Signed-off-by: Hui Zhu Co-developed-by: Gang Yan Signed-off-by: Gang Yan Signed-off-by: Geliang Tang --- tools/testing/selftests/net/mptcp/Makefile | 1 + tools/testing/selftests/net/mptcp/config | 7 + .../testing/selftests/net/mptcp/mptcp_nvme.sh | 205 ++++++++++++++++++ 3 files changed, 213 insertions(+) create mode 100755 tools/testing/selftests/net/mptcp/mptcp_nvme.sh diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/sel= ftests/net/mptcp/Makefile index 22ba0da2adb8..7b308447a58b 100644 --- a/tools/testing/selftests/net/mptcp/Makefile +++ b/tools/testing/selftests/net/mptcp/Makefile @@ -13,6 +13,7 @@ TEST_PROGS :=3D \ mptcp_connect_sendfile.sh \ mptcp_connect_splice.sh \ mptcp_join.sh \ + mptcp_nvme.sh \ mptcp_sockopt.sh \ pm_netlink.sh \ simult_flows.sh \ diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selft= ests/net/mptcp/config index 59051ee2a986..0eee348eff8b 100644 --- a/tools/testing/selftests/net/mptcp/config +++ b/tools/testing/selftests/net/mptcp/config @@ -34,3 +34,10 @@ CONFIG_NFT_SOCKET=3Dm CONFIG_NFT_TPROXY=3Dm CONFIG_SYN_COOKIES=3Dy CONFIG_VETH=3Dy +CONFIG_CONFIGFS_FS=3Dy +CONFIG_NVME_CORE=3Dy +CONFIG_NVME_FABRICS=3Dy +CONFIG_NVME_TCP=3Dy +CONFIG_NVME_TARGET=3Dy +CONFIG_NVME_TARGET_TCP=3Dy +CONFIG_NVME_MULTIPATH=3Dy diff --git a/tools/testing/selftests/net/mptcp/mptcp_nvme.sh b/tools/testin= g/selftests/net/mptcp/mptcp_nvme.sh new file mode 100755 index 000000000000..bc201a300b72 --- /dev/null +++ b/tools/testing/selftests/net/mptcp/mptcp_nvme.sh @@ -0,0 +1,205 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(dirname "$0")/mptcp_lib.sh" + +ret=3D0 +trtype=3D"${1:-mptcp}" +iopolicy=3D${2:-"numa"} # round-robin, queue-depth +nqn=3Dnqn.2014-08.org.nvmexpress.${trtype}dev +ns=3D1 +port=3D1234 +trsvcid=3D4420 +ns1=3D"" +ns2=3D"" + +ns1_cleanup() +{ + mount -t configfs none /sys/kernel/config + + rm -rf /sys/kernel/config/nvmet/ports/"${port}"/subsystems/"${trtype}"sub= sys + rmdir /sys/kernel/config/nvmet/ports/"${port}" + echo 0 > /sys/kernel/config/nvmet/subsystems/"${nqn}"/namespaces/"${ns}"/= enable + echo -n 0 > /sys/kernel/config/nvmet/subsystems/"${nqn}"/namespaces/"${ns= }"/device_path + rmdir /sys/kernel/config/nvmet/subsystems/"${nqn}"/namespaces/"${ns}" + rmdir /sys/kernel/config/nvmet/subsystems/"${nqn}" +} + +ns2_cleanup() +{ + nvme disconnect -n "${nqn}" || true +} + +cleanup() +{ + ip netns exec "$ns2" bash <<- EOF + $(declare -f ns2_cleanup) + ns2_cleanup + EOF + + sleep 1 + + ip netns exec "$ns1" bash <<- EOF + $(declare -f ns1_cleanup) + ns1_cleanup + EOF + + losetup -d /dev/loop100 + rm -rf /tmp/test.raw + + mptcp_lib_ns_exit "$ns1" "$ns2" + + kill "$monitor_pid_ns1" 2>/dev/null + wait "$monitor_pid_ns1" 2>/dev/null + + kill "$monitor_pid_ns2" 2>/dev/null + wait "$monitor_pid_ns2" 2>/dev/null + + unset -v trtype nqn ns port trsvcid +} + +init() +{ + mptcp_lib_ns_init ns1 ns2 + + # ns1 ns2 + # 10.1.1.1 10.1.1.2 + # 10.1.2.1 10.1.2.2 + # 10.1.3.1 10.1.3.2 + # 10.1.4.1 10.1.4.2 + for i in {1..4}; do + ip link add ns1eth"$i" netns "$ns1" type veth peer name ns2eth"$i" netns= "$ns2" + ip -net "$ns1" addr add 10.1."$i".1/24 dev ns1eth"$i" + ip -net "$ns1" addr add dead:beef:"$i"::1/64 dev ns1eth"$i" nodad + ip -net "$ns1" link set ns1eth"$i" up + ip -net "$ns2" addr add 10.1."$i".2/24 dev ns2eth"$i" + ip -net "$ns2" addr add dead:beef:"$i"::2/64 dev ns2eth"$i" nodad + ip -net "$ns2" link set ns2eth"$i" up + ip -net "$ns2" route add default via 10.1."$i".1 dev ns2eth"$i" metric 1= 0"$i" + ip -net "$ns2" route add default via dead:beef:"$i"::1 dev ns2eth"$i" me= tric 10"$i" + + # Add tc qdisc to both namespaces for bandwidth limiting + tc -n "$ns1" qdisc add dev ns1eth"$i" root netem rate 1000mbit + tc -n "$ns2" qdisc add dev ns2eth"$i" root netem rate 1000mbit + done + + mptcp_lib_pm_nl_set_limits "${ns1}" 8 8 + + mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.2.1 flags signal + mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.3.1 flags signal + mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.4.1 flags signal + + mptcp_lib_pm_nl_set_limits "${ns2}" 8 8 + + mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.2.2 flags subflow + mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.3.2 flags subflow + mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.4.2 flags subflow + + ip -n "${ns1}" mptcp monitor & + monitor_pid_ns1=3D$! + ip -n "${ns2}" mptcp monitor & + monitor_pid_ns2=3D$! +} + +run_target() +{ + mount -t configfs none /sys/kernel/config + + cd /sys/kernel/config/nvmet/subsystems || exit + mkdir -p "${nqn}" + cd "${nqn}" || exit + echo 1 > attr_allow_any_host + mkdir -p namespaces/"${ns}" + echo /dev/loop100 > namespaces/"${ns}"/device_path + echo 1 > namespaces/"${ns}"/enable + + cd /sys/kernel/config/nvmet/ports || exit + mkdir -p "${port}" + cd "${port}" || exit + echo "${trtype}" > addr_trtype + echo ipv4 > addr_adrfam + echo 0.0.0.0 > addr_traddr + echo "${trsvcid}" > addr_trsvcid + + cd subsystems || exit + ln -sf ../../../subsystems/"${nqn}" "${trtype}"subsys +} + +run_host() +{ + local traddr=3D10.1.1.1 + + echo "nvme discover -a ${traddr}" + nvme discover -t "${trtype}" -a "${traddr}" -s "${trsvcid}" + if [ $? -ne 0 ]; then + return "${KSFT_FAIL}" + fi + + echo "nvme connect" + devname=3D$(nvme connect -t "${trtype}" -a "${traddr}" -s "${trsvcid}" -n= "${nqn}" | + awk '{print $NF}') + + sleep 1 + + echo "nvme list" + nvme list + + subname=3D$(nvme list-subsys /dev/"${devname}"n1 | + grep -o 'nvme-subsys[0-9]*' | head -1) + + echo "${iopolicy}" > /sys/class/nvme-subsystem/"${subname}"/iopolicy + cat /sys/class/nvme-subsystem/"${subname}"/iopolicy + + echo "fio randread /dev/${devname}n1" + fio --name=3Dglobal --direct=3D1 --norandommap --randrepeat=3D0 --ioengin= e=3Dlibaio \ + --thread=3D1 --blocksize=3D4k --runtime=3D10 --time_based --rw=3Drand= read --numjobs=3D4 \ + --iodepth=3D256 --group_reporting --size=3D100% --name=3Dlibaio_4_256= _4k_randread \ + --size=3D4m --filename=3D/dev/"${devname}"n1 + + sleep 1 + + echo "fio randwrite /dev/${devname}n1" + fio --name=3Dglobal --direct=3D1 --norandommap --randrepeat=3D0 --ioengin= e=3Dlibaio \ + --thread=3D1 --blocksize=3D4k --runtime=3D10 --time_based --rw=3Drand= write --numjobs=3D4 \ + --iodepth=3D256 --group_reporting --size=3D100% --name=3Dlibaio_4_256= _4k_randwrite \ + --size=3D4m --filename=3D/dev/"${devname}"n1 + + nvme flush /dev/"${devname}"n1 + + sleep 1 +} + +init +trap cleanup EXIT + +dd if=3D/dev/zero of=3D/tmp/test.raw bs=3D1M count=3D0 seek=3D512 +losetup /dev/loop100 /tmp/test.raw + +run_test() +{ + export trtype nqn ns port trsvcid + export iopolicy + + if ! ip netns exec "$ns1" bash <<- EOF + $(declare -f run_target) + run_target + exit \$? + EOF + then + ret=3D"${KSFT_FAIL}" + fi + + if ! ip netns exec "$ns2" bash <<- EOF + $(declare -f run_host) + run_host + exit \$? + EOF + then + ret=3D"${KSFT_FAIL}" + fi +} + +run_test "$@" + +mptcp_lib_result_print_all_tap +exit "$ret" --=20 2.53.0