From nobody Wed Apr 1 22:18:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90E083C457D for ; Wed, 1 Apr 2026 12:54:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775048071; cv=none; b=lD/RJ5HC4NS23x16hDSIXCbQYFVkTcOzpz09xPjgXrKR0iqCDGONrNGCV2AhBbECcRvwl5vVUmbHbIkk/ZWzQ4JllPKyDuWhuk0REORxoUBNMiOxqOeK7JhvdmERJ33AQjqtkX4mrnqGDWt9vDiBl8GXAo6kVfUBWsLrApHtlHU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775048071; c=relaxed/simple; bh=SSwrmnxlgId8ZBAiDfe9i9qiHSefb/jOBMjHtpkcMQA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qwFBuGm6SBUDd3C7EP20HHNF3ufywD87cQ/iAZAXsv96gyt5DscUgd7+yIeyKnq5B/ZIVrFdgNm7OQ2GvUzjM071f2yuYE3/SoMRxR1lit4q6E9Rx3iK2vJHf06V1pOOnxhlxwiPbFdwxshtb9TQe59vEPZdqsNv9Y8+Qp74wCI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JIAdI73w; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JIAdI73w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C1A3C116C6; Wed, 1 Apr 2026 12:54:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775048071; bh=SSwrmnxlgId8ZBAiDfe9i9qiHSefb/jOBMjHtpkcMQA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JIAdI73w9vBkmQpOVBp7sF5jwqKphTJ1FKEgBmXktURxbmKqabfEoYLLoi3W6idpB lCna7cy/TKMZ7erYblm4AV0JIz2cBu/BDd2ZRc7j9YxifDaaXkHImSA/PadW9FKYTV h0/ewS3wahBZUUpLP2qllDfriLIKm4FMKkrLPPqIWOsf2MuHorCRFmq9+/qxxIqlEI 0gK80qphc7jK6faE/AByQpEi7xN8kktXCHg/y1FDWT2ERgic1HC7vq6/b08rvo8d4p AyLRu3qigLBDTNEAQpJyQzItRUdtpziW03zD84uCzNu8LW0Hbg12pCTAG9/uBOTaTs JXT0fMgTPcd3Q== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang , Hannes Reinecke , Nilay Shroff , Ming Lei , zhenwei pi , Hui Zhu , Gang Yan Subject: [RFC mptcp-next v8 7/7] selftests: mptcp: add NVMe over MPTCP test Date: Wed, 1 Apr 2026 20:53:45 +0800 Message-ID: X-Mailer: git-send-email 2.51.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Geliang Tang A test case for NVMe over MPTCP has been implemented. It verifies the proper functionality of nvme list, discover, connect, and disconnect commands. Additionally, read/write performance has been evaluated using fio. This test simulates four NICs on both target and host sides, each limited to 100MB/s. It shows that 'NVMe over MPTCP' delivered bandwidth up to four times that of standard TCP: # ./mptcp_nvme.sh tcp READ: bw=3D112MiB/s (118MB/s), 112MiB/s-112MiB/s (118MB/s-118MB/s), io=3D1123MiB (1177MB), run=3D10018-10018msec WRITE: bw=3D112MiB/s (117MB/s), 112MiB/s-112MiB/s (117MB/s-117MB/s), io=3D1118MiB (1173MB), run=3D10018-10018msec # ./mptcp_nvme.sh mptcp READ: bw=3D427MiB/s (448MB/s), 427MiB/s-427MiB/s (448MB/s-448MB/s), io=3D4286MiB (4494MB), run=3D10039-10039msec WRITE: bw=3D387MiB/s (406MB/s), 387MiB/s-387MiB/s (406MB/s-406MB/s), io=3D3885MiB (4073MB), run=3D10043-10043msec Also add NVMe iopolicy testing to mptcp_nvme.sh, with the default set to "numa". It can be set to "round-robin" or "queue-depth". # ./mptcp_nvme.sh mptcp round-robin Cc: Hannes Reinecke Cc: Nilay Shroff Cc: Ming Lei Co-developed-by: zhenwei pi Signed-off-by: zhenwei pi Co-developed-by: Hui Zhu Signed-off-by: Hui Zhu Co-developed-by: Gang Yan Signed-off-by: Gang Yan Signed-off-by: Geliang Tang --- tools/testing/selftests/net/mptcp/Makefile | 1 + tools/testing/selftests/net/mptcp/config | 7 + .../testing/selftests/net/mptcp/mptcp_lib.sh | 12 + .../testing/selftests/net/mptcp/mptcp_nvme.sh | 240 ++++++++++++++++++ 4 files changed, 260 insertions(+) create mode 100755 tools/testing/selftests/net/mptcp/mptcp_nvme.sh diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/sel= ftests/net/mptcp/Makefile index 22ba0da2adb8..7b308447a58b 100644 --- a/tools/testing/selftests/net/mptcp/Makefile +++ b/tools/testing/selftests/net/mptcp/Makefile @@ -13,6 +13,7 @@ TEST_PROGS :=3D \ mptcp_connect_sendfile.sh \ mptcp_connect_splice.sh \ mptcp_join.sh \ + mptcp_nvme.sh \ mptcp_sockopt.sh \ pm_netlink.sh \ simult_flows.sh \ diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selft= ests/net/mptcp/config index 59051ee2a986..0eee348eff8b 100644 --- a/tools/testing/selftests/net/mptcp/config +++ b/tools/testing/selftests/net/mptcp/config @@ -34,3 +34,10 @@ CONFIG_NFT_SOCKET=3Dm CONFIG_NFT_TPROXY=3Dm CONFIG_SYN_COOKIES=3Dy CONFIG_VETH=3Dy +CONFIG_CONFIGFS_FS=3Dy +CONFIG_NVME_CORE=3Dy +CONFIG_NVME_FABRICS=3Dy +CONFIG_NVME_TCP=3Dy +CONFIG_NVME_TARGET=3Dy +CONFIG_NVME_TARGET_TCP=3Dy +CONFIG_NVME_MULTIPATH=3Dy diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing= /selftests/net/mptcp/mptcp_lib.sh index 5fea7e7df628..62e01afc50ed 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh @@ -526,6 +526,18 @@ mptcp_lib_check_tools() { exit ${KSFT_SKIP} fi ;; + "nvme") + if ! nvme --version &> /dev/null; then + mptcp_lib_pr_skip "Could not run test without nvme tool" + exit ${KSFT_SKIP} + fi + ;; + "fio") + if ! fio -h &> /dev/null; then + mptcp_lib_pr_skip "Could not run test without fio tool" + exit ${KSFT_SKIP} + fi + ;; *) mptcp_lib_pr_fail "Internal error: unsupported tool: ${tool}" exit ${KSFT_FAIL} diff --git a/tools/testing/selftests/net/mptcp/mptcp_nvme.sh b/tools/testin= g/selftests/net/mptcp/mptcp_nvme.sh new file mode 100755 index 000000000000..101536b66b9d --- /dev/null +++ b/tools/testing/selftests/net/mptcp/mptcp_nvme.sh @@ -0,0 +1,240 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +. "$(dirname "$0")/mptcp_lib.sh" + +ret=3D0 +trtype=3D"${1:-mptcp}" +iopolicy=3D${2:-"numa"} # round-robin, queue-depth +nqn=3D"nqn.2014-08.org.nvmexpress.${trtype}dev.$$.${RANDOM}" +ns=3D1 +port=3D$((RANDOM % 10000 + 20000)) +trsvcid=3D$((RANDOM % 64512 + 1024)) +ns1=3D"" +ns2=3D"" +temp_file=3D$(mktemp /tmp/test.XXXXXX.raw) +loop_dev=3D"" + +ns1_cleanup() +{ + mount -t configfs none /sys/kernel/config + + pushd /sys/kernel/config/nvmet || exit 1 + rm -rf ports/"${port}"/subsystems/"${trtype}"subsys + rmdir ports/"${port}" + echo 0 > subsystems/"${nqn}"/namespaces/"${ns}"/enable + echo -n 0 > subsystems/"${nqn}"/namespaces/"${ns}"/device_path + rmdir subsystems/"${nqn}"/namespaces/"${ns}" + rmdir subsystems/"${nqn}" + popd || exit 1 +} + +ns2_cleanup() +{ + nvme disconnect -n "${nqn}" || true +} + +cleanup() +{ + ip netns exec "$ns2" bash <<- EOF + $(declare -f ns2_cleanup) + ns2_cleanup + EOF + + sleep 1 + + ip netns exec "$ns1" bash <<- EOF + $(declare -f ns1_cleanup) + ns1_cleanup + EOF + + if [ -n "${loop_dev}" ] && [ -b "${loop_dev}" ]; then + losetup -d "${loop_dev}" 2>/dev/null || true + fi + rm -rf "${temp_file}" + + mptcp_lib_ns_exit "$ns1" "$ns2" + + kill "$monitor_pid_ns1" 2>/dev/null + wait "$monitor_pid_ns1" 2>/dev/null + + kill "$monitor_pid_ns2" 2>/dev/null + wait "$monitor_pid_ns2" 2>/dev/null + + unset -v trtype nqn ns port trsvcid +} + +init() +{ + mptcp_lib_ns_init ns1 ns2 + + # ns1 ns2 + # 10.1.1.1 10.1.1.2 + # 10.1.2.1 10.1.2.2 + # 10.1.3.1 10.1.3.2 + # 10.1.4.1 10.1.4.2 + for i in {1..4}; do + ip link add ns1eth"$i" netns "$ns1" type veth peer \ + name ns2eth"$i" netns "$ns2" + ip -net "$ns1" addr add 10.1."$i".1/24 dev ns1eth"$i" + ip -net "$ns1" addr add dead:beef:"$i"::1/64 \ + dev ns1eth"$i" nodad + ip -net "$ns1" link set ns1eth"$i" up + ip -net "$ns2" addr add 10.1."$i".2/24 dev ns2eth"$i" + ip -net "$ns2" addr add dead:beef:"$i"::2/64 \ + dev ns2eth"$i" nodad + ip -net "$ns2" link set ns2eth"$i" up + ip -net "$ns2" route add default via 10.1."$i".1 \ + dev ns2eth"$i" metric 10"$i" + ip -net "$ns2" route add default via dead:beef:"$i"::1 \ + dev ns2eth"$i" metric 10"$i" + + # Add tc qdisc to both namespaces for bandwidth limiting + tc -n "$ns1" qdisc add dev ns1eth"$i" root netem rate 1000mbit + tc -n "$ns2" qdisc add dev ns2eth"$i" root netem rate 1000mbit + done + + mptcp_lib_pm_nl_set_limits "${ns1}" 8 8 + + mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.2.1 flags signal + mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.3.1 flags signal + mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.4.1 flags signal + + mptcp_lib_pm_nl_set_limits "${ns2}" 8 8 + + mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.2.2 flags subflow + mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.3.2 flags subflow + mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.4.2 flags subflow + + ip -n "${ns1}" mptcp monitor & + monitor_pid_ns1=3D$! + ip -n "${ns2}" mptcp monitor & + monitor_pid_ns2=3D$! +} + +run_target() +{ + mount -t configfs none /sys/kernel/config + + cd /sys/kernel/config/nvmet/subsystems || exit + mkdir -p "${nqn}" + cd "${nqn}" || exit + echo 1 > attr_allow_any_host + mkdir -p namespaces/"${ns}" + echo "${loop_dev}" > namespaces/"${ns}"/device_path + echo 1 > namespaces/"${ns}"/enable + + cd /sys/kernel/config/nvmet/ports || exit + mkdir -p "${port}" + cd "${port}" || exit + echo "${trtype}" > addr_trtype + echo ipv4 > addr_adrfam + echo 0.0.0.0 > addr_traddr + echo "${trsvcid}" > addr_trsvcid + + cd subsystems || exit + ln -sf ../../../subsystems/"${nqn}" "${trtype}"subsys +} + +run_host() +{ + local traddr=3D10.1.1.1 + local output + local devname + local subname + + echo "nvme discover -a ${traddr}" + nvme discover -t "${trtype}" -a "${traddr}" -s "${trsvcid}" + if [ $? -ne 0 ]; then + return 1 + fi + + echo "nvme connect" + output=3D$(nvme connect -t "${trtype}" -a "${traddr}" \ + -s "${trsvcid}" -n "${nqn}" 2>&1) + if [ $? -ne 0 ]; then + echo "nvme connect failed: $output" >&2 + return 1 + fi + + devname=3D$(echo "$output" | awk '{print $NF}') + if [ -z "$devname" ]; then + echo "Failed to parse device name from output: $output" >&2 + return 1 + fi + + sleep 1 + + echo "nvme list" + nvme list + + subname=3D$(nvme list-subsys /dev/"${devname}"n1 | + grep -o 'nvme-subsys[0-9]*' | head -1) + + echo "${iopolicy}" > /sys/class/nvme-subsystem/"${subname}"/iopolicy + cat /sys/class/nvme-subsystem/"${subname}"/iopolicy + + echo "fio randread /dev/${devname}n1" + fio --name=3Dglobal --direct=3D1 --norandommap --randrepeat=3D0 \ + --ioengine=3Dlibaio --thread=3D1 --blocksize=3D4k --runtime=3D10 \ + --time_based --rw=3Drandread --numjobs=3D4 --iodepth=3D256 \ + --group_reporting --size=3D100% --name=3Dlibaio_4_256_4k_randread \ + --filename=3D/dev/"${devname}"n1 + if [ $? -ne 0 ]; then + return 1 + fi + + sleep 1 + + echo "fio randwrite /dev/${devname}n1" + fio --name=3Dglobal --direct=3D1 --norandommap --randrepeat=3D0 \ + --ioengine=3Dlibaio --thread=3D1 --blocksize=3D4k --runtime=3D10 \ + --time_based --rw=3Drandwrite --numjobs=3D4 --iodepth=3D256 \ + --group_reporting --size=3D100% --name=3Dlibaio_4_256_4k_randwrite \ + --filename=3D/dev/"${devname}"n1 + if [ $? -ne 0 ]; then + return 1 + fi + + nvme flush /dev/"${devname}"n1 +} + +mptcp_lib_check_tools nvme fio + +init +trap cleanup EXIT + +dd if=3D/dev/zero of=3D"${temp_file}" bs=3D1M count=3D0 seek=3D512 +loop_dev=3D$(losetup -f --show "${temp_file}") + +run_test() +{ + export trtype nqn ns port trsvcid + export loop_dev temp_file + export iopolicy + + if ! ip netns exec "$ns1" bash <<- EOF + $(declare -f run_target) + run_target + exit \$? + EOF + then + ret=3D"${KSFT_FAIL}" + fi + + if ! ip netns exec "$ns2" bash <<- EOF + $(declare -f run_host) + run_host + exit \$? + EOF + then + ret=3D"${KSFT_FAIL}" + fi + + sleep 1 +} + +run_test "$@" + +mptcp_lib_result_print_all_tap +exit "$ret" --=20 2.51.0