From nobody Fri Oct 25 19:33:44 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BE5517755; Mon, 21 Aug 2023 22:25:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6CDCC433CB; Mon, 21 Aug 2023 22:25:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1692656722; bh=ZegXcvFdUUUbQrQhYHQhTMtH3PwvZ1GQ2Ih8qmeynjw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=TrW6DIpLvGwjbWQZYqE/4N10iaYht1a2I04DSKCrM0jJ1sfKQDBa4lVkyRlfULrEb XfPLcfrnmAhFyjVLMZt/VUGHqSCMpOK9Ucj2QtE0e7tdle87txEIKsveTLCSmOsRZH xRajB0du2UG+uBpJ7jpCiP+oeA5o8dpj+TO2ush/GFiEB4b4lHGOINSkLThSm9VYMd GwOt8AEm85cMyugDYTvbdCBC3YsHPDZ07dRNHbHBUsRI9rduw2NV2MDExIsQeaKu33 Ij8LVSaV7MNqkgiDy0slWoi9diEdM1ZN0Y3rOo8jFtgjI/eqz47+eVMfY7Alu1bxwP CvI2fAGbVBI7g== From: Mat Martineau Date: Mon, 21 Aug 2023 15:25:12 -0700 Subject: [PATCH net-next 01/10] mptcp: refactor push_pending logic Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20230821-upstream-net-next-20230818-v1-1-0c860fb256a8@kernel.org> References: <20230821-upstream-net-next-20230818-v1-0-0c860fb256a8@kernel.org> In-Reply-To: <20230821-upstream-net-next-20230818-v1-0-0c860fb256a8@kernel.org> To: Matthieu Baerts , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: netdev@vger.kernel.org, mptcp@lists.linux.dev, Geliang Tang , Mat Martineau X-Mailer: b4 0.12.3 From: Geliang Tang To support redundant package schedulers more easily, this patch refactors __mptcp_push_pending() logic from: For each dfrag: While sends succeed: Call the scheduler (selects subflow and msk->snd_burst) Update subflow locks (push/release/acquire as needed) Send the dfrag data with mptcp_sendmsg_frag() Update already_sent, snd_nxt, snd_burst Update msk->first_pending Push/release on final subflow -> While first_pending isn't empty: Call the scheduler (selects subflow and msk->snd_burst) Update subflow locks (push/release/acquire as needed) For each pending dfrag: While sends succeed: Send the dfrag data with mptcp_sendmsg_frag() Update already_sent, snd_nxt, snd_burst Update msk->first_pending Break if required by msk->snd_burst / etc Push/release on final subflow Refactors __mptcp_subflow_push_pending logic from: For each dfrag: While sends succeed: Call the scheduler (selects subflow and msk->snd_burst) Send the dfrag data with mptcp_subflow_delegate(), break Send the dfrag data with mptcp_sendmsg_frag() Update dfrag->already_sent, msk->snd_nxt, msk->snd_burst Update msk->first_pending -> While first_pending isn't empty: Call the scheduler (selects subflow and msk->snd_burst) Send the dfrag data with mptcp_subflow_delegate(), break Send the dfrag data with mptcp_sendmsg_frag() For each pending dfrag: While sends succeed: Send the dfrag data with mptcp_sendmsg_frag() Update already_sent, snd_nxt, snd_burst Update msk->first_pending Break if required by msk->snd_burst / etc Move the duplicate code from __mptcp_push_pending() and __mptcp_subflow_push_pending() into a new helper function, named __subflow_push_pending(). Simplify __mptcp_push_pending() and __mptcp_subflow_push_pending() by invoking this helper. Also move the burst check conditions out of the function mptcp_subflow_get_send(), check them in __subflow_push_pending() in the inner "for each pending dfrag" loop. Reviewed-by: Mat Martineau Signed-off-by: Geliang Tang Signed-off-by: Mat Martineau --- net/mptcp/protocol.c | 153 +++++++++++++++++++++++++++--------------------= ---- 1 file changed, 81 insertions(+), 72 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 6019a3cf1625..29c662ffcd05 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1386,14 +1386,6 @@ static struct sock *mptcp_subflow_get_send(struct mp= tcp_sock *msk) sk_stream_memory_free(msk->first) ? msk->first : NULL; } =20 - /* re-use last subflow, if the burst allow that */ - if (msk->last_snd && msk->snd_burst > 0 && - sk_stream_memory_free(msk->last_snd) && - mptcp_subflow_active(mptcp_subflow_ctx(msk->last_snd))) { - mptcp_set_timeout(sk); - return msk->last_snd; - } - /* pick the subflow with the lower wmem/wspace ratio */ for (i =3D 0; i < SSK_MODE_MAX; ++i) { send_info[i].ssk =3D NULL; @@ -1499,57 +1491,86 @@ void mptcp_check_and_set_pending(struct sock *sk) mptcp_sk(sk)->push_pending |=3D BIT(MPTCP_PUSH_PENDING); } =20 -void __mptcp_push_pending(struct sock *sk, unsigned int flags) +static int __subflow_push_pending(struct sock *sk, struct sock *ssk, + struct mptcp_sendmsg_info *info) { - struct sock *prev_ssk =3D NULL, *ssk =3D NULL; struct mptcp_sock *msk =3D mptcp_sk(sk); - struct mptcp_sendmsg_info info =3D { - .flags =3D flags, - }; - bool do_check_data_fin =3D false; struct mptcp_data_frag *dfrag; - int len; + int len, copied =3D 0, err =3D 0; =20 while ((dfrag =3D mptcp_send_head(sk))) { - info.sent =3D dfrag->already_sent; - info.limit =3D dfrag->data_len; + info->sent =3D dfrag->already_sent; + info->limit =3D dfrag->data_len; len =3D dfrag->data_len - dfrag->already_sent; while (len > 0) { int ret =3D 0; =20 - prev_ssk =3D ssk; - ssk =3D mptcp_subflow_get_send(msk); - - /* First check. If the ssk has changed since - * the last round, release prev_ssk - */ - if (ssk !=3D prev_ssk && prev_ssk) - mptcp_push_release(prev_ssk, &info); - if (!ssk) - goto out; - - /* Need to lock the new subflow only if different - * from the previous one, otherwise we are still - * helding the relevant lock - */ - if (ssk !=3D prev_ssk) - lock_sock(ssk); - - ret =3D mptcp_sendmsg_frag(sk, ssk, dfrag, &info); + ret =3D mptcp_sendmsg_frag(sk, ssk, dfrag, info); if (ret <=3D 0) { - if (ret =3D=3D -EAGAIN) - continue; - mptcp_push_release(ssk, &info); + err =3D copied ? : ret; goto out; } =20 - do_check_data_fin =3D true; - info.sent +=3D ret; + info->sent +=3D ret; + copied +=3D ret; len -=3D ret; =20 mptcp_update_post_push(msk, dfrag, ret); } WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); + + if (msk->snd_burst <=3D 0 || + !sk_stream_memory_free(ssk) || + !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { + err =3D copied; + goto out; + } + mptcp_set_timeout(sk); + } + err =3D copied; + +out: + return err; +} + +void __mptcp_push_pending(struct sock *sk, unsigned int flags) +{ + struct sock *prev_ssk =3D NULL, *ssk =3D NULL; + struct mptcp_sock *msk =3D mptcp_sk(sk); + struct mptcp_sendmsg_info info =3D { + .flags =3D flags, + }; + bool do_check_data_fin =3D false; + + while (mptcp_send_head(sk)) { + int ret =3D 0; + + prev_ssk =3D ssk; + ssk =3D mptcp_subflow_get_send(msk); + + /* First check. If the ssk has changed since + * the last round, release prev_ssk + */ + if (ssk !=3D prev_ssk && prev_ssk) + mptcp_push_release(prev_ssk, &info); + if (!ssk) + goto out; + + /* Need to lock the new subflow only if different + * from the previous one, otherwise we are still + * helding the relevant lock + */ + if (ssk !=3D prev_ssk) + lock_sock(ssk); + + ret =3D __subflow_push_pending(sk, ssk, &info); + if (ret <=3D 0) { + if (ret =3D=3D -EAGAIN) + continue; + mptcp_push_release(ssk, &info); + goto out; + } + do_check_data_fin =3D true; } =20 /* at this point we held the socket lock for the last subflow we used */ @@ -1570,42 +1591,30 @@ static void __mptcp_subflow_push_pending(struct soc= k *sk, struct sock *ssk, bool struct mptcp_sendmsg_info info =3D { .data_lock_held =3D true, }; - struct mptcp_data_frag *dfrag; struct sock *xmit_ssk; - int len, copied =3D 0; + int copied =3D 0; =20 info.flags =3D 0; - while ((dfrag =3D mptcp_send_head(sk))) { - info.sent =3D dfrag->already_sent; - info.limit =3D dfrag->data_len; - len =3D dfrag->data_len - dfrag->already_sent; - while (len > 0) { - int ret =3D 0; - - /* check for a different subflow usage only after - * spooling the first chunk of data - */ - xmit_ssk =3D first ? ssk : mptcp_subflow_get_send(msk); - if (!xmit_ssk) - goto out; - if (xmit_ssk !=3D ssk) { - mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), - MPTCP_DELEGATE_SEND); - goto out; - } - - ret =3D mptcp_sendmsg_frag(sk, ssk, dfrag, &info); - if (ret <=3D 0) - goto out; + while (mptcp_send_head(sk)) { + int ret =3D 0; =20 - info.sent +=3D ret; - copied +=3D ret; - len -=3D ret; - first =3D false; - - mptcp_update_post_push(msk, dfrag, ret); + /* check for a different subflow usage only after + * spooling the first chunk of data + */ + xmit_ssk =3D first ? ssk : mptcp_subflow_get_send(msk); + if (!xmit_ssk) + goto out; + if (xmit_ssk !=3D ssk) { + mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), + MPTCP_DELEGATE_SEND); + goto out; } - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); + + ret =3D __subflow_push_pending(sk, ssk, &info); + first =3D false; + if (ret <=3D 0) + break; + copied +=3D ret; } =20 out: --=20 2.41.0