-------- Forwarded message -------
From: gang.yan@linux.dev mailto:gang.yan@linux.dev
To: "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E >
Sent: April 3, 2026 at 2:44 PM
Subject: Re: [PATCH mptcp-net v5 4/5] mptcp: fix the dead_lock in mptcp_data_ready
April 2, 2026 at 4:02 AM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
>
> On 4/1/26 10:54 AM, Gang Yan wrote:
>
>
> From: Gang Yan <yangang@kylinos.cn>
>
> This patch defers mptcp_check_data_fin from __mptcp_move_skbs to avoid
> deadlock.
>
> When processing backlogged data in softirq context, __mptcp_move_skbs
> directly calls mptcp_check_data_fin, which can lead to a deadlock if
> the msk socket is in TCP_FIN_WAIT2 state. In that case,
> mptcp_check_data_fin calls mptcp_shutdown_subflows, which attempts to
> lock the subflow socket with lock_sock_fast() - a sleeping function
> that cannot be called in atomic context (softirq).
>
> The fix for such issue should be squashed inside the patch generating
> the issue itself.
>
>
> The correct pattern is already used in move_skbs_to_msk: if a data fin
> is pending, schedule the work to be processed later in process context
> rather than handling it directly.
>
> Reported-by: Geliang Tang <geliang@kernel.org>
> Co-developed-by: Geliang Tang <geliang@kernel.org>
> Signed-off-by: Geliang Tang <geliang@kernel.org>
> Signed-off-by: Gang Yan <yangang@kylinos.cn>
> ---
> Notes:
>
> Hi Matt,Paolo:
>
> Here is the response of AI review for this patch:
>
> - Typo problem in subject line:
> Already fixed in v5.
>
> - Remove else branch:
> It demonstrates a TOCTOU race, though the race window is extremely
> narrow. The else branch is useless even without the race. So I
> removed it in v5. WDYT?
>
> Thanks,
> Gang
>
> Changelog:
> v5:
> - Fix typo in title.
> - Remove the else branch.
>
> ---
> net/mptcp/protocol.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 326b5d4d79e7..88e19fa61b84 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2247,8 +2247,9 @@ static bool __mptcp_move_skbs(struct sock *sk, struct list_head *skbs, u32 *delt
> }
>
> __mptcp_ofo_queue(msk);
> - if (moved)
> - mptcp_check_data_fin((struct sock *)msk);
> + if (moved && mptcp_pending_data_fin(sk, NULL))
> + mptcp_schedule_work(sk);
>
> Scheduling the worker when we are not in BH could introduce problematic
> delays. Instead you could factor out a ____mptcp_move_skbs() helper (a
> better name would be nice!) not including the mptcp_check_data_fin()
> call, use the latter in mptcp_data_ready() and in such function
> eventually schedule the mptcp work as needed.
>
Hi Paolo,
Thanks for your review.
I'm a little bit confused. Does this mean the 'move_skbs_to_msk' should be
fixed too? like:
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 6ce0d18c86b6..e90c8621683c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -869,13 +869,6 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, struct sock *ssk)
if (unlikely(ssk->sk_err))
__mptcp_subflow_error_report(sk, ssk);
- /* If the moves have caught up with the DATA_FIN sequence number
- * it's time to ack the DATA_FIN and change socket state, but
- * this is not a good place to change state. Let the workqueue
- * do it.
- */
- if (mptcp_pending_data_fin(sk, NULL))
- mptcp_schedule_work(sk);
return moved;
}
@@ -909,6 +902,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
struct mptcp_sock *msk = mptcp_sk(sk);
+ bool moved = false;
/* The peer can send data while we are shutting down this
* subflow at subflow destruction time, but we must avoid enqueuing
@@ -920,13 +914,17 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
mptcp_data_lock(sk);
mptcp_rcv_rtt_update(msk, subflow);
if (!sock_owned_by_user(sk)) {
+ moved = move_skbs_to_msk(msk, ssk);
/* Wake-up the reader only for in-sequence data */
- if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk))
+ if (moved && mptcp_epollin_ready(sk))
sk->sk_data_ready(sk);
} else {
__mptcp_move_skbs_from_subflow(msk, ssk, false);
}
mptcp_data_unlock(sk);
+
+ if (mptcp_pending_data_fin(sk, NULL))
+ mptcp_check_data_fin(sk);
}
Looking forward for your reply
Very thanks
Gang
>
> /P
>
© 2016 - 2026 Red Hat, Inc.