From nobody Thu Nov 27 12:35:52 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 623C4139D for ; Thu, 13 Nov 2025 00:11:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762992667; cv=none; b=mpnIZhkEPOuUaoi6BFS1bVW85Uo/ywuFSl2hzLXXswlX4p3cPhHzcPySp604Zv7g+cRtuCUyugsYrKckf0uhvK1leI0sIztsKD+RIBKgh+spAaaZ3NfdSNT67tMST6VnUDq6x8Em+3JuOKMvvlGTiaBBYhHPkChUN+BHlbeOIUI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762992667; c=relaxed/simple; bh=0YMgWiYBF6yn1Ng5wQZdIio18el3SYEsrQIRUvPYOBI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=NuVkSvS58rgTtdMVwzzlUVUcTnsSnA0teOdUZtuh6ZufJ5vsoUFGrnbTC9o89+/P7eT32UklNy4jUAEg4c1MJ6L8cwFbsmfO/Gn0BiAbn9+Q0kJtUucMCEWTbUE/C3D/j9lmap6NNNJrUQtiitlgQi/xbBPkb0GKPpRAGBze9M4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DpU8R9AB; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DpU8R9AB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1762992663; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qIdG2aOjc1nyZ6RoNo9HwYB9lkd7OQSVaoQZpHTMP+w=; b=DpU8R9ABqvCYzT/bVhwfRoqW1ELXmcgLOZstw4hbTYpwwFx6sZWNUr2tWCi8Zt1JsZotJ3 WNKb+y6864ZKSdYmjP7sTkZR3+1jctaJuFDC7pb7m88JwcVevIvM9QsOt8i0BQsSPt4U38 FxO+HZRWKJo2WTMd4srLOvdlRRstfFU= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-631-uoUxyaf_N9Cxa-BmppLPIg-1; Wed, 12 Nov 2025 19:11:01 -0500 X-MC-Unique: uoUxyaf_N9Cxa-BmppLPIg-1 X-Mimecast-MFC-AGG-ID: uoUxyaf_N9Cxa-BmppLPIg_1762992661 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E25651956095 for ; Thu, 13 Nov 2025 00:11:00 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.33.120]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 0E6D830044E0 for ; Thu, 13 Nov 2025 00:10:59 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v3 mptcp-net 1/3] mptcp: fix grafting corner case Date: Thu, 13 Nov 2025 01:10:50 +0100 Message-ID: <4696be966622c9d340e8bfa4728b219b7cac1d1b.1762992570.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 7oB5g7wa5EwYOtoHR_2cdj-n3_I_OJgjoO727XwrF8U_1762992661 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" If a passive MPTCP socket creates active subflows while still unaccepted, __mptcp_subflow_connect() will try to graft such subflows to the msk, but the msk struct socket is not yet initialized at that point: the subflows will misbehave. Address the issue always trying to graft the subflow in mptcp_finish_join(), regardless of the subflow itself being active or passive. To avoid races with accept(), access the msk->sk_socket under the callback lock. Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 8965abb94b81..1b3c5fd01600 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -913,12 +913,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk= , struct sock *ssk) mptcp_subflow_joined(msk, ssk); spin_unlock_bh(&msk->fallback_lock); =20 - /* attach to msk socket only after we are sure we will deal with it - * at close time - */ - if (sk->sk_socket && !ssk->sk_socket) - mptcp_sock_graft(ssk, sk->sk_socket); - mptcp_subflow_ctx(ssk)->subflow_id =3D msk->subflow_id++; mptcp_sockopt_sync_locked(msk, ssk); mptcp_stop_tout_timer(sk); @@ -3734,6 +3728,20 @@ void mptcp_sock_graft(struct sock *sk, struct socket= *parent) write_unlock_bh(&sk->sk_callback_lock); } =20 +static void mptcp_check_graft(struct sock *sk, struct sock *ssk) +{ + struct socket *sock; + + if (ssk->sk_socket) + return; + + write_lock_bh(&sk->sk_callback_lock); + sock =3D sk->sk_socket; + write_lock_bh(&sk->sk_callback_lock); + if (sock) + mptcp_sock_graft(ssk, sock); +} + bool mptcp_finish_join(struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); @@ -3758,6 +3766,7 @@ bool mptcp_finish_join(struct sock *ssk) } mptcp_subflow_joined(msk, ssk); spin_unlock_bh(&msk->fallback_lock); + mptcp_check_graft(parent, ssk); mptcp_propagate_sndbuf(parent, ssk); return true; } @@ -3767,6 +3776,8 @@ bool mptcp_finish_join(struct sock *ssk) goto err_prohibited; } =20 + mptcp_check_graft(parent, ssk); + /* If we can't acquire msk socket lock here, let the release callback * handle it */ --=20 2.51.1 From nobody Thu Nov 27 12:35:52 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8018D2110 for ; Thu, 13 Nov 2025 00:11:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762992668; cv=none; b=BVdL9o7oXmqkaLgd+aSwUxkC/hIWkDbZKfEBddq9uzfBvCZ+SojoKIWj0HoS2l056IUbN7SFxnjqiaI4L3VxaCMQkygmCudVHJ476m/RYMmMMG8TNmM6+IVbYgXjftakQjvCWomn7uJx8v8SOHgNVK4n03t6T/AlWcP9Oyz9QS0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762992668; c=relaxed/simple; bh=1q9Rw2BrF7TzDCyWRsgI2hYlJCwCj2rFqfnb7sIFCYI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=j6mwaP7lzk8QiPNpxSP0IxmmiR0BcvY46gR3laI9SA/O4RHlzYRVA/SrP1yZYpNIqi/4M/xwxfsnUhD0Rbnm5s0CtIOBQfmA0N2u4FwDrORinLHoDHKv+D4l4DTHCAMo/G2+CqrO+yUgMSen/5oIIFhs6PREqYbb5QGvIofZF74= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=KlrXWlBV; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="KlrXWlBV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1762992665; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uQuXFpC9aM5kuonXi2IVrHQ5YkA84CxRe3xKW/nnH5k=; b=KlrXWlBVTJKHC+YJlqe7MrthcMn8F9nLTSM4X+RuiQa/YuaPPpyHQzG4p8eUUelby6Jud4 dBW9SJ6DxN+fBjXgUhx/jxyMwqDrZIEL/JYdICrIkMjPRiS86m94aPbkGzrCzdHycbIn+Z V1hg6A8g5Pg7E6Ey1ibvtU8Iz1cIghQ= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-495-i2hPL86_Md-p-5zX77z11g-1; Wed, 12 Nov 2025 19:11:03 -0500 X-MC-Unique: i2hPL86_Md-p-5zX77z11g-1 X-Mimecast-MFC-AGG-ID: i2hPL86_Md-p-5zX77z11g_1762992662 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5C31518011EF for ; Thu, 13 Nov 2025 00:11:02 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.33.120]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 7CDAA30044E0 for ; Thu, 13 Nov 2025 00:11:01 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v3 mptcp-net 2/3] Squash-to: "mptcp: fix memcg accounting for passive sockets" Date: Thu, 13 Nov 2025 01:10:51 +0100 Message-ID: <60e9d25298d9c8da731e3e0a86788b4ca5aabfb9.1762992570.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: BsCqpEmpCj0DzLK8vEAc3vYtPihj8OPZbphkBm3c8kE_1762992662 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" __mptcp_inherit_memcg() is currently invoked by mptcp_graph_subflows() with the wrong GFP flags, as lock_sock_fast() can yield atomic scope. Since this is not the most extreme fast path, use plain lock_sock() instead. Additionally ensure the CG is correctly set even for active subflows of not yet accepted passive msk. Finally fix a typo in the mentioned helper name. Signed-off-by: Paolo Abeni --- I'm sorry for the bad directions again. I had to rework completely the next patch due to several races still present, and that required the change here, quite unexpected otherwise. --- net/mptcp/protocol.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 1b3c5fd01600..addd8025d235 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -916,8 +916,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk,= struct sock *ssk) mptcp_subflow_ctx(ssk)->subflow_id =3D msk->subflow_id++; mptcp_sockopt_sync_locked(msk, ssk); mptcp_stop_tout_timer(sk); - __mptcp_inherit_cgrp_data(sk, ssk); - __mptcp_inherit_memcg(sk, ssk, GFP_ATOMIC); __mptcp_propagate_sndbuf(sk, ssk); return true; } @@ -3737,9 +3735,13 @@ static void mptcp_check_graft(struct sock *sk, struc= t sock *ssk) =20 write_lock_bh(&sk->sk_callback_lock); sock =3D sk->sk_socket; - write_lock_bh(&sk->sk_callback_lock); - if (sock) + write_unlock_bh(&sk->sk_callback_lock); + + if (sock) { mptcp_sock_graft(ssk, sock); + __mptcp_inherit_cgrp_data(sk, ssk); + __mptcp_inherit_memcg(sk, ssk, GFP_ATOMIC); + } } =20 bool mptcp_finish_join(struct sock *ssk) @@ -4052,18 +4054,17 @@ static int mptcp_listen(struct socket *sock, int ba= cklog) return err; } =20 -static void mptcp_graph_subflows(struct sock *sk) +static void mptcp_graft_subflows(struct sock *sk) { struct mptcp_subflow_context *subflow; struct mptcp_sock *msk =3D mptcp_sk(sk); =20 mptcp_for_each_subflow(msk, subflow) { struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); - bool slow; =20 - slow =3D lock_sock_fast(ssk); + lock_sock(ssk); =20 - /* set ssk->sk_socket of accept()ed flows to mptcp socket. + /* Set ssk->sk_socket of accept()ed flows to mptcp socket. * This is needed so NOSPACE flag can be set from tcp stack. */ if (!ssk->sk_socket) @@ -4071,7 +4072,7 @@ static void mptcp_graph_subflows(struct sock *sk) =20 __mptcp_inherit_cgrp_data(sk, ssk); __mptcp_inherit_memcg(sk, ssk, GFP_KERNEL); - unlock_sock_fast(ssk, slow); + release_sock(ssk); } } =20 @@ -4122,7 +4123,7 @@ static int mptcp_stream_accept(struct socket *sock, s= truct socket *newsock, msk =3D mptcp_sk(newsk); msk->in_accept_queue =3D 0; =20 - mptcp_graph_subflows(newsk); + mptcp_graft_subflows(newsk); mptcp_rps_record_subflows(msk); =20 /* Do late cleanup for the first subflow as necessary. Also --=20 2.51.1 From nobody Thu Nov 27 12:35:52 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DFBE810F2 for ; Thu, 13 Nov 2025 00:11:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762992668; cv=none; b=oVcVVKx7GIl2yzZQRxpmoYFWvRfAuc0LZMNf8O+fBfM9pMdaLhq6Fs/iL7UKb5W/KGv038xqU5lzNqzLo9mmVM3k8NW27y3ky2pcua94RLvIjww6dc9ab9GiX6JYyk44meguYqlXTZWGkwR0Lw204sDmbv122oyaA8hPavLCMuw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762992668; c=relaxed/simple; bh=KE1DJZwMoyR1ElDfpz+h1vqDvjHnc6npxfqejYYzdEo=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=Rc87xwxHkv5C3yLfISNLpMjBvUdKU2DXt2bZA4Qvf0cBY4mjFBL90snliXpcjx77LpDJb/mzD6oI2GUbgI96RC27q+MaL5IkGOWUkRTGu5z/qECzNBX2dBUHlJnZEVgnNtc5eaEGEbf58tCM5rtuV/x68k5DLxHbucxn4PBeinI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DlUltSMq; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DlUltSMq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1762992666; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VUPf97zcN4CGTMgWAhbPgHtPUModoV424KfSMBZSl4o=; b=DlUltSMqi9dfBt3HVc8IvNYVUopGTe0lGWzG0giy27gug/VyzDNOicYVu7PpiL27rHUfDn a0vTWsG6OUcbAj+1xpekMP6rLMlO/28hR4v1bJvctifDn2qYqZBtuU8XTB4OPPCFNyqQhS 3QpMjfZJ5im1kkF5Ch4tVRFeRL046tA= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-258-ZAy97YZVMymEo0us69mm1A-1; Wed, 12 Nov 2025 19:11:04 -0500 X-MC-Unique: ZAy97YZVMymEo0us69mm1A-1 X-Mimecast-MFC-AGG-ID: ZAy97YZVMymEo0us69mm1A_1762992664 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F03A21956096 for ; Thu, 13 Nov 2025 00:11:03 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.44.33.120]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1B3B430044E9 for ; Thu, 13 Nov 2025 00:11:02 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v3 mptcp-net 3/3] Squash-to: "mptcp: leverage the backlog for RX packet processing" Date: Thu, 13 Nov 2025 01:10:52 +0100 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: vh8snmPIJZFoVVPOmPJ__WmeoHIDDzXKrQivS7H6S5o_1762992664 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" If a subflow receives data before gaining the memcg while the msk socket lock is held at accept time, or the PM locks the msk socket while still unaccepted and subflows push data to it at the same time, the mptcp_graph_subflows() can complete with a non empty backlog. The msk will try to borrow such memory, but (some) of the skbs there where not memcg charged. When the msk finally will return such accounted memory, we should hit the same splat of #597. [even if so far I was unable to replicate this scenario] This patch tries to address such potential issue by: - explicitly keep track of the amount of memory added to the backlog not CG accounted - additionally accouting for such memory at accept time - preventing any subflow from adding memory to the backlog not CG accounted after the above flush Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 64 +++++++++++++++++++++++++++++++++++++++++--- net/mptcp/protocol.h | 1 + 2 files changed, 61 insertions(+), 4 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index addd8025d235..abf0edc4b888 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -658,6 +658,7 @@ static void __mptcp_add_backlog(struct sock *sk, { struct mptcp_sock *msk =3D mptcp_sk(sk); struct sk_buff *tail =3D NULL; + struct sock *ssk =3D skb->sk; bool fragstolen; int delta; =20 @@ -671,18 +672,26 @@ static void __mptcp_add_backlog(struct sock *sk, tail =3D list_last_entry(&msk->backlog_list, struct sk_buff, list); =20 if (tail && MPTCP_SKB_CB(skb)->map_seq =3D=3D MPTCP_SKB_CB(tail)->end_seq= && - skb->sk =3D=3D tail->sk && + ssk =3D=3D tail->sk && __mptcp_try_coalesce(sk, tail, skb, &fragstolen, &delta)) { skb->truesize -=3D delta; kfree_skb_partial(skb, fragstolen); __mptcp_subflow_lend_fwdmem(subflow, delta); - WRITE_ONCE(msk->backlog_len, msk->backlog_len + delta); - return; + goto account; } =20 list_add_tail(&skb->list, &msk->backlog_list); mptcp_subflow_lend_fwdmem(subflow, skb); - WRITE_ONCE(msk->backlog_len, msk->backlog_len + skb->truesize); + delta =3D skb->truesize; + +account: + WRITE_ONCE(msk->backlog_len, msk->backlog_len + delta); + + /* Possibly not accept()ed yet, keep track of memory not CG + * accounted, mptcp_grapt_subflows will handle it. + */ + if (!ssk->sk_memcg) + msk->backlog_unaccounted +=3D delta; } =20 static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, @@ -2154,6 +2163,12 @@ static bool mptcp_can_spool_backlog(struct sock *sk,= struct list_head *skbs) { struct mptcp_sock *msk =3D mptcp_sk(sk); =20 + /* After CG initialization, subflows should never add skb before + * gaining the CG themself. + */ + DEBUG_NET_WARN_ON_ONCE(msk->backlog_unaccounted && sk->sk_socket && + mem_cgroup_from_sk(sk)); + /* Don't spool the backlog if the rcvbuf is full. */ if (list_empty(&msk->backlog_list) || sk_rmem_alloc_get(sk) > sk->sk_rcvbuf) @@ -4059,6 +4074,22 @@ static void mptcp_graft_subflows(struct sock *sk) struct mptcp_subflow_context *subflow; struct mptcp_sock *msk =3D mptcp_sk(sk); =20 + if (mem_cgroup_sockets_enabled) { + LIST_HEAD(join_list); + + /* Subflows joining after __inet_accept() with get the + * mem CG properly initialized at mptcp_finish_join() time, + * but subflows pending in join_list need explicit + * initialization before flushing `backlog_unaccounted` + * or we can cat unexpeced unaccounted memory later. + */ + mptcp_data_lock(sk); + list_splice_init(&msk->join_list, &join_list); + mptcp_data_unlock(sk); + + __mptcp_flush_join_list(sk, &join_list); + } + mptcp_for_each_subflow(msk, subflow) { struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); =20 @@ -4070,10 +4101,35 @@ static void mptcp_graft_subflows(struct sock *sk) if (!ssk->sk_socket) mptcp_sock_graft(ssk, sk->sk_socket); =20 + if (!mem_cgroup_sk_enabled(sk)) + goto unlock; + __mptcp_inherit_cgrp_data(sk, ssk); __mptcp_inherit_memcg(sk, ssk, GFP_KERNEL); + +unlock: release_sock(ssk); } + + if (mem_cgroup_sk_enabled(sk)) { + gfp_t gfp =3D GFP_KERNEL | __GFP_NOFAIL; + int amt; + + /* Account the backlog memory; prior accept() is aware of + * fwd and rmem only + */ + mptcp_data_lock(sk); + amt =3D sk_mem_pages(sk->sk_forward_alloc + + msk->backlog_unaccounted + + atomic_read(&sk->sk_rmem_alloc)) - + sk_mem_pages(sk->sk_forward_alloc + + atomic_read(&sk->sk_rmem_alloc)); + msk->backlog_unaccounted =3D 0; + mptcp_data_unlock(sk); + + if (amt) + mem_cgroup_sk_charge(sk, amt, gfp); + } } =20 static int mptcp_stream_accept(struct socket *sock, struct socket *newsock, diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 161b704be16b..199f28f3dd5e 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -360,6 +360,7 @@ struct mptcp_sock { =20 struct list_head backlog_list; /* protected by the data lock */ u32 backlog_len; + u32 backlog_unaccounted; }; =20 #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock) --=20 2.51.1