From nobody Wed Jan 8 06:16:40 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AE0E1FF7A1 for ; Fri, 6 Dec 2024 12:10:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487046; cv=none; b=P1IS+KNxkUVdlDfnlOJLBtUOYTfMOGjesVOW7Zcyg65zd+Hnb9Nd9MiDVuA9XXmHnElA+7leo+pn/L+Eai4AOPGFYw0qOdAdsByfigRhPLuyxGw8sLpM8yN41zMVARavzdtae7bhwXsr/320I76btD5zhZhjz8XMeK8xUoO8hMw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487046; c=relaxed/simple; bh=nh0JS+9Fb1CZPzBhYUtvUU3O+CP21dHcWnWjQCZq1iI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=TCEpCnfQUicS3arttj9icb2XrfY6RVCP6Yvg8KAyvakqQjvYb+Fbzv1dmCXotkDKin0UlbQnuUK/ghtZSAFxFKXh4AmwRPUZn+0w5GsfoiYOjDfa9UEKxxxfUvZfpAmc/iaihJilDAWeO09l5p6tetMS3xQqWn6zaFvPyPgr/Gc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=RwjReOVz; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RwjReOVz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733487043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mIziDAgzhno3pF7TeaMOb+Tqv9Hx8iYCJIuWwtOGo6g=; b=RwjReOVz7jIdVQqYyTTGyfXidmTneprw8ERUjsUUn3cVImvqSZQeU/bVk3IdoK5AYSx8Lo LvVZEWUgFxJ+9cgyJgUhUmUCAYfFt+Cynu0tGUg8p0jOcotCs+gumHAAtn59mmlUrGKUyF QoMO1Wf32tDloUrSShEf/bcLGfGfC9g= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-354-04d2w_tzNDie22N2yM2bEA-1; Fri, 06 Dec 2024 07:10:41 -0500 X-MC-Unique: 04d2w_tzNDie22N2yM2bEA-1 X-Mimecast-MFC-AGG-ID: 04d2w_tzNDie22N2yM2bEA Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 501D1195423C for ; Fri, 6 Dec 2024 12:10:38 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.243]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 7A0BD1956095 for ; Fri, 6 Dec 2024 12:10:37 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 1/7] mptcp: prevent excessive coalescing on receive Date: Fri, 6 Dec 2024 13:10:34 +0100 Message-ID: <9c5947ad3b55695cb6000900acad5471b7314195.1733486870.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: VOZ2moFR3Qcaad15UcJwYcGN6BxlIKXjWVVg_VuyKbs_1733487041 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Currently the skb size after coalescing is only limited by the skb layout (the skb must not carry frag_list). A single coalesced skb covering several MSS can potentially fill completely the receive buffer. In such a case, the snd win will zero until the receive buffer will be empty again, affecting tput badly. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- No fixes tag because the problem is not very visible in practice currently, but will be apparent after the rx refactor. Still I hope this could affect positively the simlut flows self-tests --- net/mptcp/protocol.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index f768aa4473fb..fd9593f85a98 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -136,6 +136,7 @@ static bool mptcp_try_coalesce(struct sock *sk, struct = sk_buff *to, int delta; =20 if (MPTCP_SKB_CB(from)->offset || + ((to->len + from->len) > (sk->sk_rcvbuf >> 3)) || !skb_try_coalesce(to, from, &fragstolen, &delta)) return false; =20 --=20 2.45.2 From nobody Wed Jan 8 06:16:40 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FB881FF7A1 for ; Fri, 6 Dec 2024 12:10:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487049; cv=none; b=Juw22C88Q2AEwLQgqhcVR5OBLSMhbvO1SHtd4RUnqqYSeRcmFJQG5HZNT6tguDFEkskY4c2OZBN6663hChAysIY15Cj+cMIn12KjXatNorLxql0LlZ+Qyol5L8XpFmGGHTEJjh9sV+wYdimlAAlR1MVfGCoKhyRtQWic17Yi+LI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487049; c=relaxed/simple; bh=gQANsrNymzzuRiXAg2YHwp6A1C/OzXBmjJ4b0vcUnxw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=liX/2kEtaYkOYcSwuQzGksUWqk/dF9XeEpIJQENS0b5op8c4Q+0lQh4cOOzbqJEq+OnHLsOLeJViLst/Py3h0O9b+Bv2ER6Jo4erf0yp/lwLCNs0F/aDVLFhzUO4wajF3YsKgTCDqKrOUmwPLLLBn1VQhgkJCV9qX7YOoNY3jqY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Nf83wfcl; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Nf83wfcl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733487046; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lk8Jgnx5qUsYIcDhKvKBMF1KyfJ9lbg9HpKbd3xGwqM=; b=Nf83wfcl54rmoChFcZfPLrR055ceUyBf0WY7GCW501I7C5xtD/wdql7A1qxn0GGtN+HvHH R71px32+KO2KlbVFWiFWyYY8vdgTxr3Zn4ICP0EQU65YOs1kYIVHGMOvnKsXZO8VCaI/pS rkbFhcbNFMuQhAJJBt0ZsrtTKjpXdV8= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-651-4LQVov5LPhKjqmc-dRrInQ-1; Fri, 06 Dec 2024 07:10:45 -0500 X-MC-Unique: 4LQVov5LPhKjqmc-dRrInQ-1 X-Mimecast-MFC-AGG-ID: 4LQVov5LPhKjqmc-dRrInQ Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id BC4681956089 for ; Fri, 6 Dec 2024 12:10:44 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.243]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id EFF731955F3F for ; Fri, 6 Dec 2024 12:10:43 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 2/7] tcp: fix recvbuffer adjust on sleeping rcvmsg Date: Fri, 6 Dec 2024 13:10:41 +0100 Message-ID: <23213d7ff1d80061cbac28414a683fd9d7688cf9.1733486870.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: zQxgSRo1Nq9E7JdRu72eKE6ymDrmDRBXxtowb_LrJPw_1733487044 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" If the recvmsg() blocks after receiving some data - i.e. due to SO_RCVLOWAT - the MPTCP code will attempt multiple times to adjust the receive buffer size, wrongly accounting every time the cumulative of received data - instead of accounting only for the delta. Address the issue moving mptcp_rcv_space_adjust just after the data reception and passing it only the just received bytes. This also remove an unneeded difference between the TCP and MPTCP RX code path implementation. Fixes: 581302298524 ("mptcp: error out earlier on disconnect") Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- Not strictly related to the refactor, found while investigating the CI failure --- net/mptcp/protocol.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index fd9593f85a98..bca8c2c046c3 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1940,6 +1940,8 @@ static int mptcp_sendmsg(struct sock *sk, struct msgh= dr *msg, size_t len) goto out; } =20 +static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied); + static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk, struct msghdr *msg, size_t len, int flags, @@ -1993,6 +1995,7 @@ static int __mptcp_recvmsg_mskq(struct mptcp_sock *ms= k, break; } =20 + mptcp_rcv_space_adjust(msk, copied); return copied; } =20 @@ -2269,7 +2272,6 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, } =20 pr_debug("block timeout %ld\n", timeo); - mptcp_rcv_space_adjust(msk, copied); err =3D sk_wait_data(sk, &timeo, NULL); if (err < 0) { err =3D copied ? : err; @@ -2277,8 +2279,6 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, } } =20 - mptcp_rcv_space_adjust(msk, copied); - out_err: if (cmsg_flags && copied >=3D 0) { if (cmsg_flags & MPTCP_CMSG_TS) --=20 2.45.2 From nobody Wed Jan 8 06:16:40 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACD0B1FF7A1 for ; Fri, 6 Dec 2024 12:10:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487057; cv=none; b=irx1j/Eq0rJNGZ1LveMERUfSV4vG+duqxt9z97RBGWWzGPsTd4ntyZ4nzhzMvYWpb8bHimvvkCVGOC3KN5ZQuEgIrY7k4sBvXFJnDpMgT4lobVEhwtCQ2hCpmIO+o3Z4Ltljnwk2QUKhAbqK6zTdgJ6BjpBeiqok3Yv57e+LQe8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487057; c=relaxed/simple; bh=9TTSho1zKZX7E/HlQiibtnxacoc/i/pQ42ZFMOxzCRA=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=rfhoj7ik84h15jdQPdamLkJC+jNeONA560DS/+Pb6fTWx7Mldpsr7IWlNjccEH2WyXaOTYdkCNbUuhrfVOcbE4WVyLqaJkYkztIsQol4ClrjYlQ+cSp4ke3lvr73iMHCou3c2KuRyCLwJTIn9L6BGWMIyZksiGdZWKHNG7es5Fc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=NxruYWNP; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NxruYWNP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733487054; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L1h7GX4QIhVML5zb8rD8wWeDqRe2mzYX00Ccxks2Qvw=; b=NxruYWNPJuQMnNzVeSil99z/Gh7WTAtU+oTT76xddkVIL2RePVkUcUsNVXu3SyQtf7FZgA +wToIce8Ajh/VX6V7gUB2Tn24QWIzj2IBqK/kTmc3Z3kHk41meivpIWgQ75mFxleVY1OG+ X+97dYxdaZZef/ff008CC0QJ6UXViL0= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-189-w4m0YmA1MvW7CsoQ0adVnw-1; Fri, 06 Dec 2024 07:10:53 -0500 X-MC-Unique: w4m0YmA1MvW7CsoQ0adVnw-1 X-Mimecast-MFC-AGG-ID: w4m0YmA1MvW7CsoQ0adVnw Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 819161955D9A for ; Fri, 6 Dec 2024 12:10:52 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.243]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B6548300019E for ; Fri, 6 Dec 2024 12:10:51 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 3/7] mptcp: don't always assume copied data in mptcp_cleanup_rbuf() Date: Fri, 6 Dec 2024 13:10:47 +0100 Message-ID: <045648bc48950d4d1e78b295131f08b655179c08.1733486870.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: akKWu5azS7ppk6C_nNszPLwYwPgQJCSYbePtbd4OLno_1733487052 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Under some corner cases the MPTCP protocol can end-up invoking mptcp_cleanup_rbuf() when no data has been copied, but such helper assumes the opposite condition. Explicitly drop such assumption and performs the costly call only when strictly needed - before releasing the msk socket lock. Fixes: fd8976790a6c ("mptcp: be careful on MPTCP-level ack.") Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- net/mptcp/protocol.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index bca8c2c046c3..690614816749 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -529,13 +529,13 @@ static void mptcp_send_ack(struct mptcp_sock *msk) mptcp_subflow_send_ack(mptcp_subflow_tcp_sock(subflow)); } =20 -static void mptcp_subflow_cleanup_rbuf(struct sock *ssk) +static void mptcp_subflow_cleanup_rbuf(struct sock *ssk, int copied) { bool slow; =20 slow =3D lock_sock_fast(ssk); if (tcp_can_send_ack(ssk)) - tcp_cleanup_rbuf(ssk, 1); + tcp_cleanup_rbuf(ssk, copied); unlock_sock_fast(ssk, slow); } =20 @@ -552,7 +552,7 @@ static bool mptcp_subflow_could_cleanup(const struct so= ck *ssk, bool rx_empty) (ICSK_ACK_PUSHED2 | ICSK_ACK_PUSHED))); } =20 -static void mptcp_cleanup_rbuf(struct mptcp_sock *msk) +static void mptcp_cleanup_rbuf(struct mptcp_sock *msk, int copied) { int old_space =3D READ_ONCE(msk->old_wspace); struct mptcp_subflow_context *subflow; @@ -560,14 +560,14 @@ static void mptcp_cleanup_rbuf(struct mptcp_sock *msk) int space =3D __mptcp_space(sk); bool cleanup, rx_empty; =20 - cleanup =3D (space > 0) && (space >=3D (old_space << 1)); - rx_empty =3D !__mptcp_rmem(sk); + cleanup =3D (space > 0) && (space >=3D (old_space << 1)) && copied; + rx_empty =3D !__mptcp_rmem(sk) && copied; =20 mptcp_for_each_subflow(msk, subflow) { struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); =20 if (cleanup || mptcp_subflow_could_cleanup(ssk, rx_empty)) - mptcp_subflow_cleanup_rbuf(ssk); + mptcp_subflow_cleanup_rbuf(ssk, copied); } } =20 @@ -2221,9 +2221,6 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, =20 copied +=3D bytes_read; =20 - /* be sure to advertise window change */ - mptcp_cleanup_rbuf(msk); - if (skb_queue_empty(&msk->receive_queue) && __mptcp_move_skbs(msk)) continue; =20 @@ -2272,6 +2269,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, } =20 pr_debug("block timeout %ld\n", timeo); + mptcp_cleanup_rbuf(msk, copied); err =3D sk_wait_data(sk, &timeo, NULL); if (err < 0) { err =3D copied ? : err; @@ -2279,6 +2277,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, } } =20 + mptcp_cleanup_rbuf(msk, copied); + out_err: if (cmsg_flags && copied >=3D 0) { if (cmsg_flags & MPTCP_CMSG_TS) --=20 2.45.2 From nobody Wed Jan 8 06:16:40 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0AF6B1FF7A1 for ; Fri, 6 Dec 2024 12:11:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487065; cv=none; b=NVet54ydMBhsQSwtxCaWrACw4ZsJzUwHZpZpB6d4A50ZioyT9+wiSB4UcT8fb9zQ3gra04Q+qAGDR7R2klcN0FXLWebmaMOOZU0Yn1QKO/tP7BMoqt5Av+StR16zo40zdw+rD6G1PN6d40gRiV9glUSxz7VaDWCaj8nMAhoVkT4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487065; c=relaxed/simple; bh=DvQdryvpYGYbciXu4RM0QTa41Fp39OU7gU58Ab9TmFQ=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=GHEUZqbQb4bRFS45tbz4c4WTmIdESN4rlDiGgypZBsUfCfnrhPeQ4ZEwTtiUZU/+RGIEfEbTrK4oCtSFK4lEGLI3E+VqapxhY6Llq3paLbS/mwanEMKL3km3w4ZZHOW8E8UM+eHcnPzgaTsINVp8oT3NH8ix9oERxVpsQJTycoc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=AAmCukHy; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="AAmCukHy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733487061; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SKFeTDlZpv4iRoURERTidb2hJTsEoHN8u4bNK8zmEVc=; b=AAmCukHyCL4tm236AR3nlDrPYk+LJcbl3TUTlsOHHXt27U+SmEAlYN8YVfEhNoiv/sryIq M+t1zVxWtHBQlRDoAP+OR4aXvJj1gf0yreUTCC5lum4jGgbX2zl+DTpuNImxrYGnVDzzSy e1ptLpTklxGzkK7alePA95HXmgsAahI= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-459-Lni6XAklOy2WBKN0PifNeQ-1; Fri, 06 Dec 2024 07:10:59 -0500 X-MC-Unique: Lni6XAklOy2WBKN0PifNeQ-1 X-Mimecast-MFC-AGG-ID: Lni6XAklOy2WBKN0PifNeQ Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D94021955F40 for ; Fri, 6 Dec 2024 12:10:58 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.243]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1E6ED19560A0 for ; Fri, 6 Dec 2024 12:10:57 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 4/7] mptcp: consolidate subflow cleanup Date: Fri, 6 Dec 2024 13:10:55 +0100 Message-ID: <6d5225e24b2ddd80a356714ee3655b2d2734a92d.1733486870.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 2HEbfIUNyK4rUZuXYQ3AlQDCKX5bIarfBVjNCgvv0jA_1733487059 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Consolidate all the cleanup actions requiring the worker in a single helper and ensure the dummy data fin creation for fallback socket is performed only when the tcp rx queue is empty. There are no functional changes intended, but this will simplify the next patch, when the tcp rx queue spooling could be delayed at release_cb time. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- net/mptcp/subflow.c | 33 ++++++++++++++++++--------------- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index fd021cf8286e..2926bdf88e42 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1271,7 +1271,12 @@ static void mptcp_subflow_discard_data(struct sock *= ssk, struct sk_buff *skb, subflow->map_valid =3D 0; } =20 -/* sched mptcp worker to remove the subflow if no more data is pending */ +static bool subflow_is_done(const struct sock *sk) +{ + return sk->sk_shutdown & RCV_SHUTDOWN || sk->sk_state =3D=3D TCP_CLOSE; +} + +/* sched mptcp worker for subflow cleanup if no more data is pending */ static void subflow_sched_work_if_closed(struct mptcp_sock *msk, struct so= ck *ssk) { struct sock *sk =3D (struct sock *)msk; @@ -1281,8 +1286,18 @@ static void subflow_sched_work_if_closed(struct mptc= p_sock *msk, struct sock *ss inet_sk_state_load(sk) !=3D TCP_ESTABLISHED))) return; =20 - if (skb_queue_empty(&ssk->sk_receive_queue) && - !test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags)) + if (!skb_queue_empty(&ssk->sk_receive_queue)) + return; + + if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags)) + mptcp_schedule_work(sk); + + /* when the fallback subflow closes the rx side, trigger a 'dummy' + * ingress data fin, so that the msk state will follow along + */ + if (__mptcp_check_fallback(msk) && subflow_is_done(ssk) && + msk->first =3D=3D ssk && + mptcp_update_rcv_data_fin(msk, READ_ONCE(msk->ack_seq), true)) mptcp_schedule_work(sk); } =20 @@ -1842,11 +1857,6 @@ static void __subflow_state_change(struct sock *sk) rcu_read_unlock(); } =20 -static bool subflow_is_done(const struct sock *sk) -{ - return sk->sk_shutdown & RCV_SHUTDOWN || sk->sk_state =3D=3D TCP_CLOSE; -} - static void subflow_state_change(struct sock *sk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(sk); @@ -1873,13 +1883,6 @@ static void subflow_state_change(struct sock *sk) subflow_error_report(sk); =20 subflow_sched_work_if_closed(mptcp_sk(parent), sk); - - /* when the fallback subflow closes the rx side, trigger a 'dummy' - * ingress data fin, so that the msk state will follow along - */ - if (__mptcp_check_fallback(msk) && subflow_is_done(sk) && msk->first =3D= =3D sk && - mptcp_update_rcv_data_fin(msk, READ_ONCE(msk->ack_seq), true)) - mptcp_schedule_work(parent); } =20 void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *list= ener_ssk) --=20 2.45.2 From nobody Wed Jan 8 06:16:40 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD11E1FF7A1 for ; Fri, 6 Dec 2024 12:11:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487070; cv=none; b=ZSDZJeu1tGcIhlWVBJ1WjxZkqBwol/Se4dCIvZN/fxzaFle5WlHn6wSLXD62PsAlGWrRM1UtCbd1evVl3TX9JWWDvwlP6LSVGtuKq3aX7qEqjrAXcNzIsA+hyq/TGBLSINRrfUKeSVbUlUYOcFQXK0pQkzhEZwXsrNroCWdhtL4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487070; c=relaxed/simple; bh=q8kT0ilAN4Jxf7TTMd/mGbup5Y7SM0eBbpdHdo+WiC4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=oLmCSP2KjEF7vBEAeCYVvTMr7octOs6y4LMm1As8l0GoHsL2MtlE7G/+TLJT++//GzG+nbfFWOldGLtZ7hUYtLFpbSvnm0TnsyMZ4LRIhfvc44l9FEh2TjziLh1jXrbagSRLh12uidXZQAHgnr9MzXBrm58eZeJJdq0TNIuu9JE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Wtcsfn7O; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Wtcsfn7O" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733487067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mKPGNuV23Yjj2+7zpNuZAaxI1QusZJAmixlLGMnNB50=; b=Wtcsfn7OlF78CXgAfrBUb2OhvlGPUhj1sJWAi0XltzEChWnrQNEH+LkCWE113HQ7Gqnsqb 6tszL1AiLLIjH0OYgcuyjwZrJ/BSkoS3UHBs331g77rDzmrnHXGpvtC0lIRAiNneeS5lsG vC2TXiQ9dC4ivcythnPr4E7TjYPACYs= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-213-Ty0YH1ARPgGPT3YkLy7tXA-1; Fri, 06 Dec 2024 07:11:06 -0500 X-MC-Unique: Ty0YH1ARPgGPT3YkLy7tXA-1 X-Mimecast-MFC-AGG-ID: Ty0YH1ARPgGPT3YkLy7tXA Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5C86F19560A3 for ; Fri, 6 Dec 2024 12:11:05 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.243]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8F1831955F3F for ; Fri, 6 Dec 2024 12:11:04 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 5/7] mptcp: move the whole rx path under msk socket lock protection Date: Fri, 6 Dec 2024 13:11:01 +0100 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: Bal3So4AVH40MC0NMVITWDS9svURUZ19Qcyq1pCr6E0_1733487065 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" After commit c2e6048fa1cf ("mptcp: fix race in release_cb") we can move the whole MPTCP rx path under the socket lock leveraging the release_cb. We can drop a bunch of spin_lock pairs in the receive functions, use a single receive queue and invoke __mptcp_move_skbs only when subflows ask for it. This will allow more cleanup in the next patch. Some changes are worth specific mention: The msk rcvbuf update now always happens under both the msk and the subflow socket lock: we can drop a bunch of ONCE annotation and consolidate the checks. When the skbs move is delayed at msk release callback time, even the msk rcvbuf update is delayed; additionally take care of such action in __mptcp_move_skbs(). Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- v1 -> v2: - cleanup && fixup msk sk_rcvbuf update - fix missed wakeup due to bad placed __mptcp_move_skbs(); move it form __mptcp_recvmsg_mskq into mptcp_recvmsg() - added missing '\n' in debug message format string - keep 'snd_una' && friends update under the mptcp data lock --- net/mptcp/fastopen.c | 2 + net/mptcp/protocol.c | 123 ++++++++++++++++++++----------------------- net/mptcp/protocol.h | 2 +- 3 files changed, 61 insertions(+), 66 deletions(-) diff --git a/net/mptcp/fastopen.c b/net/mptcp/fastopen.c index a29ff901df75..fb945c0d50bf 100644 --- a/net/mptcp/fastopen.c +++ b/net/mptcp/fastopen.c @@ -49,6 +49,7 @@ void mptcp_fastopen_subflow_synack_set_params(struct mptc= p_subflow_context *subf MPTCP_SKB_CB(skb)->has_rxtstamp =3D TCP_SKB_CB(skb)->has_rxtstamp; =20 mptcp_data_lock(sk); + DEBUG_NET_WARN_ON_ONCE(sock_owned_by_user_nocheck(sk)); =20 mptcp_set_owner_r(skb, sk); __skb_queue_tail(&sk->sk_receive_queue, skb); @@ -65,6 +66,7 @@ void __mptcp_fastopen_gen_msk_ackseq(struct mptcp_sock *m= sk, struct mptcp_subflo struct sock *sk =3D (struct sock *)msk; struct sk_buff *skb; =20 + DEBUG_NET_WARN_ON_ONCE(sock_owned_by_user_nocheck(sk)); skb =3D skb_peek_tail(&sk->sk_receive_queue); if (skb) { WARN_ON_ONCE(MPTCP_SKB_CB(skb)->end_seq); diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 690614816749..42d04de32560 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -643,18 +643,6 @@ static bool __mptcp_move_skbs_from_subflow(struct mptc= p_sock *msk, bool more_data_avail; struct tcp_sock *tp; bool done =3D false; - int sk_rbuf; - - sk_rbuf =3D READ_ONCE(sk->sk_rcvbuf); - - if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) { - int ssk_rbuf =3D READ_ONCE(ssk->sk_rcvbuf); - - if (unlikely(ssk_rbuf > sk_rbuf)) { - WRITE_ONCE(sk->sk_rcvbuf, ssk_rbuf); - sk_rbuf =3D ssk_rbuf; - } - } =20 pr_debug("msk=3D%p ssk=3D%p\n", msk, ssk); tp =3D tcp_sk(ssk); @@ -722,7 +710,7 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp= _sock *msk, WRITE_ONCE(tp->copied_seq, seq); more_data_avail =3D mptcp_subflow_data_available(ssk); =20 - if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) { + if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf) { done =3D true; break; } @@ -846,11 +834,30 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, = struct sock *ssk) return moved > 0; } =20 +static void __mptcp_rcvbuf_update(struct sock *sk, struct sock *ssk) +{ + if (unlikely(ssk->sk_rcvbuf > sk->sk_rcvbuf)) + WRITE_ONCE(sk->sk_rcvbuf, ssk->sk_rcvbuf); +} + +static void __mptcp_data_ready(struct sock *sk, struct sock *ssk) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + + __mptcp_rcvbuf_update(sk, ssk); + + /* over limit? can't append more skbs to msk, Also, no need to wake-up*/ + if (__mptcp_rmem(sk) > sk->sk_rcvbuf) + return; + + /* Wake-up the reader only for in-sequence data */ + if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk)) + sk->sk_data_ready(sk); +} + void mptcp_data_ready(struct sock *sk, struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); - struct mptcp_sock *msk =3D mptcp_sk(sk); - int sk_rbuf, ssk_rbuf; =20 /* The peer can send data while we are shutting down this * subflow at msk destruction time, but we must avoid enqueuing @@ -859,19 +866,11 @@ void mptcp_data_ready(struct sock *sk, struct sock *s= sk) if (unlikely(subflow->disposable)) return; =20 - ssk_rbuf =3D READ_ONCE(ssk->sk_rcvbuf); - sk_rbuf =3D READ_ONCE(sk->sk_rcvbuf); - if (unlikely(ssk_rbuf > sk_rbuf)) - sk_rbuf =3D ssk_rbuf; - - /* over limit? can't append more skbs to msk, Also, no need to wake-up*/ - if (__mptcp_rmem(sk) > sk_rbuf) - return; - - /* Wake-up the reader only for in-sequence data */ mptcp_data_lock(sk); - if (move_skbs_to_msk(msk, ssk) && mptcp_epollin_ready(sk)) - sk->sk_data_ready(sk); + if (!sock_owned_by_user(sk)) + __mptcp_data_ready(sk, ssk); + else + __set_bit(MPTCP_DEQUEUE, &mptcp_sk(sk)->cb_flags); mptcp_data_unlock(sk); } =20 @@ -1942,16 +1941,17 @@ static int mptcp_sendmsg(struct sock *sk, struct ms= ghdr *msg, size_t len) =20 static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied); =20 -static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk, +static int __mptcp_recvmsg_mskq(struct sock *sk, struct msghdr *msg, size_t len, int flags, struct scm_timestamping_internal *tss, int *cmsg_flags) { + struct mptcp_sock *msk =3D mptcp_sk(sk); struct sk_buff *skb, *tmp; int copied =3D 0; =20 - skb_queue_walk_safe(&msk->receive_queue, skb, tmp) { + skb_queue_walk_safe(&sk->sk_receive_queue, skb, tmp) { u32 offset =3D MPTCP_SKB_CB(skb)->offset; u32 data_len =3D skb->len - offset; u32 count =3D min_t(size_t, len - copied, data_len); @@ -1986,7 +1986,7 @@ static int __mptcp_recvmsg_mskq(struct mptcp_sock *ms= k, /* we will bulk release the skb memory later */ skb->destructor =3D NULL; WRITE_ONCE(msk->rmem_released, msk->rmem_released + skb->truesize); - __skb_unlink(skb, &msk->receive_queue); + __skb_unlink(skb, &sk->sk_receive_queue); __kfree_skb(skb); msk->bytes_consumed +=3D count; } @@ -2111,54 +2111,46 @@ static void __mptcp_update_rmem(struct sock *sk) WRITE_ONCE(msk->rmem_released, 0); } =20 -static void __mptcp_splice_receive_queue(struct sock *sk) +static bool __mptcp_move_skbs(struct sock *sk) { + struct mptcp_subflow_context *subflow; struct mptcp_sock *msk =3D mptcp_sk(sk); - - skb_queue_splice_tail_init(&sk->sk_receive_queue, &msk->receive_queue); -} - -static bool __mptcp_move_skbs(struct mptcp_sock *msk) -{ - struct sock *sk =3D (struct sock *)msk; unsigned int moved =3D 0; bool ret, done; =20 + /* verify we can move any data from the subflow, eventually updating */ + if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) + mptcp_for_each_subflow(msk, subflow) + __mptcp_rcvbuf_update(sk, subflow->tcp_sock); + + if (__mptcp_rmem(sk) > sk->sk_rcvbuf) + return false; + do { struct sock *ssk =3D mptcp_subflow_recv_lookup(msk); bool slowpath; =20 - /* we can have data pending in the subflows only if the msk - * receive buffer was full at subflow_data_ready() time, - * that is an unlikely slow path. - */ - if (likely(!ssk)) + if (unlikely(!ssk)) break; =20 slowpath =3D lock_sock_fast(ssk); - mptcp_data_lock(sk); __mptcp_update_rmem(sk); done =3D __mptcp_move_skbs_from_subflow(msk, ssk, &moved); - mptcp_data_unlock(sk); =20 if (unlikely(ssk->sk_err)) __mptcp_error_report(sk); unlock_sock_fast(ssk, slowpath); } while (!done); =20 - /* acquire the data lock only if some input data is pending */ ret =3D moved > 0; if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) || - !skb_queue_empty_lockless(&sk->sk_receive_queue)) { - mptcp_data_lock(sk); + !skb_queue_empty(&sk->sk_receive_queue)) { __mptcp_update_rmem(sk); ret |=3D __mptcp_ofo_queue(msk); - __mptcp_splice_receive_queue(sk); - mptcp_data_unlock(sk); } if (ret) mptcp_check_data_fin((struct sock *)msk); - return !skb_queue_empty(&msk->receive_queue); + return ret; } =20 static unsigned int mptcp_inq_hint(const struct sock *sk) @@ -2166,7 +2158,7 @@ static unsigned int mptcp_inq_hint(const struct sock = *sk) const struct mptcp_sock *msk =3D mptcp_sk(sk); const struct sk_buff *skb; =20 - skb =3D skb_peek(&msk->receive_queue); + skb =3D skb_peek(&sk->sk_receive_queue); if (skb) { u64 hint_val =3D READ_ONCE(msk->ack_seq) - MPTCP_SKB_CB(skb)->map_seq; =20 @@ -2212,7 +2204,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, while (copied < len) { int err, bytes_read; =20 - bytes_read =3D __mptcp_recvmsg_mskq(msk, msg, len - copied, flags, &tss,= &cmsg_flags); + bytes_read =3D __mptcp_recvmsg_mskq(sk, msg, len - copied, flags, &tss, = &cmsg_flags); if (unlikely(bytes_read < 0)) { if (!copied) copied =3D bytes_read; @@ -2221,7 +2213,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, =20 copied +=3D bytes_read; =20 - if (skb_queue_empty(&msk->receive_queue) && __mptcp_move_skbs(msk)) + if (skb_queue_empty(&sk->sk_receive_queue) && __mptcp_move_skbs(sk)) continue; =20 /* only the MPTCP socket status is relevant here. The exit @@ -2247,7 +2239,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, /* race breaker: the shutdown could be after the * previous receive queue check */ - if (__mptcp_move_skbs(msk)) + if (__mptcp_move_skbs(sk)) continue; break; } @@ -2291,9 +2283,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, } } =20 - pr_debug("msk=3D%p rx queue empty=3D%d:%d copied=3D%d\n", - msk, skb_queue_empty_lockless(&sk->sk_receive_queue), - skb_queue_empty(&msk->receive_queue), copied); + pr_debug("msk=3D%p rx queue empty=3D%d copied=3D%d\n", + msk, skb_queue_empty(&sk->sk_receive_queue), copied); =20 release_sock(sk); return copied; @@ -2820,7 +2811,6 @@ static void __mptcp_init_sock(struct sock *sk) INIT_LIST_HEAD(&msk->join_list); INIT_LIST_HEAD(&msk->rtx_queue); INIT_WORK(&msk->work, mptcp_worker); - __skb_queue_head_init(&msk->receive_queue); msk->out_of_order_queue =3D RB_ROOT; msk->first_pending =3D NULL; WRITE_ONCE(msk->rmem_fwd_alloc, 0); @@ -3403,12 +3393,8 @@ void mptcp_destroy_common(struct mptcp_sock *msk, un= signed int flags) mptcp_for_each_subflow_safe(msk, subflow, tmp) __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags); =20 - /* move to sk_receive_queue, sk_stream_kill_queues will purge it */ - mptcp_data_lock(sk); - skb_queue_splice_tail_init(&msk->receive_queue, &sk->sk_receive_queue); __skb_queue_purge(&sk->sk_receive_queue); skb_rbtree_purge(&msk->out_of_order_queue); - mptcp_data_unlock(sk); =20 /* move all the rx fwd alloc into the sk_mem_reclaim_final in * inet_sock_destruct() will dispose it @@ -3451,7 +3437,8 @@ void __mptcp_check_push(struct sock *sk, struct sock = *ssk) =20 #define MPTCP_FLAGS_PROCESS_CTX_NEED (BIT(MPTCP_PUSH_PENDING) | \ BIT(MPTCP_RETRANSMIT) | \ - BIT(MPTCP_FLUSH_JOIN_LIST)) + BIT(MPTCP_FLUSH_JOIN_LIST) | \ + BIT(MPTCP_DEQUEUE)) =20 /* processes deferred events and flush wmem */ static void mptcp_release_cb(struct sock *sk) @@ -3485,6 +3472,11 @@ static void mptcp_release_cb(struct sock *sk) __mptcp_push_pending(sk, 0); if (flags & BIT(MPTCP_RETRANSMIT)) __mptcp_retrans(sk); + if ((flags & BIT(MPTCP_DEQUEUE)) && __mptcp_move_skbs(sk)) { + /* notify ack seq update */ + mptcp_cleanup_rbuf(msk, 0); + sk->sk_data_ready(sk); + } =20 cond_resched(); spin_lock_bh(&sk->sk_lock.slock); @@ -3722,7 +3714,8 @@ static int mptcp_ioctl(struct sock *sk, int cmd, int = *karg) return -EINVAL; =20 lock_sock(sk); - __mptcp_move_skbs(msk); + if (__mptcp_move_skbs(sk)) + mptcp_cleanup_rbuf(msk, 0); *karg =3D mptcp_inq_hint(sk); release_sock(sk); break; diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index b4c72a73594f..ad940cc1f26f 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -124,6 +124,7 @@ #define MPTCP_FLUSH_JOIN_LIST 5 #define MPTCP_SYNC_STATE 6 #define MPTCP_SYNC_SNDBUF 7 +#define MPTCP_DEQUEUE 8 =20 struct mptcp_skb_cb { u64 map_seq; @@ -322,7 +323,6 @@ struct mptcp_sock { struct work_struct work; struct sk_buff *ooo_last_skb; struct rb_root out_of_order_queue; - struct sk_buff_head receive_queue; struct list_head conn_list; struct list_head rtx_queue; struct mptcp_data_frag *first_pending; --=20 2.45.2 From nobody Wed Jan 8 06:16:40 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 602BB1FF7A1 for ; Fri, 6 Dec 2024 12:11:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487076; cv=none; b=hugH2RL00HnjIYqmDn6zOapiUyiNLhEH+NrLvV2E5s4DO6Hc6RfZLK0+6EVmWUBw34gMOoSJd1hMS3NDrtU2HxhRY76w2bIrqeDfK/Wr5wDBSJ4QHvlRcL7I7C9Viw24d+wlue4fw3FMCrfStn/0s+hAL279bCo/aAqKAb/+BM8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487076; c=relaxed/simple; bh=N5EAGa8ayyMc9yxv8vM4o2y/bFRDyp68LmtLa89ZyQU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=El6DRRQeKVBzFyN27IfP0vGgxekCIDIg2pM+gO52mm2LEblPEcYydZ4YQUWfX0NSMR77hIbcKRdGP7sAnVYr0vaK26wyCZpQxLdNz+VJG/XCOGX5wNjsjJ1e1VlN2vLRCsmysNZHhB3pvgmeC81jfJpmq2oWTgghj2Zjx4Q6vVo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=RgtZekYB; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RgtZekYB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733487073; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QbHFfINypOIognEw9lvtbTAUS0vFOuIKEc0jegXnfW0=; b=RgtZekYBTEzyUjal6vPaUHxDQn0EN4A5XTJTXysOkFI+CZCP49ARyhVNOn212PyaSuB+sc UNfOyPbF+998BSedyj4M8xFjOk7fqCpMdzuoeG4mBhui1AtHaZevuVBRvHVlF/sW52AxCm ftXzoUlFdCg3TSbyVw7iKVBn0q6uFkk= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-571-g4r0PpvbMfuaaLPMNASjQQ-1; Fri, 06 Dec 2024 07:11:12 -0500 X-MC-Unique: g4r0PpvbMfuaaLPMNASjQQ-1 X-Mimecast-MFC-AGG-ID: g4r0PpvbMfuaaLPMNASjQQ Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A0E6A1956053 for ; Fri, 6 Dec 2024 12:11:11 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.243]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D30B319560A0 for ; Fri, 6 Dec 2024 12:11:10 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 6/7] mptcp: cleanup mem accounting. Date: Fri, 6 Dec 2024 13:11:08 +0100 Message-ID: <152364cf85476115c84f435d76a8f04da9e2d089.1733486870.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: U4IS2z8emBwCMJtN_JPlnBk38rox1CDOrxnvcvAvyzY_1733487071 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" After the previous patch, updating sk_forward_memory is cheap and we can drop a lot of complexity from the MPTCP memory acconting, removing the custom fwd mem allocations for rmem. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- v1 -> v2: - keep 'snd_una' and recovery-related fields under the msk data lock - dropped unneeded code in __mptcp_move_skbs() --- net/mptcp/fastopen.c | 2 +- net/mptcp/protocol.c | 115 +++---------------------------------------- net/mptcp/protocol.h | 4 +- 3 files changed, 10 insertions(+), 111 deletions(-) diff --git a/net/mptcp/fastopen.c b/net/mptcp/fastopen.c index fb945c0d50bf..b0f1dddfb143 100644 --- a/net/mptcp/fastopen.c +++ b/net/mptcp/fastopen.c @@ -51,7 +51,7 @@ void mptcp_fastopen_subflow_synack_set_params(struct mptc= p_subflow_context *subf mptcp_data_lock(sk); DEBUG_NET_WARN_ON_ONCE(sock_owned_by_user_nocheck(sk)); =20 - mptcp_set_owner_r(skb, sk); + skb_set_owner_r(skb, sk); __skb_queue_tail(&sk->sk_receive_queue, skb); mptcp_sk(sk)->bytes_received +=3D skb->len; =20 diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 42d04de32560..4f27b0cafac5 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -118,17 +118,6 @@ static void mptcp_drop(struct sock *sk, struct sk_buff= *skb) __kfree_skb(skb); } =20 -static void mptcp_rmem_fwd_alloc_add(struct sock *sk, int size) -{ - WRITE_ONCE(mptcp_sk(sk)->rmem_fwd_alloc, - mptcp_sk(sk)->rmem_fwd_alloc + size); -} - -static void mptcp_rmem_charge(struct sock *sk, int size) -{ - mptcp_rmem_fwd_alloc_add(sk, -size); -} - static bool mptcp_try_coalesce(struct sock *sk, struct sk_buff *to, struct sk_buff *from) { @@ -150,7 +139,7 @@ static bool mptcp_try_coalesce(struct sock *sk, struct = sk_buff *to, * negative one */ atomic_add(delta, &sk->sk_rmem_alloc); - mptcp_rmem_charge(sk, delta); + sk_mem_charge(sk, delta); kfree_skb_partial(from, fragstolen); =20 return true; @@ -165,44 +154,6 @@ static bool mptcp_ooo_try_coalesce(struct mptcp_sock *= msk, struct sk_buff *to, return mptcp_try_coalesce((struct sock *)msk, to, from); } =20 -static void __mptcp_rmem_reclaim(struct sock *sk, int amount) -{ - amount >>=3D PAGE_SHIFT; - mptcp_rmem_charge(sk, amount << PAGE_SHIFT); - __sk_mem_reduce_allocated(sk, amount); -} - -static void mptcp_rmem_uncharge(struct sock *sk, int size) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - int reclaimable; - - mptcp_rmem_fwd_alloc_add(sk, size); - reclaimable =3D msk->rmem_fwd_alloc - sk_unused_reserved_mem(sk); - - /* see sk_mem_uncharge() for the rationale behind the following schema */ - if (unlikely(reclaimable >=3D PAGE_SIZE)) - __mptcp_rmem_reclaim(sk, reclaimable); -} - -static void mptcp_rfree(struct sk_buff *skb) -{ - unsigned int len =3D skb->truesize; - struct sock *sk =3D skb->sk; - - atomic_sub(len, &sk->sk_rmem_alloc); - mptcp_rmem_uncharge(sk, len); -} - -void mptcp_set_owner_r(struct sk_buff *skb, struct sock *sk) -{ - skb_orphan(skb); - skb->sk =3D sk; - skb->destructor =3D mptcp_rfree; - atomic_add(skb->truesize, &sk->sk_rmem_alloc); - mptcp_rmem_charge(sk, skb->truesize); -} - /* "inspired" by tcp_data_queue_ofo(), main differences: * - use mptcp seqs * - don't cope with sacks @@ -315,25 +266,7 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *ms= k, struct sk_buff *skb) =20 end: skb_condense(skb); - mptcp_set_owner_r(skb, sk); -} - -static bool mptcp_rmem_schedule(struct sock *sk, struct sock *ssk, int siz= e) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - int amt, amount; - - if (size <=3D msk->rmem_fwd_alloc) - return true; - - size -=3D msk->rmem_fwd_alloc; - amt =3D sk_mem_pages(size); - amount =3D amt << PAGE_SHIFT; - if (!__sk_mem_raise_allocated(sk, size, amt, SK_MEM_RECV)) - return false; - - mptcp_rmem_fwd_alloc_add(sk, amount); - return true; + skb_set_owner_r(skb, sk); } =20 static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk, @@ -351,7 +284,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, st= ruct sock *ssk, skb_orphan(skb); =20 /* try to fetch required memory from subflow */ - if (!mptcp_rmem_schedule(sk, ssk, skb->truesize)) { + if (!sk_rmem_schedule(sk, skb, skb->truesize)) { MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVPRUNED); goto drop; } @@ -375,7 +308,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, st= ruct sock *ssk, if (tail && mptcp_try_coalesce(sk, tail, skb)) return true; =20 - mptcp_set_owner_r(skb, sk); + skb_set_owner_r(skb, sk); __skb_queue_tail(&sk->sk_receive_queue, skb); return true; } else if (after64(MPTCP_SKB_CB(skb)->map_seq, msk->ack_seq)) { @@ -1983,9 +1916,10 @@ static int __mptcp_recvmsg_mskq(struct sock *sk, } =20 if (!(flags & MSG_PEEK)) { - /* we will bulk release the skb memory later */ + /* avoid the indirect call, we know the destructor is sock_wfree */ skb->destructor =3D NULL; - WRITE_ONCE(msk->rmem_released, msk->rmem_released + skb->truesize); + atomic_sub(skb->truesize, &sk->sk_rmem_alloc); + sk_mem_uncharge(sk, skb->truesize); __skb_unlink(skb, &sk->sk_receive_queue); __kfree_skb(skb); msk->bytes_consumed +=3D count; @@ -2099,18 +2033,6 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock= *msk, int copied) msk->rcvq_space.time =3D mstamp; } =20 -static void __mptcp_update_rmem(struct sock *sk) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - - if (!msk->rmem_released) - return; - - atomic_sub(msk->rmem_released, &sk->sk_rmem_alloc); - mptcp_rmem_uncharge(sk, msk->rmem_released); - WRITE_ONCE(msk->rmem_released, 0); -} - static bool __mptcp_move_skbs(struct sock *sk) { struct mptcp_subflow_context *subflow; @@ -2134,7 +2056,6 @@ static bool __mptcp_move_skbs(struct sock *sk) break; =20 slowpath =3D lock_sock_fast(ssk); - __mptcp_update_rmem(sk); done =3D __mptcp_move_skbs_from_subflow(msk, ssk, &moved); =20 if (unlikely(ssk->sk_err)) @@ -2142,12 +2063,7 @@ static bool __mptcp_move_skbs(struct sock *sk) unlock_sock_fast(ssk, slowpath); } while (!done); =20 - ret =3D moved > 0; - if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) || - !skb_queue_empty(&sk->sk_receive_queue)) { - __mptcp_update_rmem(sk); - ret |=3D __mptcp_ofo_queue(msk); - } + ret =3D moved > 0 || __mptcp_ofo_queue(msk); if (ret) mptcp_check_data_fin((struct sock *)msk); return ret; @@ -2813,8 +2729,6 @@ static void __mptcp_init_sock(struct sock *sk) INIT_WORK(&msk->work, mptcp_worker); msk->out_of_order_queue =3D RB_ROOT; msk->first_pending =3D NULL; - WRITE_ONCE(msk->rmem_fwd_alloc, 0); - WRITE_ONCE(msk->rmem_released, 0); msk->timer_ival =3D TCP_RTO_MIN; msk->scaling_ratio =3D TCP_DEFAULT_SCALING_RATIO; =20 @@ -3040,8 +2954,6 @@ static void __mptcp_destroy_sock(struct sock *sk) =20 sk->sk_prot->destroy(sk); =20 - WARN_ON_ONCE(READ_ONCE(msk->rmem_fwd_alloc)); - WARN_ON_ONCE(msk->rmem_released); sk_stream_kill_queues(sk); xfrm_sk_free_policy(sk); =20 @@ -3399,8 +3311,6 @@ void mptcp_destroy_common(struct mptcp_sock *msk, uns= igned int flags) /* move all the rx fwd alloc into the sk_mem_reclaim_final in * inet_sock_destruct() will dispose it */ - sk_forward_alloc_add(sk, msk->rmem_fwd_alloc); - WRITE_ONCE(msk->rmem_fwd_alloc, 0); mptcp_token_destroy(msk); mptcp_pm_free_anno_list(msk); mptcp_free_local_addr_list(msk); @@ -3496,8 +3406,6 @@ static void mptcp_release_cb(struct sock *sk) if (__test_and_clear_bit(MPTCP_SYNC_SNDBUF, &msk->cb_flags)) __mptcp_sync_sndbuf(sk); } - - __mptcp_update_rmem(sk); } =20 /* MP_JOIN client subflow must wait for 4th ack before sending any data: @@ -3668,12 +3576,6 @@ static void mptcp_shutdown(struct sock *sk, int how) __mptcp_wr_shutdown(sk); } =20 -static int mptcp_forward_alloc_get(const struct sock *sk) -{ - return READ_ONCE(sk->sk_forward_alloc) + - READ_ONCE(mptcp_sk(sk)->rmem_fwd_alloc); -} - static int mptcp_ioctl_outq(const struct mptcp_sock *msk, u64 v) { const struct sock *sk =3D (void *)msk; @@ -3832,7 +3734,6 @@ static struct proto mptcp_prot =3D { .hash =3D mptcp_hash, .unhash =3D mptcp_unhash, .get_port =3D mptcp_get_port, - .forward_alloc_get =3D mptcp_forward_alloc_get, .stream_memory_free =3D mptcp_stream_memory_free, .sockets_allocated =3D &mptcp_sockets_allocated, =20 diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index ad940cc1f26f..a0d46b69746d 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -278,7 +278,6 @@ struct mptcp_sock { u64 rcv_data_fin_seq; u64 bytes_retrans; u64 bytes_consumed; - int rmem_fwd_alloc; int snd_burst; int old_wspace; u64 recovery_snd_nxt; /* in recovery mode accept up to this seq; @@ -293,7 +292,6 @@ struct mptcp_sock { u32 last_ack_recv; unsigned long timer_ival; u32 token; - int rmem_released; unsigned long flags; unsigned long cb_flags; bool recovery; /* closing subflow write queue reinjected */ @@ -384,7 +382,7 @@ static inline void msk_owned_by_me(const struct mptcp_s= ock *msk) */ static inline int __mptcp_rmem(const struct sock *sk) { - return atomic_read(&sk->sk_rmem_alloc) - READ_ONCE(mptcp_sk(sk)->rmem_rel= eased); + return atomic_read(&sk->sk_rmem_alloc); } =20 static inline int mptcp_win_from_space(const struct sock *sk, int space) --=20 2.45.2 From nobody Wed Jan 8 06:16:40 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBF741FF7A1 for ; Fri, 6 Dec 2024 12:11:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487083; cv=none; b=kGnvK/KsYXz8MFlpVPtehtyqWpQybivGoV1VKg6jFBbtPTssqmAmYqKkoxYMuhIB4PBxsncmLOfm+sXd9c+j0aeOusOmzb9PuZJZhmuYFCr06402kgzwLDPfIjT3pFgAthoEkgwyEVAG0xVHsaimV8qXLLNe6loEgT/IbT51wEk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733487083; c=relaxed/simple; bh=OP2x2sT4/evFes6aikOpvmLL6ZF2P8+lSi+wylxgd6Y=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:content-type; b=oT6nMRn+Jy4kfBRl27EN9gOFH7vJYVUuflNIOhn+SQudX0CIcEo7+cJ7IZmsYs6+dgiriWePuBNdtHCbwK/NWhKfCPK2cgOCKr6JMyfQYnV6Pz62UOXWuEzj2lUztCLH9V/XrsmsSm7z1WUClUgFQBw4QaOxvb2cTeb8xhAy3/s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=MPd2kW+z; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MPd2kW+z" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733487080; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lhRAL4quweAJyOzJHKhn3E8lg7d7AH45OOR5g93Eb/w=; b=MPd2kW+zh3k07gFyIobG40AvWw7WK9gI1rxQXShBzMNoxOaPGMSycWArBhFl0o40OMGSE5 ApR7ahFWkHfnKwTITG4zRUCvwQgEefRoHFU2ThfxvdJt1zKgt0DyEpjoR/Uxr9N4RKiSp2 Jnp/yTOahScUOGReZdtEBDurJf/FlBU= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-38-uz6Pm4JENuiNZKIUouOXyg-1; Fri, 06 Dec 2024 07:11:19 -0500 X-MC-Unique: uz6Pm4JENuiNZKIUouOXyg-1 X-Mimecast-MFC-AGG-ID: uz6Pm4JENuiNZKIUouOXyg Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B34E51956054 for ; Fri, 6 Dec 2024 12:11:17 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.243]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E7B651956095 for ; Fri, 6 Dec 2024 12:11:16 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 7/7] net: dismiss sk_forward_alloc_get() Date: Fri, 6 Dec 2024 13:11:14 +0100 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: iispmeqHp93JvUMGC_cLVWn3V0ESviRXyaev2-kAiqs_1733487077 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" After the previous patch we can remove the forward_alloc_get proto callback, basically reverting commit 292e6077b040 ("net: introduce sk_forward_alloc_get()") and commit 66d58f046c9d ("net: use sk_forward_alloc_get() in sk_get_meminfo()"). Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- include/net/sock.h | 13 ------------- net/core/sock.c | 2 +- net/ipv4/af_inet.c | 2 +- net/ipv4/inet_diag.c | 2 +- net/sched/em_meta.c | 2 +- 5 files changed, 4 insertions(+), 17 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 7464e9f9f47c..a5c28a4f0263 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1283,10 +1283,6 @@ struct proto { unsigned int inuse_idx; #endif =20 -#if IS_ENABLED(CONFIG_MPTCP) - int (*forward_alloc_get)(const struct sock *sk); -#endif - bool (*stream_memory_free)(const struct sock *sk, int wake); bool (*sock_is_readable)(struct sock *sk); /* Memory pressure */ @@ -1347,15 +1343,6 @@ int sock_load_diag_module(int family, int protocol); =20 INDIRECT_CALLABLE_DECLARE(bool tcp_stream_memory_free(const struct sock *s= k, int wake)); =20 -static inline int sk_forward_alloc_get(const struct sock *sk) -{ -#if IS_ENABLED(CONFIG_MPTCP) - if (sk->sk_prot->forward_alloc_get) - return sk->sk_prot->forward_alloc_get(sk); -#endif - return READ_ONCE(sk->sk_forward_alloc); -} - static inline bool __sk_stream_memory_free(const struct sock *sk, int wake) { if (READ_ONCE(sk->sk_wmem_queued) >=3D READ_ONCE(sk->sk_sndbuf)) diff --git a/net/core/sock.c b/net/core/sock.c index 74729d20cd00..06b732604bf2 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3858,7 +3858,7 @@ void sk_get_meminfo(const struct sock *sk, u32 *mem) mem[SK_MEMINFO_RCVBUF] =3D READ_ONCE(sk->sk_rcvbuf); mem[SK_MEMINFO_WMEM_ALLOC] =3D sk_wmem_alloc_get(sk); mem[SK_MEMINFO_SNDBUF] =3D READ_ONCE(sk->sk_sndbuf); - mem[SK_MEMINFO_FWD_ALLOC] =3D sk_forward_alloc_get(sk); + mem[SK_MEMINFO_FWD_ALLOC] =3D READ_ONCE(sk->sk_forward_alloc); mem[SK_MEMINFO_WMEM_QUEUED] =3D READ_ONCE(sk->sk_wmem_queued); mem[SK_MEMINFO_OPTMEM] =3D atomic_read(&sk->sk_omem_alloc); mem[SK_MEMINFO_BACKLOG] =3D READ_ONCE(sk->sk_backlog.len); diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 8095e82de808..a460ef3a2b0b 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -153,7 +153,7 @@ void inet_sock_destruct(struct sock *sk) WARN_ON_ONCE(atomic_read(&sk->sk_rmem_alloc)); WARN_ON_ONCE(refcount_read(&sk->sk_wmem_alloc)); WARN_ON_ONCE(sk->sk_wmem_queued); - WARN_ON_ONCE(sk_forward_alloc_get(sk)); + WARN_ON_ONCE(sk->sk_forward_alloc); =20 kfree(rcu_dereference_protected(inet->inet_opt, 1)); dst_release(rcu_dereference_protected(sk->sk_dst_cache, 1)); diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 321acc8abf17..efe2a085cf68 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -282,7 +282,7 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_conn= ection_sock *icsk, struct inet_diag_meminfo minfo =3D { .idiag_rmem =3D sk_rmem_alloc_get(sk), .idiag_wmem =3D READ_ONCE(sk->sk_wmem_queued), - .idiag_fmem =3D sk_forward_alloc_get(sk), + .idiag_fmem =3D READ_ONCE(sk->sk_forward_alloc), .idiag_tmem =3D sk_wmem_alloc_get(sk), }; =20 diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c index 8996c73c9779..3f2e707a11d1 100644 --- a/net/sched/em_meta.c +++ b/net/sched/em_meta.c @@ -460,7 +460,7 @@ META_COLLECTOR(int_sk_fwd_alloc) *err =3D -1; return; } - dst->value =3D sk_forward_alloc_get(sk); + dst->value =3D READ_ONCE(sk->sk_forward_alloc); } =20 META_COLLECTOR(int_sk_sndbuf) --=20 2.45.2