From nobody Tue Apr 23 17:25:59 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:7824:0:0:0:0:0 with SMTP id p36csp427567jac; Fri, 15 Oct 2021 08:40:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzShDGy3cTkPJ+tdf+RcXiRQDZM5mjdHp1GhMzY/sJNjCpaa7+LPnqssYq9nknQku2l0HXC X-Received: by 2002:a05:6a00:2306:b0:44c:6d97:5a5e with SMTP id h6-20020a056a00230600b0044c6d975a5emr12122537pfh.84.1634312449597; Fri, 15 Oct 2021 08:40:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634312449; cv=none; d=google.com; s=arc-20160816; b=XlQzQtNa8B9GOQKpO/pd79aTTNp6LAEn6AJIcP0BHzT2x8MGvSyNqYfPIa6ov/oG77 t54ovKoEV8xmw5Rwwh29p9Gw9P/BLbNbtBmaynw7XTaKB1Tdto5fPN5MB+JnwYFKnwAH 4N3WxnoF6E3DszG2SMVUuEV40Sm1joMjRkLQ33J514OJOPrRirpwegjZEdGyciETLOBJ YeIyj16PPFgSHPoZhHWNQ9FnWs1lFmVYQ1qLmhLZjccRrkJztxL+kgezEwFENBYywgty SEGDvRKPW/wXOQnNFOgU/o5XceOEEMlK76GvfCvqtNLovRcBaS0Nc+qsccnfOX+IsL7q F0Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=1egjQgvve1wj1U6ZexQM4//Hlh3iOTX1wIEtAwnp8o0=; b=vCYgezMe/Uaqf/jRxR7rUvm/wp1jJvGDE34oWtPI2RwShsTL4WJc19ZEHMbkURBQuB n9OxA1BWRHdUW2skMn6qJlRM0vm/yr4gr5jQAFtg90wlccobD2G8W1MQ0QblyYZUdEpX 5Ws5d49QYfwuhYmZqKc3cxfQ6XqC2TlYaARI4w3GXKWy9Sh3+M5Tw5i0nYvv9ToduvgR widQnxNZiPiqo6SxGnisw2+rype43BI85PDezpWXkPrxWhkCPm702S+zTbtLpmed4Yc0 ay2lCkGP0oNrQgXpbXNbxuYMo+p8rpWDYZWbqMnaw6585F6UvLCfHjjEPq1780VR6kXe t3Tg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OdtjLLwy; spf=pass (google.com: domain of mptcp+bounces-2169-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1:3600::1 as permitted sender) smtp.mailfrom="mptcp+bounces-2169-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [2604:1380:1:3600::1]) by mx.google.com with ESMTPS id y126si8060432pfb.277.2021.10.15.08.40.49 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 Oct 2021 08:40:49 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-2169-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1:3600::1 as permitted sender) client-ip=2604:1380:1:3600::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OdtjLLwy; spf=pass (google.com: domain of mptcp+bounces-2169-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1:3600::1 as permitted sender) smtp.mailfrom="mptcp+bounces-2169-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id C714B1C0F5D for ; Fri, 15 Oct 2021 15:40:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C3A142C81; Fri, 15 Oct 2021 15:40:43 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A0BD29CA for ; Fri, 15 Oct 2021 15:40:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634312441; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1egjQgvve1wj1U6ZexQM4//Hlh3iOTX1wIEtAwnp8o0=; b=OdtjLLwytzDisVpsQ4/OAk2npz8wvO3XR1SLSWAUqBJ4wKHZeOCgUvfZwby2FFfW9uPWwZ EBqEdd50G8BCMJ8Qvwqu8FpYdXAg/zV6MzqlAjrViXpTbqad0+ZKKpCRvssgbS3oWnpviF u/WEScea7AcBdQhIvkA5c0FuD3zWIeI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-548-yuEhtIX-NIuFVIY-9twvOw-1; Fri, 15 Oct 2021 11:40:40 -0400 X-MC-Unique: yuEhtIX-NIuFVIY-9twvOw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BA2F8801AFC for ; Fri, 15 Oct 2021 15:40:39 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30B07ADD7 for ; Fri, 15 Oct 2021 15:40:38 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 1/3] tcp: define macros for a couple reclaim thresholds Date: Fri, 15 Oct 2021 17:39:37 +0200 Message-Id: <0d3f747f3a1dcdc67e71fc93cc1d64eee9e6b803.1634312286.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A following patch is going to implment for the MPTCP protocol a similar reclaim schema, with different locking. Let's define a couple of macros for the used thresholds, so that the latter code will be more easily maintainable. Signed-off-by: Paolo Abeni Acked-by: Mat Martineau --- include/net/sock.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index d08ab55fa4a0..9c5d0502090f 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1568,6 +1568,12 @@ static inline void sk_mem_charge(struct sock *sk, in= t size) sk->sk_forward_alloc -=3D size; } =20 +/* the following macros control control memory reclaiming in + * sk_mem_uncharge() + */ +#define SK_RECLAIM_THRESHOLD (1 << 21) +#define SK_RECLAIM_CHUNK (1 << 20) + static inline void sk_mem_uncharge(struct sock *sk, int size) { int reclaimable; @@ -1584,8 +1590,8 @@ static inline void sk_mem_uncharge(struct sock *sk, i= nt size) * If we reach 2 MBytes, reclaim 1 MBytes right now, there is * no need to hold that much forward allocation anyway. */ - if (unlikely(reclaimable >=3D 1 << 21)) - __sk_mem_reclaim(sk, 1 << 20); + if (unlikely(reclaimable >=3D SK_RECLAIM_THRESHOLD)) + __sk_mem_reclaim(sk, SK_RECLAIM_CHUNK); } =20 static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) --=20 2.26.3 From nobody Tue Apr 23 17:25:59 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:7824:0:0:0:0:0 with SMTP id p36csp427593jac; Fri, 15 Oct 2021 08:40:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwYqw4zpO+5ziz9DtPryMEcbgQbbC7/4f6uYI9/1sDi2bhg9Y1D+S6HbSjauDCI6ct0pdEL X-Received: by 2002:a17:90b:3b88:: with SMTP id pc8mr13872391pjb.93.1634312450666; Fri, 15 Oct 2021 08:40:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634312450; cv=none; d=google.com; s=arc-20160816; b=t59CCSMXs2qI3ao30+sMsV/uactW8F5P1RW0penaKrfKIIVezLGNo8Ke36lFRoJ6fE Z2xj8n1yCa5KxWOQmmHhKV4J5CLb6APDcQIHOoFwNqft4sXYy4eeIeMLYkknGxM79TQk 6ywMA2elN8PCm6f1VgdZm8WsshXv/RZeNNN0c+Ov7dIKaV4/7BcV0qGsz2yNiP8tiR+X fmCtXpLv91VqaWLIkZV3Ooa6FI6PpD6bLZCuUiJPwH8iRQn/nuZi6Cu5dSMni8Quy5q7 7xm0toa9EFgoU/GjB1/gW5dTDpy7ts9NMbcjuPe2t8zMceDDcCdd7Trt/YATuqNw7O0h PO8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=9CfqUPu7O1w8W4v6WD5zHDCxWC8uPzyP6pFLYQwAak8=; b=wBFQHI3+vctT6wuA+21rNaZ4ymx/C1ozJ/oymT2OwwsRH268wwpgF1yTtWwvRN1bBz iEQ9pdC7cVXNgqgArWfmdnEYH0saT49AItGc7gwK6O4wL3NwFQOdovo3QlWgzROsuvWz 9AKXJAxEcdczSFMZXLHberkQOZMz8Yo1NgSOCZc0H/x8zlLLgNdd1J2Z/fBoPOT8C2Jo kEXnU6yt0/hrgTcqMi5owsV7b9WzKdJW6NTjO+YOIXU080uJobaoIN1iEuFFck9P0ez1 Rf4D6fjkj4OizZqg4u4C3QJbWQLu606M7/u4y+qA8s6kE4k6OEbDHqTKNyDvOU9l8mqk g1pw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=B8TkBWx+; spf=pass (google.com: domain of mptcp+bounces-2170-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2170-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [147.75.197.195]) by mx.google.com with ESMTPS id p14si5634773pll.89.2021.10.15.08.40.50 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 Oct 2021 08:40:50 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-2170-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) client-ip=147.75.197.195; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=B8TkBWx+; spf=pass (google.com: domain of mptcp+bounces-2170-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2170-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id CCA761C0F83 for ; Fri, 15 Oct 2021 15:40:46 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A522E2C85; Fri, 15 Oct 2021 15:40:45 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C4CF29CA for ; Fri, 15 Oct 2021 15:40:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634312443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9CfqUPu7O1w8W4v6WD5zHDCxWC8uPzyP6pFLYQwAak8=; b=B8TkBWx+B6a/p09tdxU7dFI4INqBTUpzTFGwN5fLChCWiWV1E0lWHSGLXw76VOpetQ9bQp v+KpkrgAFyZ7COHMDTkPni8Lw1HzlOPsG52knoSD8ibrKLqsIwoYOV2jjaOsbuyv+lZRzD LL7HgSKKW0B3bFsEZi4J1ffaW+TE88M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-540-zxh6IL-WOMm6BXkoqP9hkQ-1; Fri, 15 Oct 2021 11:40:41 -0400 X-MC-Unique: zxh6IL-WOMm6BXkoqP9hkQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A25EF91271 for ; Fri, 15 Oct 2021 15:40:40 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1A4A9ADD7 for ; Fri, 15 Oct 2021 15:40:39 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 2/3] net: introduce sk_forward_alloc_get() Date: Fri, 15 Oct 2021 17:39:38 +0200 Message-Id: <944348e29bd564a3e45d7ab6683c4e64ef48aec1.1634312286.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A later patch will change the mptcp memory accounting schema in such a way that MPTCP sockets will encode total amount of forward allocated memory in two separate fields (one for tx and one for rx). MPTCP sockets will use their own helper to provide the accurate amount of fwd alloced memory. To allow the above, this patch adds a new, optional, sk method to fetch the fwd memory, wrap the call in a new helper and use it where apprioriate. Signed-off-by: Paolo Abeni Acked-by: Mat Martineau --- This schema was suggested long time ago by Eric in a completely different context: https://marc.info/?l=3Dlinux-netdev&m=3D147516056204838&w=3D2 I'm unsure if that is still applicable/valid. --- include/net/sock.h | 11 +++++++++++ net/ipv4/af_inet.c | 2 +- net/ipv4/inet_diag.c | 2 +- net/sched/em_meta.c | 2 +- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 9c5d0502090f..86ea60df6e84 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1205,6 +1205,8 @@ struct proto { unsigned int inuse_idx; #endif =20 + int (*forward_alloc_get)(const struct sock *sk); + bool (*stream_memory_free)(const struct sock *sk, int wake); bool (*stream_memory_read)(const struct sock *sk); /* Memory pressure */ @@ -1212,6 +1214,7 @@ struct proto { void (*leave_memory_pressure)(struct sock *sk); atomic_long_t *memory_allocated; /* Current allocated memory. */ struct percpu_counter *sockets_allocated; /* Current number of sockets. */ + /* * Pressure flag: try to collapse. * Technical note: it is used by multiple contexts non atomically. @@ -1289,6 +1292,14 @@ static inline void sk_refcnt_debug_release(const str= uct sock *sk) =20 INDIRECT_CALLABLE_DECLARE(bool tcp_stream_memory_free(const struct sock *s= k, int wake)); =20 +static inline int sk_forward_alloc_get(const struct sock *sk) +{ + if (!sk->sk_prot->forward_alloc_get) + return sk->sk_forward_alloc; + + return sk->sk_prot->forward_alloc_get(sk); +} + static inline bool __sk_stream_memory_free(const struct sock *sk, int wake) { if (READ_ONCE(sk->sk_wmem_queued) >=3D READ_ONCE(sk->sk_sndbuf)) diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 37e69fd9246c..55f862abe811 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -150,7 +150,7 @@ void inet_sock_destruct(struct sock *sk) WARN_ON(atomic_read(&sk->sk_rmem_alloc)); WARN_ON(refcount_read(&sk->sk_wmem_alloc)); WARN_ON(sk->sk_wmem_queued); - WARN_ON(sk->sk_forward_alloc); + WARN_ON(sk_forward_alloc_get(sk)); =20 kfree(rcu_dereference_protected(inet->inet_opt, 1)); dst_release(rcu_dereference_protected(sk->sk_dst_cache, 1)); diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index ef7897226f08..c8fa6e7f7d12 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -271,7 +271,7 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_conn= ection_sock *icsk, struct inet_diag_meminfo minfo =3D { .idiag_rmem =3D sk_rmem_alloc_get(sk), .idiag_wmem =3D READ_ONCE(sk->sk_wmem_queued), - .idiag_fmem =3D sk->sk_forward_alloc, + .idiag_fmem =3D sk_forward_alloc_get(sk), .idiag_tmem =3D sk_wmem_alloc_get(sk), }; =20 diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c index 46254968d390..0a04468b7314 100644 --- a/net/sched/em_meta.c +++ b/net/sched/em_meta.c @@ -457,7 +457,7 @@ META_COLLECTOR(int_sk_fwd_alloc) *err =3D -1; return; } - dst->value =3D sk->sk_forward_alloc; + dst->value =3D sk_forward_alloc_get(sk); } =20 META_COLLECTOR(int_sk_sndbuf) --=20 2.26.3 From nobody Tue Apr 23 17:25:59 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:7824:0:0:0:0:0 with SMTP id p36csp427600jac; Fri, 15 Oct 2021 08:40:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw1ZEla6aLptTvI6c6KhH3zMINZvb7BFF29nDwbBOpgdXyGwTyERkUju1aj8UfWoPt5a1kk X-Received: by 2002:aa7:9ad8:0:b0:44d:24d0:3ddf with SMTP id x24-20020aa79ad8000000b0044d24d03ddfmr12224093pfp.29.1634312450905; Fri, 15 Oct 2021 08:40:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634312450; cv=none; d=google.com; s=arc-20160816; b=zsvFOam76YjEgK6QmC1dBBUNLwDkjWD+dM28x4EXHQTdBwxXAaLBEne3Yrnx/3G/0x G4ysBSZAYXtwhl0asTITAoG2hwBm0y0qkE4AqxlpV4h1SvZydv2lGswWGkd/eud9MKSb hfevXajK4rXxjMZbyeHbLRxqCkC0SSb8Qfg0LsU4oY2ke49oeIBt0k96tPcUg/2y5O+I ST1MRcdFQ82n9CFJHPbzYGdf1hU8FfzJ1b/yfxHxTcpIha6ozyiL+LBH+gtMxs2VXskM pQ4IbdDkgW8mT7FgINb68mQL0siFkdHSD0SAbCBGtYAQzc+6JUOYI9IZv8QlfTQszxgs emjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=jGFSt7xRVh8xBZ1l+4V1N4du+C0Kwql6lZR55p+q62o=; b=SKFUECbuKt7H3Fz4HsEjo1Z6LwPpsAPf4cuiyGpg966HgqBvKarZo+cyC+z8Mp0cFd BjXUbHp2VkRRIl4ifRAxwuhlnhuf9Fgv/3DgIoGJHlULeQZSCirFBQWEmnheUI1kiHJP soHkuqMBP3BQknf3M957rriOf76EU2HYVZKZsBFH0iCyceLrGEzcc9oOiuw1p8glB6SW C4POBeSfXKw0gctN5NsKFKIj/x+7ZGVa+y+zzq20tEsHfvFwTTZOBDg83mFRyTK32lzJ AzMj9IAwA9plozfvRafxJf435ea/Ya4akZpGwX5zQrtxyUehLIQVbp2te0rJXvG9s9Q4 /dzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="fDhkrf/w"; spf=pass (google.com: domain of mptcp+bounces-2171-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2171-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [147.75.197.195]) by mx.google.com with ESMTPS id y2si7963523pgj.401.2021.10.15.08.40.50 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 Oct 2021 08:40:50 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-2171-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) client-ip=147.75.197.195; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="fDhkrf/w"; spf=pass (google.com: domain of mptcp+bounces-2171-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2171-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id F0EE71C0F5A for ; Fri, 15 Oct 2021 15:40:47 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1A26F2C87; Fri, 15 Oct 2021 15:40:46 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C2AE2C83 for ; Fri, 15 Oct 2021 15:40:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634312443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jGFSt7xRVh8xBZ1l+4V1N4du+C0Kwql6lZR55p+q62o=; b=fDhkrf/wO6GtziZ2hqn2gPNOkVSubm5Cm+28s8V35dAN/SYT+UjzZg/XD/sGR0mFKWkQJu 8KsFvuZVy9Ac9X6Myrwa/5FKC0uGKSA3eY+47BOCitIgXQaqlVCoBDHnjrk0LS4TkewZch 13oBU5L92JiJFnRKdsQAGiRxD0lvWag= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-562-ydETUvpYMT6cgKVki9Ll4w-1; Fri, 15 Oct 2021 11:40:42 -0400 X-MC-Unique: ydETUvpYMT6cgKVki9Ll4w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 91D5C104ECFC for ; Fri, 15 Oct 2021 15:40:41 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 072BFADD7 for ; Fri, 15 Oct 2021 15:40:40 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 3/3] mptcp: allocate fwd memory separatelly on the rx and tx path Date: Fri, 15 Oct 2021 17:39:39 +0200 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" All the mptcp receive path is protected by the msk socket spinlock. As consequence, the tx path has to play a few tricks to allocate the forward memory without acquiring the spinlock multile times, making the overall TX path quite complex. This patch tries to clean-up a bit the tx path, using completely separated fwd memory allocation, for the rx and the tx path. The forward memory allocated in the rx path is now accounted in msk->rmem_fwd_alloc and is (still) protected by the msk socket spinlock. To cope with the above we provied a few MPTCP-specific variant for the helpers to charge, uncharge, reclaim and free the forward memory in the receive path. msk->sk_forward_alloc now accounts only the forward memory for the tx path, we can use the plain core sock helper to manipulate it and drop quite a bit of complexity. On memory pressure reclaim both rx and tx fwd memory. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- v1 -> v2: - fixed WARN_ON(sk_forward_alloc_get(sk)) splat on shutdown caused by non empty ooo queue RFC -> v1: - fix comment indent (Mat) - use macros instead of magic number (Mat) - set rmem_fwd_alloc at init time (Mat) - declare static __mptcp_rmem_reclaim - use forward_alloc_get() --- net/mptcp/protocol.c | 225 ++++++++++++++++++------------------------- net/mptcp/protocol.h | 15 +-- 2 files changed, 95 insertions(+), 145 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 21716392e754..f1186d25aec9 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -126,6 +126,11 @@ static void mptcp_drop(struct sock *sk, struct sk_buff= *skb) __kfree_skb(skb); } =20 +static void mptcp_rmem_charge(struct sock *sk, int size) +{ + mptcp_sk(sk)->rmem_fwd_alloc -=3D size; +} + static bool mptcp_try_coalesce(struct sock *sk, struct sk_buff *to, struct sk_buff *from) { @@ -142,7 +147,7 @@ static bool mptcp_try_coalesce(struct sock *sk, struct = sk_buff *to, MPTCP_SKB_CB(to)->end_seq =3D MPTCP_SKB_CB(from)->end_seq; kfree_skb_partial(from, fragstolen); atomic_add(delta, &sk->sk_rmem_alloc); - sk_mem_charge(sk, delta); + mptcp_rmem_charge(sk, delta); return true; } =20 @@ -155,6 +160,44 @@ static bool mptcp_ooo_try_coalesce(struct mptcp_sock *= msk, struct sk_buff *to, return mptcp_try_coalesce((struct sock *)msk, to, from); } =20 +static void __mptcp_rmem_reclaim(struct sock *sk, int amount) +{ + amount >>=3D SK_MEM_QUANTUM_SHIFT; + mptcp_sk(sk)->rmem_fwd_alloc -=3D amount << SK_MEM_QUANTUM_SHIFT; + __sk_mem_reduce_allocated(sk, amount); +} + +static void mptcp_rmem_uncharge(struct sock *sk, int size) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + int reclaimable; + + msk->rmem_fwd_alloc +=3D size; + reclaimable =3D msk->rmem_fwd_alloc - sk_unused_reserved_mem(sk); + + /* see sk_mem_uncharge() for the rationale behind the following schema */ + if (unlikely(reclaimable >=3D SK_RECLAIM_THRESHOLD)) + __mptcp_rmem_reclaim(sk, SK_RECLAIM_CHUNK); +} + +static void mptcp_rfree(struct sk_buff *skb) +{ + unsigned int len =3D skb->truesize; + struct sock *sk =3D skb->sk; + + atomic_sub(len, &sk->sk_rmem_alloc); + mptcp_rmem_uncharge(sk, len); +} + +static void mptcp_set_owner_r(struct sk_buff *skb, struct sock *sk) +{ + skb_orphan(skb); + skb->sk =3D sk; + skb->destructor =3D mptcp_rfree; + atomic_add(skb->truesize, &sk->sk_rmem_alloc); + mptcp_rmem_charge(sk, skb->truesize); +} + /* "inspired" by tcp_data_queue_ofo(), main differences: * - use mptcp seqs * - don't cope with sacks @@ -267,7 +310,29 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *ms= k, struct sk_buff *skb) =20 end: skb_condense(skb); - skb_set_owner_r(skb, sk); + mptcp_set_owner_r(skb, sk); +} + +static bool mptcp_rmem_schedule(struct sock *sk, struct sock *ssk, int siz= e) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + int amt, amount; + + if (size < msk->rmem_fwd_alloc) + return true;; + + amt =3D sk_mem_pages(size); + amount =3D amt << SK_MEM_QUANTUM_SHIFT; + msk->rmem_fwd_alloc +=3D amount; + if (!__sk_mem_raise_allocated(sk, size, amt, SK_MEM_RECV)) { + if (ssk->sk_forward_alloc < amount) { + msk->rmem_fwd_alloc -=3D amount; + return false; + } + + ssk->sk_forward_alloc -=3D amount; + } + return true; } =20 static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk, @@ -285,15 +350,8 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, s= truct sock *ssk, skb_orphan(skb); =20 /* try to fetch required memory from subflow */ - if (!sk_rmem_schedule(sk, skb, skb->truesize)) { - int amount =3D sk_mem_pages(skb->truesize) << SK_MEM_QUANTUM_SHIFT; - - if (ssk->sk_forward_alloc < amount) - goto drop; - - ssk->sk_forward_alloc -=3D amount; - sk->sk_forward_alloc +=3D amount; - } + if (!mptcp_rmem_schedule(sk, ssk, skb->truesize)) + goto drop; =20 has_rxtstamp =3D TCP_SKB_CB(skb)->has_rxtstamp; =20 @@ -313,7 +371,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, st= ruct sock *ssk, if (tail && mptcp_try_coalesce(sk, tail, skb)) return true; =20 - skb_set_owner_r(skb, sk); + mptcp_set_owner_r(skb, sk); __skb_queue_tail(&sk->sk_receive_queue, skb); return true; } else if (after64(MPTCP_SKB_CB(skb)->map_seq, msk->ack_seq)) { @@ -908,122 +966,20 @@ static bool mptcp_frag_can_collapse_to(const struct = mptcp_sock *msk, df->data_seq + df->data_len =3D=3D msk->write_seq; } =20 -static int mptcp_wmem_with_overhead(int size) -{ - return size + ((sizeof(struct mptcp_data_frag) * size) >> PAGE_SHIFT); -} - -static void __mptcp_wmem_reserve(struct sock *sk, int size) -{ - int amount =3D mptcp_wmem_with_overhead(size); - struct mptcp_sock *msk =3D mptcp_sk(sk); - - WARN_ON_ONCE(msk->wmem_reserved); - if (WARN_ON_ONCE(amount < 0)) - amount =3D 0; - - if (amount <=3D sk->sk_forward_alloc) - goto reserve; - - /* under memory pressure try to reserve at most a single page - * otherwise try to reserve the full estimate and fallback - * to a single page before entering the error path - */ - if ((tcp_under_memory_pressure(sk) && amount > PAGE_SIZE) || - !sk_wmem_schedule(sk, amount)) { - if (amount <=3D PAGE_SIZE) - goto nomem; - - amount =3D PAGE_SIZE; - if (!sk_wmem_schedule(sk, amount)) - goto nomem; - } - -reserve: - msk->wmem_reserved =3D amount; - sk->sk_forward_alloc -=3D amount; - return; - -nomem: - /* we will wait for memory on next allocation */ - msk->wmem_reserved =3D -1; -} - -static void __mptcp_update_wmem(struct sock *sk) +static void __mptcp_mem_reclaim_partial(struct sock *sk) { - struct mptcp_sock *msk =3D mptcp_sk(sk); + int reclaimable =3D mptcp_sk(sk)->rmem_fwd_alloc - sk_unused_reserved_mem= (sk); =20 lockdep_assert_held_once(&sk->sk_lock.slock); =20 - if (!msk->wmem_reserved) - return; - - if (msk->wmem_reserved < 0) - msk->wmem_reserved =3D 0; - if (msk->wmem_reserved > 0) { - sk->sk_forward_alloc +=3D msk->wmem_reserved; - msk->wmem_reserved =3D 0; - } -} - -static bool mptcp_wmem_alloc(struct sock *sk, int size) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - - /* check for pre-existing error condition */ - if (msk->wmem_reserved < 0) - return false; - - if (msk->wmem_reserved >=3D size) - goto account; - - mptcp_data_lock(sk); - if (!sk_wmem_schedule(sk, size)) { - mptcp_data_unlock(sk); - return false; - } - - sk->sk_forward_alloc -=3D size; - msk->wmem_reserved +=3D size; - mptcp_data_unlock(sk); - -account: - msk->wmem_reserved -=3D size; - return true; -} - -static void mptcp_wmem_uncharge(struct sock *sk, int size) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - - if (msk->wmem_reserved < 0) - msk->wmem_reserved =3D 0; - msk->wmem_reserved +=3D size; -} - -static void __mptcp_mem_reclaim_partial(struct sock *sk) -{ - lockdep_assert_held_once(&sk->sk_lock.slock); - __mptcp_update_wmem(sk); + __mptcp_rmem_reclaim(sk, reclaimable - 1); sk_mem_reclaim_partial(sk); } =20 static void mptcp_mem_reclaim_partial(struct sock *sk) { - struct mptcp_sock *msk =3D mptcp_sk(sk); - - /* if we are experiencing a transint allocation error, - * the forward allocation memory has been already - * released - */ - if (msk->wmem_reserved < 0) - return; - mptcp_data_lock(sk); - sk->sk_forward_alloc +=3D msk->wmem_reserved; - sk_mem_reclaim_partial(sk); - msk->wmem_reserved =3D sk->sk_forward_alloc; - sk->sk_forward_alloc =3D 0; + __mptcp_mem_reclaim_partial(sk); mptcp_data_unlock(sk); } =20 @@ -1664,7 +1620,6 @@ static void __mptcp_subflow_push_pending(struct sock = *sk, struct sock *ssk) /* __mptcp_alloc_tx_skb could have released some wmem and we are * not going to flush it via release_sock() */ - __mptcp_update_wmem(sk); if (copied) { tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle, info.size_goal); @@ -1701,7 +1656,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msgh= dr *msg, size_t len) /* silently ignore everything else */ msg->msg_flags &=3D MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL; =20 - mptcp_lock_sock(sk, __mptcp_wmem_reserve(sk, min_t(size_t, 1 << 20, len))= ); + lock_sock(sk); =20 timeo =3D sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); =20 @@ -1749,17 +1704,17 @@ static int mptcp_sendmsg(struct sock *sk, struct ms= ghdr *msg, size_t len) psize =3D min_t(size_t, psize, msg_data_left(msg)); total_ts =3D psize + frag_truesize; =20 - if (!mptcp_wmem_alloc(sk, total_ts)) + if (!sk_wmem_schedule(sk, total_ts)) goto wait_for_memory; =20 if (copy_page_from_iter(dfrag->page, offset, psize, &msg->msg_iter) !=3D psize) { - mptcp_wmem_uncharge(sk, psize + frag_truesize); ret =3D -EFAULT; goto out; } =20 /* data successfully copied into the write queue */ + sk->sk_forward_alloc -=3D total_ts; copied +=3D psize; dfrag->data_len +=3D psize; frag_truesize +=3D psize; @@ -1956,7 +1911,7 @@ static void __mptcp_update_rmem(struct sock *sk) return; =20 atomic_sub(msk->rmem_released, &sk->sk_rmem_alloc); - sk_mem_uncharge(sk, msk->rmem_released); + mptcp_rmem_uncharge(sk, msk->rmem_released); WRITE_ONCE(msk->rmem_released, 0); } =20 @@ -2024,7 +1979,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, if (unlikely(flags & MSG_ERRQUEUE)) return inet_recv_error(sk, msg, len, addr_len); =20 - mptcp_lock_sock(sk, __mptcp_splice_receive_queue(sk)); + lock_sock(sk); if (unlikely(sk->sk_state =3D=3D TCP_LISTEN)) { copied =3D -ENOTCONN; goto out_err; @@ -2504,7 +2459,7 @@ static int __mptcp_init_sock(struct sock *sk) __skb_queue_head_init(&msk->receive_queue); msk->out_of_order_queue =3D RB_ROOT; msk->first_pending =3D NULL; - msk->wmem_reserved =3D 0; + msk->rmem_fwd_alloc =3D 0; WRITE_ONCE(msk->rmem_released, 0); msk->timer_ival =3D TCP_RTO_MIN; =20 @@ -2719,7 +2674,7 @@ static void __mptcp_destroy_sock(struct sock *sk) =20 sk->sk_prot->destroy(sk); =20 - WARN_ON_ONCE(msk->wmem_reserved); + WARN_ON_ONCE(msk->rmem_fwd_alloc); WARN_ON_ONCE(msk->rmem_released); sk_stream_kill_queues(sk); xfrm_sk_free_policy(sk); @@ -2954,8 +2909,14 @@ void mptcp_destroy_common(struct mptcp_sock *msk) =20 /* move to sk_receive_queue, sk_stream_kill_queues will purge it */ skb_queue_splice_tail_init(&msk->receive_queue, &sk->sk_receive_queue); - + __skb_queue_purge(&sk->sk_receive_queue); skb_rbtree_purge(&msk->out_of_order_queue); + + /* move all the rx fwd alloc into the sk_mem_reclaim_final in=20 + * inet_sock_destruct() will dispose it + */ + sk->sk_forward_alloc +=3D msk->rmem_fwd_alloc; + msk->rmem_fwd_alloc =3D 0; mptcp_token_destroy(msk); mptcp_pm_free_anno_list(msk); } @@ -3037,10 +2998,6 @@ static void mptcp_release_cb(struct sock *sk) if (test_and_clear_bit(MPTCP_ERROR_REPORT, &mptcp_sk(sk)->flags)) __mptcp_error_report(sk); =20 - /* push_pending may touch wmem_reserved, ensure we do the cleanup - * later - */ - __mptcp_update_wmem(sk); __mptcp_update_rmem(sk); } =20 @@ -3190,6 +3147,11 @@ static void mptcp_shutdown(struct sock *sk, int how) __mptcp_wr_shutdown(sk); } =20 +static int mptcp_forward_alloc_get(const struct sock *sk) +{ + return sk->sk_forward_alloc + mptcp_sk(sk)->rmem_fwd_alloc; +} + static struct proto mptcp_prot =3D { .name =3D "MPTCP", .owner =3D THIS_MODULE, @@ -3207,6 +3169,7 @@ static struct proto mptcp_prot =3D { .hash =3D mptcp_hash, .unhash =3D mptcp_unhash, .get_port =3D mptcp_get_port, + .forward_alloc_get =3D mptcp_forward_alloc_get, .sockets_allocated =3D &mptcp_sockets_allocated, .memory_allocated =3D &tcp_memory_allocated, .memory_pressure =3D &tcp_memory_pressure, diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 7379ab580a7e..cfb374634a83 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -227,7 +227,7 @@ struct mptcp_sock { u64 ack_seq; u64 rcv_wnd_sent; u64 rcv_data_fin_seq; - int wmem_reserved; + int rmem_fwd_alloc; struct sock *last_snd; int snd_burst; int old_wspace; @@ -272,19 +272,6 @@ struct mptcp_sock { char ca_name[TCP_CA_NAME_MAX]; }; =20 -#define mptcp_lock_sock(___sk, cb) do { \ - struct sock *__sk =3D (___sk); /* silence macro reuse warning */ \ - might_sleep(); \ - spin_lock_bh(&__sk->sk_lock.slock); \ - if (__sk->sk_lock.owned) \ - __lock_sock(__sk); \ - cb; \ - __sk->sk_lock.owned =3D 1; \ - spin_unlock(&__sk->sk_lock.slock); \ - mutex_acquire(&__sk->sk_lock.dep_map, 0, 0, _RET_IP_); \ - local_bh_enable(); \ -} while (0) - #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock) #define mptcp_data_unlock(sk) spin_unlock_bh(&(sk)->sk_lock.slock) =20 --=20 2.26.3