From nobody Sat Feb 7 17:19:29 2026 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:7824:0:0:0:0:0 with SMTP id p36csp71919jac; Fri, 15 Oct 2021 02:25:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxbf9JcbtzZvrWvBSxjRmxDLHBNSf0PnL2se6DPx6eyH8Jy/JqVX0cxe5CNU+aZp/Lu8vu2 X-Received: by 2002:a63:fb18:: with SMTP id o24mr8323274pgh.8.1634289954990; Fri, 15 Oct 2021 02:25:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634289954; cv=none; d=google.com; s=arc-20160816; b=H/LuoTbhFRVDolBdLWmOCitKYzINDeIDWdifujj4vH1JDoGfVM5bAw/VFAsDgaK/dE ZmizOhkHxPr6B5TWZTlH/yZyz9Tcr55eAVWAZ4CWYcbQfh7XX5Z+55KptECDtAPfI/y0 gRHWtn6xidujN50Z2byPFIL9gVvciQskMsx+l90MDgolBaFTMc+Z5Rjzz7fJdLKPDVFR dY7506fdzAaDfpiBtxDW3+M5Lrguq9ZfmWm9/G6hQfdJ+ckWhKy1S+UGzbEoSflyOk9C BQtkenwIXaf5jDsvkW0yPYhj2T1jm+w9cugO8xWWjaev77jhoQ5jYlj0ifubBY/XSzvB Zzpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=1egjQgvve1wj1U6ZexQM4//Hlh3iOTX1wIEtAwnp8o0=; b=DNOiYtx6RrFGAO603OJMj7nITEdq23ECoPW6GbykCaSrS/xcFx0Utq5swIlTzHPIEB A/dU9GtsXzU6dPs2DtjePRv4h8pYfaJY3f8ojrRdvmcDafEa8qMTpyBbv2VvcBoTe/gY MXHbnYPyRU2c0v5sx+5BjOHKGpnXR4C3bwDJmd1qx9ErXMa6QahV0jvSuAjClZjH6MSN wTF0jvneep2FU5xG0ngaQY1M4wlJ4i4QBt9w4lDgg3TaFIjlZgyTkPM2xR/YURa+iYBJ 22vhpo16ZPQaZSWRSSeKvNQoqGu+kDsOBBNhdpGWM0/91dqdbAzMv67umgf681PwDY8f BVZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ARXAXb8+; spf=pass (google.com: domain of mptcp+bounces-2162-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1:3600::1 as permitted sender) smtp.mailfrom="mptcp+bounces-2162-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [2604:1380:1:3600::1]) by mx.google.com with ESMTPS id k4si6578717plt.314.2021.10.15.02.25.54 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 Oct 2021 02:25:54 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-2162-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1:3600::1 as permitted sender) client-ip=2604:1380:1:3600::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ARXAXb8+; spf=pass (google.com: domain of mptcp+bounces-2162-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1:3600::1 as permitted sender) smtp.mailfrom="mptcp+bounces-2162-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id 2E3B71C0F78 for ; Fri, 15 Oct 2021 09:25:50 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8A3782C81; Fri, 15 Oct 2021 09:25:48 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 189C82C83 for ; Fri, 15 Oct 2021 09:25:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634289946; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1egjQgvve1wj1U6ZexQM4//Hlh3iOTX1wIEtAwnp8o0=; b=ARXAXb8+AqGivP6M4MDe5g4DZVspNUuDjA/FmeiN72Qg0X/N4DD4rvyYzq6odElauwbGxl obiTH8m0JuTtuv/7Ue4skTNY2C1SFEm5O9q4xUNAQfpuJKcYRqdpCp045LMYJvKZgOdE9B w0BnpgU1GF6XO9G3LVia/iL01Rqeko0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-467-y-WGq23iPb-caZ5PE5Ow7Q-1; Fri, 15 Oct 2021 05:25:44 -0400 X-MC-Unique: y-WGq23iPb-caZ5PE5Ow7Q-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0299356BEF for ; Fri, 15 Oct 2021 09:25:44 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6E1875D740 for ; Fri, 15 Oct 2021 09:25:43 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next 1/3] tcp: define macros for a couple reclaim thresholds Date: Fri, 15 Oct 2021 11:25:23 +0200 Message-Id: <0d3f747f3a1dcdc67e71fc93cc1d64eee9e6b803.1634289695.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A following patch is going to implment for the MPTCP protocol a similar reclaim schema, with different locking. Let's define a couple of macros for the used thresholds, so that the latter code will be more easily maintainable. Signed-off-by: Paolo Abeni --- include/net/sock.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index d08ab55fa4a0..9c5d0502090f 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1568,6 +1568,12 @@ static inline void sk_mem_charge(struct sock *sk, in= t size) sk->sk_forward_alloc -=3D size; } =20 +/* the following macros control control memory reclaiming in + * sk_mem_uncharge() + */ +#define SK_RECLAIM_THRESHOLD (1 << 21) +#define SK_RECLAIM_CHUNK (1 << 20) + static inline void sk_mem_uncharge(struct sock *sk, int size) { int reclaimable; @@ -1584,8 +1590,8 @@ static inline void sk_mem_uncharge(struct sock *sk, i= nt size) * If we reach 2 MBytes, reclaim 1 MBytes right now, there is * no need to hold that much forward allocation anyway. */ - if (unlikely(reclaimable >=3D 1 << 21)) - __sk_mem_reclaim(sk, 1 << 20); + if (unlikely(reclaimable >=3D SK_RECLAIM_THRESHOLD)) + __sk_mem_reclaim(sk, SK_RECLAIM_CHUNK); } =20 static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) --=20 2.26.3 From nobody Sat Feb 7 17:19:29 2026 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:7824:0:0:0:0:0 with SMTP id p36csp72091jac; Fri, 15 Oct 2021 02:26:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz1DT5f8bSxOlh3hix2x2T3M1nwQOpJ5DWBfw17KL9D5zoLDwrgGJW7BU2TdgYsM25Suahv X-Received: by 2002:a63:b906:: with SMTP id z6mr2322241pge.406.1634289970730; Fri, 15 Oct 2021 02:26:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634289970; cv=none; d=google.com; s=arc-20160816; b=g5S0wgIn/bdUWeUD3RIaE6aFhgmx0l8RAxxpud4gLhk1UkXl04N2clXdWiIN4VX9ck Xe/OxjdWNbvIbX30hEoWHizDMHhkE1huuRM9399sPiRLSMbOX56STfQMArJVdq5m3DxM B9nfQnCkERjus76Wnwq1m+UbJ5hpPobhesozrAUrRQgSkVnerqOXCFo1bN0wes+7sX2e qWAM3v6G+YnJzZFot65wTVUHn3KKvSrC73GEc7ssglpaIct7Ak0wWBYgAd74w3mHePUS yfdOudhtJcvfrcYyEmyiJnF6J/CPrXzuLQVShO18AKhsHer/rmEVFUglkBqrEQFazPmM /LXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=9CfqUPu7O1w8W4v6WD5zHDCxWC8uPzyP6pFLYQwAak8=; b=zGVGuoDeo9RWJMkPpBKHllS2N2qE3KVSfaNpg8cyfd1c8pMnX+YhFS0dsDr2ku1Qm3 czLim9vM59dtlCAWaYryXF7PKHSN4FVALKAQCduvu4hHBgkbIX7juJRiipQ+Fx42UCve bNkembk8OypFgdCUL1sWw/rfUu960a3CQCQ62JB74pRXfjUTXMwuxAHwKytWx+hSXRq3 Iaodl5zb+hgSazi/EdTDZ017L7VEAiA5JCdXBV9qDHUEILuoJHnHqAwB5UP0W+Yt4/kT PqhX295Iatm+ExqmoI4FQ+oe//1NIHUNRNZL5f5KRCMiIEA00ftyY1/AQ2d94r89CXKl NRiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QsT7taih; spf=pass (google.com: domain of mptcp+bounces-2164-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2164-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [147.75.197.195]) by mx.google.com with ESMTPS id x1si8788102pgj.465.2021.10.15.02.26.10 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 Oct 2021 02:26:10 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-2164-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) client-ip=147.75.197.195; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QsT7taih; spf=pass (google.com: domain of mptcp+bounces-2164-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2164-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id 9CDCC1C0F45 for ; Fri, 15 Oct 2021 09:26:05 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 66CCD2C83; Fri, 15 Oct 2021 09:26:04 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 134D02C81 for ; Fri, 15 Oct 2021 09:26:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634289959; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9CfqUPu7O1w8W4v6WD5zHDCxWC8uPzyP6pFLYQwAak8=; b=QsT7taihAY+93tiNUieo0pxKHQy3oa/1IoAhHLNLPiLVe3e8iK/rK6iSjbozOUosGuhiXG c2w1HwlmCkLCIW1rlYPItTrRp/e08Tbq5r9TbZisDYPbKHfU5vX9gESpg0sFRoVfXCga/n KWeoAN9o5lTVm9D7j7VQKB/hDsaNAqg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-551-XWY17dKNOUKGkE9uZC2C-Q-1; Fri, 15 Oct 2021 05:25:45 -0400 X-MC-Unique: XWY17dKNOUKGkE9uZC2C-Q-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E2D2B18358EF for ; Fri, 15 Oct 2021 09:25:44 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 56DD35F4E3 for ; Fri, 15 Oct 2021 09:25:44 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next 2/3] net: introduce sk_forward_alloc_get() Date: Fri, 15 Oct 2021 11:25:24 +0200 Message-Id: <944348e29bd564a3e45d7ab6683c4e64ef48aec1.1634289695.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A later patch will change the mptcp memory accounting schema in such a way that MPTCP sockets will encode total amount of forward allocated memory in two separate fields (one for tx and one for rx). MPTCP sockets will use their own helper to provide the accurate amount of fwd alloced memory. To allow the above, this patch adds a new, optional, sk method to fetch the fwd memory, wrap the call in a new helper and use it where apprioriate. Signed-off-by: Paolo Abeni --- This schema was suggested long time ago by Eric in a completely different context: https://marc.info/?l=3Dlinux-netdev&m=3D147516056204838&w=3D2 I'm unsure if that is still applicable/valid. --- include/net/sock.h | 11 +++++++++++ net/ipv4/af_inet.c | 2 +- net/ipv4/inet_diag.c | 2 +- net/sched/em_meta.c | 2 +- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 9c5d0502090f..86ea60df6e84 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1205,6 +1205,8 @@ struct proto { unsigned int inuse_idx; #endif =20 + int (*forward_alloc_get)(const struct sock *sk); + bool (*stream_memory_free)(const struct sock *sk, int wake); bool (*stream_memory_read)(const struct sock *sk); /* Memory pressure */ @@ -1212,6 +1214,7 @@ struct proto { void (*leave_memory_pressure)(struct sock *sk); atomic_long_t *memory_allocated; /* Current allocated memory. */ struct percpu_counter *sockets_allocated; /* Current number of sockets. */ + /* * Pressure flag: try to collapse. * Technical note: it is used by multiple contexts non atomically. @@ -1289,6 +1292,14 @@ static inline void sk_refcnt_debug_release(const str= uct sock *sk) =20 INDIRECT_CALLABLE_DECLARE(bool tcp_stream_memory_free(const struct sock *s= k, int wake)); =20 +static inline int sk_forward_alloc_get(const struct sock *sk) +{ + if (!sk->sk_prot->forward_alloc_get) + return sk->sk_forward_alloc; + + return sk->sk_prot->forward_alloc_get(sk); +} + static inline bool __sk_stream_memory_free(const struct sock *sk, int wake) { if (READ_ONCE(sk->sk_wmem_queued) >=3D READ_ONCE(sk->sk_sndbuf)) diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 37e69fd9246c..55f862abe811 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -150,7 +150,7 @@ void inet_sock_destruct(struct sock *sk) WARN_ON(atomic_read(&sk->sk_rmem_alloc)); WARN_ON(refcount_read(&sk->sk_wmem_alloc)); WARN_ON(sk->sk_wmem_queued); - WARN_ON(sk->sk_forward_alloc); + WARN_ON(sk_forward_alloc_get(sk)); =20 kfree(rcu_dereference_protected(inet->inet_opt, 1)); dst_release(rcu_dereference_protected(sk->sk_dst_cache, 1)); diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index ef7897226f08..c8fa6e7f7d12 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -271,7 +271,7 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_conn= ection_sock *icsk, struct inet_diag_meminfo minfo =3D { .idiag_rmem =3D sk_rmem_alloc_get(sk), .idiag_wmem =3D READ_ONCE(sk->sk_wmem_queued), - .idiag_fmem =3D sk->sk_forward_alloc, + .idiag_fmem =3D sk_forward_alloc_get(sk), .idiag_tmem =3D sk_wmem_alloc_get(sk), }; =20 diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c index 46254968d390..0a04468b7314 100644 --- a/net/sched/em_meta.c +++ b/net/sched/em_meta.c @@ -457,7 +457,7 @@ META_COLLECTOR(int_sk_fwd_alloc) *err =3D -1; return; } - dst->value =3D sk->sk_forward_alloc; + dst->value =3D sk_forward_alloc_get(sk); } =20 META_COLLECTOR(int_sk_sndbuf) --=20 2.26.3 From nobody Sat Feb 7 17:19:29 2026 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:7824:0:0:0:0:0 with SMTP id p36csp71869jac; Fri, 15 Oct 2021 02:25:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyV/Ayjcj8YtpRXgo8cf3M1zLYtqdw6s7/mu0jOkNgp0R4esLNxEzeXshgeG1oI9eC1CmPT X-Received: by 2002:a65:5382:: with SMTP id x2mr8275455pgq.176.1634289951545; Fri, 15 Oct 2021 02:25:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634289951; cv=none; d=google.com; s=arc-20160816; b=mj5I8wfUP+xE2TWcnXWNF9kzILbkDpjc8+cAhiNUXYHQ04V7Ewl6D/zUe/Dhi3TcHm cKxsyaIRmVLTvuBOjPPtG92aXq7/hNuotsaLRjR22MNnAi6Zs5rnLdYdfqhyiCmoaEKj kcCoLtkDZWBOHfSzjGZk/M3+EY4IFOp3GkvhcBtEe+c8+YCaZh418Q6P6C0XFIdCEOov JLgO84hdzNn4/7Og9ZgbvcQIo0ojW6MQJCcGFxmxIdWQp1tuF8h0ZjyfGUrBUlCoAvYN 4+HzmT1FDSBE/OyVH0368pQri5zIzzheS26w3uO3/2iF2Xr3VNZSVhH5XsrSTkErM7HW l1Yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=Ebfjdveq6XLhBOrEnE77NKdJc5aclacXCAUqBx87dHw=; b=sZ1WflwgjiwL5EjOos0CZzJCTiORanNAlyUIm9JxPjhlCwKTSqBohIh+R4l2sx/3Y0 Od8Jl2MjJXN8BAOBojmZRvh0/IJiW9OuQQsx4iKAA7koSiMffut1zgXgDJLbXS82idcl olPL2NfCHS3m54iX1RGvt9F4dYLRdYQwAVvRes6wY9XN4ZsRwt3C/axTBSPTwh/LisBj 3akFuY4XlqNfjrgczgVUJqr3c3V0zKFu+vLs9CxRFOhEdjzpEORKZLD+/zbtHGbvH1pF FpBJOX+bfYmMAUft/VSZ54QKDdbAU3NmZ00Lp+igS+Fpao5S+6TeLtzpUVxHBFVSIPnK zbhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FnTBBELc; spf=pass (google.com: domain of mptcp+bounces-2163-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.69.165 as permitted sender) smtp.mailfrom="mptcp+bounces-2163-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sjc.edge.kernel.org (sjc.edge.kernel.org. [147.75.69.165]) by mx.google.com with ESMTPS id kk11si6452000pjb.149.2021.10.15.02.25.51 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 Oct 2021 02:25:51 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-2163-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.69.165 as permitted sender) client-ip=147.75.69.165; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FnTBBELc; spf=pass (google.com: domain of mptcp+bounces-2163-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.69.165 as permitted sender) smtp.mailfrom="mptcp+bounces-2163-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sjc.edge.kernel.org (Postfix) with ESMTPS id 1AFCF3E103A for ; Fri, 15 Oct 2021 09:25:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6C1E12C85; Fri, 15 Oct 2021 09:25:50 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6B512C83 for ; Fri, 15 Oct 2021 09:25:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634289948; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ebfjdveq6XLhBOrEnE77NKdJc5aclacXCAUqBx87dHw=; b=FnTBBELcKopjq91LjzvabRc1hkH1Zn5UYU+Iyep6UNay9ggO1rp1nU+DM9rKZ6Dm2Uc3rr A5fkt+09KgGqm34o8SBP0eWmz/nRQhSBKnvCK8z/oBNzaS2ihBxwf/01dZg57dtwfaQUa5 ZpbDnQwo6kew20Jkyw1U+sLVR2fQS/k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-422-ghLocCVhOOeyKd9A_TtsKg-1; Fri, 15 Oct 2021 05:25:46 -0400 X-MC-Unique: ghLocCVhOOeyKd9A_TtsKg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CA06956BC2 for ; Fri, 15 Oct 2021 09:25:45 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 41E915D740 for ; Fri, 15 Oct 2021 09:25:45 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next 3/3] mptcp: allocate fwd memory separatelly on the rx and tx path Date: Fri, 15 Oct 2021 11:25:25 +0200 Message-Id: <7c175c9533f66651a3641830683ce804f3915c3b.1634289695.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" All the mptcp receive path is protected by the msk socket spinlock. As consequence, the tx path has to play a few tricks to allocate the forward memory without acquiring the spinlock multile times, making the overall TX path quite complex. This patch tries to clean-up a bit the tx path, using completely separated fwd memory allocation, for the rx and the tx path. The forward memory allocated in the rx path is now accounted in msk->rmem_fwd_alloc and is (still) protected by the msk socket spinlock. To cope with the above we provied a few MPTCP-specific variant for the helpers to charge, uncharge, reclaim and free the forward memory in the receive path. msk->sk_forward_alloc now accounts only the forward memory for the tx path, we can use the plain core sock helper to manipulate it and drop quite a bit of complexity. On memory pressure reclaim both rx and tx fwd memory. Signed-off-by: Paolo Abeni --- RFC -> v1: - fix comment indent (Mat) - use macros instead of magic number (Mat) - set rmem_fwd_alloc at init time (Mat) - declare static __mptcp_rmem_reclaim - use forward_alloc_get() --- net/mptcp/protocol.c | 220 ++++++++++++++++++------------------------- net/mptcp/protocol.h | 15 +-- 2 files changed, 91 insertions(+), 144 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 21716392e754..d68685b3a2fc 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -126,6 +126,11 @@ static void mptcp_drop(struct sock *sk, struct sk_buff= *skb) __kfree_skb(skb); } =20 +static void mptcp_rmem_charge(struct sock *sk, int size) +{ + mptcp_sk(sk)->rmem_fwd_alloc -=3D size; +} + static bool mptcp_try_coalesce(struct sock *sk, struct sk_buff *to, struct sk_buff *from) { @@ -142,7 +147,7 @@ static bool mptcp_try_coalesce(struct sock *sk, struct = sk_buff *to, MPTCP_SKB_CB(to)->end_seq =3D MPTCP_SKB_CB(from)->end_seq; kfree_skb_partial(from, fragstolen); atomic_add(delta, &sk->sk_rmem_alloc); - sk_mem_charge(sk, delta); + mptcp_rmem_charge(sk, delta); return true; } =20 @@ -155,6 +160,44 @@ static bool mptcp_ooo_try_coalesce(struct mptcp_sock *= msk, struct sk_buff *to, return mptcp_try_coalesce((struct sock *)msk, to, from); } =20 +static void __mptcp_rmem_reclaim(struct sock *sk, int amount) +{ + amount >>=3D SK_MEM_QUANTUM_SHIFT; + mptcp_sk(sk)->rmem_fwd_alloc -=3D amount << SK_MEM_QUANTUM_SHIFT; + __sk_mem_reduce_allocated(sk, amount); +} + +static void mptcp_rmem_uncharge(struct sock *sk, int size) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + int reclaimable; + + msk->rmem_fwd_alloc +=3D size; + reclaimable =3D msk->rmem_fwd_alloc - sk_unused_reserved_mem(sk); + + /* see sk_mem_uncharge() for the rationale behind the following schema */ + if (unlikely(reclaimable >=3D SK_RECLAIM_THRESHOLD)) + __mptcp_rmem_reclaim(sk, SK_RECLAIM_CHUNK); +} + +static void mptcp_rfree(struct sk_buff *skb) +{ + unsigned int len =3D skb->truesize; + struct sock *sk =3D skb->sk; + + atomic_sub(len, &sk->sk_rmem_alloc); + mptcp_rmem_uncharge(sk, len); +} + +static void mptcp_set_owner_r(struct sk_buff *skb, struct sock *sk) +{ + skb_orphan(skb); + skb->sk =3D sk; + skb->destructor =3D mptcp_rfree; + atomic_add(skb->truesize, &sk->sk_rmem_alloc); + mptcp_rmem_charge(sk, skb->truesize); +} + /* "inspired" by tcp_data_queue_ofo(), main differences: * - use mptcp seqs * - don't cope with sacks @@ -267,7 +310,29 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *ms= k, struct sk_buff *skb) =20 end: skb_condense(skb); - skb_set_owner_r(skb, sk); + mptcp_set_owner_r(skb, sk); +} + +static bool mptcp_rmem_schedule(struct sock *sk, struct sock *ssk, int siz= e) +{ + struct mptcp_sock *msk =3D mptcp_sk(sk); + int amt, amount; + + if (size < msk->rmem_fwd_alloc) + return true;; + + amt =3D sk_mem_pages(size); + amount =3D amt << SK_MEM_QUANTUM_SHIFT; + msk->rmem_fwd_alloc +=3D amount; + if (!__sk_mem_raise_allocated(sk, size, amt, SK_MEM_RECV)) { + if (ssk->sk_forward_alloc < amount) { + msk->rmem_fwd_alloc -=3D amount; + return false; + } + + ssk->sk_forward_alloc -=3D amount; + } + return true; } =20 static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk, @@ -285,15 +350,8 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, s= truct sock *ssk, skb_orphan(skb); =20 /* try to fetch required memory from subflow */ - if (!sk_rmem_schedule(sk, skb, skb->truesize)) { - int amount =3D sk_mem_pages(skb->truesize) << SK_MEM_QUANTUM_SHIFT; - - if (ssk->sk_forward_alloc < amount) - goto drop; - - ssk->sk_forward_alloc -=3D amount; - sk->sk_forward_alloc +=3D amount; - } + if (!mptcp_rmem_schedule(sk, ssk, skb->truesize)) + goto drop; =20 has_rxtstamp =3D TCP_SKB_CB(skb)->has_rxtstamp; =20 @@ -313,7 +371,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, st= ruct sock *ssk, if (tail && mptcp_try_coalesce(sk, tail, skb)) return true; =20 - skb_set_owner_r(skb, sk); + mptcp_set_owner_r(skb, sk); __skb_queue_tail(&sk->sk_receive_queue, skb); return true; } else if (after64(MPTCP_SKB_CB(skb)->map_seq, msk->ack_seq)) { @@ -908,122 +966,20 @@ static bool mptcp_frag_can_collapse_to(const struct = mptcp_sock *msk, df->data_seq + df->data_len =3D=3D msk->write_seq; } =20 -static int mptcp_wmem_with_overhead(int size) -{ - return size + ((sizeof(struct mptcp_data_frag) * size) >> PAGE_SHIFT); -} - -static void __mptcp_wmem_reserve(struct sock *sk, int size) -{ - int amount =3D mptcp_wmem_with_overhead(size); - struct mptcp_sock *msk =3D mptcp_sk(sk); - - WARN_ON_ONCE(msk->wmem_reserved); - if (WARN_ON_ONCE(amount < 0)) - amount =3D 0; - - if (amount <=3D sk->sk_forward_alloc) - goto reserve; - - /* under memory pressure try to reserve at most a single page - * otherwise try to reserve the full estimate and fallback - * to a single page before entering the error path - */ - if ((tcp_under_memory_pressure(sk) && amount > PAGE_SIZE) || - !sk_wmem_schedule(sk, amount)) { - if (amount <=3D PAGE_SIZE) - goto nomem; - - amount =3D PAGE_SIZE; - if (!sk_wmem_schedule(sk, amount)) - goto nomem; - } - -reserve: - msk->wmem_reserved =3D amount; - sk->sk_forward_alloc -=3D amount; - return; - -nomem: - /* we will wait for memory on next allocation */ - msk->wmem_reserved =3D -1; -} - -static void __mptcp_update_wmem(struct sock *sk) +static void __mptcp_mem_reclaim_partial(struct sock *sk) { - struct mptcp_sock *msk =3D mptcp_sk(sk); + int reclaimable =3D mptcp_sk(sk)->rmem_fwd_alloc - sk_unused_reserved_mem= (sk); =20 lockdep_assert_held_once(&sk->sk_lock.slock); =20 - if (!msk->wmem_reserved) - return; - - if (msk->wmem_reserved < 0) - msk->wmem_reserved =3D 0; - if (msk->wmem_reserved > 0) { - sk->sk_forward_alloc +=3D msk->wmem_reserved; - msk->wmem_reserved =3D 0; - } -} - -static bool mptcp_wmem_alloc(struct sock *sk, int size) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - - /* check for pre-existing error condition */ - if (msk->wmem_reserved < 0) - return false; - - if (msk->wmem_reserved >=3D size) - goto account; - - mptcp_data_lock(sk); - if (!sk_wmem_schedule(sk, size)) { - mptcp_data_unlock(sk); - return false; - } - - sk->sk_forward_alloc -=3D size; - msk->wmem_reserved +=3D size; - mptcp_data_unlock(sk); - -account: - msk->wmem_reserved -=3D size; - return true; -} - -static void mptcp_wmem_uncharge(struct sock *sk, int size) -{ - struct mptcp_sock *msk =3D mptcp_sk(sk); - - if (msk->wmem_reserved < 0) - msk->wmem_reserved =3D 0; - msk->wmem_reserved +=3D size; -} - -static void __mptcp_mem_reclaim_partial(struct sock *sk) -{ - lockdep_assert_held_once(&sk->sk_lock.slock); - __mptcp_update_wmem(sk); + __mptcp_rmem_reclaim(sk, reclaimable - 1); sk_mem_reclaim_partial(sk); } =20 static void mptcp_mem_reclaim_partial(struct sock *sk) { - struct mptcp_sock *msk =3D mptcp_sk(sk); - - /* if we are experiencing a transint allocation error, - * the forward allocation memory has been already - * released - */ - if (msk->wmem_reserved < 0) - return; - mptcp_data_lock(sk); - sk->sk_forward_alloc +=3D msk->wmem_reserved; - sk_mem_reclaim_partial(sk); - msk->wmem_reserved =3D sk->sk_forward_alloc; - sk->sk_forward_alloc =3D 0; + __mptcp_mem_reclaim_partial(sk); mptcp_data_unlock(sk); } =20 @@ -1664,7 +1620,6 @@ static void __mptcp_subflow_push_pending(struct sock = *sk, struct sock *ssk) /* __mptcp_alloc_tx_skb could have released some wmem and we are * not going to flush it via release_sock() */ - __mptcp_update_wmem(sk); if (copied) { tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle, info.size_goal); @@ -1701,7 +1656,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msgh= dr *msg, size_t len) /* silently ignore everything else */ msg->msg_flags &=3D MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL; =20 - mptcp_lock_sock(sk, __mptcp_wmem_reserve(sk, min_t(size_t, 1 << 20, len))= ); + lock_sock(sk); =20 timeo =3D sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); =20 @@ -1749,17 +1704,17 @@ static int mptcp_sendmsg(struct sock *sk, struct ms= ghdr *msg, size_t len) psize =3D min_t(size_t, psize, msg_data_left(msg)); total_ts =3D psize + frag_truesize; =20 - if (!mptcp_wmem_alloc(sk, total_ts)) + if (!sk_wmem_schedule(sk, total_ts)) goto wait_for_memory; =20 if (copy_page_from_iter(dfrag->page, offset, psize, &msg->msg_iter) !=3D psize) { - mptcp_wmem_uncharge(sk, psize + frag_truesize); ret =3D -EFAULT; goto out; } =20 /* data successfully copied into the write queue */ + sk->sk_forward_alloc -=3D total_ts; copied +=3D psize; dfrag->data_len +=3D psize; frag_truesize +=3D psize; @@ -1956,7 +1911,7 @@ static void __mptcp_update_rmem(struct sock *sk) return; =20 atomic_sub(msk->rmem_released, &sk->sk_rmem_alloc); - sk_mem_uncharge(sk, msk->rmem_released); + mptcp_rmem_uncharge(sk, msk->rmem_released); WRITE_ONCE(msk->rmem_released, 0); } =20 @@ -2024,7 +1979,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msgh= dr *msg, size_t len, if (unlikely(flags & MSG_ERRQUEUE)) return inet_recv_error(sk, msg, len, addr_len); =20 - mptcp_lock_sock(sk, __mptcp_splice_receive_queue(sk)); + lock_sock(sk); if (unlikely(sk->sk_state =3D=3D TCP_LISTEN)) { copied =3D -ENOTCONN; goto out_err; @@ -2504,7 +2459,7 @@ static int __mptcp_init_sock(struct sock *sk) __skb_queue_head_init(&msk->receive_queue); msk->out_of_order_queue =3D RB_ROOT; msk->first_pending =3D NULL; - msk->wmem_reserved =3D 0; + msk->rmem_fwd_alloc =3D 0; WRITE_ONCE(msk->rmem_released, 0); msk->timer_ival =3D TCP_RTO_MIN; =20 @@ -2719,7 +2674,7 @@ static void __mptcp_destroy_sock(struct sock *sk) =20 sk->sk_prot->destroy(sk); =20 - WARN_ON_ONCE(msk->wmem_reserved); + WARN_ON_ONCE(msk->rmem_fwd_alloc); WARN_ON_ONCE(msk->rmem_released); sk_stream_kill_queues(sk); xfrm_sk_free_policy(sk); @@ -2954,6 +2909,9 @@ void mptcp_destroy_common(struct mptcp_sock *msk) =20 /* move to sk_receive_queue, sk_stream_kill_queues will purge it */ skb_queue_splice_tail_init(&msk->receive_queue, &sk->sk_receive_queue); + __skb_queue_purge(&sk->sk_receive_queue); + sk->sk_forward_alloc +=3D msk->rmem_fwd_alloc; + msk->rmem_fwd_alloc =3D 0; =20 skb_rbtree_purge(&msk->out_of_order_queue); mptcp_token_destroy(msk); @@ -3037,10 +2995,6 @@ static void mptcp_release_cb(struct sock *sk) if (test_and_clear_bit(MPTCP_ERROR_REPORT, &mptcp_sk(sk)->flags)) __mptcp_error_report(sk); =20 - /* push_pending may touch wmem_reserved, ensure we do the cleanup - * later - */ - __mptcp_update_wmem(sk); __mptcp_update_rmem(sk); } =20 @@ -3190,6 +3144,11 @@ static void mptcp_shutdown(struct sock *sk, int how) __mptcp_wr_shutdown(sk); } =20 +static int mptcp_forward_alloc_get(const struct sock *sk) +{ + return sk->sk_forward_alloc + mptcp_sk(sk)->rmem_fwd_alloc; +} + static struct proto mptcp_prot =3D { .name =3D "MPTCP", .owner =3D THIS_MODULE, @@ -3207,6 +3166,7 @@ static struct proto mptcp_prot =3D { .hash =3D mptcp_hash, .unhash =3D mptcp_unhash, .get_port =3D mptcp_get_port, + .forward_alloc_get =3D mptcp_forward_alloc_get, .sockets_allocated =3D &mptcp_sockets_allocated, .memory_allocated =3D &tcp_memory_allocated, .memory_pressure =3D &tcp_memory_pressure, diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 7379ab580a7e..cfb374634a83 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -227,7 +227,7 @@ struct mptcp_sock { u64 ack_seq; u64 rcv_wnd_sent; u64 rcv_data_fin_seq; - int wmem_reserved; + int rmem_fwd_alloc; struct sock *last_snd; int snd_burst; int old_wspace; @@ -272,19 +272,6 @@ struct mptcp_sock { char ca_name[TCP_CA_NAME_MAX]; }; =20 -#define mptcp_lock_sock(___sk, cb) do { \ - struct sock *__sk =3D (___sk); /* silence macro reuse warning */ \ - might_sleep(); \ - spin_lock_bh(&__sk->sk_lock.slock); \ - if (__sk->sk_lock.owned) \ - __lock_sock(__sk); \ - cb; \ - __sk->sk_lock.owned =3D 1; \ - spin_unlock(&__sk->sk_lock.slock); \ - mutex_acquire(&__sk->sk_lock.dep_map, 0, 0, _RET_IP_); \ - local_bh_enable(); \ -} while (0) - #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock) #define mptcp_data_unlock(sk) spin_unlock_bh(&(sk)->sk_lock.slock) =20 --=20 2.26.3