From nobody Tue Apr 7 14:39:34 2026 Received: from mail-dy1-f177.google.com (mail-dy1-f177.google.com [74.125.82.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A242431F9A7 for ; Thu, 12 Mar 2026 22:32:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773354733; cv=none; b=BXTVUBxlPdLfj+mKq/U+UCwD9+q68X40WQV305ZFJLnMPYTbQKkuV4N9g5SxHp99Tgj3BMa3hAkgFWxnv5CjLSmolt+6B+o9IsnQEhOoj1wArLP2qlZUdCrm0YZM0c0WZRh91BSdvkqrlU/2WmfzRSlftZpnyM/YajHUnSflZBA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773354733; c=relaxed/simple; bh=5LO45ZuoQ4fpB1CeZE8Jooq7YtevscjJKU7vw0VwJBE=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=QSgHAO3mUX3kxOkIldHcXZMuSydlP4W/cbjhS45CWGuxpt3TWKKSYGYJ3Koa8fzs+dMa0AYzH9n1qzmizLsom5sqr5TxKZYETJVtCctrPEyhmh2i/itcNhvop7GQEOTSYV7q7jdO9+/PMsL/Dtlzb9L78UFzYsuFxgnY5tu4pgk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=arista.com; spf=pass smtp.mailfrom=arista.com; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b=QFXVBt6y; arc=none smtp.client-ip=74.125.82.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=arista.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arista.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="QFXVBt6y" Received: by mail-dy1-f177.google.com with SMTP id 5a478bee46e88-2be1ab1fa7dso1206905eec.0 for ; Thu, 12 Mar 2026 15:32:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1773354730; x=1773959530; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=CFTQ/9EY2qpWAsJtKWK/DmanWo4AnpzW4k7Uv6lI5+c=; b=QFXVBt6yfYWkYjSJMMYJQ/jN62hMJEHpqeKXWr0XBRvxnoHtuB7iBH5s3dBCqsxQ/b thn8iEGGlezMPixW8BFjnFWgeRsWThYwe9mz6VEcxeFuyCtxSu5SXAkk7901WY0MLU8s t5NVf47jiOjOQVCKlSjs/OvnrKe0Duqy7OJl8I7q/86lrAjRjQDQ2i0vF0OXOGqu7VoI 6LkdHAhG0LdXVzENS/VRCx+YXAh1JV38X8YmrFqcr3lqxfPLRxtrMdbSW7Hz3eD1qfQd hfkpaTkVvf96J4vwzt5WPs1th4paXymCuv5l974IKp7FdsmKzTooHxr1VUnlw7NM5MOH Lz7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773354730; x=1773959530; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CFTQ/9EY2qpWAsJtKWK/DmanWo4AnpzW4k7Uv6lI5+c=; b=shcq80a/FIE0O7LgBB2Pf4Jg3SxQWtbX8Vi1fsKtuaDBMmJVS1k2N7L3L3kH+VEaj8 qrtvxmHSo48HGxCbwIO/N/tUa3bB9XME+kY5pXweRqKPExzEY6OWZNaejJq0fwNGUCPA EqMIhbpqJVUc4oZQr1+mC0u3aBfkTbVQENLo/QJ12H5M1Iwgpb3SnywjlPk9rzZ/kwSe GzKX2jLfUeNBaM/caHuke6AyiqzJ0eLUCGpBu9SZ4vn5rn7Es88m9mm19LY5ySm44aCp P8qRNhkdfCnay9o5/+Vs2g9Zz1kZynXTJOGd+ixTIE7oN0u3Imen3QaQn2mUULyGWypQ ZjWA== X-Forwarded-Encrypted: i=1; AJvYcCULB/xZnBpPN180IGkceKhHcEWrRpIedi/K00C/VQAkczrwood1h2fZOVjcVXe9GSVIkVD4ShZRhru8qkU=@vger.kernel.org X-Gm-Message-State: AOJu0Yw/foi/3ZSCj3wZpZecD4XOT7kszRr5ZFyeAAunnALQZhFSnIzq IrOyon+WRo/2u+4+qNpzdwcXQHLvRDJHQZjaYbLt4ethKgebxP3a0Q6yPlkxDby1qtapbLx5szT 1x1H8RQ== X-Gm-Gg: ATEYQzwbnYGEj+17l0/gOJrtNVOqwFiWp0H5Nv45sURNLRh1CUzKvxi/uVBRHVQWP/j A4s+yFr6V/Jg5tbsJuCNaZgLHoopQKVtuA4m8Mv52gDrpN6ihVkuc2poDpfTwROZydkltlgCOZj JWEjs+9FS4v8G7T5htKQuSjd7KHTuqOtpEkbifv+lHAPl4Hks5hAB/K5V/MXd/R1LrOcxxVx7Sr yXeVNFbHJFfDMGLMNAkiwHR6+NLqIO6UIamj4CVFLBnQIx5c+ZAxEw/+buTrsn7oJ49ki63w94v T59Rq9j9SEgHMilRq+RDHtC5/HZIEjdpkbtQxo5r4moO3mtUrWB2IpYVCtabFxQes0Q+X653omr EZSgAc2kTgUkCvRe2QWaK5gz/2iQEUqYOS7yqNZ2348NB1yfVBFaG2E4jqrmpXSgP5gSn0Hu6N5 JihzUd9p9GHFInIoogTMKDZE0WK3tLANed94Uc+hopC4M/vU8wmtFs X-Received: by 2002:a05:7300:e2cb:b0:2ae:55ac:3ff6 with SMTP id 5a478bee46e88-2bea53d83eemr681650eec.1.1773354729647; Thu, 12 Mar 2026 15:32:09 -0700 (PDT) Received: from localhost.localdomain ([74.123.28.19]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2beab3a110csm71286eec.6.2026.03.12.15.32.08 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 12 Mar 2026 15:32:09 -0700 (PDT) From: Prasanna S Panchamukhi To: netfilter-devel@vger.kernel.org Cc: panchamukhi@arista.com, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Jonathan Corbet , Shuah Khan , Pablo Neira Ayuso , Florian Westphal , Phil Sutter , netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, coreteam@netfilter.org Subject: [PATCH net-next v2] netfilter: conntrack: expose gc_scan_interval_max via sysctl Date: Thu, 12 Mar 2026 15:31:57 -0700 Message-ID: <20260312223157.25083-1-panchamukhi@arista.com> X-Mailer: git-send-email 2.50.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The conntrack garbage collection worker uses an adaptive algorithm that adjusts the scan interval based on the average timeout of tracked entries. The upper bound of this interval is hardcoded as GC_SCAN_INTERVAL_MAX (60 seconds). Expose the upper bound as a new sysctl, net.netfilter.nf_conntrack_gc_scan_interval_max, so it can be tuned at runtime without rebuilding the kernel. The default remains 60 seconds to preserve existing behavior. The sysctl is global and read-only in non-init network namespaces, consistent with nf_conntrack_max and nf_conntrack_buckets. In environments where long-lived offloaded flows dominate the table, the adaptive average drifts toward the maximum, delaying cleanup of short-lived expired entries such as those in TCP CLOSE state (10s timeout). Adding sysctl to set the maximum GC scan helps to tune according to the evironment. Signed-off-by: Prasanna S Panchamukhi cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Simon Horman cc: Jonathan Corbet cc: Shuah Khan cc: Pablo Neira Ayuso cc: Florian Westphal cc: Phil Sutter cc: netdev@vger.kernel.org cc: linux-doc@vger.kernel.org cc: linux-kernel@vger.kernel.org to: netfilter-devel@vger.kernel.org cc: coreteam@netfilter.org --- Documentation/networking/nf_conntrack-sysctl.rst | 13 +++++++++++++ include/net/netfilter/nf_conntrack.h | 1 + net/netfilter/nf_conntrack_core.c | 10 +++++++--- net/netfilter/nf_conntrack_standalone.c | 10 ++++++++++ 4 files changed, 31 insertions(+), 3 deletions(-) diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentati= on/networking/nf_conntrack-sysctl.rst index 35f889259fcd..0e79f6ad1062 100644 --- a/Documentation/networking/nf_conntrack-sysctl.rst +++ b/Documentation/networking/nf_conntrack-sysctl.rst @@ -64,6 +64,19 @@ nf_conntrack_frag6_timeout - INTEGER (seconds) =20 Time to keep an IPv6 fragment in memory. =20 +nf_conntrack_gc_scan_interval_max - INTEGER (seconds) + default 60 + + Maximum interval between garbage collection scans of the connection + tracking table. The GC worker uses an adaptive algorithm that adjusts + the scan interval based on average entry timeouts; this parameter caps + the upper bound. Lower values cause expired entries (e.g. connections + in CLOSE state) to be cleaned up faster, at the cost of slightly more + CPU usage. Consider tuning this on systems with high connection churn + (e.g. NAT gateways, load balancers) where expired entries accumulate + and cause the conntrack table to fill up. Minimum value is 1. + This sysctl is only writeable in the initial net namespace. + nf_conntrack_generic_timeout - INTEGER (seconds) default 600 =20 diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/n= f_conntrack.h index bc42dd0e10e6..0449577f322e 100644 --- a/include/net/netfilter/nf_conntrack.h +++ b/include/net/netfilter/nf_conntrack.h @@ -331,6 +331,7 @@ extern struct hlist_nulls_head *nf_conntrack_hash; extern unsigned int nf_conntrack_htable_size; extern seqcount_spinlock_t nf_conntrack_generation; extern unsigned int nf_conntrack_max; +extern unsigned int nf_conntrack_gc_scan_interval_max; =20 /* must be called with rcu read lock held */ static inline void diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack= _core.c index 27ce5fda8993..8647e6824cec 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -91,7 +91,7 @@ static DEFINE_MUTEX(nf_conntrack_mutex); * allowing non-idle machines to wakeup more often when needed. */ #define GC_SCAN_INITIAL_COUNT 100 -#define GC_SCAN_INTERVAL_INIT GC_SCAN_INTERVAL_MAX +#define GC_SCAN_INTERVAL_INIT READ_ONCE(nf_conntrack_gc_scan_interval_max) =20 #define GC_SCAN_MAX_DURATION msecs_to_jiffies(10) #define GC_SCAN_EXPIRED_MAX (64000u / HZ) @@ -204,6 +204,9 @@ EXPORT_SYMBOL_GPL(nf_conntrack_htable_size); =20 unsigned int nf_conntrack_max __read_mostly; EXPORT_SYMBOL_GPL(nf_conntrack_max); + +unsigned int nf_conntrack_gc_scan_interval_max __read_mostly =3D GC_SCAN_I= NTERVAL_MAX; + seqcount_spinlock_t nf_conntrack_generation __read_mostly; static siphash_aligned_key_t nf_conntrack_hash_rnd; =20 @@ -1515,6 +1518,7 @@ static void gc_worker(struct work_struct *work) unsigned int i, hashsz, nf_conntrack_max95 =3D 0; u32 end_time, start_time =3D nfct_time_stamp; struct conntrack_gc_work *gc_work; + unsigned long gc_scan_max =3D READ_ONCE(nf_conntrack_gc_scan_interval_max= ); unsigned int expired_count =3D 0; unsigned long next_run; s32 delta_time; @@ -1568,7 +1572,7 @@ static void gc_worker(struct work_struct *work) delta_time =3D nfct_time_stamp - gc_work->start_time; =20 /* re-sched immediately if total cycle time is exceeded */ - next_run =3D delta_time < (s32)GC_SCAN_INTERVAL_MAX; + next_run =3D delta_time < (s32)gc_scan_max; goto early_exit; } =20 @@ -1630,7 +1634,7 @@ static void gc_worker(struct work_struct *work) =20 gc_work->next_bucket =3D 0; =20 - next_run =3D clamp(next_run, GC_SCAN_INTERVAL_MIN, GC_SCAN_INTERVAL_MAX); + next_run =3D clamp(next_run, GC_SCAN_INTERVAL_MIN, gc_scan_max); =20 delta_time =3D max_t(s32, nfct_time_stamp - gc_work->start_time, 1); if (next_run > (unsigned long)delta_time) diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_con= ntrack_standalone.c index 207b240b14e5..f8cab779763f 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -637,6 +637,7 @@ enum nf_ct_sysctl_index { NF_SYSCTL_CT_PROTO_TIMEOUT_GRE, NF_SYSCTL_CT_PROTO_TIMEOUT_GRE_STREAM, #endif + NF_SYSCTL_CT_GC_SCAN_INTERVAL_MAX, =20 NF_SYSCTL_CT_LAST_SYSCTL, }; @@ -920,6 +921,14 @@ static struct ctl_table nf_ct_sysctl_table[] =3D { .proc_handler =3D proc_dointvec_jiffies, }, #endif + [NF_SYSCTL_CT_GC_SCAN_INTERVAL_MAX] =3D { + .procname =3D "nf_conntrack_gc_scan_interval_max", + .data =3D &nf_conntrack_gc_scan_interval_max, + .maxlen =3D sizeof(unsigned int), + .mode =3D 0644, + .proc_handler =3D proc_dointvec_jiffies, + .extra1 =3D SYSCTL_ONE, + }, }; =20 static struct ctl_table nf_ct_netfilter_table[] =3D { @@ -1043,6 +1052,7 @@ static int nf_conntrack_standalone_init_sysctl(struct= net *net) table[NF_SYSCTL_CT_MAX].mode =3D 0444; table[NF_SYSCTL_CT_EXPECT_MAX].mode =3D 0444; table[NF_SYSCTL_CT_BUCKETS].mode =3D 0444; + table[NF_SYSCTL_CT_GC_SCAN_INTERVAL_MAX].mode =3D 0444; } =20 cnet->sysctl_header =3D register_net_sysctl_sz(net, "net/netfilter", --=20 2.50.1 (Apple Git-155)