From nobody Sun Jun 14 06:14:43 2026 Received: from mail-pg1-f178.google.com (mail-pg1-f178.google.com [209.85.215.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5ED9F239562 for ; Sat, 2 May 2026 15:09:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777734565; cv=none; b=uv2L4RFg1WvGd+8kypF1p66MiZQQA4jBQtWLmFww+BHxMp5EQDCGdWCZ4vKwHH9rji0YYBCKFoJqOvaSO/04NHypOx6Iv71F3/BRxIw6t5ezBlqVz9QCYubd7+HLK7kDFuHUREyNB+lWCZZAOaTt6V/Srj/YBwAfOzyxVWIWdAY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777734565; c=relaxed/simple; bh=uazcdPaE5njKmpLYawFPN4mUVC9ypP4hDv4opGgd7Iw=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=dUKEa7YpLpBYRzKiHKvAW8yln/lPFBv11HWTzwrSE4HfLqy2PO+kLYKItt0YZ/w0kGy2u054BjfuJSCX4JyS2NLgmxttxeAWZy/zn9792fz1edDNm+UbXAKbfnC+YgXFKZQxY3vEgKO+7WTPhSjVDeoh0n4MFDNcP3+GHxG3Vho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MpnIrw1h; arc=none smtp.client-ip=209.85.215.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MpnIrw1h" Received: by mail-pg1-f178.google.com with SMTP id 41be03b00d2f7-c70e27e2b74so940709a12.0 for ; Sat, 02 May 2026 08:09:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777734564; x=1778339364; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=uZCg2ntfiIQSDRsAzajTIBjeudz9C8jXIZXlecgKHlI=; b=MpnIrw1h6hGrASr6P0bNa973sOIWzwfxBIpdAQRI/aHetUnIAYziAcHh0CMjxzd4e8 otAiR1fqCgIxW2LubpHI60XDQDSb61nVQwAaB9nmjsQxXdWlc7n2K3wswxc3FZyA2PpR uKbnolmGxJOyB8WeJHYgtWowgE3tgxylcQYYnqxe24ZKl1nQ3Zc9VcNxdChtWebQ8684 uODeJhombxEVFnGeODR3kp63i79CfrTLMJC46prAyZL4HRzVfdMhEKSorF7FceWSDs57 0PJltxHx8kX7dn8xW7KDMWcoERpnYM9ElHAUCB1F58smJdynxv2zxRB1MVc8De0t1Oen 0iuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777734564; x=1778339364; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=uZCg2ntfiIQSDRsAzajTIBjeudz9C8jXIZXlecgKHlI=; b=e2uTFdH8/HscCJYsyetKm8D7enEDvuTpqXDjm4Q3eueM4h6e6gvSVnbAAYz3dsEXR7 3PF8Dhx/Hy4haR19p2XLuL0TJ4jaTJ+GgFnVCiAphjXR39T6hGLyNVWFTy3IxWxj7/c4 Omr1xdNW9xrAXj9kUmKtr3vawq6VLtJtfG4ZyWa/Edh+trVoV3iOO0CzLbCm3N3VZ0nA TV/0WQ2KUz8EiVsu8B9YxxMFniS4NB7qRU/XIeUYt2CwUQBNrf14jCXFLCQ50NeRyVYe JqEotGglck+xRLkr7NRbel5D9mG/mYVhA6DNdrXZoIdknTQN2YSGm3sr1mnT4e5O+X+a jgwg== X-Forwarded-Encrypted: i=1; AFNElJ9QqTSEG2XFrU1IYS1uGDsylZHFzsf/yrRAtoy9uJ8wpSOoxtgPs8bYcmMHthj5SVlbAG4y6YE2o1y7z3E=@vger.kernel.org X-Gm-Message-State: AOJu0YxLP/eJxOEfdYyp5za+DDRWna0lX275GrtxASs9RCrN5h8tjvQN tWtRm/ZsJ0vA87GIgou8ViW1L4JAaLYBps0Kr6cXvEIZMcroCgGkqGtx X-Gm-Gg: AeBDies/7idPCmB1j3n4ny+qLjONDGNLK99KpTXw7EIN7ComL1Y5yKzG4whaOX+ZA9E elTPmCvMREf7yxJvn6wPJgtNieHT16pW/7GoFInp2Uj+6qw6JdVnhrYwaWtmuISPXmIlLOMibkY IBHYr/fqYJkoq+HW3I9F+TPiE4me2x/7g/sKx3J8Q9I5WFz6yRxt7UVyJpvcmJ6cuMjaOalGb9y UA+5Pf5Te4cD14QjAc6GDoJnadneH0xDoGfB5lKAUFI6eryLCefrNqFMKikhtOb0CPI2CT389fa 4ex+1wQSX7JY8KTAPD7rMId9H9Q7m2Lwa9/Ma+AIj1mWlUxQ4SDHchlr/Ii78I98kM8f2+JJJ8S 6XvXKSHmQG4NHT3YVP0gttCb6b4A1xldUe5MdE/owRnlmWBGXacHOoPXIlx5n/6pwU6f0DXAz4v 1O0OAPcVbGk1YNXu/uVGbRi8J0+/IhBsKIyG56gIH5AK8nsc/hsKMKK7Aw X-Received: by 2002:a05:6a21:6d95:b0:3a1:90ef:7e46 with SMTP id adf61e73a8af0-3a7f1bc4197mr3551311637.33.1777734563683; Sat, 02 May 2026 08:09:23 -0700 (PDT) Received: from csl-conti-dell7858.ntu.edu.sg ([155.69.195.57]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-83515833ee3sm6080310b3a.11.2026.05.02.08.09.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2026 08:09:23 -0700 (PDT) From: Maoyi Xie X-Google-Original-From: Maoyi Xie To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com Cc: dsahern@kernel.org, kuznet@ms2.inr.ac.ru, willemb@google.com, willemdebruijn.kernel@gmail.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH net v6] ipv6: flowlabel: enforce per-netns limit for unprivileged callers Date: Sat, 2 May 2026 23:09:18 +0800 Message-Id: <20260502150918.4171847-1-maoyi.xie@ntu.edu.sg> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" fl_size, fl_ht and ip6_fl_lock in net/ipv6/ip6_flowlabel.c are file scope and shared across netns. mem_check() reads fl_size to decide whether to deny non-CAP_NET_ADMIN callers; capable() runs against init_user_ns, so an unprivileged user in any non-init userns can push fl_size past FL_MAX_SIZE - FL_MAX_SIZE/4 and starve every other unprivileged userns on the host. Add struct netns_ipv6::flowlabel_count, bumped and decremented next to fl_size in fl_intern, ip6_fl_gc and ip6_fl_purge. The new field is placed in the existing 4-byte hole after ipmr_seq, so struct netns_ipv6 stays the same size on 64-bit builds. Bump FL_MAX_SIZE from 4096 to 8192. It has been 4096 since the file was added; machines and connection counts have grown. mem_check() folds an extra per-netns ceiling into the existing non-CAP_NET_ADMIN conditional. The ceiling is half of the total budget that unprivileged callers have ever been able to use, i.e. (FL_MAX_SIZE - FL_MAX_SIZE/4) / 2 =3D 3072 entries. With FL_MAX_SIZE doubled, this preserves the original per-user reach (~3K, what an unprivileged caller could already obtain before this change) while forcing an attacker to spread allocations across at least two netns to exhaust the global non-CAP_NET_ADMIN budget. CAP_NET_ADMIN against init_user_ns still bypasses both caps. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Suggested-by: Willem de Bruijn Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Maoyi Xie Reviewed-by: Willem de Bruijn --- v6 (this submission, addressing v5 review by Willem): - Rebased onto current net (resolves the conflict on include/net/netns/ipv6.h that v5 hit. ipmr_seq is now atomic_t but remains 4 bytes, so flowlabel_count still fills the 4-byte hole after it). - Restored fl_free() to its original position in both ip6_fl_gc() and ip6_fl_purge(). v5 had moved fl_free() after the new atomic_dec() to avoid the use-after-free on fl->fl_net. v6 instead caches fl->fl_net into a local before fl_free() in ip6_fl_gc(), and uses the net argument already in scope in ip6_fl_purge(). v5: replaced the per-netns ceiling FL_MAX_SIZE/8 with the computed unpriv_user_limit =3D (FL_MAX_SIZE - FL_MAX_SIZE/4)/2, which evaluates to 3072. v4's FL_MAX_SIZE/8 =3D 1024 would have reduced the per-user budget below the ~3K an unprivileged caller could already obtain before any of this work, defeating the reason FL_MAX_SIZE was doubled in the first place. v4: addressed Willem's v3 review on netdev. Dropped the flowlabel_has_excl cacheline argument in favour of "fills the existing 4-byte hole after ipmr_seq", and reordered atomic_dec(&...flowlabel_count) to sit immediately after atomic_dec(&fl_size) in ip6_fl_gc and ip6_fl_purge. v3: addressed Willem's review on the private security@ thread. Merged FL_MAX_SIZE doubling, dropped test data, moved flowlabel_count near ipmr_seq, inlined fl->fl_net in ip6_fl_gc. v2: per-netns counter + cap, sent to security@ as a 2-patch series. v1: fix-shape sketch in original disclosure. include/net/netns/ipv6.h | 1 + net/ipv6/ip6_flowlabel.c | 14 ++++++++++++-- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h index 499e42881..ef698f5fa 100644 --- a/include/net/netns/ipv6.h +++ b/include/net/netns/ipv6.h @@ -119,6 +119,7 @@ struct netns_ipv6 { struct fib_notifier_ops *notifier_ops; struct fib_notifier_ops *ip6mr_notifier_ops; atomic_t ipmr_seq; + atomic_t flowlabel_count; struct { struct hlist_head head; spinlock_t lock; diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c index c92f98c6f..28e43718d 100644 --- a/net/ipv6/ip6_flowlabel.c +++ b/net/ipv6/ip6_flowlabel.c @@ -36,7 +36,7 @@ /* FL hash table */ =20 #define FL_MAX_PER_SOCK 32 -#define FL_MAX_SIZE 4096 +#define FL_MAX_SIZE 8192 #define FL_HASH_MASK 255 #define FL_HASH(l) (ntohl(l)&FL_HASH_MASK) =20 @@ -161,9 +161,12 @@ static void ip6_fl_gc(struct timer_list *unused) fl->expires =3D ttd; ttd =3D fl->expires; if (time_after_eq(now, ttd)) { + struct net *net =3D fl->fl_net; + *flp =3D fl->next; fl_free(fl); atomic_dec(&fl_size); + atomic_dec(&net->ipv6.flowlabel_count); continue; } if (!sched || time_before(ttd, sched)) @@ -197,6 +200,7 @@ static void __net_exit ip6_fl_purge(struct net *net) *flp =3D fl->next; fl_free(fl); atomic_dec(&fl_size); + atomic_dec(&net->ipv6.flowlabel_count); continue; } flp =3D &fl->next; @@ -245,6 +249,7 @@ static struct ip6_flowlabel *fl_intern(struct net *net, fl->next =3D fl_ht[FL_HASH(fl->label)]; rcu_assign_pointer(fl_ht[FL_HASH(fl->label)], fl); atomic_inc(&fl_size); + atomic_inc(&net->ipv6.flowlabel_count); spin_unlock_bh(&ip6_fl_lock); rcu_read_unlock(); return NULL; @@ -464,6 +469,9 @@ fl_create(struct net *net, struct sock *sk, struct in6_= flowlabel_req *freq, =20 static int mem_check(struct sock *sk) { + const int unpriv_total_limit =3D FL_MAX_SIZE - (FL_MAX_SIZE / 4); + const int unpriv_user_limit =3D unpriv_total_limit / 2; + struct net *net =3D sock_net(sk); int room =3D FL_MAX_SIZE - atomic_read(&fl_size); struct ipv6_fl_socklist *sfl; int count =3D 0; @@ -478,7 +486,9 @@ static int mem_check(struct sock *sk) =20 if (room <=3D 0 || ((count >=3D FL_MAX_PER_SOCK || - (count > 0 && room < FL_MAX_SIZE/2) || room < FL_MAX_SIZE/4) && + (count > 0 && room < FL_MAX_SIZE/2) || + room < FL_MAX_SIZE/4 || + atomic_read(&net->ipv6.flowlabel_count) >=3D unpriv_user_limit) && !capable(CAP_NET_ADMIN))) return -ENOBUFS; =20 --=20 2.34.1