From nobody Thu Apr 2 22:28:55 2026 Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8DC9E1C862D for ; Sat, 14 Feb 2026 11:54:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771070080; cv=none; b=Iv1QlxX5JNIqod9QXXTVtEy1EFmK25/bw/I8hd9eZm/BRq2qq8Q6cop6JVqoxirZLzyks1+C4jhv22Uj8f00ZiUWvjg2nmKqKmb8/McH6ERbECmpkRQOd7Axg8a6RVwwhudpKJf2AF82Ek5K7xAfp3ddJ727Cg9k7Az+/2gmv+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771070080; c=relaxed/simple; bh=ETwWE2zGiCyxqLrpfs5oEAad45Yec1pEQzZLq/vFMNc=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=s/MZMu5l2f136QaE8Pj+IKdEqNVGA7l5izJny/bt3Zhr/JnSbP+3+JMMLXrwmTWvzGWfhIyAuSKFmI9aoadmoMHjpRtAGK1tkvL5BrOSJOsbyi0v5b3AydLYJlm6GKAS9OS6EuyF7EIyP8hxwzibs9lJzXUS4+hmDsvxRBPiGhE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Uh0Q6U1w; arc=none smtp.client-ip=209.85.215.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Uh0Q6U1w" Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-c2a9a9b43b1so1265391a12.2 for ; Sat, 14 Feb 2026 03:54:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1771070078; x=1771674878; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=NdOBKkhf8j9XMo1theBb8rs2ln4eW0jY+4jxw6Dwd1g=; b=Uh0Q6U1wTR60WUhGfT+4lm7N1ZvVYlXmykYFiVW/O/T1P6AY7vsgHKQAPdzXJVYHMO g+fbKzWVe/QkCj1qZgE8JEWwriUnT6WHyUeXo1eqK85HevSUu6DPchMHwmXaRtTXisRt Glh8jUiI7gXUxziHel9FIk3JNfkfKjfGbsNLOX+ISahf/sSyZod0jUTRMGPxOYDVR8iD qWEZ/WplSiC8clT3QIvQvNPhu7P+u51GHKNOUhJraXelhZre/hMRbD30VEoV1taXeV4d bgTKNOd6BMLIaQfyPz9vHzWAZ9seR3gBt24uc+gn3NY5on+9CNZtI1v01SYM5V1QdeGB 9Yug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771070078; x=1771674878; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=NdOBKkhf8j9XMo1theBb8rs2ln4eW0jY+4jxw6Dwd1g=; b=by200yKsG1gOHmL1g/pG/6m+yZpjyL0ozVgc1L10Y2sm0DOvAKOBMJW6HweB/m011x d8RAHwmj0e/VFfdalJYjXYQlSNZwsiG4IIJqrviVx1vq5H8YdcMRZ0NgjEnpu54JkTws bXpkx7i0CALhMiq0LwOaVcrptmwP+kcgOW+1kWF57eEsQzpqdMchIlvt3/DZe4mRSp1u ArgYUyOd9tTIjjy/+/rR3ZOihs4+468x8JpAOPY2meo70mP6HKiXjpGDsnZEpYOVgS6P wP4qLYHUbxdrkSqoOogYGUYKVKkEdaJE/DCPdxi6Lf3G8B/rJrFnq7qn+0Nk15qnIqcM HRQw== X-Forwarded-Encrypted: i=1; AJvYcCXVcj2bMO2nSj/c213WyRe1pJQjfUZcuvUs82YYRU0gegduUjRT7vpUDxlE7adts+FT7fy0Gr9Rg0C0il4=@vger.kernel.org X-Gm-Message-State: AOJu0YwKmGSturkYSKrhjCmrbm0+L/vGGuWCUHQ80EeOb/l0cXHooSoo c4bMKbZBTMJYApVf9u0l8lGOT9REqR01+lqXRc8X3dASkths3DWhuEAR X-Gm-Gg: AZuq6aL9KqSwo41Uqo84oaljzMsn90uTNlaulGwUDr0g3fiSJJh5AA8nXDVm4aXP62H fImPl8hXLnkXc/u4Oy0HKwhUVX6VT29GEXyqra4V9HYMmmYyL8KmgEI0eM7pM4PlES3enUrataw 4Oq7e/69NWMb+piRPmCj+zBJbrzFjZg7cuTiupqK6yEIn7kS+wuKDGr0QKoe8CD5ngnzL95UpeF 0kbaqL3q8MVJavdX+XAr2NueE0Unyl+6SpRRQxVeEJNjwLHbKtPRPpHB9CFHUxuipjTFJTmg2S7 i+iSEN75aySh8Z88N9zjxycs3pWFLyGAjGa0pB/lxH5kustIkh/wKTHevwrfpDrRdyWBBY8/GNY cSKeIC0hFj9dcEnfjtmvagLC5t1Sfa3PqD1wrZMSUoa6bCjcwj0NWWtttVcEuP2QDqgwrNNlhMT fnyxSFIt3RrnbP36WmzWoikBeehkmzBkLp+1ExOTdzlx8ePBgl6RzjCES+0vn9AuWkqQNb/FQQK Cuo4C6qgXD0efzmQHk= X-Received: by 2002:a05:6a20:cf88:b0:361:28dd:a9ff with SMTP id adf61e73a8af0-39483a5a85dmr2075763637.38.1771070077690; Sat, 14 Feb 2026 03:54:37 -0800 (PST) Received: from localhost.localdomain ([49.207.200.48]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-824c6b95278sm6444760b3a.53.2026.02.14.03.54.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sat, 14 Feb 2026 03:54:37 -0800 (PST) From: Anand Kumar Shaw To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, shuah@kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Anand Kumar Shaw Subject: [PATCH] bpf: expose original requested max_entries in bpf_map_info Date: Sat, 14 Feb 2026 17:24:31 +0530 Message-Id: <20260214115431.55574-1-anandkrshawheritage@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When creating LRU hash maps with BPF_F_NO_COMMON_LRU, the kernel silently rounds max_entries up to a multiple of num_possible_cpus() in htab_map_alloc() to ensure each per-CPU LRU list has at least one element. However, the original value requested by the caller is lost -- map->max_entries is overwritten with the rounded value. This creates a problem for userspace map managers (e.g., Cilium) that reconcile map parameters against their configured values. When the kernel-reported max_entries differs from the originally requested value, the reconciliation logic detects a "mismatch" and may enter an infinite delete-recreate loop, as seen in production incidents where non-power- of-2 CPU counts caused small but persistent rounding differences. Add a new 'requested_max_entries' field to struct bpf_map (kernel- internal) and struct bpf_map_info (UAPI) that preserves the caller's original max_entries value. The field is set in bpf_map_init_from_attr() before any map-type-specific adjustments, and exposed to userspace via BPF_OBJ_GET_INFO_BY_FD. This is a purely additive, backward-compatible change: - Old callers that don't know about the new field see it zeroed (via memset in bpf_map_get_info_by_fd) and can safely ignore it. - New callers can compare requested_max_entries vs max_entries to detect kernel adjustments and avoid false reconciliation mismatches. The new field is placed between max_entries (u32) and map_extra (u64) in struct bpf_map, filling the existing alignment padding hole -- no increase in struct size. Also update the BPF_F_NO_COMMON_LRU documentation to describe the rounding behavior and the new field. Selftests are included covering LRU hash maps with and without BPF_F_NO_COMMON_LRU, LRU per-CPU hash maps with BPF_F_NO_COMMON_LRU, and regular hash maps. Signed-off-by: Anand Kumar Shaw --- include/linux/bpf.h | 1 + include/uapi/linux/bpf.h | 12 ++ kernel/bpf/syscall.c | 2 + tools/include/uapi/linux/bpf.h | 12 ++ .../prog_tests/map_requested_max_entries.c | 134 ++++++++++++++++++ 5 files changed, 161 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/map_requested_ma= x_entries.c diff --git a/include/linux/bpf.h b/include/linux/bpf.h index cd9b96434..8606b2c40 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -304,6 +304,7 @@ struct bpf_map { u32 key_size; u32 value_size; u32 max_entries; + u32 requested_max_entries; /* original max_entries before kernel adjustme= nt */ u64 map_extra; /* any per-map-type extra fields */ u32 map_flags; u32 id; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c8d400b76..39cd781c2 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1405,6 +1405,12 @@ enum { * which can scale and perform better. * Note, the LRU nodes (including free nodes) cannot be moved * across different LRU lists. + * + * When this flag is set, the kernel rounds max_entries up to a multiple + * of num_possible_cpus() so that each per-CPU LRU list has at least one + * element. The actual (possibly adjusted) value is reported via + * bpf_map_info.max_entries, while the original requested value is + * preserved in bpf_map_info.requested_max_entries. */ BPF_F_NO_COMMON_LRU =3D (1U << 1), /* Specify numa node during map creation */ @@ -6717,6 +6723,12 @@ struct bpf_map_info { __u64 map_extra; __aligned_u64 hash; __u32 hash_size; + /* Original max_entries as requested by the caller. May differ from + * max_entries if the kernel adjusted it (e.g., rounded up to a + * multiple of num_possible_cpus() for per-CPU LRU hash maps when + * BPF_F_NO_COMMON_LRU is set). + */ + __u32 requested_max_entries; } __attribute__((aligned(8))); =20 struct bpf_btf_info { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index dd89bf809..66a518f3a 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -439,6 +439,7 @@ void bpf_map_init_from_attr(struct bpf_map *map, union = bpf_attr *attr) map->key_size =3D attr->key_size; map->value_size =3D attr->value_size; map->max_entries =3D attr->max_entries; + map->requested_max_entries =3D attr->max_entries; map->map_flags =3D bpf_map_flags_retain_permanent(attr->map_flags); map->numa_node =3D bpf_map_attr_numa_node(attr); map->map_extra =3D attr->map_extra; @@ -5301,6 +5302,7 @@ static int bpf_map_get_info_by_fd(struct file *file, info.key_size =3D map->key_size; info.value_size =3D map->value_size; info.max_entries =3D map->max_entries; + info.requested_max_entries =3D map->requested_max_entries; info.map_flags =3D map->map_flags; info.map_extra =3D map->map_extra; memcpy(info.name, map->name, sizeof(map->name)); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 5e38b4887..bea369e10 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1405,6 +1405,12 @@ enum { * which can scale and perform better. * Note, the LRU nodes (including free nodes) cannot be moved * across different LRU lists. + * + * When this flag is set, the kernel rounds max_entries up to a multiple + * of num_possible_cpus() so that each per-CPU LRU list has at least one + * element. The actual (possibly adjusted) value is reported via + * bpf_map_info.max_entries, while the original requested value is + * preserved in bpf_map_info.requested_max_entries. */ BPF_F_NO_COMMON_LRU =3D (1U << 1), /* Specify numa node during map creation */ @@ -6717,6 +6723,12 @@ struct bpf_map_info { __u64 map_extra; __aligned_u64 hash; __u32 hash_size; + /* Original max_entries as requested by the caller. May differ from + * max_entries if the kernel adjusted it (e.g., rounded up to a + * multiple of num_possible_cpus() for per-CPU LRU hash maps when + * BPF_F_NO_COMMON_LRU is set). + */ + __u32 requested_max_entries; } __attribute__((aligned(8))); =20 struct bpf_btf_info { diff --git a/tools/testing/selftests/bpf/prog_tests/map_requested_max_entri= es.c b/tools/testing/selftests/bpf/prog_tests/map_requested_max_entries.c new file mode 100644 index 000000000..e54e88326 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/map_requested_max_entries.c @@ -0,0 +1,134 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Test that bpf_map_info.requested_max_entries correctly reports the + * original max_entries value requested by the caller, even when the + * kernel adjusts max_entries internally (e.g., rounding up for per-CPU + * LRU hash maps with BPF_F_NO_COMMON_LRU). + */ +#include +#include + +static void test_lru_hash_no_common_lru(void) +{ + LIBBPF_OPTS(bpf_map_create_opts, opts); + struct bpf_map_info info =3D {}; + __u32 info_len =3D sizeof(info); + /* Use a prime number to guarantee rounding on any SMP system */ + __u32 requested =3D 997; + int map_fd, err; + + opts.map_flags =3D BPF_F_NO_COMMON_LRU; + + map_fd =3D bpf_map_create(BPF_MAP_TYPE_LRU_HASH, "test_lru_pcpu", + sizeof(__u32), sizeof(__u32), + requested, &opts); + if (!ASSERT_GE(map_fd, 0, "bpf_map_create")) + return; + + err =3D bpf_map_get_info_by_fd(map_fd, &info, &info_len); + if (!ASSERT_OK(err, "bpf_map_get_info_by_fd")) + goto out; + + ASSERT_EQ(info.requested_max_entries, requested, + "requested_max_entries"); + ASSERT_GE(info.max_entries, requested, + "max_entries >=3D requested"); + +out: + close(map_fd); +} + +static void test_lru_percpu_hash_no_common_lru(void) +{ + LIBBPF_OPTS(bpf_map_create_opts, opts); + struct bpf_map_info info =3D {}; + __u32 info_len =3D sizeof(info); + __u32 requested =3D 997; + int map_fd, err; + + opts.map_flags =3D BPF_F_NO_COMMON_LRU; + + map_fd =3D bpf_map_create(BPF_MAP_TYPE_LRU_PERCPU_HASH, + "test_lru_pcpu_v", + sizeof(__u32), sizeof(__u32), + requested, &opts); + if (!ASSERT_GE(map_fd, 0, "bpf_map_create")) + return; + + err =3D bpf_map_get_info_by_fd(map_fd, &info, &info_len); + if (!ASSERT_OK(err, "bpf_map_get_info_by_fd")) + goto out; + + ASSERT_EQ(info.requested_max_entries, requested, + "requested_max_entries"); + ASSERT_GE(info.max_entries, requested, + "max_entries >=3D requested"); + +out: + close(map_fd); +} + +static void test_lru_hash_common_lru(void) +{ + struct bpf_map_info info =3D {}; + __u32 info_len =3D sizeof(info); + __u32 requested =3D 997; + int map_fd, err; + + /* Without BPF_F_NO_COMMON_LRU, max_entries should not be rounded */ + map_fd =3D bpf_map_create(BPF_MAP_TYPE_LRU_HASH, "test_lru_common", + sizeof(__u32), sizeof(__u32), + requested, NULL); + if (!ASSERT_GE(map_fd, 0, "bpf_map_create")) + return; + + err =3D bpf_map_get_info_by_fd(map_fd, &info, &info_len); + if (!ASSERT_OK(err, "bpf_map_get_info_by_fd")) + goto out; + + ASSERT_EQ(info.requested_max_entries, requested, + "requested_max_entries"); + ASSERT_EQ(info.max_entries, requested, + "max_entries =3D=3D requested (no rounding)"); + +out: + close(map_fd); +} + +static void test_hash_map(void) +{ + struct bpf_map_info info =3D {}; + __u32 info_len =3D sizeof(info); + __u32 requested =3D 256; + int map_fd, err; + + /* Regular hash map: max_entries should equal requested */ + map_fd =3D bpf_map_create(BPF_MAP_TYPE_HASH, "test_hash", + sizeof(__u32), sizeof(__u32), + requested, NULL); + if (!ASSERT_GE(map_fd, 0, "bpf_map_create")) + return; + + err =3D bpf_map_get_info_by_fd(map_fd, &info, &info_len); + if (!ASSERT_OK(err, "bpf_map_get_info_by_fd")) + goto out; + + ASSERT_EQ(info.requested_max_entries, requested, + "requested_max_entries"); + ASSERT_EQ(info.max_entries, requested, + "max_entries =3D=3D requested"); + +out: + close(map_fd); +} + +void test_map_requested_max_entries(void) +{ + if (test__start_subtest("lru_hash_no_common_lru")) + test_lru_hash_no_common_lru(); + if (test__start_subtest("lru_percpu_hash_no_common_lru")) + test_lru_percpu_hash_no_common_lru(); + if (test__start_subtest("lru_hash_common_lru")) + test_lru_hash_common_lru(); + if (test__start_subtest("hash_map")) + test_hash_map(); +} --=20 2.39.5 (Apple Git-154)