From: Donglin Peng <pengdonglin@xiaomi.com>
Implement lazy validation of BTF type ordering to enable efficient
binary search for sorted BTF while maintaining linear search fallback
for unsorted cases.
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Song Liu <song@kernel.org>
Cc: Xiaoqin Zhang <zhangxiaoqin@xiaomi.com>
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
---
kernel/bpf/btf.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 65 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 66cb739a0598..33c327d3cac3 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -552,6 +552,70 @@ u32 btf_nr_types(const struct btf *btf)
return total;
}
+/* Anonymous types (with empty names) are considered greater than named types
+ * and are sorted after them. Two anonymous types are considered equal. Named
+ * types are compared lexicographically.
+ */
+static int btf_compare_type_names(const void *a, const void *b, void *priv)
+{
+ struct btf *btf = (struct btf *)priv;
+ const struct btf_type *ta = btf_type_by_id(btf, *(__u32 *)a);
+ const struct btf_type *tb = btf_type_by_id(btf, *(__u32 *)b);
+ const char *na, *nb;
+
+ if (!ta->name_off && tb->name_off)
+ return 1;
+ if (ta->name_off && !tb->name_off)
+ return -1;
+ if (!ta->name_off && !tb->name_off)
+ return 0;
+
+ na = btf_name_by_offset(btf, ta->name_off);
+ nb = btf_name_by_offset(btf, tb->name_off);
+ return strcmp(na, nb);
+}
+
+/* Verifies that BTF types are sorted in ascending order according to their
+ * names, with named types appearing before anonymous types. If the ordering
+ * is correct, counts the number of named types and updates the BTF object's
+ * nr_sorted_types field.
+ *
+ * Return: true if types are properly sorted, false otherwise
+ */
+static bool btf_check_sorted(struct btf *btf)
+{
+ const struct btf_type *t;
+ int i, n, k = 0, nr_sorted_types;
+
+ if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
+ goto out;
+ btf->nr_sorted_types = 0;
+
+ if (btf->nr_types < 2)
+ goto out;
+
+ nr_sorted_types = 0;
+ n = btf_nr_types(btf) - 1;
+ for (i = btf_start_id(btf); i < n; i++) {
+ k = i + 1;
+ if (btf_compare_type_names(&i, &k, btf) > 0)
+ goto out;
+
+ t = btf_type_by_id(btf, i);
+ if (t->name_off)
+ nr_sorted_types++;
+ }
+
+ t = btf_type_by_id(btf, k);
+ if (t->name_off)
+ nr_sorted_types++;
+ if (nr_sorted_types)
+ btf->nr_sorted_types = nr_sorted_types;
+
+out:
+ return btf->nr_sorted_types > 0;
+}
+
/* Performs binary search within specified type ID range to find the leftmost
* BTF type matching the given name. The search assumes types are sorted by
* name in lexicographical order within the specified range.
@@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
goto out;
}
- if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
+ if (btf_check_sorted((struct btf *)btf)) {
/* binary search */
bool skip_first;
s32 start_id, end_id;;
--
2.34.1
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 66cb739a0..33c327d3c 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -552,6 +552,70 @@ u32 btf_nr_types(const struct btf *btf)
> return total;
> }
>
> +/* Verifies that BTF types are sorted in ascending order according to their
> + * names, with named types appearing before anonymous types. If the ordering
> + * is correct, counts the number of named types and updates the BTF object's
> + * nr_sorted_types field.
> + *
> + * Return: true if types are properly sorted, false otherwise
> + */
> +static bool btf_check_sorted(struct btf *btf)
> +{
> + const struct btf_type *t;
> + int i, n, k = 0, nr_sorted_types;
> +
> + if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
> + goto out;
> + btf->nr_sorted_types = 0;
^
Can multiple threads race when writing to btf->nr_sorted_types here?
Looking at btf_find_by_name_kind()->btf_check_sorted(), I see that
btf_find_by_name_kind() receives a const pointer but casts away the const
to call btf_check_sorted(). The function bpf_find_btf_id() calls
btf_find_by_name_kind() without holding any locks (line 737), and later
explicitly unlocks before calling it again (lines 756-757).
This means multiple threads can concurrently enter btf_check_sorted() and
write to btf->nr_sorted_types. While the validation logic is idempotent
and all threads would compute the same value, the concurrent writes to the
same memory location without synchronization could trigger KCSAN warnings.
Should this use atomic operations, or should the validation be performed
under a lock during BTF initialization before the BTF becomes visible to
multiple threads?
> +
> + if (btf->nr_types < 2)
> + goto out;
> +
> + nr_sorted_types = 0;
> + n = btf_nr_types(btf) - 1;
> + for (i = btf_start_id(btf); i < n; i++) {
> + k = i + 1;
> + if (btf_compare_type_names(&i, &k, btf) > 0)
> + goto out;
> +
> + t = btf_type_by_id(btf, i);
> + if (t->name_off)
> + nr_sorted_types++;
> + }
> +
> + t = btf_type_by_id(btf, k);
> + if (t->name_off)
> + nr_sorted_types++;
> + if (nr_sorted_types)
> + btf->nr_sorted_types = nr_sorted_types;
^
The same race condition applies to this write of nr_sorted_types as well.
> +
> +out:
> + return btf->nr_sorted_types > 0;
> +}
[ ... ]
> @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> goto out;
> }
>
> - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> + if (btf_check_sorted((struct btf *)btf)) {
^
The const cast here enables the concurrent writes discussed above. Is
there a reason to mark the btf parameter as const if we're modifying it?
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19137195500
On Thu, Nov 6, 2025 at 9:47 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 66cb739a0..33c327d3c 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -552,6 +552,70 @@ u32 btf_nr_types(const struct btf *btf)
> > return total;
> > }
> >
> > +/* Verifies that BTF types are sorted in ascending order according to their
> > + * names, with named types appearing before anonymous types. If the ordering
> > + * is correct, counts the number of named types and updates the BTF object's
> > + * nr_sorted_types field.
> > + *
> > + * Return: true if types are properly sorted, false otherwise
> > + */
> > +static bool btf_check_sorted(struct btf *btf)
> > +{
> > + const struct btf_type *t;
> > + int i, n, k = 0, nr_sorted_types;
> > +
> > + if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
> > + goto out;
> > + btf->nr_sorted_types = 0;
> ^
>
> Can multiple threads race when writing to btf->nr_sorted_types here?
> Looking at btf_find_by_name_kind()->btf_check_sorted(), I see that
> btf_find_by_name_kind() receives a const pointer but casts away the const
> to call btf_check_sorted(). The function bpf_find_btf_id() calls
> btf_find_by_name_kind() without holding any locks (line 737), and later
> explicitly unlocks before calling it again (lines 756-757).
>
> This means multiple threads can concurrently enter btf_check_sorted() and
> write to btf->nr_sorted_types. While the validation logic is idempotent
> and all threads would compute the same value, the concurrent writes to the
> same memory location without synchronization could trigger KCSAN warnings.
>
> Should this use atomic operations, or should the validation be performed
> under a lock during BTF initialization before the BTF becomes visible to
> multiple threads?
Hi, is it necessary to address this issue?
For example, by using atomic_try_cmpxchg or WRITE/READ_ONCE? Using
atomic_try_cmpxchg can prevent race conditions on writes but requires an atomic
variable, while WRITE_ONCE/READ_ONCE can avoid KCSAN warnings. Since
the race condition is unlikely to cause critical issues, I suggest using
WRITE_ONCE/READ_ONCE.
>
> > +
> > + if (btf->nr_types < 2)
> > + goto out;
> > +
> > + nr_sorted_types = 0;
> > + n = btf_nr_types(btf) - 1;
> > + for (i = btf_start_id(btf); i < n; i++) {
> > + k = i + 1;
> > + if (btf_compare_type_names(&i, &k, btf) > 0)
> > + goto out;
> > +
> > + t = btf_type_by_id(btf, i);
> > + if (t->name_off)
> > + nr_sorted_types++;
> > + }
> > +
> > + t = btf_type_by_id(btf, k);
> > + if (t->name_off)
> > + nr_sorted_types++;
> > + if (nr_sorted_types)
> > + btf->nr_sorted_types = nr_sorted_types;
> ^
>
> The same race condition applies to this write of nr_sorted_types as well.
>
> > +
> > +out:
> > + return btf->nr_sorted_types > 0;
> > +}
>
> [ ... ]
>
> > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> > goto out;
> > }
> >
> > - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > + if (btf_check_sorted((struct btf *)btf)) {
> ^
>
> The const cast here enables the concurrent writes discussed above. Is
> there a reason to mark the btf parameter as const if we're modifying it?
Hi team, is casting away const an acceptable approach for our codebase?
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19137195500
On Fri, 2025-11-07 at 15:08 +0800, Donglin Peng wrote:
> On Thu, Nov 6, 2025 at 9:47 PM <bot+bpf-ci@kernel.org> wrote:
> >
> > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > index 66cb739a0..33c327d3c 100644
> > > --- a/kernel/bpf/btf.c
> > > +++ b/kernel/bpf/btf.c
> > > @@ -552,6 +552,70 @@ u32 btf_nr_types(const struct btf *btf)
> > > return total;
> > > }
> > >
> > > +/* Verifies that BTF types are sorted in ascending order
> > > according to their
> > > + * names, with named types appearing before anonymous types. If
> > > the ordering
> > > + * is correct, counts the number of named types and updates the
> > > BTF object's
> > > + * nr_sorted_types field.
> > > + *
> > > + * Return: true if types are properly sorted, false otherwise
> > > + */
> > > +static bool btf_check_sorted(struct btf *btf)
> > > +{
> > > + const struct btf_type *t;
> > > + int i, n, k = 0, nr_sorted_types;
> > > +
> > > + if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
> > > + goto out;
> > > + btf->nr_sorted_types = 0;
> > ^
> >
> > Can multiple threads race when writing to btf->nr_sorted_types
> > here?
> > Looking at btf_find_by_name_kind()->btf_check_sorted(), I see that
> > btf_find_by_name_kind() receives a const pointer but casts away the
> > const
> > to call btf_check_sorted(). The function bpf_find_btf_id() calls
> > btf_find_by_name_kind() without holding any locks (line 737), and
> > later
> > explicitly unlocks before calling it again (lines 756-757).
> >
> > This means multiple threads can concurrently enter
> > btf_check_sorted() and
> > write to btf->nr_sorted_types. While the validation logic is
> > idempotent
> > and all threads would compute the same value, the concurrent writes
> > to the
> > same memory location without synchronization could trigger KCSAN
> > warnings.
> >
> > Should this use atomic operations, or should the validation be
> > performed
> > under a lock during BTF initialization before the BTF becomes
> > visible to
> > multiple threads?
>
> Hi, is it necessary to address this issue?
> For example, by using atomic_try_cmpxchg or WRITE/READ_ONCE? Using
> atomic_try_cmpxchg can prevent race conditions on writes but requires
> an atomic
> variable, while WRITE_ONCE/READ_ONCE can avoid KCSAN warnings. Since
> the race condition is unlikely to cause critical issues, I suggest
> using
> WRITE_ONCE/READ_ONCE.
Probably use WRITE_ONCE/READ_ONCE?
> > > +
> > > + if (btf->nr_types < 2)
> > > + goto out;
> > > +
> > > + nr_sorted_types = 0;
> > > + n = btf_nr_types(btf) - 1;
> > > + for (i = btf_start_id(btf); i < n; i++) {
> > > + k = i + 1;
> > > + if (btf_compare_type_names(&i, &k, btf) > 0)
> > > + goto out;
> > > +
> > > + t = btf_type_by_id(btf, i);
> > > + if (t->name_off)
> > > + nr_sorted_types++;
> > > + }
> > > +
> > > + t = btf_type_by_id(btf, k);
> > > + if (t->name_off)
> > > + nr_sorted_types++;
> > > + if (nr_sorted_types)
> > > + btf->nr_sorted_types = nr_sorted_types;
> > ^
> >
> > The same race condition applies to this write of nr_sorted_types as
> > well.
> >
> > > +
> > > +out:
> > > + return btf->nr_sorted_types > 0;
> > > +}
> >
> > [ ... ]
> >
> > > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct btf
> > > *btf, const char *name, u8 kind)
> > > goto out;
> > > }
> > >
> > > - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > > + if (btf_check_sorted((struct btf *)btf)) {
> > ^
> >
> > The const cast here enables the concurrent writes discussed above.
> > Is
> > there a reason to mark the btf parameter as const if we're
> > modifying it?
>
> Hi team, is casting away const an acceptable approach for our
> codebase?
Casting away const is undefined behaviour, e.g. see paragraph 6.7.3.6
N1570 ISO/IEC 9899:201x Programming languages — C.
Both of the problems above can be avoided if kernel will do sorted
check non-lazily. But Andrii and Alexei seem to like that property.
>
> >
> >
> > ---
> > AI reviewed your patch. Please fix the bug or email reply why it's
> > not a bug.
> > See:
> > https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> >
> > CI run summary:
> > https://github.com/kernel-patches/bpf/actions/runs/19137195500
On Fri, Nov 7, 2025 at 10:19 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2025-11-07 at 15:08 +0800, Donglin Peng wrote:
> > On Thu, Nov 6, 2025 at 9:47 PM <bot+bpf-ci@kernel.org> wrote:
> > >
> > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > > index 66cb739a0..33c327d3c 100644
> > > > --- a/kernel/bpf/btf.c
> > > > +++ b/kernel/bpf/btf.c
> > > > @@ -552,6 +552,70 @@ u32 btf_nr_types(const struct btf *btf)
> > > > return total;
> > > > }
> > > >
> > > > +/* Verifies that BTF types are sorted in ascending order
> > > > according to their
> > > > + * names, with named types appearing before anonymous types. If
> > > > the ordering
> > > > + * is correct, counts the number of named types and updates the
> > > > BTF object's
> > > > + * nr_sorted_types field.
> > > > + *
> > > > + * Return: true if types are properly sorted, false otherwise
> > > > + */
> > > > +static bool btf_check_sorted(struct btf *btf)
> > > > +{
> > > > + const struct btf_type *t;
> > > > + int i, n, k = 0, nr_sorted_types;
> > > > +
> > > > + if (likely(btf->nr_sorted_types != BTF_NEED_SORT_CHECK))
> > > > + goto out;
> > > > + btf->nr_sorted_types = 0;
> > > ^
> > >
> > > Can multiple threads race when writing to btf->nr_sorted_types
> > > here?
> > > Looking at btf_find_by_name_kind()->btf_check_sorted(), I see that
> > > btf_find_by_name_kind() receives a const pointer but casts away the
> > > const
> > > to call btf_check_sorted(). The function bpf_find_btf_id() calls
> > > btf_find_by_name_kind() without holding any locks (line 737), and
> > > later
> > > explicitly unlocks before calling it again (lines 756-757).
> > >
> > > This means multiple threads can concurrently enter
> > > btf_check_sorted() and
> > > write to btf->nr_sorted_types. While the validation logic is
> > > idempotent
> > > and all threads would compute the same value, the concurrent writes
> > > to the
> > > same memory location without synchronization could trigger KCSAN
> > > warnings.
> > >
> > > Should this use atomic operations, or should the validation be
> > > performed
> > > under a lock during BTF initialization before the BTF becomes
> > > visible to
> > > multiple threads?
> >
> > Hi, is it necessary to address this issue?
> > For example, by using atomic_try_cmpxchg or WRITE/READ_ONCE? Using
> > atomic_try_cmpxchg can prevent race conditions on writes but requires
> > an atomic
> > variable, while WRITE_ONCE/READ_ONCE can avoid KCSAN warnings. Since
> > the race condition is unlikely to cause critical issues, I suggest
> > using
> > WRITE_ONCE/READ_ONCE.
>
> Probably use WRITE_ONCE/READ_ONCE?
>
> > > > +
> > > > + if (btf->nr_types < 2)
> > > > + goto out;
> > > > +
> > > > + nr_sorted_types = 0;
> > > > + n = btf_nr_types(btf) - 1;
> > > > + for (i = btf_start_id(btf); i < n; i++) {
> > > > + k = i + 1;
> > > > + if (btf_compare_type_names(&i, &k, btf) > 0)
> > > > + goto out;
> > > > +
> > > > + t = btf_type_by_id(btf, i);
> > > > + if (t->name_off)
> > > > + nr_sorted_types++;
> > > > + }
> > > > +
> > > > + t = btf_type_by_id(btf, k);
> > > > + if (t->name_off)
> > > > + nr_sorted_types++;
> > > > + if (nr_sorted_types)
> > > > + btf->nr_sorted_types = nr_sorted_types;
> > > ^
> > >
> > > The same race condition applies to this write of nr_sorted_types as
> > > well.
> > >
> > > > +
> > > > +out:
> > > > + return btf->nr_sorted_types > 0;
> > > > +}
> > >
> > > [ ... ]
> > >
> > > > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct btf
> > > > *btf, const char *name, u8 kind)
> > > > goto out;
> > > > }
> > > >
> > > > - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > > > + if (btf_check_sorted((struct btf *)btf)) {
> > > ^
> > >
> > > The const cast here enables the concurrent writes discussed above.
> > > Is
> > > there a reason to mark the btf parameter as const if we're
> > > modifying it?
> >
> > Hi team, is casting away const an acceptable approach for our
> > codebase?
>
> Casting away const is undefined behaviour, e.g. see paragraph 6.7.3.6
> N1570 ISO/IEC 9899:201x Programming languages — C.
>
> Both of the problems above can be avoided if kernel will do sorted
> check non-lazily. But Andrii and Alexei seem to like that property.
Ihor is going to move BTF manipulations into resolve_btfid.
Sorting of BTF should be in resolve_btfid as well.
This way the build process will guarantee that BTF is sorted
to the kernel liking. So the kernel doesn't even need to check
that BTF is sorted.
On Fri, 2025-11-07 at 10:54 -0800, Alexei Starovoitov wrote:
[...]
> > > > > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct
> > > > > btf
> > > > > *btf, const char *name, u8 kind)
> > > > > goto out;
> > > > > }
> > > > >
> > > > > - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > > > > + if (btf_check_sorted((struct btf *)btf)) {
> > > > ^
> > > >
> > > > The const cast here enables the concurrent writes discussed
> > > > above.
> > > > Is
> > > > there a reason to mark the btf parameter as const if we're
> > > > modifying it?
> > >
> > > Hi team, is casting away const an acceptable approach for our
> > > codebase?
> >
> > Casting away const is undefined behaviour, e.g. see paragraph
> > 6.7.3.6
> > N1570 ISO/IEC 9899:201x Programming languages — C.
> >
> > Both of the problems above can be avoided if kernel will do sorted
> > check non-lazily. But Andrii and Alexei seem to like that property.
>
> Ihor is going to move BTF manipulations into resolve_btfid.
> Sorting of BTF should be in resolve_btfid as well.
> This way the build process will guarantee that BTF is sorted
> to the kernel liking. So the kernel doesn't even need to check
> that BTF is sorted.
This would be great.
Does this imply that module BTFs are sorted too?
On Fri, Nov 7, 2025 at 10:58 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2025-11-07 at 10:54 -0800, Alexei Starovoitov wrote:
>
> [...]
>
> > > > > > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const struct
> > > > > > btf
> > > > > > *btf, const char *name, u8 kind)
> > > > > > goto out;
> > > > > > }
> > > > > >
> > > > > > - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > > > > > + if (btf_check_sorted((struct btf *)btf)) {
> > > > > ^
> > > > >
> > > > > The const cast here enables the concurrent writes discussed
> > > > > above.
> > > > > Is
> > > > > there a reason to mark the btf parameter as const if we're
> > > > > modifying it?
> > > >
> > > > Hi team, is casting away const an acceptable approach for our
> > > > codebase?
> > >
> > > Casting away const is undefined behaviour, e.g. see paragraph
> > > 6.7.3.6
> > > N1570 ISO/IEC 9899:201x Programming languages — C.
> > >
> > > Both of the problems above can be avoided if kernel will do sorted
> > > check non-lazily. But Andrii and Alexei seem to like that property.
> >
> > Ihor is going to move BTF manipulations into resolve_btfid.
> > Sorting of BTF should be in resolve_btfid as well.
> > This way the build process will guarantee that BTF is sorted
> > to the kernel liking. So the kernel doesn't even need to check
> > that BTF is sorted.
>
> This would be great.
> Does this imply that module BTFs are sorted too?
Yes. The module build is supposed to use the kernel build tree where
kernel BTF expectations will match resolve_btfid actions.
Just like compiler and config flags should be the same.
On Fri, 2025-11-07 at 11:01 -0800, Alexei Starovoitov wrote:
> On Fri, Nov 7, 2025 at 10:58 AM Eduard Zingerman <eddyz87@gmail.com>
> wrote:
> >
> > On Fri, 2025-11-07 at 10:54 -0800, Alexei Starovoitov wrote:
> >
> > [...]
> >
> > > > > > > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const
> > > > > > > struct
> > > > > > > btf
> > > > > > > *btf, const char *name, u8 kind)
> > > > > > > goto out;
> > > > > > > }
> > > > > > >
> > > > > > > - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > > > > > > + if (btf_check_sorted((struct btf *)btf)) {
> > > > > > ^
> > > > > >
> > > > > > The const cast here enables the concurrent writes discussed
> > > > > > above.
> > > > > > Is
> > > > > > there a reason to mark the btf parameter as const if we're
> > > > > > modifying it?
> > > > >
> > > > > Hi team, is casting away const an acceptable approach for our
> > > > > codebase?
> > > >
> > > > Casting away const is undefined behaviour, e.g. see paragraph
> > > > 6.7.3.6
> > > > N1570 ISO/IEC 9899:201x Programming languages — C.
> > > >
> > > > Both of the problems above can be avoided if kernel will do
> > > > sorted
> > > > check non-lazily. But Andrii and Alexei seem to like that
> > > > property.
> > >
> > > Ihor is going to move BTF manipulations into resolve_btfid.
> > > Sorting of BTF should be in resolve_btfid as well.
> > > This way the build process will guarantee that BTF is sorted
> > > to the kernel liking. So the kernel doesn't even need to check
> > > that BTF is sorted.
> >
> > This would be great.
> > Does this imply that module BTFs are sorted too?
>
> Yes. The module build is supposed to use the kernel build tree where
> kernel BTF expectations will match resolve_btfid actions.
> Just like compiler and config flags should be the same.
There is also program BTF. E.g. btf_find_by_name_kind() is called for
program BTF in bpf_check_attach_target(). I think it would be fine to
check program BTF for being sorted at the BTF load time.
On Sat, Nov 8, 2025 at 3:51 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2025-11-07 at 11:01 -0800, Alexei Starovoitov wrote:
> > On Fri, Nov 7, 2025 at 10:58 AM Eduard Zingerman <eddyz87@gmail.com>
> > wrote:
> > >
> > > On Fri, 2025-11-07 at 10:54 -0800, Alexei Starovoitov wrote:
> > >
> > > [...]
> > >
> > > > > > > > @@ -610,7 +674,7 @@ s32 btf_find_by_name_kind(const
> > > > > > > > struct
> > > > > > > > btf
> > > > > > > > *btf, const char *name, u8 kind)
> > > > > > > > goto out;
> > > > > > > > }
> > > > > > > >
> > > > > > > > - if (btf->nr_sorted_types != BTF_NEED_SORT_CHECK) {
> > > > > > > > + if (btf_check_sorted((struct btf *)btf)) {
> > > > > > > ^
> > > > > > >
> > > > > > > The const cast here enables the concurrent writes discussed
> > > > > > > above.
> > > > > > > Is
> > > > > > > there a reason to mark the btf parameter as const if we're
> > > > > > > modifying it?
> > > > > >
> > > > > > Hi team, is casting away const an acceptable approach for our
> > > > > > codebase?
> > > > >
> > > > > Casting away const is undefined behaviour, e.g. see paragraph
> > > > > 6.7.3.6
> > > > > N1570 ISO/IEC 9899:201x Programming languages — C.
> > > > >
> > > > > Both of the problems above can be avoided if kernel will do
> > > > > sorted
> > > > > check non-lazily. But Andrii and Alexei seem to like that
> > > > > property.
> > > >
> > > > Ihor is going to move BTF manipulations into resolve_btfid.
> > > > Sorting of BTF should be in resolve_btfid as well.
> > > > This way the build process will guarantee that BTF is sorted
> > > > to the kernel liking. So the kernel doesn't even need to check
> > > > that BTF is sorted.
> > >
> > > This would be great.
> > > Does this imply that module BTFs are sorted too?
> >
> > Yes. The module build is supposed to use the kernel build tree where
> > kernel BTF expectations will match resolve_btfid actions.
> > Just like compiler and config flags should be the same.
>
> There is also program BTF. E.g. btf_find_by_name_kind() is called for
> program BTF in bpf_check_attach_target(). I think it would be fine to
> check program BTF for being sorted at the BTF load time.
[[Resending in plain text format - previous HTML email was rejected]
Thanks for the feedback. Based on the previous discussions, I plan
to implement the following changes in the next version:
1. Modify the btf__permute interface to adopt the ID map approach, as
suggested by Andrii.
2. Remove the lazy sort check and move the verification to the BTF
parsing phase. This addresses two concerns: potential race conditions
with write operations and const-cast issues. The overhead is negligible
(approximately 1.4ms for vmlinux BTF).
3. Invoke the btf__permute interface to implement BTF sorting in resolve_btfids.
I welcome any further suggestions.
Thanks,
Donglin
On Mon, 2025-11-10 at 09:42 +0800, Donglin Peng wrote: [...] > [[Resending in plain text format - previous HTML email was rejected] > > Thanks for the feedback. Based on the previous discussions, I plan > to implement the following changes in the next version: > > 1. Modify the btf__permute interface to adopt the ID map approach, as > suggested by Andrii. > > 2. Remove the lazy sort check and move the verification to the BTF > parsing phase. This addresses two concerns: potential race conditions > with write operations and const-cast issues. The overhead is negligible > (approximately 1.4ms for vmlinux BTF). > > 3. Invoke the btf__permute interface to implement BTF sorting in resolve_btfids. > > I welcome any further suggestions. Hi Donglin, I think this summarizes the discussion pretty well. One thing to notice about (2): if sorting is done by resolve_btfids, there is no need to check for BTF being sorted in vmlinux BTF. So, maybe it's a good idea to skip this check for it, as Alexei suggested (but not for programs BTF). Thanks, Eduard.
On Tue, Nov 11, 2025 at 4:44 AM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Mon, 2025-11-10 at 09:42 +0800, Donglin Peng wrote: > > [...] > > > [[Resending in plain text format - previous HTML email was rejected] > > > > Thanks for the feedback. Based on the previous discussions, I plan > > to implement the following changes in the next version: > > > > 1. Modify the btf__permute interface to adopt the ID map approach, as > > suggested by Andrii. > > > > 2. Remove the lazy sort check and move the verification to the BTF > > parsing phase. This addresses two concerns: potential race conditions > > with write operations and const-cast issues. The overhead is negligible > > (approximately 1.4ms for vmlinux BTF). > > > > 3. Invoke the btf__permute interface to implement BTF sorting in resolve_btfids. > > > > I welcome any further suggestions. > > Hi Donglin, > > I think this summarizes the discussion pretty well. > One thing to notice about (2): if sorting is done by resolve_btfids, > there is no need to check for BTF being sorted in vmlinux BTF. > So, maybe it's a good idea to skip this check for it, as Alexei suggested > (but not for programs BTF). Thanks. I noticed that we still need an additional iteration in btf_parse_base() and btf_parse_module() to compute nr_sorted_types for lookup performance optimization. Thanks, Donglin > > Thanks, > Eduard.
© 2016 - 2025 Red Hat, Inc.