[PATCH bpf-next v2 1/2] bpf: Fix tnum_overlap to check for zero mask intersection

KaFai Wan posted 2 patches 3 months, 1 week ago
[PATCH bpf-next v2 1/2] bpf: Fix tnum_overlap to check for zero mask intersection
Posted by KaFai Wan 3 months, 1 week ago
Syzbot reported a kernel warning due to a range invariant violation in
the BPF verifier. The issue occurs when tnum_overlap() fails to detect
that two tnums don't have any overlapping bits.

The problematic BPF program:
   0: call bpf_get_prandom_u32
   1: r6 = r0
   2: r6 &= 0xFFFFFFFFFFFFFFF0
   3: r7 = r0
   4: r7 &= 0x07
   5: r7 -= 0xFF
   6: if r6 == r7 goto <exit>

After instruction 5, R7 has the range:
   R7: u64=[0xffffffffffffff01, 0xffffffffffffff08] var_off=(0xffffffffffffff00; 0xf)

R6 and R7 don't overlap since they have no agreeing bits. However,
is_branch_taken() fails to recognize this, causing the verifier to
refine register bounds and trigger range bounds violation:

   6: if r6 == r7 goto <exit>
   true_reg1: u64=[0xffffffffffffff01, 0xffffffffffffff00] var_off=(0xffffffffffffff00, 0x0)
   true_reg2: u64=[0xffffffffffffff01, 0xffffffffffffff00] var_off=(0xffffffffffffff00, 0x0)

The root cause is that tnum_overlap() doesn't properly handle the case
where the masks have no overlapping bits.

Fix this by adding an early check for zero mask intersection in tnum_overlap().

Reported-by: syzbot+c950cc277150935cc0b5@syzkaller.appspotmail.com
Fixes: f41345f47fb2 ("bpf: Use tnums for JEQ/JNE is_branch_taken logic")
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
---
 kernel/bpf/tnum.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index f8e70e9c3998..1a75b7c9a73a 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -163,6 +163,8 @@ bool tnum_overlap(struct tnum a, struct tnum b)
 {
 	u64 mu;
 
+	if (a.mask && b.mask && !(a.mask & b.mask))
+		return false;
 	mu = ~a.mask & ~b.mask;
 	return (a.value & mu) == (b.value & mu);
 }
-- 
2.43.0
Re: [PATCH bpf-next v2 1/2] bpf: Fix tnum_overlap to check for zero mask intersection
Posted by bot+bpf-ci@kernel.org 3 months, 1 week ago
```
commit 16cbc4520eb13d2065d145c76b97fb93035d81e8
Author: KaFai Wan <kafai.wan@linux.dev>

bpf: Fix tnum_overlap to check for zero mask intersection

This patch adds an early return check in tnum_overlap() to detect
non-overlapping tnums when their masks have no overlapping bits. The
fix addresses a syzbot report where the verifier failed to recognize
that two tnums couldn't represent the same value.

Reported-by: syzbot+c950cc277150935cc0b5@syzkaller.appspotmail.com

> diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> index f8e70e9c3..1a75b7c9a 100644
> --- a/kernel/bpf/tnum.c
> +++ b/kernel/bpf/tnum.c
> @@ -163,6 +163,8 @@ bool tnum_overlap(struct tnum a, struct tnum b)
>  {
>  	u64 mu;
>
> +	if (a.mask && b.mask && !(a.mask & b.mask))
> +		return false;
                              ^^^^^

Does this logic correctly detect non-overlapping tnums? Consider this
counter-example:

  a = (value=0x5, mask=0xA)  // binary: value=0101, mask=1010
  b = (value=0xA, mask=0x5)  // binary: value=1010, mask=0101

The masks are disjoint: (0xA & 0x5) == 0, so the new check returns
false (no overlap).

However, these tnums actually represent overlapping sets:
  a represents: {5, 7, 13, 15}  // 0b0101, 0b0111, 0b1101, 0b1111
  b represents: {10, 11, 14, 15} // 0b1010, 0b1011, 0b1110, 0b1111

Both sets contain 15, so they do overlap. Can this cause incorrect
verifier behavior when is_branch_taken() calls tnum_overlap() to
determine if JEQ/JNE branches are reachable?

>  	mu = ~a.mask & ~b.mask;
>  	return (a.value & mu) == (b.value & mu);
>  }


```

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: `bpf: Fix tnum_overlap to check for zero mask intersection`
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18880108453
Re: [PATCH bpf-next v2 1/2] bpf: Fix tnum_overlap to check for zero mask intersection
Posted by Eduard Zingerman 3 months, 1 week ago
On Tue, 2025-10-28 at 15:45 +0000, bot+bpf-ci@kernel.org wrote:
> ```
> commit 16cbc4520eb13d2065d145c76b97fb93035d81e8
> Author: KaFai Wan <kafai.wan@linux.dev>
> 
> bpf: Fix tnum_overlap to check for zero mask intersection
> 
> This patch adds an early return check in tnum_overlap() to detect
> non-overlapping tnums when their masks have no overlapping bits. The
> fix addresses a syzbot report where the verifier failed to recognize
> that two tnums couldn't represent the same value.
> 
> Reported-by: syzbot+c950cc277150935cc0b5@syzkaller.appspotmail.com
> 
> > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> > index f8e70e9c3..1a75b7c9a 100644
> > --- a/kernel/bpf/tnum.c
> > +++ b/kernel/bpf/tnum.c
> > @@ -163,6 +163,8 @@ bool tnum_overlap(struct tnum a, struct tnum b)
> >  {
> >  	u64 mu;
> > 
> > +	if (a.mask && b.mask && !(a.mask & b.mask))
> > +		return false;
>                               ^^^^^
> 
> Does this logic correctly detect non-overlapping tnums? Consider this
> counter-example:
> 
>   a = (value=0x5, mask=0xA)  // binary: value=0101, mask=1010
>   b = (value=0xA, mask=0x5)  // binary: value=1010, mask=0101
> 
> The masks are disjoint: (0xA & 0x5) == 0, so the new check returns
> false (no overlap).
> 
> However, these tnums actually represent overlapping sets:
>   a represents: {5, 7, 13, 15}  // 0b0101, 0b0111, 0b1101, 0b1111
>   b represents: {10, 11, 14, 15} // 0b1010, 0b1011, 0b1110, 0b1111
> 
> Both sets contain 15, so they do overlap. Can this cause incorrect
> verifier behavior when is_branch_taken() calls tnum_overlap() to
> determine if JEQ/JNE branches are reachable?

This is a legit bug, please address.

> >  	mu = ~a.mask & ~b.mask;
> >  	return (a.value & mu) == (b.value & mu);
> >  }
> 
> 
> ```
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> In-Reply-To-Subject: `bpf: Fix tnum_overlap to check for zero mask intersection`
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18880108453