[RFC PATCH] libbpf: Change hash_combine parameters from long to __u32

Sidong Yang posted 1 patch 1 week ago
tools/lib/bpf/btf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[RFC PATCH] libbpf: Change hash_combine parameters from long to __u32
Posted by Sidong Yang 1 week ago
The hash_combine() could be trapped when compiled with sanitizer like "zig cc".
This patch changes parameters to __u32 to fix it.

Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai>
---
 tools/lib/bpf/btf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 8befb8103e32..11ccb5aa4958 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -3548,7 +3548,7 @@ struct btf_dedup {
 	struct strset *strs_set;
 };
 
-static long hash_combine(long h, long value)
+static __u32 hash_combine(__u32 h, __u32 value)
 {
 	return h * 31 + value;
 }
-- 
2.42.0
Re: [RFC PATCH] libbpf: Change hash_combine parameters from long to __u32
Posted by Andrii Nakryiko 6 days, 20 hours ago
On Fri, Nov 15, 2024 at 2:51 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
>
> The hash_combine() could be trapped when compiled with sanitizer like "zig cc".
> This patch changes parameters to __u32 to fix it.

Can you please elaborate? What exactly are you fixing? "Undefined"
signed integer overflow? I can consider changing long to unsigned
long, but I don't think we should downgrade from long all the way to
32-bit u32. I'd rather keep all those 64 bits for hash.

pw-bot: cr

>
> Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai>
> ---
>  tools/lib/bpf/btf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 8befb8103e32..11ccb5aa4958 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -3548,7 +3548,7 @@ struct btf_dedup {
>         struct strset *strs_set;
>  };
>
> -static long hash_combine(long h, long value)
> +static __u32 hash_combine(__u32 h, __u32 value)
>  {
>         return h * 31 + value;
>  }
> --
> 2.42.0
>
>
Re: [RFC PATCH] libbpf: Change hash_combine parameters from long to __u32
Posted by Sidong Yang 6 days, 16 hours ago
On Fri, Nov 15, 2024 at 11:57:24AM -0800, Andrii Nakryiko wrote:
> On Fri, Nov 15, 2024 at 2:51 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
> >
> > The hash_combine() could be trapped when compiled with sanitizer like "zig cc".
> > This patch changes parameters to __u32 to fix it.
> 
> Can you please elaborate? What exactly are you fixing? "Undefined"
> signed integer overflow? I can consider changing long to unsigned
> long, but I don't think we should downgrade from long all the way to
> 32-bit u32. I'd rather keep all those 64 bits for hash.

Hi, Andrii.

Actually I'm using libbpf-rs with maturin build that makes python package for
rust. It seems that it uses zig cc for cross compilation. It compiles libbpf
like this command.

CC="zig cc" make CFLAGS="-fsanitize-trap"

And hash_combine's result is like below.

0000000000063860 <hash_combine>:
   63860:       55                      push   %rbp
   63861:       48 89 e5                mov    %rsp,%rbp
   63864:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
   63868:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
   6386c:       b8 1f 00 00 00          mov    $0x1f,%eax
   63871:       48 0f af 45 f8          imul   -0x8(%rbp),%rax
   63876:       48 89 45 e8             mov    %rax,-0x18(%rbp)
   6387a:       0f 90 c0                seto   %al
   6387d:       34 ff                   xor    $0xff,%al
   6387f:       a8 01                   test   $0x1,%al
   63881:       0f 85 05 00 00 00       jne    6388c <hash_combine+0x2c>
-> 63887:       67 0f b9 40 0c          ud1    0xc(%eax),%eax
   6388c:       48 8b 45 e8             mov    -0x18(%rbp),%rax
   63890:       48 03 45 f0             add    -0x10(%rbp),%rax
   63894:       48 89 45 e0             mov    %rax,-0x20(%rbp)
   63898:       0f 90 c0                seto   %al
   6389b:       34 ff                   xor    $0xff,%al
   6389d:       a8 01                   test   $0x1,%al
   6389f:       0f 85 04 00 00 00       jne    638a9 <hash_combine+0x49>
   638a5:       67 0f b9 00             ud1    (%eax),%eax
   638a9:       48 8b 45 e0             mov    -0x20(%rbp),%rax
   638ad:       5d                      pop    %rbp
   638ae:       c3                      ret   
   638af:       90                      nop

When I'm using libbpf-rs, it receives SIGILL for ud1 instruction.
It seems more appropriate to use u64 instead of u32, doesn't it?
I'll work on it.

Thanks,
Sidong
> 
> pw-bot: cr
> 
> >
> > Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai>
> > ---
> >  tools/lib/bpf/btf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 8befb8103e32..11ccb5aa4958 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -3548,7 +3548,7 @@ struct btf_dedup {
> >         struct strset *strs_set;
> >  };
> >
> > -static long hash_combine(long h, long value)
> > +static __u32 hash_combine(__u32 h, __u32 value)
> >  {
> >         return h * 31 + value;
> >  }
> > --
> > 2.42.0
> >
> >
Re: [RFC PATCH] libbpf: Change hash_combine parameters from long to __u32
Posted by Yonghong Song 6 days, 12 hours ago


On 11/15/24 4:36 PM, Sidong Yang wrote:
> On Fri, Nov 15, 2024 at 11:57:24AM -0800, Andrii Nakryiko wrote:
>> On Fri, Nov 15, 2024 at 2:51 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
>>> The hash_combine() could be trapped when compiled with sanitizer like "zig cc".
>>> This patch changes parameters to __u32 to fix it.
>> Can you please elaborate? What exactly are you fixing? "Undefined"
>> signed integer overflow? I can consider changing long to unsigned
>> long, but I don't think we should downgrade from long all the way to
>> 32-bit u32. I'd rather keep all those 64 bits for hash.
> Hi, Andrii.
>
> Actually I'm using libbpf-rs with maturin build that makes python package for
> rust. It seems that it uses zig cc for cross compilation. It compiles libbpf
> like this command.
>
> CC="zig cc" make CFLAGS="-fsanitize-trap"
>
> And hash_combine's result is like below.
>
> 0000000000063860 <hash_combine>:
>     63860:       55                      push   %rbp
>     63861:       48 89 e5                mov    %rsp,%rbp
>     63864:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
>     63868:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
>     6386c:       b8 1f 00 00 00          mov    $0x1f,%eax
>     63871:       48 0f af 45 f8          imul   -0x8(%rbp),%rax
>     63876:       48 89 45 e8             mov    %rax,-0x18(%rbp)
>     6387a:       0f 90 c0                seto   %al
>     6387d:       34 ff                   xor    $0xff,%al
>     6387f:       a8 01                   test   $0x1,%al
>     63881:       0f 85 05 00 00 00       jne    6388c <hash_combine+0x2c>
> -> 63887:       67 0f b9 40 0c          ud1    0xc(%eax),%eax
>     6388c:       48 8b 45 e8             mov    -0x18(%rbp),%rax
>     63890:       48 03 45 f0             add    -0x10(%rbp),%rax
>     63894:       48 89 45 e0             mov    %rax,-0x20(%rbp)
>     63898:       0f 90 c0                seto   %al
>     6389b:       34 ff                   xor    $0xff,%al
>     6389d:       a8 01                   test   $0x1,%al
>     6389f:       0f 85 04 00 00 00       jne    638a9 <hash_combine+0x49>
>     638a5:       67 0f b9 00             ud1    (%eax),%eax
>     638a9:       48 8b 45 e0             mov    -0x20(%rbp),%rax
>     638ad:       5d                      pop    %rbp
>     638ae:       c3                      ret
>     638af:       90                      nop
>
> When I'm using libbpf-rs, it receives SIGILL for ud1 instruction.
> It seems more appropriate to use u64 instead of u32, doesn't it?
> I'll work on it.

Yes, this is due to potential integer overflow.

I tried with clang with additional flags
    -fsanitize=signed-integer-overflow -fsanitize-trap=all
and disable inlining for hash_combine().
The asm code (the code is compiled with -O2)

0000000000007cb0 <hash_combine>:
     7cb0: 48 6b c7 1f                   imulq   $0x1f, %rdi, %rax
     7cb4: 70 06                         jo      0x7cbc <hash_combine+0xc>
     7cb6: 48 01 f0                      addq    %rsi, %rax
     7cb9: 70 06                         jo      0x7cc1 <hash_combine+0x11>
     7cbb: c3                            retq
     7cbc: 67 0f b9 40 0c                ud1l    0xc(%eax), %eax
     7cc1: 67 0f b9 00                   ud1l    (%eax), %eax
     7cc5: 66 66 2e 0f 1f 84 00 00 00 00 00      nopw    %cs:(%rax,%rax)

Here 'jo' means 'jump if overflow'.
So if overflow happens, 'ud1l' will execute and dump error.

Changing 'long' type to 'unsigned long' should fix the problem.

>
> Thanks,
> Sidong
>> pw-bot: cr
>>
>>> Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai>
>>> ---
>>>   tools/lib/bpf/btf.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>>> index 8befb8103e32..11ccb5aa4958 100644
>>> --- a/tools/lib/bpf/btf.c
>>> +++ b/tools/lib/bpf/btf.c
>>> @@ -3548,7 +3548,7 @@ struct btf_dedup {
>>>          struct strset *strs_set;
>>>   };
>>>
>>> -static long hash_combine(long h, long value)
>>> +static __u32 hash_combine(__u32 h, __u32 value)
>>>   {
>>>          return h * 31 + value;
>>>   }
>>> --
>>> 2.42.0
>>>
>>>

Re: [RFC PATCH] libbpf: Change hash_combine parameters from long to __u32
Posted by Sidong Yang 6 days, 9 hours ago
On Fri, Nov 15, 2024 at 08:39:05PM -0800, Yonghong Song wrote:
> 
> 
> 
> On 11/15/24 4:36 PM, Sidong Yang wrote:
> > On Fri, Nov 15, 2024 at 11:57:24AM -0800, Andrii Nakryiko wrote:
> > > On Fri, Nov 15, 2024 at 2:51 AM Sidong Yang <sidong.yang@furiosa.ai> wrote:
> > > > The hash_combine() could be trapped when compiled with sanitizer like "zig cc".
> > > > This patch changes parameters to __u32 to fix it.
> > > Can you please elaborate? What exactly are you fixing? "Undefined"
> > > signed integer overflow? I can consider changing long to unsigned
> > > long, but I don't think we should downgrade from long all the way to
> > > 32-bit u32. I'd rather keep all those 64 bits for hash.
> > Hi, Andrii.
> > 
> > Actually I'm using libbpf-rs with maturin build that makes python package for
> > rust. It seems that it uses zig cc for cross compilation. It compiles libbpf
> > like this command.
> > 
> > CC="zig cc" make CFLAGS="-fsanitize-trap"
> > 
> > And hash_combine's result is like below.
> > 
> > 0000000000063860 <hash_combine>:
> >     63860:       55                      push   %rbp
> >     63861:       48 89 e5                mov    %rsp,%rbp
> >     63864:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
> >     63868:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
> >     6386c:       b8 1f 00 00 00          mov    $0x1f,%eax
> >     63871:       48 0f af 45 f8          imul   -0x8(%rbp),%rax
> >     63876:       48 89 45 e8             mov    %rax,-0x18(%rbp)
> >     6387a:       0f 90 c0                seto   %al
> >     6387d:       34 ff                   xor    $0xff,%al
> >     6387f:       a8 01                   test   $0x1,%al
> >     63881:       0f 85 05 00 00 00       jne    6388c <hash_combine+0x2c>
> > -> 63887:       67 0f b9 40 0c          ud1    0xc(%eax),%eax
> >     6388c:       48 8b 45 e8             mov    -0x18(%rbp),%rax
> >     63890:       48 03 45 f0             add    -0x10(%rbp),%rax
> >     63894:       48 89 45 e0             mov    %rax,-0x20(%rbp)
> >     63898:       0f 90 c0                seto   %al
> >     6389b:       34 ff                   xor    $0xff,%al
> >     6389d:       a8 01                   test   $0x1,%al
> >     6389f:       0f 85 04 00 00 00       jne    638a9 <hash_combine+0x49>
> >     638a5:       67 0f b9 00             ud1    (%eax),%eax
> >     638a9:       48 8b 45 e0             mov    -0x20(%rbp),%rax
> >     638ad:       5d                      pop    %rbp
> >     638ae:       c3                      ret
> >     638af:       90                      nop
> > 
> > When I'm using libbpf-rs, it receives SIGILL for ud1 instruction.
> > It seems more appropriate to use u64 instead of u32, doesn't it?
> > I'll work on it.
> 
> Yes, this is due to potential integer overflow.
> 
> I tried with clang with additional flags
>    -fsanitize=signed-integer-overflow -fsanitize-trap=all
> and disable inlining for hash_combine().
> The asm code (the code is compiled with -O2)
> 
> 0000000000007cb0 <hash_combine>:
>     7cb0: 48 6b c7 1f                   imulq   $0x1f, %rdi, %rax
>     7cb4: 70 06                         jo      0x7cbc <hash_combine+0xc>
>     7cb6: 48 01 f0                      addq    %rsi, %rax
>     7cb9: 70 06                         jo      0x7cc1 <hash_combine+0x11>
>     7cbb: c3                            retq
>     7cbc: 67 0f b9 40 0c                ud1l    0xc(%eax), %eax
>     7cc1: 67 0f b9 00                   ud1l    (%eax), %eax
>     7cc5: 66 66 2e 0f 1f 84 00 00 00 00 00      nopw    %cs:(%rax,%rax)
> 
> Here 'jo' means 'jump if overflow'.
> So if overflow happens, 'ud1l' will execute and dump error.
> 
> Changing 'long' type to 'unsigned long' should fix the problem.

Agree, unsigned long will be good. thanks.

> 
> > 
> > Thanks,
> > Sidong
> > > pw-bot: cr
> > > 
> > > > Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai>
> > > > ---
> > > >   tools/lib/bpf/btf.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > index 8befb8103e32..11ccb5aa4958 100644
> > > > --- a/tools/lib/bpf/btf.c
> > > > +++ b/tools/lib/bpf/btf.c
> > > > @@ -3548,7 +3548,7 @@ struct btf_dedup {
> > > >          struct strset *strs_set;
> > > >   };
> > > > 
> > > > -static long hash_combine(long h, long value)
> > > > +static __u32 hash_combine(__u32 h, __u32 value)
> > > >   {
> > > >          return h * 31 + value;
> > > >   }
> > > > --
> > > > 2.42.0
> > > > 
> > > > 
>