答复: [????] Re: divide error in x86 and cputime

Li,Rongqing posted 1 patch 3 months ago
答复: [????] Re: divide error in x86 and cputime
Posted by Li,Rongqing 3 months ago


> On a second thought, this
> 
>     mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> 0x09e00900);
>                         stime               rtime
> stime + utime
> 
> looks suspicious:
> 
> 	- stime > stime + utime
> 
> 	- rtime = 0xfffd213aabd74626 is absurdly huge
> 
> so perhaps there is another problem?
> 

it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit

non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0

so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 6dab4854..db0c273 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
                goto update;
        }

+       if (stime > (stime + utime)) {
+               goto update;
+       }
+
        stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
        /*
         * Because mul_u64_u64_div_u64() can approximate on some


Thanks

-Li


> Oleg.
> 
> On 07/08, Oleg Nesterov wrote:
> >
> > On 07/07, Li,Rongqing wrote:
> > >
> > > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> >
> > ...
> >
> > > It caused by a process with many threads running very long, and
> > > utime+stime overflowed 64bit, then cause the below div
> > >
> > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > 0x09e00900);
> > >
> > > I see the comments of mul_u64_u64_div_u64() say:
> > >
> > > Will generate an #DE when the result doesn't fit u64, could fix with
> > > an __ex_table[] entry when it becomes an issu
> > >
> > > Seem __ex_table[] entry for div does not work ?
> >
> > Well, the current version doesn't have an __ex_table[] entry for div...
> >
> > I do not know what can/should we do in this case... Perhaps
> >
> > 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> > 	{
> > 		int ok = 0;
> > 		u64 q;
> >
> > 		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> > 			_ASM_EXTABLE(1b, 2b)
> > 			: "=a" (q), "+r" (ok)
> > 			: "a" (a), "rm" (mul), "rm" (div)
> > 			: "rdx");
> >
> > 		return ok ? q : -1ul;
> > 	}
> >
> > ?
> >
> > Should return ULLONG_MAX on #DE.
> >
> > Oleg.

答复: [????] Re: divide error in x86 and cputime
Posted by Li,Rongqing 3 months ago
> non-x86 system maybe has same issue, once (stime + utime) overflows 64bit,
> mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
> 

Correct this, mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x009e00900) oflib/math/div64.c returns 0xffffffffffffffff


> so to cputime, could cputime_adjust() return stime if stime if stime + utime is
> overflow
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> 6dab4854..db0c273 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
>                 goto update;
>         }
> 
> +       if (stime > (stime + utime)) {
> +               goto update;
> +       }
> +
>         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
>         /*
>          * Because mul_u64_u64_div_u64() can approximate on some
> 
> 
> Thanks
> 
> -Li
> 
> 
> > Oleg.
> >
> > On 07/08, Oleg Nesterov wrote:
> > >
> > > On 07/07, Li,Rongqing wrote:
> > > >
> > > > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> > >
> > > ...
> > >
> > > > It caused by a process with many threads running very long, and
> > > > utime+stime overflowed 64bit, then cause the below div
> > > >
> > > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > > 0x09e00900);
> > > >
> > > > I see the comments of mul_u64_u64_div_u64() say:
> > > >
> > > > Will generate an #DE when the result doesn't fit u64, could fix
> > > > with an __ex_table[] entry when it becomes an issu
> > > >
> > > > Seem __ex_table[] entry for div does not work ?
> > >
> > > Well, the current version doesn't have an __ex_table[] entry for div...
> > >
> > > I do not know what can/should we do in this case... Perhaps
> > >
> > > 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> > > 	{
> > > 		int ok = 0;
> > > 		u64 q;
> > >
> > > 		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> > > 			_ASM_EXTABLE(1b, 2b)
> > > 			: "=a" (q), "+r" (ok)
> > > 			: "a" (a), "rm" (mul), "rm" (div)
> > > 			: "rdx");
> > >
> > > 		return ok ? q : -1ul;
> > > 	}
> > >
> > > ?
> > >
> > > Should return ULLONG_MAX on #DE.
> > >
> > > Oleg.

Re: [????] Re: divide error in x86 and cputime
Posted by Steven Rostedt 3 months ago
On Mon, 7 Jul 2025 23:41:14 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> > On a second thought, this
> > 
> >     mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > 0x09e00900);
> >                         stime               rtime
> > stime + utime
> > 
> > looks suspicious:
> > 
> > 	- stime > stime + utime
> > 
> > 	- rtime = 0xfffd213aabd74626 is absurdly huge
> > 
> > so perhaps there is another problem?
> >   
> 
> it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit
> 
> non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
> 
> so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 6dab4854..db0c273 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
>                 goto update;
>         }
> 
> +       if (stime > (stime + utime)) {
> +               goto update;
> +       }
> +
>         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
>         /*
>          * Because mul_u64_u64_div_u64() can approximate on some
> 

Are you running 5.10.0? Because a diff of 5.10.238 from 5.10.0 gives:

@@ -579,6 +579,12 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
        }
 
        stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
+       /*
+        * Because mul_u64_u64_div_u64() can approximate on some
+        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
+        */
+       if (unlikely(stime > rtime))
+               stime = rtime;
 
 update:


Thus the result is what's getting screwed up.

-- Steve
答复: [????] Re: [????] Re: divide error in x86 and cputime
Posted by Li,Rongqing 3 months ago
> On Mon, 7 Jul 2025 23:41:14 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
> 
> > > On a second thought, this
> > >
> > >     mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > 0x09e00900);
> > >                         stime               rtime
> > > stime + utime
> > >
> > > looks suspicious:
> > >
> > > 	- stime > stime + utime
> > >
> > > 	- rtime = 0xfffd213aabd74626 is absurdly huge
> > >
> > > so perhaps there is another problem?
> > >
> >
> > it happened when a process with 236 busy polling threads , run about
> > 904 days, the total time will overflow the 64bit
> >
> > non-x86 system maybe has same issue, once (stime + utime) overflows
> > 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division
> > by 0
> >
> > so to cputime, could cputime_adjust() return stime if stime if stime +
> > utime is overflow
> >
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> > 6dab4854..db0c273 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr,
> struct prev_cputime *prev,
> >                 goto update;
> >         }
> >
> > +       if (stime > (stime + utime)) {
> > +               goto update;
> > +       }
> > +
> >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> >         /*
> >          * Because mul_u64_u64_div_u64() can approximate on some
> >
> 
> Are you running 5.10.0? Because a diff of 5.10.238 from 5.10.0 gives:
> 
> @@ -579,6 +579,12 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
>         }
> 
>         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> +       /*
> +        * Because mul_u64_u64_div_u64() can approximate on some
> +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> +        */
> +       if (unlikely(stime > rtime))
> +               stime = rtime;


My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()

Thanks

-Li


> 
>  update:
> 
> 
> Thus the result is what's getting screwed up.
> 
> -- Steve
Re: [????] Re: [????] Re: divide error in x86 and cputime
Posted by Steven Rostedt 3 months ago
On Tue, 8 Jul 2025 00:10:54 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > +       /*
> > +        * Because mul_u64_u64_div_u64() can approximate on some
> > +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > +        */
> > +       if (unlikely(stime > rtime))
> > +               stime = rtime;  
> 
> 
> My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
> but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()

Have you tried it? Or are you just making an assumption?

How can you be so sure? Did you even *look* at the commit?

    sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
    
    In extreme test scenarios:
    the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
    utime = 18446744073709518790 ns, rtime = 135989749728000 ns
    
    In cputime_adjust() process, stime is greater than rtime due to
    mul_u64_u64_div_u64() precision problem.
    before call mul_u64_u64_div_u64(),
    stime = 175136586720000, rtime = 135989749728000, utime = 1416780000.
    after call mul_u64_u64_div_u64(),
    stime = 135989949653530
    
    unsigned reversion occurs because rtime is less than stime.
    utime = rtime - stime = 135989749728000 - 135989949653530
                          = -199925530
                          = (u64)18446744073709518790
    
    Trigger condition:
      1). User task run in kernel mode most of time
      2). ARM64 architecture
      3). TICK_CPU_ACCOUNTING=y
          CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
    
    Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime


When stime ends up greater than rtime, it causes utime to go NEGATIVE!

That means *YES* it can overflow a u64 number. That's your bug.

Next time, look to see if there's fixes in the code that is triggering
issues for you and test them out, before bothering upstream.

Goodbye.

-- Steve
Re: [????] Re: [????] Re: divide error in x86 and cputime
Posted by David Laight 3 months ago
On Mon, 7 Jul 2025 20:30:57 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 8 Jul 2025 00:10:54 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
> 
> > >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > +       /*
> > > +        * Because mul_u64_u64_div_u64() can approximate on some
> > > +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > > +        */
> > > +       if (unlikely(stime > rtime))
> > > +               stime = rtime;    
> > 
> > 
> > My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
> > but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()  
> 
> Have you tried it? Or are you just making an assumption?
> 
> How can you be so sure? Did you even *look* at the commit?

It can't be relevant.
That change is after the mul_u64_u64_div_u64() call that trapped.
It is also not relevant for x86-64 because it uses the asm version.

At some point mul_u64_u64_div_u64() got changed to be accurate (and slow)
so that check isn't needed any more.

	David
答复: [????] Re: [????] Re: [????] Re: divide error in x86 and cputime
Posted by Li,Rongqing 3 months ago
> > On Tue, 8 Jul 2025 00:10:54 +0000
> > "Li,Rongqing" <lirongqing@baidu.com> wrote:
> >
> > > >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > > +       /*
> > > > +        * Because mul_u64_u64_div_u64() can approximate on some
> > > > +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > > > +        */
> > > > +       if (unlikely(stime > rtime))
> > > > +               stime = rtime;
> > >
> > >
> > > My 5.10 has not this patch " sched/cputime: Fix
> > > mul_u64_u64_div_u64() precision for cputime ", but I am sure this
> > > patch can not fix this overflow issue, Since division error happened
> > > in mul_u64_u64_div_u64()
> >
> > Have you tried it? Or are you just making an assumption?
> >
> > How can you be so sure? Did you even *look* at the commit?
> 
> It can't be relevant.
> That change is after the mul_u64_u64_div_u64() call that trapped.
> It is also not relevant for x86-64 because it uses the asm version.
> 
> At some point mul_u64_u64_div_u64() got changed to be accurate (and slow) so
> that check isn't needed any more.
> 

I see this patch not relevant

Thank you very much for your confirmation

-Li
答复: [????] Re: [????] Re: [????] Re: divide error in x86 and cputime
Posted by Li,Rongqing 3 months ago
> Have you tried it? Or are you just making an assumption?
> 
> How can you be so sure? Did you even *look* at the commit?
> 
>     sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
> 
>     In extreme test scenarios:
>     the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
>     utime = 18446744073709518790 ns, rtime = 135989749728000 ns
> 
>     In cputime_adjust() process, stime is greater than rtime due to
>     mul_u64_u64_div_u64() precision problem.
>     before call mul_u64_u64_div_u64(),
>     stime = 175136586720000, rtime = 135989749728000, utime =
> 1416780000.
>     after call mul_u64_u64_div_u64(),
>     stime = 135989949653530
> 
>     unsigned reversion occurs because rtime is less than stime.
>     utime = rtime - stime = 135989749728000 - 135989949653530
>                           = -199925530
>                           = (u64)18446744073709518790
> 

I will try to tested this patch, But I think it is different case;

Stime is not greater than rtime in my case, (stime= 0x69f98da9ba980c00, rtime= 0xfffd213aabd74626, stime+utime= 0x9e00900. So utime should be 0x960672564f47fd00 ), and this overflow process with 236 busy poll threads running about 904 day, so I think these times are correct


Thanks

-Li

>     Trigger condition:
>       1). User task run in kernel mode most of time
>       2). ARM64 architecture
>       3). TICK_CPU_ACCOUNTING=y
>           CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
> 
>     Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime
> 
> 
> When stime ends up greater than rtime, it causes utime to go NEGATIVE!
> 
> That means *YES* it can overflow a u64 number. That's your bug.
> 
> Next time, look to see if there's fixes in the code that is triggering issues for you
> and test them out, before bothering upstream.
> 
> Goodbye.
> 
> -- Steve
Re: divide error in x86 and cputime
Posted by Steven Rostedt 3 months ago
On Tue, 8 Jul 2025 01:17:50 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> Stime is not greater than rtime in my case, (stime= 0x69f98da9ba980c00,
> rtime= 0xfffd213aabd74626, stime+utime= 0x9e00900. So utime should be
> 0x960672564f47fd00 ), and this overflow process with 236 busy poll
> threads running about 904 day, so I think these times are correct
> 

But look at rtime, it is *negative*. So maybe that fix isn't going to fix
this bug, but rtime is most definitely screwed up. That value is:

  0xfffd213aabd74626 = (u64)18445936184654251558 = (s64)-807889055300058

There's no way run time should be 584 years in nanoseconds.

So if it's not fixed by that commit, it's a bug that happened before you even
got to the mul_u64_u64_div_u64() function. Touching that is only putting a
band-aid on the symptom, you haven't touched the real bug.

I bet there's likely another fix between what you are using and 5.10.238.
There's 31,101 commits between those two. You are using a way old kernel
without any fixes to it. It is known to be buggy. You will hit bugs with
it. No need to tell us about it.

-- Steve