sched/cputime: Fix a mostly theoretical divide by zero

[PATCH] sched/cputime: Fix a mostly theoretical divide by zero

Posted by lirongqing 8 months, 1 week ago

From: Li RongQing <lirongqing@baidu.com>

Sum of utime and stime can overflow to 0, when a process with
many threads run over total 2^64 ns

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
 kernel/sched/cputime.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 6dab4854..c35fc4c 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -579,7 +579,8 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
 		goto update;
 	}
 
-	stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
+	if (likely(stime + utime))
+		stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
 	/*
 	 * Because mul_u64_u64_div_u64() can approximate on some
 	 * achitectures; enforce the constraint that: a*b/(b+c) <= a.
-- 
2.9.4

答复: [PATCH] sched/cputime: Fix a mostly theoretical divide by zero

Posted by Li,Rongqing 7 months, 3 weeks ago

> 
> Sum of utime and stime can overflow to 0, when a process with many threads
> run over total 2^64 ns
> 
> Signed-off-by: Li RongQing <lirongqing@baidu.com>


Ping

Thanks

-Li

> ---
>  kernel/sched/cputime.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> 6dab4854..c35fc4c 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -579,7 +579,8 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
>  		goto update;
>  	}
> 
> -	stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> +	if (likely(stime + utime))
> +		stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
>  	/*
>  	 * Because mul_u64_u64_div_u64() can approximate on some
>  	 * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> --
> 2.9.4

Re: [PATCH] sched/cputime: Fix a mostly theoretical divide by zero

Posted by Vincent Guittot 7 months, 3 weeks ago

On Mon, 16 Jun 2025 at 13:46, Li,Rongqing <lirongqing@baidu.com> wrote:
>
> >
> > Sum of utime and stime can overflow to 0, when a process with many threads
> > run over total 2^64 ns

Theoretical is the right word; If all 2^32 possible threads belong to
the process, we can get an overflow to 0 after ~4sec run time of each
thread. But then how long will it take to have those 2^32 threads run
4sec on a system ...

It would be good to get number to show how realistic or not it could
be to reach this value

> >
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
>
>
> Ping
>
> Thanks
>
> -Li
>
> > ---
> >  kernel/sched/cputime.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> > 6dab4854..c35fc4c 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -579,7 +579,8 @@ void cputime_adjust(struct task_cputime *curr, struct
> > prev_cputime *prev,
> >               goto update;
> >       }
> >
> > -     stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > +     if (likely(stime + utime))
> > +             stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> >       /*
> >        * Because mul_u64_u64_div_u64() can approximate on some
> >        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > --
> > 2.9.4
>

答复: [外部邮件] Re: [PATCH] sched/cputime: Fix a mostly theoretical divide by zero

Posted by Li,Rongqing 7 months, 3 weeks ago

> Theoretical is the right word; If all 2^32 possible threads belong to the process,
> we can get an overflow to 0 after ~4sec run time of each thread. But then how
> long will it take to have those 2^32 threads run 4sec on a system ...
> 
> It would be good to get number to show how realistic or not it could be to
> reach this value
> 
>

The 2^64 ns is 584 years, if a process with 1000 busy polling threads is running in a machine with more than 1000 CPUs, the runtime will overflow about half year

Thank
-LI

Re: [PATCH] sched/cputime: Fix a mostly theoretical divide by zero

Posted by David Laight 7 months, 3 weeks ago

On Mon, 16 Jun 2025 14:51:56 +0200
Vincent Guittot <vincent.guittot@linaro.org> wrote:

> On Mon, 16 Jun 2025 at 13:46, Li,Rongqing <lirongqing@baidu.com> wrote:
> >  
> > >
> > > Sum of utime and stime can overflow to 0, when a process with many threads
> > > run over total 2^64 ns  
> 
> Theoretical is the right word; If all 2^32 possible threads belong to
> the process, we can get an overflow to 0 after ~4sec run time of each
> thread. But then how long will it take to have those 2^32 threads run
> 4sec on a system ...
> 
> It would be good to get number to show how realistic or not it could
> be to reach this value

I did wonder when re-writing mul_u64_u64_div_u64() how common this path
is and whether both stime and utime could be zero.

The current mul_u64_u64_div_u64() is particularly horrid on 32bit.
(Note that it no longer generates an approximate result.)
On 32bit x86 the worst case (lots of 1 bits in the result) is ~900 clocks,
my new version takes ~130 for pretty much all (large) values.

That is in userspace with cmov, without cmov it will be worse.
I also think the kernel has one less register to play with - %epb.

Other architectures are likely to be worse, sh[rl]d makes double
length shifts less painful - especially when combined with cmov.

See: https://www.spinics.net/lists/kernel/msg5723178.html

	David


> 
> > >
> > > Signed-off-by: Li RongQing <lirongqing@baidu.com>  
> >
> >
> > Ping
> >
> > Thanks
> >
> > -Li
> >  
> > > ---
> > >  kernel/sched/cputime.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> > > 6dab4854..c35fc4c 100644
> > > --- a/kernel/sched/cputime.c
> > > +++ b/kernel/sched/cputime.c
> > > @@ -579,7 +579,8 @@ void cputime_adjust(struct task_cputime *curr, struct
> > > prev_cputime *prev,
> > >               goto update;
> > >       }
> > >
> > > -     stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > +     if (likely(stime + utime))
> > > +             stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > >       /*
> > >        * Because mul_u64_u64_div_u64() can approximate on some
> > >        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > > --
> > > 2.9.4  
> >  
>