[PATCH] watchdog/softlockup: Fix sample ring index wrap in need_counting_irqs()

shengminghu512 posted 1 patch 2 weeks, 4 days ago
kernel/watchdog.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] watchdog/softlockup: Fix sample ring index wrap in need_counting_irqs()
Posted by shengminghu512 2 weeks, 4 days ago
From: Shengming Hu <hu.shengming@zte.com.cn>

cpustat_tail indexes cpustat_util[], which is a NUM_SAMPLE_PERIODS-sized
ring buffer. need_counting_irqs() currently wraps the index using
NUM_HARDIRQ_REPORT, which only happens to match NUM_SAMPLE_PERIODS.

Use NUM_SAMPLE_PERIODS for the wrap to keep the ring math correct even if
the NUM_HARDIRQ_REPORT or  NUM_SAMPLE_PERIODS changes.

Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
---
 kernel/watchdog.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b4d5fbdb9..7d675781b 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -550,7 +550,7 @@ static bool need_counting_irqs(void)
 	u8 util;
 	int tail = __this_cpu_read(cpustat_tail);
 
-	tail = (tail + NUM_HARDIRQ_REPORT - 1) % NUM_HARDIRQ_REPORT;
+	tail = (tail + NUM_SAMPLE_PERIODS - 1) % NUM_SAMPLE_PERIODS;
 	util = __this_cpu_read(cpustat_util[tail][STATS_HARDIRQ]);
 	return util > HARDIRQ_PERCENT_THRESH;
 }
-- 
2.25.1

Re: [PATCH] watchdog/softlockup: Fix sample ring index wrap in need_counting_irqs()
Posted by Petr Mladek 1 week, 2 days ago
On Mon 2026-01-19 21:59:05, shengminghu512 wrote:
> From: Shengming Hu <hu.shengming@zte.com.cn>
> 
> cpustat_tail indexes cpustat_util[], which is a NUM_SAMPLE_PERIODS-sized
> ring buffer. need_counting_irqs() currently wraps the index using
> NUM_HARDIRQ_REPORT, which only happens to match NUM_SAMPLE_PERIODS.
> 
> Use NUM_SAMPLE_PERIODS for the wrap to keep the ring math correct even if
> the NUM_HARDIRQ_REPORT or  NUM_SAMPLE_PERIODS changes.
> 
> ---
>  kernel/watchdog.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index b4d5fbdb9..7d675781b 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -550,7 +550,7 @@ static bool need_counting_irqs(void)
>  	u8 util;
>  	int tail = __this_cpu_read(cpustat_tail);
>  
> -	tail = (tail + NUM_HARDIRQ_REPORT - 1) % NUM_HARDIRQ_REPORT;
> +	tail = (tail + NUM_SAMPLE_PERIODS - 1) % NUM_SAMPLE_PERIODS;
>  	util = __this_cpu_read(cpustat_util[tail][STATS_HARDIRQ]);
>  	return util > HARDIRQ_PERCENT_THRESH;

Great catch! It makes perfect sense.

The NUM_HARDIRQ_REPORT is used for another array (irq_counts_sorted[])
with the most frequent IRQs. This code was added with the same commit
which added the other array. It would explain the mistake.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Andrew, I assume that you would take it...

Best Regards,
Petr









>  }
> -- 
> 2.25.1
>
Re: [PATCH] watchdog/softlockup: Fix sample ring index wrap in need_counting_irqs()
Posted by Andrew Morton 1 week, 2 days ago
On Wed, 28 Jan 2026 18:52:11 +0100 Petr Mladek <pmladek@suse.com> wrote:

> On Mon 2026-01-19 21:59:05, shengminghu512 wrote:
> > From: Shengming Hu <hu.shengming@zte.com.cn>
> > 
> > cpustat_tail indexes cpustat_util[], which is a NUM_SAMPLE_PERIODS-sized
> > ring buffer. need_counting_irqs() currently wraps the index using
> > NUM_HARDIRQ_REPORT, which only happens to match NUM_SAMPLE_PERIODS.
> > 
> > Use NUM_SAMPLE_PERIODS for the wrap to keep the ring math correct even if
> > the NUM_HARDIRQ_REPORT or  NUM_SAMPLE_PERIODS changes.
> > 
> > ---
> >  kernel/watchdog.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > index b4d5fbdb9..7d675781b 100644
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -550,7 +550,7 @@ static bool need_counting_irqs(void)
> >  	u8 util;
> >  	int tail = __this_cpu_read(cpustat_tail);
> >  
> > -	tail = (tail + NUM_HARDIRQ_REPORT - 1) % NUM_HARDIRQ_REPORT;
> > +	tail = (tail + NUM_SAMPLE_PERIODS - 1) % NUM_SAMPLE_PERIODS;
> >  	util = __this_cpu_read(cpustat_util[tail][STATS_HARDIRQ]);
> >  	return util > HARDIRQ_PERCENT_THRESH;
> 
> Great catch! It makes perfect sense.
> 
> The NUM_HARDIRQ_REPORT is used for another array (irq_counts_sorted[])
> with the most frequent IRQs. This code was added with the same commit
> which added the other array. It would explain the mistake.
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>

Fixes: e9a9292e2368 ("watchdog/softlockup: Report the most frequent
interrupts"), yes?

What are the runtime effects of this?  "most frequent interrupts" data
is messed up?

I'm assuming we want to fix earlier kernels, so cc:stable?

> Andrew, I assume that you would take it...

Sure, I can queue it.  e9a9292e2368 was merged by tglx so he might want
to take it - if so I'll drop the mm.git copy if/when this appears in
linux-next.
Re: [PATCH] watchdog/softlockup: Fix sample ring index wrap in need_counting_irqs()
Posted by Petr Mladek 1 week, 1 day ago
On Wed 2026-01-28 10:13:39, Andrew Morton wrote:
> On Wed, 28 Jan 2026 18:52:11 +0100 Petr Mladek <pmladek@suse.com> wrote:
> 
> > On Mon 2026-01-19 21:59:05, shengminghu512 wrote:
> > > From: Shengming Hu <hu.shengming@zte.com.cn>
> > > 
> > > cpustat_tail indexes cpustat_util[], which is a NUM_SAMPLE_PERIODS-sized
> > > ring buffer. need_counting_irqs() currently wraps the index using
> > > NUM_HARDIRQ_REPORT, which only happens to match NUM_SAMPLE_PERIODS.
> > > 
> > > Use NUM_SAMPLE_PERIODS for the wrap to keep the ring math correct even if
> > > the NUM_HARDIRQ_REPORT or  NUM_SAMPLE_PERIODS changes.
> > > 
> > > ---
> > >  kernel/watchdog.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > > index b4d5fbdb9..7d675781b 100644
> > > --- a/kernel/watchdog.c
> > > +++ b/kernel/watchdog.c
> > > @@ -550,7 +550,7 @@ static bool need_counting_irqs(void)
> > >  	u8 util;
> > >  	int tail = __this_cpu_read(cpustat_tail);
> > >  
> > > -	tail = (tail + NUM_HARDIRQ_REPORT - 1) % NUM_HARDIRQ_REPORT;
> > > +	tail = (tail + NUM_SAMPLE_PERIODS - 1) % NUM_SAMPLE_PERIODS;
> > >  	util = __this_cpu_read(cpustat_util[tail][STATS_HARDIRQ]);
> > >  	return util > HARDIRQ_PERCENT_THRESH;
> > 
> > Great catch! It makes perfect sense.
> > 
> > The NUM_HARDIRQ_REPORT is used for another array (irq_counts_sorted[])
> > with the most frequent IRQs. This code was added with the same commit
> > which added the other array. It would explain the mistake.
> > 
> > Reviewed-by: Petr Mladek <pmladek@suse.com>
> 
> Fixes: e9a9292e2368 ("watchdog/softlockup: Report the most frequent
> interrupts"), yes?

Yes.

> What are the runtime effects of this?  "most frequent interrupts" data
> is messed up?

It does not have any affect at the moment because both
NUM_HARDIRQ_REPORT and NUM_SAMPLE_PERIODS are defined as '5'.

It is rather a proactive fix. I might cause an invalid access
when anyone increases NUM_HARDIRQ_REPORT count in the future.
The purpose of this value is different.

> I'm assuming we want to fix earlier kernels, so cc:stable?

Good point. It is a good to have in stable.

> > Andrew, I assume that you would take it...
> 
> Sure, I can queue it.  e9a9292e2368 was merged by tglx so he might want
> to take it - if so I'll drop the mm.git copy if/when this appears in
> linux-next.

Thanks for taking it.

Best Regards,
Petr