[PATCH] watchdog/hardlockup: set watchdog_hardlockup_warned to true as early as possible

Liu Song posted 1 patch 2 years, 1 month ago
kernel/watchdog.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] watchdog/hardlockup: set watchdog_hardlockup_warned to true as early as possible
Posted by Liu Song 2 years, 1 month ago
Since we want to ensure only printing hardlockups once, it is necessary
to set "watchdog_hardlockup_warned" to true as early as possible.

Signed-off-by: Liu Song <liusong@linux.alibaba.com>
---
 kernel/watchdog.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 25d5627a6580..c4795f2d148c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -180,6 +180,8 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
 		/* Only print hardlockups once. */
 		if (per_cpu(watchdog_hardlockup_warned, cpu))
 			return;
+		else
+			per_cpu(watchdog_hardlockup_warned, cpu) = true;
 
 		pr_emerg("Watchdog detected hard LOCKUP on cpu %d\n", cpu);
 		print_modules();
@@ -206,8 +208,6 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
 
 		if (hardlockup_panic)
 			nmi_panic(regs, "Hard LOCKUP");
-
-		per_cpu(watchdog_hardlockup_warned, cpu) = true;
 	} else {
 		per_cpu(watchdog_hardlockup_warned, cpu) = false;
 	}
-- 
2.19.1.6.gb485710b
Re: [PATCH] watchdog/hardlockup: set watchdog_hardlockup_warned to true as early as possible
Posted by Andrew Morton 2 years, 1 month ago
On Sun,  6 Aug 2023 00:01:44 +0800 Liu Song <liusong@linux.alibaba.com> wrote:

> Since we want to ensure only printing hardlockups once, it is necessary
> to set "watchdog_hardlockup_warned" to true as early as possible.
> 
> ...
>
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -180,6 +180,8 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
>  		/* Only print hardlockups once. */
>  		if (per_cpu(watchdog_hardlockup_warned, cpu))
>  			return;
> +		else
> +			per_cpu(watchdog_hardlockup_warned, cpu) = true;

The "else" is unneeded.
  
>  		pr_emerg("Watchdog detected hard LOCKUP on cpu %d\n", cpu);
>  		print_modules();
> @@ -206,8 +208,6 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
>  
>  		if (hardlockup_panic)
>  			nmi_panic(regs, "Hard LOCKUP");
> -
> -		per_cpu(watchdog_hardlockup_warned, cpu) = true;
>  	} else {
>  		per_cpu(watchdog_hardlockup_warned, cpu) = false;
>  	}

When resending, please tell us some more about the effects of the
change.  Presumably there are circumstances in which excess output is
produced?  If so, describe these circumstances and the observed
effects.
Re: [PATCH] watchdog/hardlockup: set watchdog_hardlockup_warned to true as early as possible
Posted by Liu Song 2 years, 1 month ago
在 2023/8/6 01:17, Andrew Morton 写道:
> When resending, please tell us some more about the effects of the
> change.  Presumably there are circumstances in which excess output is
> produced?  If so, describe these circumstances and the observed
> effects.

Hi,

I haven't found duplicate warnings in the real environment.

However, considering that when system occurs hard lockup is basically 
abnormal, it

seems more reasonable to set "watchdog_hardlockup_warned" to ture, 
rather than

waiting for all kinds of information to be printed.


Thanks

Re: [PATCH] watchdog/hardlockup: set watchdog_hardlockup_warned to true as early as possible
Posted by Petr Mladek 2 years, 1 month ago
On Sun 2023-08-06 10:52:57, Liu Song wrote:
> 
> 在 2023/8/6 01:17, Andrew Morton 写道:
> > When resending, please tell us some more about the effects of the
> > change.  Presumably there are circumstances in which excess output is
> > produced?  If so, describe these circumstances and the observed
> > effects.
> 
> Hi,
> 
> I haven't found duplicate warnings in the real environment.
> 
> However, considering that when system occurs hard lockup is basically
> abnormal, it
> 
> seems more reasonable to set "watchdog_hardlockup_warned" to ture, rather
> than
> 
> waiting for all kinds of information to be printed.

I believe that this is not needed.

watchdog_hardlockup_check(cpu, regs) is called on a CPU periodically.
There are two callers:

   + buddy detector checks the particular CPU when the solflockup's
     hrtimer callback is called. See watchdog_hardlockup_kick()
     in watchdog_timer_fn().

   + perf detector checks the particular CPU from a perf callback,
     see watchdog_overflow_callback().

Neither timer nor perf callbacks might be nested. They are naturally
serialized on a given CPU. So, races are not possible in this case.

Best Regards,
Petr