[PATCHv5 3/3] watchdog/softlockup: add SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob

Bitao Hu posted 3 patches 2 years ago
There is a newer version of this series
[PATCHv5 3/3] watchdog/softlockup: add SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob
Posted by Bitao Hu 2 years ago
The interrupt storm detection mechanism we implemented requires a
considerable amount of global storage space when configured for
the maximum number of CPUs.
Therefore, adding a SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob that
defaults to "yes" if the max number of CPUs is <= 128.

Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
---
 kernel/watchdog.c |  2 +-
 lib/Kconfig.debug | 13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 26dc1ad86276..1595e4a94774 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -338,7 +338,7 @@ __setup("watchdog_thresh=", watchdog_thresh_setup);
 
 static void __lockup_detector_cleanup(void);
 
-#ifdef CONFIG_IRQ_TIME_ACCOUNTING
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR_INTR_STORM
 #define NUM_STATS_GROUPS	5
 #define NUM_STATS_PER_GROUP	4
 enum stats_per_group {
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 975a07f9f1cc..74002ba7c42d 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1029,6 +1029,19 @@ config SOFTLOCKUP_DETECTOR
 	  chance to run.  The current stack trace is displayed upon
 	  detection and the system will stay locked up.
 
+config SOFTLOCKUP_DETECTOR_INTR_STORM
+	bool "Detect Interrupt Storm in Soft Lockups"
+	depends on SOFTLOCKUP_DETECTOR && IRQ_TIME_ACCOUNTING
+	default y if NR_CPUS <= 128
+	help
+	  Say Y here to enable the kernel to detect interrupt storm
+	  during "soft lockups".
+
+	  "soft lockups" can be caused by a variety of reasons. If one is caused by
+	  an interrupt storm, then the storming interrupts will not be on the
+	  callstack. To detect this case, it is necessary to report the CPU stats
+	  and the interrupt counts during the "soft lockups".
+
 config BOOTPARAM_SOFTLOCKUP_PANIC
 	bool "Panic (Reboot) On Soft Lockups"
 	depends on SOFTLOCKUP_DETECTOR
-- 
2.37.1 (Apple Git-137.1)
Re: [PATCHv5 3/3] watchdog/softlockup: add SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob
Posted by Doug Anderson 2 years ago
Hi,

On Tue, Feb 6, 2024 at 1:59 AM Bitao Hu <yaoma@linux.alibaba.com> wrote:
>
> The interrupt storm detection mechanism we implemented requires a
> considerable amount of global storage space when configured for
> the maximum number of CPUs.
> Therefore, adding a SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob that
> defaults to "yes" if the max number of CPUs is <= 128.
>
> Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
> ---
>  kernel/watchdog.c |  2 +-
>  lib/Kconfig.debug | 13 +++++++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)

IMO this should be squashed into patch #1, though I won't insist.


> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 26dc1ad86276..1595e4a94774 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -338,7 +338,7 @@ __setup("watchdog_thresh=", watchdog_thresh_setup);
>
>  static void __lockup_detector_cleanup(void);
>
> -#ifdef CONFIG_IRQ_TIME_ACCOUNTING
> +#ifdef CONFIG_SOFTLOCKUP_DETECTOR_INTR_STORM
>  #define NUM_STATS_GROUPS       5
>  #define NUM_STATS_PER_GROUP    4
>  enum stats_per_group {
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 975a07f9f1cc..74002ba7c42d 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1029,6 +1029,19 @@ config SOFTLOCKUP_DETECTOR
>           chance to run.  The current stack trace is displayed upon
>           detection and the system will stay locked up.
>
> +config SOFTLOCKUP_DETECTOR_INTR_STORM
> +       bool "Detect Interrupt Storm in Soft Lockups"
> +       depends on SOFTLOCKUP_DETECTOR && IRQ_TIME_ACCOUNTING
> +       default y if NR_CPUS <= 128
> +       help
> +         Say Y here to enable the kernel to detect interrupt storm
> +         during "soft lockups".
> +
> +         "soft lockups" can be caused by a variety of reasons. If one is caused by
> +         an interrupt storm, then the storming interrupts will not be on the
> +         callstack. To detect this case, it is necessary to report the CPU stats
> +         and the interrupt counts during the "soft lockups".

It's probably not terribly important, but I notice that the other help
text in this file is generally wrapped to 80 columns. Even though the
kernel has relaxed the 80 column rule a bit, it still feels like this
could easily be wrapped to 80 columns without sacrificing any
readability.

In any case:

Reviewed-by: Douglas Anderson <dianders@chromium.org>
Re: [PATCHv5 3/3] watchdog/softlockup: add SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob
Posted by Bitao Hu 2 years ago

On 2024/2/7 05:42, Doug Anderson wrote:
> Hi,
> 
> On Tue, Feb 6, 2024 at 1:59 AM Bitao Hu <yaoma@linux.alibaba.com> wrote:
>>
>> The interrupt storm detection mechanism we implemented requires a
>> considerable amount of global storage space when configured for
>> the maximum number of CPUs.
>> Therefore, adding a SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob that
>> defaults to "yes" if the max number of CPUs is <= 128.
>>
>> Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
>> ---
>>   kernel/watchdog.c |  2 +-
>>   lib/Kconfig.debug | 13 +++++++++++++
>>   2 files changed, 14 insertions(+), 1 deletion(-)
> 
> IMO this should be squashed into patch #1, though I won't insist.
Agree.
> 
> 
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 26dc1ad86276..1595e4a94774 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -338,7 +338,7 @@ __setup("watchdog_thresh=", watchdog_thresh_setup);
>>
>>   static void __lockup_detector_cleanup(void);
>>
>> -#ifdef CONFIG_IRQ_TIME_ACCOUNTING
>> +#ifdef CONFIG_SOFTLOCKUP_DETECTOR_INTR_STORM
>>   #define NUM_STATS_GROUPS       5
>>   #define NUM_STATS_PER_GROUP    4
>>   enum stats_per_group {
>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>> index 975a07f9f1cc..74002ba7c42d 100644
>> --- a/lib/Kconfig.debug
>> +++ b/lib/Kconfig.debug
>> @@ -1029,6 +1029,19 @@ config SOFTLOCKUP_DETECTOR
>>            chance to run.  The current stack trace is displayed upon
>>            detection and the system will stay locked up.
>>
>> +config SOFTLOCKUP_DETECTOR_INTR_STORM
>> +       bool "Detect Interrupt Storm in Soft Lockups"
>> +       depends on SOFTLOCKUP_DETECTOR && IRQ_TIME_ACCOUNTING
>> +       default y if NR_CPUS <= 128
>> +       help
>> +         Say Y here to enable the kernel to detect interrupt storm
>> +         during "soft lockups".
>> +
>> +         "soft lockups" can be caused by a variety of reasons. If one is caused by
>> +         an interrupt storm, then the storming interrupts will not be on the
>> +         callstack. To detect this case, it is necessary to report the CPU stats
>> +         and the interrupt counts during the "soft lockups".
> 
> It's probably not terribly important, but I notice that the other help
> text in this file is generally wrapped to 80 columns. Even though the
> kernel has relaxed the 80 column rule a bit, it still feels like this
> could easily be wrapped to 80 columns without sacrificing any
> readability.
OK.
> 
> In any case:
> 
> Reviewed-by: Douglas Anderson <dianders@chromium.org>