[PATCH] watchdog: Fix watchdog may detect false positive of softlockup

Luo Gengkun posted 1 patch 8 months ago
There is a newer version of this series
kernel/watchdog.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
[PATCH] watchdog: Fix watchdog may detect false positive of softlockup
Posted by Luo Gengkun 8 months ago
The watchdog may dectect false positive of softlockup because of stop
softlockup after update watchdog_thresh. The problem can be described as
follow:

 # We asuume previous watchdog_thresh is 60, so the timer is coming every
 # 24s.
echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
|
+------>+ update watchdog_thresh (We are in kernel now)
	|
	|
	+------>+ watchdog hrtimer (irq context: detect softlockup)
		|
		|
	+-------+
	|
	|
	+ softlockup_stop_all

As showed above, there is a window between update watchdog_thresh and
softlockup_stop_all. During this window, if a timer is coming, a false
positive of softlockup will happen. To fix this problem, use a shadow
variable to store the new value and write back to watchdog_thresh after
softlockup_stop_all.

Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
---
 kernel/watchdog.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 9fa2af9dbf2c..80e5a77e92fd 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -47,6 +47,7 @@ int __read_mostly watchdog_user_enabled = 1;
 static int __read_mostly watchdog_hardlockup_user_enabled = WATCHDOG_HARDLOCKUP_DEFAULT;
 static int __read_mostly watchdog_softlockup_user_enabled = 1;
 int __read_mostly watchdog_thresh = 10;
+static int __read_mostly watchdog_thresh_shadow;
 static int __read_mostly watchdog_hardlockup_available;
 
 struct cpumask watchdog_cpumask __read_mostly;
@@ -876,6 +877,7 @@ static void __lockup_detector_reconfigure(void)
 	watchdog_hardlockup_stop();
 
 	softlockup_stop_all();
+	watchdog_thresh = READ_ONCE(watchdog_thresh_shadow);
 	set_sample_period();
 	lockup_detector_update_enable();
 	if (watchdog_enabled && watchdog_thresh)
@@ -1035,10 +1037,12 @@ static int proc_watchdog_thresh(const struct ctl_table *table, int write,
 
 	mutex_lock(&watchdog_mutex);
 
-	old = READ_ONCE(watchdog_thresh);
+	watchdog_thresh_shadow = READ_ONCE(watchdog_thresh);
+
+	old = watchdog_thresh_shadow;
 	err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
 
-	if (!err && write && old != READ_ONCE(watchdog_thresh))
+	if (!err && write && old != READ_ONCE(watchdog_thresh_shadow))
 		proc_watchdog_update();
 
 	mutex_unlock(&watchdog_mutex);
@@ -1080,7 +1084,7 @@ static const struct ctl_table watchdog_sysctls[] = {
 	},
 	{
 		.procname	= "watchdog_thresh",
-		.data		= &watchdog_thresh,
+		.data		= &watchdog_thresh_shadow,
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_watchdog_thresh,
-- 
2.34.1
Re: [PATCH] watchdog: Fix watchdog may detect false positive of softlockup
Posted by Andrew Morton 8 months ago
On Wed, 16 Apr 2025 01:39:22 +0000 Luo Gengkun <luogengkun@huaweicloud.com> wrote:

> The watchdog may dectect false positive of softlockup because of stop
> softlockup after update watchdog_thresh. The problem can be described as
> follow:
> 
>  # We asuume previous watchdog_thresh is 60, so the timer is coming every
>  # 24s.
> echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
> |
> +------>+ update watchdog_thresh (We are in kernel now)
> 	|
> 	|
> 	+------>+ watchdog hrtimer (irq context: detect softlockup)
> 		|
> 		|
> 	+-------+
> 	|
> 	|
> 	+ softlockup_stop_all
> 
> As showed above, there is a window between update watchdog_thresh and
> softlockup_stop_all. During this window, if a timer is coming, a false
> positive of softlockup will happen. To fix this problem, use a shadow
> variable to store the new value and write back to watchdog_thresh after
> softlockup_stop_all.
> 

Changelog is a bit hard to follow.  I asked gemini.google.com to clean
it up and it produced this:

: The watchdog may detect a false positive softlockup due to stopping the
: softlockup detection after updating `watchdog_thresh`.  The problem can
: be described as follows:
: 
: ```
: # Assume the previous watchdog_thresh is 60, so the timer triggers every 24 seconds.
: echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
: |
: +------>+ Update watchdog_thresh (Kernel space)
: 	|
: 	|
: 	+------>+ Watchdog hrtimer (irq context: detect softlockup)
: 		|
: 		|
: 	+-------+
: 	|
: 	|
: 	+ softlockup_stop_all
: ```
: 
: As shown above, there is a window between updating `watchdog_thresh`
: and `softlockup_stop_all`.  During this window, if a timer triggers, a
: false positive softlockup can occur.  To fix this problem, a shadow
: variable should be used to store the new value, and this value should
: be written back to `watchdog_thresh` only after `softlockup_stop_all`
: has completed.  

I don't know how accurate this is - please check&fix it and consider
incorporating the result?

> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -47,6 +47,7 @@ int __read_mostly watchdog_user_enabled = 1;
>  static int __read_mostly watchdog_hardlockup_user_enabled = WATCHDOG_HARDLOCKUP_DEFAULT;
>  static int __read_mostly watchdog_softlockup_user_enabled = 1;
>  int __read_mostly watchdog_thresh = 10;
> +static int __read_mostly watchdog_thresh_shadow;
>  static int __read_mostly watchdog_hardlockup_available;
>  
>  struct cpumask watchdog_cpumask __read_mostly;
> @@ -876,6 +877,7 @@ static void __lockup_detector_reconfigure(void)
>  	watchdog_hardlockup_stop();
>  
>  	softlockup_stop_all();
> +	watchdog_thresh = READ_ONCE(watchdog_thresh_shadow);

I expect a reader of this code will wonder "what's all this
watchdog_thresh_shadow stuff".  Can you please add a few small comments
to explain why we're doing this?
Re: [PATCH] watchdog: Fix watchdog may detect false positive of softlockup
Posted by Luo Gengkun 8 months ago
On 2025/4/16 10:31, Andrew Morton wrote:
> On Wed, 16 Apr 2025 01:39:22 +0000 Luo Gengkun <luogengkun@huaweicloud.com> wrote:
>
>> The watchdog may dectect false positive of softlockup because of stop
>> softlockup after update watchdog_thresh. The problem can be described as
>> follow:
>>
>>   # We asuume previous watchdog_thresh is 60, so the timer is coming every
>>   # 24s.
>> echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
>> |
>> +------>+ update watchdog_thresh (We are in kernel now)
>> 	|
>> 	|
>> 	+------>+ watchdog hrtimer (irq context: detect softlockup)
>> 		|
>> 		|
>> 	+-------+
>> 	|
>> 	|
>> 	+ softlockup_stop_all
>>
>> As showed above, there is a window between update watchdog_thresh and
>> softlockup_stop_all. During this window, if a timer is coming, a false
>> positive of softlockup will happen. To fix this problem, use a shadow
>> variable to store the new value and write back to watchdog_thresh after
>> softlockup_stop_all.
>>
> Changelog is a bit hard to follow.  I asked gemini.google.com to clean
> it up and it produced this:
>
> : The watchdog may detect a false positive softlockup due to stopping the
> : softlockup detection after updating `watchdog_thresh`.  The problem can
> : be described as follows:
> :
> : ```
> : # Assume the previous watchdog_thresh is 60, so the timer triggers every 24 seconds.
> : echo 10 > /proc/sys/kernel/watchdog_thresh (User space)
> : |
> : +------>+ Update watchdog_thresh (Kernel space)
> : 	|
> : 	|
> : 	+------>+ Watchdog hrtimer (irq context: detect softlockup)
> : 		|
> : 		|
> : 	+-------+
> : 	|
> : 	|
> : 	+ softlockup_stop_all
> : ```
> :
> : As shown above, there is a window between updating `watchdog_thresh`
> : and `softlockup_stop_all`.  During this window, if a timer triggers, a
> : false positive softlockup can occur.  To fix this problem, a shadow
> : variable should be used to store the new value, and this value should
> : be written back to `watchdog_thresh` only after `softlockup_stop_all`
> : has completed.
>
> I don't know how accurate this is - please check&fix it and consider
> incorporating the result?
>
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -47,6 +47,7 @@ int __read_mostly watchdog_user_enabled = 1;
>>   static int __read_mostly watchdog_hardlockup_user_enabled = WATCHDOG_HARDLOCKUP_DEFAULT;
>>   static int __read_mostly watchdog_softlockup_user_enabled = 1;
>>   int __read_mostly watchdog_thresh = 10;
>> +static int __read_mostly watchdog_thresh_shadow;
>>   static int __read_mostly watchdog_hardlockup_available;
>>   
>>   struct cpumask watchdog_cpumask __read_mostly;
>> @@ -876,6 +877,7 @@ static void __lockup_detector_reconfigure(void)
>>   	watchdog_hardlockup_stop();
>>   
>>   	softlockup_stop_all();
>> +	watchdog_thresh = READ_ONCE(watchdog_thresh_shadow);
> I expect a reader of this code will wonder "what's all this
> watchdog_thresh_shadow stuff".  Can you please add a few small comments
> to explain why we're doing this?

The following testcase may help to understand this problem.
---------------------------------------------
echo RT_RUNTIME_SHARE > /sys/kernel/debug/sched/features
echo -1 > /proc/sys/kernel/sched_rt_runtime_us
echo 0 > /sys/kernel/debug/sched/fair_server/cpu3/runtime
echo 60 > /proc/sys/kernel/watchdog_thresh
taskset -c 3 chrt -r 99 ./test &
echo 10 > /proc/sys/kernel/watchdog_thresh &
---------------------------------------------
This testcase first cancels the throttle rt task. Change the value of
watchdog_thresh to 60, and run a rt task on cpu3. The final command will
be blocked because the kworker:3, smp_call_on_cpu will put work on it,
cannot be selected by the scheduler due to the existence of real-time
threads. A softlockup will be detected on cpu3 after a while. The
reason is the watchdog_thresh is changed to 10, but the watchdog_timer_fn
still runs every 24 seconds, because the kworker:3
has not chance to stop previous hrtimer on cpu3.

The above case indicates that updating watchdog_thresh before doing
softlockup_stop_all is insecure, although there is not rt task on cpu3,
once the timer irq arrives, a false positive of softlockup is reported.

What this patch does, is first update the new thresh to
watchdog_thresh_shadow, which acts like a temporary variable, and write
back to watchdog_thresh after softlockup_stop_all is finished. By this way,
no more false positive case will be reported.

>
Re: [PATCH] watchdog: Fix watchdog may detect false positive of softlockup
Posted by Andrew Morton 8 months ago
On Wed, 16 Apr 2025 14:44:52 +0800 Luo Gengkun <luogengkun@huaweicloud.com> wrote:

> The following testcase may help to understand this problem.

OK.  Please

a) include this info in the v2 changelog

b) consider whether you believe the fix should be backported into
   -stable kernels and if so, add a Cc: <stable@vger.kernel.org> and,
   if appropriate, add a Fixes: target

c) Place a couple of comments in the code to help readers understand
   why we're doing this

d) try to amke the changelog a little less puzzling ;)

e) retest and resend!

Thanks.