[PATCH] watchdog: when watchdog_enabled is 0, let (soft,nmi)switch remain 1 after we read them in proc

Tio Zhang posted 1 patch 1 year, 5 months ago
kernel/watchdog.c | 10 ++++++++++
1 file changed, 10 insertions(+)
[PATCH] watchdog: when watchdog_enabled is 0, let (soft,nmi)switch remain 1 after we read them in proc
Posted by Tio Zhang 1 year, 5 months ago
For users set "watchdog_user_enabled=0" but remaining
"(soft,nmi)watchdog_user_enabled=1". Watchdog threads(,nmi watchdog)
rework only if users reset "watchdog_user_enabled=1" without printing
those watchdog swicthes. Otherwise (soft,nmi)watchdog_user_enabled
will turn to 0 because of printing their values (It makes sense to print 0
since they do not work any more, but it does not make sense to let user's
swicthes change to 0 only by prints).

And after that, watchdog only should work again by doing:
(soft,nmi)watchdog_user_enabled=1
*** can't print, or everything go back to 0 again ***
watchdog_user_enabled=1

So this patch fixes this situation:

| name                       | value
|----------------------------|--------------------------
| watchdog_enabled           | 0
|----------------------------|--------------------------
| nmi_watchdog_user_enabled  | 1
|----------------------------|--------------------------
| soft_watchdog_user_enabled | 1
|----------------------------|--------------------------
| watchdog_user_enabled      | 0
|----------------------------|--------------------------
    
            cat /proc/sys/kernel/*watchdog
                          |
                          |
                          V
| name                       | value
|----------------------------|--------------------------
| watchdog_enabled           | 0
|----------------------------|--------------------------
| nmi_watchdog_user_enabled  | 0
|----------------------------|--------------------------
| soft_watchdog_user_enabled | 0
|----------------------------|--------------------------
| watchdog_user_enabled      | 0
|----------------------------|--------------------------


Signed-off-by: Tio Zhang <tiozhang@didiglobal.com>
---
 kernel/watchdog.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 51915b44ac73..42e69e83e76d 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -995,8 +995,18 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
 		 * On read synchronize the userspace interface. This is a
 		 * racy snapshot.
 		 */
+		old = READ_ONCE(*param);
 		*param = (watchdog_enabled & which) != 0;
 		err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+		/*
+		 * When "old" is 1 and watchdog_enabled is 0,
+		 * it should not be change to 0 for printing
+		 * nmi_watchdog_user_enabled or soft_watchdog_user_enabled.
+		 * So after we print it as 0,
+		 * we should recover it to 1.
+		 */
+		if (old && !watchdog_enabled)
+			*param = old;
 	} else {
 		old = READ_ONCE(*param);
 		err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-- 
2.17.1
Re: [PATCH] watchdog: when watchdog_enabled is 0, let (soft,nmi)switch remain 1 after we read them in proc
Posted by 张元瀚 Tio Zhang 1 year, 5 months ago
Hi, 

Ping :)

// get_maintainer.pl does not tell who are the maintainers or reviewers of kernel/watchdog.c
Add commit signers and sched maintainers to the CC list.

On 8/22/24 下午3:09, "张元瀚 Tio Zhang" <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>> wrote:


For users set "watchdog_user_enabled=0" but remaining
"(soft,nmi)watchdog_user_enabled=1". Watchdog threads(,nmi watchdog)
rework only if users reset "watchdog_user_enabled=1" without printing
those watchdog swicthes. Otherwise (soft,nmi)watchdog_user_enabled
will turn to 0 because of printing their values (It makes sense to print 0
since they do not work any more, but it does not make sense to let user's
swicthes change to 0 only by prints).


And after that, watchdog only should work again by doing:
(soft,nmi)watchdog_user_enabled=1
*** can't print, or everything go back to 0 again ***
watchdog_user_enabled=1


So this patch fixes this situation:


| name | value
|----------------------------|--------------------------
| watchdog_enabled | 0
|----------------------------|--------------------------
| nmi_watchdog_user_enabled | 1
|----------------------------|--------------------------
| soft_watchdog_user_enabled | 1
|----------------------------|--------------------------
| watchdog_user_enabled | 0
|----------------------------|--------------------------


cat /proc/sys/kernel/*watchdog
|
|
V
| name | value
|----------------------------|--------------------------
| watchdog_enabled | 0
|----------------------------|--------------------------
| nmi_watchdog_user_enabled | 0
|----------------------------|--------------------------
| soft_watchdog_user_enabled | 0
|----------------------------|--------------------------
| watchdog_user_enabled | 0
|----------------------------|--------------------------




Signed-off-by: Tio Zhang <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>>
---
kernel/watchdog.c | 10 ++++++++++
1 file changed, 10 insertions(+)


diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 51915b44ac73..42e69e83e76d 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -995,8 +995,18 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
* On read synchronize the userspace interface. This is a
* racy snapshot.
*/
+ old = READ_ONCE(*param);
*param = (watchdog_enabled & which) != 0;
err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ /*
+ * When "old" is 1 and watchdog_enabled is 0,
+ * it should not be change to 0 for printing
+ * nmi_watchdog_user_enabled or soft_watchdog_user_enabled.
+ * So after we print it as 0,
+ * we should recover it to 1.
+ */
+ if (old && !watchdog_enabled)
+ *param = old;
} else {
old = READ_ONCE(*param);
err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-- 
2.17.1





Re: [PATCH] watchdog: when watchdog_enabled is 0, let (soft,nmi)switch remain 1 after we read them in proc
Posted by Doug Anderson 1 year, 5 months ago
Hi,

On Wed, Sep 4, 2024 at 2:23 AM 张元瀚 Tio Zhang <tiozhang@didiglobal.com> wrote:
>
> Hi,
>
> Ping :)
>
> // get_maintainer.pl does not tell who are the maintainers or reviewers of kernel/watchdog.c
> Add commit signers and sched maintainers to the CC list.

FWIW, people in the Linux community usually don't like "top posting"...

In any case, I saw your original patch but I struggled to make sense
of the explanation and so I left it for later and then never got back
to it. I suspect that the explanation needs to be a little clearer.


> On 8/22/24 下午3:09, "张元瀚 Tio Zhang" <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>> wrote:
>
>
> For users set "watchdog_user_enabled=0" but remaining
> "(soft,nmi)watchdog_user_enabled=1". Watchdog threads(,nmi watchdog)
> rework only if users reset "watchdog_user_enabled=1" without printing
> those watchdog swicthes. Otherwise (soft,nmi)watchdog_user_enabled
> will turn to 0 because of printing their values (It makes sense to print 0
> since they do not work any more, but it does not make sense to let user's
> swicthes change to 0 only by prints).
>
>
> And after that, watchdog only should work again by doing:
> (soft,nmi)watchdog_user_enabled=1
> *** can't print, or everything go back to 0 again ***
> watchdog_user_enabled=1
>
>
> So this patch fixes this situation:
>
>
> | name | value
> |----------------------------|--------------------------
> | watchdog_enabled | 0
> |----------------------------|--------------------------
> | nmi_watchdog_user_enabled | 1
> |----------------------------|--------------------------
> | soft_watchdog_user_enabled | 1
> |----------------------------|--------------------------
> | watchdog_user_enabled | 0
> |----------------------------|--------------------------
>
>
> cat /proc/sys/kernel/*watchdog
> |
> |
> V
> | name | value
> |----------------------------|--------------------------
> | watchdog_enabled | 0
> |----------------------------|--------------------------
> | nmi_watchdog_user_enabled | 0
> |----------------------------|--------------------------
> | soft_watchdog_user_enabled | 0
> |----------------------------|--------------------------
> | watchdog_user_enabled | 0
> |----------------------------|--------------------------
>
>
>
>
> Signed-off-by: Tio Zhang <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>>
> ---
> kernel/watchdog.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 51915b44ac73..42e69e83e76d 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -995,8 +995,18 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
> * On read synchronize the userspace interface. This is a
> * racy snapshot.
> */
> + old = READ_ONCE(*param);

Given that the first thing that both the "if" and "else" case do now
is to set "old" then it feels like it could move outside the "if"
statement.

Even though the existing code does it, I'm also a little doubtful that
a READ_ONCE() is really needed here. You're protected by a mutex so
you don't need to worry about other CPUs and it would be really
confusing to me if the compiler could optimize things in a way that
the READ_ONCE() was needed.


> *param = (watchdog_enabled & which) != 0;
> err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> + /*
> + * When "old" is 1 and watchdog_enabled is 0,
> + * it should not be change to 0 for printing
> + * nmi_watchdog_user_enabled or soft_watchdog_user_enabled.
> + * So after we print it as 0,
> + * we should recover it to 1.
> + */
> + if (old && !watchdog_enabled)
> + *param = old;

I'm confused. Why all this extra logic? This should just
_unconditionally_ restore "*param" to "old", right? The only reason
the code hacked "*param" in the first place was so that it could use
the common proc_dointvec_minmax() helper function. Aside from the
common helper function working there is never a reason to muck with
"*param" if "!write" so it should just always restore.

-Doug
Re: [PATCH] watchdog: when watchdog_enabled is 0, let (soft,nmi)switch remain 1 after we read them in proc
Posted by Yuanhan Zhang 1 year, 5 months ago
Hi Doug,

Doug Anderson <dianders@chromium.org> 于2024年9月5日周四 03:39写道:
>
> Hi,
>
> On Wed, Sep 4, 2024 at 2:23 AM 张元瀚 Tio Zhang <tiozhang@didiglobal.com> wrote:
> >
> > Hi,
> >
> > Ping :)
> >
> > // get_maintainer.pl does not tell who are the maintainers or reviewers of kernel/watchdog.c
> > Add commit signers and sched maintainers to the CC list.
>
> FWIW, people in the Linux community usually don't like "top posting"...
>
> In any case, I saw your original patch but I struggled to make sense
> of the explanation and so I left it for later and then never got back
> to it. I suspect that the explanation needs to be a little clearer.

Thanks for your time reviewing and sorry for my poor explanation,
I update the explanation with an example to reproduce the issue.
Please see [patch v2]

>
>
> > On 8/22/24 下午3:09, "张元瀚 Tio Zhang" <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>> wrote:
> >
> >
> > For users set "watchdog_user_enabled=0" but remaining
> > "(soft,nmi)watchdog_user_enabled=1". Watchdog threads(,nmi watchdog)
> > rework only if users reset "watchdog_user_enabled=1" without printing
> > those watchdog swicthes. Otherwise (soft,nmi)watchdog_user_enabled
> > will turn to 0 because of printing their values (It makes sense to print 0
> > since they do not work any more, but it does not make sense to let user's
> > swicthes change to 0 only by prints).
> >
> >
> > And after that, watchdog only should work again by doing:
> > (soft,nmi)watchdog_user_enabled=1
> > *** can't print, or everything go back to 0 again ***
> > watchdog_user_enabled=1
> >
> >
> > So this patch fixes this situation:
> >
> >
> > | name | value
> > |----------------------------|--------------------------
> > | watchdog_enabled | 0
> > |----------------------------|--------------------------
> > | nmi_watchdog_user_enabled | 1
> > |----------------------------|--------------------------
> > | soft_watchdog_user_enabled | 1
> > |----------------------------|--------------------------
> > | watchdog_user_enabled | 0
> > |----------------------------|--------------------------
> >
> >
> > cat /proc/sys/kernel/*watchdog
> > |
> > |
> > V
> > | name | value
> > |----------------------------|--------------------------
> > | watchdog_enabled | 0
> > |----------------------------|--------------------------
> > | nmi_watchdog_user_enabled | 0
> > |----------------------------|--------------------------
> > | soft_watchdog_user_enabled | 0
> > |----------------------------|--------------------------
> > | watchdog_user_enabled | 0
> > |----------------------------|--------------------------
> >
> >
> >
> >
> > Signed-off-by: Tio Zhang <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>>
> > ---
> > kernel/watchdog.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> >
> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > index 51915b44ac73..42e69e83e76d 100644
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -995,8 +995,18 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
> > * On read synchronize the userspace interface. This is a
> > * racy snapshot.
> > */
> > + old = READ_ONCE(*param);
>
> Given that the first thing that both the "if" and "else" case do now
> is to set "old" then it feels like it could move outside the "if"
> statement.

Done in [patch v2]

>
> Even though the existing code does it, I'm also a little doubtful that
> a READ_ONCE() is really needed here. You're protected by a mutex so
> you don't need to worry about other CPUs and it would be really
> confusing to me if the compiler could optimize things in a way that
> the READ_ONCE() was needed.

I'm also confused, the only reason I keep 'READ_ONCE' is to align with
existing code. I remove 'READ_ONCE' in [patch v2].

>
>
> > *param = (watchdog_enabled & which) != 0;
> > err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> > + /*
> > + * When "old" is 1 and watchdog_enabled is 0,
> > + * it should not be change to 0 for printing
> > + * nmi_watchdog_user_enabled or soft_watchdog_user_enabled.
> > + * So after we print it as 0,
> > + * we should recover it to 1.
> > + */
> > + if (old && !watchdog_enabled)
> > + *param = old;
>
> I'm confused. Why all this extra logic? This should just
> _unconditionally_ restore "*param" to "old", right? The only reason
> the code hacked "*param" in the first place was so that it could use
> the common proc_dointvec_minmax() helper function. Aside from the
> common helper function working there is never a reason to muck with
> "*param" if "!write" so it should just always restore.

Yes actually the if statement is useless, my original purpose writing this
is trying to make the explanation more clear...
Remove in [patch v2]

>
> -Doug

Thanks,
[PATCH v2 1/1] kernel/watchdog: always restore watchdog_softlockup(,hardlockup)_user_enabled after proc show
Posted by Tio Zhang 1 year, 5 months ago
Otherwise when watchdog_enabled becomes 0,
watchdog_softlockup(,hardlockup)_user_enabled will changes to 0 after
proc show.

Steps to reproduce:

  step 1:
  # cat /proc/sys/kernel/*watchdog
  1
  1
  1

  | name                             | value
  |----------------------------------|--------------------------
  | watchdog_enabled                 | 1
  |----------------------------------|--------------------------
  | watchdog_hardlockup_user_enabled | 1
  |----------------------------------|--------------------------
  | watchdog_softlockup_user_enabled | 1
  |----------------------------------|--------------------------
  | watchdog_user_enabled            | 1
  |----------------------------------|--------------------------

  step 2:
  # echo 0 > /proc/sys/kernel/watchdog

  | name                             | value
  |----------------------------------|--------------------------
  | watchdog_enabled                 | 0
  |----------------------------------|--------------------------
  | watchdog_hardlockup_user_enabled | 1
  |----------------------------------|--------------------------
  | watchdog_softlockup_user_enabled | 1
  |----------------------------------|--------------------------
  | watchdog_user_enabled            | 0
  |----------------------------------|--------------------------

  step 3:
  # cat /proc/sys/kernel/*watchdog
  0
  0
  0

  | name                             | value
  |----------------------------------|--------------------------
  | watchdog_enabled                 | 0
  |----------------------------------|--------------------------
  | watchdog_hardlockup_user_enabled | 0
  |----------------------------------|--------------------------
  | watchdog_softlockup_user_enabled | 0
  |----------------------------------|--------------------------
  | watchdog_user_enabled            | 0
  |----------------------------------|--------------------------

  step 4:
  # echo 1 > /proc/sys/kernel/watchdog

  | name                             | value
  |----------------------------------|--------------------------
  | watchdog_enabled                 | 0
  |----------------------------------|--------------------------
  | watchdog_hardlockup_user_enabled | 0
  |----------------------------------|--------------------------
  | watchdog_softlockup_user_enabled | 0
  |----------------------------------|--------------------------
  | watchdog_user_enabled            | 0
  |----------------------------------|--------------------------

  step 5:
  # cat /proc/sys/kernel/*watchdog
  0
  0
  0

If we dont do "step 3", do "step 4" right after "step 2", it will be

  | name                             | value
  |----------------------------------|--------------------------
  | watchdog_enabled                 | 1
  |----------------------------------|--------------------------
  | watchdog_hardlockup_user_enabled | 1
  |----------------------------------|--------------------------
  | watchdog_softlockup_user_enabled | 1
  |----------------------------------|--------------------------
  | watchdog_user_enabled            | 1
  |----------------------------------|--------------------------

then everything works correctly.

So this patch fix "step 3"'s value into

| name                             | value
|----------------------------------|--------------------------
| watchdog_enabled                 | 0
|----------------------------------|--------------------------
| watchdog_hardlockup_user_enabled | 1
|----------------------------------|--------------------------
| watchdog_softlockup_user_enabled | 1
|----------------------------------|--------------------------
| watchdog_user_enabled            | 0
|----------------------------------|--------------------------

And still print 0 as before.

Signed-off-by: Tio Zhang <tiozhang@didiglobal.com>
---
 kernel/watchdog.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 51915b44ac73..ade69b2319f6 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -990,6 +990,7 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
 
 	mutex_lock(&watchdog_mutex);
 
+	old = *param;
 	if (!write) {
 		/*
 		 * On read synchronize the userspace interface. This is a
@@ -997,8 +998,8 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
 		 */
 		*param = (watchdog_enabled & which) != 0;
 		err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+		*param = old;
 	} else {
-		old = READ_ONCE(*param);
 		err = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
 		if (!err && old != READ_ONCE(*param))
 			proc_watchdog_update();
-- 
2.17.1
Re: [PATCH v2 1/1] kernel/watchdog: always restore watchdog_softlockup(,hardlockup)_user_enabled after proc show
Posted by Doug Anderson 1 year, 4 months ago
Hi,

On Fri, Sep 6, 2024 at 2:47 AM Tio Zhang <tiozhang@didiglobal.com> wrote:
>
> Otherwise when watchdog_enabled becomes 0,
> watchdog_softlockup(,hardlockup)_user_enabled will changes to 0 after
> proc show.
>
> Steps to reproduce:
>
>   step 1:
>   # cat /proc/sys/kernel/*watchdog
>   1
>   1
>   1
>
>   | name                             | value
>   |----------------------------------|--------------------------
>   | watchdog_enabled                 | 1
>   |----------------------------------|--------------------------
>   | watchdog_hardlockup_user_enabled | 1
>   |----------------------------------|--------------------------
>   | watchdog_softlockup_user_enabled | 1
>   |----------------------------------|--------------------------
>   | watchdog_user_enabled            | 1
>   |----------------------------------|--------------------------
>
>   step 2:
>   # echo 0 > /proc/sys/kernel/watchdog
>
>   | name                             | value
>   |----------------------------------|--------------------------
>   | watchdog_enabled                 | 0
>   |----------------------------------|--------------------------
>   | watchdog_hardlockup_user_enabled | 1
>   |----------------------------------|--------------------------
>   | watchdog_softlockup_user_enabled | 1
>   |----------------------------------|--------------------------
>   | watchdog_user_enabled            | 0
>   |----------------------------------|--------------------------
>
>   step 3:
>   # cat /proc/sys/kernel/*watchdog
>   0
>   0
>   0
>
>   | name                             | value
>   |----------------------------------|--------------------------
>   | watchdog_enabled                 | 0
>   |----------------------------------|--------------------------
>   | watchdog_hardlockup_user_enabled | 0
>   |----------------------------------|--------------------------
>   | watchdog_softlockup_user_enabled | 0
>   |----------------------------------|--------------------------
>   | watchdog_user_enabled            | 0
>   |----------------------------------|--------------------------
>
>   step 4:
>   # echo 1 > /proc/sys/kernel/watchdog
>
>   | name                             | value
>   |----------------------------------|--------------------------
>   | watchdog_enabled                 | 0
>   |----------------------------------|--------------------------
>   | watchdog_hardlockup_user_enabled | 0
>   |----------------------------------|--------------------------
>   | watchdog_softlockup_user_enabled | 0
>   |----------------------------------|--------------------------
>   | watchdog_user_enabled            | 0
>   |----------------------------------|--------------------------
>
>   step 5:
>   # cat /proc/sys/kernel/*watchdog
>   0
>   0
>   0
>
> If we dont do "step 3", do "step 4" right after "step 2", it will be
>
>   | name                             | value
>   |----------------------------------|--------------------------
>   | watchdog_enabled                 | 1
>   |----------------------------------|--------------------------
>   | watchdog_hardlockup_user_enabled | 1
>   |----------------------------------|--------------------------
>   | watchdog_softlockup_user_enabled | 1
>   |----------------------------------|--------------------------
>   | watchdog_user_enabled            | 1
>   |----------------------------------|--------------------------
>
> then everything works correctly.
>
> So this patch fix "step 3"'s value into
>
> | name                             | value
> |----------------------------------|--------------------------
> | watchdog_enabled                 | 0
> |----------------------------------|--------------------------
> | watchdog_hardlockup_user_enabled | 1
> |----------------------------------|--------------------------
> | watchdog_softlockup_user_enabled | 1
> |----------------------------------|--------------------------
> | watchdog_user_enabled            | 0
> |----------------------------------|--------------------------
>
> And still print 0 as before.
>
> Signed-off-by: Tio Zhang <tiozhang@didiglobal.com>
> ---
>  kernel/watchdog.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

The description is pretty verbose but also much clearer. The patch
looks good to me now.

Reviewed-by: Douglas Anderson <dianders@chromium.org>