mm/oom_kill.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
From: Chen Ridong <chenridong@huawei.com>
Unlike memcg OOM, which is relatively common, global OOM events are rare
and typically indicate that the entire system is under severe memory
pressure. The commit ade81479c7dd ("memcg: fix soft lockup in the OOM
process") added the touch_softlockup_watchdog in the global OOM handler to
suppess the soft lockup issues. However, while this change can suppress
soft lockup warnings, it does not address RCU stalls, which can still be
detected and may cause unnecessary disturbances. Simply remove the
modification from the global OOM handler.
Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
mm/oom_kill.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 25923cfec9c6..2d8b27604ef8 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -44,7 +44,6 @@
#include <linux/init.h>
#include <linux/mmu_notifier.h>
#include <linux/cred.h>
-#include <linux/nmi.h>
#include <asm/tlb.h>
#include "internal.h"
@@ -431,15 +430,10 @@ static void dump_tasks(struct oom_control *oc)
mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
else {
struct task_struct *p;
- int i = 0;
rcu_read_lock();
- for_each_process(p) {
- /* Avoid potential softlockup warning */
- if ((++i & 1023) == 0)
- touch_softlockup_watchdog();
+ for_each_process(p)
dump_task(p, oc);
- }
rcu_read_unlock();
}
}
--
2.34.1
On Wed 12-02-25 02:57:07, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> Unlike memcg OOM, which is relatively common, global OOM events are rare
> and typically indicate that the entire system is under severe memory
> pressure. The commit ade81479c7dd ("memcg: fix soft lockup in the OOM
> process") added the touch_softlockup_watchdog in the global OOM handler to
> suppess the soft lockup issues. However, while this change can suppress
> soft lockup warnings, it does not address RCU stalls, which can still be
> detected and may cause unnecessary disturbances. Simply remove the
> modification from the global OOM handler.
>
> Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
But this is not really fixing anything, is it? While this doesn't
address a potential RCU stall it doesn't address any actual problem.
So why do we want to do this?
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
> mm/oom_kill.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 25923cfec9c6..2d8b27604ef8 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -44,7 +44,6 @@
> #include <linux/init.h>
> #include <linux/mmu_notifier.h>
> #include <linux/cred.h>
> -#include <linux/nmi.h>
>
> #include <asm/tlb.h>
> #include "internal.h"
> @@ -431,15 +430,10 @@ static void dump_tasks(struct oom_control *oc)
> mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
> else {
> struct task_struct *p;
> - int i = 0;
>
> rcu_read_lock();
> - for_each_process(p) {
> - /* Avoid potential softlockup warning */
> - if ((++i & 1023) == 0)
> - touch_softlockup_watchdog();
> + for_each_process(p)
> dump_task(p, oc);
> - }
> rcu_read_unlock();
> }
> }
> --
> 2.34.1
--
Michal Hocko
SUSE Labs
On 2025/2/12 16:57, Michal Hocko wrote:
> On Wed 12-02-25 02:57:07, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> Unlike memcg OOM, which is relatively common, global OOM events are rare
>> and typically indicate that the entire system is under severe memory
>> pressure. The commit ade81479c7dd ("memcg: fix soft lockup in the OOM
>> process") added the touch_softlockup_watchdog in the global OOM handler to
>> suppess the soft lockup issues. However, while this change can suppress
>> soft lockup warnings, it does not address RCU stalls, which can still be
>> detected and may cause unnecessary disturbances. Simply remove the
>> modification from the global OOM handler.
>>
>> Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
>
> But this is not really fixing anything, is it? While this doesn't
> address a potential RCU stall it doesn't address any actual problem.
> So why do we want to do this?
>
[1]
https://lore.kernel.org/cgroups/0d9ea655-5c1a-4ba9-9eeb-b45d74cc68d0@huaweicloud.com/
As previously discussed, the work I have done on the global OOM is 'half
of the job'. Based on our discussions, I thought that it would be best
to abandon this approach for global OOM. Therefore, I am sending this
patch to revert the changes.
Or just leave it?
Best regards,
Ridong
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>> ---
>> mm/oom_kill.c | 8 +-------
>> 1 file changed, 1 insertion(+), 7 deletions(-)
>>
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> index 25923cfec9c6..2d8b27604ef8 100644
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>> @@ -44,7 +44,6 @@
>> #include <linux/init.h>
>> #include <linux/mmu_notifier.h>
>> #include <linux/cred.h>
>> -#include <linux/nmi.h>
>>
>> #include <asm/tlb.h>
>> #include "internal.h"
>> @@ -431,15 +430,10 @@ static void dump_tasks(struct oom_control *oc)
>> mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
>> else {
>> struct task_struct *p;
>> - int i = 0;
>>
>> rcu_read_lock();
>> - for_each_process(p) {
>> - /* Avoid potential softlockup warning */
>> - if ((++i & 1023) == 0)
>> - touch_softlockup_watchdog();
>> + for_each_process(p)
>> dump_task(p, oc);
>> - }
>> rcu_read_unlock();
>> }
>> }
>> --
>> 2.34.1
>
On 2/12/25 10:19, Chen Ridong wrote:
>
>
> On 2025/2/12 16:57, Michal Hocko wrote:
>> On Wed 12-02-25 02:57:07, Chen Ridong wrote:
>>> From: Chen Ridong <chenridong@huawei.com>
>>>
>>> Unlike memcg OOM, which is relatively common, global OOM events are rare
>>> and typically indicate that the entire system is under severe memory
>>> pressure. The commit ade81479c7dd ("memcg: fix soft lockup in the OOM
>>> process") added the touch_softlockup_watchdog in the global OOM handler to
>>> suppess the soft lockup issues. However, while this change can suppress
>>> soft lockup warnings, it does not address RCU stalls, which can still be
>>> detected and may cause unnecessary disturbances. Simply remove the
>>> modification from the global OOM handler.
>>>
>>> Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
>>
>> But this is not really fixing anything, is it? While this doesn't
>> address a potential RCU stall it doesn't address any actual problem.
>> So why do we want to do this?
>>
>
>
> [1]
> https://lore.kernel.org/cgroups/0d9ea655-5c1a-4ba9-9eeb-b45d74cc68d0@huaweicloud.com/
>
> As previously discussed, the work I have done on the global OOM is 'half
> of the job'. Based on our discussions, I thought that it would be best
> to abandon this approach for global OOM. Therefore, I am sending this
> patch to revert the changes.
>
> Or just leave it?
I suggested that part doesn't need to be in the patch, but if it was merged
with it, we can just leave it there. Thanks.
On 2025/2/12 17:34, Vlastimil Babka wrote:
> On 2/12/25 10:19, Chen Ridong wrote:
>>
>>
>> On 2025/2/12 16:57, Michal Hocko wrote:
>>> On Wed 12-02-25 02:57:07, Chen Ridong wrote:
>>>> From: Chen Ridong <chenridong@huawei.com>
>>>>
>>>> Unlike memcg OOM, which is relatively common, global OOM events are rare
>>>> and typically indicate that the entire system is under severe memory
>>>> pressure. The commit ade81479c7dd ("memcg: fix soft lockup in the OOM
>>>> process") added the touch_softlockup_watchdog in the global OOM handler to
>>>> suppess the soft lockup issues. However, while this change can suppress
>>>> soft lockup warnings, it does not address RCU stalls, which can still be
>>>> detected and may cause unnecessary disturbances. Simply remove the
>>>> modification from the global OOM handler.
>>>>
>>>> Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
>>>
>>> But this is not really fixing anything, is it? While this doesn't
>>> address a potential RCU stall it doesn't address any actual problem.
>>> So why do we want to do this?
>>>
>>
>>
>> [1]
>> https://lore.kernel.org/cgroups/0d9ea655-5c1a-4ba9-9eeb-b45d74cc68d0@huaweicloud.com/
>>
>> As previously discussed, the work I have done on the global OOM is 'half
>> of the job'. Based on our discussions, I thought that it would be best
>> to abandon this approach for global OOM. Therefore, I am sending this
>> patch to revert the changes.
>>
>> Or just leave it?
>
> I suggested that part doesn't need to be in the patch, but if it was merged
> with it, we can just leave it there. Thanks.
See. Thank you very much.
Best regards,
Ridong
On Wed 12-02-25 10:34:06, Vlastimil Babka wrote:
> On 2/12/25 10:19, Chen Ridong wrote:
> >
> >
> > On 2025/2/12 16:57, Michal Hocko wrote:
> >> On Wed 12-02-25 02:57:07, Chen Ridong wrote:
> >>> From: Chen Ridong <chenridong@huawei.com>
> >>>
> >>> Unlike memcg OOM, which is relatively common, global OOM events are rare
> >>> and typically indicate that the entire system is under severe memory
> >>> pressure. The commit ade81479c7dd ("memcg: fix soft lockup in the OOM
> >>> process") added the touch_softlockup_watchdog in the global OOM handler to
> >>> suppess the soft lockup issues. However, while this change can suppress
> >>> soft lockup warnings, it does not address RCU stalls, which can still be
> >>> detected and may cause unnecessary disturbances. Simply remove the
> >>> modification from the global OOM handler.
> >>>
> >>> Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
> >>
> >> But this is not really fixing anything, is it? While this doesn't
> >> address a potential RCU stall it doesn't address any actual problem.
> >> So why do we want to do this?
> >>
> >
> >
> > [1]
> > https://lore.kernel.org/cgroups/0d9ea655-5c1a-4ba9-9eeb-b45d74cc68d0@huaweicloud.com/
> >
> > As previously discussed, the work I have done on the global OOM is 'half
> > of the job'. Based on our discussions, I thought that it would be best
> > to abandon this approach for global OOM. Therefore, I am sending this
> > patch to revert the changes.
> >
> > Or just leave it?
>
> I suggested that part doesn't need to be in the patch, but if it was merged
> with it, we can just leave it there. Thanks.
Agreed!
--
Michal Hocko
SUSE Labs
On 2025/2/12 10:57, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> Unlike memcg OOM, which is relatively common, global OOM events are rare
> and typically indicate that the entire system is under severe memory
> pressure. The commit ade81479c7dd ("memcg: fix soft lockup in the OOM
> process") added the touch_softlockup_watchdog in the global OOM handler to
> suppess the soft lockup issues. However, while this change can suppress
> soft lockup warnings, it does not address RCU stalls, which can still be
> detected and may cause unnecessary disturbances. Simply remove the
> modification from the global OOM handler.
>
> Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
> mm/oom_kill.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 25923cfec9c6..2d8b27604ef8 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -44,7 +44,6 @@
> #include <linux/init.h>
> #include <linux/mmu_notifier.h>
> #include <linux/cred.h>
> -#include <linux/nmi.h>
>
> #include <asm/tlb.h>
> #include "internal.h"
> @@ -431,15 +430,10 @@ static void dump_tasks(struct oom_control *oc)
> mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
> else {
> struct task_struct *p;
> - int i = 0;
>
> rcu_read_lock();
> - for_each_process(p) {
> - /* Avoid potential softlockup warning */
> - if ((++i & 1023) == 0)
> - touch_softlockup_watchdog();
> + for_each_process(p)
> dump_task(p, oc);
> - }
> rcu_read_unlock();
> }
> }
Add discussion link:
https://lore.kernel.org/cgroups/0d9ea655-5c1a-4ba9-9eeb-b45d74cc68d0@huaweicloud.com/
© 2016 - 2025 Red Hat, Inc.