[PATCH v8 2/3] mm/oom_kill: Thaw the entire OOM victim process

zhongjinji posted 3 patches 3 weeks, 2 days ago
There is a newer version of this series
[PATCH v8 2/3] mm/oom_kill: Thaw the entire OOM victim process
Posted by zhongjinji 3 weeks, 2 days ago
OOM killer is a mechanism that selects and kills processes when the system
runs out of memory to reclaim resources and keep the system stable.
However, the oom victim cannot terminate on its own when it is frozen,
because __thaw_task() only thaws one thread of the victim, while
the other threads remain in the frozen state.

This change will thaw the entire victim process when OOM occurs,
ensuring that the oom victim can terminate on its own.

Signed-off-by: zhongjinji <zhongjinji@honor.com>
---
 mm/oom_kill.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 25923cfec9c6..ffa50a1f0132 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -772,12 +772,11 @@ static void mark_oom_victim(struct task_struct *tsk)
 		mmgrab(tsk->signal->oom_mm);
 
 	/*
-	 * Make sure that the task is woken up from uninterruptible sleep
+	 * Make sure that the process is woken up from uninterruptible sleep
 	 * if it is frozen because OOM killer wouldn't be able to free
-	 * any memory and livelock. freezing_slow_path will tell the freezer
-	 * that TIF_MEMDIE tasks should be ignored.
+	 * any memory and livelock.
 	 */
-	__thaw_task(tsk);
+	thaw_oom_process(tsk);
 	atomic_inc(&oom_victims);
 	cred = get_task_cred(tsk);
 	trace_mark_victim(tsk, cred->uid.val);
-- 
2.17.1
Re: [PATCH v8 2/3] mm/oom_kill: Thaw the entire OOM victim process
Posted by Michal Hocko 3 weeks, 2 days ago
On Tue 09-09-25 17:06:58, zhongjinji wrote:
> OOM killer is a mechanism that selects and kills processes when the system
> runs out of memory to reclaim resources and keep the system stable.
> However, the oom victim cannot terminate on its own when it is frozen,
> because __thaw_task() only thaws one thread of the victim, while
> the other threads remain in the frozen state.
> 
> This change will thaw the entire victim process when OOM occurs,
> ensuring that the oom victim can terminate on its own.

fold this into patch 1.

> 
> Signed-off-by: zhongjinji <zhongjinji@honor.com>
> ---
>  mm/oom_kill.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 25923cfec9c6..ffa50a1f0132 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -772,12 +772,11 @@ static void mark_oom_victim(struct task_struct *tsk)
>  		mmgrab(tsk->signal->oom_mm);
>  
>  	/*
> -	 * Make sure that the task is woken up from uninterruptible sleep
> +	 * Make sure that the process is woken up from uninterruptible sleep
>  	 * if it is frozen because OOM killer wouldn't be able to free
> -	 * any memory and livelock. freezing_slow_path will tell the freezer
> -	 * that TIF_MEMDIE tasks should be ignored.
> +	 * any memory and livelock.
>  	 */
> -	__thaw_task(tsk);
> +	thaw_oom_process(tsk);
>  	atomic_inc(&oom_victims);
>  	cred = get_task_cred(tsk);
>  	trace_mark_victim(tsk, cred->uid.val);
> -- 
> 2.17.1

-- 
Michal Hocko
SUSE Labs
Re: [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims
Posted by zhongjinji 3 weeks, 2 days ago
> On Tue 09-09-25 17:06:57, zhongjinji wrote:
> > OOM killer is a mechanism that selects and kills processes when the system
> > runs out of memory to reclaim resources and keep the system stable.
> > However, the oom victim cannot terminate on its own when it is frozen,
> > because __thaw_task() only thaws one thread of the victim, while
> > the other threads remain in the frozen state.
> > 
> > Since __thaw_task did not fully thaw the OOM victim for self-termination,
> > introduce thaw_oom_process() to properly thaw OOM victims.
> 
> You will need s@thaw_oom_process@thaw_processes@

The reason for using thaw_oom_process is that the TIF_MEMDIE flag of the
thawed thread will be set, which means this function can only be used to
thaw processes terminated by the OOM killer.

thaw_processes has already been defined in kernel/power/process.c.
Would it be better to use thaw_process instead?

I am concerned that others might misunderstand the thaw_process function.
thaw_process sets all threads to the TIF_MEMDIE state, so it can only be
used to thaw processes killed by the OOM killer.

If the TIF_MEMDIE flag of a thread is not set, the thread cannot be thawed
regardless of the cgroup state. Should we add a function to set the TIF_MEMDIE
state for all threads, like the implementation below?

-/*
- * thaw_oom_process - thaw the OOM victim process
- * @p: process to be thawed
- *
- * Sets TIF_MEMDIE for all threads in the process group and thaws them.
- * Threads with TIF_MEMDIE are ignored by the freezer.
- */
-void thaw_oom_process(struct task_struct *p)
+void thaw_process(struct task_struct *p)
 {
        struct task_struct *t;

        rcu_read_lock();
        for_each_thread(p, t) {
-               set_tsk_thread_flag(t, TIF_MEMDIE);
                __thaw_task(t);
        }
        rcu_read_unlock();
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 52d285da5ba4..67b65b249757 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -753,6 +753,17 @@ static inline void queue_oom_reaper(struct task_struct *tsk)
 }
 #endif /* CONFIG_MMU */

+void mark_oom_victim_die(struct task_struct *p)
+{
+       struct task_struct *t;
+
+       rcu_read_lock();
+       for_each_thread(p, t) {
+               set_tsk_thread_flag(t, TIF_MEMDIE);
+       }
+       rcu_read_unlock();
+}
+
 /**
  * mark_oom_victim - mark the given task as OOM victim
  * @tsk: task to mark
@@ -782,7 +793,8 @@ static void mark_oom_victim(struct task_struct *tsk)
         * if it is frozen because OOM killer wouldn't be able to free
         * any memory and livelock.
         */
-       thaw_oom_process(tsk);
+       mark_oom_victim_die(tsk);
+       thaw_process(tsk);

> I would also add the caller in this patch.
>
> > Signed-off-by: zhongjinji <zhongjinji@honor.com>
> 
> Other than that looks good to me. With the above fixed feel free to add
> Acked-by: Michal Hocko <mhocko@suse.com>
Re: [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims
Posted by Michal Hocko 3 weeks, 2 days ago
On Tue 09-09-25 19:41:31, zhongjinji wrote:
> > On Tue 09-09-25 17:06:57, zhongjinji wrote:
> > > OOM killer is a mechanism that selects and kills processes when the system
> > > runs out of memory to reclaim resources and keep the system stable.
> > > However, the oom victim cannot terminate on its own when it is frozen,
> > > because __thaw_task() only thaws one thread of the victim, while
> > > the other threads remain in the frozen state.
> > > 
> > > Since __thaw_task did not fully thaw the OOM victim for self-termination,
> > > introduce thaw_oom_process() to properly thaw OOM victims.
> > 
> > You will need s@thaw_oom_process@thaw_processes@
> 
> The reason for using thaw_oom_process is that the TIF_MEMDIE flag of the
> thawed thread will be set, which means this function can only be used to
> thaw processes terminated by the OOM killer.

Just do not set the flag inside the function. I would even say do not
set TIF_MEMDIE to the rest of the thread group at all. More on that
below

> thaw_processes has already been defined in kernel/power/process.c.
> Would it be better to use thaw_process instead?

Sorry I meant thaw_process as thaw_processes is handling all the
processes.

> I am concerned that others might misunderstand the thaw_process function.
> thaw_process sets all threads to the TIF_MEMDIE state, so it can only be
> used to thaw processes killed by the OOM killer.

And that is the reason why it shouldn't be doing that. It should thaw
the whole thread group. That's it.
 
> If the TIF_MEMDIE flag of a thread is not set, the thread cannot be thawed
> regardless of the cgroup state.

Why would that be the case. TIF_MEMDIE should only denote the victim
should be able to access memory reserves. Why the whole thread group
needs that? While more threads could be caught in the allocation path
this is a sort of boost at best. It cannot guarantee any forward
progress and we have kept marking only the first thread that way without
any issues.

> Should we add a function to set the TIF_MEMDIE
> state for all threads, like the implementation below?
> 
> -/*
> - * thaw_oom_process - thaw the OOM victim process
> - * @p: process to be thawed
> - *
> - * Sets TIF_MEMDIE for all threads in the process group and thaws them.
> - * Threads with TIF_MEMDIE are ignored by the freezer.
> - */
> -void thaw_oom_process(struct task_struct *p)
> +void thaw_process(struct task_struct *p)
>  {
>         struct task_struct *t;
> 
>         rcu_read_lock();
>         for_each_thread(p, t) {
> -               set_tsk_thread_flag(t, TIF_MEMDIE);
>                 __thaw_task(t);
>         }
>         rcu_read_unlock();
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 52d285da5ba4..67b65b249757 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -753,6 +753,17 @@ static inline void queue_oom_reaper(struct task_struct *tsk)
>  }
>  #endif /* CONFIG_MMU */
> 
> +void mark_oom_victim_die(struct task_struct *p)
> +{
> +       struct task_struct *t;
> +
> +       rcu_read_lock();
> +       for_each_thread(p, t) {
> +               set_tsk_thread_flag(t, TIF_MEMDIE);
> +       }
> +       rcu_read_unlock();
> +}
> +
>  /**
>   * mark_oom_victim - mark the given task as OOM victim
>   * @tsk: task to mark
> @@ -782,7 +793,8 @@ static void mark_oom_victim(struct task_struct *tsk)
>          * if it is frozen because OOM killer wouldn't be able to free
>          * any memory and livelock.
>          */
> -       thaw_oom_process(tsk);
> +       mark_oom_victim_die(tsk);
> +       thaw_process(tsk);
> 
> > I would also add the caller in this patch.
> >
> > > Signed-off-by: zhongjinji <zhongjinji@honor.com>
> > 
> > Other than that looks good to me. With the above fixed feel free to add
> > Acked-by: Michal Hocko <mhocko@suse.com>

-- 
Michal Hocko
SUSE Labs
Re: [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims
Posted by zhongjinji 3 weeks, 2 days ago
> On Tue 09-09-25 19:41:31, zhongjinji wrote:
> > > On Tue 09-09-25 17:06:57, zhongjinji wrote:
> > > > OOM killer is a mechanism that selects and kills processes when the system
> > > > runs out of memory to reclaim resources and keep the system stable.
> > > > However, the oom victim cannot terminate on its own when it is frozen,
> > > > because __thaw_task() only thaws one thread of the victim, while
> > > > the other threads remain in the frozen state.
> > > > 
> > > > Since __thaw_task did not fully thaw the OOM victim for self-termination,
> > > > introduce thaw_oom_process() to properly thaw OOM victims.
> > > 
> > > You will need s@thaw_oom_process@thaw_processes@
> > 
> > The reason for using thaw_oom_process is that the TIF_MEMDIE flag of the
> > thawed thread will be set, which means this function can only be used to
> > thaw processes terminated by the OOM killer.
> 
> Just do not set the flag inside the function. I would even say do not
> set TIF_MEMDIE to the rest of the thread group at all. More on that
> below
> 
> > thaw_processes has already been defined in kernel/power/process.c.
> > Would it be better to use thaw_process instead?
> 
> Sorry I meant thaw_process as thaw_processes is handling all the
> processes.
> 
> > I am concerned that others might misunderstand the thaw_process function.
> > thaw_process sets all threads to the TIF_MEMDIE state, so it can only be
> > used to thaw processes killed by the OOM killer.
> 
> And that is the reason why it shouldn't be doing that. It should thaw
> the whole thread group. That's it.
>  
> > If the TIF_MEMDIE flag of a thread is not set, the thread cannot be thawed
> > regardless of the cgroup state.
> 
> Why would that be the case. TIF_MEMDIE should only denote the victim
> should be able to access memory reserves. Why the whole thread group
> needs that? While more threads could be caught in the allocation path
> this is a sort of boost at best. It cannot guarantee any forward
> progress and we have kept marking only the first thread that way without
> any issues.

When a process is frozen, all its threads enter __refrigerator() (in kernel/freezer.c). 
When __thaw_task is called, the threads are woken up and check the freezing(current) 
state (in __refrigerator). The freezing check is implemented via freezing_slow_path. 
When TIF_MEMDIE is set for a thread, freezing_slow_path will return false, allowing 
the thread to exit the infinite loop in __refrigerator(), and thus the thread will 
be thawed.

The following code can explain how TIF_MEMDIE works in thread thawing.
__refrigerator
  for (;;) {
    freezing = freezing(current)
      freezing_slow_path
        if (test_tsk_thread_flag(p, TIF_MEMDIE))
          return false;
    if (!freezing)
      break;
    schedule();
  }

Since thread_info is not shared within a thread group, TIF_MEMDIE for each thread  
must be set so that all threads can be thawed.
Re: [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims
Posted by Michal Hocko 3 weeks, 2 days ago
On Tue 09-09-25 21:51:52, zhongjinji wrote:
> > On Tue 09-09-25 19:41:31, zhongjinji wrote:
> > > > On Tue 09-09-25 17:06:57, zhongjinji wrote:
> > > > > OOM killer is a mechanism that selects and kills processes when the system
> > > > > runs out of memory to reclaim resources and keep the system stable.
> > > > > However, the oom victim cannot terminate on its own when it is frozen,
> > > > > because __thaw_task() only thaws one thread of the victim, while
> > > > > the other threads remain in the frozen state.
> > > > > 
> > > > > Since __thaw_task did not fully thaw the OOM victim for self-termination,
> > > > > introduce thaw_oom_process() to properly thaw OOM victims.
> > > > 
> > > > You will need s@thaw_oom_process@thaw_processes@
> > > 
> > > The reason for using thaw_oom_process is that the TIF_MEMDIE flag of the
> > > thawed thread will be set, which means this function can only be used to
> > > thaw processes terminated by the OOM killer.
> > 
> > Just do not set the flag inside the function. I would even say do not
> > set TIF_MEMDIE to the rest of the thread group at all. More on that
> > below
> > 
> > > thaw_processes has already been defined in kernel/power/process.c.
> > > Would it be better to use thaw_process instead?
> > 
> > Sorry I meant thaw_process as thaw_processes is handling all the
> > processes.
> > 
> > > I am concerned that others might misunderstand the thaw_process function.
> > > thaw_process sets all threads to the TIF_MEMDIE state, so it can only be
> > > used to thaw processes killed by the OOM killer.
> > 
> > And that is the reason why it shouldn't be doing that. It should thaw
> > the whole thread group. That's it.
> >  
> > > If the TIF_MEMDIE flag of a thread is not set, the thread cannot be thawed
> > > regardless of the cgroup state.
> > 
> > Why would that be the case. TIF_MEMDIE should only denote the victim
> > should be able to access memory reserves. Why the whole thread group
> > needs that? While more threads could be caught in the allocation path
> > this is a sort of boost at best. It cannot guarantee any forward
> > progress and we have kept marking only the first thread that way without
> > any issues.
> 
> When a process is frozen, all its threads enter __refrigerator() (in kernel/freezer.c). 
> When __thaw_task is called, the threads are woken up and check the freezing(current) 
> state (in __refrigerator). The freezing check is implemented via freezing_slow_path. 
> When TIF_MEMDIE is set for a thread, freezing_slow_path will return false, allowing 
> the thread to exit the infinite loop in __refrigerator(), and thus the thread will 
> be thawed.
> 
> The following code can explain how TIF_MEMDIE works in thread thawing.
> __refrigerator
>   for (;;) {
>     freezing = freezing(current)
>       freezing_slow_path
>         if (test_tsk_thread_flag(p, TIF_MEMDIE))
>           return false;
>     if (!freezing)
>       break;
>     schedule();
>   }

OK, I see. We could deal with that by checking tsk_is_oom_victim()
instead of TIF_MEMDIE

> Since thread_info is not shared within a thread group, TIF_MEMDIE for each thread  
> must be set so that all threads can be thawed.

-- 
Michal Hocko
SUSE Labs
Re: [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims
Posted by zhongjinji 3 weeks, 2 days ago
> > > On Tue 09-09-25 19:41:31, zhongjinji wrote:
> > > > > On Tue 09-09-25 17:06:57, zhongjinji wrote:
> > > > > > OOM killer is a mechanism that selects and kills processes when the system
> > > > > > runs out of memory to reclaim resources and keep the system stable.
> > > > > > However, the oom victim cannot terminate on its own when it is frozen,
> > > > > > because __thaw_task() only thaws one thread of the victim, while
> > > > > > the other threads remain in the frozen state.
> > > > > > 
> > > > > > Since __thaw_task did not fully thaw the OOM victim for self-termination,
> > > > > > introduce thaw_oom_process() to properly thaw OOM victims.
> > > > > 
> > > > > You will need s@thaw_oom_process@thaw_processes@
> > > > 
> > > > The reason for using thaw_oom_process is that the TIF_MEMDIE flag of the
> > > > thawed thread will be set, which means this function can only be used to
> > > > thaw processes terminated by the OOM killer.
> > > 
> > > Just do not set the flag inside the function. I would even say do not
> > > set TIF_MEMDIE to the rest of the thread group at all. More on that
> > > below
> > > 
> > > > thaw_processes has already been defined in kernel/power/process.c.
> > > > Would it be better to use thaw_process instead?
> > > 
> > > Sorry I meant thaw_process as thaw_processes is handling all the
> > > processes.
> > > 
> > > > I am concerned that others might misunderstand the thaw_process function.
> > > > thaw_process sets all threads to the TIF_MEMDIE state, so it can only be
> > > > used to thaw processes killed by the OOM killer.
> > > 
> > > And that is the reason why it shouldn't be doing that. It should thaw
> > > the whole thread group. That's it.
> > >  
> > > > If the TIF_MEMDIE flag of a thread is not set, the thread cannot be thawed
> > > > regardless of the cgroup state.
> > > 
> > > Why would that be the case. TIF_MEMDIE should only denote the victim
> > > should be able to access memory reserves. Why the whole thread group
> > > needs that? While more threads could be caught in the allocation path
> > > this is a sort of boost at best. It cannot guarantee any forward
> > > progress and we have kept marking only the first thread that way without
> > > any issues.
> > 
> > When a process is frozen, all its threads enter __refrigerator() (in kernel/freezer.c). 
> > When __thaw_task is called, the threads are woken up and check the freezing(current) 
> > state (in __refrigerator). The freezing check is implemented via freezing_slow_path. 
> > When TIF_MEMDIE is set for a thread, freezing_slow_path will return false, allowing 
> > the thread to exit the infinite loop in __refrigerator(), and thus the thread will 
> > be thawed.
> > 
> > The following code can explain how TIF_MEMDIE works in thread thawing.
> > __refrigerator
> >   for (;;) {
> >     freezing = freezing(current)
> >       freezing_slow_path
> >         if (test_tsk_thread_flag(p, TIF_MEMDIE))
> >           return false;
> >     if (!freezing)
> >       break;
> >     schedule();
> >   }
> 
> OK, I see. We could deal with that by checking tsk_is_oom_victim()
> instead of TIF_MEMDIE

Thank you, this looks great. It seems that oom_reserves_allowed implies
that tsk_is_oom_victim is not always effective (in page_alloc.c).
I will check it.
Re: [PATCH v8 2/3] mm/oom_kill: Thaw the entire OOM victim process
Posted by Suren Baghdasaryan 3 weeks, 2 days ago
On Tue, Sep 9, 2025 at 2:15 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Tue 09-09-25 17:06:58, zhongjinji wrote:
> > OOM killer is a mechanism that selects and kills processes when the system
> > runs out of memory to reclaim resources and keep the system stable.
> > However, the oom victim cannot terminate on its own when it is frozen,
> > because __thaw_task() only thaws one thread of the victim, while
> > the other threads remain in the frozen state.
> >
> > This change will thaw the entire victim process when OOM occurs,
> > ensuring that the oom victim can terminate on its own.
>
> fold this into patch 1.

+1
With that done,
Reviewed-by: Suren Baghdasaryan <surenb@google.com>

>
> >
> > Signed-off-by: zhongjinji <zhongjinji@honor.com>
> > ---
> >  mm/oom_kill.c | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 25923cfec9c6..ffa50a1f0132 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -772,12 +772,11 @@ static void mark_oom_victim(struct task_struct *tsk)
> >               mmgrab(tsk->signal->oom_mm);
> >
> >       /*
> > -      * Make sure that the task is woken up from uninterruptible sleep
> > +      * Make sure that the process is woken up from uninterruptible sleep
> >        * if it is frozen because OOM killer wouldn't be able to free
> > -      * any memory and livelock. freezing_slow_path will tell the freezer
> > -      * that TIF_MEMDIE tasks should be ignored.
> > +      * any memory and livelock.
> >        */
> > -     __thaw_task(tsk);
> > +     thaw_oom_process(tsk);
> >       atomic_inc(&oom_victims);
> >       cred = get_task_cred(tsk);
> >       trace_mark_victim(tsk, cred->uid.val);
> > --
> > 2.17.1
>
> --
> Michal Hocko
> SUSE Labs