mm/oom_kill.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 3 deletions(-)
patch 1 do not delay oom reaper when the victim is frozen, patch 2 makes the OOM reaper and exit_mmap() traverse the maple tree in opposite orders to reduce PTE lock contention caused by unmapping the same vma. About patch 1: Patch 1 uses frozen() to check the frozen state of a single thread to determine if a process is frozen, rather than checking all threads, because the frozen state of all threads in a process will eventually be consistent. There is no need to strictly confirm that all threads are frozen; it is only necessary to check whether the process has been frozen or is about to be frozen. When a process is frozen, if it cannot be unfrozen promptly, the delayed two-second oom reaper cannot guarantee that robust futexes will not be reaped. So the processes holding robust futexes should not be frozen. This patch will not make issue [1] worse. About patch 2: I tested the changes of patch 2 on Android. The reproduction steps are as follows: Start a process, then kill it like oom kill does, and actively add it to the oom reaper. The perf data applying patch 1 but not patch 2: |--99.74%-- oom_reaper | |--76.67%-- unmap_page_range | | |--33.70%-- __pte_offset_map_lock | | | |--98.46%-- _raw_spin_lock | | |--27.61%-- free_swap_and_cache_nr | | |--16.40%-- folio_remove_rmap_ptes | | |--12.25%-- tlb_flush_mmu | |--12.61%-- tlb_finish_mmu The perf data applying patch 1 and patch 2: |--98.84%-- oom_reaper | |--53.45%-- unmap_page_range | | |--24.29%-- [hit in function] | | |--48.06%-- folio_remove_rmap_ptes | | |--17.99%-- tlb_flush_mmu | | |--1.72%-- __pte_offset_map_lock | | | |--30.43%-- tlb_finish_mmu It is obvious that the lock contention on the pte spinlock will be very intense when they traverse the tree along the same path. On low-memory Android devices, high memory pressure often requires killing processes to free memory, which is generally accepted on Android. lmkd, a user-space program that actively kills processes, needs to asynchronously call process_mrelease to release memory from killed processes, similar to the oom reaper. At the same time, OOM events are not rare. Therefore, reducing lock contention on __oom_reap_task_mm is meaningful. Link: https://lore.kernel.org/all/20220414144042.677008-1-npache@redhat.com/T/#u [1] --- v4 -> v5: 1. Detect the frozen state of the process instead of checking the futex state, as special handling of futex locks should be avoided during OOM kill [2]. 2. Use mas_find_rev() to traverse the VMA tree instead of vma_prev(), because vma_prev() may skip the first VMA and should not be used here. [3] 3. Just check ishould_delay_oom_reap() in queue_oom_reaper() since it is not hot path. [4] v4 link: https://lore.kernel.org/linux-mm/20250814135555.17493-1-zhongjinji@honor.com/ v3 -> v4: 1. Rename check_robust_futex() to process_has_robust_futex() for clearer intent. 2. Because the delay_reap parameter was added to task_will_free_mem(), the function is renamed to should_reap_task() to better clarify its purpose. 3. Add should_delay_oom_reap() to decide whether to delay OOM reap. 4. Modify the OOM reaper to traverse the maple tree in reverse order; see patch 3 for details. These changes improve code readability and enhance OOM reaper behavior. v3 link: https://lore.kernel.org/all/20250804030341.18619-1-zhongjinji@honor.com/ https://lore.kernel.org/all/20250804030341.18619-2-zhongjinji@honor.com/ v2 -> v3: 1. It mainly fixed the error in the Subject prefix, changing it from futex to mm/oom_kill. v2 link: https://lore.kernel.org/linux-mm/20250801153649.23244-1-zhongjinji@honor.com/ https://lore.kernel.org/linux-mm/20250801153649.23244-2-zhongjinji@honor.com/ v1 -> v2: 1. Check the robust_list of all threads instead of just a single thread. v1 link: https://lore.kernel.org/linux-mm/20250731102904.8615-1-zhongjinji@honor.com/ Reference: https://lore.kernel.org/linux-mm/aKRWtjRhE_HgFlp2@tiehlicka/ [2] https://lore.kernel.org/linux-mm/26larxehoe3a627s4fxsqghriwctays4opm4hhme3uk7ybjc5r@pmwh4s4yv7lm/ [3] https://lore.kernel.org/linux-mm/d5013a33-c08a-44c5-a67f-9dc8fd73c969@lucifer.local/ [4] *** BLURB HERE *** zhongjinji (2): mm/oom_kill: Do not delay oom reaper when the victim is frozen mm/oom_kill: Have the OOM reaper and exit_mmap() traverse the maple tree in opposite order mm/oom_kill.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 3 deletions(-) -- 2.17.1
© 2016 - 2025 Red Hat, Inc.