mm/oom_kill.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
An overview of the relationship between patch 1 and patch 2: With patch 1 applied, the OOM reaper is no longer delayed when the victim process is frozen. If the victim process is thawed in time, the OOM reaper and the exit_mmap() thread may run concurrently, which can lead to significant spinlock contention. Patch 2 mitigates this issue by traversing the maple tree in reverse order, reducing the likelihood of such lock contention. The attached test data was collected on Android. It shows that when the OOM reaper and exit_mmap are executed at the same time, pte spinlock contention becomes more intense. This results in increased running time for both processes, which in turn means higher system load. It also shows that reverse-order traversal of the VMA maple tree by the OOM reaper can significantly reduce pte spinlock contention. The test data indicate that it can significantly reduce spinlock contention and decrease the load (measured by process running time) of both oom_reaper and exit_mmap by 30%. The perf data applying patch 1 but not patch 2: |--99.74%-- oom_reaper | |--76.67%-- unmap_page_range | | |--33.70%-- __pte_offset_map_lock | | | |--98.46%-- _raw_spin_lock | | |--27.61%-- free_swap_and_cache_nr | | |--16.40%-- folio_remove_rmap_ptes | | |--12.25%-- tlb_flush_mmu | |--12.61%-- tlb_finish_mmu The perf data applying patch 1 and patch 2: |--98.84%-- oom_reaper | |--53.45%-- unmap_page_range | | |--24.29%-- [hit in function] | | |--48.06%-- folio_remove_rmap_ptes | | |--17.99%-- tlb_flush_mmu | | |--1.72%-- __pte_offset_map_lock | | | |--30.43%-- tlb_finish_mmu This is test data regarding the process running time. With oom reaper (reverse traverse): Thread TID State Wall duration (ms) RxComputationT 13708 Running 60.69 oom_reaper 81 Running 46.49 Total (ms): 107.18 With oom reaper: Thread TID State Wall duration (ms) vdp:vidtask:m 14040 Running 81.85 oom_reaper 81 Running 69.32 Total (ms): 151.17 Without oom reaper: Thread TID State Wall duration (ms) tp-background 12424 Running 106.02 Total (ms): 106.02 Note: RxComputationT, vdp:vidtask:m, and tp-background are threads of the same process, and they are the last threads to exit. --- v5 -> v6: - Use mas_for_each_rev() for VMA traversal [6] - Simplify the judgment of whether to delay in queue_oom_reaper() [7] - Refine changelog to better capture the essence of the changes [8] - Use READ_ONCE(tsk->frozen) instead of checking mm and additional checks inside for_each_process(), as it is sufficient [9] - Add report tags because fengbaopeng and tianxiaobin reported the high load issue of the reaper v4 -> v5: - Detect frozen state directly, avoid special futex handling. [3] - Use mas_find_rev() for VMA traversal to avoid skipping entries. [4] - Only check should_delay_oom_reap() in queue_oom_reaper(). [5] v3 -> v4: - Renamed functions and parameters for clarity. [2] - Added should_delay_oom_reap() for OOM reap decisions. - Traverse maple tree in reverse for improved behavior. v2 -> v3: - Fixed Subject prefix error. v1 -> v2: - Check robust_list for all threads, not just one. [1] Reference: [1] https://lore.kernel.org/linux-mm/u3mepw3oxj7cywezna4v72y2hvyc7bafkmsbirsbfuf34zpa7c@b23sc3rvp2gp/ [2] https://lore.kernel.org/linux-mm/87cy99g3k6.ffs@tglx/ [3] https://lore.kernel.org/linux-mm/aKRWtjRhE_HgFlp2@tiehlicka/ [4] https://lore.kernel.org/linux-mm/26larxehoe3a627s4fxsqghriwctays4opm4hhme3uk7ybjc5r@pmwh4s4yv7lm/ [5] https://lore.kernel.org/linux-mm/d5013a33-c08a-44c5-a67f-9dc8fd73c969@lucifer.local/ [6] https://lore.kernel.org/linux-mm/nwh7gegmvoisbxlsfwslobpbqku376uxdj2z32owkbftvozt3x@4dfet73fh2yy/ [7] https://lore.kernel.org/linux-mm/af4edeaf-d3c9-46a9-a300-dbaf5936e7d6@lucifer.local/ [8] https://lore.kernel.org/linux-mm/aK71W1ITmC_4I_RY@tiehlicka/ [9] https://lore.kernel.org/linux-mm/jzzdeczuyraup2zrspl6b74muf3bly2a3acejfftcldfmz4ekk@s5mcbeim34my/ The earlier post: v5: https://lore.kernel.org/linux-mm/20250825133855.30229-1-zhongjinji@honor.com/ v4: https://lore.kernel.org/linux-mm/20250814135555.17493-1-zhongjinji@honor.com/ v3: https://lore.kernel.org/linux-mm/20250804030341.18619-1-zhongjinji@honor.com/ v2: https://lore.kernel.org/linux-mm/20250801153649.23244-1-zhongjinji@honor.com/ v1: https://lore.kernel.org/linux-mm/20250731102904.8615-1-zhongjinji@honor.com/ zhongjinji (2): mm/oom_kill: Do not delay oom reaper when the victim is frozen mm/oom_kill: The OOM reaper traverses the VMA maple tree in reverse order mm/oom_kill.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) -- 2.17.1
© 2016 - 2025 Red Hat, Inc.