include/asm-generic/tlb.h | 50 +++++++ include/linux/mm_types.h | 58 ++++++++ include/linux/oom.h | 6 + mm/memcontrol.c | 6 - mm/memory.c | 3 +- mm/mmu_gather.c | 297 ++++++++++++++++++++++++++++++++++++++ 6 files changed, 413 insertions(+), 7 deletions(-) mode change 100644 => 100755 include/asm-generic/tlb.h mode change 100644 => 100755 include/linux/mm_types.h mode change 100644 => 100755 include/linux/oom.h mode change 100644 => 100755 mm/memcontrol.c mode change 100644 => 100755 mm/memory.c mode change 100644 => 100755 mm/mmu_gather.c
The main reasons for the prolonged exit of a background process is the time-consuming release of its swap entries. The proportion of swap memory occupied by the background process increases with its duration in the background, and after a period of time, this value can reach 60% or more. Additionally, the relatively lengthy path for releasing swap entries further contributes to the longer time required for the background process to release its swap entries. In the multiple background applications scenario, when launching a large memory application such as a camera, system may enter a low memory state, which will triggers the killing of multiple background processes at the same time. Due to multiple exiting processes occupying multiple CPUs for concurrent execution, the current foreground application's CPU resources are tight and may cause issues such as lagging. To solve this problem, we have introduced the multiple exiting process asynchronous swap memory release mechanism, which isolates and caches swap entries occupied by multiple exit processes, and hands them over to an asynchronous kworker to complete the release. This allows the exiting processes to complete quickly and release CPU resources. We have validated this modification on the products and achieved the expected benefits. It offers several benefits: 1. Alleviate the high system cpu load caused by multiple exiting processes running simultaneously. 2. Reduce lock competition in swap entry free path by an asynchronous kworker instead of multiple exiting processes parallel execution. 3. Release memory occupied by exiting processes more efficiently. Zhiguo Jiang (2): mm: move task_is_dying to h headfile mm: tlb: multiple exiting processes's swap entries async release include/asm-generic/tlb.h | 50 +++++++ include/linux/mm_types.h | 58 ++++++++ include/linux/oom.h | 6 + mm/memcontrol.c | 6 - mm/memory.c | 3 +- mm/mmu_gather.c | 297 ++++++++++++++++++++++++++++++++++++++ 6 files changed, 413 insertions(+), 7 deletions(-) mode change 100644 => 100755 include/asm-generic/tlb.h mode change 100644 => 100755 include/linux/mm_types.h mode change 100644 => 100755 include/linux/oom.h mode change 100644 => 100755 mm/memcontrol.c mode change 100644 => 100755 mm/memory.c mode change 100644 => 100755 mm/mmu_gather.c -- 2.39.0
On Tue, Jul 30, 2024 at 7:44 PM Zhiguo Jiang <justinjiang@vivo.com> wrote: > > The main reasons for the prolonged exit of a background process is the > time-consuming release of its swap entries. The proportion of swap memory > occupied by the background process increases with its duration in the > background, and after a period of time, this value can reach 60% or more. Do you know the reason? Could they be contending for a cluster lock or something? Is there any perf data or flamegraph available here? > Additionally, the relatively lengthy path for releasing swap entries > further contributes to the longer time required for the background process > to release its swap entries. > > In the multiple background applications scenario, when launching a large > memory application such as a camera, system may enter a low memory state, > which will triggers the killing of multiple background processes at the > same time. Due to multiple exiting processes occupying multiple CPUs for > concurrent execution, the current foreground application's CPU resources > are tight and may cause issues such as lagging. > > To solve this problem, we have introduced the multiple exiting process > asynchronous swap memory release mechanism, which isolates and caches > swap entries occupied by multiple exit processes, and hands them over > to an asynchronous kworker to complete the release. This allows the > exiting processes to complete quickly and release CPU resources. We have > validated this modification on the products and achieved the expected > benefits. > > It offers several benefits: > 1. Alleviate the high system cpu load caused by multiple exiting > processes running simultaneously. > 2. Reduce lock competition in swap entry free path by an asynchronous Do you have data on which lock is affected? Could it be a cluster lock? > kworker instead of multiple exiting processes parallel execution. > 3. Release memory occupied by exiting processes more efficiently. > > Zhiguo Jiang (2): > mm: move task_is_dying to h headfile > mm: tlb: multiple exiting processes's swap entries async release > > include/asm-generic/tlb.h | 50 +++++++ > include/linux/mm_types.h | 58 ++++++++ > include/linux/oom.h | 6 + > mm/memcontrol.c | 6 - > mm/memory.c | 3 +- > mm/mmu_gather.c | 297 ++++++++++++++++++++++++++++++++++++++ > 6 files changed, 413 insertions(+), 7 deletions(-) > mode change 100644 => 100755 include/asm-generic/tlb.h > mode change 100644 => 100755 include/linux/mm_types.h > mode change 100644 => 100755 include/linux/oom.h > mode change 100644 => 100755 mm/memcontrol.c > mode change 100644 => 100755 mm/memory.c > mode change 100644 => 100755 mm/mmu_gather.c Can you check your local filesystem to determine why you're running the chmod command? > > -- > 2.39.0 > Thanks Barry
在 2024/7/31 10:18, Barry Song 写道: > [Some people who received this message don't often get email from 21cnbao@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > On Tue, Jul 30, 2024 at 7:44 PM Zhiguo Jiang <justinjiang@vivo.com> wrote: >> The main reasons for the prolonged exit of a background process is the >> time-consuming release of its swap entries. The proportion of swap memory >> occupied by the background process increases with its duration in the >> background, and after a period of time, this value can reach 60% or more. > Do you know the reason? Could they be contending for a cluster lock or > something? > Is there any perf data or flamegraph available here? Hi, Testing datas of application occuping different physical memory sizes at different time points in the background: Testing Platform: 8GB RAM Testing procedure: After booting up, start 15 applications first, and then observe the physical memory size occupied by the last launched application at different time points in the background. foreground - abbreviation FG background - abbreviation BG The app launched last: com.qiyi.video app | memory type | FG 5s | BG 5s | BG 1min | BG 3min | BG 5min | BG 10min | BG 15min | --------------------------------------------------------------------------------------- | VmRSS(KB) | 453832 | 252300 | 207724 | 206776 | 204364 | 199944 | 199748 | | RssAnon(KB) | 247348 | 99296 | 71816 | 71484 | 71268 | 67808 | 67660 | | RssFile(KB) | 205536 | 152020 | 134956 | 134340 | 132144 | 131184 | 131136 | | RssShmem(KB) | 1048 | 984 | 952 | 952 | 952 | 952 | 952 | | VmSwap(KB) | 202692 | 334852 | 362332 | 362664 | 362880 | 366340 | 366488 | | Swap ratio(%) | 30.87% | 57.03% | 63.56% | 63.69% | 63.97% | 64.69% | 64.72% | The app launched last: com.netease.sky.vivo | memory type | FG 5s | BG 5s | BG 1min | BG 3min | BG 5min | BG 10min | BG 15min | --------------------------------------------------------------------------------------- | VmRSS(KB) | 435424 | 403564 | 403200 | 401688 | 402996 | 396372 | 396268 | | RssAnon(KB) | 151616 | 117252 | 117244 | 115888 | 117088 | 110780 | 110684 | | RssFile(KB) | 281672 | 284192 | 283836 | 283680 | 283788 | 283472 | 283464 | | RssShmem(KB) | 2136 | 2120 | 2120 | 2120 | 2120 | 2120 | 2120 | | VmSwap(KB) | 546584 | 559920 | 559928 | 561284 | 560084 | 566392 | 566488 | | Swap ratio(%) | 55.66% | 58.11% | 58.14% | 58.29% | 58.16% | 58.83% | 58.84% | A background exiting process's perfedata: | interfaces | cost(ms) | exe(ms) | average(ms) | run counts | -------------------------------------------------------------------------------- | do_signal | 791.813 | 0 | 791.813 | 1 | | get_signal | 791.813 | 0 | 791.813 | 1 | | do_group_exit | 791.813 | 0 | 791.813 | 1 | | do_exit | 791.813 | 0.148 | 791.813 | 1 | | exit_mm | 577.859 | 0 | 577.859 | 1 | | __mmput | 577.859 | 0.202 | 577.859 | 1 | | exit_mmap | 577.497 | 1.806 | 192.499 | 3 | | __oom_reap_task_mm | 562.869 | 2.695 | 562.869 | 1 | | unmap_page_range | 562.07 | 3.185 | 20.817 | 27 | | zap_pte_range | 558.645 | 123.958 | 15.518 | 36 | | free_swap_and_cache | 433.381 | 28.831 | 6.879 | 63 | | free_swap_slot | 403.568 | 4.876 | 4.248 | 95 | | swapcache_free_entries | 398.292 | 3.578 | 3.588 | 111 | | swap_entry_free | 393.863 | 13.953 | 3.176 | 124 | | swap_range_free | 372.602 | 202.478 | 1.791 | 208 | | $x.204 [zram] | 132.389 | 0.341 | 0.33 | 401 | | zram_reset_device | 131.888 | 22.376 | 0.326 | 405 | | obj_free | 80.101 | 29.517 | 0.21 | 381 | | zs_create_pool | 29.381 | 2.772 | 0.124 | 237 | | clear_shadow_from_swap_cache | 22.846 | 22.686 | 0.11 | 208 | | __put_page | 19.317 | 10.088 | 0.105 | 184 | | pr_memcg_info | 13.038 | 1.181 | 0.11 | 118 | | free_pcp_prepare | 9.229 | 0.812 | 0.094 | 98 | | xxx_memcg_out | 9.223 | 4.746 | 0.098 | 94 | | free_pgtables | 8.813 | 3.302 | 8.813 | 1 | | zs_compact | 8.617 | 8.43 | 0.097 | 89 | | kmem_cache_free | 7.483 | 4.595 | 0.084 | 89 | | __mem_cgroup_uncharge_swap | 6.348 | 3.03 | 0.086 | 74 | | $x.178 [zsmalloc] | 6.182 | 0.32 | 0.09 | 69 | | $x.182 [zsmalloc] | 5.019 | 0.08 | 0.088 | 57 | cost - total time consumption. exe - total actual execution time. According to perfdata, we can observe that the main reason for the prolonged exit of a background process is the time-consuming release of its swap entries. The reason for the time-consuming release of swap entries is not only due to cluster locks, but also swp_slots lock and swap_info lock, additionally zram and swapdisk free path time-consuming . > >> Additionally, the relatively lengthy path for releasing swap entries >> further contributes to the longer time required for the background process >> to release its swap entries. >> >> In the multiple background applications scenario, when launching a large >> memory application such as a camera, system may enter a low memory state, >> which will triggers the killing of multiple background processes at the >> same time. Due to multiple exiting processes occupying multiple CPUs for >> concurrent execution, the current foreground application's CPU resources >> are tight and may cause issues such as lagging. >> >> To solve this problem, we have introduced the multiple exiting process >> asynchronous swap memory release mechanism, which isolates and caches >> swap entries occupied by multiple exit processes, and hands them over >> to an asynchronous kworker to complete the release. This allows the >> exiting processes to complete quickly and release CPU resources. We have >> validated this modification on the products and achieved the expected >> benefits. >> >> It offers several benefits: >> 1. Alleviate the high system cpu load caused by multiple exiting >> processes running simultaneously. >> 2. Reduce lock competition in swap entry free path by an asynchronous > Do you have data on which lock is affected? Could it be a cluster lock? The reason for the time-consuming release of swap entries is not only due to cluster locks, but also swp_slots lock and swap_info lock, additionally zram and swapdisk free path time-consuming . In short, swap entry release path is relatively long compared to file and anonymous folio release path. > >> kworker instead of multiple exiting processes parallel execution. >> 3. Release memory occupied by exiting processes more efficiently. >> >> Zhiguo Jiang (2): >> mm: move task_is_dying to h headfile >> mm: tlb: multiple exiting processes's swap entries async release >> >> include/asm-generic/tlb.h | 50 +++++++ >> include/linux/mm_types.h | 58 ++++++++ >> include/linux/oom.h | 6 + >> mm/memcontrol.c | 6 - >> mm/memory.c | 3 +- >> mm/mmu_gather.c | 297 ++++++++++++++++++++++++++++++++++++++ >> 6 files changed, 413 insertions(+), 7 deletions(-) >> mode change 100644 => 100755 include/asm-generic/tlb.h >> mode change 100644 => 100755 include/linux/mm_types.h >> mode change 100644 => 100755 include/linux/oom.h >> mode change 100644 => 100755 mm/memcontrol.c >> mode change 100644 => 100755 mm/memory.c >> mode change 100644 => 100755 mm/mmu_gather.c > Can you check your local filesystem to determine why you're running > the chmod command? Ok, I will check it carefully. Thanks Zhiguo > >> -- >> 2.39.0 >> > Thanks > Barry
© 2016 - 2026 Red Hat, Inc.