From nobody Mon May 11 11:30:20 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6319CC433EF for ; Fri, 8 Apr 2022 10:38:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234650AbiDHKkx (ORCPT ); Fri, 8 Apr 2022 06:40:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232918AbiDHKkv (ORCPT ); Fri, 8 Apr 2022 06:40:51 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AD96DF33; Fri, 8 Apr 2022 03:38:46 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id 75so1594488qkk.8; Fri, 08 Apr 2022 03:38:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=mG2gws1am1gOrwdE88okuwfDPFBk0cTSxDkUvHCcH/g=; b=foxd5QtW399Jj7DRvpiIdnOPcNUc0ehGdan1ikRC+av1rFzEov2M0nJVf9KsQ9Zuge X3gbDpVrBCbLGqTrEn5DxLLi+pNCx4aeM/HzH77xWMY2mEZy4aNd0sHngvLEd/kGRiPO 6tQBYiLrBEh0sG1cZiWT7yHTkual/vwkTKpyAkb6OCU+uASBataPFtrxIgcm6mFbvKCH DUE/XNMyI/yWmqAK/qdsN42h/A6QbBaQpjUSMonl1m0kIP9T5Pf/sUf/az6ZJEzIhCst 3Dg87ILFjS8ApBsywumPO0kgJ+A4uZvq8SSDeeRmC2e1a08rwUpCCf+OQpKD4P9IsUnL jATQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=mG2gws1am1gOrwdE88okuwfDPFBk0cTSxDkUvHCcH/g=; b=QXyYoLIbsHOBnWkgUvYufkjn0omLEX0Az0IsOk3Kk9QIIJTNm5puc010vesv8T72wv Lyes9nVK+1Yhmof7892Raj5LN88Y33Bxy4+E/NfZh957J2DegoZGKM//PfROOBqq1Bk8 LefHVbe9/cKy1ZSGCK7FmVXLGWSqCIpQyjAbAwHQLuzgcpUIzCKOnRIhvn4SXnl6ggIT 5KT4bS6dhDzgBRAZDO1PQbiBcoKL779DXhkCVfcVn+AVyDV97WZmgBFOnWTXKF1y8DZ8 2ZTaB8bPD8AsH+dblS5bh/n9nlP9hoWrTGu0IoDdVuou/E6a97PBM/o9Kt1x3iiBwJ0i dXUw== X-Gm-Message-State: AOAM533eQ+reeJZXshq82bSc6dWZQTNuS2E7xyY/T8QRciRe9LHy+AvJ wTrm8q58rd6fbYFjb+UN5E8= X-Google-Smtp-Source: ABdhPJxgitZIQDwYsW4TE9GUJduhStEYlk6pviIfjVmoY6YN1+y6+ttQ/sV1fIR0zwbe6RFvOju/0g== X-Received: by 2002:a05:620a:1916:b0:67d:69ef:6715 with SMTP id bj22-20020a05620a191600b0067d69ef6715mr12140222qkb.514.1649414325184; Fri, 08 Apr 2022 03:38:45 -0700 (PDT) Received: from localhost.localdomain ([193.203.214.57]) by smtp.gmail.com with ESMTPSA id bm1-20020a05620a198100b0047bf910892bsm14924850qkb.65.2022.04.08.03.38.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Apr 2022 03:38:44 -0700 (PDT) From: cgel.zte@gmail.com X-Google-Original-From: yang.yang29@zte.com.cn To: akpm@linux-foundation.org, david@redhat.com Cc: corbet@lwn.net, bsingharora@gmail.com, mike.kravetz@oracle.com, yang.yang29@zte.com.cn, wang.yong12@zte.com.cn, peterz@infradead.org, jiang.xuexin@zte.com.cn, sfr@canb.auug.org.au, thomas.orgis@uni-hamburg.de, ran.xiaokai@zte.com.cn, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH] delayacct: track delays from write-protect copy Date: Fri, 8 Apr 2022 10:37:10 +0000 Message-Id: <20220408103708.2495882-1-yang.yang29@zte.com.cn> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yang Yang Delay accounting does not track the delay of write-protect copy. When tasks trigger many write-protect copys(include COW and unsharing of anonymous pages[1]), it may spend a amount of time waiting for them. To get the delay of tasks in write-protect copy, could help users to evaluate the impact of using KSM or fork() or GUP. Also update tools/accounting/getdelays.c: / # ./getdelays -dl -p 231 print delayacct stats ON listen forever PID 231 CPU count real total virtual total delay total del= ay average 6247 1859000000 2154070021 1674255063 = 0.268ms IO count delay total delay average 0 0 0ms SWAP count delay total delay average 0 0 0ms RECLAIM count delay total delay average 0 0 0ms THRASHING count delay total delay average 0 0 0ms COMPACT count delay total delay average 3 72758 0ms WPCOPY count delay total delay average 3635 271567604 0ms [1] commit 31cc5bc4af70("mm: support GUP-triggered unsharing of anonymous p= ages") Signed-off-by: Yang Yang Reviewed-by: David Hildenbrand Reviewed-by: Jiang Xuexin Reviewed-by: Ran Xiaokai Reviewed-by: wangyong Reported-by: kernel test robot --- Documentation/accounting/delay-accounting.rst | 5 +++- include/linux/delayacct.h | 28 +++++++++++++++++++ include/uapi/linux/taskstats.h | 6 +++- kernel/delayacct.c | 16 +++++++++++ mm/hugetlb.c | 7 +++++ mm/memory.c | 8 ++++++ tools/accounting/getdelays.c | 8 +++++- 7 files changed, 75 insertions(+), 3 deletions(-) diff --git a/Documentation/accounting/delay-accounting.rst b/Documentation/= accounting/delay-accounting.rst index 197fe319cbec..241d1a87f2cd 100644 --- a/Documentation/accounting/delay-accounting.rst +++ b/Documentation/accounting/delay-accounting.rst @@ -15,6 +15,7 @@ c) swapping in pages d) memory reclaim e) thrashing page cache f) direct compact +g) write-protect copy =20 and makes these statistics available to userspace through the taskstats interface. @@ -48,7 +49,7 @@ this structure. See for a description of the fields pertaining to delay accounting. It will generally be in the form of counters returning the cumulative delay seen for cpu, sync block I/O, swapin, memory reclaim, thrash page -cache, direct compact etc. +cache, direct compact, write-protect copy etc. =20 Taking the difference of two successive readings of a given counter (say cpu_delay_total) for a task will give the delay @@ -117,6 +118,8 @@ Get sum of delays, since system boot, for all pids with= tgid 5:: 0 0 0ms COMPACT count delay total delay average 0 0 0ms + WPCOPY count delay total delay average + 0 0 0ms =20 Get IO accounting for pid 1, it works only with -p:: =20 diff --git a/include/linux/delayacct.h b/include/linux/delayacct.h index 6b16a6930a19..58aea2d7385c 100644 --- a/include/linux/delayacct.h +++ b/include/linux/delayacct.h @@ -45,9 +45,13 @@ struct task_delay_info { u64 compact_start; u64 compact_delay; /* wait for memory compact */ =20 + u64 wpcopy_start; + u64 wpcopy_delay; /* wait for write-protect copy */ + u32 freepages_count; /* total count of memory reclaim */ u32 thrashing_count; /* total count of thrash waits */ u32 compact_count; /* total count of memory compact */ + u32 wpcopy_count; /* total count of write-protect copy */ }; #endif =20 @@ -75,6 +79,8 @@ extern void __delayacct_swapin_start(void); extern void __delayacct_swapin_end(void); extern void __delayacct_compact_start(void); extern void __delayacct_compact_end(void); +extern void __delayacct_wpcopy_start(void); +extern void __delayacct_wpcopy_end(void); =20 static inline void delayacct_tsk_init(struct task_struct *tsk) { @@ -191,6 +197,24 @@ static inline void delayacct_compact_end(void) __delayacct_compact_end(); } =20 +static inline void delayacct_wpcopy_start(void) +{ + if (!static_branch_unlikely(&delayacct_key)) + return; + + if (current->delays) + __delayacct_wpcopy_start(); +} + +static inline void delayacct_wpcopy_end(void) +{ + if (!static_branch_unlikely(&delayacct_key)) + return; + + if (current->delays) + __delayacct_wpcopy_end(); +} + #else static inline void delayacct_init(void) {} @@ -225,6 +249,10 @@ static inline void delayacct_compact_start(void) {} static inline void delayacct_compact_end(void) {} +static inline void delayacct_wpcopy_start(void) +{} +static inline void delayacct_wpcopy_end(void) +{} =20 #endif /* CONFIG_TASK_DELAY_ACCT */ =20 diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h index 736154171489..a7f5b11a8f1b 100644 --- a/include/uapi/linux/taskstats.h +++ b/include/uapi/linux/taskstats.h @@ -34,7 +34,7 @@ */ =20 =20 -#define TASKSTATS_VERSION 12 +#define TASKSTATS_VERSION 13 #define TS_COMM_LEN 32 /* should be >=3D TASK_COMM_LEN * in linux/sched.h */ =20 @@ -194,6 +194,10 @@ struct taskstats { __u64 ac_exe_dev; /* program binary device ID */ __u64 ac_exe_inode; /* program binary inode number */ /* v12 end */ + + /* v13: Delay waiting for write-protect copy */ + __u64 wpcopy_count; + __u64 wpcopy_delay_total; }; =20 =20 diff --git a/kernel/delayacct.c b/kernel/delayacct.c index 2c1e18f7c5cf..164ed9ef77a3 100644 --- a/kernel/delayacct.c +++ b/kernel/delayacct.c @@ -177,11 +177,14 @@ int delayacct_add_tsk(struct taskstats *d, struct tas= k_struct *tsk) d->thrashing_delay_total =3D (tmp < d->thrashing_delay_total) ? 0 : tmp; tmp =3D d->compact_delay_total + tsk->delays->compact_delay; d->compact_delay_total =3D (tmp < d->compact_delay_total) ? 0 : tmp; + tmp =3D d->wpcopy_delay_total + tsk->delays->wpcopy_delay; + d->wpcopy_delay_total =3D (tmp < d->wpcopy_delay_total) ? 0 : tmp; d->blkio_count +=3D tsk->delays->blkio_count; d->swapin_count +=3D tsk->delays->swapin_count; d->freepages_count +=3D tsk->delays->freepages_count; d->thrashing_count +=3D tsk->delays->thrashing_count; d->compact_count +=3D tsk->delays->compact_count; + d->wpcopy_count +=3D tsk->delays->wpcopy_count; raw_spin_unlock_irqrestore(&tsk->delays->lock, flags); =20 return 0; @@ -249,3 +252,16 @@ void __delayacct_compact_end(void) ¤t->delays->compact_delay, ¤t->delays->compact_count); } + +void __delayacct_wpcopy_start(void) +{ + current->delays->wpcopy_start =3D local_clock(); +} + +void __delayacct_wpcopy_end(void) +{ + delayacct_end(¤t->delays->lock, + ¤t->delays->wpcopy_start, + ¤t->delays->wpcopy_delay, + ¤t->delays->wpcopy_count); +} diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fb5a549169ce..b131d44741dd 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5173,6 +5173,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, st= ruct vm_area_struct *vma, pte =3D huge_ptep_get(ptep); old_page =3D pte_page(pte); =20 + delayacct_wpcopy_start(); + retry_avoidcopy: /* * If no-one else is actually using this page, we're the exclusive @@ -5183,6 +5185,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, st= ruct vm_area_struct *vma, page_move_anon_rmap(old_page, vma); if (likely(!unshare)) set_huge_ptep_writable(vma, haddr, ptep); + + delayacct_wpcopy_end(); return 0; } VM_BUG_ON_PAGE(PageAnon(old_page) && PageAnonExclusive(old_page), @@ -5252,6 +5256,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, st= ruct vm_area_struct *vma, * race occurs while re-acquiring page table * lock, and our job is done. */ + delayacct_wpcopy_end(); return 0; } =20 @@ -5310,6 +5315,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, st= ruct vm_area_struct *vma, put_page(old_page); =20 spin_lock(ptl); /* Caller expects lock to be held */ + + delayacct_wpcopy_end(); return ret; } =20 diff --git a/mm/memory.c b/mm/memory.c index a82bf21be5e3..aad64c51c175 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3008,6 +3008,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) int page_copied =3D 0; struct mmu_notifier_range range; =20 + delayacct_wpcopy_start(); + if (unlikely(anon_vma_prepare(vma))) goto oom; =20 @@ -3032,6 +3034,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) put_page(new_page); if (old_page) put_page(old_page); + + delayacct_wpcopy_end(); return 0; } } @@ -3138,12 +3142,16 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) free_swap_cache(old_page); put_page(old_page); } + + delayacct_wpcopy_end(); return page_copied && !unshare ? VM_FAULT_WRITE : 0; oom_free_new: put_page(new_page); oom: if (old_page) put_page(old_page); + + delayacct_wpcopy_end(); return VM_FAULT_OOM; } =20 diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c index 11e86739456d..e83e6e47a21e 100644 --- a/tools/accounting/getdelays.c +++ b/tools/accounting/getdelays.c @@ -207,6 +207,8 @@ static void print_delayacct(struct taskstats *t) "THRASHING%12s%15s%15s\n" " %15llu%15llu%15llums\n" "COMPACT %12s%15s%15s\n" + " %15llu%15llu%15llums\n" + "WPCOPY %12s%15s%15s\n" " %15llu%15llu%15llums\n", "count", "real total", "virtual total", "delay total", "delay average", @@ -234,7 +236,11 @@ static void print_delayacct(struct taskstats *t) "count", "delay total", "delay average", (unsigned long long)t->compact_count, (unsigned long long)t->compact_delay_total, - average_ms(t->compact_delay_total, t->compact_count)); + average_ms(t->compact_delay_total, t->compact_count), + "count", "delay total", "delay average", + (unsigned long long)t->wpcopy_count, + (unsigned long long)t->wpcopy_delay_total, + average_ms(t->wpcopy_delay_total, t->wpcopy_count)); } =20 static void task_context_switch_counts(struct taskstats *t) --=20 2.25.1