From nobody Tue Dec 30 16:38:18 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5064C4167D for ; Mon, 13 Nov 2023 23:52:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231942AbjKMXpV (ORCPT ); Mon, 13 Nov 2023 18:45:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231370AbjKMXpQ (ORCPT ); Mon, 13 Nov 2023 18:45:16 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5C258F for ; Mon, 13 Nov 2023 15:44:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699919088; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=MWLskBwSdWDZ41yFDXxOzxCovYMtUiIkublLbv2soJQ=; b=gY2+4zPDrtW6sgBm9w9yN1Kn08GlMkvW+YDfo59mwxrmXIsCpSlccM6/qHaukSNNnEovkY 7Y3S3nQXHuIXIxSek4FqLHPFUU87zStHR9ZCNf+7NBfiRaVm+P/7tN+ZvW4lHYIBoMJep6 Q+dVUw5qCj5DMgiiec0RMA1lmYlQJ04= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-154-HezSOEpmMgCumBzIQGPfFQ-1; Mon, 13 Nov 2023 18:44:44 -0500 X-MC-Unique: HezSOEpmMgCumBzIQGPfFQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1261C810FC0; Mon, 13 Nov 2023 23:44:42 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CC5882166B26; Mon, 13 Nov 2023 23:44:41 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 2C3E9409B6035; Mon, 13 Nov 2023 20:35:57 -0300 (-03) Message-ID: <20231113233502.563575851@redhat.com> User-Agent: quilt/0.67 Date: Mon, 13 Nov 2023 20:34:21 -0300 From: Marcelo Tosatti To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Michal Hocko , Vlastimil Babka , Andrew Morton , David Hildenbrand , Peter Xu , Marcelo Tosatti Subject: [patch 1/2] mm: vmstat: introduce node_page_state_pages_snapshot References: <20231113233420.446465795@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a _snapshot variant of node_page_state_snapshot, similar to zone_page_state_snapshot. To be used by next patch. Signed-off-by: Marcelo Tosatti --- include/linux/vmstat.h | 4 ++++ mm/vmstat.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) Index: linux/mm/vmstat.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux.orig/mm/vmstat.c +++ linux/mm/vmstat.c @@ -1031,6 +1031,34 @@ unsigned long node_page_state(struct pgl =20 return node_page_state_pages(pgdat, item); } + +/* + * Determine the per node value of a stat item, snapshot version + * (see comment on top zone_page_state_snapshot). + */ +unsigned long node_page_state_pages_snapshot(struct pglist_data *pgdat, + enum node_stat_item item) +{ + long x =3D atomic_long_read(&pgdat->vm_stat[item]); +#ifdef CONFIG_SMP + int cpu; + + for_each_online_cpu(cpu) + x +=3D per_cpu_ptr(pgdat->per_cpu_nodestats, cpu)->vm_node_stat_diff[ite= m]; + + if (x < 0) + x =3D 0; +#endif + return x; +} + +unsigned long node_page_state_snapshot(struct pglist_data *pgdat, + enum node_stat_item item) +{ + VM_WARN_ON_ONCE(vmstat_item_in_bytes(item)); + + return node_page_state_pages(pgdat, item); +} #endif =20 #ifdef CONFIG_COMPACTION Index: linux/include/linux/vmstat.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux.orig/include/linux/vmstat.h +++ linux/include/linux/vmstat.h @@ -262,6 +262,10 @@ extern unsigned long node_page_state(str enum node_stat_item item); extern unsigned long node_page_state_pages(struct pglist_data *pgdat, enum node_stat_item item); +extern unsigned long node_page_state_snapshot(struct pglist_data *pgdat, + enum node_stat_item item); +extern unsigned long node_page_state_pages_snapshot(struct pglist_data *pg= dat, + enum node_stat_item item); extern void fold_vm_numa_events(void); #else #define sum_zone_node_page_state(node, item) global_zone_page_state(item) From nobody Tue Dec 30 16:38:18 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0013AC4167B for ; Mon, 13 Nov 2023 23:52:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231951AbjKMXpS (ORCPT ); Mon, 13 Nov 2023 18:45:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229696AbjKMXpO (ORCPT ); Mon, 13 Nov 2023 18:45:14 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D042910C for ; Mon, 13 Nov 2023 15:44:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699919086; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=kzmqPw87QEnBeQlSVWGP3J/i9psO+2r7MY9OYmaqENY=; b=SFEhJbx2P30DtkyJtAeGImnme/Tc6+QiEWWLyv1JxBGOUaMIJaY/fbsU+xwPjOdq3yAEjc +duoKUPoZTFGEFiJjFyqnE4xPiOL0hzP1bm2mIyoqCgT6D3Q7RrpuWBFrAsBg1YwVLuu7E qHvVtrwDUnpvhpNInUqKk85thQVvVwA= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-649-cZCqJAktN82N8Llx0r_7qg-1; Mon, 13 Nov 2023 18:44:43 -0500 X-MC-Unique: cZCqJAktN82N8Llx0r_7qg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3CA341C2B660; Mon, 13 Nov 2023 23:44:42 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CCD5F40C6EB9; Mon, 13 Nov 2023 23:44:41 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 300E6409C16E0; Mon, 13 Nov 2023 20:35:57 -0300 (-03) Message-ID: <20231113233502.587879658@redhat.com> User-Agent: quilt/0.67 Date: Mon, 13 Nov 2023 20:34:22 -0300 From: Marcelo Tosatti To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Michal Hocko , Vlastimil Babka , Andrew Morton , David Hildenbrand , Peter Xu , Marcelo Tosatti Subject: [patch 2/2] mm: vmstat: use node_page_state_snapshot in too_many_isolated References: <20231113233420.446465795@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A customer reported seeing processes hung at too_many_isolated, while analysis indicated that the problem occurred due to out of sync per-CPU stats (see below). Fix is to use node_page_state_snapshot to avoid the out of stale values. 2136 static unsigned long 2137 shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruv= ec, 2138 struct scan_control *sc, enum lru_list lru) 2139 { : 2145 bool file =3D is_file_lru(lru); : 2147 struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); : 2150 while (unlikely(too_many_isolated(pgdat, file, sc))) { 2151 if (stalled) 2152 return 0; 2153=20 2154 /* wait a bit for the reclaimer. */ 2155 msleep(100); <--- some processes were sleeping h= ere, with pending SIGKILL. 2156 stalled =3D true; 2157=20 2158 /* We are about to die and free our memory. Return= now. */ 2159 if (fatal_signal_pending(current)) 2160 return SWAP_CLUSTER_MAX; 2161 } msleep() must be called only when there are too many isolated pages:=20 2019 static int too_many_isolated(struct pglist_data *pgdat, int file, 2020 struct scan_control *sc) 2021 { : 2030 if (file) { 2031 inactive =3D node_page_state(pgdat, NR_INACTIVE_FI= LE); 2032 isolated =3D node_page_state(pgdat, NR_ISOLATED_FI= LE); 2033 } else { : 2046 return isolated > inactive; The return value was true since: crash> p ((struct pglist_data *) 0xffff00817fffe580)->vm_stat[NR_INACTI= VE_FILE] $8 =3D { counter =3D 1 } crash> p ((struct pglist_data *) 0xffff00817fffe580)->vm_stat[NR_ISOLAT= ED_FILE] $9 =3D { counter =3D 2 while per_cpu stats had: crash> p ((struct pglist_data *) 0xffff00817fffe580)->per_cpu_nodestats $85 =3D (struct per_cpu_nodestat *) 0xffff8000118832e0 crash> p/x 0xffff8000118832e0 + __per_cpu_offset[42] $86 =3D 0xffff00917fcc32e0 crash> p ((struct per_cpu_nodestat *) 0xffff00917fcc32e0)->vm_node_stat= _diff[NR_ISOLATED_FILE] $87 =3D -1 '\377' =20 crash> p/x 0xffff8000118832e0 + __per_cpu_offset[44] $89 =3D 0xffff00917fe032e0 crash> p ((struct per_cpu_nodestat *) 0xffff00917fe032e0)->vm_node_stat= _diff[NR_ISOLATED_FILE] $91 =3D -1 '\377'=20 It seems that processes were trapped in direct reclaim/compaction loop because these nodes had few free pages lower than watermark min. crash> kmem -z | grep -A 3 Normal : NODE: 4 ZONE: 1 ADDR: ffff00817fffec40 NAME: "Normal" SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264 VM_STAT: NR_FREE_PAGES: 68 -- NODE: 5 ZONE: 1 ADDR: ffff00897fffec40 NAME: "Normal" SIZE: 118784 MIN/LOW/HIGH: 82/200/318 VM_STAT: NR_FREE_PAGES: 45 -- NODE: 6 ZONE: 1 ADDR: ffff00917fffec40 NAME: "Normal" SIZE: 118784 MIN/LOW/HIGH: 82/200/318 VM_STAT: NR_FREE_PAGES: 53 -- NODE: 7 ZONE: 1 ADDR: ffff00997fbbec40 NAME: "Normal" SIZE: 118784 MIN/LOW/HIGH: 82/200/318 VM_STAT: NR_FREE_PAGES: 52 Signed-off-by: Marcelo Tosatti --- mm/compaction.c | 6 +++--- mm/vmscan.c | 8 ++++---- 2 files changed, 7 insertions(+), 7 deletions(-) Index: linux/mm/compaction.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux.orig/mm/compaction.c +++ linux/mm/compaction.c @@ -791,11 +791,11 @@ static bool too_many_isolated(struct com =20 unsigned long active, inactive, isolated; =20 - inactive =3D node_page_state(pgdat, NR_INACTIVE_FILE) + + inactive =3D node_page_state_snapshot(pgdat, NR_INACTIVE_FILE) + node_page_state(pgdat, NR_INACTIVE_ANON); - active =3D node_page_state(pgdat, NR_ACTIVE_FILE) + + active =3D node_page_state_snapshot(pgdat, NR_ACTIVE_FILE) + node_page_state(pgdat, NR_ACTIVE_ANON); - isolated =3D node_page_state(pgdat, NR_ISOLATED_FILE) + + isolated =3D node_page_state_snapshot(pgdat, NR_ISOLATED_FILE) + node_page_state(pgdat, NR_ISOLATED_ANON); =20 /* Index: linux/mm/vmscan.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux.orig/mm/vmscan.c +++ linux/mm/vmscan.c @@ -1756,11 +1756,11 @@ static int too_many_isolated(struct pgli return 0; =20 if (file) { - inactive =3D node_page_state(pgdat, NR_INACTIVE_FILE); - isolated =3D node_page_state(pgdat, NR_ISOLATED_FILE); + inactive =3D node_page_state_snapshot(pgdat, NR_INACTIVE_FILE); + isolated =3D node_page_state_snapshot(pgdat, NR_ISOLATED_FILE); } else { - inactive =3D node_page_state(pgdat, NR_INACTIVE_ANON); - isolated =3D node_page_state(pgdat, NR_ISOLATED_ANON); + inactive =3D node_page_state_snapshot(pgdat, NR_INACTIVE_ANON); + isolated =3D node_page_state_snapshot(pgdat, NR_ISOLATED_ANON); } =20 /*