From nobody Wed Dec 17 19:01:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95F16C27C40 for ; Wed, 22 Nov 2023 23:25:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344498AbjKVXZd (ORCPT ); Wed, 22 Nov 2023 18:25:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232389AbjKVXZb (ORCPT ); Wed, 22 Nov 2023 18:25:31 -0500 Received: from out-186.mta0.migadu.com (out-186.mta0.migadu.com [91.218.175.186]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F06A11F for ; Wed, 22 Nov 2023 15:25:28 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700695526; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YRuFFmQdC+GEPgfI4SIAkxSEqgU4CuzXncjpbXVOgmA=; b=OYAGGmY34QArkmiHjXDebJ27PVWXyHMvnQZdKucHfTgjB1wcWYWZBFp8TmHRA8jCBIi8v4 yvW1Y2FRd872H0c4q3qoiHR5jVSk8jX1HncoAHfmzh1blP3uywzK+6Ze2zyzcQKJnohAOI ZqfJdF5DZ+e1jm6U7nga+mewJb4vhXs= From: Kent Overstreet To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Kent Overstreet Subject: [PATCH 1/7] seq_buf: seq_buf_human_readable_u64() Date: Wed, 22 Nov 2023 18:25:06 -0500 Message-ID: <20231122232515.177833-2-kent.overstreet@linux.dev> In-Reply-To: <20231122232515.177833-1-kent.overstreet@linux.dev> References: <20231122232515.177833-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This adds a seq_buf wrapper for string_get_size(). Signed-off-by: Kent Overstreet --- include/linux/seq_buf.h | 4 ++++ lib/seq_buf.c | 10 ++++++++++ 2 files changed, 14 insertions(+) diff --git a/include/linux/seq_buf.h b/include/linux/seq_buf.h index 5fb1f12c33f9..dfcd0f367d6a 100644 --- a/include/linux/seq_buf.h +++ b/include/linux/seq_buf.h @@ -171,4 +171,8 @@ seq_buf_bprintf(struct seq_buf *s, const char *fmt, con= st u32 *binary); =20 void seq_buf_do_printk(struct seq_buf *s, const char *lvl); =20 +enum string_size_units; +void seq_buf_human_readable_u64(struct seq_buf *s, u64 v, + const enum string_size_units units); + #endif /* _LINUX_SEQ_BUF_H */ diff --git a/lib/seq_buf.c b/lib/seq_buf.c index 010c730ca7fc..9d4e4d5f43b4 100644 --- a/lib/seq_buf.c +++ b/lib/seq_buf.c @@ -425,3 +425,13 @@ int seq_buf_hex_dump(struct seq_buf *s, const char *pr= efix_str, int prefix_type, } return 0; } + +void seq_buf_human_readable_u64(struct seq_buf *s, u64 v, const enum strin= g_size_units units) +{ + char *buf; + size_t size =3D seq_buf_get_buf(s, &buf); + int wrote =3D string_get_size(v, 1, units, buf, size); + + seq_buf_commit(s, wrote); +} +EXPORT_SYMBOL(seq_buf_human_readable_u64); --=20 2.42.0 From nobody Wed Dec 17 19:01:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73315C61DAB for ; Wed, 22 Nov 2023 23:25:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344562AbjKVXZg (ORCPT ); Wed, 22 Nov 2023 18:25:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229453AbjKVXZc (ORCPT ); Wed, 22 Nov 2023 18:25:32 -0500 Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [IPv6:2001:41d0:1004:224b::b2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6A67199 for ; Wed, 22 Nov 2023 15:25:28 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700695527; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ce6mQdeWYA7JZiZSelXeyRte1bT790UpclDUySrWbbI=; b=RwDGCsBxUOAeC14xj7ENLjAvR8IPMtphog88OnO0JhzfsIcs+mraOc5OHgWapTIGcLSRsp Eu6N2oJ5sNyByGKgzowShHq/oDCtx/SHNWZszz3Jq5vB9+XfTDyzs5kpuj8NlrxAXhVCaG bk8karUNujFdsVJa4kR7ZHnGQ3NZheM= From: Kent Overstreet To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Kent Overstreet , Andrew Morton , Qi Zheng , Roman Gushchin Subject: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers Date: Wed, 22 Nov 2023 18:25:07 -0500 Message-ID: <20231122232515.177833-3-kent.overstreet@linux.dev> In-Reply-To: <20231122232515.177833-1-kent.overstreet@linux.dev> References: <20231122232515.177833-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This adds a new callback method to shrinkers which they can use to describe anything relevant to memory reclaim about their internal state, for example object dirtyness. This patch also adds shrinkers_to_text(), which reports on the top 10 shrinkers - by object count - in sorted order, to be used in OOM reporting. Cc: Andrew Morton Cc: Qi Zheng Cc: Roman Gushchin Cc: linux-mm@kvack.org Signed-off-by: Kent Overstreet --- include/linux/shrinker.h | 6 +++- mm/shrinker.c | 73 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 77 insertions(+), 2 deletions(-) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 1a00be90d93a..968c55474e78 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -24,6 +24,8 @@ struct shrinker_info { struct shrinker_info_unit *unit[]; }; =20 +struct seq_buf; + /* * This struct is used to pass information from page reclaim to the shrink= ers. * We consolidate the values for easier extension later. @@ -80,10 +82,12 @@ struct shrink_control { * @flags determine the shrinker abilities, like numa awareness */ struct shrinker { + const char *name; unsigned long (*count_objects)(struct shrinker *, struct shrink_control *sc); unsigned long (*scan_objects)(struct shrinker *, struct shrink_control *sc); + void (*to_text)(struct seq_buf *, struct shrinker *); =20 long batch; /* reclaim batch size, 0 =3D default */ int seeks; /* seeks to recreate an obj */ @@ -110,7 +114,6 @@ struct shrinker { #endif #ifdef CONFIG_SHRINKER_DEBUG int debugfs_id; - const char *name; struct dentry *debugfs_entry; #endif /* objs pending delete, per node */ @@ -135,6 +138,7 @@ __printf(2, 3) struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); void shrinker_register(struct shrinker *shrinker); void shrinker_free(struct shrinker *shrinker); +void shrinkers_to_text(struct seq_buf *); =20 static inline bool shrinker_try_get(struct shrinker *shrinker) { diff --git a/mm/shrinker.c b/mm/shrinker.c index dd91eab43ed3..4976dbac4c83 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -1,8 +1,9 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include +#include #include -#include #include =20 #include "internal.h" @@ -807,3 +808,73 @@ void shrinker_free(struct shrinker *shrinker) call_rcu(&shrinker->rcu, shrinker_free_rcu_cb); } EXPORT_SYMBOL_GPL(shrinker_free); + +void shrinker_to_text(struct seq_buf *out, struct shrinker *shrinker) +{ + struct shrink_control sc =3D { .gfp_mask =3D GFP_KERNEL, }; + + seq_buf_puts(out, shrinker->name); + seq_buf_printf(out, " objects: %lu\n", shrinker->count_objects(shrinker, = &sc)); + + if (shrinker->to_text) { + shrinker->to_text(out, shrinker); + seq_buf_puts(out, "\n"); + } +} + +/** + * shrinkers_to_text - Report on shrinkers with highest usage + * + * This reports on the top 10 shrinkers, by object counts, in sorted order: + * intended to be used for OOM reporting. + */ +void shrinkers_to_text(struct seq_buf *out) +{ + struct shrinker *shrinker; + struct shrinker_by_mem { + struct shrinker *shrinker; + unsigned long mem; + } shrinkers_by_mem[10]; + int i, nr =3D 0; + + if (!mutex_trylock(&shrinker_mutex)) { + seq_buf_puts(out, "(couldn't take shrinker lock)"); + return; + } + + list_for_each_entry(shrinker, &shrinker_list, list) { + struct shrink_control sc =3D { .gfp_mask =3D GFP_KERNEL, }; + unsigned long mem =3D shrinker->count_objects(shrinker, &sc); + + if (!mem || mem =3D=3D SHRINK_STOP || mem =3D=3D SHRINK_EMPTY) + continue; + + for (i =3D 0; i < nr; i++) + if (mem < shrinkers_by_mem[i].mem) + break; + + if (nr < ARRAY_SIZE(shrinkers_by_mem)) { + memmove(&shrinkers_by_mem[i + 1], + &shrinkers_by_mem[i], + sizeof(shrinkers_by_mem[0]) * (nr - i)); + nr++; + } else if (i) { + i--; + memmove(&shrinkers_by_mem[0], + &shrinkers_by_mem[1], + sizeof(shrinkers_by_mem[0]) * i); + } else { + continue; + } + + shrinkers_by_mem[i] =3D (struct shrinker_by_mem) { + .shrinker =3D shrinker, + .mem =3D mem, + }; + } + + for (i =3D nr - 1; i >=3D 0; --i) + shrinker_to_text(out, shrinkers_by_mem[i].shrinker); + + mutex_unlock(&shrinker_mutex); +} --=20 2.42.0 From nobody Wed Dec 17 19:01:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15B1EC61D97 for ; Wed, 22 Nov 2023 23:25:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344600AbjKVXZj (ORCPT ); Wed, 22 Nov 2023 18:25:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344409AbjKVXZd (ORCPT ); Wed, 22 Nov 2023 18:25:33 -0500 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [IPv6:2001:41d0:1004:224b::ac]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B610D110 for ; Wed, 22 Nov 2023 15:25:29 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700695528; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h7yT/gNa8WVPOQ61JKqXqNgHJRrqgXjHTIF9eRajh2g=; b=O6Wa1Pq7Futp9HUKsVGvdg+Z5w+n3MO/N9v0rDnM9VMU80eqJlKtDs5MCdlB/RLuq3zB75 DFQ+Bbi5//mR819xx4Wrz6FRF25hy8ONIL+PG+xyufurdr4QS6+jDEsypsEnexxM7IRVd+ TH/mv9J3tjkNle2GRCxH+jXEE/7wAlg= From: Kent Overstreet To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Kent Overstreet , Andrew Morton , Qi Zheng , Roman Gushchin Subject: [PATCH 3/7] mm: shrinker: Add new stats for .to_text() Date: Wed, 22 Nov 2023 18:25:08 -0500 Message-ID: <20231122232515.177833-4-kent.overstreet@linux.dev> In-Reply-To: <20231122232515.177833-1-kent.overstreet@linux.dev> References: <20231122232515.177833-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add a few new shrinker stats. number of objects requested to free, number of objects freed: Shrinkers won't necessarily free all objects requested for a variety of reasons, but if the two counts are wildly different something is likely amiss. .scan_objects runtime: If one shrinker is taking an excessive amount of time to free objects that will block kswapd from running other shrinkers. Cc: Andrew Morton Cc: Qi Zheng Cc: Roman Gushchin Cc: linux-mm@kvack.org Signed-off-by: Kent Overstreet --- include/linux/shrinker.h | 5 +++++ mm/shrinker.c | 18 +++++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 968c55474e78..497a7e8e348c 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -118,6 +118,11 @@ struct shrinker { #endif /* objs pending delete, per node */ atomic_long_t *nr_deferred; + + atomic_long_t objects_requested_to_free; + atomic_long_t objects_freed; + atomic_long_t last_freed; + atomic64_t ns_run; }; #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */ =20 diff --git a/mm/shrinker.c b/mm/shrinker.c index 4976dbac4c83..acfc3f92f552 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -430,13 +430,20 @@ static unsigned long do_shrink_slab(struct shrink_con= trol *shrinkctl, total_scan >=3D freeable) { unsigned long ret; unsigned long nr_to_scan =3D min(batch_size, total_scan); + u64 start_time =3D ktime_get_ns(); + + atomic_long_add(nr_to_scan, &shrinker->objects_requested_to_free); =20 shrinkctl->nr_to_scan =3D nr_to_scan; shrinkctl->nr_scanned =3D nr_to_scan; ret =3D shrinker->scan_objects(shrinker, shrinkctl); + + atomic64_add(ktime_get_ns() - start_time, &shrinker->ns_run); if (ret =3D=3D SHRINK_STOP) break; freed +=3D ret; + atomic_long_add(ret, &shrinker->objects_freed); + atomic_long_set(&shrinker->last_freed, ret); =20 count_vm_events(SLABS_SCANNED, shrinkctl->nr_scanned); total_scan -=3D shrinkctl->nr_scanned; @@ -812,9 +819,18 @@ EXPORT_SYMBOL_GPL(shrinker_free); void shrinker_to_text(struct seq_buf *out, struct shrinker *shrinker) { struct shrink_control sc =3D { .gfp_mask =3D GFP_KERNEL, }; + unsigned long nr_freed =3D atomic_long_read(&shrinker->objects_freed); =20 seq_buf_puts(out, shrinker->name); - seq_buf_printf(out, " objects: %lu\n", shrinker->count_objects(shrinker, = &sc)); + seq_buf_putc(out, '\n'); + + seq_buf_printf(out, "objects: %lu", shrinker->count_objects(s= hrinker, &sc)); + seq_buf_printf(out, "requested to free: %lu", atomic_long_read(&shrinke= r->objects_requested_to_free)); + seq_buf_printf(out, "objects freed: %lu", nr_freed); + seq_buf_printf(out, "last freed: %lu", atomic_long_read(&shrinke= r->last_freed)); + seq_buf_printf(out, "ns per object freed: %llu", nr_freed + ? div64_ul(atomic64_read(&shrinker->ns_run), nr_freed) + : 0); =20 if (shrinker->to_text) { shrinker->to_text(out, shrinker); --=20 2.42.0 From nobody Wed Dec 17 19:01:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA644C61D97 for ; Wed, 22 Nov 2023 23:25:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344696AbjKVXZr (ORCPT ); Wed, 22 Nov 2023 18:25:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344575AbjKVXZg (ORCPT ); Wed, 22 Nov 2023 18:25:36 -0500 Received: from out-175.mta0.migadu.com (out-175.mta0.migadu.com [91.218.175.175]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C180A199 for ; Wed, 22 Nov 2023 15:25:30 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700695529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RfcqUFguaDDZw+gf+d0LSy3BejslhVkAhwRYXLWAYww=; b=KVh3wxKIo4UY3M0nKsfWud9MNgWoL2E3pERfHvOTCAkXnvsURoSqAEjUmPalhXUSNLrOps qL0Vq27zCcEV27nYL27phxtsnLF4sUXNJKSd1c90XA5LM5BNTphc9LQscyej8ZjxlFAQtj zdogso5LeCN9DCAJz2BtPcz5+k96adw= From: Kent Overstreet To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Kent Overstreet , Andrew Morton , Qi Zheng , Roman Gushchin Subject: [PATCH 4/7] mm: Centralize & improve oom reporting in show_mem.c Date: Wed, 22 Nov 2023 18:25:09 -0500 Message-ID: <20231122232515.177833-5-kent.overstreet@linux.dev> In-Reply-To: <20231122232515.177833-1-kent.overstreet@linux.dev> References: <20231122232515.177833-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This patch: - Changes show_mem() to always report on slab usage - Instead of reporting on all slabs, we only report on top 10 slabs, and in sorted order - Also reports on shrinkers, with the new shrinkers_to_text(). Shrinkers need to be included in OOM/allocation failure reporting because they're responsible for memory reclaim - if a shrinker isn't giving up its memory, we need to know which one and why. More OOM reporting can be moved to show_mem.c and improved, this patch is only a start. New example output on OOM/memory allocation failure: 00177 Mem-Info: 00177 active_anon:13706 inactive_anon:32266 isolated_anon:16 00177 active_file:1653 inactive_file:1822 isolated_file:0 00177 unevictable:0 dirty:0 writeback:0 00177 slab_reclaimable:6242 slab_unreclaimable:11168 00177 mapped:3824 shmem:3 pagetables:1266 bounce:0 00177 kernel_misc_reclaimable:0 00177 free:4362 free_pcp:35 free_cma:0 00177 Node 0 active_anon:54824kB inactive_anon:129064kB active_file:6612kB = inactive_file:7288kB unevictable:0kB isolated(anon):64kB isolated(file):0kB= mapped:15296kB dirty:0kB writeback:0kB shmem:12kB writeback_tmp:0kB kernel= _stack:3392kB pagetables:5064kB all_unreclaimable? no 00177 DMA free:2232kB boost:0kB min:88kB low:108kB high:128kB reserved_high= atomic:0KB active_anon:2924kB inactive_anon:6596kB active_file:428kB inacti= ve_file:384kB unevictable:0kB writepending:0kB present:15992kB managed:1536= 0kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 00177 lowmem_reserve[]: 0 426 426 426 00177 DMA32 free:15092kB boost:5836kB min:8432kB low:9080kB high:9728kB res= erved_highatomic:0KB active_anon:52196kB inactive_anon:122392kB active_file= :6176kB inactive_file:7068kB unevictable:0kB writepending:0kB present:50776= 0kB managed:441816kB mlocked:0kB bounce:0kB free_pcp:72kB local_pcp:0kB fre= e_cma:0kB 00177 lowmem_reserve[]: 0 0 0 0 00177 DMA: 284*4kB (UM) 53*8kB (UM) 21*16kB (U) 11*32kB (U) 0*64kB 0*128kB = 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =3D 2248kB 00177 DMA32: 2765*4kB (UME) 375*8kB (UME) 57*16kB (UM) 5*32kB (U) 0*64kB 0*= 128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =3D 15132kB 00177 4656 total pagecache pages 00177 1031 pages in swap cache 00177 Swap cache stats: add 6572399, delete 6572173, find 488603/3286476 00177 Free swap =3D 509112kB 00177 Total swap =3D 2097148kB 00177 130938 pages RAM 00177 0 pages HighMem/MovableOnly 00177 16644 pages reserved 00177 Unreclaimable slab info: 00177 9p-fcall-cache total: 8.25 MiB active: 8.25 MiB 00177 kernfs_node_cache total: 2.15 MiB active: 2.15 MiB 00177 kmalloc-64 total: 2.08 MiB active: 2.07 MiB 00177 task_struct total: 1.95 MiB active: 1.95 MiB 00177 kmalloc-4k total: 1.50 MiB active: 1.50 MiB 00177 signal_cache total: 1.34 MiB active: 1.34 MiB 00177 kmalloc-2k total: 1.16 MiB active: 1.16 MiB 00177 bch_inode_info total: 1.02 MiB active: 922 KiB 00177 perf_event total: 1.02 MiB active: 1.02 MiB 00177 biovec-max total: 992 KiB active: 960 KiB 00177 Shrinkers: 00177 super_cache_scan: objects: 127 00177 super_cache_scan: objects: 106 00177 jbd2_journal_shrink_scan: objects: 32 00177 ext4_es_scan: objects: 32 00177 bch2_btree_cache_scan: objects: 8 00177 nr nodes: 24 00177 nr dirty: 0 00177 cannibalize lock: 0000000000000000 00177 00177 super_cache_scan: objects: 8 00177 super_cache_scan: objects: 1 Cc: Andrew Morton Cc: Qi Zheng Cc: Roman Gushchin Cc: linux-mm@kvack.org Signed-off-by: Kent Overstreet --- mm/oom_kill.c | 23 --------------------- mm/show_mem.c | 20 +++++++++++++++++++ mm/slab.h | 6 ++++-- mm/slab_common.c | 52 +++++++++++++++++++++++++++++++++++++++--------- 4 files changed, 67 insertions(+), 34 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 9e6071fde34a..4b825a2b353f 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -168,27 +168,6 @@ static bool oom_unkillable_task(struct task_struct *p) return false; } =20 -/* - * Check whether unreclaimable slab amount is greater than - * all user memory(LRU pages). - * dump_unreclaimable_slab() could help in the case that - * oom due to too much unreclaimable slab used by kernel. -*/ -static bool should_dump_unreclaim_slab(void) -{ - unsigned long nr_lru; - - nr_lru =3D global_node_page_state(NR_ACTIVE_ANON) + - global_node_page_state(NR_INACTIVE_ANON) + - global_node_page_state(NR_ACTIVE_FILE) + - global_node_page_state(NR_INACTIVE_FILE) + - global_node_page_state(NR_ISOLATED_ANON) + - global_node_page_state(NR_ISOLATED_FILE) + - global_node_page_state(NR_UNEVICTABLE); - - return (global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B) > nr_lru); -} - /** * oom_badness - heuristic function to determine which candidate task to k= ill * @p: task struct of which task we should calculate @@ -462,8 +441,6 @@ static void dump_header(struct oom_control *oc) mem_cgroup_print_oom_meminfo(oc->memcg); else { __show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask, gfp_zone(oc->gfp_mask)); - if (should_dump_unreclaim_slab()) - dump_unreclaimable_slab(); } if (sysctl_oom_dump_tasks) dump_tasks(oc); diff --git a/mm/show_mem.c b/mm/show_mem.c index ba0808d6917f..ab258ab1161c 100644 --- a/mm/show_mem.c +++ b/mm/show_mem.c @@ -12,10 +12,12 @@ #include #include #include +#include #include #include =20 #include "internal.h" +#include "slab.h" #include "swap.h" =20 atomic_long_t _totalram_pages __read_mostly; @@ -401,6 +403,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodema= sk, int max_zone_idx) { unsigned long total =3D 0, reserved =3D 0, highmem =3D 0; struct zone *zone; + char *buf; =20 printk("Mem-Info:\n"); show_free_areas(filter, nodemask, max_zone_idx); @@ -423,4 +426,21 @@ void __show_mem(unsigned int filter, nodemask_t *nodem= ask, int max_zone_idx) #ifdef CONFIG_MEMORY_FAILURE printk("%lu pages hwpoisoned\n", atomic_long_read(&num_poisoned_pages)); #endif + + buf =3D kmalloc(4096, GFP_ATOMIC); + if (buf) { + struct seq_buf s; + + printk("Unreclaimable slab info:\n"); + seq_buf_init(&s, buf, 4096); + dump_unreclaimable_slab(&s); + printk("%s", seq_buf_str(&s)); + + printk("Shrinkers:\n"); + seq_buf_init(&s, buf, 4096); + shrinkers_to_text(&s); + printk("%s", seq_buf_str(&s)); + + kfree(buf); + } } diff --git a/mm/slab.h b/mm/slab.h index 3d07fb428393..c16358a8424c 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -818,10 +818,12 @@ static inline struct kmem_cache_node *get_node(struct= kmem_cache *s, int node) if ((__n =3D get_node(__s, __node))) =20 =20 +struct seq_buf; + #if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG) -void dump_unreclaimable_slab(void); +void dump_unreclaimable_slab(struct seq_buf *); #else -static inline void dump_unreclaimable_slab(void) +static inline void dump_unreclaimable_slab(struct seq_buf *out) { } #endif diff --git a/mm/slab_common.c b/mm/slab_common.c index 8d431193c273..1eadff512422 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -26,6 +26,7 @@ #include #include #include +#include #include =20 #include "internal.h" @@ -1295,10 +1296,15 @@ static int slab_show(struct seq_file *m, void *p) return 0; } =20 -void dump_unreclaimable_slab(void) +void dump_unreclaimable_slab(struct seq_buf *out) { struct kmem_cache *s; struct slabinfo sinfo; + struct slab_by_mem { + struct kmem_cache *s; + size_t total, active; + } slabs_by_mem[10], n; + int i, nr =3D 0; =20 /* * Here acquiring slab_mutex is risky since we don't prefer to get @@ -1308,24 +1314,52 @@ void dump_unreclaimable_slab(void) * without acquiring the mutex. */ if (!mutex_trylock(&slab_mutex)) { - pr_warn("excessive unreclaimable slab but cannot dump stats\n"); + seq_buf_puts(out, "excessive unreclaimable slab but cannot dump stats\n"= ); return; } =20 - pr_info("Unreclaimable slab info:\n"); - pr_info("Name Used Total\n"); - list_for_each_entry(s, &slab_caches, list) { if (s->flags & SLAB_RECLAIM_ACCOUNT) continue; =20 get_slabinfo(s, &sinfo); =20 - if (sinfo.num_objs > 0) - pr_info("%-17s %10luKB %10luKB\n", s->name, - (sinfo.active_objs * s->size) / 1024, - (sinfo.num_objs * s->size) / 1024); + if (!sinfo.num_objs) + continue; + + n.s =3D s; + n.total =3D sinfo.num_objs * s->size; + n.active =3D sinfo.active_objs * s->size; + + for (i =3D 0; i < nr; i++) + if (n.total < slabs_by_mem[i].total) + break; + + if (nr < ARRAY_SIZE(slabs_by_mem)) { + memmove(&slabs_by_mem[i + 1], + &slabs_by_mem[i], + sizeof(slabs_by_mem[0]) * (nr - i)); + nr++; + } else if (i) { + i--; + memmove(&slabs_by_mem[0], + &slabs_by_mem[1], + sizeof(slabs_by_mem[0]) * i); + } else { + continue; + } + + slabs_by_mem[i] =3D n; + } + + for (i =3D nr - 1; i >=3D 0; --i) { + seq_buf_printf(out, "%-17s total: ", slabs_by_mem[i].s->name); + seq_buf_human_readable_u64(out, slabs_by_mem[i].total, STRING_UNITS_2); + seq_buf_printf(out, " active: "); + seq_buf_human_readable_u64(out, slabs_by_mem[i].active, STRING_UNITS_2); + seq_buf_putc(out, '\n'); } + mutex_unlock(&slab_mutex); } =20 --=20 2.42.0 From nobody Wed Dec 17 19:01:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5233C27C40 for ; Wed, 22 Nov 2023 23:25:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344613AbjKVXZp (ORCPT ); Wed, 22 Nov 2023 18:25:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344558AbjKVXZf (ORCPT ); Wed, 22 Nov 2023 18:25:35 -0500 Received: from out-170.mta0.migadu.com (out-170.mta0.migadu.com [91.218.175.170]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6286419D for ; Wed, 22 Nov 2023 15:25:31 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700695529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hhFyhh1rGVdN8KU7zA1YO/xWC9r61TLiUr+vYD/9YWg=; b=BHd1Y+nn87jfgs0JPV5W9mQnj/jbSljivlaWBWB01hjhj/kWczcqBvz5Z6L0HpvjZCtsmd /dVQdVskpae/TXe1lhJkyqwhQBEja9RxyeQj/qFWuwDRlz1zUC1oNymeFeLzQPPjOqHaw5 +p/p0qYljzIyMnmljGtQZZIcQd8+3JE= From: Kent Overstreet To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Kent Overstreet Subject: [PATCH 5/7] mm: shrinker: Add shrinker_to_text() to debugfs interface Date: Wed, 22 Nov 2023 18:25:10 -0500 Message-ID: <20231122232515.177833-6-kent.overstreet@linux.dev> In-Reply-To: <20231122232515.177833-1-kent.overstreet@linux.dev> References: <20231122232515.177833-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Previously, we added shrinker_to_text() and hooked it up to the OOM report - now, the same report is available via debugfs. Signed-off-by: Kent Overstreet --- include/linux/shrinker.h | 1 + mm/shrinker_debug.c | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 497a7e8e348c..b8e57afabbcc 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -143,6 +143,7 @@ __printf(2, 3) struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); void shrinker_register(struct shrinker *shrinker); void shrinker_free(struct shrinker *shrinker); +void shrinker_to_text(struct seq_buf *, struct shrinker *); void shrinkers_to_text(struct seq_buf *); =20 static inline bool shrinker_try_get(struct shrinker *shrinker) diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index 12ea5486a3e9..39342aa9f4ca 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include #include @@ -159,6 +160,21 @@ static const struct file_operations shrinker_debugfs_s= can_fops =3D { .write =3D shrinker_debugfs_scan_write, }; =20 +static int shrinker_debugfs_report_show(struct seq_file *m, void *v) +{ + struct shrinker *shrinker =3D m->private; + char *bufp; + size_t buflen =3D seq_get_buf(m, &bufp); + struct seq_buf out; + + seq_buf_init(&out, bufp, buflen); + shrinker_to_text(&out, shrinker); + seq_commit(m, seq_buf_used(&out)); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_report); + int shrinker_debugfs_add(struct shrinker *shrinker) { struct dentry *entry; @@ -190,6 +206,8 @@ int shrinker_debugfs_add(struct shrinker *shrinker) &shrinker_debugfs_count_fops); debugfs_create_file("scan", 0220, entry, shrinker, &shrinker_debugfs_scan_fops); + debugfs_create_file("report", 0440, entry, shrinker, + &shrinker_debugfs_report_fops); return 0; } =20 --=20 2.42.0 From nobody Wed Dec 17 19:01:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C21AAC61D9C for ; Wed, 22 Nov 2023 23:25:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344409AbjKVXZo (ORCPT ); Wed, 22 Nov 2023 18:25:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344564AbjKVXZg (ORCPT ); Wed, 22 Nov 2023 18:25:36 -0500 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D21251AB for ; Wed, 22 Nov 2023 15:25:31 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700695530; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+xIe2DoE21qrt9LSNcAUwIwIwHsNrxaH9AQLMP04McM=; b=o3CvX/g1Mo37OAG87L/6Vu3aVXW2glLs+ZCCGUrP4H0SA8cPtcuDyYy22kVuLNkWtpIA9Q /HbvDCodA7ny8mu2ew1QagbLzcTglJHQRpbARFoW4RvMBKCAQNlLos+/qEMrNpU12NLKky vaSlk3dtsvcNkFHaJ0XGnT1dL407Frk= From: Kent Overstreet To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Kent Overstreet Subject: [PATCH 6/7] bcachefs: shrinker.to_text() methods Date: Wed, 22 Nov 2023 18:25:11 -0500 Message-ID: <20231122232515.177833-7-kent.overstreet@linux.dev> In-Reply-To: <20231122232515.177833-1-kent.overstreet@linux.dev> References: <20231122232515.177833-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This adds shrinker.to_text() methods for our shrinkers and hooks them up to our existing to_text() functions. Signed-off-by: Kent Overstreet --- fs/bcachefs/btree_cache.c | 13 +++++++++++++ fs/bcachefs/btree_key_cache.c | 14 ++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/fs/bcachefs/btree_cache.c b/fs/bcachefs/btree_cache.c index 47e7770d0583..aab279dff944 100644 --- a/fs/bcachefs/btree_cache.c +++ b/fs/bcachefs/btree_cache.c @@ -13,6 +13,7 @@ =20 #include #include +#include =20 const char * const bch2_btree_node_flags[] =3D { #define x(f) #f, @@ -392,6 +393,17 @@ static unsigned long bch2_btree_cache_count(struct shr= inker *shrink, return btree_cache_can_free(bc); } =20 +static void bch2_btree_cache_shrinker_to_text(struct seq_buf *s, struct sh= rinker *shrink) +{ + struct bch_fs *c =3D shrink->private_data; + char *cbuf; + size_t buflen =3D seq_buf_get_buf(s, &cbuf); + struct printbuf out =3D PRINTBUF_EXTERN(cbuf, buflen); + + bch2_btree_cache_to_text(&out, c); + seq_buf_commit(s, out.pos); +} + void bch2_fs_btree_cache_exit(struct bch_fs *c) { struct btree_cache *bc =3D &c->btree_cache; @@ -478,6 +490,7 @@ int bch2_fs_btree_cache_init(struct bch_fs *c) bc->shrink =3D shrink; shrink->count_objects =3D bch2_btree_cache_count; shrink->scan_objects =3D bch2_btree_cache_scan; + shrink->to_text =3D bch2_btree_cache_shrinker_to_text; shrink->seeks =3D 4; shrink->private_data =3D c; shrinker_register(shrink); diff --git a/fs/bcachefs/btree_key_cache.c b/fs/bcachefs/btree_key_cache.c index 4402cf068c56..e14e9b4cd029 100644 --- a/fs/bcachefs/btree_key_cache.c +++ b/fs/bcachefs/btree_key_cache.c @@ -13,6 +13,7 @@ #include "trace.h" =20 #include +#include =20 static inline bool btree_uses_pcpu_readers(enum btree_id id) { @@ -1028,6 +1029,18 @@ void bch2_fs_btree_key_cache_init_early(struct btree= _key_cache *c) INIT_LIST_HEAD(&c->freed_nonpcpu); } =20 +static void bch2_btree_key_cache_shrinker_to_text(struct seq_buf *s, struc= t shrinker *shrink) +{ + struct bch_fs *c =3D shrink->private_data; + struct btree_key_cache *bc =3D &c->btree_key_cache; + char *cbuf; + size_t buflen =3D seq_buf_get_buf(s, &cbuf); + struct printbuf out =3D PRINTBUF_EXTERN(cbuf, buflen); + + bch2_btree_key_cache_to_text(&out, bc); + seq_buf_commit(s, out.pos); +} + int bch2_fs_btree_key_cache_init(struct btree_key_cache *bc) { struct bch_fs *c =3D container_of(bc, struct bch_fs, btree_key_cache); @@ -1051,6 +1064,7 @@ int bch2_fs_btree_key_cache_init(struct btree_key_cac= he *bc) shrink->seeks =3D 0; shrink->count_objects =3D bch2_btree_key_cache_count; shrink->scan_objects =3D bch2_btree_key_cache_scan; + shrink->to_text =3D bch2_btree_key_cache_shrinker_to_text; shrink->private_data =3D c; shrinker_register(shrink); return 0; --=20 2.42.0 From nobody Wed Dec 17 19:01:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3588C61D97 for ; Wed, 22 Nov 2023 23:25:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344713AbjKVXZv (ORCPT ); Wed, 22 Nov 2023 18:25:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235163AbjKVXZk (ORCPT ); Wed, 22 Nov 2023 18:25:40 -0500 Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [IPv6:2001:41d0:1004:224b::ad]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4CFF1B3 for ; Wed, 22 Nov 2023 15:25:32 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700695531; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1NMfXBsonhz+ezAg3sKaNXQeyap2g8yZZq48bhQ7Wec=; b=L/6Ssih/O7QhT9069T31VThCTNukc3MMvLU+/VMAZ3jnXRJbm6rGc+TMETY0offdQtngVp 6jDxcKMISPFQdNLC7MDtwdDuraODTV+7GC8ozeT/P6g2dMkS3qcUJPEzMFbmowqy23RWPc TvwAmvR6OWymxacMWlUCPuncKjfbn0Y= From: Kent Overstreet To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Hill , Kent Overstreet Subject: [PATCH 7/7] bcachefs: add counters for failed shrinker reclaim Date: Wed, 22 Nov 2023 18:25:12 -0500 Message-ID: <20231122232515.177833-8-kent.overstreet@linux.dev> In-Reply-To: <20231122232515.177833-1-kent.overstreet@linux.dev> References: <20231122232515.177833-1-kent.overstreet@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Daniel Hill This adds distinct counters for every reason the btree node shrinker can fail to free an object - if our shrinker isn't making progress, this will tell us why. Signed-off-by: Daniel Hill Signed-off-by: Kent Overstreet --- fs/bcachefs/btree_cache.c | 91 +++++++++++++++++++++++++++++---------- fs/bcachefs/btree_cache.h | 2 +- fs/bcachefs/btree_types.h | 10 +++++ fs/bcachefs/sysfs.c | 2 +- 4 files changed, 81 insertions(+), 24 deletions(-) diff --git a/fs/bcachefs/btree_cache.c b/fs/bcachefs/btree_cache.c index aab279dff944..72dea90e12fa 100644 --- a/fs/bcachefs/btree_cache.c +++ b/fs/bcachefs/btree_cache.c @@ -15,6 +15,12 @@ #include #include =20 +#define BTREE_CACHE_NOT_FREED_INCREMENT(counter) \ +do { \ + if (shrinker_counter) \ + bc->not_freed_##counter++; \ +} while (0) + const char * const bch2_btree_node_flags[] =3D { #define x(f) #f, BTREE_FLAGS() @@ -202,7 +208,7 @@ static inline struct btree *btree_cache_find(struct btr= ee_cache *bc, * this version is for btree nodes that have already been freed (we're not * reaping a real btree node) */ -static int __btree_node_reclaim(struct bch_fs *c, struct btree *b, bool fl= ush) +static int __btree_node_reclaim(struct bch_fs *c, struct btree *b, bool fl= ush, bool shrinker_counter) { struct btree_cache *bc =3D &c->btree_cache; int ret =3D 0; @@ -212,38 +218,64 @@ static int __btree_node_reclaim(struct bch_fs *c, str= uct btree *b, bool flush) if (b->flags & ((1U << BTREE_NODE_dirty)| (1U << BTREE_NODE_read_in_flight)| (1U << BTREE_NODE_write_in_flight))) { - if (!flush) + if (!flush) { + if (btree_node_dirty(b)) + BTREE_CACHE_NOT_FREED_INCREMENT(dirty); + else if (btree_node_read_in_flight(b)) + BTREE_CACHE_NOT_FREED_INCREMENT(read_in_flight); + else if (btree_node_write_in_flight(b)) + BTREE_CACHE_NOT_FREED_INCREMENT(write_in_flight); return -BCH_ERR_ENOMEM_btree_node_reclaim; + } =20 /* XXX: waiting on IO with btree cache lock held */ bch2_btree_node_wait_on_read(b); bch2_btree_node_wait_on_write(b); } =20 - if (!six_trylock_intent(&b->c.lock)) + if (!six_trylock_intent(&b->c.lock)) { + BTREE_CACHE_NOT_FREED_INCREMENT(lock_intent); return -BCH_ERR_ENOMEM_btree_node_reclaim; + } =20 - if (!six_trylock_write(&b->c.lock)) + if (!six_trylock_write(&b->c.lock)) { + BTREE_CACHE_NOT_FREED_INCREMENT(lock_write); goto out_unlock_intent; + } =20 /* recheck under lock */ if (b->flags & ((1U << BTREE_NODE_read_in_flight)| (1U << BTREE_NODE_write_in_flight))) { - if (!flush) + if (!flush) { + if (btree_node_read_in_flight(b)) + BTREE_CACHE_NOT_FREED_INCREMENT(read_in_flight); + else if (btree_node_write_in_flight(b)) + BTREE_CACHE_NOT_FREED_INCREMENT(write_in_flight); goto out_unlock; + } six_unlock_write(&b->c.lock); six_unlock_intent(&b->c.lock); goto wait_on_io; } =20 - if (btree_node_noevict(b) || - btree_node_write_blocked(b) || - btree_node_will_make_reachable(b)) + if (btree_node_noevict(b)) { + BTREE_CACHE_NOT_FREED_INCREMENT(noevict); + goto out_unlock; + } + if (btree_node_write_blocked(b)) { + BTREE_CACHE_NOT_FREED_INCREMENT(write_blocked); + goto out_unlock; + } + if (btree_node_will_make_reachable(b)) { + BTREE_CACHE_NOT_FREED_INCREMENT(will_make_reachable); goto out_unlock; + } =20 if (btree_node_dirty(b)) { - if (!flush) + if (!flush) { + BTREE_CACHE_NOT_FREED_INCREMENT(dirty); goto out_unlock; + } /* * Using the underscore version because we don't want to compact * bsets after the write, since this node is about to be evicted @@ -273,14 +305,14 @@ static int __btree_node_reclaim(struct bch_fs *c, str= uct btree *b, bool flush) goto out; } =20 -static int btree_node_reclaim(struct bch_fs *c, struct btree *b) +static int btree_node_reclaim(struct bch_fs *c, struct btree *b, bool shri= nker_counter) { - return __btree_node_reclaim(c, b, false); + return __btree_node_reclaim(c, b, false, shrinker_counter); } =20 static int btree_node_write_and_reclaim(struct bch_fs *c, struct btree *b) { - return __btree_node_reclaim(c, b, true); + return __btree_node_reclaim(c, b, true, false); } =20 static unsigned long bch2_btree_cache_scan(struct shrinker *shrink, @@ -328,11 +360,12 @@ static unsigned long bch2_btree_cache_scan(struct shr= inker *shrink, if (touched >=3D nr) goto out; =20 - if (!btree_node_reclaim(c, b)) { + if (!btree_node_reclaim(c, b, true)) { btree_node_data_free(c, b); six_unlock_write(&b->c.lock); six_unlock_intent(&b->c.lock); freed++; + bc->freed++; } } restart: @@ -341,9 +374,11 @@ static unsigned long bch2_btree_cache_scan(struct shri= nker *shrink, =20 if (btree_node_accessed(b)) { clear_btree_node_accessed(b); - } else if (!btree_node_reclaim(c, b)) { + bc->not_freed_access_bit++; + } else if (!btree_node_reclaim(c, b, true)) { freed++; btree_node_data_free(c, b); + bc->freed++; =20 bch2_btree_node_hash_remove(bc, b); six_unlock_write(&b->c.lock); @@ -400,7 +435,7 @@ static void bch2_btree_cache_shrinker_to_text(struct se= q_buf *s, struct shrinker size_t buflen =3D seq_buf_get_buf(s, &cbuf); struct printbuf out =3D PRINTBUF_EXTERN(cbuf, buflen); =20 - bch2_btree_cache_to_text(&out, c); + bch2_btree_cache_to_text(&out, &c->btree_cache); seq_buf_commit(s, out.pos); } =20 @@ -564,7 +599,7 @@ static struct btree *btree_node_cannibalize(struct bch_= fs *c) struct btree *b; =20 list_for_each_entry_reverse(b, &bc->live, list) - if (!btree_node_reclaim(c, b)) + if (!btree_node_reclaim(c, b, false)) return b; =20 while (1) { @@ -600,7 +635,7 @@ struct btree *bch2_btree_node_mem_alloc(struct btree_tr= ans *trans, bool pcpu_rea * disk node. Check the freed list before allocating a new one: */ list_for_each_entry(b, freed, list) - if (!btree_node_reclaim(c, b)) { + if (!btree_node_reclaim(c, b, false)) { list_del_init(&b->list); goto got_node; } @@ -626,7 +661,7 @@ struct btree *bch2_btree_node_mem_alloc(struct btree_tr= ans *trans, bool pcpu_rea * the list. Check if there's any freed nodes there: */ list_for_each_entry(b2, &bc->freeable, list) - if (!btree_node_reclaim(c, b2)) { + if (!btree_node_reclaim(c, b2, false)) { swap(b->data, b2->data); swap(b->aux_data, b2->aux_data); btree_node_to_freedlist(bc, b2); @@ -1222,9 +1257,21 @@ void bch2_btree_node_to_text(struct printbuf *out, s= truct bch_fs *c, const struc stats.failed); } =20 -void bch2_btree_cache_to_text(struct printbuf *out, const struct bch_fs *c) +void bch2_btree_cache_to_text(struct printbuf *out, const struct btree_cac= he *bc) { - prt_printf(out, "nr nodes:\t\t%u\n", c->btree_cache.used); - prt_printf(out, "nr dirty:\t\t%u\n", atomic_read(&c->btree_cache.dirty)); - prt_printf(out, "cannibalize lock:\t%p\n", c->btree_cache.alloc_lock); + prt_printf(out, "nr nodes:\t\t%u\n", bc->used); + prt_printf(out, "nr dirty:\t\t%u\n", atomic_read(&bc->dirty)); + prt_printf(out, "cannibalize lock:\t%p\n", bc->alloc_lock); + + prt_printf(out, "freed:\t\t\t\t%u\n", bc->freed); + prt_printf(out, "not freed, dirty:\t\t%u\n", bc->not_freed_dirty); + prt_printf(out, "not freed, write in flight:\t%u\n", bc->not_freed_write_= in_flight); + prt_printf(out, "not freed, read in flight:\t%u\n", bc->not_freed_read_in= _flight); + prt_printf(out, "not freed, lock intent failed:\t%u\n", bc->not_freed_loc= k_intent); + prt_printf(out, "not freed, lock write failed:\t%u\n", bc->not_freed_lock= _write); + prt_printf(out, "not freed, access bit:\t\t%u\n", bc->not_freed_access_bi= t); + prt_printf(out, "not freed, no evict failed:\t%u\n", bc->not_freed_noevic= t); + prt_printf(out, "not freed, write blocked:\t%u\n", bc->not_freed_write_bl= ocked); + prt_printf(out, "not freed, will make reachable:\t%u\n", bc->not_freed_wi= ll_make_reachable); + } diff --git a/fs/bcachefs/btree_cache.h b/fs/bcachefs/btree_cache.h index cfb80b201d61..bfe1d7482cbc 100644 --- a/fs/bcachefs/btree_cache.h +++ b/fs/bcachefs/btree_cache.h @@ -126,6 +126,6 @@ static inline struct btree *btree_node_root(struct bch_= fs *c, struct btree *b) const char *bch2_btree_id_str(enum btree_id); void bch2_btree_pos_to_text(struct printbuf *, struct bch_fs *, const stru= ct btree *); void bch2_btree_node_to_text(struct printbuf *, struct bch_fs *, const str= uct btree *); -void bch2_btree_cache_to_text(struct printbuf *, const struct bch_fs *); +void bch2_btree_cache_to_text(struct printbuf *, const struct btree_cache = *); =20 #endif /* _BCACHEFS_BTREE_CACHE_H */ diff --git a/fs/bcachefs/btree_types.h b/fs/bcachefs/btree_types.h index 2326bceb34f8..14983e778756 100644 --- a/fs/bcachefs/btree_types.h +++ b/fs/bcachefs/btree_types.h @@ -162,6 +162,16 @@ struct btree_cache { /* Number of elements in live + freeable lists */ unsigned used; unsigned reserve; + unsigned freed; + unsigned not_freed_lock_intent; + unsigned not_freed_lock_write; + unsigned not_freed_dirty; + unsigned not_freed_read_in_flight; + unsigned not_freed_write_in_flight; + unsigned not_freed_noevict; + unsigned not_freed_write_blocked; + unsigned not_freed_will_make_reachable; + unsigned not_freed_access_bit; atomic_t dirty; struct shrinker *shrink; =20 diff --git a/fs/bcachefs/sysfs.c b/fs/bcachefs/sysfs.c index ab743115f169..264c46b456c2 100644 --- a/fs/bcachefs/sysfs.c +++ b/fs/bcachefs/sysfs.c @@ -402,7 +402,7 @@ SHOW(bch2_fs) bch2_btree_updates_to_text(out, c); =20 if (attr =3D=3D &sysfs_btree_cache) - bch2_btree_cache_to_text(out, c); + bch2_btree_cache_to_text(out, &c->btree_cache); =20 if (attr =3D=3D &sysfs_btree_key_cache) bch2_btree_key_cache_to_text(out, &c->btree_key_cache); --=20 2.42.0