From nobody Tue Dec 16 19:56:59 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07CD9C71153 for ; Thu, 24 Aug 2023 03:37:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239621AbjHXDgp (ORCPT ); Wed, 23 Aug 2023 23:36:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239615AbjHXDg0 (ORCPT ); Wed, 23 Aug 2023 23:36:26 -0400 Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D46010F3 for ; Wed, 23 Aug 2023 20:36:24 -0700 (PDT) Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-68a56ed12c0so754619b3a.0 for ; Wed, 23 Aug 2023 20:36:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692848183; x=1693452983; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GrI/X5Yh7LcU5jmS4qIzMzIMVvFD9p+WXlgnh63DuQo=; b=bi91mAkWfHF+ZGYrAGi8fSfP1rZgaQ9UJQma4R2GISMfzIMSepcMEEtmQfvpPwQ6sZ 52K3xo34HKG6b4Tb0EhiNBVsZ9PyIACSu80NiXODexlSiTfsepdidiPEL6ktcGMDBKkH CWpSVA/inH6bwziARO2vWCSnFe3pz89phQgHiYKVDAZf74s4T/6V3tF34JyKaqboDMmH SESxt5Qwuo7Pdz2vKABn2yS97A32aC1lsl9BQ3K0Wakrgi/UtYf3bF0wCw1wYDKP5wJG GOFGVNp3DQTv8FP9lay3DWqR5HarnFkL1KQcm6aePLHP9ZUkzjnAXDUEXu0QaA+19pee Q3Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692848183; x=1693452983; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GrI/X5Yh7LcU5jmS4qIzMzIMVvFD9p+WXlgnh63DuQo=; b=cWSnwBmjSD4gPTnxBC8PxMondieKkz3PC/14ESpEpGexlRjfwjd4Vz68UF76VqZqLS jOnbeXmhoxfrcgaZeQQFhxesFPs0smH2yMZkcwj24fgq3drJKsAVl98qJUTH2gtMGeGT q7j8G+IOFU3LtBvUj2OSCgYwsGgSIbOscH/jYL5KxSMuK4xhtc/Tb96XSpTVb+SckIzJ Zfc9pEn9pQoczeeSYcUq74y1qGWC02pHTFvOdLdX1RtxhvSSd5ejInaR/kXAXym3dmCv xYkRcQm0S00IUa218YFbwifIKAaIwuz10EmzHvDpwJ2DgHBqSfegbM8Az+BNiQKYO/6c S6FA== X-Gm-Message-State: AOJu0Ywu8ova9vnbu3weO/UrWdOJvTkkrU3vXFX27MDW5eXcvPXPpwPC QG4Cs5knSelKtyZzKLpIqhj9Ww== X-Google-Smtp-Source: AGHT+IHygFMFVjr3jXqgDVchpFeZd7tEZ1ZJUV56jgY+zNR9KGBBKA6XO9KnSRG3Wm4xupEdPrToWQ== X-Received: by 2002:a05:6a20:938d:b0:13c:bda3:79c3 with SMTP id x13-20020a056a20938d00b0013cbda379c3mr17537865pzh.4.1692848183684; Wed, 23 Aug 2023 20:36:23 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id p16-20020a62ab10000000b0068b6137d144sm2996570pff.30.2023.08.23.20.36.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 20:36:23 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, joel@joelfernandes.org, christian.koenig@amd.com, daniel@ffwll.ch Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, dri-devel@lists.freedesktop.org, linux-fsdevel@vger.kernel.org, Qi Zheng , Muchun Song Subject: [PATCH v3 1/4] mm: move some shrinker-related function declarations to mm/internal.h Date: Thu, 24 Aug 2023 11:35:36 +0800 Message-Id: <20230824033539.34570-2-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230824033539.34570-1-zhengqi.arch@bytedance.com> References: <20230824033539.34570-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The following functions are only used inside the mm subsystem, so it's better to move their declarations to the mm/internal.h file. 1. shrinker_debugfs_add() 2. shrinker_debugfs_detach() 3. shrinker_debugfs_remove() Signed-off-by: Qi Zheng Reviewed-by: Muchun Song --- include/linux/shrinker.h | 19 ------------------- mm/internal.h | 26 ++++++++++++++++++++++++++ mm/shrinker_debug.c | 2 ++ 3 files changed, 28 insertions(+), 19 deletions(-) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 224293b2dd06..8dc15aa37410 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -106,28 +106,9 @@ extern void free_prealloced_shrinker(struct shrinker *= shrinker); extern void synchronize_shrinkers(void); =20 #ifdef CONFIG_SHRINKER_DEBUG -extern int shrinker_debugfs_add(struct shrinker *shrinker); -extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, - int *debugfs_id); -extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, - int debugfs_id); extern int __printf(2, 3) shrinker_debugfs_rename(struct shrinker *shrinke= r, const char *fmt, ...); #else /* CONFIG_SHRINKER_DEBUG */ -static inline int shrinker_debugfs_add(struct shrinker *shrinker) -{ - return 0; -} -static inline struct dentry *shrinker_debugfs_detach(struct shrinker *shri= nker, - int *debugfs_id) -{ - *debugfs_id =3D -1; - return NULL; -} -static inline void shrinker_debugfs_remove(struct dentry *debugfs_entry, - int debugfs_id) -{ -} static inline __printf(2, 3) int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ..= .) { diff --git a/mm/internal.h b/mm/internal.h index 7499b5ea1cf6..f30bb60e7790 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1155,4 +1155,30 @@ struct vma_prepare { struct vm_area_struct *remove; struct vm_area_struct *remove2; }; + +/* shrinker related functions */ + +#ifdef CONFIG_SHRINKER_DEBUG +extern int shrinker_debugfs_add(struct shrinker *shrinker); +extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, + int *debugfs_id); +extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, + int debugfs_id); +#else /* CONFIG_SHRINKER_DEBUG */ +static inline int shrinker_debugfs_add(struct shrinker *shrinker) +{ + return 0; +} +static inline struct dentry *shrinker_debugfs_detach(struct shrinker *shri= nker, + int *debugfs_id) +{ + *debugfs_id =3D -1; + return NULL; +} +static inline void shrinker_debugfs_remove(struct dentry *debugfs_entry, + int debugfs_id) +{ +} +#endif /* CONFIG_SHRINKER_DEBUG */ + #endif /* __MM_INTERNAL_H */ diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index 3ab53fad8876..ee0cddb4530f 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -6,6 +6,8 @@ #include #include =20 +#include "internal.h" + /* defined in vmscan.c */ extern struct rw_semaphore shrinker_rwsem; extern struct list_head shrinker_list; --=20 2.30.2 From nobody Tue Dec 16 19:56:59 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A69CC71145 for ; Thu, 24 Aug 2023 03:37:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239641AbjHXDhV (ORCPT ); Wed, 23 Aug 2023 23:37:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239634AbjHXDhF (ORCPT ); Wed, 23 Aug 2023 23:37:05 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A483310F3 for ; Wed, 23 Aug 2023 20:36:37 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-56f8334f15eso162998a12.1 for ; Wed, 23 Aug 2023 20:36:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692848197; x=1693452997; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=r3psnmGFCbLqWduUZROwVMCl8lwTTcvyNoVEFT074Yk=; b=dpdiz4r7mm1pedbpooAfUpZx2b+HV7yPHuqAUPxNhr4M33+yrCqY8Figggq6aEUs4U Otrq28ouiXzWJml1oz3T5qNqZoEPP8cD01Ux87lCG6JRPJ3L73wKDofPMzBZeMVF7R+b 1ADxq2TDMxBkHnOG1m0Et2xg4gsoDt6tNyBMysJJN5wHas6gEAvKD3tfr5u56lBjtKOu Qr2pP8bC/zpufRps4U9UlxAoLbX5jn5CB3/uEFLve4E8r+6utnuqorrhfeoib63R1FM8 pjxwSscggZkz3y4tGA+Jgi3G8yY8xCWpYLtl8jE3n8x/QnzPE2PDlqasZWO+5kqOtnM1 8aLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692848197; x=1693452997; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=r3psnmGFCbLqWduUZROwVMCl8lwTTcvyNoVEFT074Yk=; b=X9KsBNgvCKSJn7dJaXNmFcmWTJVfmiupSCFjr+uCYwJUBL7a7NTRU9wWEOzEd6WNau hecNMIkSpZvsJ1UYY98rCmbEGWkKWYbftu5OBFMTuIQOjLqs5oisXyKwzGO6C4YKQTGj dm9n1kboUJzyQDC6NRYHAJtI6lj0r/lqEKJfMDEzFiYRSPJbrdJ6q3l4wA4RnZmsHRVY 5RAROZSSHF02Y7/aAdA8otINRbEtcGwXbGlTTSS6+Mp2bmef/PfydS03zEvxbihE128I ovUR94/8gfYt8SLhRY0yipVOD6IkGtla2WZiAG7G79rWAVdrbG9MaT5CrNP3VxgbCrB+ CJDw== X-Gm-Message-State: AOJu0YybSdwxNpu7ErD6ziWVD9qrwq7psqoKhuCfVWIkS7pfoQtCba+D Dp0uCESsoxDhf4SvbTanjQO9Bw== X-Google-Smtp-Source: AGHT+IGxuU0Wej1QGWnYTl1eV5BlGNM3BG8vGM/tIQQelX/rcugYoOh3dIkC56Zqcea49J4sIM1kWA== X-Received: by 2002:a05:6a00:234b:b0:68a:6cbe:35a7 with SMTP id j11-20020a056a00234b00b0068a6cbe35a7mr6976163pfj.2.1692848196915; Wed, 23 Aug 2023 20:36:36 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id p16-20020a62ab10000000b0068b6137d144sm2996570pff.30.2023.08.23.20.36.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 20:36:36 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, joel@joelfernandes.org, christian.koenig@amd.com, daniel@ffwll.ch Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, dri-devel@lists.freedesktop.org, linux-fsdevel@vger.kernel.org, Qi Zheng , Muchun Song Subject: [PATCH v3 2/4] mm: vmscan: move shrinker-related code into a separate file Date: Thu, 24 Aug 2023 11:35:37 +0800 Message-Id: <20230824033539.34570-3-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230824033539.34570-1-zhengqi.arch@bytedance.com> References: <20230824033539.34570-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The mm/vmscan.c file is too large, so separate the shrinker-related code from it into a separate file. No functional changes. Signed-off-by: Qi Zheng Reviewed-by: Muchun Song --- mm/Makefile | 4 +- mm/internal.h | 2 + mm/shrinker.c | 709 ++++++++++++++++++++++++++++++++++++++++++++++++++ mm/vmscan.c | 701 ------------------------------------------------- 4 files changed, 713 insertions(+), 703 deletions(-) create mode 100644 mm/shrinker.c diff --git a/mm/Makefile b/mm/Makefile index ec65984e2ade..33873c8aedb3 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -48,8 +48,8 @@ endif =20 obj-y :=3D filemap.o mempool.o oom_kill.o fadvise.o \ maccess.o page-writeback.o folio-compat.o \ - readahead.o swap.o truncate.o vmscan.o shmem.o \ - util.o mmzone.o vmstat.o backing-dev.o \ + readahead.o swap.o truncate.o vmscan.o shrinker.o \ + shmem.o util.o mmzone.o vmstat.o backing-dev.o \ mm_init.o percpu.o slab_common.o \ compaction.o show_mem.o shmem_quota.o\ interval_tree.o list_lru.o workingset.o \ diff --git a/mm/internal.h b/mm/internal.h index f30bb60e7790..5d4697612073 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1157,6 +1157,8 @@ struct vma_prepare { }; =20 /* shrinker related functions */ +unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, + int priority); =20 #ifdef CONFIG_SHRINKER_DEBUG extern int shrinker_debugfs_add(struct shrinker *shrinker); diff --git a/mm/shrinker.c b/mm/shrinker.c new file mode 100644 index 000000000000..043c87ccfab4 --- /dev/null +++ b/mm/shrinker.c @@ -0,0 +1,709 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include + +#include "internal.h" + +LIST_HEAD(shrinker_list); +DECLARE_RWSEM(shrinker_rwsem); + +#ifdef CONFIG_MEMCG +static int shrinker_nr_max; + +/* The shrinker_info is expanded in a batch of BITS_PER_LONG */ +static inline int shrinker_map_size(int nr_items) +{ + return (DIV_ROUND_UP(nr_items, BITS_PER_LONG) * sizeof(unsigned long)); +} + +static inline int shrinker_defer_size(int nr_items) +{ + return (round_up(nr_items, BITS_PER_LONG) * sizeof(atomic_long_t)); +} + +void free_shrinker_info(struct mem_cgroup *memcg) +{ + struct mem_cgroup_per_node *pn; + struct shrinker_info *info; + int nid; + + for_each_node(nid) { + pn =3D memcg->nodeinfo[nid]; + info =3D rcu_dereference_protected(pn->shrinker_info, true); + kvfree(info); + rcu_assign_pointer(pn->shrinker_info, NULL); + } +} + +int alloc_shrinker_info(struct mem_cgroup *memcg) +{ + struct shrinker_info *info; + int nid, size, ret =3D 0; + int map_size, defer_size =3D 0; + + down_write(&shrinker_rwsem); + map_size =3D shrinker_map_size(shrinker_nr_max); + defer_size =3D shrinker_defer_size(shrinker_nr_max); + size =3D map_size + defer_size; + for_each_node(nid) { + info =3D kvzalloc_node(sizeof(*info) + size, GFP_KERNEL, nid); + if (!info) { + free_shrinker_info(memcg); + ret =3D -ENOMEM; + break; + } + info->nr_deferred =3D (atomic_long_t *)(info + 1); + info->map =3D (void *)info->nr_deferred + defer_size; + info->map_nr_max =3D shrinker_nr_max; + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); + } + up_write(&shrinker_rwsem); + + return ret; +} + +static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *me= mcg, + int nid) +{ + return rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_info, + lockdep_is_held(&shrinker_rwsem)); +} + +static int expand_one_shrinker_info(struct mem_cgroup *memcg, + int map_size, int defer_size, + int old_map_size, int old_defer_size, + int new_nr_max) +{ + struct shrinker_info *new, *old; + struct mem_cgroup_per_node *pn; + int nid; + int size =3D map_size + defer_size; + + for_each_node(nid) { + pn =3D memcg->nodeinfo[nid]; + old =3D shrinker_info_protected(memcg, nid); + /* Not yet online memcg */ + if (!old) + return 0; + + /* Already expanded this shrinker_info */ + if (new_nr_max <=3D old->map_nr_max) + continue; + + new =3D kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid); + if (!new) + return -ENOMEM; + + new->nr_deferred =3D (atomic_long_t *)(new + 1); + new->map =3D (void *)new->nr_deferred + defer_size; + new->map_nr_max =3D new_nr_max; + + /* map: set all old bits, clear all new bits */ + memset(new->map, (int)0xff, old_map_size); + memset((void *)new->map + old_map_size, 0, map_size - old_map_size); + /* nr_deferred: copy old values, clear all new values */ + memcpy(new->nr_deferred, old->nr_deferred, old_defer_size); + memset((void *)new->nr_deferred + old_defer_size, 0, + defer_size - old_defer_size); + + rcu_assign_pointer(pn->shrinker_info, new); + kvfree_rcu(old, rcu); + } + + return 0; +} + +static int expand_shrinker_info(int new_id) +{ + int ret =3D 0; + int new_nr_max =3D round_up(new_id + 1, BITS_PER_LONG); + int map_size, defer_size =3D 0; + int old_map_size, old_defer_size =3D 0; + struct mem_cgroup *memcg; + + if (!root_mem_cgroup) + goto out; + + lockdep_assert_held(&shrinker_rwsem); + + map_size =3D shrinker_map_size(new_nr_max); + defer_size =3D shrinker_defer_size(new_nr_max); + old_map_size =3D shrinker_map_size(shrinker_nr_max); + old_defer_size =3D shrinker_defer_size(shrinker_nr_max); + + memcg =3D mem_cgroup_iter(NULL, NULL, NULL); + do { + ret =3D expand_one_shrinker_info(memcg, map_size, defer_size, + old_map_size, old_defer_size, + new_nr_max); + if (ret) { + mem_cgroup_iter_break(NULL, memcg); + goto out; + } + } while ((memcg =3D mem_cgroup_iter(NULL, memcg, NULL)) !=3D NULL); +out: + if (!ret) + shrinker_nr_max =3D new_nr_max; + + return ret; +} + +void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) +{ + if (shrinker_id >=3D 0 && memcg && !mem_cgroup_is_root(memcg)) { + struct shrinker_info *info; + + rcu_read_lock(); + info =3D rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); + if (!WARN_ON_ONCE(shrinker_id >=3D info->map_nr_max)) { + /* Pairs with smp mb in shrink_slab() */ + smp_mb__before_atomic(); + set_bit(shrinker_id, info->map); + } + rcu_read_unlock(); + } +} + +static DEFINE_IDR(shrinker_idr); + +static int prealloc_memcg_shrinker(struct shrinker *shrinker) +{ + int id, ret =3D -ENOMEM; + + if (mem_cgroup_disabled()) + return -ENOSYS; + + down_write(&shrinker_rwsem); + /* This may call shrinker, so it must use down_read_trylock() */ + id =3D idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL); + if (id < 0) + goto unlock; + + if (id >=3D shrinker_nr_max) { + if (expand_shrinker_info(id)) { + idr_remove(&shrinker_idr, id); + goto unlock; + } + } + shrinker->id =3D id; + ret =3D 0; +unlock: + up_write(&shrinker_rwsem); + return ret; +} + +static void unregister_memcg_shrinker(struct shrinker *shrinker) +{ + int id =3D shrinker->id; + + BUG_ON(id < 0); + + lockdep_assert_held(&shrinker_rwsem); + + idr_remove(&shrinker_idr, id); +} + +static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, + struct mem_cgroup *memcg) +{ + struct shrinker_info *info; + + info =3D shrinker_info_protected(memcg, nid); + return atomic_long_xchg(&info->nr_deferred[shrinker->id], 0); +} + +static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrin= ker, + struct mem_cgroup *memcg) +{ + struct shrinker_info *info; + + info =3D shrinker_info_protected(memcg, nid); + return atomic_long_add_return(nr, &info->nr_deferred[shrinker->id]); +} + +void reparent_shrinker_deferred(struct mem_cgroup *memcg) +{ + int i, nid; + long nr; + struct mem_cgroup *parent; + struct shrinker_info *child_info, *parent_info; + + parent =3D parent_mem_cgroup(memcg); + if (!parent) + parent =3D root_mem_cgroup; + + /* Prevent from concurrent shrinker_info expand */ + down_read(&shrinker_rwsem); + for_each_node(nid) { + child_info =3D shrinker_info_protected(memcg, nid); + parent_info =3D shrinker_info_protected(parent, nid); + for (i =3D 0; i < child_info->map_nr_max; i++) { + nr =3D atomic_long_read(&child_info->nr_deferred[i]); + atomic_long_add(nr, &parent_info->nr_deferred[i]); + } + } + up_read(&shrinker_rwsem); +} +#else +static int prealloc_memcg_shrinker(struct shrinker *shrinker) +{ + return -ENOSYS; +} + +static void unregister_memcg_shrinker(struct shrinker *shrinker) +{ +} + +static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, + struct mem_cgroup *memcg) +{ + return 0; +} + +static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrin= ker, + struct mem_cgroup *memcg) +{ + return 0; +} +#endif /* CONFIG_MEMCG */ + +static long xchg_nr_deferred(struct shrinker *shrinker, + struct shrink_control *sc) +{ + int nid =3D sc->nid; + + if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) + nid =3D 0; + + if (sc->memcg && + (shrinker->flags & SHRINKER_MEMCG_AWARE)) + return xchg_nr_deferred_memcg(nid, shrinker, + sc->memcg); + + return atomic_long_xchg(&shrinker->nr_deferred[nid], 0); +} + + +static long add_nr_deferred(long nr, struct shrinker *shrinker, + struct shrink_control *sc) +{ + int nid =3D sc->nid; + + if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) + nid =3D 0; + + if (sc->memcg && + (shrinker->flags & SHRINKER_MEMCG_AWARE)) + return add_nr_deferred_memcg(nr, nid, shrinker, + sc->memcg); + + return atomic_long_add_return(nr, &shrinker->nr_deferred[nid]); +} + +#define SHRINK_BATCH 128 + +static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, + struct shrinker *shrinker, int priority) +{ + unsigned long freed =3D 0; + unsigned long long delta; + long total_scan; + long freeable; + long nr; + long new_nr; + long batch_size =3D shrinker->batch ? shrinker->batch + : SHRINK_BATCH; + long scanned =3D 0, next_deferred; + + freeable =3D shrinker->count_objects(shrinker, shrinkctl); + if (freeable =3D=3D 0 || freeable =3D=3D SHRINK_EMPTY) + return freeable; + + /* + * copy the current shrinker scan count into a local variable + * and zero it so that other concurrent shrinker invocations + * don't also do this scanning work. + */ + nr =3D xchg_nr_deferred(shrinker, shrinkctl); + + if (shrinker->seeks) { + delta =3D freeable >> priority; + delta *=3D 4; + do_div(delta, shrinker->seeks); + } else { + /* + * These objects don't require any IO to create. Trim + * them aggressively under memory pressure to keep + * them from causing refetches in the IO caches. + */ + delta =3D freeable / 2; + } + + total_scan =3D nr >> priority; + total_scan +=3D delta; + total_scan =3D min(total_scan, (2 * freeable)); + + trace_mm_shrink_slab_start(shrinker, shrinkctl, nr, + freeable, delta, total_scan, priority); + + /* + * Normally, we should not scan less than batch_size objects in one + * pass to avoid too frequent shrinker calls, but if the slab has less + * than batch_size objects in total and we are really tight on memory, + * we will try to reclaim all available objects, otherwise we can end + * up failing allocations although there are plenty of reclaimable + * objects spread over several slabs with usage less than the + * batch_size. + * + * We detect the "tight on memory" situations by looking at the total + * number of objects we want to scan (total_scan). If it is greater + * than the total number of objects on slab (freeable), we must be + * scanning at high prio and therefore should try to reclaim as much as + * possible. + */ + while (total_scan >=3D batch_size || + total_scan >=3D freeable) { + unsigned long ret; + unsigned long nr_to_scan =3D min(batch_size, total_scan); + + shrinkctl->nr_to_scan =3D nr_to_scan; + shrinkctl->nr_scanned =3D nr_to_scan; + ret =3D shrinker->scan_objects(shrinker, shrinkctl); + if (ret =3D=3D SHRINK_STOP) + break; + freed +=3D ret; + + count_vm_events(SLABS_SCANNED, shrinkctl->nr_scanned); + total_scan -=3D shrinkctl->nr_scanned; + scanned +=3D shrinkctl->nr_scanned; + + cond_resched(); + } + + /* + * The deferred work is increased by any new work (delta) that wasn't + * done, decreased by old deferred work that was done now. + * + * And it is capped to two times of the freeable items. + */ + next_deferred =3D max_t(long, (nr + delta - scanned), 0); + next_deferred =3D min(next_deferred, (2 * freeable)); + + /* + * move the unused scan count back into the shrinker in a + * manner that handles concurrent updates. + */ + new_nr =3D add_nr_deferred(next_deferred, shrinker, shrinkctl); + + trace_mm_shrink_slab_end(shrinker, shrinkctl->nid, freed, nr, new_nr, tot= al_scan); + return freed; +} + +#ifdef CONFIG_MEMCG +static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, + struct mem_cgroup *memcg, int priority) +{ + struct shrinker_info *info; + unsigned long ret, freed =3D 0; + int i; + + if (!mem_cgroup_online(memcg)) + return 0; + + if (!down_read_trylock(&shrinker_rwsem)) + return 0; + + info =3D shrinker_info_protected(memcg, nid); + if (unlikely(!info)) + goto unlock; + + for_each_set_bit(i, info->map, info->map_nr_max) { + struct shrink_control sc =3D { + .gfp_mask =3D gfp_mask, + .nid =3D nid, + .memcg =3D memcg, + }; + struct shrinker *shrinker; + + shrinker =3D idr_find(&shrinker_idr, i); + if (unlikely(!shrinker || !(shrinker->flags & SHRINKER_REGISTERED))) { + if (!shrinker) + clear_bit(i, info->map); + continue; + } + + /* Call non-slab shrinkers even though kmem is disabled */ + if (!memcg_kmem_online() && + !(shrinker->flags & SHRINKER_NONSLAB)) + continue; + + ret =3D do_shrink_slab(&sc, shrinker, priority); + if (ret =3D=3D SHRINK_EMPTY) { + clear_bit(i, info->map); + /* + * After the shrinker reported that it had no objects to + * free, but before we cleared the corresponding bit in + * the memcg shrinker map, a new object might have been + * added. To make sure, we have the bit set in this + * case, we invoke the shrinker one more time and reset + * the bit if it reports that it is not empty anymore. + * The memory barrier here pairs with the barrier in + * set_shrinker_bit(): + * + * list_lru_add() shrink_slab_memcg() + * list_add_tail() clear_bit() + * + * set_bit() do_shrink_slab() + */ + smp_mb__after_atomic(); + ret =3D do_shrink_slab(&sc, shrinker, priority); + if (ret =3D=3D SHRINK_EMPTY) + ret =3D 0; + else + set_shrinker_bit(memcg, nid, i); + } + freed +=3D ret; + + if (rwsem_is_contended(&shrinker_rwsem)) { + freed =3D freed ? : 1; + break; + } + } +unlock: + up_read(&shrinker_rwsem); + return freed; +} +#else /* !CONFIG_MEMCG */ +static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, + struct mem_cgroup *memcg, int priority) +{ + return 0; +} +#endif /* CONFIG_MEMCG */ + +/** + * shrink_slab - shrink slab caches + * @gfp_mask: allocation context + * @nid: node whose slab caches to target + * @memcg: memory cgroup whose slab caches to target + * @priority: the reclaim priority + * + * Call the shrink functions to age shrinkable caches. + * + * @nid is passed along to shrinkers with SHRINKER_NUMA_AWARE set, + * unaware shrinkers will receive a node id of 0 instead. + * + * @memcg specifies the memory cgroup to target. Unaware shrinkers + * are called only if it is the root cgroup. + * + * @priority is sc->priority, we take the number of objects and >> by prio= rity + * in order to get the scan target. + * + * Returns the number of reclaimed slab objects. + */ +unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, + int priority) +{ + unsigned long ret, freed =3D 0; + struct shrinker *shrinker; + + /* + * The root memcg might be allocated even though memcg is disabled + * via "cgroup_disable=3Dmemory" boot parameter. This could make + * mem_cgroup_is_root() return false, then just run memcg slab + * shrink, but skip global shrink. This may result in premature + * oom. + */ + if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg)) + return shrink_slab_memcg(gfp_mask, nid, memcg, priority); + + if (!down_read_trylock(&shrinker_rwsem)) + goto out; + + list_for_each_entry(shrinker, &shrinker_list, list) { + struct shrink_control sc =3D { + .gfp_mask =3D gfp_mask, + .nid =3D nid, + .memcg =3D memcg, + }; + + ret =3D do_shrink_slab(&sc, shrinker, priority); + if (ret =3D=3D SHRINK_EMPTY) + ret =3D 0; + freed +=3D ret; + /* + * Bail out if someone want to register a new shrinker to + * prevent the registration from being stalled for long periods + * by parallel ongoing shrinking. + */ + if (rwsem_is_contended(&shrinker_rwsem)) { + freed =3D freed ? : 1; + break; + } + } + + up_read(&shrinker_rwsem); +out: + cond_resched(); + return freed; +} + +/* + * Add a shrinker callback to be called from the vm. + */ +static int __prealloc_shrinker(struct shrinker *shrinker) +{ + unsigned int size; + int err; + + if (shrinker->flags & SHRINKER_MEMCG_AWARE) { + err =3D prealloc_memcg_shrinker(shrinker); + if (err !=3D -ENOSYS) + return err; + + shrinker->flags &=3D ~SHRINKER_MEMCG_AWARE; + } + + size =3D sizeof(*shrinker->nr_deferred); + if (shrinker->flags & SHRINKER_NUMA_AWARE) + size *=3D nr_node_ids; + + shrinker->nr_deferred =3D kzalloc(size, GFP_KERNEL); + if (!shrinker->nr_deferred) + return -ENOMEM; + + return 0; +} + +#ifdef CONFIG_SHRINKER_DEBUG +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + va_list ap; + int err; + + va_start(ap, fmt); + shrinker->name =3D kvasprintf_const(GFP_KERNEL, fmt, ap); + va_end(ap); + if (!shrinker->name) + return -ENOMEM; + + err =3D __prealloc_shrinker(shrinker); + if (err) { + kfree_const(shrinker->name); + shrinker->name =3D NULL; + } + + return err; +} +#else +int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + return __prealloc_shrinker(shrinker); +} +#endif + +void free_prealloced_shrinker(struct shrinker *shrinker) +{ +#ifdef CONFIG_SHRINKER_DEBUG + kfree_const(shrinker->name); + shrinker->name =3D NULL; +#endif + if (shrinker->flags & SHRINKER_MEMCG_AWARE) { + down_write(&shrinker_rwsem); + unregister_memcg_shrinker(shrinker); + up_write(&shrinker_rwsem); + return; + } + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred =3D NULL; +} + +void register_shrinker_prepared(struct shrinker *shrinker) +{ + down_write(&shrinker_rwsem); + list_add_tail(&shrinker->list, &shrinker_list); + shrinker->flags |=3D SHRINKER_REGISTERED; + shrinker_debugfs_add(shrinker); + up_write(&shrinker_rwsem); +} + +static int __register_shrinker(struct shrinker *shrinker) +{ + int err =3D __prealloc_shrinker(shrinker); + + if (err) + return err; + register_shrinker_prepared(shrinker); + return 0; +} + +#ifdef CONFIG_SHRINKER_DEBUG +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + va_list ap; + int err; + + va_start(ap, fmt); + shrinker->name =3D kvasprintf_const(GFP_KERNEL, fmt, ap); + va_end(ap); + if (!shrinker->name) + return -ENOMEM; + + err =3D __register_shrinker(shrinker); + if (err) { + kfree_const(shrinker->name); + shrinker->name =3D NULL; + } + return err; +} +#else +int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) +{ + return __register_shrinker(shrinker); +} +#endif +EXPORT_SYMBOL(register_shrinker); + +/* + * Remove one + */ +void unregister_shrinker(struct shrinker *shrinker) +{ + struct dentry *debugfs_entry; + int debugfs_id; + + if (!(shrinker->flags & SHRINKER_REGISTERED)) + return; + + down_write(&shrinker_rwsem); + list_del(&shrinker->list); + shrinker->flags &=3D ~SHRINKER_REGISTERED; + if (shrinker->flags & SHRINKER_MEMCG_AWARE) + unregister_memcg_shrinker(shrinker); + debugfs_entry =3D shrinker_debugfs_detach(shrinker, &debugfs_id); + up_write(&shrinker_rwsem); + + shrinker_debugfs_remove(debugfs_entry, debugfs_id); + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred =3D NULL; +} +EXPORT_SYMBOL(unregister_shrinker); + +/** + * synchronize_shrinkers - Wait for all running shrinkers to complete. + * + * This is equivalent to calling unregister_shrink() and register_shrinker= (), + * but atomically and with less overhead. This is useful to guarantee that= all + * shrinker invocations have seen an update, before freeing memory, simila= r to + * rcu. + */ +void synchronize_shrinkers(void) +{ + down_write(&shrinker_rwsem); + up_write(&shrinker_rwsem); +} +EXPORT_SYMBOL(synchronize_shrinkers); diff --git a/mm/vmscan.c b/mm/vmscan.c index 6f13394b112e..62bbfea11835 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -35,7 +35,6 @@ #include #include #include -#include #include #include #include @@ -188,246 +187,7 @@ struct scan_control { */ int vm_swappiness =3D 60; =20 -LIST_HEAD(shrinker_list); -DECLARE_RWSEM(shrinker_rwsem); - #ifdef CONFIG_MEMCG -static int shrinker_nr_max; - -/* The shrinker_info is expanded in a batch of BITS_PER_LONG */ -static inline int shrinker_map_size(int nr_items) -{ - return (DIV_ROUND_UP(nr_items, BITS_PER_LONG) * sizeof(unsigned long)); -} - -static inline int shrinker_defer_size(int nr_items) -{ - return (round_up(nr_items, BITS_PER_LONG) * sizeof(atomic_long_t)); -} - -static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *me= mcg, - int nid) -{ - return rcu_dereference_protected(memcg->nodeinfo[nid]->shrinker_info, - lockdep_is_held(&shrinker_rwsem)); -} - -static int expand_one_shrinker_info(struct mem_cgroup *memcg, - int map_size, int defer_size, - int old_map_size, int old_defer_size, - int new_nr_max) -{ - struct shrinker_info *new, *old; - struct mem_cgroup_per_node *pn; - int nid; - int size =3D map_size + defer_size; - - for_each_node(nid) { - pn =3D memcg->nodeinfo[nid]; - old =3D shrinker_info_protected(memcg, nid); - /* Not yet online memcg */ - if (!old) - return 0; - - /* Already expanded this shrinker_info */ - if (new_nr_max <=3D old->map_nr_max) - continue; - - new =3D kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid); - if (!new) - return -ENOMEM; - - new->nr_deferred =3D (atomic_long_t *)(new + 1); - new->map =3D (void *)new->nr_deferred + defer_size; - new->map_nr_max =3D new_nr_max; - - /* map: set all old bits, clear all new bits */ - memset(new->map, (int)0xff, old_map_size); - memset((void *)new->map + old_map_size, 0, map_size - old_map_size); - /* nr_deferred: copy old values, clear all new values */ - memcpy(new->nr_deferred, old->nr_deferred, old_defer_size); - memset((void *)new->nr_deferred + old_defer_size, 0, - defer_size - old_defer_size); - - rcu_assign_pointer(pn->shrinker_info, new); - kvfree_rcu(old, rcu); - } - - return 0; -} - -void free_shrinker_info(struct mem_cgroup *memcg) -{ - struct mem_cgroup_per_node *pn; - struct shrinker_info *info; - int nid; - - for_each_node(nid) { - pn =3D memcg->nodeinfo[nid]; - info =3D rcu_dereference_protected(pn->shrinker_info, true); - kvfree(info); - rcu_assign_pointer(pn->shrinker_info, NULL); - } -} - -int alloc_shrinker_info(struct mem_cgroup *memcg) -{ - struct shrinker_info *info; - int nid, size, ret =3D 0; - int map_size, defer_size =3D 0; - - down_write(&shrinker_rwsem); - map_size =3D shrinker_map_size(shrinker_nr_max); - defer_size =3D shrinker_defer_size(shrinker_nr_max); - size =3D map_size + defer_size; - for_each_node(nid) { - info =3D kvzalloc_node(sizeof(*info) + size, GFP_KERNEL, nid); - if (!info) { - free_shrinker_info(memcg); - ret =3D -ENOMEM; - break; - } - info->nr_deferred =3D (atomic_long_t *)(info + 1); - info->map =3D (void *)info->nr_deferred + defer_size; - info->map_nr_max =3D shrinker_nr_max; - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); - } - up_write(&shrinker_rwsem); - - return ret; -} - -static int expand_shrinker_info(int new_id) -{ - int ret =3D 0; - int new_nr_max =3D round_up(new_id + 1, BITS_PER_LONG); - int map_size, defer_size =3D 0; - int old_map_size, old_defer_size =3D 0; - struct mem_cgroup *memcg; - - if (!root_mem_cgroup) - goto out; - - lockdep_assert_held(&shrinker_rwsem); - - map_size =3D shrinker_map_size(new_nr_max); - defer_size =3D shrinker_defer_size(new_nr_max); - old_map_size =3D shrinker_map_size(shrinker_nr_max); - old_defer_size =3D shrinker_defer_size(shrinker_nr_max); - - memcg =3D mem_cgroup_iter(NULL, NULL, NULL); - do { - ret =3D expand_one_shrinker_info(memcg, map_size, defer_size, - old_map_size, old_defer_size, - new_nr_max); - if (ret) { - mem_cgroup_iter_break(NULL, memcg); - goto out; - } - } while ((memcg =3D mem_cgroup_iter(NULL, memcg, NULL)) !=3D NULL); -out: - if (!ret) - shrinker_nr_max =3D new_nr_max; - - return ret; -} - -void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) -{ - if (shrinker_id >=3D 0 && memcg && !mem_cgroup_is_root(memcg)) { - struct shrinker_info *info; - - rcu_read_lock(); - info =3D rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); - if (!WARN_ON_ONCE(shrinker_id >=3D info->map_nr_max)) { - /* Pairs with smp mb in shrink_slab() */ - smp_mb__before_atomic(); - set_bit(shrinker_id, info->map); - } - rcu_read_unlock(); - } -} - -static DEFINE_IDR(shrinker_idr); - -static int prealloc_memcg_shrinker(struct shrinker *shrinker) -{ - int id, ret =3D -ENOMEM; - - if (mem_cgroup_disabled()) - return -ENOSYS; - - down_write(&shrinker_rwsem); - /* This may call shrinker, so it must use down_read_trylock() */ - id =3D idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL); - if (id < 0) - goto unlock; - - if (id >=3D shrinker_nr_max) { - if (expand_shrinker_info(id)) { - idr_remove(&shrinker_idr, id); - goto unlock; - } - } - shrinker->id =3D id; - ret =3D 0; -unlock: - up_write(&shrinker_rwsem); - return ret; -} - -static void unregister_memcg_shrinker(struct shrinker *shrinker) -{ - int id =3D shrinker->id; - - BUG_ON(id < 0); - - lockdep_assert_held(&shrinker_rwsem); - - idr_remove(&shrinker_idr, id); -} - -static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, - struct mem_cgroup *memcg) -{ - struct shrinker_info *info; - - info =3D shrinker_info_protected(memcg, nid); - return atomic_long_xchg(&info->nr_deferred[shrinker->id], 0); -} - -static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrin= ker, - struct mem_cgroup *memcg) -{ - struct shrinker_info *info; - - info =3D shrinker_info_protected(memcg, nid); - return atomic_long_add_return(nr, &info->nr_deferred[shrinker->id]); -} - -void reparent_shrinker_deferred(struct mem_cgroup *memcg) -{ - int i, nid; - long nr; - struct mem_cgroup *parent; - struct shrinker_info *child_info, *parent_info; - - parent =3D parent_mem_cgroup(memcg); - if (!parent) - parent =3D root_mem_cgroup; - - /* Prevent from concurrent shrinker_info expand */ - down_read(&shrinker_rwsem); - for_each_node(nid) { - child_info =3D shrinker_info_protected(memcg, nid); - parent_info =3D shrinker_info_protected(parent, nid); - for (i =3D 0; i < child_info->map_nr_max; i++) { - nr =3D atomic_long_read(&child_info->nr_deferred[i]); - atomic_long_add(nr, &parent_info->nr_deferred[i]); - } - } - up_read(&shrinker_rwsem); -} =20 /* Returns true for reclaim through cgroup limits or cgroup interfaces. */ static bool cgroup_reclaim(struct scan_control *sc) @@ -468,27 +228,6 @@ static bool writeback_throttling_sane(struct scan_cont= rol *sc) return false; } #else -static int prealloc_memcg_shrinker(struct shrinker *shrinker) -{ - return -ENOSYS; -} - -static void unregister_memcg_shrinker(struct shrinker *shrinker) -{ -} - -static long xchg_nr_deferred_memcg(int nid, struct shrinker *shrinker, - struct mem_cgroup *memcg) -{ - return 0; -} - -static long add_nr_deferred_memcg(long nr, int nid, struct shrinker *shrin= ker, - struct mem_cgroup *memcg) -{ - return 0; -} - static bool cgroup_reclaim(struct scan_control *sc) { return false; @@ -557,39 +296,6 @@ static void flush_reclaim_state(struct scan_control *s= c) } } =20 -static long xchg_nr_deferred(struct shrinker *shrinker, - struct shrink_control *sc) -{ - int nid =3D sc->nid; - - if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) - nid =3D 0; - - if (sc->memcg && - (shrinker->flags & SHRINKER_MEMCG_AWARE)) - return xchg_nr_deferred_memcg(nid, shrinker, - sc->memcg); - - return atomic_long_xchg(&shrinker->nr_deferred[nid], 0); -} - - -static long add_nr_deferred(long nr, struct shrinker *shrinker, - struct shrink_control *sc) -{ - int nid =3D sc->nid; - - if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) - nid =3D 0; - - if (sc->memcg && - (shrinker->flags & SHRINKER_MEMCG_AWARE)) - return add_nr_deferred_memcg(nr, nid, shrinker, - sc->memcg); - - return atomic_long_add_return(nr, &shrinker->nr_deferred[nid]); -} - static bool can_demote(int nid, struct scan_control *sc) { if (!numa_demotion_enabled) @@ -671,413 +377,6 @@ static unsigned long lruvec_lru_size(struct lruvec *l= ruvec, enum lru_list lru, return size; } =20 -/* - * Add a shrinker callback to be called from the vm. - */ -static int __prealloc_shrinker(struct shrinker *shrinker) -{ - unsigned int size; - int err; - - if (shrinker->flags & SHRINKER_MEMCG_AWARE) { - err =3D prealloc_memcg_shrinker(shrinker); - if (err !=3D -ENOSYS) - return err; - - shrinker->flags &=3D ~SHRINKER_MEMCG_AWARE; - } - - size =3D sizeof(*shrinker->nr_deferred); - if (shrinker->flags & SHRINKER_NUMA_AWARE) - size *=3D nr_node_ids; - - shrinker->nr_deferred =3D kzalloc(size, GFP_KERNEL); - if (!shrinker->nr_deferred) - return -ENOMEM; - - return 0; -} - -#ifdef CONFIG_SHRINKER_DEBUG -int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - va_list ap; - int err; - - va_start(ap, fmt); - shrinker->name =3D kvasprintf_const(GFP_KERNEL, fmt, ap); - va_end(ap); - if (!shrinker->name) - return -ENOMEM; - - err =3D __prealloc_shrinker(shrinker); - if (err) { - kfree_const(shrinker->name); - shrinker->name =3D NULL; - } - - return err; -} -#else -int prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - return __prealloc_shrinker(shrinker); -} -#endif - -void free_prealloced_shrinker(struct shrinker *shrinker) -{ -#ifdef CONFIG_SHRINKER_DEBUG - kfree_const(shrinker->name); - shrinker->name =3D NULL; -#endif - if (shrinker->flags & SHRINKER_MEMCG_AWARE) { - down_write(&shrinker_rwsem); - unregister_memcg_shrinker(shrinker); - up_write(&shrinker_rwsem); - return; - } - - kfree(shrinker->nr_deferred); - shrinker->nr_deferred =3D NULL; -} - -void register_shrinker_prepared(struct shrinker *shrinker) -{ - down_write(&shrinker_rwsem); - list_add_tail(&shrinker->list, &shrinker_list); - shrinker->flags |=3D SHRINKER_REGISTERED; - shrinker_debugfs_add(shrinker); - up_write(&shrinker_rwsem); -} - -static int __register_shrinker(struct shrinker *shrinker) -{ - int err =3D __prealloc_shrinker(shrinker); - - if (err) - return err; - register_shrinker_prepared(shrinker); - return 0; -} - -#ifdef CONFIG_SHRINKER_DEBUG -int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - va_list ap; - int err; - - va_start(ap, fmt); - shrinker->name =3D kvasprintf_const(GFP_KERNEL, fmt, ap); - va_end(ap); - if (!shrinker->name) - return -ENOMEM; - - err =3D __register_shrinker(shrinker); - if (err) { - kfree_const(shrinker->name); - shrinker->name =3D NULL; - } - return err; -} -#else -int register_shrinker(struct shrinker *shrinker, const char *fmt, ...) -{ - return __register_shrinker(shrinker); -} -#endif -EXPORT_SYMBOL(register_shrinker); - -/* - * Remove one - */ -void unregister_shrinker(struct shrinker *shrinker) -{ - struct dentry *debugfs_entry; - int debugfs_id; - - if (!(shrinker->flags & SHRINKER_REGISTERED)) - return; - - down_write(&shrinker_rwsem); - list_del(&shrinker->list); - shrinker->flags &=3D ~SHRINKER_REGISTERED; - if (shrinker->flags & SHRINKER_MEMCG_AWARE) - unregister_memcg_shrinker(shrinker); - debugfs_entry =3D shrinker_debugfs_detach(shrinker, &debugfs_id); - up_write(&shrinker_rwsem); - - shrinker_debugfs_remove(debugfs_entry, debugfs_id); - - kfree(shrinker->nr_deferred); - shrinker->nr_deferred =3D NULL; -} -EXPORT_SYMBOL(unregister_shrinker); - -/** - * synchronize_shrinkers - Wait for all running shrinkers to complete. - * - * This is equivalent to calling unregister_shrink() and register_shrinker= (), - * but atomically and with less overhead. This is useful to guarantee that= all - * shrinker invocations have seen an update, before freeing memory, simila= r to - * rcu. - */ -void synchronize_shrinkers(void) -{ - down_write(&shrinker_rwsem); - up_write(&shrinker_rwsem); -} -EXPORT_SYMBOL(synchronize_shrinkers); - -#define SHRINK_BATCH 128 - -static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, - struct shrinker *shrinker, int priority) -{ - unsigned long freed =3D 0; - unsigned long long delta; - long total_scan; - long freeable; - long nr; - long new_nr; - long batch_size =3D shrinker->batch ? shrinker->batch - : SHRINK_BATCH; - long scanned =3D 0, next_deferred; - - freeable =3D shrinker->count_objects(shrinker, shrinkctl); - if (freeable =3D=3D 0 || freeable =3D=3D SHRINK_EMPTY) - return freeable; - - /* - * copy the current shrinker scan count into a local variable - * and zero it so that other concurrent shrinker invocations - * don't also do this scanning work. - */ - nr =3D xchg_nr_deferred(shrinker, shrinkctl); - - if (shrinker->seeks) { - delta =3D freeable >> priority; - delta *=3D 4; - do_div(delta, shrinker->seeks); - } else { - /* - * These objects don't require any IO to create. Trim - * them aggressively under memory pressure to keep - * them from causing refetches in the IO caches. - */ - delta =3D freeable / 2; - } - - total_scan =3D nr >> priority; - total_scan +=3D delta; - total_scan =3D min(total_scan, (2 * freeable)); - - trace_mm_shrink_slab_start(shrinker, shrinkctl, nr, - freeable, delta, total_scan, priority); - - /* - * Normally, we should not scan less than batch_size objects in one - * pass to avoid too frequent shrinker calls, but if the slab has less - * than batch_size objects in total and we are really tight on memory, - * we will try to reclaim all available objects, otherwise we can end - * up failing allocations although there are plenty of reclaimable - * objects spread over several slabs with usage less than the - * batch_size. - * - * We detect the "tight on memory" situations by looking at the total - * number of objects we want to scan (total_scan). If it is greater - * than the total number of objects on slab (freeable), we must be - * scanning at high prio and therefore should try to reclaim as much as - * possible. - */ - while (total_scan >=3D batch_size || - total_scan >=3D freeable) { - unsigned long ret; - unsigned long nr_to_scan =3D min(batch_size, total_scan); - - shrinkctl->nr_to_scan =3D nr_to_scan; - shrinkctl->nr_scanned =3D nr_to_scan; - ret =3D shrinker->scan_objects(shrinker, shrinkctl); - if (ret =3D=3D SHRINK_STOP) - break; - freed +=3D ret; - - count_vm_events(SLABS_SCANNED, shrinkctl->nr_scanned); - total_scan -=3D shrinkctl->nr_scanned; - scanned +=3D shrinkctl->nr_scanned; - - cond_resched(); - } - - /* - * The deferred work is increased by any new work (delta) that wasn't - * done, decreased by old deferred work that was done now. - * - * And it is capped to two times of the freeable items. - */ - next_deferred =3D max_t(long, (nr + delta - scanned), 0); - next_deferred =3D min(next_deferred, (2 * freeable)); - - /* - * move the unused scan count back into the shrinker in a - * manner that handles concurrent updates. - */ - new_nr =3D add_nr_deferred(next_deferred, shrinker, shrinkctl); - - trace_mm_shrink_slab_end(shrinker, shrinkctl->nid, freed, nr, new_nr, tot= al_scan); - return freed; -} - -#ifdef CONFIG_MEMCG -static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, - struct mem_cgroup *memcg, int priority) -{ - struct shrinker_info *info; - unsigned long ret, freed =3D 0; - int i; - - if (!mem_cgroup_online(memcg)) - return 0; - - if (!down_read_trylock(&shrinker_rwsem)) - return 0; - - info =3D shrinker_info_protected(memcg, nid); - if (unlikely(!info)) - goto unlock; - - for_each_set_bit(i, info->map, info->map_nr_max) { - struct shrink_control sc =3D { - .gfp_mask =3D gfp_mask, - .nid =3D nid, - .memcg =3D memcg, - }; - struct shrinker *shrinker; - - shrinker =3D idr_find(&shrinker_idr, i); - if (unlikely(!shrinker || !(shrinker->flags & SHRINKER_REGISTERED))) { - if (!shrinker) - clear_bit(i, info->map); - continue; - } - - /* Call non-slab shrinkers even though kmem is disabled */ - if (!memcg_kmem_online() && - !(shrinker->flags & SHRINKER_NONSLAB)) - continue; - - ret =3D do_shrink_slab(&sc, shrinker, priority); - if (ret =3D=3D SHRINK_EMPTY) { - clear_bit(i, info->map); - /* - * After the shrinker reported that it had no objects to - * free, but before we cleared the corresponding bit in - * the memcg shrinker map, a new object might have been - * added. To make sure, we have the bit set in this - * case, we invoke the shrinker one more time and reset - * the bit if it reports that it is not empty anymore. - * The memory barrier here pairs with the barrier in - * set_shrinker_bit(): - * - * list_lru_add() shrink_slab_memcg() - * list_add_tail() clear_bit() - * - * set_bit() do_shrink_slab() - */ - smp_mb__after_atomic(); - ret =3D do_shrink_slab(&sc, shrinker, priority); - if (ret =3D=3D SHRINK_EMPTY) - ret =3D 0; - else - set_shrinker_bit(memcg, nid, i); - } - freed +=3D ret; - - if (rwsem_is_contended(&shrinker_rwsem)) { - freed =3D freed ? : 1; - break; - } - } -unlock: - up_read(&shrinker_rwsem); - return freed; -} -#else /* CONFIG_MEMCG */ -static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, - struct mem_cgroup *memcg, int priority) -{ - return 0; -} -#endif /* CONFIG_MEMCG */ - -/** - * shrink_slab - shrink slab caches - * @gfp_mask: allocation context - * @nid: node whose slab caches to target - * @memcg: memory cgroup whose slab caches to target - * @priority: the reclaim priority - * - * Call the shrink functions to age shrinkable caches. - * - * @nid is passed along to shrinkers with SHRINKER_NUMA_AWARE set, - * unaware shrinkers will receive a node id of 0 instead. - * - * @memcg specifies the memory cgroup to target. Unaware shrinkers - * are called only if it is the root cgroup. - * - * @priority is sc->priority, we take the number of objects and >> by prio= rity - * in order to get the scan target. - * - * Returns the number of reclaimed slab objects. - */ -static unsigned long shrink_slab(gfp_t gfp_mask, int nid, - struct mem_cgroup *memcg, - int priority) -{ - unsigned long ret, freed =3D 0; - struct shrinker *shrinker; - - /* - * The root memcg might be allocated even though memcg is disabled - * via "cgroup_disable=3Dmemory" boot parameter. This could make - * mem_cgroup_is_root() return false, then just run memcg slab - * shrink, but skip global shrink. This may result in premature - * oom. - */ - if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg)) - return shrink_slab_memcg(gfp_mask, nid, memcg, priority); - - if (!down_read_trylock(&shrinker_rwsem)) - goto out; - - list_for_each_entry(shrinker, &shrinker_list, list) { - struct shrink_control sc =3D { - .gfp_mask =3D gfp_mask, - .nid =3D nid, - .memcg =3D memcg, - }; - - ret =3D do_shrink_slab(&sc, shrinker, priority); - if (ret =3D=3D SHRINK_EMPTY) - ret =3D 0; - freed +=3D ret; - /* - * Bail out if someone want to register a new shrinker to - * prevent the registration from being stalled for long periods - * by parallel ongoing shrinking. - */ - if (rwsem_is_contended(&shrinker_rwsem)) { - freed =3D freed ? : 1; - break; - } - } - - up_read(&shrinker_rwsem); -out: - cond_resched(); - return freed; -} - static unsigned long drop_slab_node(int nid) { unsigned long freed =3D 0; --=20 2.30.2 From nobody Tue Dec 16 19:56:59 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D486AC83003 for ; Thu, 24 Aug 2023 03:37:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239655AbjHXDhY (ORCPT ); Wed, 23 Aug 2023 23:37:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239635AbjHXDhM (ORCPT ); Wed, 23 Aug 2023 23:37:12 -0400 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5659510F5 for ; Wed, 23 Aug 2023 20:36:46 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-68a4dab8172so653895b3a.0 for ; Wed, 23 Aug 2023 20:36:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692848206; x=1693453006; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=q7bwmZAFhPRfqsSPFwB3IV8LJcvLn4gI9V7ELWjZ+Dw=; b=N1xChJFXyipb+5RQle6ZyegT3xc3OvIzC5mKhUrbULfJMef+H89T8PEOMnmL4nJdYQ N8zZwV0W6lbNmMmGxMNFgNlUOcPRpyP19+RyPJ8DlNbMVXU40NlXQPhjtgkkI+OTl/K7 8L5JzHE5Msr1tHKYrvJM6quTVdBT+mo9gcMuVKr0nx8NBTltN2C83er/zV0BtgV1PD4/ ACO9Ma9B1JjNMHcrUYOWt8NRs1kA8AusiSw9xGAg5XIzcL62aspNXe2GuAjNFyNoKWrc zIFZvqDP/A4zQqbJ8hoIzQk5YVJUVqKqQiRZYslXjGoF2Cw8Qu1g6Xr1Pa8bM2i/ytkS 4vNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692848206; x=1693453006; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q7bwmZAFhPRfqsSPFwB3IV8LJcvLn4gI9V7ELWjZ+Dw=; b=Insa81EtW0t2IadQWvUNmKY+OEVZ6b+7IQWc4WfCneLxQ1TQF88TTSgEakiNeI4Q/M 0WOpnml089T7PpPA+9Yq/pD2K5KdKwTVEVqKPTdCOQuhXCP6QNbWDYBAyjoLKdr0NqK7 Kw03nXVSyPrh1CZTwjlKkcA9rzMsgh+4+npebejNyYHLRBrHM88nLus/QgvQ5lIkRQDC yfgTh5a12RImWhaRQWM4t4Rg8HMs4puTbOoXcnZkEYQEcnxIIfGFIw/gg/L+9GCMM1yE lX8fO1XSDjlftHCkFO21AiUc/vnlC09HgSZu23gylZgX7I4+P+MXmrhGbvH9J01Jx8vO 0+hQ== X-Gm-Message-State: AOJu0Yz3Yvv05UgqKlCRq2aKiBsWvYg6XaO+8BR9nrsCL+vWpztZrn/t YqRyMYjIvynepApKNn1PvG1MAg== X-Google-Smtp-Source: AGHT+IF/No7OicKHtxqrUxwVNpcooeU5Pi6D1w0lXRqE33BiijaIAV5KxcraEu1IfHFFlHpUklkwxg== X-Received: by 2002:a05:6a00:1d85:b0:68a:6cec:e538 with SMTP id z5-20020a056a001d8500b0068a6cece538mr7234848pfw.3.1692848205878; Wed, 23 Aug 2023 20:36:45 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id p16-20020a62ab10000000b0068b6137d144sm2996570pff.30.2023.08.23.20.36.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 20:36:45 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, joel@joelfernandes.org, christian.koenig@amd.com, daniel@ffwll.ch Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, dri-devel@lists.freedesktop.org, linux-fsdevel@vger.kernel.org, Qi Zheng , Muchun Song Subject: [PATCH v3 3/4] mm: shrinker: remove redundant shrinker_rwsem in debugfs operations Date: Thu, 24 Aug 2023 11:35:38 +0800 Message-Id: <20230824033539.34570-4-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230824033539.34570-1-zhengqi.arch@bytedance.com> References: <20230824033539.34570-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The debugfs_remove_recursive() will wait for debugfs_file_put() to return, so the shrinker will not be freed when doing debugfs operations (such as shrinker_debugfs_count_show() and shrinker_debugfs_scan_write()), so there is no need to hold shrinker_rwsem during debugfs operations. Signed-off-by: Qi Zheng Reviewed-by: Muchun Song --- mm/shrinker_debug.c | 16 +--------------- 1 file changed, 1 insertion(+), 15 deletions(-) diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index ee0cddb4530f..e4ce509f619e 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -51,17 +51,12 @@ static int shrinker_debugfs_count_show(struct seq_file = *m, void *v) struct mem_cgroup *memcg; unsigned long total; bool memcg_aware; - int ret, nid; + int ret =3D 0, nid; =20 count_per_node =3D kcalloc(nr_node_ids, sizeof(unsigned long), GFP_KERNEL= ); if (!count_per_node) return -ENOMEM; =20 - ret =3D down_read_killable(&shrinker_rwsem); - if (ret) { - kfree(count_per_node); - return ret; - } rcu_read_lock(); =20 memcg_aware =3D shrinker->flags & SHRINKER_MEMCG_AWARE; @@ -94,7 +89,6 @@ static int shrinker_debugfs_count_show(struct seq_file *m= , void *v) } while ((memcg =3D mem_cgroup_iter(NULL, memcg, NULL)) !=3D NULL); =20 rcu_read_unlock(); - up_read(&shrinker_rwsem); =20 kfree(count_per_node); return ret; @@ -119,7 +113,6 @@ static ssize_t shrinker_debugfs_scan_write(struct file = *file, struct mem_cgroup *memcg =3D NULL; int nid; char kbuf[72]; - ssize_t ret; =20 read_len =3D size < (sizeof(kbuf) - 1) ? size : (sizeof(kbuf) - 1); if (copy_from_user(kbuf, buf, read_len)) @@ -148,12 +141,6 @@ static ssize_t shrinker_debugfs_scan_write(struct file= *file, return -EINVAL; } =20 - ret =3D down_read_killable(&shrinker_rwsem); - if (ret) { - mem_cgroup_put(memcg); - return ret; - } - sc.nid =3D nid; sc.memcg =3D memcg; sc.nr_to_scan =3D nr_to_scan; @@ -161,7 +148,6 @@ static ssize_t shrinker_debugfs_scan_write(struct file = *file, =20 shrinker->scan_objects(shrinker, &sc); =20 - up_read(&shrinker_rwsem); mem_cgroup_put(memcg); =20 return size; --=20 2.30.2 From nobody Tue Dec 16 19:56:59 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4EB2C3DA6F for ; Thu, 24 Aug 2023 03:38:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239643AbjHXDht (ORCPT ); Wed, 23 Aug 2023 23:37:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239652AbjHXDhX (ORCPT ); Wed, 23 Aug 2023 23:37:23 -0400 Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E563910F1 for ; Wed, 23 Aug 2023 20:36:55 -0700 (PDT) Received: by mail-pg1-x529.google.com with SMTP id 41be03b00d2f7-5657ca46a56so831621a12.0 for ; Wed, 23 Aug 2023 20:36:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692848215; x=1693453015; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5OticL2Ym/Xp3/1qxI1A+TfW6cmSsSh/JXNH+7cjvxw=; b=JBGdR0nxHEMd+Kid2GCoKUm5r10wwOvWEU3QhglnYZkx5KhK5zs0/11xXXYBWi+g+j 8yvov2onxhP5YBjhFNrAG5tUOFbQi8pDKuWVKXBch4NhsrUV1d3HPrZyyBeBViChk23I 00rOf0JXjllllSco9fIFNaEQHyHzg4XHW2AYUPn87mU6kLmN9+01K70apj1RVlY0ICzb M5xvWFI2uJk8jt87e8aYoK3l7npYoU+JAnJZX0A/ZFW3a1/8piSWzkjImfUB2jeLy24G 2Mb6pMacsvTxbcPoU//WRqr8Wub3WLxRABjOjRTwybn241j0mUIDG4emEsw/GHzqg5cm oXDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692848215; x=1693453015; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5OticL2Ym/Xp3/1qxI1A+TfW6cmSsSh/JXNH+7cjvxw=; b=RuEEFw5MACQkD/bOjhwugjYxWnhzvzO5JockBs8s75Yj3M7A4vDEg0ONJgHJMZLSNL +I+Wbkc8TIIMo9qHlWlV6MKFgorOrANQs+YqDWgSldpNReGegnmhLlnU0ci+hqKzK8md eMqi2wXBSN2pj0kgD9dzKU16XKztS4AypS4DFGNPHKt9OteLRPAyCKyBkz0OPTw7hh3s 3TbM/Aw5Ygd/efl9+2nHh4/1a7NoqJJIl/e4l2Zsf4j1oHtoUeVyroBEoQMedgZn/nJR M0GgNnJT/7TqaVAVYKsZtGx2WHCB03isfyC7UkiMZUumH/wzt1pGncEQDJdgMOoVwaZg fCxQ== X-Gm-Message-State: AOJu0Yzplp/g4m/FvCgnCY/f4oYwXZCv+BO9mJFg2R36XC7RtxCKFuoG b7vdfjc06BGDjEST4weT+DgQxw== X-Google-Smtp-Source: AGHT+IES43unKvAFTLYSHMe/zeQOxomYNWbuowDW3cLv1Miy6NSZqjDlL1eNhKGMV5wMuwnJItY5QQ== X-Received: by 2002:a05:6a00:1892:b0:68a:61fb:8025 with SMTP id x18-20020a056a00189200b0068a61fb8025mr8434319pfh.1.1692848215405; Wed, 23 Aug 2023 20:36:55 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id p16-20020a62ab10000000b0068b6137d144sm2996570pff.30.2023.08.23.20.36.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 20:36:54 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, joel@joelfernandes.org, christian.koenig@amd.com, daniel@ffwll.ch Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, dri-devel@lists.freedesktop.org, linux-fsdevel@vger.kernel.org, Qi Zheng , Muchun Song , Daniel Vetter Subject: [PATCH v3 4/4] drm/ttm: introduce pool_shrink_rwsem Date: Thu, 24 Aug 2023 11:35:39 +0800 Message-Id: <20230824033539.34570-5-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230824033539.34570-1-zhengqi.arch@bytedance.com> References: <20230824033539.34570-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, the synchronize_shrinkers() is only used by TTM pool. It only requires that no shrinkers run in parallel. After we use RCU+refcount method to implement the lockless slab shrink, we can not use shrinker_rwsem or synchronize_rcu() to guarantee that all shrinker invocations have seen an update before freeing memory. So we introduce a new pool_shrink_rwsem to implement a private ttm_pool_synchronize_shrinkers(), so as to achieve the same purpose. Signed-off-by: Qi Zheng Reviewed-by: Muchun Song Reviewed-by: Christian K=C3=B6nig Acked-by: Daniel Vetter --- drivers/gpu/drm/ttm/ttm_pool.c | 17 ++++++++++++++++- include/linux/shrinker.h | 1 - mm/shrinker.c | 15 --------------- 3 files changed, 16 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index cddb9151d20f..648ca70403a7 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -74,6 +74,7 @@ static struct ttm_pool_type global_dma32_uncached[MAX_ORD= ER + 1]; static spinlock_t shrinker_lock; static struct list_head shrinker_list; static struct shrinker mm_shrinker; +static DECLARE_RWSEM(pool_shrink_rwsem); =20 /* Allocate pages of size 1 << order with the given gfp_flags */ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_f= lags, @@ -317,6 +318,7 @@ static unsigned int ttm_pool_shrink(void) unsigned int num_pages; struct page *p; =20 + down_read(&pool_shrink_rwsem); spin_lock(&shrinker_lock); pt =3D list_first_entry(&shrinker_list, typeof(*pt), shrinker_list); list_move_tail(&pt->shrinker_list, &shrinker_list); @@ -329,6 +331,7 @@ static unsigned int ttm_pool_shrink(void) } else { num_pages =3D 0; } + up_read(&pool_shrink_rwsem); =20 return num_pages; } @@ -572,6 +575,18 @@ void ttm_pool_init(struct ttm_pool *pool, struct devic= e *dev, } EXPORT_SYMBOL(ttm_pool_init); =20 +/** + * ttm_pool_synchronize_shrinkers - Wait for all running shrinkers to comp= lete. + * + * This is useful to guarantee that all shrinker invocations have seen an + * update, before freeing memory, similar to rcu. + */ +static void ttm_pool_synchronize_shrinkers(void) +{ + down_write(&pool_shrink_rwsem); + up_write(&pool_shrink_rwsem); +} + /** * ttm_pool_fini - Cleanup a pool * @@ -593,7 +608,7 @@ void ttm_pool_fini(struct ttm_pool *pool) /* We removed the pool types from the LRU, but we need to also make sure * that no shrinker is concurrently freeing pages from the pool. */ - synchronize_shrinkers(); + ttm_pool_synchronize_shrinkers(); } EXPORT_SYMBOL(ttm_pool_fini); =20 diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 8dc15aa37410..6b5843c3b827 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -103,7 +103,6 @@ extern int __printf(2, 3) register_shrinker(struct shri= nker *shrinker, const char *fmt, ...); extern void unregister_shrinker(struct shrinker *shrinker); extern void free_prealloced_shrinker(struct shrinker *shrinker); -extern void synchronize_shrinkers(void); =20 #ifdef CONFIG_SHRINKER_DEBUG extern int __printf(2, 3) shrinker_debugfs_rename(struct shrinker *shrinke= r, diff --git a/mm/shrinker.c b/mm/shrinker.c index 043c87ccfab4..a16cd448b924 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -692,18 +692,3 @@ void unregister_shrinker(struct shrinker *shrinker) shrinker->nr_deferred =3D NULL; } EXPORT_SYMBOL(unregister_shrinker); - -/** - * synchronize_shrinkers - Wait for all running shrinkers to complete. - * - * This is equivalent to calling unregister_shrink() and register_shrinker= (), - * but atomically and with less overhead. This is useful to guarantee that= all - * shrinker invocations have seen an update, before freeing memory, simila= r to - * rcu. - */ -void synchronize_shrinkers(void) -{ - down_write(&shrinker_rwsem); - up_write(&shrinker_rwsem); -} -EXPORT_SYMBOL(synchronize_shrinkers); --=20 2.30.2