From nobody Tue Apr 28 01:11:08 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5B0BC433EF for ; Wed, 8 Jun 2022 14:40:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241560AbiFHOkz (ORCPT ); Wed, 8 Jun 2022 10:40:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241659AbiFHOki (ORCPT ); Wed, 8 Jun 2022 10:40:38 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDF5E14AF66 for ; Wed, 8 Jun 2022 07:40:33 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4LJ8vH0Sv7zRjBF; Wed, 8 Jun 2022 22:37:15 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 8 Jun 2022 22:40:27 +0800 From: Miaohe Lin To: CC: , , , Subject: [PATCH v2 1/3] mm/swapfile: make security_vm_enough_memory_mm() work as expected Date: Wed, 8 Jun 2022 22:40:29 +0800 Message-ID: <20220608144031.829-2-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220608144031.829-1-linmiaohe@huawei.com> References: <20220608144031.829-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" security_vm_enough_memory_mm() checks whether a process has enough memory to allocate a new virtual mapping. And total_swap_pages is considered as available memory while swapoff tries to make sure there's enough memory that can hold the swapped out memory. But total_swap_pages contains the swap space that is being swapoff. So security_vm_enough_memory_mm() will success even if there's no memory to hold the swapped out memory because total_swap_pages always greater than or equal to p->pages. In order to fix it, p->pages should be retracted from total_swap_pages first and then check whether there's enough memory for inuse swap pages. Signed-off-by: Miaohe Lin Reviewed-by: David Hildenbrand --- mm/swapfile.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index ec4c1b276691..d2bead7b8b70 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2398,6 +2398,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, special= file) struct filename *pathname; int err, found =3D 0; unsigned int old_block_size; + unsigned int inuse_pages; =20 if (!capable(CAP_SYS_ADMIN)) return -EPERM; @@ -2428,9 +2429,13 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specia= lfile) spin_unlock(&swap_lock); goto out_dput; } - if (!security_vm_enough_memory_mm(current->mm, p->pages)) - vm_unacct_memory(p->pages); + + total_swap_pages -=3D p->pages; + inuse_pages =3D READ_ONCE(p->inuse_pages); + if (!security_vm_enough_memory_mm(current->mm, inuse_pages)) + vm_unacct_memory(inuse_pages); else { + total_swap_pages +=3D p->pages; err =3D -ENOMEM; spin_unlock(&swap_lock); goto out_dput; @@ -2453,7 +2458,6 @@ SYSCALL_DEFINE1(swapoff, const char __user *, special= file) } plist_del(&p->list, &swap_active_head); atomic_long_sub(p->pages, &nr_swap_pages); - total_swap_pages -=3D p->pages; p->flags &=3D ~SWP_WRITEOK; spin_unlock(&p->lock); spin_unlock(&swap_lock); --=20 2.23.0 From nobody Tue Apr 28 01:11:08 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F623CCA483 for ; Wed, 8 Jun 2022 14:40:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241488AbiFHOkq (ORCPT ); Wed, 8 Jun 2022 10:40:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241592AbiFHOkg (ORCPT ); Wed, 8 Jun 2022 10:40:36 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA1FE1455B4 for ; Wed, 8 Jun 2022 07:40:31 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4LJ8wt6y9Kz1K9sx; Wed, 8 Jun 2022 22:38:38 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 8 Jun 2022 22:40:27 +0800 From: Miaohe Lin To: CC: , , , Subject: [PATCH v2 2/3] mm/swapfile: fix possible data races of inuse_pages Date: Wed, 8 Jun 2022 22:40:30 +0800 Message-ID: <20220608144031.829-3-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220608144031.829-1-linmiaohe@huawei.com> References: <20220608144031.829-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" si->inuse_pages could still be accessed concurrently now. The plain reads outside si->lock critical section, i.e. swap_show and si_swapinfo, which results in data races. But these should be ok because they're just used for showing swap info. Signed-off-by: Miaohe Lin Reviewed-by: David Hildenbrand --- mm/swapfile.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index d2bead7b8b70..3fa26f6971e9 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2646,7 +2646,7 @@ static int swap_show(struct seq_file *swap, void *v) } =20 bytes =3D si->pages << (PAGE_SHIFT - 10); - inuse =3D si->inuse_pages << (PAGE_SHIFT - 10); + inuse =3D READ_ONCE(si->inuse_pages) << (PAGE_SHIFT - 10); =20 file =3D si->swap_file; len =3D seq_file_path(swap, file, " \t\n\\"); @@ -3265,7 +3265,7 @@ void si_swapinfo(struct sysinfo *val) struct swap_info_struct *si =3D swap_info[type]; =20 if ((si->flags & SWP_USED) && !(si->flags & SWP_WRITEOK)) - nr_to_be_unused +=3D si->inuse_pages; + nr_to_be_unused +=3D READ_ONCE(si->inuse_pages); } val->freeswap =3D atomic_long_read(&nr_swap_pages) + nr_to_be_unused; val->totalswap =3D total_swap_pages + nr_to_be_unused; --=20 2.23.0 From nobody Tue Apr 28 01:11:08 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F0D9C43334 for ; Wed, 8 Jun 2022 14:40:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241656AbiFHOki (ORCPT ); Wed, 8 Jun 2022 10:40:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241585AbiFHOkg (ORCPT ); Wed, 8 Jun 2022 10:40:36 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8083913B8CC for ; Wed, 8 Jun 2022 07:40:30 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4LJ8vH6YdpzRj9k; Wed, 8 Jun 2022 22:37:15 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 8 Jun 2022 22:40:28 +0800 From: Miaohe Lin To: CC: , , , Subject: [PATCH v2 3/3] mm/swap: remove swap_cache_info statistics Date: Wed, 8 Jun 2022 22:40:31 +0800 Message-ID: <20220608144031.829-4-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220608144031.829-1-linmiaohe@huawei.com> References: <20220608144031.829-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" swap_cache_info are not statistics that could be easily used to tune system performance because they are not easily accessile. Also they can't provide really useful info when OOM occurs. Remove these statistics can also help mitigate unneeded global swap_cache_info cacheline contention. Suggested-by: David Hildenbrand Signed-off-by: Miaohe Lin Acked-by: "Huang, Ying" Reviewed-by: David Hildenbrand Reviewed-by: Muchun Song --- mm/swap_state.c | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/mm/swap_state.c b/mm/swap_state.c index 0a2021fc55ad..41c6a6053d5c 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -59,24 +59,11 @@ static bool enable_vma_readahead __read_mostly =3D true; #define GET_SWAP_RA_VAL(vma) \ (atomic_long_read(&(vma)->swap_readahead_info) ? : 4) =20 -#define INC_CACHE_INFO(x) data_race(swap_cache_info.x++) -#define ADD_CACHE_INFO(x, nr) data_race(swap_cache_info.x +=3D (nr)) - -static struct { - unsigned long add_total; - unsigned long del_total; - unsigned long find_success; - unsigned long find_total; -} swap_cache_info; - static atomic_t swapin_readahead_hits =3D ATOMIC_INIT(4); =20 void show_swap_cache_info(void) { printk("%lu pages in swap cache\n", total_swapcache_pages()); - printk("Swap cache stats: add %lu, delete %lu, find %lu/%lu\n", - swap_cache_info.add_total, swap_cache_info.del_total, - swap_cache_info.find_success, swap_cache_info.find_total); printk("Free swap =3D %ldkB\n", get_nr_swap_pages() << (PAGE_SHIFT - 10)); printk("Total swap =3D %lukB\n", total_swap_pages << (PAGE_SHIFT - 10)); @@ -133,7 +120,6 @@ int add_to_swap_cache(struct page *page, swp_entry_t en= try, address_space->nrpages +=3D nr; __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr); __mod_lruvec_page_state(page, NR_SWAPCACHE, nr); - ADD_CACHE_INFO(add_total, nr); unlock: xas_unlock_irq(&xas); } while (xas_nomem(&xas, gfp)); @@ -172,7 +158,6 @@ void __delete_from_swap_cache(struct page *page, address_space->nrpages -=3D nr; __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr); __mod_lruvec_page_state(page, NR_SWAPCACHE, -nr); - ADD_CACHE_INFO(del_total, nr); } =20 /** @@ -348,12 +333,10 @@ struct page *lookup_swap_cache(swp_entry_t entry, str= uct vm_area_struct *vma, page =3D find_get_page(swap_address_space(entry), swp_offset(entry)); put_swap_device(si); =20 - INC_CACHE_INFO(find_total); if (page) { bool vma_ra =3D swap_use_vma_readahead(); bool readahead; =20 - INC_CACHE_INFO(find_success); /* * At the moment, we don't support PG_readahead for anon THP * so let's bail out rather than confusing the readahead stat. --=20 2.23.0