From nobody Tue Dec 16 19:53:34 2025 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E55D2158522 for ; Mon, 12 Aug 2024 07:42:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448550; cv=none; b=QOnoDuGtxKppANTdEcEi00dI1zrpwDePsHCZTqu32VQvmm9nPQf73nbs/AwDE4SDb6tK9RZm/TJd0dINUjeVKvpzeQ+u0an289iMbjzyAMO3GAM5Eea4g0skb+0DVpKrK0vNgawcjpsqYy+UWEMqYDO2OLktcdfFbmzRtOlDsxU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448550; c=relaxed/simple; bh=OKG5otDDfLQNbfBVfi2UmI4LBkG1rv6xpR+T0/6hxIA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ojjLrwkEsXtl2mj/g/V6AYXKTCD/2SeFj/lmN0KFovWR3AH8Qa6e7XklJtuarCNZCXaeFvXAxOko/at1hYnW2FRZWF1CiPUxrPirwEB+pNl/ucELIB3jx7ZCb+mrhJ3d1+Hz2gT5x2h2SCeT3sArDh9RW10TTO9UWx0S4oSryzk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Ut3CYW5S; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Ut3CYW5S" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448540; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=o6+ZLJUP4gIVFn0Lkl5fLomnFxVaP80TbLuhIXh/Zp0=; b=Ut3CYW5SgDYjK2FqRDVMQ8HZ2yoyjY6lA6C60GvRQ7vq2rUf2ffO8bYHMU0zqxzK4zcRD8HScKV/vvJbOseHTiHu3yVQZ71FBHK3jDA5dia7V0ypo0Tfq+E2K3US2zRjakcE8hMYzPNVFDccZvH+gKjp97dyxOZsZxh0dpDJGdc= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCbAKO1_1723448538) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:18 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 1/9] mm: swap: extend swap_shmem_alloc() to support batch SWAP_MAP_SHMEM flag setting Date: Mon, 12 Aug 2024 15:42:02 +0800 Message-Id: <99f64115d04b285e009580eb177352c57119ffd0.1723434324.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support shmem large folio swap operations, add a new parameter to swap_shmem_alloc() that allows batch SWAP_MAP_SHMEM flag setting for shmem swap entries. While we are at it, using folio_nr_pages() to get the number of pages of the folio as a preparation. Signed-off-by: Baolin Wang Reviewed-by: Barry Song --- include/linux/swap.h | 4 ++-- mm/shmem.c | 6 ++++-- mm/swapfile.c | 4 ++-- 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 1c8f844a9f0f..248db1dd7812 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -481,7 +481,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t en= try); extern swp_entry_t get_swap_page_of_type(int); extern int get_swap_pages(int n, swp_entry_t swp_entries[], int order); extern int add_swap_count_continuation(swp_entry_t, gfp_t); -extern void swap_shmem_alloc(swp_entry_t); +extern void swap_shmem_alloc(swp_entry_t, int); extern int swap_duplicate(swp_entry_t); extern int swapcache_prepare(swp_entry_t entry, int nr); extern void swap_free_nr(swp_entry_t entry, int nr_pages); @@ -548,7 +548,7 @@ static inline int add_swap_count_continuation(swp_entry= _t swp, gfp_t gfp_mask) return 0; } =20 -static inline void swap_shmem_alloc(swp_entry_t swp) +static inline void swap_shmem_alloc(swp_entry_t swp, int nr) { } =20 diff --git a/mm/shmem.c b/mm/shmem.c index 4a5254bfd610..22cdc10f27ea 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1452,6 +1452,7 @@ static int shmem_writepage(struct page *page, struct = writeback_control *wbc) struct shmem_sb_info *sbinfo =3D SHMEM_SB(inode->i_sb); swp_entry_t swap; pgoff_t index; + int nr_pages; =20 /* * Our capabilities prevent regular writeback or sync from ever calling @@ -1484,6 +1485,7 @@ static int shmem_writepage(struct page *page, struct = writeback_control *wbc) } =20 index =3D folio->index; + nr_pages =3D folio_nr_pages(folio); =20 /* * This is somewhat ridiculous, but without plumbing a SWAP_MAP_FALLOC @@ -1536,8 +1538,8 @@ static int shmem_writepage(struct page *page, struct = writeback_control *wbc) if (add_to_swap_cache(folio, swap, __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN, NULL) =3D=3D 0) { - shmem_recalc_inode(inode, 0, 1); - swap_shmem_alloc(swap); + shmem_recalc_inode(inode, 0, nr_pages); + swap_shmem_alloc(swap, nr_pages); shmem_delete_from_page_cache(folio, swp_to_radix_entry(swap)); =20 mutex_unlock(&shmem_swaplist_mutex); diff --git a/mm/swapfile.c b/mm/swapfile.c index 6de12d712c7e..1caeee676696 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3604,9 +3604,9 @@ static int __swap_duplicate(swp_entry_t entry, unsign= ed char usage, int nr) * Help swapoff by noting that swap entry belongs to shmem/tmpfs * (in which case its reference count is never incremented). */ -void swap_shmem_alloc(swp_entry_t entry) +void swap_shmem_alloc(swp_entry_t entry, int nr) { - __swap_duplicate(entry, SWAP_MAP_SHMEM, 1); + __swap_duplicate(entry, SWAP_MAP_SHMEM, nr); } =20 /* --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0494C15C135 for ; Mon, 12 Aug 2024 07:42:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448552; cv=none; b=m4lVSuhGDoMbRgs8VJmL/2l08GjrLc3DDmCBYd56ZxuNtyHkoEEKt5bnspxOVRlYLOBDdDKsw1Qjs7zMDRKo29rkSYmrnYLsODFQNvTJulWBedZH+r/KNOlCew3v3JVQvD3oM2AEqXw8KX4wCWYknI4qA9JPdeWefHiHspaANN0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448552; c=relaxed/simple; bh=dSE5A1i/0Mslp7rR8GV63PP8IIVhz1+IlrL/SE1zme0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JiFTor0d/Mr3eltB2oiBHoFTlGn6y1ik+GIBVH+JIjF5Pr2B9txvGoWeu6BBLvtjj9CshUAvDfhDt7Hvrx/CY1+cywJZ8KmstGffqoj4S9qEUyCUdSDgjFQR+2L+RjHcHjuv0SAAafxJ5hBGicXueuPDyOFHOueFM1mqp3e/kCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=uJE1VEiW; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="uJE1VEiW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448542; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=2cIcVzSoLVgPRYlybE3t8RKibMfrcxZj8JgqApi1OSo=; b=uJE1VEiWOPfMo9lqgitzw/Rv0cBMkRqrhdPbbiH+ks7VCoMITHmKKgGzpUK0PndssJsJbGpQNVWalpYTu3GN52Qnlr3XpGUhtbP1gHhMdwlkpFfe328GrLMmXsMrmQKbMgKIbVAbyTxpV4VfX9QlaOsTyeRuVYkmegBi6Isudw0= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCazOhr_1723448539) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:19 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 2/9] mm: shmem: extend shmem_partial_swap_usage() to support large folio swap Date: Mon, 12 Aug 2024 15:42:03 +0800 Message-Id: <60b130b9fc3e422bb91293a172c2113c85e9233a.1723434324.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support shmem large folio swapout in the following patches, using xa_get_order() to get the order of the swap entry to calculate the swap usage of shmem. Signed-off-by: Baolin Wang --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index 22cdc10f27ea..02fb188d627f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -890,7 +890,7 @@ unsigned long shmem_partial_swap_usage(struct address_s= pace *mapping, if (xas_retry(&xas, page)) continue; if (xa_is_value(page)) - swapped++; + swapped +=3D 1 << xa_get_order(xas.xa, xas.xa_index); if (xas.xa_index =3D=3D max) break; if (need_resched()) { --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0490515C12B for ; Mon, 12 Aug 2024 07:42:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448553; cv=none; b=QUMFd9umFWWPTD+8RkyeFnFCTors05ARKCmvFo1iTAawQCsRIwzsirmQ69alUjjkzzjRpB+sAueLvJgx92mBdMQSOEw3xQVRkp+h9K7a+oHadcrs5ISuISiq9nXUbiMXWt5m3rRGXW8upNRxD6D7JhSzhvjM4/FFTBf2cnmRchg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448553; c=relaxed/simple; bh=wU+mvmFmyXxD/VfwSu8JkQrOHqzi5LfMKdgy7Mj74U4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kbSyRxns24bf5sSw/LMQJ6Qx5KiFNQ1rAvht4fovc8FolHI+1AlUIkW3mHJevovZnsv0fQpnX25d4GW5vfMbybQ9lEzXzyTV/Is7VOu7su7EVSh/hjJvsgD7Dw3GXeN2nu4tiw6lOVd7RE7Ad2Ky0RI9d9+Jpufap1IJBgJG1n0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=tU8ArFSW; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="tU8ArFSW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448542; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=71u6PF0T766CyIr04AjwEpJvSI8v+1Ntw3LK3owxVuQ=; b=tU8ArFSWaw0R/n1rHBU30YyW2KKq4GhtVbV10IOgZ6nenZ0Q4CYRtGJpd/VYSErwu6tVLLA2Vp7fuCq0zFgwAWp5iOE0xYl/fkB2WB92SYf9Mg4VzpRd41rmQU8+6PyZMzm2dk+jllh5GViOiIWE2FlT1+m4yrJx6qTF/bsq8e8= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCbAKOu_1723448540) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:21 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 3/9] mm: shmem: return number of pages beeing freed in shmem_free_swap Date: Mon, 12 Aug 2024 15:42:04 +0800 Message-Id: <9623e863c83d749d5ab407f6fdf0a8e5a3bdf052.1723434324.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Daniel Gomez Both shmem_free_swap callers expect the number of pages being freed. In the large folios context, this needs to support larger values other than 0 (used as 1 page being freed) and -ENOENT (used as 0 pages being freed). In preparation for large folios adoption, make shmem_free_swap routine return the number of pages being freed. So, returning 0 in this context, means 0 pages being freed. While we are at it, changing to use free_swap_and_cache_nr() to free large order swap entry by Baolin Wang. Suggested-by: Matthew Wilcox Signed-off-by: Daniel Gomez Signed-off-by: Baolin Wang --- mm/shmem.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 02fb188d627f..d0d54939da48 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -856,18 +856,22 @@ static void shmem_delete_from_page_cache(struct folio= *folio, void *radswap) } =20 /* - * Remove swap entry from page cache, free the swap and its page cache. + * Remove swap entry from page cache, free the swap and its page cache. Re= turns + * the number of pages being freed. 0 means entry not found in XArray (0 p= ages + * being freed). */ -static int shmem_free_swap(struct address_space *mapping, - pgoff_t index, void *radswap) +static long shmem_free_swap(struct address_space *mapping, + pgoff_t index, void *radswap) { + int order =3D xa_get_order(&mapping->i_pages, index); void *old; =20 old =3D xa_cmpxchg_irq(&mapping->i_pages, index, radswap, NULL, 0); if (old !=3D radswap) - return -ENOENT; - free_swap_and_cache(radix_to_swp_entry(radswap)); - return 0; + return 0; + free_swap_and_cache_nr(radix_to_swp_entry(radswap), 1 << order); + + return 1 << order; } =20 /* @@ -1019,7 +1023,7 @@ static void shmem_undo_range(struct inode *inode, lof= f_t lstart, loff_t lend, if (xa_is_value(folio)) { if (unfalloc) continue; - nr_swaps_freed +=3D !shmem_free_swap(mapping, + nr_swaps_freed +=3D shmem_free_swap(mapping, indices[i], folio); continue; } @@ -1086,14 +1090,17 @@ static void shmem_undo_range(struct inode *inode, l= off_t lstart, loff_t lend, folio =3D fbatch.folios[i]; =20 if (xa_is_value(folio)) { + long swaps_freed; + if (unfalloc) continue; - if (shmem_free_swap(mapping, indices[i], folio)) { + swaps_freed =3D shmem_free_swap(mapping, indices[i], folio); + if (!swaps_freed) { /* Swap was replaced by page: retry */ index =3D indices[i]; break; } - nr_swaps_freed++; + nr_swaps_freed +=3D swaps_freed; continue; } =20 --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96EF8165EEE for ; Mon, 12 Aug 2024 07:42:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448553; cv=none; b=oNsAqqHK4DUEnxQSUGXFJFqQzeFnjZBp0gq7lx1j1/6asU0/oC6+tO01NclYvG+nhC7hrmOkQPRMpZKCVoonsZUPxXoZcRHC5dWnRuNnjGx3MKb6scbst7qOqEcP/mivudM0rFIWt72tUfh9g/fhiS9WVSvGgO0ia4J7HdcNUjQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448553; c=relaxed/simple; bh=xQOVqSgAJA3gYLLEP9aMoaNOqiX/NhGKX5duWWfBJ4g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=c+qMWmyjpQ6K1FGq5idvwFz9HpckGUBdTatNNgjLPu/sgDAPctT8sOapsKwssWIkXj5UImQ+r2ComYrML7BI1ReZC9xkLgJ9FxxHxtGOXSOAARJFT0d9uqyg3Df2NzAqSF5HKfLvARN8D6kqOI73+1G/XHcv1mSEALeEpN6rlE8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=TN9Bti/P; arc=none smtp.client-ip=115.124.30.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="TN9Bti/P" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448543; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=kqzOiSUcg37jmhpXPpVX6LPzNgjUKcTxKBX3p6fMXZU=; b=TN9Bti/PbtYAdna4Ta+AHAUoPUMngHXCbi9/ycTQjr9DtoiV3uBdqqaaEASO4xr+sYFPHWX6YAOxcbEuJhJYFZf04I29SI2qDouWWRLEeOqfKO9Ftfk0ucjp6UED4UyzRDY9ziwRlKJQhRYNwfUYnz8UEj8E1FYRegX7Kc1drjA= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCbU5uG_1723448541) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:22 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 4/9] mm: filemap: use xa_get_order() to get the swap entry order Date: Mon, 12 Aug 2024 15:42:05 +0800 Message-Id: <6876d55145c1cc80e79df7884aa3a62e397b101d.1723434324.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In the following patches, shmem will support the swap out of large folios, which means the shmem mappings may contain large order swap entries, so using xa_get_order() to get the folio order of the shmem swap entry to update the '*start' correctly. Signed-off-by: Baolin Wang --- mm/filemap.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 4130be74f6fd..4c312aab8b1f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2056,6 +2056,8 @@ unsigned find_get_entries(struct address_space *mappi= ng, pgoff_t *start, folio =3D fbatch->folios[idx]; if (!xa_is_value(folio)) nr =3D folio_nr_pages(folio); + else + nr =3D 1 << xa_get_order(&mapping->i_pages, indices[idx]); *start =3D indices[idx] + nr; } return folio_batch_count(fbatch); @@ -2120,6 +2122,8 @@ unsigned find_lock_entries(struct address_space *mapp= ing, pgoff_t *start, folio =3D fbatch->folios[idx]; if (!xa_is_value(folio)) nr =3D folio_nr_pages(folio); + else + nr =3D 1 << xa_get_order(&mapping->i_pages, indices[idx]); *start =3D indices[idx] + nr; } return folio_batch_count(fbatch); --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89596163A9B for ; Mon, 12 Aug 2024 07:42:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.118 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448554; cv=none; b=FUnijMGLhVWWdlNLncWPMVaKoeUyCMKBCx/F7OOVuzagf/KM2ujYMY7xmiJ/arK7VTCdYej8Ix1GVrNWJzWAwzBDtnj9xoC1T8mG8uUPES91Mvv0aFcK2ront8Y9jXGfFPZqtDwhioJ+B/Wcja8parpmznmq6R16KA/6MsOZZx0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448554; c=relaxed/simple; bh=68HIudlpGk+ESEOygYNQxPojYTg/dxgEFFDRbpcvT9E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TAtjZETJ05Q5m/pT3VOcM4dLUKQ/4u33mDYYWovpxxv6Aa3zAliJ5hgcjuinmRHYpNR0ykQdy9nwmCYwvUsOWyLEKa4ABI+G8WkL9fcJly0e9BxxKJvAfzRUhe7u4peczw0Ge14L4HzrJbXue6s8G1zOo9AldZcELKtqUHH2paE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=YqK8RcEK; arc=none smtp.client-ip=115.124.30.118 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="YqK8RcEK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448544; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=CJ17mJWuqe6d52h6IBrZyk6CffbeYGvkfB1+lwHpEl4=; b=YqK8RcEK97o5sDSbuo9hOYNwKT1ZNBSWmLzGGWQ/3gW5BMP0VeNaVm+JTvPKTW2kvy8bEd3E4BTfs20QEyMLlPNq0pHOa2uMbyMn27noS/F1426h8BidOfpHz4RfF3Uj60a0twx+9TGfLLtL+jPcU9cL/+2j+5kZaGdywgwvDNI= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCazOiu_1723448542) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:23 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 5/9] mm: shmem: use swap_free_nr() to free shmem swap entries Date: Mon, 12 Aug 2024 15:42:06 +0800 Message-Id: <342207fa679fc88a447dac2e101ad79e6050fe79.1723434324.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As a preparation for supporting shmem large folio swapout, use swap_free_nr= () to free some continuous swap entries of the shmem large folio when the large folio was swapped in from the swap cache. In addition, the index should also be round down to the number of pages when adding the swapin folio into the pagecache. Signed-off-by: Baolin Wang --- mm/shmem.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index d0d54939da48..f6bab42180ea 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1961,6 +1961,7 @@ static void shmem_set_folio_swapin_error(struct inode= *inode, pgoff_t index, struct address_space *mapping =3D inode->i_mapping; swp_entry_t swapin_error; void *old; + int nr_pages; =20 swapin_error =3D make_poisoned_swp_entry(); old =3D xa_cmpxchg_irq(&mapping->i_pages, index, @@ -1969,6 +1970,7 @@ static void shmem_set_folio_swapin_error(struct inode= *inode, pgoff_t index, if (old !=3D swp_to_radix_entry(swap)) return; =20 + nr_pages =3D folio_nr_pages(folio); folio_wait_writeback(folio); delete_from_swap_cache(folio); /* @@ -1976,8 +1978,8 @@ static void shmem_set_folio_swapin_error(struct inode= *inode, pgoff_t index, * won't be 0 when inode is released and thus trigger WARN_ON(i_blocks) * in shmem_evict_inode(). */ - shmem_recalc_inode(inode, -1, -1); - swap_free(swap); + shmem_recalc_inode(inode, -nr_pages, -nr_pages); + swap_free_nr(swap, nr_pages); } =20 /* @@ -1996,7 +1998,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, struct swap_info_struct *si; struct folio *folio =3D NULL; swp_entry_t swap; - int error; + int error, nr_pages; =20 VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); swap =3D radix_to_swp_entry(*foliop); @@ -2043,6 +2045,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, goto failed; } folio_wait_writeback(folio); + nr_pages =3D folio_nr_pages(folio); =20 /* * Some architectures may have to restore extra metadata to the @@ -2056,19 +2059,20 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, goto failed; } =20 - error =3D shmem_add_to_page_cache(folio, mapping, index, + error =3D shmem_add_to_page_cache(folio, mapping, + round_down(index, nr_pages), swp_to_radix_entry(swap), gfp); if (error) goto failed; =20 - shmem_recalc_inode(inode, 0, -1); + shmem_recalc_inode(inode, 0, -nr_pages); =20 if (sgp =3D=3D SGP_WRITE) folio_mark_accessed(folio); =20 delete_from_swap_cache(folio); folio_mark_dirty(folio); - swap_free(swap); + swap_free_nr(swap, nr_pages); put_swap_device(si); =20 *foliop =3D folio; --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7FD2C16A382 for ; Mon, 12 Aug 2024 07:42:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448555; cv=none; b=KQhchWuehNx4yzZHZGsbzxPIstoT0guytCRhvb4zwcqJDq8/sBkViEpKaSpzu0GynEWbfLj9zqR0IzmCFexgirSNklIRiyAUFJLuPjqJV8Bo97ILpamr/5EIAHM29Y1EjT2WEj6IVUQAp0aR/R6fBxvt35UmR+7tInkJM//rHU0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448555; c=relaxed/simple; bh=Fyf147nmWm0HWzBPlS492Q/Gdv+ZSkkrV6PKaTePt8w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tNyrT/FK6w6EYcdi716syA+o6rFaeeS3KMLcyKuiE1S3eMVGxun013v3vJHaAyq/hHX9h8DsAnG+RKoyq9ZGLWqUI0JnBIld9AudYAHpslHgeqZRg5V27y/rdKk/AipVyheqYbY4+gWUadizQOVKx8mIKqXs1QclU+cgDY+684g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=T9eXe2A4; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="T9eXe2A4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448545; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=+R9mz0zB2CI8dFveIvPvJwpF+8EOZp0Xn71CWrw5Br0=; b=T9eXe2A4wONRQNWus98fIYJn8K20My44NAACzI8ROn0DSV2PN0fburoroE0ejkvL8vmCPgWFaRtweu3/IqFDSpFOvzrG38Msqd61RE0ARlyEo5YggwysXJV4TOASdakzxvf7spT1qt0jxVGNRP90XKIa4Gh6Zf1gYOyxh8Bxwk8= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCbAKPi_1723448543) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:24 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 6/9] mm: shmem: support large folio allocation for shmem_replace_folio() Date: Mon, 12 Aug 2024 15:42:07 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support large folio swapin for shmem in the following patches, add large folio allocation for the new replacement folio in shmem_replace_folio= (). Moreover large folios occupy N consecutive entries in the swap cache instead of using multi-index entries like the page cache, therefore we should replace each consecutive entries in the swap cache instead of using the shmem_replace_entry(). As well as updating statistics and folio reference count using the number of pages in the folio. Signed-off-by: Baolin Wang --- mm/shmem.c | 54 +++++++++++++++++++++++++++++++----------------------- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index f6bab42180ea..d94f02ad7bd1 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1889,28 +1889,24 @@ static bool shmem_should_replace_folio(struct folio= *folio, gfp_t gfp) static int shmem_replace_folio(struct folio **foliop, gfp_t gfp, struct shmem_inode_info *info, pgoff_t index) { - struct folio *old, *new; - struct address_space *swap_mapping; - swp_entry_t entry; - pgoff_t swap_index; - int error; - - old =3D *foliop; - entry =3D old->swap; - swap_index =3D swap_cache_index(entry); - swap_mapping =3D swap_address_space(entry); + struct folio *new, *old =3D *foliop; + swp_entry_t entry =3D old->swap; + struct address_space *swap_mapping =3D swap_address_space(entry); + pgoff_t swap_index =3D swap_cache_index(entry); + XA_STATE(xas, &swap_mapping->i_pages, swap_index); + int nr_pages =3D folio_nr_pages(old); + int error =3D 0, i; =20 /* * We have arrived here because our zones are constrained, so don't * limit chance of success by further cpuset and node constraints. */ gfp &=3D ~GFP_CONSTRAINT_MASK; - VM_BUG_ON_FOLIO(folio_test_large(old), old); - new =3D shmem_alloc_folio(gfp, 0, info, index); + new =3D shmem_alloc_folio(gfp, folio_order(old), info, index); if (!new) return -ENOMEM; =20 - folio_get(new); + folio_ref_add(new, nr_pages); folio_copy(new, old); flush_dcache_folio(new); =20 @@ -1920,18 +1916,25 @@ static int shmem_replace_folio(struct folio **folio= p, gfp_t gfp, new->swap =3D entry; folio_set_swapcache(new); =20 - /* - * Our caller will very soon move newpage out of swapcache, but it's - * a nice clean interface for us to replace oldpage by newpage there. - */ + /* Swap cache still stores N entries instead of a high-order entry */ xa_lock_irq(&swap_mapping->i_pages); - error =3D shmem_replace_entry(swap_mapping, swap_index, old, new); + for (i =3D 0; i < nr_pages; i++) { + void *item =3D xas_load(&xas); + + if (item !=3D old) { + error =3D -ENOENT; + break; + } + + xas_store(&xas, new); + xas_next(&xas); + } if (!error) { mem_cgroup_replace_folio(old, new); - __lruvec_stat_mod_folio(new, NR_FILE_PAGES, 1); - __lruvec_stat_mod_folio(new, NR_SHMEM, 1); - __lruvec_stat_mod_folio(old, NR_FILE_PAGES, -1); - __lruvec_stat_mod_folio(old, NR_SHMEM, -1); + __lruvec_stat_mod_folio(new, NR_FILE_PAGES, nr_pages); + __lruvec_stat_mod_folio(new, NR_SHMEM, nr_pages); + __lruvec_stat_mod_folio(old, NR_FILE_PAGES, -nr_pages); + __lruvec_stat_mod_folio(old, NR_SHMEM, -nr_pages); } xa_unlock_irq(&swap_mapping->i_pages); =20 @@ -1951,7 +1954,12 @@ static int shmem_replace_folio(struct folio **foliop= , gfp_t gfp, old->private =3D NULL; =20 folio_unlock(old); - folio_put_refs(old, 2); + /* + * The old folio are removed from swap cache, drop the 'nr_pages' + * reference, as well as one temporary reference getting from swap + * cache. + */ + folio_put_refs(old, nr_pages + 1); return error; } =20 --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0065167DA4 for ; Mon, 12 Aug 2024 07:42:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.97 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448556; cv=none; b=D3oOjxCq8yUv1o2+QtwUkHdNiOLNoM/nN/lEWxIQmrcwKyM0VxGQzsowzurn9rXeDgGTsWbi06MTzISGKXWCVtT/gmn1Te7x4wG0T66b685qCzJGSqhE8WbgUnVhwBWbCPYQ56SPWss1//TonUDZsigjenor6SfwAhMPlsSd+yg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448556; c=relaxed/simple; bh=u9rNPtYk720ssVEKO/QfyAhPgGcp45dwY7nlKRUHczU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=biI63dQIvxpk8P/nPWpkl+TwTwA0O3udEBdyiheBHjrd5E77UmgAzHAz4jF5dDCtHcyzR1M39y1w2AKqBjMzN90xhdSptKOmSPm7Iyyu2UzxYWjy48Al0htwdn1bESinRGqewWhabeeHvemvk6FaTtE7jd3kRIbr/fBZu5AQRng= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=E6Omb59K; arc=none smtp.client-ip=115.124.30.97 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="E6Omb59K" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448547; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=HTq6ByUKYopHbBOucaa7TnUS458Ur1YounHQqWHqN9Y=; b=E6Omb59KxBW10MwmDv8gZmA3aiVsEivAfHXW16N/XooKXU8jJ4v2ro9V6YeWxerPcMfFdViXoWrXbHDuJPyr4gy51+W3YeoyR1UNcBvUimdrC289Azqm3bx5OdFgZu/lApkS4p9HDaykToZ8pWJPjywV6HLYP/eQktRkTSopcbc= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCazOjj_1723448544) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:25 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 7/9] mm: shmem: drop folio reference count using 'nr_pages' in shmem_delete_from_page_cache() Date: Mon, 12 Aug 2024 15:42:08 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support large folio swapin/swapout for shmem in the following patches, drop the folio's reference count by the number of pages contained in the folio when a shmem folio is deleted from shmem pagecache after adding into swap cache. Signed-off-by: Baolin Wang --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index d94f02ad7bd1..345e25425e37 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -851,7 +851,7 @@ static void shmem_delete_from_page_cache(struct folio *= folio, void *radswap) __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, -nr); __lruvec_stat_mod_folio(folio, NR_SHMEM, -nr); xa_unlock_irq(&mapping->i_pages); - folio_put(folio); + folio_put_refs(folio, nr); BUG_ON(error); } =20 --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E47715886B for ; Mon, 12 Aug 2024 07:42:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.98 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448559; cv=none; b=ZQwEzlSFvmISheI8JwBpfJRAA7F7N7NZyyPT+kXbKMhxO+4078LbdnqUX1ifu9bw2Pp7d69fy8rSaBmAP//OgEy8lIdVdyHqGcA1gOjD6n2km7etdYVyL2XJ3t5wNwdJngjTOWSTLxdWMhmYBbmLUYzpQMfUoDuZ14uhfDziNRc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448559; c=relaxed/simple; bh=L1tUkGG2sobTCu0i+kAP6ENbalJz5VP6gTOk0FbXJNo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=PmlDf0oIupRrAa7ahk1u1z7dLXisEexhT3WxmAbCuc4eIfBIv7A6vbdkMHFoi6kWWzSmB4yXU3AEnESoM5BTZsTcPt8/ZrvN8yMioxF6AXYzhXjEyDEQNSiINbuxa637L8imZrR6oFCIFjn7RJv1citD0YyJyQjNZEoTdePnVI8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=sDCzgibt; arc=none smtp.client-ip=115.124.30.98 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="sDCzgibt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448547; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=PZslh3xDjv8KO1NsIQv9gTqVJj0244TUIWHfHI4JVZY=; b=sDCzgibtglDth7yEb06ZF+JPV3a7P8kXlyjen9puTazzWqLXGurUo8JQLXMwluUBT2ZI2zlqp6a2wd2202r6sJhrj95PZg2es7ua5U+8+ZzeZALsl7bVluIDLiYwHcl9Z5vC4mBgPexIiw4XxvUHBXd3x3pUf1SgzJ2VMIFzXZs= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCbU5vC_1723448545) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:26 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 8/9] mm: shmem: split large entry if the swapin folio is not large Date: Mon, 12 Aug 2024 15:42:09 +0800 Message-Id: <4a0f12f27c54a62eb4d9ca1265fed3a62531a63e.1723434324.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now the swap device can only swap-in order 0 folio, even though a large folio is swapped out. This requires us to split the large entry previously saved in the shmem pagecache to support the swap in of small folios. Signed-off-by: Baolin Wang --- mm/shmem.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index 345e25425e37..996062dc196b 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1990,6 +1990,81 @@ static void shmem_set_folio_swapin_error(struct inod= e *inode, pgoff_t index, swap_free_nr(swap, nr_pages); } =20 +static int shmem_split_large_entry(struct inode *inode, pgoff_t index, + swp_entry_t swap, int new_order, gfp_t gfp) +{ + struct address_space *mapping =3D inode->i_mapping; + XA_STATE_ORDER(xas, &mapping->i_pages, index, new_order); + void *alloced_shadow =3D NULL; + int alloced_order =3D 0, i; + + for (;;) { + int order =3D -1, split_order =3D 0; + void *old =3D NULL; + + xas_lock_irq(&xas); + old =3D xas_load(&xas); + if (!xa_is_value(old) || swp_to_radix_entry(swap) !=3D old) { + xas_set_err(&xas, -EEXIST); + goto unlock; + } + + order =3D xas_get_order(&xas); + + /* Swap entry may have changed before we re-acquire the lock */ + if (alloced_order && + (old !=3D alloced_shadow || order !=3D alloced_order)) { + xas_destroy(&xas); + alloced_order =3D 0; + } + + /* Try to split large swap entry in pagecache */ + if (order > 0 && order > new_order) { + if (!alloced_order) { + split_order =3D order; + goto unlock; + } + xas_split(&xas, old, order); + + /* + * Re-set the swap entry after splitting, and the swap + * offset of the original large entry must be continuous. + */ + for (i =3D 0; i < 1 << order; i +=3D (1 << new_order)) { + pgoff_t aligned_index =3D round_down(index, 1 << order); + swp_entry_t tmp; + + tmp =3D swp_entry(swp_type(swap), swp_offset(swap) + i); + __xa_store(&mapping->i_pages, aligned_index + i, + swp_to_radix_entry(tmp), 0); + } + } + +unlock: + xas_unlock_irq(&xas); + + /* split needed, alloc here and retry. */ + if (split_order) { + xas_split_alloc(&xas, old, split_order, gfp); + if (xas_error(&xas)) + goto error; + alloced_shadow =3D old; + alloced_order =3D split_order; + xas_reset(&xas); + continue; + } + + if (!xas_nomem(&xas, gfp)) + break; + } + +error: + if (xas_error(&xas)) + return xas_error(&xas); + + return alloced_order; +} + /* * Swap in the folio pointed to by *foliop. * Caller has to make sure that *foliop contains a valid swapped folio. @@ -2026,12 +2101,37 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); if (!folio) { + int split_order; + /* Or update major stats only when swapin succeeds?? */ if (fault_type) { *fault_type |=3D VM_FAULT_MAJOR; count_vm_event(PGMAJFAULT); count_memcg_event_mm(fault_mm, PGMAJFAULT); } + + /* + * Now swap device can only swap in order 0 folio, then we + * should split the large swap entry stored in the pagecache + * if necessary. + */ + split_order =3D shmem_split_large_entry(inode, index, swap, 0, gfp); + if (split_order < 0) { + error =3D split_order; + goto failed; + } + + /* + * If the large swap entry has already been split, it is + * necessary to recalculate the new swap entry based on + * the old order alignment. + */ + if (split_order > 0) { + pgoff_t offset =3D index - round_down(index, 1 << split_order); + + swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); + } + /* Here we actually start the io */ folio =3D shmem_swapin_cluster(swap, gfp, info, index); if (!folio) { --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9817A16D4C2 for ; Mon, 12 Aug 2024 07:42:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448558; cv=none; b=mLn7HYB9V6k2q2qYvB74PxhHDl+ElfcuLNeDM8y8Jv4gIFVnIQtFp56/yBBEVfxHJishDNWKtzZrYHzMM+PE6zKLygvjbKSK+eAwqJqSxp4CBz7DLIYNQ2w9zXy/6OL53ynoNnm3ErWqLDIJoihqJv3SUB2gpez7t+W6PMzJ15Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723448558; c=relaxed/simple; bh=Fad/wvCvAhhMbu2+GNFchjOYH9D0Up5PK5PcdgD4djw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=P5Px95t82h23AYVPCHlCcnqrFiM/s4p4I456wZJ14xLCTvv5z1FtPi8J0xiW3SE83NcdKDRsLP/2VYQe1gZm8iX0qM1DKki4brRTPwZ3OyN6B5cb2+4/2yhlyIu4WWlSh124ojzSy1aIcmnky+VyJ56h0Ldpygu041qXE4KisQk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=AE/C8scD; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="AE/C8scD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1723448549; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=d32a7jBZCt/LbWrkrcBvwBUp97vywsx+8D5iSkO5umo=; b=AE/C8scDa77VHZn2x1S3uv59HGA2JQtdlp85JCxcurcrJzdJZ4PzulLYfg2ZbX4Z2HzTG1G57n4Bpq+NGto8/87cSFUZlFDe9uXj9fFCLfBhupy88AvP/flpnZtJyfuf/jp8Z7hsKpVf6uu9o/yhg//JIKVl/xcAlKanw5Gaoj8= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WCbAKQr_1723448546) by smtp.aliyun-inc.com; Mon, 12 Aug 2024 15:42:27 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 9/9] mm: shmem: support large folio swap out Date: Mon, 12 Aug 2024 15:42:10 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Shmem will support large folio allocation [1] [2] to get a better performan= ce, however, the memory reclaim still splits the precious large folios when try= ing to swap out shmem, which may lead to the memory fragmentation issue and can= not take advantage of the large folio for shmeme. Moreover, the swap code already supports for swapping out large folio witho= ut split, hence this patch set supports the large folio swap out for shmem. Note the i915_gem_shmem driver still need to be split when swapping, thus add a new flag 'split_large_folio' for writeback_control to indicate spliti= ng the large folio. [1] https://lore.kernel.org/all/cover.1717495894.git.baolin.wang@linux.alib= aba.com/ [2] https://lore.kernel.org/all/20240515055719.32577-1-da.gomez@samsung.com/ Signed-off-by: Baolin Wang --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 1 + include/linux/writeback.h | 4 +++ mm/shmem.c | 12 ++++++--- mm/vmscan.c | 32 ++++++++++++++++++----- 4 files changed, 38 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i9= 15/gem/i915_gem_shmem.c index c5e1c718a6d2..c66cb9c585e1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -308,6 +308,7 @@ void __shmem_writeback(size_t size, struct address_spac= e *mapping) .range_start =3D 0, .range_end =3D LLONG_MAX, .for_reclaim =3D 1, + .split_large_folio =3D 1, }; unsigned long i; =20 diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 1a54676d843a..10100e22d5c6 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -63,6 +63,7 @@ struct writeback_control { unsigned range_cyclic:1; /* range_start is cyclic */ unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ unsigned unpinned_netfs_wb:1; /* Cleared I_PINNING_NETFS_WB */ + unsigned split_large_folio:1; /* Split large folio for shmem writeback */ =20 /* * When writeback IOs are bounced through async layers, only the @@ -79,6 +80,9 @@ struct writeback_control { */ struct swap_iocb **swap_plug; =20 + /* Target list for splitting a large folio */ + struct list_head *list; + /* internal fields used by the ->writepages implementation: */ struct folio_batch fbatch; pgoff_t index; diff --git a/mm/shmem.c b/mm/shmem.c index 996062dc196b..50aeb03c4d34 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -795,7 +795,6 @@ static int shmem_add_to_page_cache(struct folio *folio, VM_BUG_ON_FOLIO(index !=3D round_down(index, nr), folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(!folio_test_swapbacked(folio), folio); - VM_BUG_ON(expected && folio_test_large(folio)); =20 folio_ref_add(folio, nr); folio->mapping =3D mapping; @@ -1482,10 +1481,11 @@ static int shmem_writepage(struct page *page, struc= t writeback_control *wbc) * "force", drivers/gpu/drm/i915/gem/i915_gem_shmem.c gets huge pages, * and its shmem_writeback() needs them to be split when swapping. */ - if (folio_test_large(folio)) { + if (wbc->split_large_folio && folio_test_large(folio)) { +try_split: /* Ensure the subpages are still dirty */ folio_test_set_dirty(folio); - if (split_huge_page(page) < 0) + if (split_huge_page_to_list_to_order(page, wbc->list, 0)) goto redirty; folio =3D page_folio(page); folio_clear_dirty(folio); @@ -1527,8 +1527,12 @@ static int shmem_writepage(struct page *page, struct= writeback_control *wbc) } =20 swap =3D folio_alloc_swap(folio); - if (!swap.val) + if (!swap.val) { + if (nr_pages > 1) + goto try_split; + goto redirty; + } =20 /* * Add inode to shmem_unuse()'s list of swapped-out inodes, diff --git a/mm/vmscan.c b/mm/vmscan.c index 96ce889ea3d0..ba7b67218caf 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -628,7 +628,7 @@ typedef enum { * Calls ->writepage(). */ static pageout_t pageout(struct folio *folio, struct address_space *mappin= g, - struct swap_iocb **plug) + struct swap_iocb **plug, struct list_head *folio_list) { /* * If the folio is dirty, only perform writeback if that write @@ -676,6 +676,16 @@ static pageout_t pageout(struct folio *folio, struct a= ddress_space *mapping, .swap_plug =3D plug, }; =20 + /* + * The large shmem folio can be split if CONFIG_THP_SWAP is + * not enabled or contiguous swap entries are failed to + * allocate. + */ + if (shmem_mapping(mapping) && folio_test_large(folio)) { + wbc.list =3D folio_list; + wbc.split_large_folio =3D !IS_ENABLED(CONFIG_THP_SWAP); + } + folio_set_reclaim(folio); res =3D mapping->a_ops->writepage(&folio->page, &wbc); if (res < 0) @@ -1257,11 +1267,6 @@ static unsigned int shrink_folio_list(struct list_he= ad *folio_list, goto activate_locked_split; } } - } else if (folio_test_swapbacked(folio) && - folio_test_large(folio)) { - /* Split shmem folio */ - if (split_folio_to_list(folio, folio_list)) - goto keep_locked; } =20 /* @@ -1362,12 +1367,25 @@ static unsigned int shrink_folio_list(struct list_h= ead *folio_list, * starts and then write it out here. */ try_to_unmap_flush_dirty(); - switch (pageout(folio, mapping, &plug)) { + switch (pageout(folio, mapping, &plug, folio_list)) { case PAGE_KEEP: goto keep_locked; case PAGE_ACTIVATE: + /* + * If shmem folio is split when writeback to swap, + * the tail pages will make their own pass through + * this function and be accounted then. + */ + if (nr_pages > 1 && !folio_test_large(folio)) { + sc->nr_scanned -=3D (nr_pages - 1); + nr_pages =3D 1; + } goto activate_locked; case PAGE_SUCCESS: + if (nr_pages > 1 && !folio_test_large(folio)) { + sc->nr_scanned -=3D (nr_pages - 1); + nr_pages =3D 1; + } stat->nr_pageout +=3D nr_pages; =20 if (folio_test_writeback(folio)) --=20 2.39.3