From nobody Tue Dec 16 19:53:34 2025 Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30411210F8 for ; Tue, 18 Jun 2024 06:54:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.113 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693674; cv=none; b=lU8mOrSIx1IpUaNA8ztfI7DTMMQhDowSHp2dwjhxG15dq3C/g8V2cj5NdjzGNKbyXtl3hHTEwpbkJo7Rye5Ef3eXAw9X5l+yVybXXkUWF060J0G6Mb6oKbUycndUw4mNo73HMgSILH8/WzG24hQyafayKRCBZ9Xc07fjYZvqGHI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693674; c=relaxed/simple; bh=89pIoAXavdFIN8K8yf6MCqWgIZXiI9HwlIwNMVvrXm4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bkRAT9s1jvDsAU/vzVwVdHfsuZy8DdxxTv2ujZjh/Xx+pEP3OhBV7iNzSAWu3rZlsvjr9c5oDlFhXRlEbvkl8IRzP0aokoFu/cGbM9mzGOM3ki2hnVXLuPDGqXUqZsQ1S3mTsEUcIF3+v4R3yBjKMq+lNKbIzzXYql0mSrpIHNM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=jPV10q7o; arc=none smtp.client-ip=115.124.30.113 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="jPV10q7o" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693670; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=yoIYIBJL5vLTtJOJekDK/vHyVKGOuJ66AVuG1N/Iti4=; b=jPV10q7olMSR8lQVPiAVl3vfo2TAPTkGrKgZXo3xtI56kZoEUXzLPXwZzQvh6nki6PLOEXw94WwZmqvUs/4USnES2yUHRFaJE4DD7xLqPz+v6F1kg7IKhG8NAfmRLKE1qRS4XxzEby7hfRHnSm2LBa+0ZDgXtbcn7yD9ShaN4+A= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067112;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jKUXF_1718693668; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jKUXF_1718693668) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:28 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 1/9] mm: vmscan: add validation before spliting shmem large folio Date: Tue, 18 Jun 2024 14:54:13 +0800 Message-Id: <413866f2f2d996cfbb5c906467b36c96b1a82e2d.1718690645.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add swap available space validation before spliting shmem large folio to avoid redundant split, since we can not write shmem folio to the swap device in this case. Signed-off-by: Baolin Wang --- mm/vmscan.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index 1807e5d95dda..61465f92283f 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1237,6 +1237,14 @@ static unsigned int shrink_folio_list(struct list_he= ad *folio_list, } } else if (folio_test_swapbacked(folio) && folio_test_large(folio)) { + + /* + * Do not split shmem folio if no swap memory + * available. + */ + if (!total_swap_pages) + goto activate_locked; + /* Split shmem folio */ if (split_folio_to_list(folio, folio_list)) goto keep_locked; --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA57A7D07E for ; Tue, 18 Jun 2024 06:54:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693680; cv=none; b=Fol8vPJ+37r8vg5BLpdq2p6kmH1X6Tyv1vUU3str2x5gdznjUW7OQ9kJ5ybTcDHD7F4o6vGVIaUSWVhVHwpjjoi7hPE02QAM2QnnHNCVMfeZ3zNB4ce7CjMZOP9pCEbaAc9pM32Yr1R79UHRVuvtthu/jtA2qBH/OUGHqf0VIYQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693680; c=relaxed/simple; bh=BWI5TFADp95k290UyR2q8Ifcha58gdQYoHqG8YBDuVM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Y0+2D2YN518aMGonagQScj4U3UgibkOOMVVLxr25ge1+vzAs29c6LnYnvDTy4/8PZTaG0LGmnARfBG0mwjo8g5/9JEzgVFAk+UpyCFDCREw3qiCR0trlN+bOqHWRw6uoXfDS/iS1565uIpsgWT154XAnGTlxlNiZu57oIeUT6Og= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=SdmM4p/h; arc=none smtp.client-ip=115.124.30.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="SdmM4p/h" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693671; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=A1nuL6+6NmFkbTac9imOaGsyA3d5PGdnrfoW9BMoYYs=; b=SdmM4p/hL0pBaDI+D+wT/aVXDvnuD6x5K84xMmQFz4Uf9psxbNi3uFDaAVevcy0lwdNiRqmJ9Dx4cIDYkzCGog8zVFzD+UnRRgDBX00WtMKfZ0ouoZqDNpJWLS9QxPxHZ3nnUVEWp0Ft4QVLsSltfear6YK1JDHRaDF0W0PsVW0= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032014031;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jH7lG_1718693669; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jH7lG_1718693669) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:29 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 2/9] mm: swap: extend swap_shmem_alloc() to support batch SWAP_MAP_SHMEM flag setting Date: Tue, 18 Jun 2024 14:54:14 +0800 Message-Id: <4909c4bf2f28665c6a35cd7e77990c5defc3d7fb.1718690645.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support shmem large folio swap operations, add a new parameter to swap_shmem_alloc() that allows batch SWAP_MAP_SHMEM flag setting for shmem swap entries. While we are at it, using folio_nr_pages() to get the number of pages of the folio as a preparation. Signed-off-by: Baolin Wang --- include/linux/swap.h | 4 +- mm/shmem.c | 6 ++- mm/swapfile.c | 98 +++++++++++++++++++++++--------------------- 3 files changed, 57 insertions(+), 51 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index d33ce740b695..bffb2281840d 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -474,7 +474,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t en= try); extern swp_entry_t get_swap_page_of_type(int); extern int get_swap_pages(int n, swp_entry_t swp_entries[], int order); extern int add_swap_count_continuation(swp_entry_t, gfp_t); -extern void swap_shmem_alloc(swp_entry_t); +extern void swap_shmem_alloc(swp_entry_t, int); extern int swap_duplicate(swp_entry_t); extern int swapcache_prepare(swp_entry_t); extern void swap_free_nr(swp_entry_t entry, int nr_pages); @@ -541,7 +541,7 @@ static inline int add_swap_count_continuation(swp_entry= _t swp, gfp_t gfp_mask) return 0; } =20 -static inline void swap_shmem_alloc(swp_entry_t swp) +static inline void swap_shmem_alloc(swp_entry_t swp, int nr) { } =20 diff --git a/mm/shmem.c b/mm/shmem.c index d09c6bf1f28a..b90965486631 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1432,6 +1432,7 @@ static int shmem_writepage(struct page *page, struct = writeback_control *wbc) struct shmem_sb_info *sbinfo =3D SHMEM_SB(inode->i_sb); swp_entry_t swap; pgoff_t index; + int nr_pages; =20 /* * Our capabilities prevent regular writeback or sync from ever calling @@ -1464,6 +1465,7 @@ static int shmem_writepage(struct page *page, struct = writeback_control *wbc) } =20 index =3D folio->index; + nr_pages =3D folio_nr_pages(folio); =20 /* * This is somewhat ridiculous, but without plumbing a SWAP_MAP_FALLOC @@ -1516,8 +1518,8 @@ static int shmem_writepage(struct page *page, struct = writeback_control *wbc) if (add_to_swap_cache(folio, swap, __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN, NULL) =3D=3D 0) { - shmem_recalc_inode(inode, 0, 1); - swap_shmem_alloc(swap); + shmem_recalc_inode(inode, 0, nr_pages); + swap_shmem_alloc(swap, nr_pages); shmem_delete_from_page_cache(folio, swp_to_radix_entry(swap)); =20 mutex_unlock(&shmem_swaplist_mutex); diff --git a/mm/swapfile.c b/mm/swapfile.c index 9c6d8e557c0f..1dde413264e2 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3362,62 +3362,58 @@ void si_swapinfo(struct sysinfo *val) * - swap-cache reference is requested but the entry is not used. -> ENOENT * - swap-mapped reference requested but needs continued swap count. -> EN= OMEM */ -static int __swap_duplicate(swp_entry_t entry, unsigned char usage) +static int __swap_duplicate(struct swap_info_struct *p, unsigned long offs= et, + int nr, unsigned char usage) { - struct swap_info_struct *p; struct swap_cluster_info *ci; - unsigned long offset; unsigned char count; unsigned char has_cache; - int err; + int err, i; =20 - p =3D swp_swap_info(entry); - - offset =3D swp_offset(entry); ci =3D lock_cluster_or_swap_info(p, offset); =20 - count =3D p->swap_map[offset]; - - /* - * swapin_readahead() doesn't check if a swap entry is valid, so the - * swap entry could be SWAP_MAP_BAD. Check here with lock held. - */ - if (unlikely(swap_count(count) =3D=3D SWAP_MAP_BAD)) { - err =3D -ENOENT; - goto unlock_out; - } - - has_cache =3D count & SWAP_HAS_CACHE; - count &=3D ~SWAP_HAS_CACHE; - err =3D 0; - - if (usage =3D=3D SWAP_HAS_CACHE) { + for (i =3D 0; i < nr; i++) { + count =3D p->swap_map[offset + i]; =20 - /* set SWAP_HAS_CACHE if there is no cache and entry is used */ - if (!has_cache && count) - has_cache =3D SWAP_HAS_CACHE; - else if (has_cache) /* someone else added cache */ - err =3D -EEXIST; - else /* no users remaining */ + /* + * swapin_readahead() doesn't check if a swap entry is valid, so the + * swap entry could be SWAP_MAP_BAD. Check here with lock held. + */ + if (unlikely(swap_count(count) =3D=3D SWAP_MAP_BAD)) { err =3D -ENOENT; + break; + } =20 - } else if (count || has_cache) { + has_cache =3D count & SWAP_HAS_CACHE; + count &=3D ~SWAP_HAS_CACHE; + err =3D 0; + + if (usage =3D=3D SWAP_HAS_CACHE) { + /* set SWAP_HAS_CACHE if there is no cache and entry is used */ + if (!has_cache && count) + has_cache =3D SWAP_HAS_CACHE; + else if (has_cache) /* someone else added cache */ + err =3D -EEXIST; + else /* no users remaining */ + err =3D -ENOENT; + } else if (count || has_cache) { + if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) + count +=3D usage; + else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) + err =3D -EINVAL; + else if (swap_count_continued(p, offset + i, count)) + count =3D COUNT_CONTINUED; + else + err =3D -ENOMEM; + } else + err =3D -ENOENT; /* unused swap entry */ =20 - if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) - count +=3D usage; - else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) - err =3D -EINVAL; - else if (swap_count_continued(p, offset, count)) - count =3D COUNT_CONTINUED; - else - err =3D -ENOMEM; - } else - err =3D -ENOENT; /* unused swap entry */ + if (err) + break; =20 - if (!err) - WRITE_ONCE(p->swap_map[offset], count | has_cache); + WRITE_ONCE(p->swap_map[offset + i], count | has_cache); + } =20 -unlock_out: unlock_cluster_or_swap_info(p, ci); return err; } @@ -3426,9 +3422,12 @@ static int __swap_duplicate(swp_entry_t entry, unsig= ned char usage) * Help swapoff by noting that swap entry belongs to shmem/tmpfs * (in which case its reference count is never incremented). */ -void swap_shmem_alloc(swp_entry_t entry) +void swap_shmem_alloc(swp_entry_t entry, int nr) { - __swap_duplicate(entry, SWAP_MAP_SHMEM); + struct swap_info_struct *p =3D swp_swap_info(entry); + unsigned long offset =3D swp_offset(entry); + + __swap_duplicate(p, offset, nr, SWAP_MAP_SHMEM); } =20 /* @@ -3440,9 +3439,11 @@ void swap_shmem_alloc(swp_entry_t entry) */ int swap_duplicate(swp_entry_t entry) { + struct swap_info_struct *p =3D swp_swap_info(entry); + unsigned long offset =3D swp_offset(entry); int err =3D 0; =20 - while (!err && __swap_duplicate(entry, 1) =3D=3D -ENOMEM) + while (!err && __swap_duplicate(p, offset, 1, 1) =3D=3D -ENOMEM) err =3D add_swap_count_continuation(entry, GFP_ATOMIC); return err; } @@ -3457,7 +3458,10 @@ int swap_duplicate(swp_entry_t entry) */ int swapcache_prepare(swp_entry_t entry) { - return __swap_duplicate(entry, SWAP_HAS_CACHE); + struct swap_info_struct *p =3D swp_swap_info(entry); + unsigned long offset =3D swp_offset(entry); + + return __swap_duplicate(p, offset, 1, SWAP_HAS_CACHE); } =20 void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5DBA173477 for ; Tue, 18 Jun 2024 06:54:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693677; cv=none; b=UZFM2hIctg+W4Ah/w6+z3kQCOKtEzSJuH6N3igEso7nS447RxEZtpum3HZlFCEf+F0Y2rdVukvw0k5FXeAHXVyKEfpZ4Y/CTly0lfTMWMwiM9Ckp1jlgfE8CfzzDbMVVnLqWrgufOVoYYaYjRtn56iQ6+wAA0JxgXy/K+mEMuPU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693677; c=relaxed/simple; bh=9w3+BGqMOeVAeOQIAOTwGksYjI6niZWaad8y/5Mil1k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WCDITY1T9yYaeQ52ozNqaXBbyumjTwGKEBNqAM+TKp4TI+296f21VEi/PTmnyO5ifI26CAxPLUgVu+Iy6ax3vnnqsJjsHbN/BMBJlJtxDm6dSOi8pU0qvT9047t//jLHGBSwuXp9N2uIhJKCMuNC6025+d79vieIpRwstA/oUxM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=yRYct/UF; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="yRYct/UF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693673; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=eA6kyRQ9XEpy05GdpwvXfCDvN4imeajWIAzdeARsgKo=; b=yRYct/UFaYNmmeZK5qcyA8QOZ3CAsn+smy8uqjBqePugsHlZ44p8OjReZj4Xu5pUhK+XvsQ9P7e4q6fPIbCx5rSgGPfU09vJUsnvUnFKZLhPFiI6aTOcuv7WAFYnecuylPRwEOqP8w9uuiH0IHfX+QYLZI9M0w3GKBDi/+BSEMg= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R531e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045046011;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jH7ld_1718693670; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jH7ld_1718693670) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:30 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 3/9] mm: shmem: extend shmem_partial_swap_usage() to support large folio swap Date: Tue, 18 Jun 2024 14:54:15 +0800 Message-Id: <61c49050a8aeffecade8c2c407280bf6b92f359e.1718690645.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support shmem large folio swapout in the following patches, using xa_get_order() to get the order of the swap entry to calculate the swap usage of shmem. Signed-off-by: Baolin Wang --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index b90965486631..012a06ef39aa 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -870,7 +870,7 @@ unsigned long shmem_partial_swap_usage(struct address_s= pace *mapping, if (xas_retry(&xas, page)) continue; if (xa_is_value(page)) - swapped++; + swapped +=3D 1 << xa_get_order(xas.xa, xas.xa_index); if (xas.xa_index =3D=3D max) break; if (need_resched()) { --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF52F73537 for ; Tue, 18 Jun 2024 06:54:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693678; cv=none; b=busqpHe1DBgEpihVpD4xF0ZqlW33eP7+JRK/6Yy/Wl76NKbSpimbaX2d7MRgOAXO6J86wZXVyfI9bPRkQ+BklZRJHB093cvjYnUAb+/wNeF6kUHWVJrEoeKxvfMSNPVn+6WyLiAxlZJfxfOK1ngh5iFT1eLvKYOz1hPmLJdkr7E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693678; c=relaxed/simple; bh=y/bLAyJVcfx/xNGxU0XeAS0dLKu3XWOVnCuwHcKPo3M=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DFMgGAbLgyoCLGbgaL0TyoeQM6qv9vtxGayu/XUrttiK9WODe6+c24ileCYS/qFmRw3SZ2QSly0ZP7te9YtM8WRknn5IqiG7+zLJ3WcyDHoAdHwjqEI2+5rZh4bl6+vr6GrMpASKKxEDpCtuFhybJY13hamOr1EwBybaW6GK45s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=xnbiavfM; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="xnbiavfM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693673; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=oU/PcFIl06vSDqEtjZujhmAzGqRrrYre+wfF0/f2UwU=; b=xnbiavfMUiKo/s15gGly/qLRpFpIiM4Dhzae0MlQh12ZoF06eMrjOyODaVyxZGoOMO+mAHpAzHXXre3gJQEdNhx0R3KgcomLgoVm2hpcdBZu+MLregMlZhThhHG8hLohmUyA/8RgvCggVGpXyb0hESFV2hfeLQ3hZxFtKu53ntE= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R861e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045075189;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jFFL._1718693671; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jFFL._1718693671) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:31 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 4/9] mm: shmem: return number of pages beeing freed in shmem_free_swap Date: Tue, 18 Jun 2024 14:54:16 +0800 Message-Id: <6e567bfbb6bac0bfbdd63e60718bd1aaf8eade6c.1718690645.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Daniel Gomez Both shmem_free_swap callers expect the number of pages being freed. In the large folios context, this needs to support larger values other than 0 (used as 1 page being freed) and -ENOENT (used as 0 pages being freed). In preparation for large folios adoption, make shmem_free_swap routine return the number of pages being freed. So, returning 0 in this context, means 0 pages being freed. While we are at it, changing to use free_swap_and_cache_nr() to free large order swap entry by Baolin Wang. Suggested-by: Matthew Wilcox Signed-off-by: Daniel Gomez Signed-off-by: Baolin Wang --- mm/shmem.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 012a06ef39aa..a73d2da54897 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -836,18 +836,22 @@ static void shmem_delete_from_page_cache(struct folio= *folio, void *radswap) } =20 /* - * Remove swap entry from page cache, free the swap and its page cache. + * Remove swap entry from page cache, free the swap and its page cache. Re= turns + * the number of pages being freed. 0 means entry not found in XArray (0 p= ages + * being freed). */ -static int shmem_free_swap(struct address_space *mapping, - pgoff_t index, void *radswap) +static long shmem_free_swap(struct address_space *mapping, + pgoff_t index, void *radswap) { + int order =3D xa_get_order(&mapping->i_pages, index); void *old; =20 old =3D xa_cmpxchg_irq(&mapping->i_pages, index, radswap, NULL, 0); if (old !=3D radswap) - return -ENOENT; - free_swap_and_cache(radix_to_swp_entry(radswap)); - return 0; + return 0; + free_swap_and_cache_nr(radix_to_swp_entry(radswap), 1 << order); + + return 1 << order; } =20 /* @@ -999,7 +1003,7 @@ static void shmem_undo_range(struct inode *inode, loff= _t lstart, loff_t lend, if (xa_is_value(folio)) { if (unfalloc) continue; - nr_swaps_freed +=3D !shmem_free_swap(mapping, + nr_swaps_freed +=3D shmem_free_swap(mapping, indices[i], folio); continue; } @@ -1066,14 +1070,17 @@ static void shmem_undo_range(struct inode *inode, l= off_t lstart, loff_t lend, folio =3D fbatch.folios[i]; =20 if (xa_is_value(folio)) { + long swaps_freed; + if (unfalloc) continue; - if (shmem_free_swap(mapping, indices[i], folio)) { + swaps_freed =3D shmem_free_swap(mapping, indices[i], folio); + if (!swaps_freed) { /* Swap was replaced by page: retry */ index =3D indices[i]; break; } - nr_swaps_freed++; + nr_swaps_freed +=3D swaps_freed; continue; } =20 --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 895DE74BE1 for ; Tue, 18 Jun 2024 06:54:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693678; cv=none; b=ACZWt6P+zHX/AAHGcXfEz04kYPSyJTRh0QaPQpdl1C3mrwhrDq4nv+zF/RAj7Yhi67abwzhx7iHVOgKZGM8PEOuaOqzoMYMEqCWWU4Aih6HoB4k8h74qly5XxgWTR+t9OX2JDp0RHUUwUScbrKjTJv11iU670D/r2Dfa7iw3e+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693678; c=relaxed/simple; bh=0j0o/JJqbkPuCd3w9wnczyFRRX9ombob99obZa5CSFk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=S+ehJBFyg7zHUCqjHqZtMm55ZkQdJWTOmbveKeDqtJRJc0LOxcy7Vnlpfl1oepJJal/Lz4wnlz1AqrWHy3bUD+uGBhxUApGdWVYlz+SjsF+Pdq2NKZqjfYu0kvO3MaBzv9wZ1s26m8MhTF5EoSKnNJWjrX9UmUZmZwwycpww3ho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=X12BZvlz; arc=none smtp.client-ip=115.124.30.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="X12BZvlz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693674; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=OhkcTOgQo6MdSZjxEddX9KsK/pU+Rag0OqwoL0KeI6s=; b=X12BZvlzYEHEGso0VQ/0+4DyjCHOUQDR4VUOMb7d3FLyjaGdz8II8RV33NR1meY0QMi3smKKLbnxU/mVKFM6R3oJjHN7N448TfTieXUwHABLq5tNnqqGuRwoRNeoeV3DLkwpB22rOldX+iVkxJbVOOO32E7JM3d4JhduUmgfw4k= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R881e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045046011;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jFFLh_1718693672; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jFFLh_1718693672) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:32 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 5/9] mm: filemap: use xa_get_order() to get the swap entry order Date: Tue, 18 Jun 2024 14:54:17 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In the following patches, shmem will support the swap out of large folios, which means the shmem mappings may contain large order swap entries, so using xa_get_order() to get the folio order of the shmem swap entry to update the '*start' correctly. Signed-off-by: Baolin Wang --- mm/filemap.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 876cc64aadd7..ecc785758aaf 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2056,6 +2056,8 @@ unsigned find_get_entries(struct address_space *mappi= ng, pgoff_t *start, folio =3D fbatch->folios[idx]; if (!xa_is_value(folio)) nr =3D folio_nr_pages(folio); + else + nr =3D 1 << xa_get_order(&mapping->i_pages, indices[idx]); *start =3D indices[idx] + nr; } return folio_batch_count(fbatch); @@ -2120,6 +2122,8 @@ unsigned find_lock_entries(struct address_space *mapp= ing, pgoff_t *start, folio =3D fbatch->folios[idx]; if (!xa_is_value(folio)) nr =3D folio_nr_pages(folio); + else + nr =3D 1 << xa_get_order(&mapping->i_pages, indices[idx]); *start =3D indices[idx] + nr; } return folio_batch_count(fbatch); --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3B5A13A276 for ; Tue, 18 Jun 2024 06:54:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.97 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693684; cv=none; b=OVBGuXJlWB/xa5z/P8Xfa+4zau7LRbYLwnGNmxL1rFkKU3gf4B/35MWm2/ubndBSIXTS+P9MGTkXlQgAIeNTpSXhLCwFDCdxTBLvu8Ef7vjzy+siA/ZCpOXOvWJtAaWW/mqoGJo8pGNn8MNgP/vn8KxtkCNbNnBQjqpbsg6usBI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693684; c=relaxed/simple; bh=oAeBF5uoFhvlx+e/P4CdH7gVwx6K9SJHEsXMg/F3Cy8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=m8qUuV4G7sYgSSc9MqsVnkm1BUs4HxD1PNby9l0oOioyYKbHnp8BN8HJKc/WypXfOuXTZUmaMWA1Jk2T6lFxeKal4Wx4ENKvl6TNhL1ZL2BUTaCVpDDDf4987OZ2A8JDJDoYuiP1blftBkS8S48Ts3ipXd15GII0hsQNFJj6pt4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Zec3z7y8; arc=none smtp.client-ip=115.124.30.97 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Zec3z7y8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693675; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=Y0p1KmYvp9sydPHpYYHL11MSmx/npKoL7TRhS1PPQNA=; b=Zec3z7y8JJvzLm7WW1yS+ga26w9+4x6ND726snHCSXHo2uY+geDg1sWmzlXTUCqXIogRjryfOi7zTMh9Za8xhQtMqD4KHVrWmnj/cnhbgD9xMS9c10Nyqrv4+8grFwigY7HifLlC+hkf7tB8UuLrFGwcEDW7oOFr4IBLtcIW8Hg= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R981e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067111;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jKUYu_1718693673; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jKUYu_1718693673) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:33 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 6/9] mm: shmem: use swap_free_nr() to free shmem swap entries Date: Tue, 18 Jun 2024 14:54:18 +0800 Message-Id: <373a0d959930494b83f68169eddb62d0d49a29cf.1718690645.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As a preparation for supporting shmem large folio swapout, use swap_free_nr= () to free some continuous swap entries of the shmem large folio when the large folio was swapped in from the swap cache. In addition, the index should also be round down to the number of pages when adding the swapin folio into the pagecache. Signed-off-by: Baolin Wang --- mm/shmem.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index a73d2da54897..4d7996962388 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1950,6 +1950,7 @@ static void shmem_set_folio_swapin_error(struct inode= *inode, pgoff_t index, struct address_space *mapping =3D inode->i_mapping; swp_entry_t swapin_error; void *old; + int nr_pages; =20 swapin_error =3D make_poisoned_swp_entry(); old =3D xa_cmpxchg_irq(&mapping->i_pages, index, @@ -1958,6 +1959,7 @@ static void shmem_set_folio_swapin_error(struct inode= *inode, pgoff_t index, if (old !=3D swp_to_radix_entry(swap)) return; =20 + nr_pages =3D folio_nr_pages(folio); folio_wait_writeback(folio); delete_from_swap_cache(folio); /* @@ -1965,8 +1967,8 @@ static void shmem_set_folio_swapin_error(struct inode= *inode, pgoff_t index, * won't be 0 when inode is released and thus trigger WARN_ON(i_blocks) * in shmem_evict_inode(). */ - shmem_recalc_inode(inode, -1, -1); - swap_free(swap); + shmem_recalc_inode(inode, -nr_pages, -nr_pages); + swap_free_nr(swap, nr_pages); } =20 /* @@ -1985,7 +1987,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, struct swap_info_struct *si; struct folio *folio =3D NULL; swp_entry_t swap; - int error; + int error, nr_pages; =20 VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); swap =3D radix_to_swp_entry(*foliop); @@ -2032,6 +2034,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, goto failed; } folio_wait_writeback(folio); + nr_pages =3D folio_nr_pages(folio); =20 /* * Some architectures may have to restore extra metadata to the @@ -2045,19 +2048,20 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, goto failed; } =20 - error =3D shmem_add_to_page_cache(folio, mapping, index, + error =3D shmem_add_to_page_cache(folio, mapping, + round_down(index, nr_pages), swp_to_radix_entry(swap), gfp); if (error) goto failed; =20 - shmem_recalc_inode(inode, 0, -1); + shmem_recalc_inode(inode, 0, -nr_pages); =20 if (sgp =3D=3D SGP_WRITE) folio_mark_accessed(folio); =20 delete_from_swap_cache(folio); folio_mark_dirty(folio); - swap_free(swap); + swap_free_nr(swap, nr_pages); put_swap_device(si); =20 *foliop =3D folio; --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4156778676 for ; Tue, 18 Jun 2024 06:54:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693680; cv=none; b=bPnSeHvZ1Df0Wtbwej/H+Jp7rsa2QyTqRBYo2kljbw2D8pIGdXT5zgZ91ZheBvk9WXjfLTfosk4TciJOpERyaf71FTrzek+pBZG+E7/FZf6LEs0laIYSw2b3LsXgniwx1l9dF1POXQpFa0YIayaITiZn3BQFOqGeIw+LNPFBR9w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693680; c=relaxed/simple; bh=b8LuyynHN3NsClPyf8jZS4sH1SoZUwxuViazv8I7w1Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=A3KS+eobtbhuw7HGZ0J/nDzoXyYpP9ysDekn2k20TxvUvZD/e5HHftZN6kZ6ub27/+nPgKlFyr5s0Zy99N2DVDFwhKEW0xZt/jxYUTD4gHfWCt1hJ8py+tdD9QxXLYFb9Sugxs0WWarCblMv0M8LDg2w64WVTebop9QtcJJuank= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=a2KWH0XI; arc=none smtp.client-ip=115.124.30.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="a2KWH0XI" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693676; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=koLTHCsWXBAa3N27f7lrGDpaxVfvYWdW4ehMIBy6BWA=; b=a2KWH0XIyyGXX0IJR7KJRMXwHPjSTQyPFdqJvj180MWe561lG62uXZ7FRST7HhVxf44mnaYwJineiUzGU/oUGkK1deKfhunH4GrlMm2vZowgDqG2RkMYQ7m/v/ijRGm3N9RfjISejOvYrtB0RTHaFrHjDe5B2VrFW2NTYPTc4Y8= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045046011;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jKUZG_1718693674; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jKUZG_1718693674) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:34 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 7/9] mm: shmem: support large folio allocation for shmem_replace_folio() Date: Tue, 18 Jun 2024 14:54:19 +0800 Message-Id: <75d1c1f2fbc88dce476a53fa90f085b3bebb83c5.1718690645.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support large folio swapin for shmem in the following patches, add large folio allocation for the new replacement folio in shmem_replace_folio= (). Moreover large folios occupy N consecutive entries in the swap cache instead of using multi-index entries like the page cache, therefore we should replace each consecutive entries in the swap cache instead of using the shmem_replace_entry(). As well as updating statistics and folio reference count using the number of pages in the folio. Signed-off-by: Baolin Wang --- mm/shmem.c | 53 ++++++++++++++++++++++++++++++----------------------- 1 file changed, 30 insertions(+), 23 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 4d7996962388..c0a9253f3a99 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1878,28 +1878,24 @@ static bool shmem_should_replace_folio(struct folio= *folio, gfp_t gfp) static int shmem_replace_folio(struct folio **foliop, gfp_t gfp, struct shmem_inode_info *info, pgoff_t index) { - struct folio *old, *new; - struct address_space *swap_mapping; - swp_entry_t entry; - pgoff_t swap_index; - int error; - - old =3D *foliop; - entry =3D old->swap; - swap_index =3D swap_cache_index(entry); - swap_mapping =3D swap_address_space(entry); + struct folio *new, *old =3D *foliop; + swp_entry_t entry =3D old->swap; + struct address_space *swap_mapping =3D swap_address_space(entry); + pgoff_t swap_index =3D swap_cache_index(entry); + XA_STATE(xas, &swap_mapping->i_pages, swap_index); + int nr_pages =3D folio_nr_pages(old); + int error =3D 0, i; =20 /* * We have arrived here because our zones are constrained, so don't * limit chance of success by further cpuset and node constraints. */ gfp &=3D ~GFP_CONSTRAINT_MASK; - VM_BUG_ON_FOLIO(folio_test_large(old), old); - new =3D shmem_alloc_folio(gfp, 0, info, index); + new =3D shmem_alloc_folio(gfp, folio_order(old), info, index); if (!new) return -ENOMEM; =20 - folio_get(new); + folio_ref_add(new, nr_pages); folio_copy(new, old); flush_dcache_folio(new); =20 @@ -1909,18 +1905,24 @@ static int shmem_replace_folio(struct folio **folio= p, gfp_t gfp, new->swap =3D entry; folio_set_swapcache(new); =20 - /* - * Our caller will very soon move newpage out of swapcache, but it's - * a nice clean interface for us to replace oldpage by newpage there. - */ + /* Swap cache still stores N entries instead of a high-order entry */ xa_lock_irq(&swap_mapping->i_pages); - error =3D shmem_replace_entry(swap_mapping, swap_index, old, new); + for (i =3D 0; i < nr_pages; i++) { + void *item =3D xas_store(&xas, new); + + if (item !=3D old) { + error =3D -ENOENT; + break; + } + + xas_next(&xas); + } if (!error) { mem_cgroup_replace_folio(old, new); - __lruvec_stat_mod_folio(new, NR_FILE_PAGES, 1); - __lruvec_stat_mod_folio(new, NR_SHMEM, 1); - __lruvec_stat_mod_folio(old, NR_FILE_PAGES, -1); - __lruvec_stat_mod_folio(old, NR_SHMEM, -1); + __lruvec_stat_mod_folio(new, NR_FILE_PAGES, nr_pages); + __lruvec_stat_mod_folio(new, NR_SHMEM, nr_pages); + __lruvec_stat_mod_folio(old, NR_FILE_PAGES, -nr_pages); + __lruvec_stat_mod_folio(old, NR_SHMEM, -nr_pages); } xa_unlock_irq(&swap_mapping->i_pages); =20 @@ -1940,7 +1942,12 @@ static int shmem_replace_folio(struct folio **foliop= , gfp_t gfp, old->private =3D NULL; =20 folio_unlock(old); - folio_put_refs(old, 2); + /* + * The old folio are removed from swap cache, drop the 'nr_pages' + * reference, as well as one temporary reference getting from swap + * cache. + */ + folio_put_refs(old, nr_pages + 1); return error; } =20 --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A7C3811FE for ; Tue, 18 Jun 2024 06:54:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.110 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693681; cv=none; b=B/ECWyu92vQFqmmkkdDd2CyM7XinXrG+t/Dofl6YwJHwvGy5oy5XG8BmL82iQSwphlmQTTXWspofgWPS/XaTCWYpmkOyDPoSNsmMg8xq5igtQKYxDJErwYrBScgfw1R4gyDwpSHauGfpy3aeL3pgjpSL8SCPyClQcnt4CUzb7x0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693681; c=relaxed/simple; bh=oVnV3o1DiHJTJv/4w0ienh3h1L9X2UkzNvbAiS7nYa4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CZ/FFDfoCEEgJEro7KZyyAlj4arr8nTMlwvTyBV2xELfY2/39jbOS1Sfm3yINN60LRxZwHMMSz8nOuITm+XUIvVNyKUYk9APk3E2/QL7tJqq3e7IzAQwq7xJaqfHdNcMq8YvsKtbiAJiERqFKrSuKmKuOsugW52E7K5DYjg1SAU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=CgkwSa/o; arc=none smtp.client-ip=115.124.30.110 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="CgkwSa/o" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693677; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=BtIWGiPeiPtgyrkhSZFohxCr8LroRfij9iZfhgtdvvw=; b=CgkwSa/ooabLtWd/cLwAJuu20RX1FEuE9lSzNYgBr2taB5xDtS28QXDFYEn8NeopdM2Z+Xr50q1ABEZcIQuX5K+cpA2Q9OSpaUI2//4vHdzfZw2gLJ7BEBIUMVIl8MHK6cxqdNM3TFf6mZvJNQ7Av0Rexzf3vQXkoBeLfsrVE+U= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033068173054;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jFFMY_1718693675; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jFFMY_1718693675) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:35 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 8/9] mm: shmem: drop folio reference count using 'nr_pages' in shmem_delete_from_page_cache() Date: Tue, 18 Jun 2024 14:54:20 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support large folio swapin/swapout for shmem in the following patches, drop the folio's reference count by the number of pages contained in the folio when a shmem folio is deleted from shmem pagecache after adding into swap cache. Signed-off-by: Baolin Wang --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index c0a9253f3a99..9a35ee7e7f40 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -831,7 +831,7 @@ static void shmem_delete_from_page_cache(struct folio *= folio, void *radswap) __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, -nr); __lruvec_stat_mod_folio(folio, NR_SHMEM, -nr); xa_unlock_irq(&mapping->i_pages); - folio_put(folio); + folio_put_refs(folio, nr); BUG_ON(error); } =20 --=20 2.39.3 From nobody Tue Dec 16 19:53:34 2025 Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BA2E131E33 for ; Tue, 18 Jun 2024 06:54:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693682; cv=none; b=sNRHaPaQHBVLw0Sti+EnWigZFdiKxDhP9yLNmo8qnNIT54NRhysfgCWqHl5GZi7JnulZgrHNCDVq8eOAuq889w8cejd1PlyRgR5NNymA2osZznBb4oO69+ukouS7fXQYAQpQR2qjk8jlSJV4888qhqadLQcZP1L3xs0PxH0DQYc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718693682; c=relaxed/simple; bh=Mh9+JA8BouLeObhnXDllLi20vYXVUfEOul24tJO2drk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=SBirudBZPSAmNspf0rcfx/kDwx+6D8AYt6EIGA4mot1lP6PNG+aV0DgfuD5IXsHazvj5MFB1O4PMKnMQ8BTho0oAM9Q9kpPH63darNe+1gdnJhN7lhB/9Sd/ronrBHakNQwu90fXj7nCfYIM/ead5UG2lHECUiUpUZza155wrkw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=j/tpHpF6; arc=none smtp.client-ip=115.124.30.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="j/tpHpF6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718693678; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=X2IkWrVoIA4HwkKR/y14knPLhy9dqitMrcGqnXrDEBs=; b=j/tpHpF65loG60eh+FQ8RHML/IKZYpV/JEf/Hx58eIP8cx3s14PxNksyLAxXDZHV66RxDHtvU64yjJqf5rVxSRL49c4gbYI5Ah9wNGB2P61OY4jEe9fJYbLSMDHxmydBJBSiG9krryj/mWAZYprvRPwyMMEKgLvU8y0aYTraplc= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R891e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045046011;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W8jH7nR_1718693676; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8jH7nR_1718693676) by smtp.aliyun-inc.com; Tue, 18 Jun 2024 14:54:36 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, chrisl@kernel.org, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 9/9] mm: shmem: support large folio swap out Date: Tue, 18 Jun 2024 14:54:21 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Shmem will support large folio allocation [1] [2] to get a better performan= ce, however, the memory reclaim still splits the precious large folios when try= ing to swap out shmem, which may lead to the memory fragmentation issue and can= not take advantage of the large folio for shmeme. Moreover, the swap code already supports for swapping out large folio witho= ut split, hence this patch set supports the large folio swap out for shmem. Note the i915_gem_shmem driver still need to be split when swapping, thus add a new flag 'split_large_folio' for writeback_control to indicate spliti= ng the large folio. [1] https://lore.kernel.org/all/cover.1717495894.git.baolin.wang@linux.alib= aba.com/ [2] https://lore.kernel.org/all/20240515055719.32577-1-da.gomez@samsung.com/ Signed-off-by: Baolin Wang --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 1 + include/linux/writeback.h | 1 + mm/shmem.c | 3 +-- mm/vmscan.c | 14 ++++++++++++-- 4 files changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i9= 15/gem/i915_gem_shmem.c index c5e1c718a6d2..c66cb9c585e1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -308,6 +308,7 @@ void __shmem_writeback(size_t size, struct address_spac= e *mapping) .range_start =3D 0, .range_end =3D LLONG_MAX, .for_reclaim =3D 1, + .split_large_folio =3D 1, }; unsigned long i; =20 diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 112d806ddbe4..6f2599244ae0 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -63,6 +63,7 @@ struct writeback_control { unsigned range_cyclic:1; /* range_start is cyclic */ unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ unsigned unpinned_netfs_wb:1; /* Cleared I_PINNING_NETFS_WB */ + unsigned split_large_folio:1; /* Split large folio for shmem writeback */ =20 /* * When writeback IOs are bounced through async layers, only the diff --git a/mm/shmem.c b/mm/shmem.c index 9a35ee7e7f40..2c951d936fc1 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -775,7 +775,6 @@ static int shmem_add_to_page_cache(struct folio *folio, VM_BUG_ON_FOLIO(index !=3D round_down(index, nr), folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(!folio_test_swapbacked(folio), folio); - VM_BUG_ON(expected && folio_test_large(folio)); =20 folio_ref_add(folio, nr); folio->mapping =3D mapping; @@ -1462,7 +1461,7 @@ static int shmem_writepage(struct page *page, struct = writeback_control *wbc) * "force", drivers/gpu/drm/i915/gem/i915_gem_shmem.c gets huge pages, * and its shmem_writeback() needs them to be split when swapping. */ - if (folio_test_large(folio)) { + if (wbc->split_large_folio && folio_test_large(folio)) { /* Ensure the subpages are still dirty */ folio_test_set_dirty(folio); if (split_huge_page(page) < 0) diff --git a/mm/vmscan.c b/mm/vmscan.c index 61465f92283f..fd503506262d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1245,8 +1245,12 @@ static unsigned int shrink_folio_list(struct list_he= ad *folio_list, if (!total_swap_pages) goto activate_locked; =20 - /* Split shmem folio */ - if (split_folio_to_list(folio, folio_list)) + /* + * Only split shmem folio when CONFIG_THP_SWAP + * is not enabled. + */ + if (!IS_ENABLED(CONFIG_THP_SWAP) && + split_folio_to_list(folio, folio_list)) goto keep_locked; } =20 @@ -1348,10 +1352,16 @@ static unsigned int shrink_folio_list(struct list_h= ead *folio_list, * starts and then write it out here. */ try_to_unmap_flush_dirty(); +try_pageout: switch (pageout(folio, mapping, &plug)) { case PAGE_KEEP: goto keep_locked; case PAGE_ACTIVATE: + if (shmem_mapping(mapping) && folio_test_large(folio) && + !split_folio_to_list(folio, folio_list)) { + nr_pages =3D 1; + goto try_pageout; + } goto activate_locked; case PAGE_SUCCESS: stat->nr_pageout +=3D nr_pages; --=20 2.39.3