From nobody Wed Sep 10 10:58:05 2025 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 884822777F9 for ; Fri, 5 Sep 2025 19:14:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757099667; cv=none; b=aSi9V4MzVrM5/LRs+17kg1osYO7wbBi1QZxHTekwzR7Ych8hfRPiJifKwsmiASImz3qXsvCZA+lx/e0urPA/QgK2TUxmN4llJqSK773fDV1jtYIsp47w+Dai8S9yaXTKf4wWT8hIUuFsjN68lzjHZAwVgZ8S7pXayX7w8g3zC1Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757099667; c=relaxed/simple; bh=1LoHdPKo2zKz70TOHdTU9nuy8mSrJQdXii+sK06/mQA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k/crbvNm8iz4Ax1stNwXNRePD713HuZaDjprTWwczAANylQbAnhmnXgkVbtA7RklxSnmeBVz8wjNZPAc+XSPEH+3j9BmGZ6MlpUmdX1wFOr1wJy3Wq17E3xNC+XNwjs4JQvWa93uoYtqLF7l8iQPgxtqI2eIuGGdN1cMewquujw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=h+xn4grU; arc=none smtp.client-ip=209.85.210.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="h+xn4grU" Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-772301f8a4cso3348177b3a.3 for ; Fri, 05 Sep 2025 12:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757099661; x=1757704461; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=oPAOWrCpwpbR9rVPMUrX0Qp8nhk8aV7LnM+BJcLKRk0=; b=h+xn4grU2vQosInH7IQCldOaygllWGhoEFLw6i8YexJ1kVRXgh/x83GmU1YoVu96Xq qIEtRGHkKiAA7dhb9NvSr0/DyIpzmn4nrREpNIs/0jq+PGjgjcKSxddNbznmUuESAJ7P +xxFM0BuRGbCbOqHq5AT5aBt3v7WneixGPyylXHr+WMwRE/77bFPXIwyC+Bfe19/hmCf ueEwLSQOiLP9HVmjiDJ334jMQztt9FFvM2mBcWofdK0CDbsD6IX5tQGwudwUSrS9a5zN jqV8qg5X1wQhQf1jBiriYDNcIq7QPF2+7EDypa0JK/f/RwgVa4FPBhEk8BxJ4V52THnj bulg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757099661; x=1757704461; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=oPAOWrCpwpbR9rVPMUrX0Qp8nhk8aV7LnM+BJcLKRk0=; b=kDpPouF1n7oC4NFGdQ7/ksRMfzfOl23Iu5IYY4WCpD3zxrIsaCXoORxy+DPyt5Ap4m DMmP5hCV043lmtCZmZSTi7TorM1VmFWPT6+oWfV85Rk9VKUl2+GI8Qorrqdq3zZTD9H+ 7XAVuOMNrFxknN241vnKq93rhllJJnA5WniFPg9sreJ4jQ9EiX8sKrYcZaKjM6IHFzZn 68zSMJLQQurLJN9vPqfM0pV8l+nH4xCrPbdKb5fyhzfHZ8jV5e33k19Q2CdFwrtw46ZS MumUwl5/tGodPMEidp8zIqSmcHE9loVbhVcOA2gxHXvjtMuA2BN0xKWef3S+6cf8azZB chbA== X-Forwarded-Encrypted: i=1; AJvYcCVHkV0Uf7YihxJMupYaUFKwgbm1rgESo1XZSexqn6z3CVsqDWE4zB2dcb+TCL86rJjvFcB7DXwfDgZQBw0=@vger.kernel.org X-Gm-Message-State: AOJu0Yw5+nXfPh3xun8g7zf/TjQaWaRaVRqB5zoQawT/yYPIpxDy5XFm HNgR9M4igWOIwotNRcwL7J39Vw9WENA+cF55QKD5E7iX8Xa5TcJg8uU7 X-Gm-Gg: ASbGncvWzmyfBN7rFB5ejhg+HVAYEFjW1MpLjIKp8Q0CWX9jEBjyAkLKOz/4PgtCsBY 8bd6NFniWMilUAPgeojOokfORRZaSgAFCPWTGLO+R4dyY1DvW2TwIxQLQ8+LKCJwSTedna1MOb3 84F5MMhb13UVwQS+gy9s2dDt41gDdc3vesyWXEeHScExD9gyTAkX1xM6HSNVFOZe9GwyXM9dob6 O+K8tRfgCga+/oaOkk1S8sjikIOceNdX/XgmgsDSPeRFeeEj1NVDJINo+aB2tCOSZLuP1+Cta5n 5U1xbwnKhaw5Da0qc4eaPXXw3g37v+NJSZTc7LljJDIeKC03Sth4sDX3z5EaHNusxz3GhN4hPwK 4jWE6GorHUkqJBSJoeVF9B8R0yMO70BNFIIybxq3aYXK/aCtUmt3G+KDThw== X-Google-Smtp-Source: AGHT+IEVx2SUdnlXnTbUzYeaXXSinO95wJgf+dR2LlFdFQaVChU5ZU3mt8Ll6c0HJcL8428E5cqOnw== X-Received: by 2002:a05:6a00:848:b0:772:5d0f:6c9b with SMTP id d2e1a72fcca58-7725d0f6f0cmr23220397b3a.1.1757099660578; Fri, 05 Sep 2025 12:14:20 -0700 (PDT) Received: from KASONG-MC4.tencent.com ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-77256a0f916sm15871442b3a.63.2025.09.05.12.14.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 05 Sep 2025 12:14:20 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Matthew Wilcox , Hugh Dickins , Chris Li , Barry Song , Baoquan He , Nhat Pham , Kemeng Shi , Baolin Wang , Ying Huang , Johannes Weiner , David Hildenbrand , Yosry Ahmed , Lorenzo Stoakes , Zi Yan , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v2 02/15] mm, swap: use unified helper for swap cache look up Date: Sat, 6 Sep 2025 03:13:44 +0800 Message-ID: <20250905191357.78298-3-ryncsn@gmail.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20250905191357.78298-1-ryncsn@gmail.com> References: <20250905191357.78298-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song The swap cache lookup helper swap_cache_get_folio currently does readahead updates as well, so callers that are not doing swapin from any VMA or mapping are forced to reuse filemap helpers instead, and have to access the swap cache space directly. So decouple readahead update with swap cache lookup. Move the readahead update part into a standalone helper. Let the caller call the readahead update helper if they do readahead. And convert all swap cache lookups to use swap_cache_get_folio. After this commit, there are only three special cases for accessing swap cache space now: huge memory splitting, migration, and shmem replacing, because they need to lock the XArray. The following commits will wrap their accesses to the swap cache too, with special helpers. And worth noting, currently dropbehind is not supported for anon folio, and we will never see a dropbehind folio in swap cache. The unified helper can be updated later to handle that. While at it, add proper kernedoc for touched helpers. No functional change. Signed-off-by: Kairui Song Acked-by: Chris Li Acked-by: Nhat Pham Reviewed-by: Baolin Wang Reviewed-by: Barry Song Acked-by: David Hildenbrand --- mm/memory.c | 6 ++- mm/mincore.c | 3 +- mm/shmem.c | 4 +- mm/swap.h | 13 ++++-- mm/swap_state.c | 109 +++++++++++++++++++++++++---------------------- mm/swapfile.c | 11 +++-- mm/userfaultfd.c | 5 +-- 7 files changed, 81 insertions(+), 70 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index d9de6c056179..10ef528a5f44 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4660,9 +4660,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (unlikely(!si)) goto out; =20 - folio =3D swap_cache_get_folio(entry, vma, vmf->address); - if (folio) + folio =3D swap_cache_get_folio(entry); + if (folio) { + swap_update_readahead(folio, vma, vmf->address); page =3D folio_file_page(folio, swp_offset(entry)); + } swapcache =3D folio; =20 if (!folio) { diff --git a/mm/mincore.c b/mm/mincore.c index 2f3e1816a30d..8ec4719370e1 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -76,8 +76,7 @@ static unsigned char mincore_swap(swp_entry_t entry, bool= shmem) if (!si) return 0; } - folio =3D filemap_get_entry(swap_address_space(entry), - swap_cache_index(entry)); + folio =3D swap_cache_get_folio(entry); if (shmem) put_swap_device(si); /* The swap cache space contains either folio, shadow or NULL */ diff --git a/mm/shmem.c b/mm/shmem.c index 2df26f4d6e60..4e27e8e5da3b 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2354,7 +2354,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, } =20 /* Look it up and read it in.. */ - folio =3D swap_cache_get_folio(swap, NULL, 0); + folio =3D swap_cache_get_folio(swap); if (!folio) { if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { /* Direct swapin skipping swap cache & readahead */ @@ -2379,6 +2379,8 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, count_vm_event(PGMAJFAULT); count_memcg_event_mm(fault_mm, PGMAJFAULT); } + } else { + swap_update_readahead(folio, NULL, 0); } =20 if (order > folio_order(folio)) { diff --git a/mm/swap.h b/mm/swap.h index 1ae44d4193b1..efb6d7ff9f30 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -62,8 +62,7 @@ void delete_from_swap_cache(struct folio *folio); void clear_shadow_from_swap_cache(int type, unsigned long begin, unsigned long end); void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int n= r); -struct folio *swap_cache_get_folio(swp_entry_t entry, - struct vm_area_struct *vma, unsigned long addr); +struct folio *swap_cache_get_folio(swp_entry_t entry); struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct vm_area_struct *vma, unsigned long addr, struct swap_iocb **plug); @@ -74,6 +73,8 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, g= fp_t flag, struct mempolicy *mpol, pgoff_t ilx); struct folio *swapin_readahead(swp_entry_t entry, gfp_t flag, struct vm_fault *vmf); +void swap_update_readahead(struct folio *folio, struct vm_area_struct *vma, + unsigned long addr); =20 static inline unsigned int folio_swap_flags(struct folio *folio) { @@ -159,6 +160,11 @@ static inline struct folio *swapin_readahead(swp_entry= _t swp, gfp_t gfp_mask, return NULL; } =20 +static inline void swap_update_readahead(struct folio *folio, + struct vm_area_struct *vma, unsigned long addr) +{ +} + static inline int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug) { @@ -169,8 +175,7 @@ static inline void swapcache_clear(struct swap_info_str= uct *si, swp_entry_t entr { } =20 -static inline struct folio *swap_cache_get_folio(swp_entry_t entry, - struct vm_area_struct *vma, unsigned long addr) +static inline struct folio *swap_cache_get_folio(swp_entry_t entry) { return NULL; } diff --git a/mm/swap_state.c b/mm/swap_state.c index 99513b74b5d8..68ec531d0f2b 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -69,6 +69,27 @@ void show_swap_cache_info(void) printk("Total swap =3D %lukB\n", K(total_swap_pages)); } =20 +/** + * swap_cache_get_folio - Looks up a folio in the swap cache. + * @entry: swap entry used for the lookup. + * + * A found folio will be returned unlocked and with its refcount increased. + * + * Context: Caller must ensure @entry is valid and protect the swap device + * with reference count or locks. + * Return: Returns the found folio on success, NULL otherwise. The caller + * must lock and check if the folio still matches the swap entry before + * use. + */ +struct folio *swap_cache_get_folio(swp_entry_t entry) +{ + struct folio *folio =3D filemap_get_folio(swap_address_space(entry), + swap_cache_index(entry)); + if (IS_ERR(folio)) + return NULL; + return folio; +} + void *get_shadow_from_swap_cache(swp_entry_t entry) { struct address_space *address_space =3D swap_address_space(entry); @@ -272,55 +293,43 @@ static inline bool swap_use_vma_readahead(void) return READ_ONCE(enable_vma_readahead) && !atomic_read(&nr_rotate_swap); } =20 -/* - * Lookup a swap entry in the swap cache. A found folio will be returned - * unlocked and with its refcount incremented - we rely on the kernel - * lock getting page table operations atomic even if we drop the folio - * lock before returning. - * - * Caller must lock the swap device or hold a reference to keep it valid. +/** + * swap_update_readahead - Update the readahead statistics of VMA or globa= lly. + * @folio: the swap cache folio that just got hit. + * @vma: the VMA that should be updated, could be NULL for global update. + * @addr: the addr that triggered the swapin, ignored if @vma is NULL. */ -struct folio *swap_cache_get_folio(swp_entry_t entry, - struct vm_area_struct *vma, unsigned long addr) +void swap_update_readahead(struct folio *folio, struct vm_area_struct *vma, + unsigned long addr) { - struct folio *folio; - - folio =3D filemap_get_folio(swap_address_space(entry), swap_cache_index(e= ntry)); - if (!IS_ERR(folio)) { - bool vma_ra =3D swap_use_vma_readahead(); - bool readahead; + bool readahead, vma_ra =3D swap_use_vma_readahead(); =20 - /* - * At the moment, we don't support PG_readahead for anon THP - * so let's bail out rather than confusing the readahead stat. - */ - if (unlikely(folio_test_large(folio))) - return folio; - - readahead =3D folio_test_clear_readahead(folio); - if (vma && vma_ra) { - unsigned long ra_val; - int win, hits; - - ra_val =3D GET_SWAP_RA_VAL(vma); - win =3D SWAP_RA_WIN(ra_val); - hits =3D SWAP_RA_HITS(ra_val); - if (readahead) - hits =3D min_t(int, hits + 1, SWAP_RA_HITS_MAX); - atomic_long_set(&vma->swap_readahead_info, - SWAP_RA_VAL(addr, win, hits)); - } - - if (readahead) { - count_vm_event(SWAP_RA_HIT); - if (!vma || !vma_ra) - atomic_inc(&swapin_readahead_hits); - } - } else { - folio =3D NULL; + /* + * At the moment, we don't support PG_readahead for anon THP + * so let's bail out rather than confusing the readahead stat. + */ + if (unlikely(folio_test_large(folio))) + return; + + readahead =3D folio_test_clear_readahead(folio); + if (vma && vma_ra) { + unsigned long ra_val; + int win, hits; + + ra_val =3D GET_SWAP_RA_VAL(vma); + win =3D SWAP_RA_WIN(ra_val); + hits =3D SWAP_RA_HITS(ra_val); + if (readahead) + hits =3D min_t(int, hits + 1, SWAP_RA_HITS_MAX); + atomic_long_set(&vma->swap_readahead_info, + SWAP_RA_VAL(addr, win, hits)); } =20 - return folio; + if (readahead) { + count_vm_event(SWAP_RA_HIT); + if (!vma || !vma_ra) + atomic_inc(&swapin_readahead_hits); + } } =20 struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, @@ -336,14 +345,10 @@ struct folio *__read_swap_cache_async(swp_entry_t ent= ry, gfp_t gfp_mask, *new_page_allocated =3D false; for (;;) { int err; - /* - * First check the swap cache. Since this is normally - * called after swap_cache_get_folio() failed, re-calling - * that would confuse statistics. - */ - folio =3D filemap_get_folio(swap_address_space(entry), - swap_cache_index(entry)); - if (!IS_ERR(folio)) + + /* Check the swap cache in case the folio is already there */ + folio =3D swap_cache_get_folio(entry); + if (folio) goto got_folio; =20 /* diff --git a/mm/swapfile.c b/mm/swapfile.c index a7ffabbe65ef..4b8ab2cb49ca 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -213,15 +213,14 @@ static int __try_to_reclaim_swap(struct swap_info_str= uct *si, unsigned long offset, unsigned long flags) { swp_entry_t entry =3D swp_entry(si->type, offset); - struct address_space *address_space =3D swap_address_space(entry); struct swap_cluster_info *ci; struct folio *folio; int ret, nr_pages; bool need_reclaim; =20 again: - folio =3D filemap_get_folio(address_space, swap_cache_index(entry)); - if (IS_ERR(folio)) + folio =3D swap_cache_get_folio(entry); + if (!folio) return 0; =20 nr_pages =3D folio_nr_pages(folio); @@ -2131,7 +2130,7 @@ static int unuse_pte_range(struct vm_area_struct *vma= , pmd_t *pmd, pte_unmap(pte); pte =3D NULL; =20 - folio =3D swap_cache_get_folio(entry, vma, addr); + folio =3D swap_cache_get_folio(entry); if (!folio) { struct vm_fault vmf =3D { .vma =3D vma, @@ -2357,8 +2356,8 @@ static int try_to_unuse(unsigned int type) (i =3D find_next_to_unuse(si, i)) !=3D 0) { =20 entry =3D swp_entry(type, i); - folio =3D filemap_get_folio(swap_address_space(entry), swap_cache_index(= entry)); - if (IS_ERR(folio)) + folio =3D swap_cache_get_folio(entry); + if (!folio) continue; =20 /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 50aaa8dcd24c..af61b95c89e4 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1489,9 +1489,8 @@ static long move_pages_ptes(struct mm_struct *mm, pmd= _t *dst_pmd, pmd_t *src_pmd * separately to allow proper handling. */ if (!src_folio) - folio =3D filemap_get_folio(swap_address_space(entry), - swap_cache_index(entry)); - if (!IS_ERR_OR_NULL(folio)) { + folio =3D swap_cache_get_folio(entry); + if (folio) { if (folio_test_large(folio)) { ret =3D -EBUSY; folio_put(folio); --=20 2.51.0