From nobody Tue Dec 2 00:46:00 2025 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADBC123EAAB for ; Mon, 24 Nov 2025 19:16:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764011776; cv=none; b=Pks7BRqwB2/4q5wvIP71Cjz3ssI2wQEWzUeXMzjm3RAgY7zcIqVp75rn/jsNX9C9GcWApF1ERttlgPAcIjgWGAatCzlgDQLaM/ak80rPUP5/+5mNgVCkmo30uxoobNN1T4TK40L3aATxNIK1tof3u50NF27TbLsTjQBi5nI8sts= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764011776; c=relaxed/simple; bh=BCyM/sRpXUizVKA+fshJhSXkN4lhT/BtafffoT97yuY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=eOjySuOPKGDQ0vMWNkQz//E941ntZuE8VUNgBC8J5AqG1wPWQ+ihvcBhNo3CW3p/WVDvdKKHHOyUApTQ4gRUK90A1WcI64doBZPBpKyde03HCkzYZbTGgVxM2opIK5kGcy2tSsD4abVSnlwdrpbrTDupKHpfoQXbKIQ2Sj8D+Xo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZC+scpTZ; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZC+scpTZ" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-29845b06dd2so58186105ad.2 for ; Mon, 24 Nov 2025 11:16:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764011772; x=1764616572; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=obraih8eLf0OOjYF+WhdrywBlHz95tFwt0h0OBRMkZ8=; b=ZC+scpTZNlZHjqIisD4+mmjn9DU6zyMkjpUbaW24ynnbA0bPtNKNDcwjgZml6DE9eJ J74hUce4sqPQ/KrjJVqTDvhjTX/on9Y+q5Nmd3No6G+k+ItA2CRn8MheGBUBBjxFi+o7 vk3tpC6ztwSE3CB3zIUclROWZr9/JFjFebP9oxiwB7Mu3ZAjYcvLopqWkp5Cmtwe6dGO 6bRh4u8k/23Kp9pZnjZHeOdl0H9urThImTzJetCJAL3yajh1KzjoJPLARX384Ok7q2Fb Me/iQINal5GorRMgHGep4LvxYCFcNU2rVQwk28Ltn3OlgDOjBPtu+9nyNo4Wp84ZCYCX +moQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764011772; x=1764616572; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=obraih8eLf0OOjYF+WhdrywBlHz95tFwt0h0OBRMkZ8=; b=O8qEsHvjdeKimm76pC8h508Iz7fneojwJztVysAiCqv+saICsYZysIKtueOI8JWB4u NeoMYGJVasoKmhK9DuUUScDSXGbQcobyuBy3QKTpfHUjBh/+EspIZ/7Q6AK17iwkfNG6 fVtGdQz5Dsdk6T0fOx7aPT3/B8pC2/Prm+HcSDQvXOrFJuGQzf+iU7nNG5zmHoJgvvMs XHkc16ppkm3Mxn5iz9iCSOc9UnvCTyy9cVWX8X/clyGcry9DTUZmbOUns43nRpXLjsEf mNE2/15yDwJJebh/LDLuFVgFHw+7r0dsuVaqzbK2fuaCAeO8G9AFtUbG1E62CDHXqQLG LLYg== X-Forwarded-Encrypted: i=1; AJvYcCVcTvozuw+PcWeAw+0MFBM+Ir3u6Wb8X9FUJZwJzwJp4POa8BgGoQa5OAS+F9ntq0qldd/LEld9hUk/88U=@vger.kernel.org X-Gm-Message-State: AOJu0YzUamAti6lI7kLZXXUTYvNir/dVNzwkYlbTeCqn1fekKjG5GL+a rsHgJhBtHND/M56SMZHZ6pUVUlLL/obeXmOQKCc7sN2WE6SCJvNCsFJS X-Gm-Gg: ASbGnctlzJ+BEffjwDc8ZHBxOLFeQdUGMRiEv2rfzn6kEmm6IuknBQMQgGk7A8LXvvm JINLyLYBXsREGDLQ8QYKN/rBymd8E5CuHLOmKKx4BLBwXS8YEzTyMAQHQMxFznnHvfghZ6xj5LC 840pGbsleXFj8gczB61e1Z8iZ/JzROPbHLA2PK2OKsSJIuEnrggonjCMnZG1GGEmWnDRWcxxzD3 EeB3YR9bRiCr/H2kiLikz0KWWt35BTYN1Xe7DsjUTCI+ex6apE6HAYTVzpnEg5+ZNVeVc5dhYMq QXRK5N7XwtLzG/KWhnjzhRBaZjPLkiRk90gVphcY9YK24StW4irTV4mz/iTPbP/5vYg64J3lavf QRFbz3Fjx+q7rksCIuV+BeQZYcz+mIDtm9hJujf9wxOJViwPiMYdpGoRxuF22MaWoZPOxPX6jo1 KDQ+QZ9oJVA8c3CJVb3w7GjejvHKr1ywQcYAMqfxeIBa1usKNm X-Google-Smtp-Source: AGHT+IGe78PocrlSrCBER8KtPpbn93UgwXznFo2T6cbCpPDjFbaqSNvsNNClyif7SLSaUJ9xOH6mGA== X-Received: by 2002:a17:903:2f8d:b0:298:52a9:31d4 with SMTP id d9443c01a7336-29b6c6b87c7mr163646585ad.54.1764011771769; Mon, 24 Nov 2025 11:16:11 -0800 (PST) Received: from [127.0.0.1] ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-bd75def75ffsm14327479a12.3.2025.11.24.11.16.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Nov 2025 11:16:11 -0800 (PST) From: Kairui Song Date: Tue, 25 Nov 2025 03:13:50 +0800 Subject: [PATCH v3 07/19] mm/shmem: never bypass the swap cache for SWP_SYNCHRONOUS_IO Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251125-swap-table-p2-v3-7-33f54f707a5c@tencent.com> References: <20251125-swap-table-p2-v3-0-33f54f707a5c@tencent.com> In-Reply-To: <20251125-swap-table-p2-v3-0-33f54f707a5c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Baoquan He , Barry Song , Chris Li , Nhat Pham , Yosry Ahmed , David Hildenbrand , Johannes Weiner , Youngjun Park , Hugh Dickins , Baolin Wang , Ying Huang , Kemeng Shi , Lorenzo Stoakes , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Kairui Song X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1764011730; l=7744; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=mFJMctr8NWLbKAoBn+MFMi16Ti5CS/gXXQ3s4bOVo4E=; b=N17pfrXCrtJLWjiS9vI4Ie7s945niM4NnEl9K/tyR45NMLgbtHmGvPybvJANfb0UFKeSi3Dw3 r2NC3i5ugLwCmwUc7hkTIJ9Nv8FQo2FjRze+wmW33V8R+Whs4ivMfnz X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= From: Kairui Song Now the overhead of the swap cache is trivial to none, bypassing the swap cache is no longer a valid optimization. We have removed the cache bypass swapin for anon memory, now do the same for shmem. Many helpers and functions can be dropped now. Signed-off-by: Kairui Song --- mm/shmem.c | 65 +++++++++++++++++--------------------------------------= ---- mm/swap.h | 4 ---- mm/swapfile.c | 35 +++++++++----------------------- 3 files changed, 27 insertions(+), 77 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index ad18172ff831..d08248fd67ff 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2001,10 +2001,9 @@ static struct folio *shmem_swap_alloc_folio(struct i= node *inode, swp_entry_t entry, int order, gfp_t gfp) { struct shmem_inode_info *info =3D SHMEM_I(inode); + struct folio *new, *swapcache; int nr_pages =3D 1 << order; - struct folio *new; gfp_t alloc_gfp; - void *shadow; =20 /* * We have arrived here because our zones are constrained, so don't @@ -2044,34 +2043,19 @@ static struct folio *shmem_swap_alloc_folio(struct = inode *inode, goto fallback; } =20 - /* - * Prevent parallel swapin from proceeding with the swap cache flag. - * - * Of course there is another possible concurrent scenario as well, - * that is to say, the swap cache flag of a large folio has already - * been set by swapcache_prepare(), while another thread may have - * already split the large swap entry stored in the shmem mapping. - * In this case, shmem_add_to_page_cache() will help identify the - * concurrent swapin and return -EEXIST. - */ - if (swapcache_prepare(entry, nr_pages)) { + swapcache =3D swapin_folio(entry, new); + if (swapcache !=3D new) { folio_put(new); - new =3D ERR_PTR(-EEXIST); - /* Try smaller folio to avoid cache conflict */ - goto fallback; + if (!swapcache) { + /* + * The new folio is charged already, swapin can + * only fail due to another raced swapin. + */ + new =3D ERR_PTR(-EEXIST); + goto fallback; + } } - - __folio_set_locked(new); - __folio_set_swapbacked(new); - new->swap =3D entry; - - memcg1_swapin(entry, nr_pages); - shadow =3D swap_cache_get_shadow(entry); - if (shadow) - workingset_refault(new, shadow); - folio_add_lru(new); - swap_read_folio(new, NULL); - return new; + return swapcache; fallback: /* Order 0 swapin failed, nothing to fallback to, abort */ if (!order) @@ -2161,8 +2145,7 @@ static int shmem_replace_folio(struct folio **foliop,= gfp_t gfp, } =20 static void shmem_set_folio_swapin_error(struct inode *inode, pgoff_t inde= x, - struct folio *folio, swp_entry_t swap, - bool skip_swapcache) + struct folio *folio, swp_entry_t swap) { struct address_space *mapping =3D inode->i_mapping; swp_entry_t swapin_error; @@ -2178,8 +2161,7 @@ static void shmem_set_folio_swapin_error(struct inode= *inode, pgoff_t index, =20 nr_pages =3D folio_nr_pages(folio); folio_wait_writeback(folio); - if (!skip_swapcache) - swap_cache_del_folio(folio); + swap_cache_del_folio(folio); /* * Don't treat swapin error folio as alloced. Otherwise inode->i_blocks * won't be 0 when inode is released and thus trigger WARN_ON(i_blocks) @@ -2279,7 +2261,6 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, softleaf_t index_entry; struct swap_info_struct *si; struct folio *folio =3D NULL; - bool skip_swapcache =3D false; int error, nr_pages, order; pgoff_t offset; =20 @@ -2322,7 +2303,6 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, folio =3D NULL; goto failed; } - skip_swapcache =3D true; } else { /* Cached swapin only supports order 0 folio */ folio =3D shmem_swapin_cluster(swap, gfp, info, index); @@ -2378,9 +2358,8 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, * and swap cache folios are never partially freed. */ folio_lock(folio); - if ((!skip_swapcache && !folio_test_swapcache(folio)) || - shmem_confirm_swap(mapping, index, swap) < 0 || - folio->swap.val !=3D swap.val) { + if (!folio_matches_swap_entry(folio, swap) || + shmem_confirm_swap(mapping, index, swap) < 0) { error =3D -EEXIST; goto unlock; } @@ -2412,12 +2391,7 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, if (sgp =3D=3D SGP_WRITE) folio_mark_accessed(folio); =20 - if (skip_swapcache) { - folio->swap.val =3D 0; - swapcache_clear(si, swap, nr_pages); - } else { - swap_cache_del_folio(folio); - } + swap_cache_del_folio(folio); folio_mark_dirty(folio); swap_free_nr(swap, nr_pages); put_swap_device(si); @@ -2428,14 +2402,11 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, if (shmem_confirm_swap(mapping, index, swap) < 0) error =3D -EEXIST; if (error =3D=3D -EIO) - shmem_set_folio_swapin_error(inode, index, folio, swap, - skip_swapcache); + shmem_set_folio_swapin_error(inode, index, folio, swap); unlock: if (folio) folio_unlock(folio); failed_nolock: - if (skip_swapcache) - swapcache_clear(si, folio->swap, folio_nr_pages(folio)); if (folio) folio_put(folio); put_swap_device(si); diff --git a/mm/swap.h b/mm/swap.h index 214e7d041030..e0f05babe13a 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -403,10 +403,6 @@ static inline int swap_writeout(struct folio *folio, return 0; } =20 -static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_= t entry, int nr) -{ -} - static inline struct folio *swap_cache_get_folio(swp_entry_t entry) { return NULL; diff --git a/mm/swapfile.c b/mm/swapfile.c index ee6bb37ab174..5853db044031 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1610,22 +1610,6 @@ struct swap_info_struct *get_swap_device(swp_entry_t= entry) return NULL; } =20 -static void swap_entries_put_cache(struct swap_info_struct *si, - swp_entry_t entry, int nr) -{ - unsigned long offset =3D swp_offset(entry); - struct swap_cluster_info *ci; - - ci =3D swap_cluster_lock(si, offset); - if (swap_only_has_cache(si, offset, nr)) { - swap_entries_free(si, ci, entry, nr); - } else { - for (int i =3D 0; i < nr; i++, entry.val++) - swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE); - } - swap_cluster_unlock(ci); -} - static bool swap_entries_put_map(struct swap_info_struct *si, swp_entry_t entry, int nr) { @@ -1761,13 +1745,21 @@ void swap_free_nr(swp_entry_t entry, int nr_pages) void put_swap_folio(struct folio *folio, swp_entry_t entry) { struct swap_info_struct *si; + struct swap_cluster_info *ci; + unsigned long offset =3D swp_offset(entry); int size =3D 1 << swap_entry_order(folio_order(folio)); =20 si =3D _swap_info_get(entry); if (!si) return; =20 - swap_entries_put_cache(si, entry, size); + ci =3D swap_cluster_lock(si, offset); + if (swap_only_has_cache(si, offset, size)) + swap_entries_free(si, ci, entry, size); + else + for (int i =3D 0; i < size; i++, entry.val++) + swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE); + swap_cluster_unlock(ci); } =20 int __swap_count(swp_entry_t entry) @@ -3780,15 +3772,6 @@ int swapcache_prepare(swp_entry_t entry, int nr) return __swap_duplicate(entry, SWAP_HAS_CACHE, nr); } =20 -/* - * Caller should ensure entries belong to the same folio so - * the entries won't span cross cluster boundary. - */ -void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int n= r) -{ - swap_entries_put_cache(si, entry, nr); -} - /* * add_swap_count_continuation - called when a swap count is duplicated * beyond SWAP_MAP_MAX, it allocates a new page and links that to the entr= y's --=20 2.52.0