From nobody Mon Oct  6 01:29:06 2025
Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com
 [209.85.214.179])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76676231858
	for <linux-kernel@vger.kernel.org>; Mon, 28 Jul 2025 07:53:49 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.214.179
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1753689231; cv=none;
 b=s4lpXS/uogy8aqgLs9EKxBtcq3sOBpQzK2NSP0Pb2tfYq9YfIVL2tSe1cBnSzTYElxdwUPP6VkAgWurUe8JDuVE307HEdYgbUq1W0oUXKjv5tiNhlGmimdvFMae7NVshf5SjSZ6Z8JkLOhNls0TvCeVNrPwkJ7ejIDGDNeWXMEc=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1753689231; c=relaxed/simple;
	bh=i2Ml1j5eIPRPh2qJ3bQbTipBWbLZnjCKiiBu+r+Sbvg=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version;
 b=mSxldxSrMLNJLyX/uF7LBuuf5l+1qAuKB0qXyoXjM5CoUAoGcM9g4KmdqcOpqDW/Xm/E+fxAUA9UaxWXo7NpazqO5G3Us26v3Nvl/gRPe4AXWHNPxZ7tbpdcjGzDMuxzHosXki8NQLWQ9AGShP0Pmm8LUQJy2VFe8y6r/uNXitA=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=mZQmFCEG; arc=none smtp.client-ip=209.85.214.179
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="mZQmFCEG"
Received: by mail-pl1-f179.google.com with SMTP id
 d9443c01a7336-2402774851fso7276695ad.1
        for <linux-kernel@vger.kernel.org>;
 Mon, 28 Jul 2025 00:53:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1753689229; x=1754294029;
 darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:reply-to:references
         :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject
         :date:message-id:reply-to;
        bh=1VEuTfHkrb7cFGkJ/gqHxoeXk/L/blmzoDyJoRWPvKs=;
        b=mZQmFCEGLbi32m8S+tWWlVezxsX/mdp0/Y27c43whwSafustMPeYarFb5SvYuvNX4x
         8X6r+A2W3ZDVlUizOV4iC/PrkxW/ut5rmGS/Slx3LxfsZTiVZUL2b5ZoUTzeat6c3Umj
         xhentB2cdTA1PJscFdQFaFkmVcsy2k9iJ2XChUNjasdXkU+Qnm44HRiB/YMp30FwvPeI
         FVMCvWLRpn66wE1g7nGIWHc8M8JWk7PAxMIP/fDv2swQHEehJ1CUJBedmj/F03ub4rGj
         rWNdtyK82CkYwYrc2kulxvAE6hD2FRCc2drdhYpnZgehT6e8dcqmYild+LW2bv4Yx+NZ
         JPIw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1753689229; x=1754294029;
        h=content-transfer-encoding:mime-version:reply-to:references
         :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state
         :from:to:cc:subject:date:message-id:reply-to;
        bh=1VEuTfHkrb7cFGkJ/gqHxoeXk/L/blmzoDyJoRWPvKs=;
        b=t2gyxyscR1UkLFR8VedMA6mov210Uw7EmxnaY5n+rJfTMDQkJBAWtk7JYXUkRN7d0L
         cyvFtohmMLBXQC/8ExDnw8TOaqf17T/UHJEUDe/s4Z3u8MD0JVNjZ/9jHGlxvB8IUPBy
         /+JxEMX2kGNRLJTnK5O5UUxZjIFOLQbQy7En9NRXYHxYmeAPsWHBojbHhGbztldxVRFd
         bOCISB2hDb7IFtemEPOmhOybcw7VGJLXSC4x2imuXj5dVpBRIQUzl4YjFpIOpxNHlDF9
         FwoFSCJL7O8Ki1PRdb3da3uo16CPeVUbn/lyZcKWU8XRaj14rbUElvJFclRI4QfG394i
         xmlg==
X-Forwarded-Encrypted: i=1;
 AJvYcCV+Wbt58naGhouC1nnf1HIi7GyIoq4cGMs21ycZ0/T9aga6HEBLcMBKFcXKPXy3o4KThNwyCknw/KScFu8=@vger.kernel.org
X-Gm-Message-State: AOJu0YxPuR2CXQKzYc4dTQu3wf8Z9UpMxG6oCSouK5x6i2r4jYKrMyK/
	N12b0Ta40H9wfulBNfJbWem3rLgbQti718vyAx00yWJtLkM7IqU6a8L2
X-Gm-Gg: ASbGncsw07bQ5rehbQR+e+8F2QlxVRDkI3LFJDQy5/05sBd2acLf01yZXHHFNwnEX5b
	uKqNK2AOoeBGE7X9JQrwUumgXoudDwsaNd4ZTjiydmSOem7nqZZonod8mxTzu9mj/jJB/cWc3nK
	tgg72+TH1yiyg+hLvxjkTtglZFavfzz9CqV+l6Uhh1EkBaDUpon99GcxWp3V136RjGGs/lw+9N/
	L8JcgHDNEIPXcn5ZgMdw0LMjosNkzH422Kn4Y5fOGj9iYAU6vi/vFoqGtHEgtrwc83V7KnuM/w4
	BLZpuzi9CcGE83jogD2nyWq6yIHRpyDtBhWXkLb78U0Ki4WQl+Y3029bur0V4yQ33LuQBHepNa8
	ijXEDK4GXw/0lOllmi7LwXcNs5Y7shauROt0z
X-Google-Smtp-Source: 
 AGHT+IF4zj96Wo4WCoowoTg63RWQuhjAGgkJy2kQXBL3c8OZAyvAEm+L1xjNpscOBJlcodq9IcwBPw==
X-Received: by 2002:a17:903:1b44:b0:21f:617a:f1b2 with SMTP id
 d9443c01a7336-23fb30f2f32mr151564505ad.46.1753689228748;
        Mon, 28 Jul 2025 00:53:48 -0700 (PDT)
Received: from KASONG-MC4 ([43.132.141.24])
        by smtp.gmail.com with ESMTPSA id
 d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.45
        (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256);
        Mon, 28 Jul 2025 00:53:48 -0700 (PDT)
From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Matthew Wilcox <willy@infradead.org>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Chris Li <chrisl@kernel.org>,
	Nhat Pham <nphamcs@gmail.com>,
	Baoquan He <bhe@redhat.com>,
	Barry Song <baohua@kernel.org>,
	linux-kernel@vger.kernel.org,
	Kairui Song <kasong@tencent.com>
Subject: [PATCH v6 4/8] mm/shmem, swap: tidy up swap entry splitting
Date: Mon, 28 Jul 2025 15:53:02 +0800
Message-ID: <20250728075306.12704-5-ryncsn@gmail.com>
X-Mailer: git-send-email 2.50.1
In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com>
References: <20250728075306.12704-1-ryncsn@gmail.com>
Reply-To: Kairui Song <kasong@tencent.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

From: Kairui Song <kasong@tencent.com>

Instead of keeping different paths of splitting the entry before the swap
in start, move the entry splitting after the swapin has put the folio in
swap cache (or set the SWAP_HAS_CACHE bit).  This way we only need one
place and one unified way to split the large entry.  Whenever swapin
brought in a folio smaller than the shmem swap entry, split the entry and
recalculate the entry and index for verification.

This removes duplicated codes and function calls, reduces LOC, and the
split is less racy as it's guarded by swap cache now.  So it will have a
lower chance of repeated faults due to raced split.  The compiler is also
able to optimize the coder further:

bloat-o-meter results with GCC 14:

With DEBUG_SECTION_MISMATCH (-fno-inline-functions-called-once):
./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-143 (-143)
Function                                     old     new   delta
shmem_swapin_folio                          2358    2215    -143
Total: Before=3D32933, After=3D32790, chg -0.43%

With !DEBUG_SECTION_MISMATCH:
add/remove: 0/1 grow/shrink: 1/0 up/down: 1069/-749 (320)
Function                                     old     new   delta
shmem_swapin_folio                          2871    3940   +1069
shmem_split_large_entry.isra                 749       -    -749
Total: Before=3D32806, After=3D33126, chg +0.98%

Since shmem_split_large_entry is only called in one place now. The
compiler will either generate more compact code, or inlined it for
better performance.

Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/shmem.c | 56 ++++++++++++++++++++++--------------------------------
 1 file changed, 23 insertions(+), 33 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 881d440eeebb..e089de25cf6a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2303,14 +2303,16 @@ static int shmem_swapin_folio(struct inode *inode, =
pgoff_t index,
 	struct address_space *mapping =3D inode->i_mapping;
 	struct mm_struct *fault_mm =3D vma ? vma->vm_mm : NULL;
 	struct shmem_inode_info *info =3D SHMEM_I(inode);
+	swp_entry_t swap, index_entry;
 	struct swap_info_struct *si;
 	struct folio *folio =3D NULL;
 	bool skip_swapcache =3D false;
-	swp_entry_t swap;
 	int error, nr_pages, order, split_order;
+	pgoff_t offset;
=20
 	VM_BUG_ON(!*foliop || !xa_is_value(*foliop));
-	swap =3D radix_to_swp_entry(*foliop);
+	index_entry =3D radix_to_swp_entry(*foliop);
+	swap =3D index_entry;
 	*foliop =3D NULL;
=20
 	if (is_poisoned_swp_entry(swap))
@@ -2358,46 +2360,35 @@ static int shmem_swapin_folio(struct inode *inode, =
pgoff_t index,
 		}
=20
 		/*
-		 * Now swap device can only swap in order 0 folio, then we
-		 * should split the large swap entry stored in the pagecache
-		 * if necessary.
-		 */
-		split_order =3D shmem_split_large_entry(inode, index, swap, gfp);
-		if (split_order < 0) {
-			error =3D split_order;
-			goto failed;
-		}
-
-		/*
-		 * If the large swap entry has already been split, it is
+		 * Now swap device can only swap in order 0 folio, it is
 		 * necessary to recalculate the new swap entry based on
-		 * the old order alignment.
+		 * the offset, as the swapin index might be unalgined.
 		 */
-		if (split_order > 0) {
-			pgoff_t offset =3D index - round_down(index, 1 << split_order);
-
+		if (order) {
+			offset =3D index - round_down(index, 1 << order);
 			swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset);
 		}
=20
-		/* Here we actually start the io */
 		folio =3D shmem_swapin_cluster(swap, gfp, info, index);
 		if (!folio) {
 			error =3D -ENOMEM;
 			goto failed;
 		}
-	} else if (order > folio_order(folio)) {
+	}
+alloced:
+	if (order > folio_order(folio)) {
 		/*
-		 * Swap readahead may swap in order 0 folios into swapcache
+		 * Swapin may get smaller folios due to various reasons:
+		 * It may fallback to order 0 due to memory pressure or race,
+		 * swap readahead may swap in order 0 folios into swapcache
 		 * asynchronously, while the shmem mapping can still stores
 		 * large swap entries. In such cases, we should split the
 		 * large swap entry to prevent possible data corruption.
 		 */
-		split_order =3D shmem_split_large_entry(inode, index, swap, gfp);
+		split_order =3D shmem_split_large_entry(inode, index, index_entry, gfp);
 		if (split_order < 0) {
-			folio_put(folio);
-			folio =3D NULL;
 			error =3D split_order;
-			goto failed;
+			goto failed_nolock;
 		}
=20
 		/*
@@ -2406,16 +2397,14 @@ static int shmem_swapin_folio(struct inode *inode, =
pgoff_t index,
 		 * the old order alignment.
 		 */
 		if (split_order > 0) {
-			pgoff_t offset =3D index - round_down(index, 1 << split_order);
-
-			swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset);
+			offset =3D index - round_down(index, 1 << split_order);
+			swap =3D swp_entry(swp_type(swap), swp_offset(index_entry) + offset);
 		}
 	} else if (order < folio_order(folio)) {
 		swap.val =3D round_down(swap.val, 1 << folio_order(folio));
 		index =3D round_down(index, 1 << folio_order(folio));
 	}
=20
-alloced:
 	/*
 	 * We have to do this with the folio locked to prevent races.
 	 * The shmem_confirm_swap below only checks if the first swap
@@ -2479,12 +2468,13 @@ static int shmem_swapin_folio(struct inode *inode, =
pgoff_t index,
 		shmem_set_folio_swapin_error(inode, index, folio, swap,
 					     skip_swapcache);
 unlock:
-	if (skip_swapcache)
-		swapcache_clear(si, swap, folio_nr_pages(folio));
-	if (folio) {
+	if (folio)
 		folio_unlock(folio);
+failed_nolock:
+	if (skip_swapcache)
+		swapcache_clear(si, folio->swap, folio_nr_pages(folio));
+	if (folio)
 		folio_put(folio);
-	}
 	put_swap_device(si);
=20
 	return error;
--=20
2.50.1