From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D32AB223316; Mon, 28 Jul 2025 07:53:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689219; cv=none; b=k013xdJYgA/1p5pDhqqsOjqarrQ9RC9tWHUgt5D+z2XnsHyZZm8M0X+5zJdTg+oveUErzamggfsmyvuQIZ26VEJuKopb+ODtIVca3WSWS/6euK7F0Aw3Y2HC7kX1YtY3syzToAJSW/A/XM1pcvZcIcPb/BX6DKU/uW651WgU1o4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689219; c=relaxed/simple; bh=hRCz+9c49R5D5GyoBE0XZntVBGiECY0NqPqTn0DIBWw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sbEkhaXWiv/hqZ2aP2yjQWD1Qc+LLm3CsLWattxGumx+6IatfqT8N37eYFr58gZd27/+vM9Ytm04b2HTSlvWS0heW9ztn5/moRb3bj7qMQANVcXQrMBDjcflzElvoWZga1Z8ceoRFJaqR2pWt9dtHzKHyKIp+UF5rr7LCf0IdGE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lYg5+KcZ; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lYg5+KcZ" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-235f9ea8d08so38829255ad.1; Mon, 28 Jul 2025 00:53:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689217; x=1754294017; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=B8Pr14kyOYq72ydfGcY9GMR3PRaWiTWL7M2lo2XD5cc=; b=lYg5+KcZp0Gyc04y1+P5tUWLV57uhcFWNHbjNjEAl2+x/3nWeQyVwFnUYhkJDvPSCO 8fZZTqV37ZKr/lV2kc6/MfDKn6clgxYmUkNHgo3WBwASf2jmc1R8JgUeHxtTJo6X9o2E JaAsC0ZnuzkKPn09Jd+VQLv4Y+dRPsSv7Y7hf2w4KK7pSabueyxz5ZtZ06XOdlemwPtA wzpANl04DgUat6Gff10JSt/T6+zpyS3vKBL+omqaJzw7Y/8Omz4fTcOCBx1lYmEObzBN LKI+xMpa/NPmlLLHOaZYA3mD1s9vZYx7v0gJUeEo9FkJTwry1qEvHUUE188jeBSmYEUP 1ZuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689217; x=1754294017; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=B8Pr14kyOYq72ydfGcY9GMR3PRaWiTWL7M2lo2XD5cc=; b=URpjEZlxONjzOtuV/QlJb9tE1ISrN0LPG+zlVYpPfe0pekZoZMx6X/tffq9YRTcW5s sfIP/gNHCJGQzvVxtPdqdXzfTc6kaGqskHZkl3u/4LD140PPqAlEN+rFq2eOfSL18epx 38wNuh4npQ1+MCXpA95NZt+PzLFsys+dJwZhZOmT1dDD6jVSeZVRvW+/CcbW1Oc0kMfi sLH978qOLvqtypgi8NwuRYYAlRlDnBb8cOoDCckFLHlwgECRMooLceGbpDcpJCH22V8u 5+RTRFqS+SQ8dyNMKVkSekiANzpdbL9w8D/5vC1+qVNr9EfT8f7qHSjhMRo9lt6VkdUx DBPQ== X-Forwarded-Encrypted: i=1; AJvYcCUo3d0mqVhA9SxSBqj+cdPXYPKfHmk3EPmb1S2BRIZvlMSrwPr2giSHMmQXB9CFM3v8+2v+s06u@vger.kernel.org, AJvYcCWpE/KfA2/gZdFmpo8yTdQjRS5OfzobB/h4PMD0Wy+X79tv8vnePN9MIc9jU0yPDZlXZHCHLk0webQzGMo=@vger.kernel.org X-Gm-Message-State: AOJu0YxwlYgXYTjhM3Es8nFDZ1V+w+ybIIMP8gqhUVroR2N7/EW99MVE wa1IcUG1x88v1Sk/WsWJ8NYgmqFE1UkLgwJrmh2sZPfDY6gj4mbo+HwF X-Gm-Gg: ASbGnct9OE1LB2EVt4hd31NVQEiz5krBVpoXZBmblIcSaTMul9wfvdzBlQ94flmyXT2 gQ6cQ/FV8TTvtsR/slpNOokouZ5JnJFqXycM93ik5NDLC9TN8SA/olK4gEnfXsNQyZqh/4BkdUw k71oXf0wJU8Av+7PN5+QSF30dCaWSgEMzeUfd3ZMA7OeWZyxV+auanhwZExJUq3wE82iDy0MTzA pqC2C7zSAJNdbxBs3oWyBFeJvXrxvQweQGm8DQ3MWyAfoVutspTYqHQav0ZHFfEbOD2GRe/gJTl 6ZxqxHkHaUqXQ1tqzQqEC4JkHlRH47BW1GtE7rvGP7jv3RYmXKpj13fDB8qgnNdO48mh2VRdgnj csDYpONgCBFk0Rd9nfVs1RNTsJKkm2QzQEiHG X-Google-Smtp-Source: AGHT+IEvIesUH1YbOiPROpyN5FUjpzPX3drCrroBBtxQEv/yuFOn6PDCSEDUoPJ7J0n+/n7q7KAmGg== X-Received: by 2002:a17:902:da8b:b0:240:49d1:6347 with SMTP id d9443c01a7336-24049d16472mr5275355ad.35.1753689216928; Mon, 28 Jul 2025 00:53:36 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.33 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:53:36 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song , stable@vger.kernel.org Subject: [PATCH v6 1/8] mm/shmem, swap: improve cached mTHP handling and fix potential hang Date: Mon, 28 Jul 2025 15:52:59 +0800 Message-ID: <20250728075306.12704-2-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song The current swap-in code assumes that, when a swap entry in shmem mapping is order 0, its cached folios (if present) must be order 0 too, which turns out not always correct. The problem is shmem_split_large_entry is called before verifying the folio will eventually be swapped in, one possible race is: CPU1 CPU2 shmem_swapin_folio /* swap in of order > 0 swap entry S1 */ folio =3D swap_cache_get_folio /* folio =3D NULL */ order =3D xa_get_order /* order > 0 */ folio =3D shmem_swap_alloc_folio /* mTHP alloc failure, folio =3D NULL */ <... Interrupted ...> shmem_swapin_folio /* S1 is swapped in */ shmem_writeout /* S1 is swapped out, folio cached */ shmem_split_large_entry(..., S1) /* S1 is split, but the folio covering it has order > 0 now */ Now any following swapin of S1 will hang: `xa_get_order` returns 0, and folio lookup will return a folio with order > 0. The `xa_get_order(&mapping->i_pages, index) !=3D folio_order(folio)` will always return false causing swap-in to return -EEXIST. And this looks fragile. So fix this up by allowing seeing a larger folio in swap cache, and check the whole shmem mapping range covered by the swapin have the right swap value upon inserting the folio. And drop the redundant tree walks before the insertion. This will actually improve performance, as it avoids two redundant Xarray tree walks in the hot path, and the only side effect is that in the failure path, shmem may redundantly reallocate a few folios causing temporary slight memory pressure. And worth noting, it may seems the order and value check before inserting might help reducing the lock contention, which is not true. The swap cache layer ensures raced swapin will either see a swap cache folio or failed to do a swapin (we have SWAP_HAS_CACHE bit even if swap cache is bypassed), so holding the folio lock and checking the folio flag is already good enough for avoiding the lock contention. The chance that a folio passes the swap entry value check but the shmem mapping slot has changed should be very low. Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Kairui Song Reviewed-by: Kemeng Shi Reviewed-by: Baolin Wang Tested-by: Baolin Wang Cc: --- mm/shmem.c | 39 ++++++++++++++++++++++++++++++--------- 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 7570a24e0ae4..1d0fd266c29b 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -891,7 +891,9 @@ static int shmem_add_to_page_cache(struct folio *folio, pgoff_t index, void *expected, gfp_t gfp) { XA_STATE_ORDER(xas, &mapping->i_pages, index, folio_order(folio)); - long nr =3D folio_nr_pages(folio); + unsigned long nr =3D folio_nr_pages(folio); + swp_entry_t iter, swap; + void *entry; =20 VM_BUG_ON_FOLIO(index !=3D round_down(index, nr), folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); @@ -903,14 +905,25 @@ static int shmem_add_to_page_cache(struct folio *foli= o, =20 gfp &=3D GFP_RECLAIM_MASK; folio_throttle_swaprate(folio, gfp); + swap =3D radix_to_swp_entry(expected); =20 do { + iter =3D swap; xas_lock_irq(&xas); - if (expected !=3D xas_find_conflict(&xas)) { - xas_set_err(&xas, -EEXIST); - goto unlock; + xas_for_each_conflict(&xas, entry) { + /* + * The range must either be empty, or filled with + * expected swap entries. Shmem swap entries are never + * partially freed without split of both entry and + * folio, so there shouldn't be any holes. + */ + if (!expected || entry !=3D swp_to_radix_entry(iter)) { + xas_set_err(&xas, -EEXIST); + goto unlock; + } + iter.val +=3D 1 << xas_get_order(&xas); } - if (expected && xas_find_conflict(&xas)) { + if (expected && iter.val - nr !=3D swap.val) { xas_set_err(&xas, -EEXIST); goto unlock; } @@ -2359,7 +2372,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, error =3D -ENOMEM; goto failed; } - } else if (order !=3D folio_order(folio)) { + } else if (order > folio_order(folio)) { /* * Swap readahead may swap in order 0 folios into swapcache * asynchronously, while the shmem mapping can still stores @@ -2384,15 +2397,23 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, =20 swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); } + } else if (order < folio_order(folio)) { + swap.val =3D round_down(swap.val, 1 << folio_order(folio)); + index =3D round_down(index, 1 << folio_order(folio)); } =20 alloced: - /* We have to do this with folio locked to prevent races */ + /* + * We have to do this with the folio locked to prevent races. + * The shmem_confirm_swap below only checks if the first swap + * entry matches the folio, that's enough to ensure the folio + * is not used outside of shmem, as shmem swap entries + * and swap cache folios are never partially freed. + */ folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || - folio->swap.val !=3D swap.val || !shmem_confirm_swap(mapping, index, swap) || - xa_get_order(&mapping->i_pages, index) !=3D folio_order(folio)) { + folio->swap.val !=3D swap.val) { error =3D -EEXIST; goto unlock; } --=20 2.50.1 From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7B6822577E for ; Mon, 28 Jul 2025 07:53:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689223; cv=none; b=XcH87bftXDIK7fLI3hdyBu8+xdmeaO1Sy5Wb1emTzJJ0fI4IO79Dm5ejnk39W61V1Hb+vIMUy86ZGfKDHF+qdadV3TcaYVkMNJLiaSkeO4WBWTIyuvndM7VyY6gQiFSUXzzPmjB9qUWxphky/8nBZloecY7239R0ED4n7OIuFHA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689223; c=relaxed/simple; bh=ehyX7jsFmVcumacy/IVKlhxgf84bhBqyqaV9j8Q3bGE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AGsWiUdnSxGdfv4RL3LNCzwNtqprs4thpgWxT2KFhkTNt8iW6HMFEQ0uQoJBLaq+tio1vUAOWOmDMBM/T35qrug5xESrnxKkpz9LgCzjrb6cS9zujhYUMomfBX2FRKAukD9ORPN/0sEuaq/fClD5cln1WXJXMvLZY0PH9BV+oOE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=IBDFWtPm; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IBDFWtPm" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-23636167afeso35312035ad.3 for ; Mon, 28 Jul 2025 00:53:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689221; x=1754294021; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=WK9dh6JH6yfakGJA4pFDq4glQrsDDuFzovtqS8CVL90=; b=IBDFWtPmzcp0efPFp3x4l1Ewsh6CtQP+jzor13dcIUY/LDi/c4jMv2BLRoIl2foXBE RWQlTo3MwT3zjPKlt/aCJ3Q9z5br3VpfQUpJmRiz4XuwGUxrEVdDGnDmSlivSh1+DU7g MfZkmxAMFr7FHPsLwsFQQdTt4dYMrG4w2Ewzbb/rtGatsLTvOj0KWIajOQrYIrK3ncao Nh1Xu1I3UC6HoRWZj1RMWfKLrcoC6tjYZCyVGCUbcw9AQRmjRD+GKtZtdW02LWWHMIKg g661DgBLrhOFs5EjXF/gUJTg4uTc4/WwzMvBMcZDeXzaCAwKf6Ov1owmcZt9XqhpCsY5 XUZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689221; x=1754294021; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=WK9dh6JH6yfakGJA4pFDq4glQrsDDuFzovtqS8CVL90=; b=AsSs+0asXRzHNTD4MbcYNT8AFLFc630F42yQeL8ItWp0xeghWxaEEPlSV6GjeZAjMX +K01f1UcyCnV2GtwqSKalJlLEM1WT+0jUd9YQUCntqoUgaNFL3nQVp6SZtUmkXBWr30z S3KQE2eP2aIF6IvPXBh6FAkRlu2FXPv6TamOFyaPl1vjEV3ysPCzwEEcmr8+u7yODaI5 td9YRSuTNk+vindMfMKvyUgxSUKLfRhzPpcUpTMMTknSW4TP0gHuJz39RBO+azvboTxd KfU6cppIHAEJdYDpcPrPVSme/zfJAn/rYKWwVAp1tjGoNTxzhy9HapMlph9AxvbQ2cTk 5/+w== X-Forwarded-Encrypted: i=1; AJvYcCVjHt+dyivnCxHDkRRyjrDnp7Di7SjewOXmTaBSaYy79jileBu+RsvQSNKAALQWxoNbDoU+tgIbqvMxwLk=@vger.kernel.org X-Gm-Message-State: AOJu0YzrCh3uD8+B1ncXf+ZjjHCFD++1YwnZPk6fDjGvXY/+ZPDunDAs 014XmlzNYrz+uIwf2ZmNqUulZTk3r8ERmqJgTKOeiCTJM3Hm+sgDT7S1 X-Gm-Gg: ASbGnctBwQLDRD1tGQlOMAk+S8POfpxMS+g3GCv9WI5ez7qHT5fmKkx2wyoGPevZQB3 YLLtAP5ACZD43E0jouxSBhGCx8hEPnnAcBDLHzvjiAGShZDVFHWIT7O/lLGFuClFQ/vPBFDmT3v /tU6v4hMsvfChVdVqlLe7of8V6s8kxCtxTp0lt0rIuvbxHFtCGrK8FZo4G4uP/ubATAJaTQVZlv tUryp2LnkxQD6Sy+vd7DbsIFCBKpAwJg3h3zCl+l5zRDLNtAnTF9FIs88szFxKXKv1wR8YbQOR/ lZqzH37AWPnSDy0Q3+/DdtyQvb8KTReHZ7J0iYhsakrJ8kKo5cfitNzaeIoliwoQM3lHLs23YRD 0ik0GP7zvsqA0U+Uzxiac4OFk65AokRWtbTBJ X-Google-Smtp-Source: AGHT+IFulF7WReIYQE4qMeTpC92VcxNTmoJq5nsI6JchU0KIZO8LUZI5RQWcxlYsmeHQlANMK3MvXQ== X-Received: by 2002:a17:902:fc84:b0:23f:e51b:2189 with SMTP id d9443c01a7336-23fe51b27dcmr99934935ad.17.1753689221027; Mon, 28 Jul 2025 00:53:41 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.37 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:53:40 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song , Dev Jain Subject: [PATCH v6 2/8] mm/shmem, swap: avoid redundant Xarray lookup during swapin Date: Mon, 28 Jul 2025 15:53:00 +0800 Message-ID: <20250728075306.12704-3-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Currently shmem calls xa_get_order to get the swap radix entry order, requiring a full tree walk. This can be easily combined with the swap entry value checking (shmem_confirm_swap) to avoid the duplicated lookup and abort early if the entry is gone already. Which should improve the performance. Signed-off-by: Kairui Song Reviewed-by: Kemeng Shi Reviewed-by: Dev Jain Reviewed-by: Baolin Wang --- mm/shmem.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 1d0fd266c29b..da8edb363c75 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -512,15 +512,27 @@ static int shmem_replace_entry(struct address_space *= mapping, =20 /* * Sometimes, before we decide whether to proceed or to fail, we must check - * that an entry was not already brought back from swap by a racing thread. + * that an entry was not already brought back or split by a racing thread. * * Checking folio is not enough: by the time a swapcache folio is locked, = it * might be reused, and again be swapcache, using the same swap as before. + * Returns the swap entry's order if it still presents, else returns -1. */ -static bool shmem_confirm_swap(struct address_space *mapping, - pgoff_t index, swp_entry_t swap) +static int shmem_confirm_swap(struct address_space *mapping, pgoff_t index, + swp_entry_t swap) { - return xa_load(&mapping->i_pages, index) =3D=3D swp_to_radix_entry(swap); + XA_STATE(xas, &mapping->i_pages, index); + int ret =3D -1; + void *entry; + + rcu_read_lock(); + do { + entry =3D xas_load(&xas); + if (entry =3D=3D swp_to_radix_entry(swap)) + ret =3D xas_get_order(&xas); + } while (xas_retry(&xas, entry)); + rcu_read_unlock(); + return ret; } =20 /* @@ -2293,16 +2305,20 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, return -EIO; =20 si =3D get_swap_device(swap); - if (!si) { - if (!shmem_confirm_swap(mapping, index, swap)) + order =3D shmem_confirm_swap(mapping, index, swap); + if (unlikely(!si)) { + if (order < 0) return -EEXIST; else return -EINVAL; } + if (unlikely(order < 0)) { + put_swap_device(si); + return -EEXIST; + } =20 /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); - order =3D xa_get_order(&mapping->i_pages, index); if (!folio) { int nr_pages =3D 1 << order; bool fallback_order0 =3D false; @@ -2412,7 +2428,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, */ folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || - !shmem_confirm_swap(mapping, index, swap) || + shmem_confirm_swap(mapping, index, swap) < 0 || folio->swap.val !=3D swap.val) { error =3D -EEXIST; goto unlock; @@ -2460,7 +2476,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, *foliop =3D folio; return 0; failed: - if (!shmem_confirm_swap(mapping, index, swap)) + if (shmem_confirm_swap(mapping, index, swap) < 0) error =3D -EEXIST; if (error =3D=3D -EIO) shmem_set_folio_swapin_error(inode, index, folio, swap, --=20 2.50.1 From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB49522DFA4 for ; Mon, 28 Jul 2025 07:53:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689227; cv=none; b=TgDsY52Z7l5upji0lGiKxGezaPu+VALeToqwWzRVc+pK+Sojy9PFN6gTWG/4e4bwwiCIkqRO6fzy++IQEmBMN/AZWTzLiJDnunQNES3x2dulTtHyRKaoamth2rkpWdIJpko6webxlUIBi+eAqCrLM/gyfaOSNB/KGntkjpl+vp4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689227; c=relaxed/simple; bh=rADwympzNSiInHPgjw5Nne/U8ChKm+M22bcxBCELWkQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SxW5E4UAOfu3kt4GS7CPLvD47Ha1I8tXSFCNNA/UyYuCzYUIp0cerES+HhtHjwxlSY8OqEAyVXy+5lpzU9e1POFhBgk+hHAtRb5DPviNMiD8onX4+8Wq1CywvN9skjI9lnmEb2edgR1cMh4JCLq1WTwsLs7JpdJJ0x0a+aWjeqo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lp4Z61Oo; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lp4Z61Oo" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-24049d16515so1176235ad.1 for ; Mon, 28 Jul 2025 00:53:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689225; x=1754294025; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=CuE5xxPZv2QYtOzFC8IOGCR1dUU/2TmA/3KupH+/Iuw=; b=lp4Z61Oofd/D6VIFVi6hkAjDfne5sA0GkIUT2l5t9Twlbd1gGhnPVUPAaQj1wdmTtq aQCN1T3u2eA+OzSWOFZk6Wa4h3OHet/sAQUnRP6o/+TWqklF+ZNIRierx5i9dHTHu8Sg ZtLF9IlEDPeAOrkt4s9Db/IPBBTSmih0vMqI45mGExEfXr+rlLnQMBAu5Lx52+uKeY7r VHfJTu8q0uyo5/B41I/YIFN8P9clPtw64y+dGwz2QESohi2cfVibWlyykQpo6IPVCOOV NPSnAn5xngGbekeqdAcRiWbneQWvuLrimDBysZ/D6ZqE5KHN6+/GLrZRGihrHPpQ+uII NyjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689225; x=1754294025; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=CuE5xxPZv2QYtOzFC8IOGCR1dUU/2TmA/3KupH+/Iuw=; b=qjd5Zeu0a8Xe90n+HDf3h6DAaV0k1mbKLAdU2D25Z5Y9m/Gnoc7SkiC18JILfh6Qur x2UeMi57ppXlUE15fvqWOdatTYRlS+Q04OnUS/VGocIRxQ2T6Ly0zFz3qLvAJfybuHTK ZK2h3lBE7lEGLdpSP2f4GonZM+b/RRUbLyKD5953FZNXEBM6tRbsqtlKSm63KUdZRoqn 3n7SRpwOjSWZBFmxFr1iayJG4HaWxujyReV0q8IuMPUo6UWGk6BUVnQ1TybUP8iy0vB3 ERohr9AbUS+hcP4pGPUnoePTcBDhMMMPO3Pttx1Fw78kLQhu6X/sF3bTOZEsD1v4KpD/ mvWw== X-Forwarded-Encrypted: i=1; AJvYcCUiEuY1GzXlQ2RhBiJoYHjKZ4TO9GJZkmpA99YYmI0ZjwWwj00g6jzw9xETW+3UyxO033tE2S/sRKv8Fds=@vger.kernel.org X-Gm-Message-State: AOJu0YwZ4MA727HABgjQZlf4AUaPqWq9gi7Mq9NwGdGrrzxHOfnANdWM nIs2vf+SV1dBffNqsxow3m68iZRzK0BSBNPFfkqUIYn86s3kmU5i1Dhx X-Gm-Gg: ASbGnct3/dCiY/UiL/A5wD6S6twK/y3PttSCNylENrmdsj7WTIqKEaBOE0Scj1piJ9l wXKrTmproF5RtMuIijDlOZzJKaq6+Jv9vcwZMr2438JP3qHSwfnlj2biExmkyq4X82KfsGOA20e haQ4ZPuPT9PlhcgpqeYArjLrN5v3rWcJxPD4vsYNG43oUhjrO8Cs4yg1ZqkmBC3RAgWqwS+6X+f 5pd16dxt6o4WN0OVq9vBMbLNqHe/x1a+bdKXeE7TIL7dbcy7beoE0d298yVttCIh64tL7C6E9cB NCKX0G6dD3IUmDu6hdpDWL1PQ4+bP1TD1xIBmOArD/Y7EJUqrN1e9JiDduSv8jadyRdygVNZ7Ko GQlHjvDssQhFJP3jsiK3AvP5tQJrd4aMCXbWKyuD2P2YYb4c= X-Google-Smtp-Source: AGHT+IGIxKq71r0WRuuN27uNO+LNm+gc+J/FwAAdMGJYoSQMX2/isJWZP3WNNZMOhjwCc2oEuIYTlg== X-Received: by 2002:a17:902:f709:b0:23f:df56:c74c with SMTP id d9443c01a7336-23fdf56ca75mr72045305ad.14.1753689224929; Mon, 28 Jul 2025 00:53:44 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.41 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:53:44 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v6 3/8] mm/shmem, swap: tidy up THP swapin checks Date: Mon, 28 Jul 2025 15:53:01 +0800 Message-ID: <20250728075306.12704-4-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Move all THP swapin related checks under CONFIG_TRANSPARENT_HUGEPAGE, so they will be trimmed off by the compiler if not needed. And add a WARN if shmem sees a order > 0 entry when CONFIG_TRANSPARENT_HUGEPAGE is disabled, that should never happen unless things went very wrong. There should be no observable feature change except the new added WARN. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 39 ++++++++++++++++++--------------------- 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index da8edb363c75..881d440eeebb 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2017,26 +2017,38 @@ static struct folio *shmem_swap_alloc_folio(struct = inode *inode, swp_entry_t entry, int order, gfp_t gfp) { struct shmem_inode_info *info =3D SHMEM_I(inode); + int nr_pages =3D 1 << order; struct folio *new; void *shadow; - int nr_pages; =20 /* * We have arrived here because our zones are constrained, so don't * limit chance of success with further cpuset and node constraints. */ gfp &=3D ~GFP_CONSTRAINT_MASK; - if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && order > 0) { - gfp_t huge_gfp =3D vma_thp_gfp_mask(vma); + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + if (WARN_ON_ONCE(order)) + return ERR_PTR(-EINVAL); + } else if (order) { + /* + * If uffd is active for the vma, we need per-page fault + * fidelity to maintain the uffd semantics, then fallback + * to swapin order-0 folio, as well as for zswap case. + * Any existing sub folio in the swap cache also blocks + * mTHP swapin. + */ + if ((vma && unlikely(userfaultfd_armed(vma))) || + !zswap_never_enabled() || + non_swapcache_batch(entry, nr_pages) !=3D nr_pages) + return ERR_PTR(-EINVAL); =20 - gfp =3D limit_gfp_mask(huge_gfp, gfp); + gfp =3D limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); } =20 new =3D shmem_alloc_folio(gfp, order, info, index); if (!new) return ERR_PTR(-ENOMEM); =20 - nr_pages =3D folio_nr_pages(new); if (mem_cgroup_swapin_charge_folio(new, vma ? vma->vm_mm : NULL, gfp, entry)) { folio_put(new); @@ -2320,9 +2332,6 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); if (!folio) { - int nr_pages =3D 1 << order; - bool fallback_order0 =3D false; - /* Or update major stats only when swapin succeeds?? */ if (fault_type) { *fault_type |=3D VM_FAULT_MAJOR; @@ -2330,20 +2339,8 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, count_memcg_event_mm(fault_mm, PGMAJFAULT); } =20 - /* - * If uffd is active for the vma, we need per-page fault - * fidelity to maintain the uffd semantics, then fallback - * to swapin order-0 folio, as well as for zswap case. - * Any existing sub folio in the swap cache also blocks - * mTHP swapin. - */ - if (order > 0 && ((vma && unlikely(userfaultfd_armed(vma))) || - !zswap_never_enabled() || - non_swapcache_batch(swap, nr_pages) !=3D nr_pages)) - fallback_order0 =3D true; - /* Skip swapcache for synchronous device. */ - if (!fallback_order0 && data_race(si->flags & SWP_SYNCHRONOUS_IO)) { + if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { folio =3D shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); if (!IS_ERR(folio)) { skip_swapcache =3D true; --=20 2.50.1 From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76676231858 for ; Mon, 28 Jul 2025 07:53:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689231; cv=none; b=s4lpXS/uogy8aqgLs9EKxBtcq3sOBpQzK2NSP0Pb2tfYq9YfIVL2tSe1cBnSzTYElxdwUPP6VkAgWurUe8JDuVE307HEdYgbUq1W0oUXKjv5tiNhlGmimdvFMae7NVshf5SjSZ6Z8JkLOhNls0TvCeVNrPwkJ7ejIDGDNeWXMEc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689231; c=relaxed/simple; bh=i2Ml1j5eIPRPh2qJ3bQbTipBWbLZnjCKiiBu+r+Sbvg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mSxldxSrMLNJLyX/uF7LBuuf5l+1qAuKB0qXyoXjM5CoUAoGcM9g4KmdqcOpqDW/Xm/E+fxAUA9UaxWXo7NpazqO5G3Us26v3Nvl/gRPe4AXWHNPxZ7tbpdcjGzDMuxzHosXki8NQLWQ9AGShP0Pmm8LUQJy2VFe8y6r/uNXitA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=mZQmFCEG; arc=none smtp.client-ip=209.85.214.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mZQmFCEG" Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-2402774851fso7276695ad.1 for ; Mon, 28 Jul 2025 00:53:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689229; x=1754294029; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=1VEuTfHkrb7cFGkJ/gqHxoeXk/L/blmzoDyJoRWPvKs=; b=mZQmFCEGLbi32m8S+tWWlVezxsX/mdp0/Y27c43whwSafustMPeYarFb5SvYuvNX4x 8X6r+A2W3ZDVlUizOV4iC/PrkxW/ut5rmGS/Slx3LxfsZTiVZUL2b5ZoUTzeat6c3Umj xhentB2cdTA1PJscFdQFaFkmVcsy2k9iJ2XChUNjasdXkU+Qnm44HRiB/YMp30FwvPeI FVMCvWLRpn66wE1g7nGIWHc8M8JWk7PAxMIP/fDv2swQHEehJ1CUJBedmj/F03ub4rGj rWNdtyK82CkYwYrc2kulxvAE6hD2FRCc2drdhYpnZgehT6e8dcqmYild+LW2bv4Yx+NZ JPIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689229; x=1754294029; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=1VEuTfHkrb7cFGkJ/gqHxoeXk/L/blmzoDyJoRWPvKs=; b=t2gyxyscR1UkLFR8VedMA6mov210Uw7EmxnaY5n+rJfTMDQkJBAWtk7JYXUkRN7d0L cyvFtohmMLBXQC/8ExDnw8TOaqf17T/UHJEUDe/s4Z3u8MD0JVNjZ/9jHGlxvB8IUPBy /+JxEMX2kGNRLJTnK5O5UUxZjIFOLQbQy7En9NRXYHxYmeAPsWHBojbHhGbztldxVRFd bOCISB2hDb7IFtemEPOmhOybcw7VGJLXSC4x2imuXj5dVpBRIQUzl4YjFpIOpxNHlDF9 FwoFSCJL7O8Ki1PRdb3da3uo16CPeVUbn/lyZcKWU8XRaj14rbUElvJFclRI4QfG394i xmlg== X-Forwarded-Encrypted: i=1; AJvYcCV+Wbt58naGhouC1nnf1HIi7GyIoq4cGMs21ycZ0/T9aga6HEBLcMBKFcXKPXy3o4KThNwyCknw/KScFu8=@vger.kernel.org X-Gm-Message-State: AOJu0YxPuR2CXQKzYc4dTQu3wf8Z9UpMxG6oCSouK5x6i2r4jYKrMyK/ N12b0Ta40H9wfulBNfJbWem3rLgbQti718vyAx00yWJtLkM7IqU6a8L2 X-Gm-Gg: ASbGncsw07bQ5rehbQR+e+8F2QlxVRDkI3LFJDQy5/05sBd2acLf01yZXHHFNwnEX5b uKqNK2AOoeBGE7X9JQrwUumgXoudDwsaNd4ZTjiydmSOem7nqZZonod8mxTzu9mj/jJB/cWc3nK tgg72+TH1yiyg+hLvxjkTtglZFavfzz9CqV+l6Uhh1EkBaDUpon99GcxWp3V136RjGGs/lw+9N/ L8JcgHDNEIPXcn5ZgMdw0LMjosNkzH422Kn4Y5fOGj9iYAU6vi/vFoqGtHEgtrwc83V7KnuM/w4 BLZpuzi9CcGE83jogD2nyWq6yIHRpyDtBhWXkLb78U0Ki4WQl+Y3029bur0V4yQ33LuQBHepNa8 ijXEDK4GXw/0lOllmi7LwXcNs5Y7shauROt0z X-Google-Smtp-Source: AGHT+IF4zj96Wo4WCoowoTg63RWQuhjAGgkJy2kQXBL3c8OZAyvAEm+L1xjNpscOBJlcodq9IcwBPw== X-Received: by 2002:a17:903:1b44:b0:21f:617a:f1b2 with SMTP id d9443c01a7336-23fb30f2f32mr151564505ad.46.1753689228748; Mon, 28 Jul 2025 00:53:48 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:53:48 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v6 4/8] mm/shmem, swap: tidy up swap entry splitting Date: Mon, 28 Jul 2025 15:53:02 +0800 Message-ID: <20250728075306.12704-5-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Instead of keeping different paths of splitting the entry before the swap in start, move the entry splitting after the swapin has put the folio in swap cache (or set the SWAP_HAS_CACHE bit). This way we only need one place and one unified way to split the large entry. Whenever swapin brought in a folio smaller than the shmem swap entry, split the entry and recalculate the entry and index for verification. This removes duplicated codes and function calls, reduces LOC, and the split is less racy as it's guarded by swap cache now. So it will have a lower chance of repeated faults due to raced split. The compiler is also able to optimize the coder further: bloat-o-meter results with GCC 14: With DEBUG_SECTION_MISMATCH (-fno-inline-functions-called-once): ./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-143 (-143) Function old new delta shmem_swapin_folio 2358 2215 -143 Total: Before=3D32933, After=3D32790, chg -0.43% With !DEBUG_SECTION_MISMATCH: add/remove: 0/1 grow/shrink: 1/0 up/down: 1069/-749 (320) Function old new delta shmem_swapin_folio 2871 3940 +1069 shmem_split_large_entry.isra 749 - -749 Total: Before=3D32806, After=3D33126, chg +0.98% Since shmem_split_large_entry is only called in one place now. The compiler will either generate more compact code, or inlined it for better performance. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang Tested-by: Baolin Wang --- mm/shmem.c | 56 ++++++++++++++++++++++-------------------------------- 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 881d440eeebb..e089de25cf6a 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2303,14 +2303,16 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, struct address_space *mapping =3D inode->i_mapping; struct mm_struct *fault_mm =3D vma ? vma->vm_mm : NULL; struct shmem_inode_info *info =3D SHMEM_I(inode); + swp_entry_t swap, index_entry; struct swap_info_struct *si; struct folio *folio =3D NULL; bool skip_swapcache =3D false; - swp_entry_t swap; int error, nr_pages, order, split_order; + pgoff_t offset; =20 VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); - swap =3D radix_to_swp_entry(*foliop); + index_entry =3D radix_to_swp_entry(*foliop); + swap =3D index_entry; *foliop =3D NULL; =20 if (is_poisoned_swp_entry(swap)) @@ -2358,46 +2360,35 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, } =20 /* - * Now swap device can only swap in order 0 folio, then we - * should split the large swap entry stored in the pagecache - * if necessary. - */ - split_order =3D shmem_split_large_entry(inode, index, swap, gfp); - if (split_order < 0) { - error =3D split_order; - goto failed; - } - - /* - * If the large swap entry has already been split, it is + * Now swap device can only swap in order 0 folio, it is * necessary to recalculate the new swap entry based on - * the old order alignment. + * the offset, as the swapin index might be unalgined. */ - if (split_order > 0) { - pgoff_t offset =3D index - round_down(index, 1 << split_order); - + if (order) { + offset =3D index - round_down(index, 1 << order); swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); } =20 - /* Here we actually start the io */ folio =3D shmem_swapin_cluster(swap, gfp, info, index); if (!folio) { error =3D -ENOMEM; goto failed; } - } else if (order > folio_order(folio)) { + } +alloced: + if (order > folio_order(folio)) { /* - * Swap readahead may swap in order 0 folios into swapcache + * Swapin may get smaller folios due to various reasons: + * It may fallback to order 0 due to memory pressure or race, + * swap readahead may swap in order 0 folios into swapcache * asynchronously, while the shmem mapping can still stores * large swap entries. In such cases, we should split the * large swap entry to prevent possible data corruption. */ - split_order =3D shmem_split_large_entry(inode, index, swap, gfp); + split_order =3D shmem_split_large_entry(inode, index, index_entry, gfp); if (split_order < 0) { - folio_put(folio); - folio =3D NULL; error =3D split_order; - goto failed; + goto failed_nolock; } =20 /* @@ -2406,16 +2397,14 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, * the old order alignment. */ if (split_order > 0) { - pgoff_t offset =3D index - round_down(index, 1 << split_order); - - swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); + offset =3D index - round_down(index, 1 << split_order); + swap =3D swp_entry(swp_type(swap), swp_offset(index_entry) + offset); } } else if (order < folio_order(folio)) { swap.val =3D round_down(swap.val, 1 << folio_order(folio)); index =3D round_down(index, 1 << folio_order(folio)); } =20 -alloced: /* * We have to do this with the folio locked to prevent races. * The shmem_confirm_swap below only checks if the first swap @@ -2479,12 +2468,13 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, shmem_set_folio_swapin_error(inode, index, folio, swap, skip_swapcache); unlock: - if (skip_swapcache) - swapcache_clear(si, swap, folio_nr_pages(folio)); - if (folio) { + if (folio) folio_unlock(folio); +failed_nolock: + if (skip_swapcache) + swapcache_clear(si, folio->swap, folio_nr_pages(folio)); + if (folio) folio_put(folio); - } put_swap_device(si); =20 return error; --=20 2.50.1 From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52224224896 for ; Mon, 28 Jul 2025 07:53:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689234; cv=none; b=YqvIpvhyUMfkvS8AeuyqdIAIbRVmsE283RPRUlyJS3AxxIUpvZ55i2Bv3OkN73+9XFoWPmfxzrYd6EsH29NJdMMf/gsy1Wpfd7n5wjkSHA+COlpRSWM5SGX8FVjK50+OkErjuOAeeymZ8+0pu1SSaWgEt0Lsj9mpkQmGi5u8SaA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689234; c=relaxed/simple; bh=y25xOKLjcc1KaRLZ/aeRVfADsfXsQIX/tXqpvnvmC94=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k9vOmUJ0+MIYdnk4GAT6e5+w3ZuZM5teZ9GVv2Jr8k8xScDYqlqN2yLlxZ/FcAgKqQO6p0lHIEMNSu0Y6MmU+YCEo/BqGOzpPQIoTpKFD79zKWRIkdzhrg7SQgwXfIz1iZA8ERMJahIYCB2/nF/1kUwLULemg5JgAzx61In9Ojg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hdplxBJm; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hdplxBJm" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-23fe9a5e5e8so9968885ad.0 for ; Mon, 28 Jul 2025 00:53:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689233; x=1754294033; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=NRCznFcp7ajBKMuky7QQKQFuElaZNk3oJWmbVbFTaP8=; b=hdplxBJm914/b4DkCFnwMamw7yuCUSJraXX9b/Pr+rvqGkBcgye0zKYmDB/DNNzk3d 6fx0+eAfLcDJtkaiLqZsC3obwCSHGhnr3M84V2Q4DDyE4T3B3W4qVgsOW+NP+FdAWPai tlLDFY6YmLm7+nlU5pMXlhKbM7v6ZknAAONFdb5GkE6Bch6C0bBKFKMJa+ZrvLwKf6qc 4A5j5ZX5ldt8mbnYurp/dzwRWfh2KhqXXN3Vu1nHpW0RoHBtxeHJB+mTbbekfnDFf8Ep Y9CHJgDq8wYZkG0T6FSlHpXeb8gFM47OdRikQxzwIN8qgIsyHyvGO66pGINB8qY1rsXv MOTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689233; x=1754294033; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=NRCznFcp7ajBKMuky7QQKQFuElaZNk3oJWmbVbFTaP8=; b=wi/i0QMHkP0qzP9oFh+IhuX2qS9Y5/bq0q07gJ1GVkG4ZUGPwwD7zV10J8T38DSuDY ZrxR2FO9JVOAf9YmXceLMjmjTcZdYXGmOubZGKuldN8jMdPC9NOMU6K3d/3V2bqxK4rh gxEcVT0wyLEemmwk+VmtDd4I9QPuIbX8InOKIDofCA64LsuFt0SAK0fh2z34GSuNDpt3 rIK0Fk0o/9/x8lp6EOtjJkfyKb4WTAQ8PVdoZZHD72X7h43hS61lOnP3TucWt13rXwP4 WhWCI48UeSCGJkoY3l0vIsV0xEyipCe1pXOf6KZBjg32hZ9HjW2X+ZcObJaLdbs/+2vA 9TpQ== X-Forwarded-Encrypted: i=1; AJvYcCVE9YbaZkyPX73JupMfVEAz/tJVGGnBfZY6i7zozdunEULvmksMLSrR2VKDP80+3Q6yjRwWHwLmeaQKzcg=@vger.kernel.org X-Gm-Message-State: AOJu0YxP3bKrA5mdtWUhcMx3DcujtMrC0RCCI2aof24JqeNbq8ja8XGu VV0Q7ZiBP13rQ32EwFftDHMlGamHEs5bgax2JWIAorkVT4PUd1IBq3xT X-Gm-Gg: ASbGncuUzsQm+EP1zBn/t6Y+QRb1sRe8itpKgJnKX9THlpb0u9g+RKiVlc4QMeMLB1X 92W1rRGcUGQIBKva/adhJjhML+59cVlZDX5CyTUFvpqN+QZIZuIod05iwfFv6WfrypLmLG8lITe mZFteEgy283IErszFC0ia/fnN/tltUN81AcExhArDwAy780DXTQ9rQZ2KgnjKWc3080wLSa/nBt 8PlVDri0jwEpwR+/bzXwljl1WG3agdVW6d5AZMTvHnLxlr0LGrCxE9ld8UDyLuL3CpWb1THpfLy AsfGJomYFlGsstvuAU4rYv28zzzaNLsBH+x7y9AgRZsiGquS3FP8V6I3wUZS26txTfzg8gFEtez AzgcwbeY6k9ODrKQpTi61uV9DSKdtBc44wI5G X-Google-Smtp-Source: AGHT+IGiECmYwQYBMKW+4rvl77EBjR2Rh1tffl26z0E8srbfzcXh6dA6NKxkvHgnif+njR7LnfGVPw== X-Received: by 2002:a17:902:d504:b0:240:3f3d:fd37 with SMTP id d9443c01a7336-2403f3dfffemr12207345ad.27.1753689232485; Mon, 28 Jul 2025 00:53:52 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:53:51 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v6 5/8] mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO Date: Mon, 28 Jul 2025 15:53:03 +0800 Message-ID: <20250728075306.12704-6-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song For SWP_SYNCHRONOUS_IO devices, if a cache bypassing THP swapin failed due to reasons like memory pressure, partially conflicting swap cache or ZSWAP enabled, shmem will fallback to cached order 0 swapin. Right now the swap cache still has a non-trivial overhead, and readahead is not helpful for SWP_SYNCHRONOUS_IO devices, so we should always skip the readahead and swap cache even if the swapin falls back to order 0. So handle the fallback logic without falling back to the cached read. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 41 ++++++++++++++++++++++++++++------------- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index e089de25cf6a..6bcca287e173 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2019,6 +2019,7 @@ static struct folio *shmem_swap_alloc_folio(struct in= ode *inode, struct shmem_inode_info *info =3D SHMEM_I(inode); int nr_pages =3D 1 << order; struct folio *new; + gfp_t alloc_gfp; void *shadow; =20 /* @@ -2026,6 +2027,7 @@ static struct folio *shmem_swap_alloc_folio(struct in= ode *inode, * limit chance of success with further cpuset and node constraints. */ gfp &=3D ~GFP_CONSTRAINT_MASK; + alloc_gfp =3D gfp; if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { if (WARN_ON_ONCE(order)) return ERR_PTR(-EINVAL); @@ -2040,19 +2042,22 @@ static struct folio *shmem_swap_alloc_folio(struct = inode *inode, if ((vma && unlikely(userfaultfd_armed(vma))) || !zswap_never_enabled() || non_swapcache_batch(entry, nr_pages) !=3D nr_pages) - return ERR_PTR(-EINVAL); + goto fallback; =20 - gfp =3D limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); + alloc_gfp =3D limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); + } +retry: + new =3D shmem_alloc_folio(alloc_gfp, order, info, index); + if (!new) { + new =3D ERR_PTR(-ENOMEM); + goto fallback; } - - new =3D shmem_alloc_folio(gfp, order, info, index); - if (!new) - return ERR_PTR(-ENOMEM); =20 if (mem_cgroup_swapin_charge_folio(new, vma ? vma->vm_mm : NULL, - gfp, entry)) { + alloc_gfp, entry)) { folio_put(new); - return ERR_PTR(-ENOMEM); + new =3D ERR_PTR(-ENOMEM); + goto fallback; } =20 /* @@ -2067,7 +2072,9 @@ static struct folio *shmem_swap_alloc_folio(struct in= ode *inode, */ if (swapcache_prepare(entry, nr_pages)) { folio_put(new); - return ERR_PTR(-EEXIST); + new =3D ERR_PTR(-EEXIST); + /* Try smaller folio to avoid cache conflict */ + goto fallback; } =20 __folio_set_locked(new); @@ -2081,6 +2088,15 @@ static struct folio *shmem_swap_alloc_folio(struct i= node *inode, folio_add_lru(new); swap_read_folio(new, NULL); return new; +fallback: + /* Order 0 swapin failed, nothing to fallback to, abort */ + if (!order) + return new; + entry.val +=3D index - round_down(index, nr_pages); + alloc_gfp =3D gfp; + nr_pages =3D 1; + order =3D 0; + goto retry; } =20 /* @@ -2350,13 +2366,12 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, } =20 /* - * Fallback to swapin order-0 folio unless the swap entry - * already exists. + * Direct swapin handled order 0 fallback already, + * if it failed, abort. */ error =3D PTR_ERR(folio); folio =3D NULL; - if (error =3D=3D -EEXIST) - goto failed; + goto failed; } =20 /* --=20 2.50.1 From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 37F272367D4 for ; Mon, 28 Jul 2025 07:53:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689238; cv=none; b=Qvm0YAs4SHzvXU3DHJgkekedSbPNaNLD6mQvVFay4849fvgB63AFWDLS66nyJBH6NczDOWc2gtdHlGLER43dkZXh3T0MUnzMT+dMIcD+elNi8XBdvrVYUK0UT6RKYpD4rpbQdgGFZAeIDJ1a+Zf1cRa+w4GG7af/22HkokJgbrk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689238; c=relaxed/simple; bh=3QRbs4ecGrdC32cfsqPknVeHalLahgEtXvUa5MGgKP4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PtppgyA5XjLWHGPc/gkQCKvvGc47Ntld5dQJ4UXOVqW1Y1AkpRwNYr6Iwv2uJwwlt9miBYmDxAvtFb/DyJZGfs0Cv7zrKyM9lrsS7QF7pyMj9kF9zcHmM/Xq6WEcI9qXR2AProZo+q5k5b2jLl36kXXhiwBC8FZUZ0tB4yfJBPw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CiUv1iaR; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CiUv1iaR" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-23fe9a5e5e8so9969295ad.0 for ; Mon, 28 Jul 2025 00:53:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689236; x=1754294036; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=uaDofCvBJJi4OU/LmXdHLeynm/OcrnFAPJYm0C8Ejq8=; b=CiUv1iaRjihzdjH7/tb/nzbpkyTdVhG0LgYFlDhjxbDrpw8oZmpiRxTLzJ2WAKsXv8 B6kGKZp3iaDTAG9cW8yf2Pa5Y86EgVIDAPLbjrXzM2eCo/VZmFFNhNoqk3GvfcVzUxeh P3Lw5iHdJ+tMsS55jsVX6rjsHIN/cXyWl/GyZUKF/N5cdUqPiiEz9q5131s9+GeGXswp SCLqdO3rLjyzUX2g6U+k3ArYYsviK5sQslJzHOYlHvb0ajRhFZOOcEUanN9gYwk+rIH0 eDLIANhv04sPCfYOHpphJc0VOecAg815QcJ7cVL1Rnpf5juE2NdLY+fJBfxoBuR87aCv x9xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689236; x=1754294036; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=uaDofCvBJJi4OU/LmXdHLeynm/OcrnFAPJYm0C8Ejq8=; b=HH7tKbj1XIeWJWg/Q8/nqBCewdWf1dgbg0TrmTBk0eyW+pp5kjLR5ns+Zig8U5avUB C96YBwx3ZmSuNEOIbXA8d8CIwJtzQ+Dy+KdpGICXJBwN5xdvO+xzRj8toHxom6Ke1lUc y0GYRLyiWpcvhuLCwEALlnb9Vx9cVIT6l8bKuf2Iv3FpcCjqwK0Q3ccTz5Vr3oAdqRHc xQDDKcR/RNDhavEh9UAHZDeNiLY4SMTMfDutoo/9huTtBPkbcBbiwrTSpL7Zw38sbU4u bvB+MXHReCrAnvumkpbAPSJUXBXcJDwniBImI6bIl3uhiJ3UTZNCi5iN88vuDNezgLAM 7oCw== X-Forwarded-Encrypted: i=1; AJvYcCXvDDaxo4ilFpz0zuP57kWOCtD1WTdZmnlIhWSnrgDZD5Dy2vs3c1bwdoqH8MXUN6a2QCO5I4OGnyu3bdA=@vger.kernel.org X-Gm-Message-State: AOJu0Yz4dA7wFUbc5VW1u2o0wBcs9d1yhJhdjE0cCzd6mQHho51a+tQN iAjKIJ7O+g8T1uZy6YIFzt+aEhVX7pnFhTS5n3RnjTLeJu2dJjC5dBnN X-Gm-Gg: ASbGncs4QiR86+w/wqf6SIjW2ma1aNnuW9EBnx0Lz8At8ReheQc+LtYWnY/icufNb9i X27AwdsP55kR3SUEKu7yJn9a/2LAwuliEhMf6aEtbayUzsPvl4HsgZNR3McEszg6xdGUN2tdKEM MgAS5kzTs2NF0YhH6UjOx4Y5SU8rOB+b2PdtFpX5ah4e6q2ua96mBx3EuwdKrl6Utpc8X48/9I+ imjzcCc+6pG0riyRLJpa8f0QFFTjnK/n4XAyi0fXv1BnAJRF+uPmKmn+XrqpdAg85DghrexCwEP jZftJ0+fg9j5g3z+D76VaQ3sqWRJ68+vqaY4TGQM5L+I+OCBgmUyryo/lPWacufJxSh/dpVoBcS 9YnRBJ+Q9lt53uUg9lHjuUzRMmFDzhu0bBIME X-Google-Smtp-Source: AGHT+IFcjk+VnwZPRERIPqhI1RvqOKp0YIfha59JqXJyCrT3wpyy9RckeBIOnrxvrdcmptyOyK+pDg== X-Received: by 2002:a17:903:1111:b0:23c:7b65:9b08 with SMTP id d9443c01a7336-23fb3050816mr151982835ad.1.1753689236321; Mon, 28 Jul 2025 00:53:56 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.52 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:53:55 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v6 6/8] mm/shmem, swap: simplify swapin path and result handling Date: Mon, 28 Jul 2025 15:53:04 +0800 Message-ID: <20250728075306.12704-7-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Slightly tidy up the different handling of swap in and error handling for SWP_SYNCHRONOUS_IO and non-SWP_SYNCHRONOUS_IO devices. Now swapin will always use either shmem_swap_alloc_folio or shmem_swapin_cluster, then check the result. Simplify the control flow and avoid a redundant goto label. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 45 +++++++++++++++++++-------------------------- 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 6bcca287e173..72b6370a8e81 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2357,40 +2357,33 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, count_memcg_event_mm(fault_mm, PGMAJFAULT); } =20 - /* Skip swapcache for synchronous device. */ if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { + /* Direct swapin skipping swap cache & readahead */ folio =3D shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); - if (!IS_ERR(folio)) { - skip_swapcache =3D true; - goto alloced; + if (IS_ERR(folio)) { + error =3D PTR_ERR(folio); + folio =3D NULL; + goto failed; } - + skip_swapcache =3D true; + } else { /* - * Direct swapin handled order 0 fallback already, - * if it failed, abort. + * Cached swapin only supports order 0 folio, it is + * necessary to recalculate the new swap entry based on + * the offset, as the swapin index might be unalgined. */ - error =3D PTR_ERR(folio); - folio =3D NULL; - goto failed; - } - - /* - * Now swap device can only swap in order 0 folio, it is - * necessary to recalculate the new swap entry based on - * the offset, as the swapin index might be unalgined. - */ - if (order) { - offset =3D index - round_down(index, 1 << order); - swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); - } + if (order) { + offset =3D index - round_down(index, 1 << order); + swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); + } =20 - folio =3D shmem_swapin_cluster(swap, gfp, info, index); - if (!folio) { - error =3D -ENOMEM; - goto failed; + folio =3D shmem_swapin_cluster(swap, gfp, info, index); + if (!folio) { + error =3D -ENOMEM; + goto failed; + } } } -alloced: if (order > folio_order(folio)) { /* * Swapin may get smaller folios due to various reasons: --=20 2.50.1 From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0AC7323535E for ; Mon, 28 Jul 2025 07:54:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689243; cv=none; b=THjrPP0xVSQlZNQhBY9VyW/gQP8ni75MY6tA4n33fNp+eSgEHX3eq2552Zi0etmDvM8AtlJeTe+UTXd72RQrnBfZ8Jdt/1pLn5BVIrFkJAYDjM3IxuYqaTLt0Cf4QUnlqftINvmBtuDRhRVf6iV7f0lrU4E4Sll9NLKHAcw8dK4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689243; c=relaxed/simple; bh=1LAApRAhqQF9gkekwujSLp9eafdwKIlOc3CkNYjxtPE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ynpww5r3eserXpOQK3kS/x4RJ2x+WdEz45sA/hpTYi8gF+Ppbnfslrx2qmtKQ+IaNwjEWoL587g9KgQXsIuhtNQ7/hNUBhs3XJACRn1ju/yUUcU9LDKRIrstrBiaX+VoQlHYvZ1Yr+HlyXdGs+rwald13qAJVKAEC+pDncWgcWs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E5izHkHL; arc=none smtp.client-ip=209.85.214.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E5izHkHL" Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-2401248e4aaso9246115ad.0 for ; Mon, 28 Jul 2025 00:54:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689241; x=1754294041; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=knGBd5jfLpxYWqvJFei/yT6Q+eb8VZoGgCefpAHxX/Y=; b=E5izHkHLXVs4Rrz8T2erY1gwdZhIlqUaQfw6qZ52UMAPXjW+EplJigEbdS2ek62k1A UXJOa/KiBb+x8AVuYTa/9ycp5+Rc0slYR4G1OJWy7l4YhH2azdaL8JOav7haCwV0qspc b1evLUXQL7ecDm0BjMYBWP7lqk+ttqnLG9daQ9a1H1jCVMA8rOICfc4wDYtnTe+62Ip7 an6wPfjPoLV9ej6BnprKTsF7dLxNkt3qm94ivhb4LQXbOSQuCBIjZ+r5QQkFYNzXcaid or/YIrnoRH+tip5/qUbAatuFUZ1h/kauV0asPjowCrqfRP+MRqzqctSoiYiNk4ewxvR2 0fJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689241; x=1754294041; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=knGBd5jfLpxYWqvJFei/yT6Q+eb8VZoGgCefpAHxX/Y=; b=cHZ4u/TXU3QbKneZE4gD4YbRtD0VyoqauR3HywjjRqjGnEv3abSjXRLcSokb0kMVxz 7ouyRtKQSUkrIn/GZUO12IFujT/JHD+YvUTz2ys6CTai7jlxQjAodH0qsLGHRNLQtDHP nOtdc1zx/03PlNqbFfFWFn+PK8pE9gX+2vQHXxhOXZ4r6sCun7fKt6PNWxUQBZl3zH3p 1vJxPZQqk1/ZiTOHDs6Zjsx8ASxcEA9L022xgTgNxUG+z3zqgwVCy/qZ71N93I+m1cQC FQ6mwYHlrmG2b3VsvxJ6KEpxGFIZs96e9zzuxGaSkWxlfuQmb4Etz+rTZ0Mx/HAPsC2N OMPg== X-Forwarded-Encrypted: i=1; AJvYcCUEHwAkuIKpXBs79aYcO2YQItVpfkVjE0V8OG+xMHT57+vK1Hb7dhthvQ6Vfuzzw5L+u0YgNMU2OFZHQHA=@vger.kernel.org X-Gm-Message-State: AOJu0YxElk3faYt98nn2IJsXq3KAkmyM/QI1y4upSyN0Z1XpsB63yZkf AJLE0feJ9GHshPj05GOwk7msGF/OGVD5InIGFX0wM+qMIOcbj790aiWfVWyln8hRtKI= X-Gm-Gg: ASbGncvSCr2+aX1cSunoMDqJ5gSuAj1Iz8Qml9qXdfjsO7X6pXbAVFlVyb37Qzn0ACu GvmNRtaGDIbeNj0VkkNYX3hoYcV1N3fvK5aJH5WK7rH4wRLXRrUoAC+lj3HGMpVscIbgrrB66ez J33DtPQRvqb37R60tnxuMjJJrk0s4UdyEK25aBLfsspgyIx3AQNFCL6OMyfLzjC5+X4vv2wmxvY I9bAiHc0Zeuy5SHWdn6hdjtRkJ1TXfrLox9GpHNfrxR54bRorA6FD/0JHdg7ucYD1gscVPqUXe1 RjotGRPpiq8/SkYBxboLg+oNXMUWsJ3K+QSNPhuJ9zTyLW+aHqxIsDaXi4wTuB4H0mnEul2YV22 BTeybFWtFRlG1x4uYPlvHkSxU+d5RS0s7fv0W X-Google-Smtp-Source: AGHT+IHbEgSCz7muxEaP14OZPu7wIZ7iP6SbFhHsXB9PfuHrCICAOPtunkQ6Vt7Sy0XqXlLfDh2N9Q== X-Received: by 2002:a17:902:f54f:b0:240:1ed3:fc1f with SMTP id d9443c01a7336-2401ed3ff16mr52388515ad.12.1753689240203; Mon, 28 Jul 2025 00:54:00 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.53.56 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:53:59 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v6 7/8] mm/shmem, swap: rework swap entry and index calculation for large swapin Date: Mon, 28 Jul 2025 15:53:05 +0800 Message-ID: <20250728075306.12704-8-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Instead of calculating the swap entry differently in different swapin paths, calculate it early before the swap cache lookup and use that for the lookup and later swapin. And after swapin have brought a folio, simply round it down against the size of the folio. This is simple and effective enough to verify the swap value. A folio's swap entry is always aligned by its size. Any kind of parallel split or race is acceptable because the final shmem_add_to_page_cache ensures that all entries covered by the folio are correct, and thus there will be no data corruption. This also prevents false positive cache lookup. If a shmem read request's index points to the middle of a large swap entry, previously, shmem will try the swap cache lookup using the large swap entry's starting value (which is the first sub swap entry of this large entry). This will lead to false positive lookup results if only the first few swap entries are cached but the actual requested swap entry pointed by the index is uncached. This is not a rare event, as swap readahead always tries to cache order 0 folios when possible. And this shouldn't cause any increased repeated faults. Instead, no matter how the shmem mapping is split in parallel, as long as the mapping still contains the right entries, the swapin will succeed. The final object size and stack usage are also reduced due to simplified code: ./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-145 (-145) Function old new delta shmem_swapin_folio 4056 3911 -145 Total: Before=3D33242, After=3D33097, chg -0.44% Stack usage (Before vs After): mm/shmem.c:2314:12:shmem_swapin_folio 264 static mm/shmem.c:2314:12:shmem_swapin_folio 256 static And while at it, round down the index too if swap entry is round down. The index is used either for folio reallocation or confirming the mapping content. In either case, it should be aligned with the swap folio. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang Tested-by: Baolin Wang --- mm/shmem.c | 67 +++++++++++++++++++++++++++--------------------------- 1 file changed, 33 insertions(+), 34 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 72b6370a8e81..aed5da693855 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2302,7 +2302,7 @@ static int shmem_split_large_entry(struct inode *inod= e, pgoff_t index, if (xas_error(&xas)) return xas_error(&xas); =20 - return entry_order; + return 0; } =20 /* @@ -2323,7 +2323,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, struct swap_info_struct *si; struct folio *folio =3D NULL; bool skip_swapcache =3D false; - int error, nr_pages, order, split_order; + int error, nr_pages, order; pgoff_t offset; =20 VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); @@ -2331,11 +2331,11 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, swap =3D index_entry; *foliop =3D NULL; =20 - if (is_poisoned_swp_entry(swap)) + if (is_poisoned_swp_entry(index_entry)) return -EIO; =20 - si =3D get_swap_device(swap); - order =3D shmem_confirm_swap(mapping, index, swap); + si =3D get_swap_device(index_entry); + order =3D shmem_confirm_swap(mapping, index, index_entry); if (unlikely(!si)) { if (order < 0) return -EEXIST; @@ -2347,6 +2347,12 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, return -EEXIST; } =20 + /* index may point to the middle of a large entry, get the sub entry */ + if (order) { + offset =3D index - round_down(index, 1 << order); + swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); + } + /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); if (!folio) { @@ -2359,7 +2365,8 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, =20 if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { /* Direct swapin skipping swap cache & readahead */ - folio =3D shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); + folio =3D shmem_swap_alloc_folio(inode, vma, index, + index_entry, order, gfp); if (IS_ERR(folio)) { error =3D PTR_ERR(folio); folio =3D NULL; @@ -2367,16 +2374,7 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, } skip_swapcache =3D true; } else { - /* - * Cached swapin only supports order 0 folio, it is - * necessary to recalculate the new swap entry based on - * the offset, as the swapin index might be unalgined. - */ - if (order) { - offset =3D index - round_down(index, 1 << order); - swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); - } - + /* Cached swapin only supports order 0 folio */ folio =3D shmem_swapin_cluster(swap, gfp, info, index); if (!folio) { error =3D -ENOMEM; @@ -2384,6 +2382,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, } } } + if (order > folio_order(folio)) { /* * Swapin may get smaller folios due to various reasons: @@ -2393,24 +2392,25 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, * large swap entries. In such cases, we should split the * large swap entry to prevent possible data corruption. */ - split_order =3D shmem_split_large_entry(inode, index, index_entry, gfp); - if (split_order < 0) { - error =3D split_order; + error =3D shmem_split_large_entry(inode, index, index_entry, gfp); + if (error) goto failed_nolock; - } + } =20 - /* - * If the large swap entry has already been split, it is - * necessary to recalculate the new swap entry based on - * the old order alignment. - */ - if (split_order > 0) { - offset =3D index - round_down(index, 1 << split_order); - swap =3D swp_entry(swp_type(swap), swp_offset(index_entry) + offset); - } - } else if (order < folio_order(folio)) { - swap.val =3D round_down(swap.val, 1 << folio_order(folio)); - index =3D round_down(index, 1 << folio_order(folio)); + /* + * If the folio is large, round down swap and index by folio size. + * No matter what race occurs, the swap layer ensures we either get + * a valid folio that has its swap entry aligned by size, or a + * temporarily invalid one which we'll abort very soon and retry. + * + * shmem_add_to_page_cache ensures the whole range contains expected + * entries and prevents any corruption, so any race split is fine + * too, it will succeed as long as the entries are still there. + */ + nr_pages =3D folio_nr_pages(folio); + if (nr_pages > 1) { + swap.val =3D round_down(swap.val, nr_pages); + index =3D round_down(index, nr_pages); } =20 /* @@ -2446,8 +2446,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, goto failed; } =20 - error =3D shmem_add_to_page_cache(folio, mapping, - round_down(index, nr_pages), + error =3D shmem_add_to_page_cache(folio, mapping, index, swp_to_radix_entry(swap), gfp); if (error) goto failed; --=20 2.50.1 From nobody Sun Oct 5 23:40:43 2025 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8C39238C2B for ; Mon, 28 Jul 2025 07:54:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689246; cv=none; b=dKcLQQoJe7ig4l1vjkrHoNpWdre2oYtT+Hghiu9ZNiyFGc7oiKSylk14CODVbNnCUVYIUSs3zzNvJd+SyfoeZBS4yt3ZOHTagAe9ebuFoB3cPoJgfwsplYH5M2INJpqtzY58rlRsXpJ+s64KLjPzVYoGK6MUjcQfG8VEkXuT5mQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689246; c=relaxed/simple; bh=SZtmN31dgKGUcSFnkiYjLrUWC3XIbULBVU/oRWvVZAk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Pk6z4x4vHTS/ayghJusLzNnxruMAY/cOt7vYKxM+S4EzzdLGq1s7193hBS7snYFnTKPEf5LjhhUFyx0NpEQH2c/Jy/GqLnI619nYru3v7/b8vF/10f53B1C4NVM6gGy+Et9Zq+vG49Etzo+OuCqqisnQlwpNDTSHLgs1ABLNuRE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ivXQAtdx; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ivXQAtdx" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-24009eeb2a7so6596255ad.0 for ; Mon, 28 Jul 2025 00:54:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753689244; x=1754294044; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=Cd4AeVUs7V5lSylB8MQ/iKz7vLHt1zDPy5kzMmZDacY=; b=ivXQAtdxgz1ynjBqP4hCWssRFvVcM6P4XwEgjIklw1CBZziGr78rm/YZXeMDhrYGGW aE2VndkgeB9yS2g+p3XlaivdGD/em1k/JnWJKX1eF/7SbHzrVBtVHuIYH2A+dL7AyeGZ E3iwqOZv6+1D5EMLB6wO1ZOoomULYYTK/F65yVK56i3Fuc96OdLQNzGUN2sR4MKH3v0s 9/ZTM+EHsoXrkUL+1aRdkjDqpSEGwe+lOVqbQGpCjlqV7I1Dh4qLWApHOlOHkcsyd0/y R6fhg6M19PLMOJnGBy35p/+7vRH9iBf/F/HUkOZ4oGcovLQRJGUhKv0vw3gX7qYNxnNS 78EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753689244; x=1754294044; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Cd4AeVUs7V5lSylB8MQ/iKz7vLHt1zDPy5kzMmZDacY=; b=DfMLawSZyHX4634idN+ZIiBxMHo98/DsFBqgO9Su+kdfJuLkIRFH++mYV8S4Rhxnx0 5OSx4cLH5CpF3KKxArrv2zcFO2Ew5ujVu+P+iLyYN8//pxGG98E9hZfm9TptdvUhNw8B erjXMyaCYP4SOOAp0HxnYXUOSRm3UjHU2jCapdudxe10IdJN3JDbCVDjhkUCQBD11XFl ISoxuE5xYJ0QRVzppnWCx1KaLcmhX+aiHKW627Txb93vGhN48lugIYT4w+evZZwIUfCP XTqKFOIigRvnY6qC5mGq0wQCss9CD5EgdKiz5YAsWH3qJi0bGWbxbBvANE+GHdDj+R93 GjSQ== X-Forwarded-Encrypted: i=1; AJvYcCWyKeWnv7jmGmP+TvDeVpaq+XpVOrEy5pSlmjvEMF8OPbGx4qlsHB6GD6i9x3Iqx6lHFeg+9Osq603LmFU=@vger.kernel.org X-Gm-Message-State: AOJu0YzM+9kWzAW9OwVQTlSL+XiGJqftyPYfA+J4Y25qczbfyawbttYA /2MlBtUn8d4Bt3B3wDG47/rZ98xu0gBfRL8M9vEvsg9dgIyMGGCO8OEt X-Gm-Gg: ASbGncudLNlmKhart2xtHJJlaFNJDp/z3EXxGo9b/374og8sFiWVh8UoCGRD/0EdXnC ilb5jDAuzkBXPTRkfSUqKytBvAdD4azV8A6DumO3wLYMjZhVMYkA1sZOHSHx2g1MNpIEcdeZiS1 6cHFbkklSsf0Zis281aWr02P4BdEI63xYZLyqrKOXQzr9g0oSc7sbdt555A79YIHxEla77uhxk/ f6lqNB9sCbNW8w2+mCDjgYp9G2+X5vuYdFzhEFP+CC1Ze3iDe3bILJFLG0dwVMV9Ks2etPLekvK 6pkfcr7x4eBqnamPYR4iJdRpQdmBNS97owRFsCLBpQ8Xw35/FOvVznWm7dypEjrb2cUWvufRyX6 Dn/wNFlpYCUBBmNTaNVvcek/TRH4Pxs6Zw7u5 X-Google-Smtp-Source: AGHT+IFb47kHtBhlXu+Lw2+aPCdPOkojIjE0li2wCg433P4HoUD2xs5ePpOf4KNv5nq2hut+xxfISQ== X-Received: by 2002:a17:903:181:b0:234:d679:72e3 with SMTP id d9443c01a7336-23fb3100766mr169960675ad.42.1753689244060; Mon, 28 Jul 2025 00:54:04 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2401866c2a1sm20272305ad.4.2025.07.28.00.54.00 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 28 Jul 2025 00:54:03 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v6 8/8] mm/shmem, swap: fix major fault counting Date: Mon, 28 Jul 2025 15:53:06 +0800 Message-ID: <20250728075306.12704-9-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250728075306.12704-1-ryncsn@gmail.com> References: <20250728075306.12704-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song If the swapin failed, don't update the major fault count. There is a long existing comment for doing it this way, now with previous cleanups, we can finally fix it. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index aed5da693855..41eb4aa60be5 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2356,13 +2356,6 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); if (!folio) { - /* Or update major stats only when swapin succeeds?? */ - if (fault_type) { - *fault_type |=3D VM_FAULT_MAJOR; - count_vm_event(PGMAJFAULT); - count_memcg_event_mm(fault_mm, PGMAJFAULT); - } - if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { /* Direct swapin skipping swap cache & readahead */ folio =3D shmem_swap_alloc_folio(inode, vma, index, @@ -2381,6 +2374,11 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, goto failed; } } + if (fault_type) { + *fault_type |=3D VM_FAULT_MAJOR; + count_vm_event(PGMAJFAULT); + count_memcg_event_mm(fault_mm, PGMAJFAULT); + } } =20 if (order > folio_order(folio)) { --=20 2.50.1