From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 652802836B0; Thu, 10 Jul 2025 03:37:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118660; cv=none; b=FTJtc0FcPddsn5uVMfh+WJCvVXS3s0RkM8erngnxldQjKR/H4DLDermsOFfUqjCcfEUsoXly2Qr3aDN66cJ5OZJwgIlUEeBiVz66/lsPHFjAaUk1hPVqX3HMstmw+gJ/scOsZKyWV8j/Apd81r8B6t6+lrWmzQ8x6q9/MceaFJI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118660; c=relaxed/simple; bh=aHEKVPEcG65zMssaVDheCdyA1eEbVFSOQxSEcrZfmqo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lBJ8adxaXNGEMssi4OtKxA7l5Ss+nNR8Y0JhdFcq6Gxr4f9rOG0MD84lhISmck2sNhAGC/Apkf5v199Bn0xdmeEzkft9MBUl5Bbi6MP0XHMWaGXOAeaLpWcQy6BbqTxZcH17womJ9bBNQJpeq/hCGCk8pWDDep62LoT1HeRPcxw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=A/ajQGDg; arc=none smtp.client-ip=209.85.215.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="A/ajQGDg" Received: by mail-pg1-f176.google.com with SMTP id 41be03b00d2f7-b321bd36a41so591251a12.2; Wed, 09 Jul 2025 20:37:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118658; x=1752723458; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=Bl/mZF3g2hMilPsP1iGl5oIVwOLv3WwEcuUx0SZOALE=; b=A/ajQGDgNbUKcGihJcIMcdTFmJye03H0sZUF0WGlX4IF2ETe6V3P8Oi9Hp3VC4h+M5 iwzL/4dY50Vsohvcceo2Sd0y9U2I1NV+J/rPOfMTHjbGvapUxDC0AMJjEm4jHtlFDtrD Q7nVkfl37p/yQqopOQLGn3r/pTBEAMxxlFNijW7mW5MTQeRBQ0f6FiJpsl3OZuh3oAAU YBZRbwdt5R8xdF6CMoEddPz2p8el9KZ+cPcF7qudEPWWGdaoPDK10/WkSyjE1t25w2P0 VM47HV8w0jgEc4rkKd+oKNc6gsCHDNrrCntkOLWrLV77gHR3enz6GCMsmcoNvPrZ7kiC pHiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118658; x=1752723458; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Bl/mZF3g2hMilPsP1iGl5oIVwOLv3WwEcuUx0SZOALE=; b=RqsadSlrm5Cq4BykaYporFl1uWbbnUynHnlaeRIrR8WSQe/L8WTpGXlZbP7jh6PfjZ ZKZzzImYA2T8BzlSVLfiZbVYoQQ+fW/gmJvBHWfBmWPamkaVf6uhX13xAjCZfjZztVHm wLRnCjw3dGK0TSM5MDJmrN6zKKjMkNAp+YsO1NVvvAYxudIkYyO5DBPBGIZBAcMQiJv4 XyIO5Z+8rO8Y67uOqRZfLZkQkNBEsaU64aD3hrEv0dzM87Wde6aMHEWrmvVMMrjthaIT 3oPS9i0QfymIIkfkGHA5o2c3+HnaI2lf3PMbU6tvt6rLry+V3FmhnKC8gnOjLuE++yWl 0Zkw== X-Forwarded-Encrypted: i=1; AJvYcCV5yblfCofpa1aVeEAu+84HQVkAfFEYUdZtx1WmCcSVIx4OV+V9v9nQYRMsGesdpscXfUmL+TQgSz6fxGg=@vger.kernel.org, AJvYcCVKFFlJxr2vKOnrSUtqcRLyrqhf+bO3CEF2nEcbFos1XRiXQ+3Rf3iN8cMPXVciyKP7HGdtRnal@vger.kernel.org X-Gm-Message-State: AOJu0YzAe55HyqUK3DwwhTGCyGNzj4OcpHr+gVpvlGk0VbzPf+pDUFtg R3Nqfh1j3zw4JM/oEP8aZaTxzHb9P4OdNDS8eDVovpOjva0reMZkfZEv X-Gm-Gg: ASbGncsw5xNA8619aWZ98cEUIAlCW9hge42a1rnUI90wDR2XvPaVDR7ncsqEVRGpN7O 5RqNaOb8SW57gPEymYVioO7hl0QMvAb9IIRxUhj9dTy7Wm5+xs+xvi5VgUDpgqLrvUQz5bPuFK/ lLJNdWYPV7J5LWQVjbSs1f3M25mceqozWJxxnrmSn/V2DINGbY27M0BiJpKVamVE2V1eBe6utjp TnP4yKXpbrGiUYdkjX/IxeL4ZQeSQmHADomgmh5heTjq1ylfw3Fj26yde74dfB30euH133ax/BM 3XI7caeaJTsApB+Y5ahLe6iBkL6APbMl/Qq332NjjjKc/JzoiPwlBEo8YIQrHMZH0OJQYhZVJa/ Z X-Google-Smtp-Source: AGHT+IHjm1osjfE3BDbqUtnApUOiVzDzZtYqYWPkiNlunBCJHYXm4X1KAkOQOJDiTMx24KRJxs5NqA== X-Received: by 2002:a17:90b:1f8d:b0:311:ffe8:20e9 with SMTP id 98e67ed59e1d1-31c3f00b99bmr1966495a91.17.1752118657447; Wed, 09 Jul 2025 20:37:37 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.33 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:36 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song , stable@vger.kernel.org Subject: [PATCH v5 1/8] mm/shmem, swap: improve cached mTHP handling and fix potential hung Date: Thu, 10 Jul 2025 11:36:59 +0800 Message-ID: <20250710033706.71042-2-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song The current swap-in code assumes that, when a swap entry in shmem mapping is order 0, its cached folios (if present) must be order 0 too, which turns out not always correct. The problem is shmem_split_large_entry is called before verifying the folio will eventually be swapped in, one possible race is: CPU1 CPU2 shmem_swapin_folio /* swap in of order > 0 swap entry S1 */ folio =3D swap_cache_get_folio /* folio =3D NULL */ order =3D xa_get_order /* order > 0 */ folio =3D shmem_swap_alloc_folio /* mTHP alloc failure, folio =3D NULL */ <... Interrupted ...> shmem_swapin_folio /* S1 is swapped in */ shmem_writeout /* S1 is swapped out, folio cached */ shmem_split_large_entry(..., S1) /* S1 is split, but the folio covering it has order > 0 now */ Now any following swapin of S1 will hang: `xa_get_order` returns 0, and folio lookup will return a folio with order > 0. The `xa_get_order(&mapping->i_pages, index) !=3D folio_order(folio)` will always return false causing swap-in to return -EEXIST. And this looks fragile. So fix this up by allowing seeing a larger folio in swap cache, and check the whole shmem mapping range covered by the swapin have the right swap value upon inserting the folio. And drop the redundant tree walks before the insertion. This will actually improve performance, as it avoids two redundant Xarray tree walks in the hot path, and the only side effect is that in the failure path, shmem may redundantly reallocate a few folios causing temporary slight memory pressure. And worth noting, it may seems the order and value check before inserting might help reducing the lock contention, which is not true. The swap cache layer ensures raced swapin will either see a swap cache folio or failed to do a swapin (we have SWAP_HAS_CACHE bit even if swap cache is bypassed), so holding the folio lock and checking the folio flag is already good enough for avoiding the lock contention. The chance that a folio passes the swap entry value check but the shmem mapping slot has changed should be very low. Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Kairui Song Reviewed-by: Kemeng Shi Reviewed-by: Baolin Wang Tested-by: Baolin Wang Cc: --- mm/shmem.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 334b7b4a61a0..e3c9a1365ff4 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -884,7 +884,9 @@ static int shmem_add_to_page_cache(struct folio *folio, pgoff_t index, void *expected, gfp_t gfp) { XA_STATE_ORDER(xas, &mapping->i_pages, index, folio_order(folio)); - long nr =3D folio_nr_pages(folio); + unsigned long nr =3D folio_nr_pages(folio); + swp_entry_t iter, swap; + void *entry; =20 VM_BUG_ON_FOLIO(index !=3D round_down(index, nr), folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); @@ -896,14 +898,24 @@ static int shmem_add_to_page_cache(struct folio *foli= o, =20 gfp &=3D GFP_RECLAIM_MASK; folio_throttle_swaprate(folio, gfp); + swap =3D iter =3D radix_to_swp_entry(expected); =20 do { xas_lock_irq(&xas); - if (expected !=3D xas_find_conflict(&xas)) { - xas_set_err(&xas, -EEXIST); - goto unlock; + xas_for_each_conflict(&xas, entry) { + /* + * The range must either be empty, or filled with + * expected swap entries. Shmem swap entries are never + * partially freed without split of both entry and + * folio, so there shouldn't be any holes. + */ + if (!expected || entry !=3D swp_to_radix_entry(iter)) { + xas_set_err(&xas, -EEXIST); + goto unlock; + } + iter.val +=3D 1 << xas_get_order(&xas); } - if (expected && xas_find_conflict(&xas)) { + if (expected && iter.val - nr !=3D swap.val) { xas_set_err(&xas, -EEXIST); goto unlock; } @@ -2323,7 +2335,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, error =3D -ENOMEM; goto failed; } - } else if (order !=3D folio_order(folio)) { + } else if (order > folio_order(folio)) { /* * Swap readahead may swap in order 0 folios into swapcache * asynchronously, while the shmem mapping can still stores @@ -2348,15 +2360,15 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, =20 swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); } + } else if (order < folio_order(folio)) { + swap.val =3D round_down(swap.val, 1 << folio_order(folio)); } =20 alloced: /* We have to do this with folio locked to prevent races */ folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || - folio->swap.val !=3D swap.val || - !shmem_confirm_swap(mapping, index, swap) || - xa_get_order(&mapping->i_pages, index) !=3D folio_order(folio)) { + folio->swap.val !=3D swap.val) { error =3D -EEXIST; goto unlock; } --=20 2.50.0 From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2779D283CB1 for ; Thu, 10 Jul 2025 03:37:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118663; cv=none; b=iTGUZn/jCBQsgLasA7Hx7Poo8VrLPwgnB7up3y85Uz7b19o9MhLq4s/OCf/hSR+7jQcu7M7woDrJaNdLMyhdCwTYPYNsVnpTNpxg8WAZA9olfGN2DEvxZDMNPMhGzgz3My65HzAGYNjJL4+3cJzOVTBABlJkwuuYJAU2ewlFSIs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118663; c=relaxed/simple; bh=RYXgK8Gp6hun7PA1bn+yuH+2nP2F6yKWp32CpxJop20=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k6/GofnHhLfUVrjBtyhXu7+i7bLY7f8o8psoqzY6OhF1zp3h9UEgnVc/wb+fk41PxBDxvG0eBvdgywXsoXkOOr04JQDvuiXc0WaoDdnuF5+HghfpKweFLCBPb327olTUg/HI8HJ16U2nPexb3GiOLqeEMeitJwiz0TSGwCnqZCw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AwjFQVaT; arc=none smtp.client-ip=209.85.216.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AwjFQVaT" Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-3138e64b3fcso610942a91.2 for ; Wed, 09 Jul 2025 20:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118661; x=1752723461; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=ZHcB8oljduSMbeVIzGe/KyVk4jWAgvGawIHTov+tfUg=; b=AwjFQVaT0He+qDS43867MFgvIXKio9qkrKWhphM3/gcPM7SrUvtu0wEdFjHCqKT7I+ 50uJKpQypHcHtkWW+LGzL5NnezM1FaT1AFzE5/+5+IKbRtg71ehEV1X93Jr84embCc4Q lp3BblcPp4aXiANRSUCn52Ab2wo9IsCoZDVYrhC5AyEJVF8c2w10uD0aSxTcmFk9jm1g FseIZGsFgfpieZ8sY1TzzMsQ5yS/uhdQBOohMFl1s3pASZSPT/n7KrIahIPg/HC9WGq8 BIu5UddAtEx8DhLSfKHspfUevwfiBQsQEmsn0fO4thlt1WrYsaRiMPVXCVa5hFhFTdA9 o2mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118661; x=1752723461; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ZHcB8oljduSMbeVIzGe/KyVk4jWAgvGawIHTov+tfUg=; b=BpY89uC8/BzPFfsRJgtzTxiy2KT8rCkHk/IVIYuPFyR8oWz1rfnjU/OvWvTA79XZ1p LemPqTt7Ql+jv5IINrnzWMzPPU+5m202heTFl9m/RGhCQKjQfXhzNEdnptBzhsHBB5Kn cKz+b5NkWtFfFD6ZBfiypUqILLzmSwdJWvEDMhHRPlBMD6b4xYkukafP9Wur4BbQJH38 Ydnqlg+24OBTPB/ai6ULcumxBEiIe/4EW0QlO/B8fXrZmgM23kwqJKi3jS/X7kNQJSUv oEboMQGdW/S5ynOS+S501WV+7L2Rn0Oz8wh/5qDUpUmew2SOEO+Yc9N6OZLqmP78bg1f euRA== X-Forwarded-Encrypted: i=1; AJvYcCXMmMyEKqqBokk182mGyJo85a2J8SkBG+MTFR+h911XUhvvbmff/GBiGgPTxdtLRuFlgnbYRBO1/CWMB9Q=@vger.kernel.org X-Gm-Message-State: AOJu0YwhkSOSP8ygYkpnRtDUY2JhUVQafn/ST7ZNboTr0CS2lHkheYeP PJK2l+TtAS892WXB1fZoLE0TXomRGFyoBmzGcYnH3gkgyr+LB5KKnG1N X-Gm-Gg: ASbGnctMwsizV/W7Cp9E0gRCPFXK1t16+Ph+KbFvmd8PoFabQc4443PXPabd3abhdon pod8YsYUDn8n2niNSBFQvRx1ZHH0mjXnacDqo3GX+c30IREF8I84l/Mp68lNEJ3Dyi7haPaxqkz dFqmxj8lxGWB/LcqDLELrCpW/6aUMt1THuqg3sN9JWBM/v0HXDA4GULnG69S4lMSenD6fU7gGPp gNJRkOYuoCFsskxCnUkPfbvXGCvCpdSVumWTykpqKa5njWIv3BqortVKkqh2t7r9J253EJ45hXJ zccnJj7nAigNV2aOmvUtXZk18P3kzzdnVuRaoX+TbRI/eE+QSByvkwQpTWeQWCHmRiLYGiUSWAF 1 X-Google-Smtp-Source: AGHT+IEerNWA6q17dFVLFW54pLNpCFs1CG/nuAgw8pHTzR/43YyD7Fsi4qinNk+ExteACuyvRH0b1g== X-Received: by 2002:a17:90b:5249:b0:313:1a8c:c2d3 with SMTP id 98e67ed59e1d1-31c3f00a92bmr1520670a91.22.1752118661187; Wed, 09 Jul 2025 20:37:41 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.37 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:40 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song , Dev Jain Subject: [PATCH v5 2/8] mm/shmem, swap: avoid redundant Xarray lookup during swapin Date: Thu, 10 Jul 2025 11:37:00 +0800 Message-ID: <20250710033706.71042-3-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Currently shmem calls xa_get_order to get the swap radix entry order, requiring a full tree walk. This can be easily combined with the swap entry value checking (shmem_confirm_swap) to avoid the duplicated lookup, which should improve the performance. Signed-off-by: Kairui Song Reviewed-by: Kemeng Shi Reviewed-by: Dev Jain Reviewed-by: Baolin Wang --- mm/shmem.c | 32 ++++++++++++++++++++++++-------- 1 file changed, 24 insertions(+), 8 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index e3c9a1365ff4..85ecc6709b5f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -505,15 +505,27 @@ static int shmem_replace_entry(struct address_space *= mapping, =20 /* * Sometimes, before we decide whether to proceed or to fail, we must check - * that an entry was not already brought back from swap by a racing thread. + * that an entry was not already brought back or split by a racing thread. * * Checking folio is not enough: by the time a swapcache folio is locked, = it * might be reused, and again be swapcache, using the same swap as before. + * Returns the swap entry's order if it still presents, else returns -1. */ -static bool shmem_confirm_swap(struct address_space *mapping, - pgoff_t index, swp_entry_t swap) +static int shmem_confirm_swap(struct address_space *mapping, pgoff_t index, + swp_entry_t swap) { - return xa_load(&mapping->i_pages, index) =3D=3D swp_to_radix_entry(swap); + XA_STATE(xas, &mapping->i_pages, index); + int ret =3D -1; + void *entry; + + rcu_read_lock(); + do { + entry =3D xas_load(&xas); + if (entry =3D=3D swp_to_radix_entry(swap)) + ret =3D xas_get_order(&xas); + } while (xas_retry(&xas, entry)); + rcu_read_unlock(); + return ret; } =20 /* @@ -2256,16 +2268,20 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, return -EIO; =20 si =3D get_swap_device(swap); - if (!si) { - if (!shmem_confirm_swap(mapping, index, swap)) + order =3D shmem_confirm_swap(mapping, index, swap); + if (unlikely(!si)) { + if (order < 0) return -EEXIST; else return -EINVAL; } + if (unlikely(order < 0)) { + put_swap_device(si); + return -EEXIST; + } =20 /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); - order =3D xa_get_order(&mapping->i_pages, index); if (!folio) { int nr_pages =3D 1 << order; bool fallback_order0 =3D false; @@ -2415,7 +2431,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, *foliop =3D folio; return 0; failed: - if (!shmem_confirm_swap(mapping, index, swap)) + if (shmem_confirm_swap(mapping, index, swap) < 0) error =3D -EEXIST; if (error =3D=3D -EIO) shmem_set_folio_swapin_error(inode, index, folio, swap, --=20 2.50.0 From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C94DE284696 for ; Thu, 10 Jul 2025 03:37:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118667; cv=none; b=JkpZr/zaW2zHkqVx9k+AVqJod+d8ZNP2J/0CCg+UJ7bki8Tp8lyX5wqMOZhDXEmzHr24SnUSnUHwffMd0ncF0/hphuHBKyGeXDshvVZ0z86KqjhnVAKlQhXioq7DilQ7eEyF7qT4HVIwhRiddK7A5duoIFxIzV4j9DvwI8hcbSg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118667; c=relaxed/simple; bh=1dCMBlZYD7wMrYLMl8NGAabEsAtIuvO5ttJMwXWQAG0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fWVkYYdi+JYMPE3M/ngiZwQ4d4c04L3xlrF4CPBeftLV11Q11v9FgGoNDl5EzB+r4pAlEcVe7GJjgzpqUtVbzo3Aa4wVef8b/4+sArKqo1EJ12Y+X1FiTs6KEHSQcJFEZFs5aVMMQy6CnYK8SeEXk+pTN4QtCisKpfFR+v/8HWU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BkOj9PKR; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BkOj9PKR" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-315cd33fa79so440825a91.3 for ; Wed, 09 Jul 2025 20:37:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118665; x=1752723465; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=f9uQDZMSoLo04nYJSt1eoke5Qxicz761lSMKdLpc5Nw=; b=BkOj9PKR2UQOgaWyfa14MFKhPSLVlfDwIfs9N5K5zHK6B6+F8NNhEYehPDBe4vB+/r jqQdjFQf2CWNLrNnxBlnWlZkZpqfItI9rTbVenx47cKghAFgABB3m0yHTF0waynKztUW Ohs5uIZ62pDGPMfA2Q8gHWHQLXQD3B+V36eyjvSjN+f7vTog6cyblqejalgDRVDVHv/l YupvmCIoBraPWJVceaQ8nxkQJE7LqTVwrLLHVEjQ+ArSE0WMcl0LkgTEUAJVPih6F0Ht 3h/pKnlwo642SwkjQVLRr4fIexlHNuvIdqUkeA8MdxkIKjykkueVq8HcoNqmghoO2oYF rapw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118665; x=1752723465; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=f9uQDZMSoLo04nYJSt1eoke5Qxicz761lSMKdLpc5Nw=; b=hwQ83vEMNsMtP7mpTSWXdAcu/IHaYBDU34NMkEzjpGuFRWIkXWwZQm+qmzHN15A8/M XtHovkYUEaSVV54D7OpqNveTjUi8exhs8LXoudah22Voj9XizzsZBTxeJeee7VRi7aqu CYIK4Vh16R7Iq8A2x2W/qTPdB3o4uHx1Q36onOOkhRJ2rd9sN37rlxu1taWkND/+28R6 lJd3knnTXQDDaCXq4qRiUOr8bglLmmLUqlB/RVtPBFAbX4S+8V9q/1rYKdLg6QjFbguh OjQx0VDwf7hLEDeIeLrDylxLNHBR4u29WAUtlW0tNTGSrRnL2CZVSDIjCBrKXZ0s/XQ5 pnqg== X-Forwarded-Encrypted: i=1; AJvYcCVcKttwy/tzX+YzfaH6fTl2f9wlpMOk8bmgNgbHhet22UnB0RdwFjLm2zoNOmyu3axblD+As2VsYUznwdo=@vger.kernel.org X-Gm-Message-State: AOJu0YyhPEqT4157/SPANTXntdxiD+SCk1HKyx4JBmNgzpidG5AkfgZl a2tOi+o38eKKdNu/UaxD/C32dvcHEGAcYuDleEafrR1vMP8Ke6FBvwES X-Gm-Gg: ASbGnctFg0xWzbWKlaPiXyV5agBfuqYDkovAH5Ao0dryiCdvpdTyKxmkyEs6SARILpT V+0vetyAz4so31kEoTsUzdIznhzTafFt+ZEy6HEMOB0iILiXtY/DOHgJlne8zgRGUWEK286/Fv8 26tNhJ60NAVU4d8Dryi+fA2sw3ZjGr8qkr2fkUBVxoKRH6ZhxzFH7wXd7NKui9jJw/C6VP29RvQ 9MhZYHrgJlIuDQ5ELOkbQ08MwwH8MQWEI/ab4i1wuLV5ypxwQ4eQJublBY015HvIF4vIuS2BNIw yOlg6i2GWelxLNrC4DIeXzoSWlya7dTtLqInafGtEyI0yNXp7zeLm0nYcDNU1d1qgRp2Stk0dBl QUxtFq0v13Hk= X-Google-Smtp-Source: AGHT+IEHC3Eha4I792pKQPxkmiPL3iEbJLksc3wQUkP0zzrkb6DMFrqYCIKr7K2zai39OVzBjU82Zg== X-Received: by 2002:a17:90b:17cf:b0:2f8:34df:5652 with SMTP id 98e67ed59e1d1-31c2fdb9d5amr7391461a91.21.1752118664930; Wed, 09 Jul 2025 20:37:44 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.41 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:44 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v5 3/8] mm/shmem, swap: tidy up THP swapin checks Date: Thu, 10 Jul 2025 11:37:01 +0800 Message-ID: <20250710033706.71042-4-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Move all THP swapin related checks under CONFIG_TRANSPARENT_HUGEPAGE, so they will be trimmed off by the compiler if not needed. And add a WARN if shmem sees a order > 0 entry when CONFIG_TRANSPARENT_HUGEPAGE is disabled, that should never happen unless things went very wrong. There should be no observable feature change except the new added WARN. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 39 ++++++++++++++++++--------------------- 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 85ecc6709b5f..d8c872ab3570 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1980,26 +1980,38 @@ static struct folio *shmem_swap_alloc_folio(struct = inode *inode, swp_entry_t entry, int order, gfp_t gfp) { struct shmem_inode_info *info =3D SHMEM_I(inode); + int nr_pages =3D 1 << order; struct folio *new; void *shadow; - int nr_pages; =20 /* * We have arrived here because our zones are constrained, so don't * limit chance of success with further cpuset and node constraints. */ gfp &=3D ~GFP_CONSTRAINT_MASK; - if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && order > 0) { - gfp_t huge_gfp =3D vma_thp_gfp_mask(vma); + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + if (WARN_ON_ONCE(order)) + return ERR_PTR(-EINVAL); + } else if (order) { + /* + * If uffd is active for the vma, we need per-page fault + * fidelity to maintain the uffd semantics, then fallback + * to swapin order-0 folio, as well as for zswap case. + * Any existing sub folio in the swap cache also blocks + * mTHP swapin. + */ + if ((vma && unlikely(userfaultfd_armed(vma))) || + !zswap_never_enabled() || + non_swapcache_batch(entry, nr_pages) !=3D nr_pages) + return ERR_PTR(-EINVAL); =20 - gfp =3D limit_gfp_mask(huge_gfp, gfp); + gfp =3D limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); } =20 new =3D shmem_alloc_folio(gfp, order, info, index); if (!new) return ERR_PTR(-ENOMEM); =20 - nr_pages =3D folio_nr_pages(new); if (mem_cgroup_swapin_charge_folio(new, vma ? vma->vm_mm : NULL, gfp, entry)) { folio_put(new); @@ -2283,9 +2295,6 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); if (!folio) { - int nr_pages =3D 1 << order; - bool fallback_order0 =3D false; - /* Or update major stats only when swapin succeeds?? */ if (fault_type) { *fault_type |=3D VM_FAULT_MAJOR; @@ -2293,20 +2302,8 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, count_memcg_event_mm(fault_mm, PGMAJFAULT); } =20 - /* - * If uffd is active for the vma, we need per-page fault - * fidelity to maintain the uffd semantics, then fallback - * to swapin order-0 folio, as well as for zswap case. - * Any existing sub folio in the swap cache also blocks - * mTHP swapin. - */ - if (order > 0 && ((vma && unlikely(userfaultfd_armed(vma))) || - !zswap_never_enabled() || - non_swapcache_batch(swap, nr_pages) !=3D nr_pages)) - fallback_order0 =3D true; - /* Skip swapcache for synchronous device. */ - if (!fallback_order0 && data_race(si->flags & SWP_SYNCHRONOUS_IO)) { + if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { folio =3D shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); if (!IS_ERR(folio)) { skip_swapcache =3D true; --=20 2.50.0 From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8879B28506B for ; Thu, 10 Jul 2025 03:37:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118671; cv=none; b=YEgR7e7iqSrsoZwq0R4RCdPL6wX4tIBYc6rsLd3coRzQdFpHNKvhp7IFUqi2at9xZSnVwPjLPduuKUL3BTNcNb5Z/43UVcLV3eUmNleIKdUoky2Jwassg5N+MB2xvDHFeF9fSDZqPM1jnkpbnNXcKkz7MgEc8Sz+5PfBfYmeDTk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118671; c=relaxed/simple; bh=+/w7KUoG3+HE19096K5cJ0YGlmFO1FHNTw3jb2v9E8o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lpVJDxRQzGck+cYjn54WceypzAC3nWGHvlWQUSrkgEYBaTbu3RCm9ULW1KNdegjSXd/24oL2MgYzZdI2DsG7wZErGvipChn3o4+PUgBMsJldD5EZStFUcIloNps1l8SY0cLXxiQoDWt+3qU23drwAO+ctjAY93ePP0zngr9lTJI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=khPuDqZS; arc=none smtp.client-ip=209.85.216.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="khPuDqZS" Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-31329098ae8so564516a91.1 for ; Wed, 09 Jul 2025 20:37:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118669; x=1752723469; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=JUngWupCRmA1NcaSn0JYmN9Vt9g7u4CLfwO2X664xO0=; b=khPuDqZS9kheJkNY1qI7P1DHkyP2qm4fMqhwPF7TDXxaUPA+zP1SaibYcKqSGOScha Pv9Eywj3JU1nIXzhqQ91ofkJWHy2kmjieSmycyGnTWWeyoScshJ69AZ8y13ydTcudSMj x6sMnKm1KuMdW70ufAbV7R2GEILKhLsblyoZyW6fywPxE12UrYM8I+0cy7Ar1uI92/I6 x/uReyhLuqxrmc+QS4/aIZM4uw9yzMlPjeD0ARKFeOJ+LFbtmyKjy2AOCiFqNvlKj4k9 XUAKZfQuLvM2PNHfrJc+vIPIFBP+JPWm2Ea2eNmLdTl8LYcapVV/j7/1V1TqSiHK10vu E/4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118669; x=1752723469; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=JUngWupCRmA1NcaSn0JYmN9Vt9g7u4CLfwO2X664xO0=; b=dw2/wNfV3qVzN9QCwNxfuEmeanBrzOtFJIpZc1gIxEmX/M7Oishp1mvDgg+RUUtq0G jtRqAnFFl+1Zrese2iwNatVmxROf3TD8PqKHWBRNH+4PgRRwRdDelKAqKjpPsycwjwLx Kf+ORGge1pkvkEVeooyOFg2m6cyT6gwodMvGWAXs+GJi0fCU/cjl2Ye0Ftj+8PLMbJ7B yYB0FK3wkWNmXtRJWPslkIaM1mhR3appBKGNEUIUPQo/YhGwWGnJGk2Hd8EXVBu2Msjh VHMyxqjhuFaZRRRgeBQBemcXP/EnG/C1oGr//VHMcoQo2aYprFnAN8s9CZ3kbys68K2u mcoA== X-Forwarded-Encrypted: i=1; AJvYcCW3el6agvfFT75vRMkjfB/xWuTa61e4UtP2ifFmXYIqls1UBNDv9ewaUqM0rEkl0QhLmplHqKC/voiAHPQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yw2OmXgoty7lbW3PkcWfmjpXm30zm0KeIBg0WRT8wliJdW72kLj gVri7N7u8jhty36GluwUfPpLdfa7l4ANtkgDuw1fKNWSXXsXtqJHqBY/ X-Gm-Gg: ASbGncvPu0Aya2SVgc87AXjQO+CSoc6oLsz5emtTLU09CtM8QPw3395XfAXuvzA0WPh XCUHaAtGMjF/wzGc07C3D1M07hE8nml/HzM8qETxDeZaf8T7I+wdXPj4IuFTfyu2kC2Ju7JDjd9 WkFDMTMptrmYXcXSEuSvDy3R4lZpU332ZDvol4kxjgLZp0AxY+G9+Ssao+gH/FyS/8PywSLjX8q fF6OQ9Yivs/yfNH8an+AngsAD4H/Zpr8hnPPn4qcmdnw4ARvb17n/HhXNIJBNF71wwErMPiZCm4 k0on9pLmUUkyJokhwyODsnlCB8JfnlxYfkmiFySr/+A1tuRohduGWGJH0D66pm8WjVd5iO33ywY m X-Google-Smtp-Source: AGHT+IGJMcv9Y48MgcubipVQ2N+kHU4eIToB95Q+8RK8anRoIeqtEpzS8aLCS0SfzRYrIcL0Mzs1eg== X-Received: by 2002:a17:90a:c887:b0:30e:6a9d:d78b with SMTP id 98e67ed59e1d1-31c3cf77947mr2551192a91.12.1752118668551; Wed, 09 Jul 2025 20:37:48 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:47 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v5 4/8] mm/shmem, swap: tidy up swap entry splitting Date: Thu, 10 Jul 2025 11:37:02 +0800 Message-ID: <20250710033706.71042-5-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Instead of keeping different paths of splitting the entry before the swap in start, move the entry splitting after the swapin has put the folio in swap cache (or set the SWAP_HAS_CACHE bit). This way we only need one place and one unified way to split the large entry. Whenever swapin brought in a folio smaller than the shmem swap entry, split the entry and recalculate the entry and index for verification. This removes duplicated codes and function calls, reduces LOC, and the split is less racy as it's guarded by swap cache now. So it will have a lower chance of repeated faults due to raced split. The compiler is also able to optimize the coder further: bloat-o-meter results with GCC 14: With DEBUG_SECTION_MISMATCH (-fno-inline-functions-called-once): ./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-82 (-82) Function old new delta shmem_swapin_folio 2361 2279 -82 Total: Before=3D33151, After=3D33069, chg -0.25% With !DEBUG_SECTION_MISMATCH: ./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o add/remove: 0/1 grow/shrink: 1/0 up/down: 949/-750 (199) Function old new delta shmem_swapin_folio 2878 3827 +949 shmem_split_large_entry.isra 750 - -750 Total: Before=3D33086, After=3D33285, chg +0.60% Since shmem_split_large_entry is only called in one place now. The compiler will either generate more compact code, or inlined it for better performance. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang Tested-by: Baolin Wang --- mm/shmem.c | 56 ++++++++++++++++++++++-------------------------------- 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index d8c872ab3570..97db1097f7de 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2266,14 +2266,16 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, struct address_space *mapping =3D inode->i_mapping; struct mm_struct *fault_mm =3D vma ? vma->vm_mm : NULL; struct shmem_inode_info *info =3D SHMEM_I(inode); + swp_entry_t swap, index_entry; struct swap_info_struct *si; struct folio *folio =3D NULL; bool skip_swapcache =3D false; - swp_entry_t swap; int error, nr_pages, order, split_order; + pgoff_t offset; =20 VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); - swap =3D radix_to_swp_entry(*foliop); + index_entry =3D radix_to_swp_entry(*foliop); + swap =3D index_entry; *foliop =3D NULL; =20 if (is_poisoned_swp_entry(swap)) @@ -2321,46 +2323,35 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, } =20 /* - * Now swap device can only swap in order 0 folio, then we - * should split the large swap entry stored in the pagecache - * if necessary. - */ - split_order =3D shmem_split_large_entry(inode, index, swap, gfp); - if (split_order < 0) { - error =3D split_order; - goto failed; - } - - /* - * If the large swap entry has already been split, it is + * Now swap device can only swap in order 0 folio, it is * necessary to recalculate the new swap entry based on - * the old order alignment. + * the offset, as the swapin index might be unalgined. */ - if (split_order > 0) { - pgoff_t offset =3D index - round_down(index, 1 << split_order); - + if (order) { + offset =3D index - round_down(index, 1 << order); swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); } =20 - /* Here we actually start the io */ folio =3D shmem_swapin_cluster(swap, gfp, info, index); if (!folio) { error =3D -ENOMEM; goto failed; } - } else if (order > folio_order(folio)) { + } +alloced: + if (order > folio_order(folio)) { /* - * Swap readahead may swap in order 0 folios into swapcache + * Swapin may get smaller folios due to various reasons: + * It may fallback to order 0 due to memory pressure or race, + * swap readahead may swap in order 0 folios into swapcache * asynchronously, while the shmem mapping can still stores * large swap entries. In such cases, we should split the * large swap entry to prevent possible data corruption. */ - split_order =3D shmem_split_large_entry(inode, index, swap, gfp); + split_order =3D shmem_split_large_entry(inode, index, index_entry, gfp); if (split_order < 0) { - folio_put(folio); - folio =3D NULL; error =3D split_order; - goto failed; + goto failed_nolock; } =20 /* @@ -2369,15 +2360,13 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, * the old order alignment. */ if (split_order > 0) { - pgoff_t offset =3D index - round_down(index, 1 << split_order); - - swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); + offset =3D index - round_down(index, 1 << split_order); + swap =3D swp_entry(swp_type(swap), swp_offset(index_entry) + offset); } } else if (order < folio_order(folio)) { swap.val =3D round_down(swap.val, 1 << folio_order(folio)); } =20 -alloced: /* We have to do this with folio locked to prevent races */ folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || @@ -2434,12 +2423,13 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, shmem_set_folio_swapin_error(inode, index, folio, swap, skip_swapcache); unlock: - if (skip_swapcache) - swapcache_clear(si, swap, folio_nr_pages(folio)); - if (folio) { + if (folio) folio_unlock(folio); +failed_nolock: + if (skip_swapcache) + swapcache_clear(si, folio->swap, folio_nr_pages(folio)); + if (folio) folio_put(folio); - } put_swap_device(si); =20 return error; --=20 2.50.0 From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3C672857CF for ; Thu, 10 Jul 2025 03:37:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118674; cv=none; b=Czdd72/BvB7TlM5D3F5nvbrQYjUptnBqYIRrpzaC4B43NA1YpeV//hAoHB55u/EdJ3tObtrE94sgQhj/uYNKltUExVmAJOawJRR2nTSgK40ugKgwzvl84qOfYVKlLZx74Be1QGZuJW3GwCefQ+QsV44lvTUu5NEgiVO91E/A2ho= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118674; c=relaxed/simple; bh=+Gr4ZaK6+CEBXdeWK8bvMKktBfxCGaOLP73eUiHj55A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PMH/1gVDgpDbI1dFkGyf0FXOda8SYZh4e0aBFfmRuu0Gxz6jXpAT2f+fo2dKqry5syjo5/ina4GrMqjdDkm+NhGDRSnGtY1ermfVSEhF+l1BbtycH7ibwPX2SCsPWSu/rngjKELCkGDTGRm9BZdi7TCG9H1/0g4Wq0MoFC98trs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Si/mmUnT; arc=none smtp.client-ip=209.85.210.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Si/mmUnT" Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-7424ccbef4eso554892b3a.2 for ; Wed, 09 Jul 2025 20:37:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118672; x=1752723472; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=6c/3UTjwUomIURBx5sx1cWF1vLKF6eZKQiVovtdE0rk=; b=Si/mmUnTT3vHlk03PeTUfVeq8t4YkSB9acqpTAjVTi09/h25+F9effVGzV32/L1ksQ SlOE5/yqCDfr6llWLP2DT6svooejPGIIJ+tLZsaQbbfwJC1TAT0Z0vmn1NrG+o+KuflD 3AoQYhWagZsNepR1gwXus8Sl6A/nZ2dH6KgZvR1D8lJQvkWZq/yD1VOh4gJVip6ps/Ed cv5abF3y7OgR0mnA7VBgkd564kCjj/jUoVJBKvAw/kWDBAc7G005uw+YcDwbZYoIeYj7 Rpgp/hTOa2xt80n0cuT7DZfs9KRnJtVRh2tktWOJIuHReiPknm7j7jv43a3Uly2zis13 deTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118672; x=1752723472; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6c/3UTjwUomIURBx5sx1cWF1vLKF6eZKQiVovtdE0rk=; b=aXGzpaNXs8KX39WREO7qovaxnRJt5iE8qpup8u1x7GeymOBMEVLWjslF9mOCORpXF1 /sWMHovRzMRFFLAevMaJHM+N/Pz3Zqwf+oexj12Hxv3sg/qmDdro9IBO18c7GryHsNA9 j2J5Bpevc60pefDhBDQLy9V7d6mDUYcmbZMzoJk/5htK0d0GgEEcv+azBtccD0ZF45o4 5M9BDNwXtEo361sARbW1/tYf0w0lZ1gC8c2hmYhCmr3+OBDldtlxdAjbfcJn2zzitiZY c5x0GRBtjaINzbUfaH+X/VFZAA2xTeoMQwSGROLfxOmz+CndBUj5PBX+JoemSp5QSIP0 uyLA== X-Forwarded-Encrypted: i=1; AJvYcCV9B8hCDgZK1s5OkO5zO3b/+Fk0igosJsvFghSfiR2hhsj5pvDIIfNpPlP+CbFc24Ul71/nu/+ZnxA8SVY=@vger.kernel.org X-Gm-Message-State: AOJu0YxrPXwmaGyvgONQKA0/5p65iHqdFlyg4iZZs4ozK3YJbEjIHzFZ fCFsP8OJkLwlzVIdFFBlSl+dO6WoHWStpqMeNZUqfFOF0gPuERq3bnEf X-Gm-Gg: ASbGnctvkE4TuDtFxzbtU1zvIjdkSbUGNUy137gDJCs9piSCNM03ID5ZO4aRKGcaAXX RZnfDWbE8LhWD2pOao9ITJ1vt6O/H/i+uIvyiEoU3CfG6Jhu8Deyx1tg9mVmYbYaceiW9ngpj8F ZLD9J1N9P2nWXBgMCp/rcZyJJRe1b+MgAw5/LY8IzMsvfaQIwnijsz+qLkAGwhDJ07QS6MJyzSg KqwWidZJLxogYEzJAMUgD+LNi4HqLkANgqoIC60UNjCitoB1snU9ivGuQfRk9m19dZAhEQzMD/9 YoZlVzMTXGEZiBD9i6UbcM0VP5TXQkb5lbtQBmp/k/VD+7lRvyfSe1nDJYBwj098T7dec0HhRbt c X-Google-Smtp-Source: AGHT+IFo+hUv+hE7jl8qtNFOz6BmwhbqfgrE8NbpYioKLZ+rFaC6z2kjXG11U2fuldmQTgRk7Yr+Sw== X-Received: by 2002:a17:903:234d:b0:234:d7b2:2ac4 with SMTP id d9443c01a7336-23de481b3b4mr15669525ad.17.1752118672195; Wed, 09 Jul 2025 20:37:52 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.48 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:51 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v5 5/8] mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO Date: Thu, 10 Jul 2025 11:37:03 +0800 Message-ID: <20250710033706.71042-6-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song For SWP_SYNCHRONOUS_IO devices, if a cache bypassing THP swapin failed due to reasons like memory pressure, partially conflicting swap cache or ZSWAP enabled, shmem will fallback to cached order 0 swapin. Right now the swap cache still has a non-trivial overhead, and readahead is not helpful for SWP_SYNCHRONOUS_IO devices, so we should always skip the readahead and swap cache even if the swapin falls back to order 0. So handle the fallback logic without falling back to the cached read. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 41 ++++++++++++++++++++++++++++------------- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 97db1097f7de..847e6f128485 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1982,6 +1982,7 @@ static struct folio *shmem_swap_alloc_folio(struct in= ode *inode, struct shmem_inode_info *info =3D SHMEM_I(inode); int nr_pages =3D 1 << order; struct folio *new; + gfp_t alloc_gfp; void *shadow; =20 /* @@ -1989,6 +1990,7 @@ static struct folio *shmem_swap_alloc_folio(struct in= ode *inode, * limit chance of success with further cpuset and node constraints. */ gfp &=3D ~GFP_CONSTRAINT_MASK; + alloc_gfp =3D gfp; if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { if (WARN_ON_ONCE(order)) return ERR_PTR(-EINVAL); @@ -2003,19 +2005,22 @@ static struct folio *shmem_swap_alloc_folio(struct = inode *inode, if ((vma && unlikely(userfaultfd_armed(vma))) || !zswap_never_enabled() || non_swapcache_batch(entry, nr_pages) !=3D nr_pages) - return ERR_PTR(-EINVAL); + goto fallback; =20 - gfp =3D limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); + alloc_gfp =3D limit_gfp_mask(vma_thp_gfp_mask(vma), gfp); + } +retry: + new =3D shmem_alloc_folio(alloc_gfp, order, info, index); + if (!new) { + new =3D ERR_PTR(-ENOMEM); + goto fallback; } - - new =3D shmem_alloc_folio(gfp, order, info, index); - if (!new) - return ERR_PTR(-ENOMEM); =20 if (mem_cgroup_swapin_charge_folio(new, vma ? vma->vm_mm : NULL, - gfp, entry)) { + alloc_gfp, entry)) { folio_put(new); - return ERR_PTR(-ENOMEM); + new =3D ERR_PTR(-ENOMEM); + goto fallback; } =20 /* @@ -2030,7 +2035,9 @@ static struct folio *shmem_swap_alloc_folio(struct in= ode *inode, */ if (swapcache_prepare(entry, nr_pages)) { folio_put(new); - return ERR_PTR(-EEXIST); + new =3D ERR_PTR(-EEXIST); + /* Try smaller folio to avoid cache conflict */ + goto fallback; } =20 __folio_set_locked(new); @@ -2044,6 +2051,15 @@ static struct folio *shmem_swap_alloc_folio(struct i= node *inode, folio_add_lru(new); swap_read_folio(new, NULL); return new; +fallback: + /* Order 0 swapin failed, nothing to fallback to, abort */ + if (!order) + return new; + entry.val +=3D index - round_down(index, nr_pages); + alloc_gfp =3D gfp; + nr_pages =3D 1; + order =3D 0; + goto retry; } =20 /* @@ -2313,13 +2329,12 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, } =20 /* - * Fallback to swapin order-0 folio unless the swap entry - * already exists. + * Direct swapin handled order 0 fallback already, + * if it failed, abort. */ error =3D PTR_ERR(folio); folio =3D NULL; - if (error =3D=3D -EEXIST) - goto failed; + goto failed; } =20 /* --=20 2.50.0 From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD81A285CA3 for ; Thu, 10 Jul 2025 03:37:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118678; cv=none; b=f/qS/G4q4ZDhchMape9PO4ZFLTcarMJbJNtSmwxqvLNAu2UZDGtpwstMnN85nNJtJiG0dCAymzqovKXJIe1BklEIBm7KvqmTZxWMgprIGszKXsOmOkZ8MHntgJ9xa2yljQVm3M5bVyPGxHK6P00e/KGmJctx8ahwjyahC4KDkjU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118678; c=relaxed/simple; bh=HC+6k1nQxzhsFuk4wMlDLb43FdK20wLRM0oWZgL7y8Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eqLCZfk19BfIW52T2b0QoyNZQlEdbjC+uivrRBF0ypHJ1I80R3H5t+1ZY17MUoz2uIul0pT5co6/cbgorXM2LdgtVoRQSNVgV0roakdDCJtGJCDd4pI8qMnXbCeaEULpUmyghYHl7krgtinMxVLWpnIGwueW2N7OVbDr2rcRabI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cbxdw2uJ; arc=none smtp.client-ip=209.85.215.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cbxdw2uJ" Received: by mail-pg1-f181.google.com with SMTP id 41be03b00d2f7-b26f5f47ba1so553715a12.1 for ; Wed, 09 Jul 2025 20:37:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118676; x=1752723476; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=nILRlzs6W9f6pBIwoLukcSVGtWgUoN//llJCwXGQtks=; b=cbxdw2uJXTh1RwZZ1jBU3lS82pFzHs9wnwtxzPV7FHgJqkf4Yk6V9qqu2HHmtcZlqF g/eZT14h6limsqKTq8HdWIXlJOliEjY8JxJT0YE1UWwMzWmKrFhnTep4JG+t3764n2V7 BUgL4iCnUAdYwsWyXHgJJVlazgsbOaYrAT1x1PGI9EISUObw0JFcKEqo21YQdfX4+FK5 Eoh8xnBtOMwVmOpl4R3n6vmRbUI+mg/ofd8tYMKi9JgiSpv99SEy3Pc+7axpCALToAdF u3HC62mENaVi7fX5wjSwU4J+GQcKKZkDGIPBnT+LeMwHk0Z9n9AJUy4ar8h0doL1rl+z o30w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118676; x=1752723476; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=nILRlzs6W9f6pBIwoLukcSVGtWgUoN//llJCwXGQtks=; b=bKVKh0y/Ofh1zGjTaDc1oB4Vp9A1qdJlCvmP2v3uni92ahB1/tQ4cwFhfFenDrCQKe MYiqykNKaPIYHUTLBYxs7sLXWiTMD8a6OnqlrjlkJEV6MJmFklEvLuxOw5VBMo3CQjY3 CmA8nhc6p+RicolyyNXoEAyQJgKAtCiTfli49KMHG3dicX7FWhSvxRrvrBhFVnhTcgP/ 4dQtY4gJhm2tR++F13vJIewlXQ32Mwmkl/SFrZD2o7Ns6DVqfvGNthCHLMKyNW+pT4Ey MhqJutBFcDi0xAn13hAdSKWfeGo9uk4m5YSenkrvGntXD2ANeMkw6OZCfjaJwpJDuPx0 VJ5Q== X-Forwarded-Encrypted: i=1; AJvYcCUU69F8kmFTlusMtbwjuX8Rj6u2jp/jXqrPf8saFTZGCqgctWus0A32VxRa5CCvtV1GqE1rZ6AeeiXQOOM=@vger.kernel.org X-Gm-Message-State: AOJu0YwpFeWfCk+Hem5wkxK7UjZwgdini61kTb+dGDOSvRWTiO4WoUjt 7VaB0v3GZIKEm5wsk5LOVq3nNTrD41E6u3b5pt0mXrvnpeIbc5RB8CLf X-Gm-Gg: ASbGncu+PdaSQy7SOLMKFw+0vqI9fU9uDt9W3v5Fwrf/zYdqu7ldd9XGNlQ+3MWflL/ U0bmMOaY5hdHfg7YcNKeKb2CCy/k7PzklOkWpKONmva1q7bfu2467KmE6yi3bBF+PXmNsNxU559 DOMCk3jZt0cMrRRyZYaWpIEhjhmQNPJNvcKf8z77QvN8WCKbJYzebNTGf1jELr+xwqOJsvz2Smm Pj9b/qNKz7z1ISTqYo3bQwUK7rUGK+vT080XN5DYB5x2qhSpS/dNnbLkKyVD0Ve8nD3Hzttu8u6 pHSiBpJOOeY3SNS64sfzWCgR55GeBsK02U1vdGYQ0x53t+cmPvjCch80ku/7kUmiebrbnUSxqXQ 52PDsDKHK2zY= X-Google-Smtp-Source: AGHT+IFJyIDBVba3HKojPZjn0y6Nwa/FH+Rj3zhl8LqZ048q0mLYJ8pfJLTIBQJSD1OjkoOOQrO06g== X-Received: by 2002:a17:90a:d40c:b0:312:1cd7:b337 with SMTP id 98e67ed59e1d1-31c3c255d05mr3461618a91.5.1752118675880; Wed, 09 Jul 2025 20:37:55 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.52 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:55 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v5 6/8] mm/shmem, swap: simplify swapin path and result handling Date: Thu, 10 Jul 2025 11:37:04 +0800 Message-ID: <20250710033706.71042-7-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Slightly tidy up the different handling of swap in and error handling for SWP_SYNCHRONOUS_IO and non-SWP_SYNCHRONOUS_IO devices. Now swapin will always use either shmem_swap_alloc_folio or shmem_swapin_cluster, then check the result. Simplify the control flow and avoid a redundant goto label. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 45 +++++++++++++++++++-------------------------- 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 847e6f128485..80f5b8c73eb8 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2320,40 +2320,33 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, count_memcg_event_mm(fault_mm, PGMAJFAULT); } =20 - /* Skip swapcache for synchronous device. */ if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { + /* Direct mTHP swapin skipping swap cache & readhaed */ folio =3D shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); - if (!IS_ERR(folio)) { - skip_swapcache =3D true; - goto alloced; + if (IS_ERR(folio)) { + error =3D PTR_ERR(folio); + folio =3D NULL; + goto failed; } - + skip_swapcache =3D true; + } else { /* - * Direct swapin handled order 0 fallback already, - * if it failed, abort. + * Cached swapin only supports order 0 folio, it is + * necessary to recalculate the new swap entry based on + * the offset, as the swapin index might be unalgined. */ - error =3D PTR_ERR(folio); - folio =3D NULL; - goto failed; - } - - /* - * Now swap device can only swap in order 0 folio, it is - * necessary to recalculate the new swap entry based on - * the offset, as the swapin index might be unalgined. - */ - if (order) { - offset =3D index - round_down(index, 1 << order); - swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); - } + if (order) { + offset =3D index - round_down(index, 1 << order); + swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); + } =20 - folio =3D shmem_swapin_cluster(swap, gfp, info, index); - if (!folio) { - error =3D -ENOMEM; - goto failed; + folio =3D shmem_swapin_cluster(swap, gfp, info, index); + if (!folio) { + error =3D -ENOMEM; + goto failed; + } } } -alloced: if (order > folio_order(folio)) { /* * Swapin may get smaller folios due to various reasons: --=20 2.50.0 From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pg1-f174.google.com (mail-pg1-f174.google.com [209.85.215.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D6B9285CA2 for ; Thu, 10 Jul 2025 03:38:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118682; cv=none; b=PTo0uGqprRO7QJJnW9fJJ7FWhlZ5DjofuYJx8hSGI9OeQd1xjCIYd3ZgFoW/0d/8t+rYtbKAvJJc1SnxabzDHIoEXqmGaI7dYQ1Ox9OcHpyaRtUQ9z4MyelzXX9ZF5aFLcDlkr6wTbZUmVRyrruqTJXu1azfPCR/l1BwGb6NL7E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118682; c=relaxed/simple; bh=LYVd8e8AGZUjcgypR4lE+LJaT9OX+S4x9XEiHJ1xmNA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iWtAApN0I4A7nJVwEdoCHR/BpYAkAwqoeekWjNN7P3/daYHeITG8Y1mCPxjzKRIRtioPgZ9PTWvk8NkpK/JmVZzH+LmG/4mIYfjtDajyr7DosT37VyuDv03FDgkv+f0chntj27os7eb5YPe8CDmvGhlGt/V7aJ58nmZJb2KTx1w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nnX4kUCU; arc=none smtp.client-ip=209.85.215.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nnX4kUCU" Received: by mail-pg1-f174.google.com with SMTP id 41be03b00d2f7-b34ab678931so473063a12.0 for ; Wed, 09 Jul 2025 20:38:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118680; x=1752723480; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=zLyM92xjF/qE8PKYeKzMy6+qx0MOrSpvWpyzMUmH5dc=; b=nnX4kUCUq2UIkNOELVk4F9BZJCcLrwGKJalqdKF5DzHxuyxVLSEsEqdf1NIBPG+Wlq UCU6m5NvvC/qALnpnzJvTCrXEgLKJ6tlnw/QKbKjIU5Tf8Pz+HMx0TCk7fZA+oOb/VU7 aoHrh9CQe8Ua0p90IEGwLK8oNHZpjfEJrTDt9NefEu/jepZfkeJ6Au2muxr6TJtAIy8X 0rjRvFsjJPv3+CjNqwDwuAV6pN7Ot0mC5wQ3TC+pU5fWl7XtG6I3EbxoYGYdHkMlj9Ni Jmsb20sCY6FrupWZyFZcd6eomTtxAI9BX52ASe5V50zZ9YcUzaaG29wFMOVK9aHy00BB 98UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118680; x=1752723480; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=zLyM92xjF/qE8PKYeKzMy6+qx0MOrSpvWpyzMUmH5dc=; b=J4KJ6nlVHhiP/1yFESc23yLXJ/Llv9jW/J+Bo5JqnzHtmFhvOuB+drx0HEeEF3SEke VDkPQ26LSYWu0/Q670S6XsB4maT9yvbMzvV78IJETVhVtV25xZBzBI/wC0i54vnWJVKD AGQpFhPCHO1uAI5CaZNaNHej65pXvQQ/qMZAKk9I0LfSqPYWrimC73ITxHIvhK13C1u+ yt0B1FmJ11c9EcBfzrG6oFjaLmDPdqGlc16QS9afvEzwvjbk9F23QHqmTsBQ4KUl5lNJ +937TjjTNksTZ6UxIVrqMk9VaveCNbN9onNIRg61ib4QJPfdnVR/9wsHtI/yJmguYPQP XH1g== X-Forwarded-Encrypted: i=1; AJvYcCWsubcfYVoAI5IaIdmF8qgMu2NEurNkBBdIHh6H4Hw8HaVuQ3XeA2TdLKtFmkLZPSzX6MSHwJosIaEE7zk=@vger.kernel.org X-Gm-Message-State: AOJu0YwwRrLauEFWGSAEpn3JqGS2+o8A4JkVt7HoMLKDPF9s/EQE9p86 KnxbTJj1npy/e1BTac3wfGHIuMgmg7HimUgBNonjj3pvGe88oIqOtUJ9 X-Gm-Gg: ASbGnctfvACqSbwxfN/YJalcO9/VA7aJlmmq4ALnfq1oKMHIkvrLd9nc6kyoh9XSWAT U9Xl97TK0B/EZCFSvmL07n05C1rWTyGVJ9gAnLhQdoxBYot4H0oGvCv66dXqkSMY5oU+K5/TSgU PL3MpwaVf1pL60PYLY5Khlnc+d3AORdacN1JXhksqtfVoMiZL/HQCEvnyO4t2tjeXSR6NMLHtyY c035yHeLjexj9y0qAAo6k8mdnh7ORbCO5b/0CWYMKIwGG4y1HQw2J9PIsu5HAEsmjwVKvZgs+R3 NUvr+T3WruE8I/qsogrcICV3fYcGRoTdMEVNdKGlIs/kmpn60LtOdfrHRDYJDdl7RxpuIbzsARQ O X-Google-Smtp-Source: AGHT+IGMdjnBHsobKiw3IFD0J93kBWsIfXdtph/gYTDQk8Jc7sEx6r4eioU+JuxSODPnP116JqFVNg== X-Received: by 2002:a17:90b:4c09:b0:311:eb85:96f0 with SMTP id 98e67ed59e1d1-31c3c2f3dc0mr3518018a91.29.1752118679531; Wed, 09 Jul 2025 20:37:59 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.56 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:37:58 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v5 7/8] mm/shmem, swap: rework swap entry and index calculation for large swapin Date: Thu, 10 Jul 2025 11:37:05 +0800 Message-ID: <20250710033706.71042-8-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song Instead of calculating the swap entry differently in different swapin paths, calculate it early before the swap cache lookup and use that for the lookup and later swapin. And after swapin have brought a folio, simply round it down against the size of the folio. This is simple and effective enough to verify the swap value. A folio's swap entry is always aligned by its size. Any kind of parallel split or race is acceptable because the final shmem_add_to_page_cache ensures that all entries covered by the folio are correct, and thus there will be no data corruption. This also prevents false positive cache lookup. If a shmem read request's index points to the middle of a large swap entry, previously, shmem will try the swap cache lookup using the large swap entry's starting value (which is the first sub swap entry of this large entry). This will lead to false positive lookup results if only the first few swap entries are cached but the actual requested swap entry pointed by the index is uncached. This is not a rare event, as swap readahead always tries to cache order 0 folios when possible. And this shouldn't cause any increased repeated faults. Instead, no matter how the shmem mapping is split in parallel, as long as the mapping still contains the right entries, the swapin will succeed. The final object size and stack usage are also reduced due to simplified code: ./scripts/bloat-o-meter mm/shmem.o.old mm/shmem.o add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-233 (-233) Function old new delta shmem_swapin_folio 4040 3807 -233 Total: Before=3D33152, After=3D32919, chg -0.70% Stack usage (Before vs After): mm/shmem.c:2277:12:shmem_swapin_folio 264 static mm/shmem.c:2277:12:shmem_swapin_folio 256 static And while at it, round down the index too if swap entry is round down. The index is used either for folio reallocation or confirming the mapping content. In either case, it should be aligned with the swap folio. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang Tested-by: Baolin Wang --- mm/shmem.c | 66 ++++++++++++++++++++++++++---------------------------- 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 80f5b8c73eb8..9c50607ac455 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2265,7 +2265,7 @@ static int shmem_split_large_entry(struct inode *inod= e, pgoff_t index, if (xas_error(&xas)) return xas_error(&xas); =20 - return entry_order; + return 0; } =20 /* @@ -2286,7 +2286,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, struct swap_info_struct *si; struct folio *folio =3D NULL; bool skip_swapcache =3D false; - int error, nr_pages, order, split_order; + int error, nr_pages, order; pgoff_t offset; =20 VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); @@ -2294,11 +2294,11 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, swap =3D index_entry; *foliop =3D NULL; =20 - if (is_poisoned_swp_entry(swap)) + if (is_poisoned_swp_entry(index_entry)) return -EIO; =20 - si =3D get_swap_device(swap); - order =3D shmem_confirm_swap(mapping, index, swap); + si =3D get_swap_device(index_entry); + order =3D shmem_confirm_swap(mapping, index, index_entry); if (unlikely(!si)) { if (order < 0) return -EEXIST; @@ -2310,6 +2310,12 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, return -EEXIST; } =20 + /* index may point to the middle of a large entry, get the sub entry */ + if (order) { + offset =3D index - round_down(index, 1 << order); + swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); + } + /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); if (!folio) { @@ -2322,7 +2328,8 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, =20 if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { /* Direct mTHP swapin skipping swap cache & readhaed */ - folio =3D shmem_swap_alloc_folio(inode, vma, index, swap, order, gfp); + folio =3D shmem_swap_alloc_folio(inode, vma, index, + index_entry, order, gfp); if (IS_ERR(folio)) { error =3D PTR_ERR(folio); folio =3D NULL; @@ -2330,16 +2337,7 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, } skip_swapcache =3D true; } else { - /* - * Cached swapin only supports order 0 folio, it is - * necessary to recalculate the new swap entry based on - * the offset, as the swapin index might be unalgined. - */ - if (order) { - offset =3D index - round_down(index, 1 << order); - swap =3D swp_entry(swp_type(swap), swp_offset(swap) + offset); - } - + /* Cached swapin only supports order 0 folio */ folio =3D shmem_swapin_cluster(swap, gfp, info, index); if (!folio) { error =3D -ENOMEM; @@ -2356,23 +2354,25 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, * large swap entries. In such cases, we should split the * large swap entry to prevent possible data corruption. */ - split_order =3D shmem_split_large_entry(inode, index, index_entry, gfp); - if (split_order < 0) { - error =3D split_order; + error =3D shmem_split_large_entry(inode, index, index_entry, gfp); + if (error) goto failed_nolock; - } + } =20 - /* - * If the large swap entry has already been split, it is - * necessary to recalculate the new swap entry based on - * the old order alignment. - */ - if (split_order > 0) { - offset =3D index - round_down(index, 1 << split_order); - swap =3D swp_entry(swp_type(swap), swp_offset(index_entry) + offset); - } - } else if (order < folio_order(folio)) { - swap.val =3D round_down(swap.val, 1 << folio_order(folio)); + /* + * If the folio is large, round down swap and index by folio size. + * No matter what race occurs, the swap layer ensures we either get + * a valid folio that has its swap entry aligned by size, or a + * temporarily invalid one which we'll abort very soon and retry. + * + * shmem_add_to_page_cache ensures the whole range contains expected + * entries and prevents any corruption, so any race split is fine + * too, it will succeed as long as the entries are still there. + */ + nr_pages =3D folio_nr_pages(folio); + if (nr_pages > 1) { + swap.val =3D round_down(swap.val, nr_pages); + index =3D round_down(index, nr_pages); } =20 /* We have to do this with folio locked to prevent races */ @@ -2387,7 +2387,6 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, goto failed; } folio_wait_writeback(folio); - nr_pages =3D folio_nr_pages(folio); =20 /* * Some architectures may have to restore extra metadata to the @@ -2401,8 +2400,7 @@ static int shmem_swapin_folio(struct inode *inode, pg= off_t index, goto failed; } =20 - error =3D shmem_add_to_page_cache(folio, mapping, - round_down(index, nr_pages), + error =3D shmem_add_to_page_cache(folio, mapping, index, swp_to_radix_entry(swap), gfp); if (error) goto failed; --=20 2.50.0 From nobody Thu Dec 18 03:32:20 2025 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CE93286433 for ; Thu, 10 Jul 2025 03:38:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118685; cv=none; b=uOZSjisxytxy+9NOKqZW/M/fl73RH/RZ5nTCXiW3XFTk0he3ExRHssyx1khNt1UIOfkUWAGIsuzdXLW/Kzuxw+XlnAMS+lCRWs3H+dchs3ggSk3AXE/6XpeKb2gL5Eze82CBrRi588kOkgJ9kGHwGQ63I6t1I8ZXr3SWyz6gUe4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752118685; c=relaxed/simple; bh=zilMHR8Bho0+q+ksWPS9L5yJ97dNYy09T21lNk7aBso=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eTakPLeAEsHPRji/yowRrB3HYN3Q6FD0CqslLREeXdSv+3xvMSBjFdNZKgR0G4DyUTnSkTSkS+Go2cYEKsTCNf+sH2mWXq6JKwpHRrfRp0KCM9gRQ2Aquwt5b2cjtK45QYWwHNANujFVgd16cMViXByj3YFtjhGaOx7bZ6YjrAw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WW03OG36; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WW03OG36" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-2352400344aso5203035ad.2 for ; Wed, 09 Jul 2025 20:38:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752118683; x=1752723483; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=PG6G7XY0O+9uIkct7WB8iHIg0l6kV6jMS3CA4vYZotw=; b=WW03OG36YpPyTBcPGKg+XFrskZLI/VNB6rk3N1VxppruaM794SjJ9zffvSB7+d4oit eyNlQCq5m8AoIiTwvIIIRNlrDztkxxqpzO9argsfVbWt+SCHIVWSKx0932MXBzirCXfR 0ihfR6DqJHdqYiaEmOmKalmzXlidjckeoXjs3RPj/dmyTR9VLfWGDAPNOzOBNW3pPb3F 2JVx3GH8a2vgYkinwhaWdnn7hPFQwhgAvvdx6X4plBCG+OoleGv751r14PIFNzWFOffm l6G3ctKNJsDGp8sdyz+5QG2frAn397mI+FILk1E7yibwM0TbbTlVg9RZSi24LqfbZ6bt y4bQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752118683; x=1752723483; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=PG6G7XY0O+9uIkct7WB8iHIg0l6kV6jMS3CA4vYZotw=; b=J3KA4r/s38XINH2YqYpz3fXZKDeRJqLIPFIhS9wAuvSQ26uLzA1mLQ0dtqcY+tVvES W2BDuba8tGy0jljijYjKhr34yd4fLAregCjS5oVjVYXQDWGmU7s7iJSrpvW5CbKlXtkM bS0UgDy2M349j/Fl5XUTM+CDxRprFP/PMZ+IrolaMPgZQ+IQKUNd4QfC5tKbBfYoycYd +ABxGVEj70L0yz55sp5B7tlYMU7+73PYl7l7EnNhnD66klwtvWq3WrNpJhk4pwNTElfd qvmgIXvWt9HcZsVwmPUWTgFyA7ig5wRht/fObXumT0Yd4Rl8WUwGjzE9OjT4qcb0dDbR 4pNA== X-Forwarded-Encrypted: i=1; AJvYcCWoTUf+CNhy/6X9rBvSbLYqRpa+f4NvaNAvff/SV1Uv4CMpVFS5sE/q6yuoduIYsxhkgEcGk7c7KtS2PpM=@vger.kernel.org X-Gm-Message-State: AOJu0Yxj63FXQMmy+sjbwkl/6SA+P9bGEW0iN4hk2jqYddMwt1RJ5rIG Xgj5JcAOoacJQBJCImX3IRLWy9ii+fXz5ZYnBloWo6KJfIJSRHUa0qWV X-Gm-Gg: ASbGncvqkokbkb/FgmYCFbBsW8SoJGmjufgdaA98g6xjEoooWwtpWsYYTCaHdymz5+Q sFI32KppQ3dDfHRm1/Urc74Fa+yYPc/mvOgq3w+1OHyOWV4H5SxAOhl6z/7zg7wiwCMvbeTva0t Ss0bV+RhloRJ2xVM40+yOQdGrjS217K2JKFluXm0smbla/LmVZ7RUkCJ0uraeqAokVTKPtTQ0yM p0fiy9w+hmtD3yq0pLAEnX1x4f/ksbiVuXBENgLKmYkilTjgEiZgHKwkN1f1lRuNUDoNcdE4K+W pmv6fh3MNjHENg/pY2p6Wf7RrOTkX6ezgekyEpaWSuntdQlkL7hkJVXLoFL2/qWG8A6D6ZFfRDq S X-Google-Smtp-Source: AGHT+IHozr62v9ePpXEaXv6sB9HPchLtnTfJld5jCJZniq/y3YK5++iP1p4EdsFnAZw3QPWZfbZC+w== X-Received: by 2002:a17:903:3b83:b0:236:9726:7264 with SMTP id d9443c01a7336-23ddb191fecmr91457465ad.5.1752118683220; Wed, 09 Jul 2025 20:38:03 -0700 (PDT) Received: from KASONG-MC4 ([43.132.141.20]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-31c300689aasm3716320a91.13.2025.07.09.20.37.59 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 09 Jul 2025 20:38:02 -0700 (PDT) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Hugh Dickins , Baolin Wang , Matthew Wilcox , Kemeng Shi , Chris Li , Nhat Pham , Baoquan He , Barry Song , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v5 8/8] mm/shmem, swap: fix major fault counting Date: Thu, 10 Jul 2025 11:37:06 +0800 Message-ID: <20250710033706.71042-9-ryncsn@gmail.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20250710033706.71042-1-ryncsn@gmail.com> References: <20250710033706.71042-1-ryncsn@gmail.com> Reply-To: Kairui Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kairui Song If the swapin failed, don't update the major fault count. There is a long existing comment for doing it this way, now with previous cleanups, we can finally fix it. Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/shmem.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 9c50607ac455..f97c4e9f821d 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2319,13 +2319,6 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, /* Look it up and read it in.. */ folio =3D swap_cache_get_folio(swap, NULL, 0); if (!folio) { - /* Or update major stats only when swapin succeeds?? */ - if (fault_type) { - *fault_type |=3D VM_FAULT_MAJOR; - count_vm_event(PGMAJFAULT); - count_memcg_event_mm(fault_mm, PGMAJFAULT); - } - if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) { /* Direct mTHP swapin skipping swap cache & readhaed */ folio =3D shmem_swap_alloc_folio(inode, vma, index, @@ -2344,6 +2337,11 @@ static int shmem_swapin_folio(struct inode *inode, p= goff_t index, goto failed; } } + if (fault_type) { + *fault_type |=3D VM_FAULT_MAJOR; + count_vm_event(PGMAJFAULT); + count_memcg_event_mm(fault_mm, PGMAJFAULT); + } } if (order > folio_order(folio)) { /* --=20 2.50.0