From nobody Tue Dec 2 01:06:33 2025 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B206F21ADCB for ; Mon, 24 Nov 2025 19:16:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764011769; cv=none; b=JtdVL3Vw+wPVW4DPq12/ok28LL4msoBUES4fRH3zLMO2xSInVLr2n+azv52aAVzqsD/7lW4AEcAq2I1ofmdJhBVV2VE5tZ7/+fzvzj/JiqbkvY0KeflMK83z1+cHW53Meokh3XJw9CQ+O9OxYif7bpnTE9Enoy+p6VOxG4fdla8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764011769; c=relaxed/simple; bh=oai4UBxh3GEyx6xhRb3I2QqT99Mr+f1uES5spQpBFNY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pA0vKkZ/BqXDhXyt0+5lgzHHzIoE2cyDB3+X839tn1gQ3+waE34tpS2dmk81uxL4o5/hwX1Ncxq89aR0JCuIHMUxOZIj8GKzcUudKKWOxLAoSivopVkWBmgByYtzkjNKjtZ0mLLU4VfVIZXapVZE0CCmKe31d+1Op+6ovf8wvAE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dmofi+OB; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dmofi+OB" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-7b9a98b751eso3659186b3a.1 for ; Mon, 24 Nov 2025 11:16:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764011767; x=1764616567; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=m3/znPoavjBgqqjpUZfnq7wwOL79gjtKqNHBOh+8lvo=; b=dmofi+OBJUhgieIqTog0A5At0Qc8ZjMV38oVNImpYH9KulRVgd324j+FjuIrz45PH8 OqxkZXGCfrO1BjS8UPm1eh+q3dZPPuUOm0ZqqcvkXOIjBI/jyKxljZw3M+NxdEN0lmU+ HWZdSPuxCdkRQH2J4CZuxUAY0jhHCJ7jWHwovmwQdm+A7OJ0wfwRdFpKEvZLkEJlpp7a EkBf5lLZBgdRfKdM5MLuQrcgpW5WD/XmgpUFrm1k1CAbBua9RJhRUNErbE7uc+lVnvHL 9ZNzp+fNwomojVrWNhYftYkvkvGA6fRctf8OONBMRZlrNg1SERQMUZDMWyNLjIXbHiUK cL3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764011767; x=1764616567; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=m3/znPoavjBgqqjpUZfnq7wwOL79gjtKqNHBOh+8lvo=; b=SVpHDKFekMZuNSZVKyE+gjsnOc79G3+rK6XMqbiivlsEuvi/1XWQ58vQeA7gajElRc hNTNERLUtdxdni7Fyiv/SD14mnf3Y0Iz1R1wDtv9NnplqANw0y1VB1S5kqOQh18qEpyH 5XoJuJeWu7FP0rP1sgc0PHAik4pNUr6yqd7JHB3DujJaK2t0W6/hn1diXmLwyQfNH2zN LMtoKoINLQiBa23GjL6sj0r5NzUAXipHoT6Lx1dbajD7Wsol5M95XaK3k4c6iy09BrEM kjfCRB6W01E3qBmNk2e/JEO7qr19SQZ+u8GXLXOdx+sOzd4DoVIeatb80FO1BjxSMiex ajuQ== X-Forwarded-Encrypted: i=1; AJvYcCXrcRIQpmfUk+BSojEhWsV2p5+lkLEhuMA5Wnel0YS8kLIt0n4CNwZmbrM2ulZkR6vgbtXKPinBfitTFsA=@vger.kernel.org X-Gm-Message-State: AOJu0YyHAt4+uu9cvoTe1nqf7/ntUusp5GzjJlULiYdyxZeg/+BjuX+1 1C4sXrCrr/zlTvPKgaidSHDtvyig0iAjzdtVe9LEEmh50qwjBZqbjhCh X-Gm-Gg: ASbGnctfp8WENat4TezJTD2pxYTNk/ND9CV54+WfWVKhOAvKM4AeV9m5xDclOGwbzwT DU9UPaX4trd10ZGdxKCsmQuTISexrhyjk6cwcK5Xne4WmTqvv0DCwRGOwSU+qejzjopawWJJG44 BR3N9UMUpVIyq3f1CVYPhpdd+kdrSpde9nGbeKuoH1s5ekUfV7cex8N7T80RBk862cJMjmt9ihI eVC34W7iaagWXi+sXuPefiSJaUsLSbpLvaFihyAPGuQ0w9L3HcY6S6bfbm0VJJJ1GFH1DWyINHB NN01eQF+KTRxctvWxR3gzBQnE1Q17080gW8KEC3I/be11Uf/k6XazgzG2C+BxfME1AR+MSTHZeW KctJphv4CBCd6XYHpr77E/fNC+8uqoyEOkPeyUB4uHtCDkKmvZMpFl7WtheFKSrlUbX+fSY8Zbk 8kRFsq0/0A2sZvlYG6jfsDAV76pVBRQG46JWAq35g4CIZFruLjFrmakC0dK40= X-Google-Smtp-Source: AGHT+IEdNvv63j9gRZ+PlWPsm6yqUdrO7ifaRQFcoPAtjFV84g74FKwIZFNPZKCySdCzjMYIl0hjdQ== X-Received: by 2002:a05:6a21:6d89:b0:352:3695:fa64 with SMTP id adf61e73a8af0-3614eddecf9mr14535414637.37.1764011766767; Mon, 24 Nov 2025 11:16:06 -0800 (PST) Received: from [127.0.0.1] ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-bd75def75ffsm14327479a12.3.2025.11.24.11.16.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Nov 2025 11:16:06 -0800 (PST) From: Kairui Song Date: Tue, 25 Nov 2025 03:13:49 +0800 Subject: [PATCH v3 06/19] mm, swap: free the swap cache after folio is mapped Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251125-swap-table-p2-v3-6-33f54f707a5c@tencent.com> References: <20251125-swap-table-p2-v3-0-33f54f707a5c@tencent.com> In-Reply-To: <20251125-swap-table-p2-v3-0-33f54f707a5c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Baoquan He , Barry Song , Chris Li , Nhat Pham , Yosry Ahmed , David Hildenbrand , Johannes Weiner , Youngjun Park , Hugh Dickins , Baolin Wang , Ying Huang , Kemeng Shi , Lorenzo Stoakes , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Kairui Song X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1764011730; l=2699; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=7ANx1B1sM45AeElSDuuzzNPN1Tv7606509lOkJkHgRE=; b=1GahXSVrzSIb69EYYrLaQ5o8p7Eq2dNT0v+CUr1H6Miwk462u5EDamd0NV7lZ8C8w3T5qDMEE /6bdy6IuS0kAgr7/KGczPaDcuKfPjZUGRdmiZxAdD6BLnyVFLNeLqup X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= From: Kairui Song To reduce repeated faults due to parallel swapins of the same PTE, remove the folio from the swap cache after it is mapped. So new faults from the swap PTE will be much more likely to see the folio in the swap cache and wait on it. This does not eliminate all swapin races: an ongoing swapin fault may still see an empty swap cache. That's harmless, as the PTE is changed before the swap cache is cleared, so it will just return and not trigger any repeated faults. Signed-off-by: Kairui Song --- mm/memory.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 3f707275d540..ce9f56f77ae5 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4362,6 +4362,7 @@ static vm_fault_t remove_device_exclusive_entry(struc= t vm_fault *vmf) static inline bool should_try_to_free_swap(struct swap_info_struct *si, struct folio *folio, struct vm_area_struct *vma, + unsigned int extra_refs, unsigned int fault_flags) { if (!folio_test_swapcache(folio)) @@ -4384,7 +4385,7 @@ static inline bool should_try_to_free_swap(struct swa= p_info_struct *si, * reference only in case it's likely that we'll be the exclusive user. */ return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) && - folio_ref_count(folio) =3D=3D (1 + folio_nr_pages(folio)); + folio_ref_count(folio) =3D=3D (extra_refs + folio_nr_pages(folio)); } =20 static vm_fault_t pte_marker_clear(struct vm_fault *vmf) @@ -4936,15 +4937,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ arch_swap_restore(folio_swap(entry, folio), folio); =20 - /* - * Remove the swap entry and conditionally try to free up the swapcache. - * We're already holding a reference on the page but haven't mapped it - * yet. - */ - swap_free_nr(entry, nr_pages); - if (should_try_to_free_swap(si, folio, vma, vmf->flags)) - folio_free_swap(folio); - add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); add_mm_counter(vma->vm_mm, MM_SWAPENTS, -nr_pages); pte =3D mk_pte(page, vma->vm_page_prot); @@ -4998,6 +4990,15 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) arch_do_swap_page_nr(vma->vm_mm, vma, address, pte, pte, nr_pages); =20 + /* + * Remove the swap entry and conditionally try to free up the swapcache. + * Do it after mapping, so raced page faults will likely see the folio + * in swap cache and wait on the folio lock. + */ + swap_free_nr(entry, nr_pages); + if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags)) + folio_free_swap(folio); + folio_unlock(folio); if (unlikely(folio !=3D swapcache)) { /* --=20 2.52.0