From nobody Tue Dec 16 13:12:57 2025 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8611934C83D for ; Thu, 4 Dec 2025 19:30:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764876609; cv=none; b=YOsQ/ofeh4FdeimwJx41pOKHWTzR99p3/odQ/V1iYX0W1iCfRIJ5edS03n0fmbNeDhIxC5hEclSB6wCg6oK7Jwzm+B46tkvoVSlVrG9wamTPaNdEPQrwU+ZcmQAfkZll+mZpe5xKsL08/Ul3aIl8pNzARowZveNeOZz5T4yFQu0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764876609; c=relaxed/simple; bh=oai4UBxh3GEyx6xhRb3I2QqT99Mr+f1uES5spQpBFNY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=hUoTgdtLznnrQI3Or/vnmU7zbKFLlT3PeFXg7fcnHIVdjfrw9Bp5SdbNhFDlfVdhn4BTphXsmh3KCfV/TZqX8KE9iwsJLi2KNA4OnSsEwfNr/MUsRlyQWyf8t6uY7cUmAC+Z9Ajp45RKHvK0TIii2AgcYHNXbFNWv6JLtgIaxO4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UL/ffykW; arc=none smtp.client-ip=209.85.210.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UL/ffykW" Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-7baf61be569so1605733b3a.3 for ; Thu, 04 Dec 2025 11:30:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764876607; x=1765481407; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=m3/znPoavjBgqqjpUZfnq7wwOL79gjtKqNHBOh+8lvo=; b=UL/ffykW4RvwksZSgWY+edf2+pIftiT5SAN7tlAb4+bS/W/Xdl2eyDeSUxG3z/oLQN XCiYEt+87YXlVef7RboiBLceF37Ku5KjEkd5oLiXSQF/cnleeV9n5AHynQFrAebSeUJM 0jtTnG7+EV/cuoHCgIv75d7fnmAlTU3N9gdryH7KcDCq+ESQI65fRIRuxfMd5seN+g17 u9TxX9r2yz26ksaf2WKveknHNuo4qpOwN+cY/e3Lodq9yKPbVPvG4Qf/dbQiz69hjZx7 cBW2oPtRYowGGRo7vI9MaBg8u+lQLhNiY+wyAiBauf3z/hlLWw+iC1j8DfeeXxd2vpVt BqgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764876607; x=1765481407; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=m3/znPoavjBgqqjpUZfnq7wwOL79gjtKqNHBOh+8lvo=; b=HWdXrx5ZJInoWrF/LMlwbiRjJhSv5ezb6jqTNuHpUBunnlhSo71ltM/lU4wJPyO9Kt BbN7Jc+pyRv21iDADYC+Q5DfNLyNiJQGnoiGkvCoEnIsJcCJf3gu2lXDjqEY2GiMBKPV tnfYWKqS7PjL6APZeclVY2Q3G0yaMxHf5vAyoPaWCWC+tyl9rLGsNBjNdNoJkMN1VrvZ xxz7OWvTorzE94ze2rvsArlInMwv7PtY73e0uowfewpn92kUGWyAGTTJi/2UXmq9JFas ZyDqRsJx1sEjOeSoxk+gunypqZfNcmD7mBLbqRDHqSixGHD9U2i0WQBjHauC4f0k5WyY cT9A== X-Forwarded-Encrypted: i=1; AJvYcCW+EXFiL7KRIm0eHNeCyEdqmb5XvKb6p76Bv4zk0JNE8oX+fdXFJuHTENKGNliHAH831kKXPj6jbk+G9jU=@vger.kernel.org X-Gm-Message-State: AOJu0YyinUq+8QsbW5drdUuNhqTotwRh1qam8/eYfCHpSWOXfwEvGr14 LEAvVvVc8WX3KdQn9MveIDQCx/61rI9v70w2o2cMNK8jO0tKaAkSonNe X-Gm-Gg: ASbGncumWK8XMtcp5qbTbrwd91ItEqXFllFPSkMCVrQGl9klyzdhAPn5pg7oLxEcxjg 7cnYg2qCYxJkT2oT2fADXOKM5QkgxzZJWe8z90icBfP22GzyMAHDqvGAeWEL1Yb1v/6AYlLHTdA +1q5kz8DMO6bDiSGlYPpcoMDfHbk0LCBxvp52Gv4R36+aG/SxvFe7eKPIk56wPht98Ui458GS6O 98xPioPeL38Yqa48HzMfaeAvDiqCnnRBUbZzmZHEP42bUlRl5msFHNlg58tckbCfuzdRT/a3wCy id3ZLAVBUkWYp6QaX77hUwOHDaB8CYI5VvIUpAyF4HZdZtQY/pb4jdMwYPbehB5fI923rupxqXw sp3nWSGL1C9SIrFzqq63DJoPsVSfdADVITCUapNHsaQqa9nU95w5RkE5AR0ubBhpOyNOrfij9mz 99NZU7SnM+hYcIR7/M0wscfaJLO2+Xec9tAM9ShSax0PJqvLoU X-Google-Smtp-Source: AGHT+IHlCihpoml1bURIaE9C5o/lmQI/Qptg1W+OPUZo32rvNPAw+oZIUkYT4BZTK0zq/6gCdjTzQw== X-Received: by 2002:a05:6a21:3281:b0:361:3bdb:26df with SMTP id adf61e73a8af0-363f5cd95a3mr9195987637.5.1764876606643; Thu, 04 Dec 2025 11:30:06 -0800 (PST) Received: from [127.0.0.1] ([101.32.222.185]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-bf686b3b5a9sm2552926a12.9.2025.12.04.11.30.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Dec 2025 11:30:06 -0800 (PST) From: Kairui Song Date: Fri, 05 Dec 2025 03:29:14 +0800 Subject: [PATCH v4 06/19] mm, swap: free the swap cache after folio is mapped Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251205-swap-table-p2-v4-6-cb7e28a26a40@tencent.com> References: <20251205-swap-table-p2-v4-0-cb7e28a26a40@tencent.com> In-Reply-To: <20251205-swap-table-p2-v4-0-cb7e28a26a40@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Baoquan He , Barry Song , Chris Li , Nhat Pham , Yosry Ahmed , David Hildenbrand , Johannes Weiner , Youngjun Park , Hugh Dickins , Baolin Wang , Ying Huang , Kemeng Shi , Lorenzo Stoakes , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Kairui Song X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1764876574; l=2699; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=7ANx1B1sM45AeElSDuuzzNPN1Tv7606509lOkJkHgRE=; b=90zPFXqUoIJKzzeQgR2WIpWPnUlFjo51E3igJ0+jxGSHtxP9gmtjHUDBAh8+8WttPBQRL9Ia8 uE6YjgmVFcuDL4N9jy7qQnDwYpQZ/xV3RYh08HQsk+hxX9t7Xe8KXPW X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= From: Kairui Song To reduce repeated faults due to parallel swapins of the same PTE, remove the folio from the swap cache after it is mapped. So new faults from the swap PTE will be much more likely to see the folio in the swap cache and wait on it. This does not eliminate all swapin races: an ongoing swapin fault may still see an empty swap cache. That's harmless, as the PTE is changed before the swap cache is cleared, so it will just return and not trigger any repeated faults. Signed-off-by: Kairui Song Reviewed-by: Baoquan He --- mm/memory.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 3f707275d540..ce9f56f77ae5 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4362,6 +4362,7 @@ static vm_fault_t remove_device_exclusive_entry(struc= t vm_fault *vmf) static inline bool should_try_to_free_swap(struct swap_info_struct *si, struct folio *folio, struct vm_area_struct *vma, + unsigned int extra_refs, unsigned int fault_flags) { if (!folio_test_swapcache(folio)) @@ -4384,7 +4385,7 @@ static inline bool should_try_to_free_swap(struct swa= p_info_struct *si, * reference only in case it's likely that we'll be the exclusive user. */ return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) && - folio_ref_count(folio) =3D=3D (1 + folio_nr_pages(folio)); + folio_ref_count(folio) =3D=3D (extra_refs + folio_nr_pages(folio)); } =20 static vm_fault_t pte_marker_clear(struct vm_fault *vmf) @@ -4936,15 +4937,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ arch_swap_restore(folio_swap(entry, folio), folio); =20 - /* - * Remove the swap entry and conditionally try to free up the swapcache. - * We're already holding a reference on the page but haven't mapped it - * yet. - */ - swap_free_nr(entry, nr_pages); - if (should_try_to_free_swap(si, folio, vma, vmf->flags)) - folio_free_swap(folio); - add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); add_mm_counter(vma->vm_mm, MM_SWAPENTS, -nr_pages); pte =3D mk_pte(page, vma->vm_page_prot); @@ -4998,6 +4990,15 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) arch_do_swap_page_nr(vma->vm_mm, vma, address, pte, pte, nr_pages); =20 + /* + * Remove the swap entry and conditionally try to free up the swapcache. + * Do it after mapping, so raced page faults will likely see the folio + * in swap cache and wait on the folio lock. + */ + swap_free_nr(entry, nr_pages); + if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags)) + folio_free_swap(folio); + folio_unlock(folio); if (unlikely(folio !=3D swapcache)) { /* --=20 2.52.0