From nobody Thu Oct 2 03:27:34 2025 Received: from fout-a7-smtp.messagingengine.com (fout-a7-smtp.messagingengine.com [103.168.172.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E52A3302147 for ; Tue, 23 Sep 2025 11:03:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.150 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625401; cv=none; b=uPbsUvm/rw7ePdSYlT9fqX64fLc570o5XIj1kShNoBUR3nf72NqrgexIOnCpa2TBf6YdGllJ4ihEDKeSq7Gq7+SWM379BjRf1FMCCQVBTM6mxoRffpcw8xB3uNYa5odHAG0hVmQnstOss4cLwzqfJZODqJzIhkXoBqG5htwpv1k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625401; c=relaxed/simple; bh=7QCXYWOHwiU+p9/s4J6qDPLNsZQ85NJ6/lMLx1KnvP4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aQzgglBNrHueo+4jCnoIFe4PG6ZdC1doHqFfQG0rlSCvOswdYJKYLcsn0iBw/UQ3WAFAmiLTlYpXunirqyXRdY9WPXCUnZCWnDZk4oOpCQDfO7G71Ac19WADxH6WzY1WpKQcnAlUGMreeK89FqSrMnL8mIg2BA92GgG29sK7/vw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name; spf=pass smtp.mailfrom=shutemov.name; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b=AujJnepm; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=gBi3iRA2; arc=none smtp.client-ip=103.168.172.150 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shutemov.name Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b="AujJnepm"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="gBi3iRA2" Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfout.phl.internal (Postfix) with ESMTP id DA7A3EC00A0; Tue, 23 Sep 2025 07:03:18 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-01.internal (MEProxy); Tue, 23 Sep 2025 07:03:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-transfer-encoding:content-type:date:date:from :from:in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm3; t=1758625398; x= 1758711798; bh=9h5a9glrsc1Q9FewltBTBylrIxsnIQry21VQ9ojF/aI=; b=A ujJnepmyPZhX+WU2im5VqmQB3MvkTLXHej6jJZyzkNWzDWqCFIyVSqoAGpSMAboq +b/122qzuZjnXko4oM/kCcXjJg14d1MLno35d5juDolhe13UfmmeHhiKC9rYmeCS M/kFWQUxZvOuVThjgjJ4lW3bBxuFGd2Uz41duKqu31Sij7XXs2XFzehKo65XRSZS dqzL9n//3JFY9Hka9eHl4kv5rC5m8jm7oCXvu/Ts6Y+a636NULVzhcBIoph2mRcH 7yf99Nt+Ur1WAkArYEems7E5zX00MaJhlxjUclyJT08c46yXcKtlNjKaDqlajZWp dWklv5pxm2ETssA8/JMSg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1758625398; x=1758711798; bh=9 h5a9glrsc1Q9FewltBTBylrIxsnIQry21VQ9ojF/aI=; b=gBi3iRA2F5JjgzmV+ BbK3yEUYy5dVuMOaEa3mlJoVNS12UA9rh2MueCMl1WmzTCjNFt+McWKcbUJfSKNa v02KmchecKc2952pY4HUFLGYscXlbISs1v1pwCsknGlBf2XJ/iOP9QLthZL8cyPg slKOMCzS5kOd/5slgK1S4xJHSoMdoHunlQGB07mQQSOb3eG6CKdAieOgkubfEyvb 3v1W43vpahyJVPkQcLS3+xUFbnpc6p6nwyg2XKbu013ks/dlfMtmT4YfTgnElYmJ yltiLz72DPp78PrQ4ntMvtfApoVMWushhaT6knITFRgN/lYSJdgrLc3uGMM+WgE0 6E8EQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdeitdehiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmihhrhihlucfu hhhuthhsvghmrghuuceokhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgvqeenucggtf frrghtthgvrhhnpeegveehtdfgvdfhudegffeuuddvgeevjefhveevgefhvdevieevteei vdehjefhjeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvgdpnhgspghrtghpthhtohepudek pdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegrkhhpmheslhhinhhugidqfhhouh hnuggrthhiohhnrdhorhhgpdhrtghpthhtohepuggrvhhiugesrhgvughhrghtrdgtohhm pdhrtghpthhtohephhhughhhugesghhoohhglhgvrdgtohhmpdhrtghpthhtohepfihilh hlhiesihhnfhhrrgguvggrugdrohhrghdprhgtphhtthhopehlohhrvghniihordhsthho rghkvghssehorhgrtghlvgdrtghomhdprhgtphhtthhopehlihgrmhdrhhhofihlvghtth esohhrrggtlhgvrdgtohhmpdhrtghpthhtohepvhgsrggskhgrsehsuhhsvgdrtgiipdhr tghpthhtoheprhhpphhtsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehsuhhrvghnsg esghhoohhglhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 23 Sep 2025 07:03:17 -0400 (EDT) From: Kiryl Shutsemau To: Andrew Morton , David Hildenbrand , Hugh Dickins , Matthew Wilcox Cc: Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Rik van Riel , Harry Yoo , Johannes Weiner , Shakeel Butt , Baolin Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kiryl Shutsemau Subject: [PATCHv3 1/5] mm/rmap: Fix a mlock race condition in folio_referenced_one() Date: Tue, 23 Sep 2025 12:03:06 +0100 Message-ID: <20250923110310.689126-2-kirill@shutemov.name> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250923110310.689126-1-kirill@shutemov.name> References: <20250923110310.689126-1-kirill@shutemov.name> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kiryl Shutsemau The mlock_vma_folio() function requires the page table lock to be held in order to safely mlock the folio. However, folio_referenced_one() mlocks a large folios outside of the page_vma_mapped_walk() loop where the page table lock has already been dropped. Rework the mlock logic to use the same code path inside the loop for both large and small folios. Use PVMW_PGTABLE_CROSSED to detect when the folio is mapped across a page table boundary. Signed-off-by: Kiryl Shutsemau Reviewed-by: Shakeel Butt --- mm/rmap.c | 59 ++++++++++++++++++++----------------------------------- 1 file changed, 21 insertions(+), 38 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 568198e9efc2..3d0235f332de 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -851,34 +851,34 @@ static bool folio_referenced_one(struct folio *folio, { struct folio_referenced_arg *pra =3D arg; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - int referenced =3D 0; - unsigned long start =3D address, ptes =3D 0; + int ptes =3D 0, referenced =3D 0; =20 while (page_vma_mapped_walk(&pvmw)) { address =3D pvmw.address; =20 if (vma->vm_flags & VM_LOCKED) { - if (!folio_test_large(folio) || !pvmw.pte) { - /* Restore the mlock which got missed */ - mlock_vma_folio(folio, vma); - page_vma_mapped_walk_done(&pvmw); - pra->vm_flags |=3D VM_LOCKED; - return false; /* To break the loop */ - } - /* - * For large folio fully mapped to VMA, will - * be handled after the pvmw loop. - * - * For large folio cross VMA boundaries, it's - * expected to be picked by page reclaim. But - * should skip reference of pages which are in - * the range of VM_LOCKED vma. As page reclaim - * should just count the reference of pages out - * the range of VM_LOCKED vma. - */ ptes++; pra->mapcount--; - continue; + + /* Only mlock fully mapped pages */ + if (pvmw.pte && ptes !=3D pvmw.nr_pages) + continue; + + /* + * All PTEs must be protected by page table lock in + * order to mlock the page. + * + * If page table boundary has been cross, current ptl + * only protect part of ptes. + */ + if (pvmw.flags & PVMW_PGTABLE_CROSSSED) + continue; + + /* Restore the mlock which got missed */ + mlock_vma_folio(folio, vma); + page_vma_mapped_walk_done(&pvmw); + pra->vm_flags |=3D VM_LOCKED; + return false; /* To break the loop */ } =20 /* @@ -914,23 +914,6 @@ static bool folio_referenced_one(struct folio *folio, pra->mapcount--; } =20 - if ((vma->vm_flags & VM_LOCKED) && - folio_test_large(folio) && - folio_within_vma(folio, vma)) { - unsigned long s_align, e_align; - - s_align =3D ALIGN_DOWN(start, PMD_SIZE); - e_align =3D ALIGN_DOWN(start + folio_size(folio) - 1, PMD_SIZE); - - /* folio doesn't cross page table boundary and fully mapped */ - if ((s_align =3D=3D e_align) && (ptes =3D=3D folio_nr_pages(folio))) { - /* Restore the mlock which got missed */ - mlock_vma_folio(folio, vma); - pra->vm_flags |=3D VM_LOCKED; - return false; /* To break the loop */ - } - } - if (referenced) folio_clear_idle(folio); if (folio_test_clear_young(folio)) --=20 2.50.1 From nobody Thu Oct 2 03:27:34 2025 Received: from fhigh-a5-smtp.messagingengine.com (fhigh-a5-smtp.messagingengine.com [103.168.172.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C09F321268 for ; Tue, 23 Sep 2025 11:03:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.156 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625404; cv=none; b=XzZk4PLOc/zX7lDfUR+aZ1f1gx30ltgWGEfyoWtJjvFfHImOOyfTCr2rIdd0AJfZQpsba7N7IX06qeA13L4Gfg0wIKICyroPm9cG4P06Ej13sYCetMc4KrTcQitn5J1XaYxYD4C2SKVQp4QTu/NQvPtMsxMiKIGhZgzsx6PiYZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625404; c=relaxed/simple; bh=FO+dE9pku24uLN8N4NL+QT6mBqrNJmrCztRwBtxu04o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Eb5xfinEgfnTJ5hwzzPUwdwKdWZDKNQws+Tz13BdiLbZITyF1AEuuSJrLh7uinWXDJG8E6WlHNQwcICoXqZf7ZFL/QG4uHFvh91iVKgXL8mCZMN4akZdlBRHDKNB64RKJbuGRMqOGhVK1jvEHdHKNLP1CfYA5gI/im75bWP2xUA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name; spf=pass smtp.mailfrom=shutemov.name; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b=X94OxRck; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=eBykFhZL; arc=none smtp.client-ip=103.168.172.156 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shutemov.name Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b="X94OxRck"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="eBykFhZL" Received: from phl-compute-09.internal (phl-compute-09.internal [10.202.2.49]) by mailfhigh.phl.internal (Postfix) with ESMTP id 13793140007F; Tue, 23 Sep 2025 07:03:21 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-09.internal (MEProxy); Tue, 23 Sep 2025 07:03:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-transfer-encoding:content-type:date:date:from :from:in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm3; t=1758625401; x= 1758711801; bh=BDBl7oJGV3fWpTYYjjz2yGgNlw4zY/i9bG+IwFihDok=; b=X 94OxRck9RS3KXkLR2kgprjzcjR2TMNPXUrpCff15/MeG3qjwC5e622rJ3fpKHdhs 9+asmgb0zhY7IX7Ck0dG4w5SnIfuh2/G+v+NqNl5c+s5Jpz32JVn7Rws+F1fszDg oVPUtP/0taH+vQs5dI+7uWH2Osh2RRHdM9Y190RPr4AhCB2FGJEhgPTej2+RF1A8 Aj3pdGm6lzGIBc/crz1NsHHkcUOhYs8jkv6KxWVRfZipfpP3vHMlrQN0zlxxE9u5 H5nXk+GsiciJ5LFP6dRl+oGdJy1MEH49sXBBPyD5o8y19D2V96nyaKPk2xtThu89 Rg65mVi+JhShKcqtUb/yQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1758625401; x=1758711801; bh=B DBl7oJGV3fWpTYYjjz2yGgNlw4zY/i9bG+IwFihDok=; b=eBykFhZL/pnhgwzO6 a0BLPrLSjos+DE2PcwOb9kjK8H0CQ57EunTyPSfwI3Ad2MsNxprbjJUfWzDR00hx HP2OnZVn1FXOjZ2DRhmqA2Tft2UZ6ZowqItiSBUe6nAOvaWzBAnfEZDKQfpJsKoR KAxydWbQfKGUdq7qp9OhmkjZakt/K2TZhETisaqg7D4RxL1I3Cgo8sbJqnfP9Y9M +/5xDPT8m99Kl/AbzHYwIaT6DXye7uiRk1cLL+cb8AXhAL1ocpFrZOPIiED0VoFC 6EbfEX5AFNvt9j0HPFRyEK92PFqVbB0193DO1URMQgakrEoW5Lc7Zx6jgGQGgOKH jdQ0g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdeitdehiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmihhrhihlucfu hhhuthhsvghmrghuuceokhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgvqeenucggtf frrghtthgvrhhnpeegveehtdfgvdfhudegffeuuddvgeevjefhveevgefhvdevieevteei vdehjefhjeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvgdpnhgspghrtghpthhtohepudek pdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegrkhhpmheslhhinhhugidqfhhouh hnuggrthhiohhnrdhorhhgpdhrtghpthhtohepuggrvhhiugesrhgvughhrghtrdgtohhm pdhrtghpthhtohephhhughhhugesghhoohhglhgvrdgtohhmpdhrtghpthhtohepfihilh hlhiesihhnfhhrrgguvggrugdrohhrghdprhgtphhtthhopehlohhrvghniihordhsthho rghkvghssehorhgrtghlvgdrtghomhdprhgtphhtthhopehlihgrmhdrhhhofihlvghtth esohhrrggtlhgvrdgtohhmpdhrtghpthhtohepvhgsrggskhgrsehsuhhsvgdrtgiipdhr tghpthhtoheprhhpphhtsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehsuhhrvghnsg esghhoohhglhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 23 Sep 2025 07:03:19 -0400 (EDT) From: Kiryl Shutsemau To: Andrew Morton , David Hildenbrand , Hugh Dickins , Matthew Wilcox Cc: Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Rik van Riel , Harry Yoo , Johannes Weiner , Shakeel Butt , Baolin Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kiryl Shutsemau Subject: [PATCHv3 2/5] mm/rmap: mlock large folios in try_to_unmap_one() Date: Tue, 23 Sep 2025 12:03:07 +0100 Message-ID: <20250923110310.689126-3-kirill@shutemov.name> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250923110310.689126-1-kirill@shutemov.name> References: <20250923110310.689126-1-kirill@shutemov.name> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kiryl Shutsemau Currently, try_to_unmap_once() only tries to mlock small folios. Use logic similar to folio_referenced_one() to mlock large folios: only do this for fully mapped folios and under page table lock that protects all page table entries. Signed-off-by: Kiryl Shutsemau Reviewed-by: Shakeel Butt --- mm/rmap.c | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 3d0235f332de..a55c3bf41287 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1870,6 +1870,7 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, unsigned long nr_pages =3D 1, end_addr; unsigned long pfn; unsigned long hsz =3D 0; + int ptes =3D 0; =20 /* * When racing against e.g. zap_pte_range() on another cpu, @@ -1910,10 +1911,34 @@ static bool try_to_unmap_one(struct folio *folio, s= truct vm_area_struct *vma, */ if (!(flags & TTU_IGNORE_MLOCK) && (vma->vm_flags & VM_LOCKED)) { + ptes++; + + /* + * Set 'ret' to indicate the page cannot be unmapped. + * + * Do not jump to walk_abort immediately as additional + * iteration might be required to detect fully mapped + * folio an mlock it. + */ + ret =3D false; + + /* Only mlock fully mapped pages */ + if (pvmw.pte && ptes !=3D pvmw.nr_pages) + continue; + + /* + * All PTEs must be protected by page table lock in + * order to mlock the page. + * + * If page table boundary has been cross, current ptl + * only protect part of ptes. + */ + if (pvmw.flags & PVMW_PGTABLE_CROSSSED) + goto walk_done; + /* Restore the mlock which got missed */ - if (!folio_test_large(folio)) - mlock_vma_folio(folio, vma); - goto walk_abort; + mlock_vma_folio(folio, vma); + goto walk_done; } =20 if (!pvmw.pte) { --=20 2.50.1 From nobody Thu Oct 2 03:27:34 2025 Received: from fhigh-a5-smtp.messagingengine.com (fhigh-a5-smtp.messagingengine.com [103.168.172.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A5FAE3218AE for ; Tue, 23 Sep 2025 11:03:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.156 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625405; cv=none; b=s/umUGnaIIAOKqdUcukfWs/3ZgDAI5Hwjh6BeqtDB1zDGynZXQr1Mn4y63UHvcZps5k1SrH6fMkcqjCicjqOwgu+tE+bDF91m6e7ugkS13H1f0JB98YFNXdQB90TSAYmJ+51zUl5Nq1bjIa+K8wj3cK5VRIcxZGQgm7V52gSyww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625405; c=relaxed/simple; bh=YdfniaI8JtjbmRmFL2P1+yzqvWlKFWQCM5NeBuYYjgk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=e0Diy1aydnO40aw7LYZJVCLl4axgMjwETklWFiDXv81710HfvDLObx0zulUX+M3YH2YC6xarTsr/SBo9fwOE7ejtzZ1wuJhkIsjC5xJffsg94t/wegvbH6k+dTJ2hMnThMyL+1uVFS+MIrhRzduZqYYWfRMLIL/aSdJmEwJVSiY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name; spf=pass smtp.mailfrom=shutemov.name; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b=fxIDwv6N; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=I//iNIR+; arc=none smtp.client-ip=103.168.172.156 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shutemov.name Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b="fxIDwv6N"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="I//iNIR+" Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfhigh.phl.internal (Postfix) with ESMTP id A4A551400065; Tue, 23 Sep 2025 07:03:22 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-02.internal (MEProxy); Tue, 23 Sep 2025 07:03:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-transfer-encoding:content-type:date:date:from :from:in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm3; t=1758625402; x= 1758711802; bh=l+30FsNxlli8PpC3TpepLFbX1IePGlSZR2KUwpff5IU=; b=f xIDwv6Nw4l1i4iTtFMICkXa78Jm5zksgps7DQdUgDIPo5eTMQqU9njEZ9XF6Bg4m FM/TGuKU8UQdvip7vxG43or1wsjsA1SubWgf33EyJBU6CK8XAfDfsi8jjJvx0eeP joIojexJ0zE7/K/5ZHLuh+XNYA/yFxsG6X/r9ppf7OHdl2gAWRVPcngV4xgg+/RW D2ypxesd6iMcTTtihjMZy/mOIr8pCn2GKKM/2mlR7Q3FUr4KPNTODNqcR6gBhO6A XRoc1mbLXIHqFeOhHBcQu2QgA55sTIdgcPxqR4PdiCB208tKzfK34ofjXj3FXq46 /AcOJ632kvY7wI40P1nMA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1758625402; x=1758711802; bh=l +30FsNxlli8PpC3TpepLFbX1IePGlSZR2KUwpff5IU=; b=I//iNIR+jivZO6pAj 7qpoyTpnVPMx0lJKVOtzBIMdyHPNoyaW/LnJxPgtCFlZfszbAnmybSLIyWkrWi5l L3w//lhrF7bf7mQ7yV9gM4cp7BQwzzjotYKrFRCMlPLBivtykk3V47f82kPv34+1 iaYXjOb6rOxV+uZZo/bqBgwGq/7g7ZlcVc8W+vfDavEnPsdu4vutcA0v8Pw/cjys UtFmFHwRO/z9zhF0LjwIxY3oxmd/kQDvZmqm2pm6DuWahTEbmSfwvPXD5ELugbW0 ddtgnSLFDuVkNiwpgrJZY48U2QNHrpLKc+JX/1WF8ETiLiqW0EQBQP1fAb8ofC/g EhaaQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdeitdehiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmihhrhihlucfu hhhuthhsvghmrghuuceokhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgvqeenucggtf frrghtthgvrhhnpeegveehtdfgvdfhudegffeuuddvgeevjefhveevgefhvdevieevteei vdehjefhjeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvgdpnhgspghrtghpthhtohepudek pdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegrkhhpmheslhhinhhugidqfhhouh hnuggrthhiohhnrdhorhhgpdhrtghpthhtohepuggrvhhiugesrhgvughhrghtrdgtohhm pdhrtghpthhtohephhhughhhugesghhoohhglhgvrdgtohhmpdhrtghpthhtohepfihilh hlhiesihhnfhhrrgguvggrugdrohhrghdprhgtphhtthhopehlohhrvghniihordhsthho rghkvghssehorhgrtghlvgdrtghomhdprhgtphhtthhopehlihgrmhdrhhhofihlvghtth esohhrrggtlhgvrdgtohhmpdhrtghpthhtohepvhgsrggskhgrsehsuhhsvgdrtgiipdhr tghpthhtoheprhhpphhtsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehsuhhrvghnsg esghhoohhglhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 23 Sep 2025 07:03:22 -0400 (EDT) From: Kiryl Shutsemau To: Andrew Morton , David Hildenbrand , Hugh Dickins , Matthew Wilcox Cc: Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Rik van Riel , Harry Yoo , Johannes Weiner , Shakeel Butt , Baolin Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kiryl Shutsemau Subject: [PATCHv3 3/5] mm/fault: Try to map the entire file folio in finish_fault() Date: Tue, 23 Sep 2025 12:03:08 +0100 Message-ID: <20250923110310.689126-4-kirill@shutemov.name> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250923110310.689126-1-kirill@shutemov.name> References: <20250923110310.689126-1-kirill@shutemov.name> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kiryl Shutsemau The finish_fault() function uses per-page fault for file folios. This only occurs for file folios smaller than PMD_SIZE. The comment suggests that this approach prevents RSS inflation. However, it only prevents RSS accounting. The folio is still mapped to the process, and the fact that it is mapped by a single PTE does not affect memory pressure. Additionally, the kernel's ability to map large folios as PMD if they are large enough does not support this argument. When possible, map large folios in one shot. This reduces the number of minor page faults and allows for TLB coalescing. Mapping large folios at once will allow the rmap code to mlock it on add, as it will recognize that it is fully mapped and mlocking is safe. Signed-off-by: Kiryl Shutsemau Reviewed-by: Shakeel Butt Reviewed-by: Baolin Wang --- mm/memory.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 0ba4f6b71847..812a7d9f6531 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5386,13 +5386,8 @@ vm_fault_t finish_fault(struct vm_fault *vmf) =20 nr_pages =3D folio_nr_pages(folio); =20 - /* - * Using per-page fault to maintain the uffd semantics, and same - * approach also applies to non shmem/tmpfs faults to avoid - * inflating the RSS of the process. - */ - if (!vma_is_shmem(vma) || unlikely(userfaultfd_armed(vma)) || - unlikely(needs_fallback)) { + /* Using per-page fault to maintain the uffd semantics */ + if (unlikely(userfaultfd_armed(vma)) || unlikely(needs_fallback)) { nr_pages =3D 1; } else if (nr_pages > 1) { pgoff_t idx =3D folio_page_idx(folio, page); --=20 2.50.1 From nobody Thu Oct 2 03:27:34 2025 Received: from fhigh-a5-smtp.messagingengine.com (fhigh-a5-smtp.messagingengine.com [103.168.172.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC106321F35 for ; Tue, 23 Sep 2025 11:03:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.156 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625407; cv=none; b=iN7xsjjiRP9KzucdJz5Q/08KueXDCh+NQ36gOZ2HlzJ5EBCgbgnKZudHbdN+VggrVMYTVSnR4JJD02r9KOQv9sqWpeYa/cxU8ijS7YJEF51Ra+kf012hpwaVBnmu11F4MiQYtnr2LLtIcBPSTKgZHS6vtm2sZmsuHpWhyP4bfLI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625407; c=relaxed/simple; bh=lafj3u5MfFxC2O+ZOStOOH0oQz92l1iYxy2O9Q0JtL4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l/DxkVdYwJvkxUMRHCdRc0UWSq9RB8MgkdHbJwb9Df8XlXa2ETaNZY2tytCGJOkY55G8hkTLUcLYaoJMvK3FZ3xdF8EslgTP2B4LlRpwDe0vtcGYHL1+Qm0KvWiwIxCvFVyRlW+jZLl1e6mmyCagQcmLw1uHoAYEgP4KbjPQhCE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name; spf=pass smtp.mailfrom=shutemov.name; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b=g2S6l6L3; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=Y/X2fY6S; arc=none smtp.client-ip=103.168.172.156 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shutemov.name Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b="g2S6l6L3"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Y/X2fY6S" Received: from phl-compute-10.internal (phl-compute-10.internal [10.202.2.50]) by mailfhigh.phl.internal (Postfix) with ESMTP id EED3E1400062; Tue, 23 Sep 2025 07:03:24 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-10.internal (MEProxy); Tue, 23 Sep 2025 07:03:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-transfer-encoding:content-type:date:date:from :from:in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm3; t=1758625404; x= 1758711804; bh=jFlzJ6561wUPMtQD7f6sb4ugc+4wKUHkJSRo02E7hiw=; b=g 2S6l6L3MiMKNrVVk78Zc42IA/qVS5hU2W8/SmfolFj6PmyajazMKSkaPdK2jcjyZ usnOBTZQbkHfJCtyI/PliNUHMJnR5hZg5trr2AWDnCqIglt1AR55qHhnFPWUvFbQ dSRQ55n08Chzf/Tq1y1PnH1f4dx+PK4KRYOk+C7Rd/491iIIt0JxlVSEJQKH1f1B KO6tKHNVXXOIOVB06LKo9hZ8UcWU7yQdSYjNkl0hqmOv3eypaRwkYBQLCPjbDmM9 Y4vDPril2mcwcPg6iU8x7qFVwujTznEjp1C8tTlSVRltrxw4cAM7zkdsUp2g69su Bt64h3faz0B1DWzo9r2Rw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1758625404; x=1758711804; bh=j FlzJ6561wUPMtQD7f6sb4ugc+4wKUHkJSRo02E7hiw=; b=Y/X2fY6SixOFXq11V ZjqoBW7hM6wonlm2NFcIwz8Xn8bf7NEmxX1w3YOt0Q2FcP4y57fFgc1Ngh5OjOYb f6YQaS41vTRwyDzwi4H5uOM4UWeLRhcRrTWq3kNOs4BA2yyO4fg1F+VsiCMuIP2v VooPr3YPERizes8Mh0aPKt231ZLTAz8LfyPTOrP+EyhHOEU5Go/7i/2drxRb+Myb 2/W0i8uWuAhEUYAiqk1ZsomTEWeu/w8KPLMb+pilZgBkyFmHlncIXyVGeXEz1Phf lMNBcro7voHeT7Z0m0/GVh21R+GzzewTFl3S7GtcDyzWV6c1VPmMGW2fGnyb/pIX zgmvA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdeitdehiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmihhrhihlucfu hhhuthhsvghmrghuuceokhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgvqeenucggtf frrghtthgvrhhnpeegveehtdfgvdfhudegffeuuddvgeevjefhveevgefhvdevieevteei vdehjefhjeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvgdpnhgspghrtghpthhtohepudek pdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegrkhhpmheslhhinhhugidqfhhouh hnuggrthhiohhnrdhorhhgpdhrtghpthhtohepuggrvhhiugesrhgvughhrghtrdgtohhm pdhrtghpthhtohephhhughhhugesghhoohhglhgvrdgtohhmpdhrtghpthhtohepfihilh hlhiesihhnfhhrrgguvggrugdrohhrghdprhgtphhtthhopehlohhrvghniihordhsthho rghkvghssehorhgrtghlvgdrtghomhdprhgtphhtthhopehlihgrmhdrhhhofihlvghtth esohhrrggtlhgvrdgtohhmpdhrtghpthhtohepvhgsrggskhgrsehsuhhsvgdrtgiipdhr tghpthhtoheprhhpphhtsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehsuhhrvghnsg esghhoohhglhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 23 Sep 2025 07:03:23 -0400 (EDT) From: Kiryl Shutsemau To: Andrew Morton , David Hildenbrand , Hugh Dickins , Matthew Wilcox Cc: Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Rik van Riel , Harry Yoo , Johannes Weiner , Shakeel Butt , Baolin Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kiryl Shutsemau Subject: [PATCHv3 4/5] mm/filemap: Map entire large folio faultaround Date: Tue, 23 Sep 2025 12:03:09 +0100 Message-ID: <20250923110310.689126-5-kirill@shutemov.name> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250923110310.689126-1-kirill@shutemov.name> References: <20250923110310.689126-1-kirill@shutemov.name> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kiryl Shutsemau Currently, kernel only maps part of large folio that fits into start_pgoff/end_pgoff range. Map entire folio where possible. It will match finish_fault() behaviour that user hits on cold page cache. Mapping large folios at once will allow the rmap code to mlock it on add, as it will recognize that it is fully mapped and mlocking is safe. Signed-off-by: Kiryl Shutsemau --- mm/filemap.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 751838ef05e5..26cae577ba23 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3643,6 +3643,21 @@ static vm_fault_t filemap_map_folio_range(struct vm_= fault *vmf, struct page *page =3D folio_page(folio, start); unsigned int count =3D 0; pte_t *old_ptep =3D vmf->pte; + unsigned long addr0; + + /* + * Map the large folio fully where possible. + * + * The folio must not cross VMA or page table boundary. + */ + addr0 =3D addr - start * PAGE_SIZE; + if (folio_within_vma(folio, vmf->vma) && + (addr0 & PMD_MASK) =3D=3D ((addr0 + folio_size(folio) - 1) & PMD_MASK= )) { + vmf->pte -=3D start; + page -=3D start; + addr =3D addr0; + nr_pages =3D folio_nr_pages(folio); + } =20 do { if (PageHWPoison(page + count)) --=20 2.50.1 From nobody Thu Oct 2 03:27:34 2025 Received: from fout-a7-smtp.messagingengine.com (fout-a7-smtp.messagingengine.com [103.168.172.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98233302147 for ; Tue, 23 Sep 2025 11:03:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.150 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625409; cv=none; b=c53Zhx1ak25REdQIS8YXjYii3j2HbGNsj2P12P2sPeQzQ8di496NajvYIgHhUuOnjpfku95H0mPpF/j2kCSy9k8hXUOTha6VNwXVikurvGYHZcRRpYh2RzrcxeZxt4s4LlpSSYfuJWMjHhS0qyCuXd36R49vaiY9aN1ntrDmNdc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758625409; c=relaxed/simple; bh=zgThalph9y47lmEr32v6T0NLRVPpwukomwHn+9xtsDg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rpkXuGPEzSPCXJtNwgecATLUUmQqBqoxqk+tRNprADkrL3jTa9V1b/ly1+hg3pzCpK3CVU/HudLON3nT9WQTaMSBX+xTATV2GkYz4+X28aQFk28/GSVIvph1qwVPmishZLqI6iqJ0m5Qf5o4tJBDm99nUSvZ3uUhgY0wXhQG+10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name; spf=pass smtp.mailfrom=shutemov.name; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b=HNjaJRtp; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=AWz/3unk; arc=none smtp.client-ip=103.168.172.150 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shutemov.name Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b="HNjaJRtp"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="AWz/3unk" Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfout.phl.internal (Postfix) with ESMTP id AD25FEC0040; Tue, 23 Sep 2025 07:03:26 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-02.internal (MEProxy); Tue, 23 Sep 2025 07:03:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-transfer-encoding:content-type:date:date:from :from:in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm3; t=1758625406; x= 1758711806; bh=gBJQwyFG/mBuy6u3kCYzOX5EZriO4heRBy+k+i4RgPY=; b=H NjaJRtpsVfOKdVDwAZnJREfeF5y0jmef00dfHEU6FnuZe78bUl9WNwf0tkI2FKgQ +MruUEBEyzY34kb50Ydfx1JcJ5RYn8ORHRCJ9+DgcUFB1Kbb070TrFo0bqgNvZFX hzt8q0ePr2H9tM8LhOZcSVZKKdDxWHj6GjPpXQMcmvljpdzWt5stgDB0z9WB1nqE 3jAMak8hh8uNn36LJbKhWpl4Md9N8ldu0+6D0uSfCYWR5oHOj18hC120XehNWB6e cd0D44xEkHt5oAUyanDho/eDthOl/CAWOgFYGBacxHsVakfTOZ6+H+qSsAKJalEk UWfZQKTMpSLKb+KFMegIA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1758625406; x=1758711806; bh=g BJQwyFG/mBuy6u3kCYzOX5EZriO4heRBy+k+i4RgPY=; b=AWz/3unkIPlNecDqF gsNQmFd9BnXD+PgyKYnbcxwCMQsGZFp2dkrixlckj8stfErNX+hchm7Roydn5PGA gOUmYrtngnlasCs0ua9uisXOnkZC09TTscIx4B/pnU6q7uREzHCXQ8ZKyd0Shdks mpaZ/EKB1IxlH1fr/has42Mj4zB5H4/nCsPM2nrFF6eUuPj8rdqKDqSzXn0B4Yel fNCMSlg/XO+5qnZ0x19UuyJ3Kx4YBhRaZxmQ+4s4XwAR2zEn9s1oC8Sqds0O61k7 yrF7g+KnIrG0ggu7yLm22yiSc+XWMsrTlDbuPuNhKlsNZaI+ANTKTEVd/mOa83fn BlYig== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdeitdehiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmihhrhihlucfu hhhuthhsvghmrghuuceokhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgvqeenucggtf frrghtthgvrhhnpeegveehtdfgvdfhudegffeuuddvgeevjefhveevgefhvdevieevteei vdehjefhjeenucevlhhushhtvghrufhiiigvpedunecurfgrrhgrmhepmhgrihhlfhhroh hmpehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvgdpnhgspghrtghpthhtohepudek pdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegrkhhpmheslhhinhhugidqfhhouh hnuggrthhiohhnrdhorhhgpdhrtghpthhtohepuggrvhhiugesrhgvughhrghtrdgtohhm pdhrtghpthhtohephhhughhhugesghhoohhglhgvrdgtohhmpdhrtghpthhtohepfihilh hlhiesihhnfhhrrgguvggrugdrohhrghdprhgtphhtthhopehlohhrvghniihordhsthho rghkvghssehorhgrtghlvgdrtghomhdprhgtphhtthhopehlihgrmhdrhhhofihlvghtth esohhrrggtlhgvrdgtohhmpdhrtghpthhtohepvhgsrggskhgrsehsuhhsvgdrtgiipdhr tghpthhtoheprhhpphhtsehkvghrnhgvlhdrohhrghdprhgtphhtthhopehsuhhrvghnsg esghhoohhglhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 23 Sep 2025 07:03:26 -0400 (EDT) From: Kiryl Shutsemau To: Andrew Morton , David Hildenbrand , Hugh Dickins , Matthew Wilcox Cc: Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Rik van Riel , Harry Yoo , Johannes Weiner , Shakeel Butt , Baolin Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kiryl Shutsemau Subject: [PATCHv3 5/5] mm/rmap: Improve mlock tracking for large folios Date: Tue, 23 Sep 2025 12:03:10 +0100 Message-ID: <20250923110310.689126-6-kirill@shutemov.name> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20250923110310.689126-1-kirill@shutemov.name> References: <20250923110310.689126-1-kirill@shutemov.name> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kiryl Shutsemau The kernel currently does not mlock large folios when adding them to rmap, stating that it is difficult to confirm that the folio is fully mapped and safe to mlock it. This leads to a significant undercount of Mlocked in /proc/meminfo, causing problems in production where the stat was used to estimate system utilization and determine if load shedding is required. However, nowadays the caller passes a number of pages of the folio that are getting mapped, making it easy to check if the entire folio is mapped to the VMA. mlock the folio on rmap if it is fully mapped to the VMA. Mlocked in /proc/meminfo can still undercount, but the value is closer the truth and is useful for userspace. Signed-off-by: Kiryl Shutsemau Acked-by: David Hildenbrand Acked-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Lorenzo Stoakes Reviewed-by: Baolin Wang --- mm/rmap.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index a55c3bf41287..d5b40800198c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1462,12 +1462,12 @@ static __always_inline void __folio_add_anon_rmap(s= truct folio *folio, } =20 /* - * For large folio, only mlock it if it's fully mapped to VMA. It's - * not easy to check whether the large folio is fully mapped to VMA - * here. Only mlock normal 4K folio and leave page reclaim to handle - * large folio. + * Only mlock it if the folio is fully mapped to the VMA. + * + * Partially mapped folios can be split on reclaim and part outside + * of mlocked VMA can be evicted or freed. */ - if (!folio_test_large(folio)) + if (folio_nr_pages(folio) =3D=3D nr_pages) mlock_vma_folio(folio, vma); } =20 @@ -1603,8 +1603,13 @@ static __always_inline void __folio_add_file_rmap(st= ruct folio *folio, nr =3D __folio_add_rmap(folio, page, nr_pages, vma, level, &nr_pmdmapped); __folio_mod_stat(folio, nr, nr_pmdmapped); =20 - /* See comments in folio_add_anon_rmap_*() */ - if (!folio_test_large(folio)) + /* + * Only mlock it if the folio is fully mapped to the VMA. + * + * Partially mapped folios can be split on reclaim and part outside + * of mlocked VMA can be evicted or freed. + */ + if (folio_nr_pages(folio) =3D=3D nr_pages) mlock_vma_folio(folio, vma); } =20 --=20 2.50.1