From nobody Mon Jun 8 10:56:42 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2AF2037F8A5 for ; Fri, 29 May 2026 17:23:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075432; cv=none; b=iAZBiLXIk+Vd/W+Qb38vv3V+MnIOmVt/9PlqsCi8OgbBiDFYCMCtucTZRKUEoIGVv9WWiN181B/dr0PNPNoztEyGCPrET+2F9bjdPq+8fOGk4VUhn3trXuv+wBHxmB7OpzSiIsRQWjMEltpBgDtc8NAT8B6a8dJ3Fols1nuaPxA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075432; c=relaxed/simple; bh=ABzuEml0mCVNKmxYCKN7v2NAXbUNo7uCb+UM5KYetKg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=muExzbLMO5PnLLCyaF8IIrxGsRpVq1DwZB6ZCNfe2kbO7PSCiAyjprVhua1Qy8BIASml65tTXHq7HQYnSLc0WVNakMYMzElciGCx8kg2oPIDk1GGgs9d91xbWS+qdC8LiTreHTZNKWxIc+Yk2lP75UO5u53xtmNmyF9d7rY55sM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=m0EN5Fp1; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="m0EN5Fp1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 350791F00893; Fri, 29 May 2026 17:23:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780075430; bh=NIA7ICYvGFCKjBA05z3aKIhDosxqa3wRNXTiub6kjQ0=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=m0EN5Fp1wgMMtKKVDrC6Hk80Fy4Ycw0Ru/G7OLjPkCluucgOffteXI04vIgtNpRNa O4XSvqAj/V84SCyTcTtoFj9+r8RPRVSuhdPSoLHmeU36Xd6TILQRt8zA5AFOTNQvWI W6FuS+Cly53kO7blhb+MSR6agxLIg40CqtxrP2t6NOhx16l7BzQDNQ8tSEcCsj/m01 sKdMCUcdQ4fPweWiBkMPqhcT+ZfrtKl5imLYRv0XkVUGkmuxmvyguACCAmINzK5vjM Ro74tA7TYcm3jVJnekRpbm/5PTmPoffPROuivaFGwFiR2RY+J+kuiw+E7c5XG2ysZ3 VXeWl46mOXxJw== Received: from phl-compute-06.internal (phl-compute-06.internal [10.202.2.46]) by mailfauth.phl.internal (Postfix) with ESMTP id 91168F4006D; Fri, 29 May 2026 13:23:49 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-06.internal (MEProxy); Fri, 29 May 2026 13:23:49 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTEZfpMr5/eRRE28w8SXJrP2dhbWJDto6huRdd/cnfVbAAhK4j6L5mXvQtvbZW1Tgn qdHjS1xVDGkdt7hr1SYRochmR7yuP2KsuqbhH6HkebbwRdIZkVYqIRv2TfGI9SaQtUiic8 G8SBlRV6/L2Xtb05bm6QmOQTKoj3b/RGprk0F6woL4YfqntUvh+0D5sL6ruauatQhdAfx4 sxg873kOfAaKAVmot+PkxhAMe2YWyy/aSlFAam2C/bHdTY9DLsPBVhceLLRJtCvIR2Q+zM kaoJA5O2aRPotfcyz8zPVtzBiOjMLe5Pw3NzV/bo4OKaUt+lobJ5eojHQXZMuqCslD6AI+ BM13zq6jVvyTjG84RmS69Gp7SlyBSnqMSsWkjT2xuM0NzyUJK9KfpZb9fp9Tb4IMDIwXk7 CW9rEHFz3n7rqGfGkRLBuDVxWgAiEytI7zge31faM1twZ4SThN2MqD4Jq6x7NArlBmmcZ1 1jmGcq+7qkHVGuGRgmeWzw2ya9tZ+i2vtU1PD2oGlxweR0ChEX5h0JNoIroqVDNTWlqbkW 6Ov950KZCVmBG5Y+O/oz6+M8rCJ1tOtvD92ut1C6kj8aOcYhnOiEpacbmRsCZSAmmtOgCa pJ4i4UFvDKOfwsz6uUiLA9nxWUx2WzdSYxWfEtsXCpVkkYp6atlK6KqqwDPQ X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 29 May 2026 13:23:48 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lorenzo Stoakes , Mike Rapoport , David Hildenbrand , "Kiryl Shutsemau (Meta)" , stable@vger.kernel.org, Sashiko AI review , "Liam R. Howlett" , Vlastimil Babka , Jann Horn , Pedro Falcato , =?UTF-8?q?Micha=C5=82=20Miros=C5=82aw?= , Muhammad Usama Anjum , Stephen Rothwell , Arnd Bergmann , linux-fsdevel@vger.kernel.org Subject: [PATCH 1/6] fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race Date: Fri, 29 May 2026 18:23:25 +0100 Message-ID: <20260529172331.356655-2-kas@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529172331.356655-1-kas@kernel.org> References: <20260529172331.356655-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" make_uffd_wp_huge_pte() arms the UFFD_WP bit on a present HugeTLB PTE by calling huge_ptep_modify_prot_commit() with a ptent snapshot that was fetched without the corresponding huge_ptep_modify_prot_start(). The start helper is what atomically clears the entry so the kernel-owned snapshot stays consistent until the commit; without it, the hardware may set Dirty or Accessed in the live PTE between the original read and the commit, and huge_ptep_modify_prot_commit() (whose generic implementation just calls set_huge_pte_at()) then writes the stale snapshot back over the live hardware bits, losing the update. The non-hugetlb sibling make_uffd_wp_pte() does this correctly via ptep_modify_prot_start() / ptep_modify_prot_commit(). Mirror that pattern for the present-PTE branch. The migration case stays as-is -- migration entries are non-present, so there's no hardware update to race against. Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optional= ly clear info about PTEs") Cc: stable@vger.kernel.org Reported-by: Sashiko AI review Signed-off-by: Kiryl Shutsemau Reviewed-by: Dev Jain Reviewed-by: Lorenzo Stoakes --- fs/proc/task_mmu.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 1e3a15bf46f4..e21a38ac745b 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -2610,12 +2610,16 @@ static void make_uffd_wp_huge_pte(struct vm_area_st= ruct *vma, if (softleaf_is_hwpoison(entry) || softleaf_is_marker(entry)) return; =20 - if (softleaf_is_migration(entry)) + if (softleaf_is_migration(entry)) { set_huge_pte_at(vma->vm_mm, addr, ptep, pte_swp_mkuffd_wp(ptent), psize); - else - huge_ptep_modify_prot_commit(vma, addr, ptep, ptent, - huge_pte_mkuffd_wp(ptent)); + } else { + pte_t old_pte, new_pte; + + old_pte =3D huge_ptep_modify_prot_start(vma, addr, ptep); + new_pte =3D huge_pte_mkuffd_wp(old_pte); + huge_ptep_modify_prot_commit(vma, addr, ptep, old_pte, new_pte); + } } #endif /* CONFIG_HUGETLB_PAGE */ =20 --=20 2.54.0 From nobody Mon Jun 8 10:56:42 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0790F3806CE for ; Fri, 29 May 2026 17:23:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075439; cv=none; b=TvW5ZC7nU/owgK86x+xMBjN3Gla+LpD/d56zXYkauGqUR3o7FtZkD1JJRBPot/Vf2RrJoOSfhadYsTaW2sssWA+o/4JBLxQjBzYIkFy0QEJk1zJhsR/QRoxGKlUUe8sFVwFrW4BXuWzUshu5pmzQqYFVGvSeDAts3ybBxNk0+gg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075439; c=relaxed/simple; bh=YrpAz0FXPAhElqfhnnykZc7iFUIiRTe7Cl78ip7p8Ng=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hZTDq5Lgc1UMwKhALncRW2lSKdS4wMwbdhZSIRVw9FErhI0HE3PMwJ+2qvhDXm44HzMOmGFvus8QiraHGL0xPxtAI3JTZQBpqjWALVPvV+zxF5N2Iogl1FI1b838q8AIzfC8FIuTwoqLejOw1HEwX51OrUWV4H87OXAHNw8lQ0g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=cWeRzBIZ; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cWeRzBIZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3B34F1F00898; Fri, 29 May 2026 17:23:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780075437; bh=9aQovhm+rAVLhPNqkReF29glHPCC8PiSoid5QftmxW8=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=cWeRzBIZzSr4LzJXWgtJgNAGF/7YupqxAqkBmWbnMkzVMV3dSy7GDJ8YriLuOoeG4 67szeU0c8RcYEEvSwIc4nJdUpCYgnqWqxI6+d6a2+lQmg//LXZX7a5fb3W8zExUmFF oDrGqIsm66TRKqAaY2WyvN/vYRv76JaGMjlb+s2z3adiwXQlnktUPdc05Cq1YdBCGo EIhjmFiVeiTf8+xrslPvU7NOKxAHbVWUdHUaakqhZfEAeEbJNoMXSdfRyL2ce4Lscq tUyxwgKhvUIrVZH4CVXuPg6Zuy+NWZCSrfLeGtwELpchHKUOB0OJyyojVeyGnUPSxa +FLLxXOkGr97A== Received: from phl-compute-04.internal (phl-compute-04.internal [10.202.2.44]) by mailfauth.phl.internal (Postfix) with ESMTP id 8DF7DF4006D; Fri, 29 May 2026 13:23:56 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-04.internal (MEProxy); Fri, 29 May 2026 13:23:56 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTEZfpMr5/eRRE28w8SXJrP2dhbWJDto6huRdd/cnfVbAAhK4j6L5mXvQtvbZW1Tgn qdHjS1xVDGkdt7hr1SYRochmR7yuP2KsuqbhH6HkebbwRdIZkVYqIRv2TfGI9SaQtUiic8 G8SBlRV6/L2Xtb05bm6QmOQTKoj3b/RGprk0F6woL4YfqntUvh+0D5sL6ruauatQhdAfx4 sxg873kOfAaKAVmot+PkxhAMe2YWyy/aSlFAam2C/bHdTY9DLsPBVhceLLRJtCvIR2Q+zM kaoJA5O2aRPotfcyz8zPVtzBiOjMLe5Pw3NzV/bo4OKaUt+lobJ5eojHQXZMuqCslD6ATr oxy1zzjt2em+zJsRq8POPUsy4CV7PoZHrsAai4kRVafuktEUWzwlzoahhiUOPOMqVMULxU OoqIdoLTAo5WAAqArSXoEXRD/SkEHUTjv/Yu5n5CM1khoLP/Zj8mVWMYbGav2f4ELOap51 nyHTSXHFliTGg10ueqLZO/lhBP77GXxnum5X6OeJxQqeRKO2shHkz87JQ/rk7Rn5rKHBRa JzC28BFu5dhwdGTo/2Cd5xWskrDBHDkUH6cTFcJZCSe2jSZeG16EWNT/whd9WMvw6wbbQ8 BPKfQrAB/lp/adhD0u/Nww3O04VYhY8zwOvVM5i4c504T4ICiektIhpy1pKQ X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 29 May 2026 13:23:55 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lorenzo Stoakes , Mike Rapoport , David Hildenbrand , "Kiryl Shutsemau (Meta)" , stable@vger.kernel.org, Sashiko AI review , "Liam R. Howlett" , Vlastimil Babka , Jann Horn , Pedro Falcato , =?UTF-8?q?Micha=C5=82=20Miros=C5=82aw?= , Muhammad Usama Anjum , Arnd Bergmann , Andrei Vagin , linux-fsdevel@vger.kernel.org Subject: [PATCH 2/6] fs/proc/task_mmu: use huge_page_size() in pagemap_scan_hugetlb_entry() Date: Fri, 29 May 2026 18:23:26 +0100 Message-ID: <20260529172331.356655-3-kas@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529172331.356655-1-kas@kernel.org> References: <20260529172331.356655-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The partial-page check compares against HPAGE_SIZE (PMD_SIZE), which is wrong for gigantic hugetlb hstates (e.g. 1G). The walker hands the callback a huge_page_size()-sized range, never start + HPAGE_SIZE, so the comparison always declares it partial and aborts the WP. Compare against the actual hstate's page size. Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optional= ly clear info about PTEs") Cc: stable@vger.kernel.org Reported-by: Sashiko AI review Signed-off-by: Kiryl Shutsemau Reviewed-by: Dev Jain Reviewed-by: Lorenzo Stoakes --- fs/proc/task_mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index e21a38ac745b..1489c67e88f7 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -2960,7 +2960,7 @@ static int pagemap_scan_hugetlb_entry(pte_t *ptep, un= signed long hmask, if (~categories & PAGE_IS_WRITTEN) goto out_unlock; =20 - if (end !=3D start + HPAGE_SIZE) { + if (end !=3D start + huge_page_size(hstate_vma(vma))) { /* Partial HugeTLB page WP isn't possible. */ pagemap_scan_backout_range(p, start, end); p->arg.walk_end =3D start; --=20 2.54.0 From nobody Mon Jun 8 10:56:42 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6E5737FF7A; Fri, 29 May 2026 17:24:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075447; cv=none; b=LVN72c7Nt+SYXWfxle2IQPNpegXJM03BO0UVFYt7Ytdakm/Aw2f1YaC0HzmG9+2KpNPh/ti0xBBhL4YNNPal8JblsLyIj3+Fl0DhvXVYO2ocOvbg8m4d/KcnxdizUsTDj8M9azl5r+9rqyWgw+RyLSiuNtYNsZYbyp59TqR577M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075447; c=relaxed/simple; bh=UwLNADmc/f1Ohq1rz7sGbukVglOCoVqv/s7Ixqt+HRA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fGvQG5/8MUOAkjakq2+LCyoYbSb38wuz2zBEDKuIOHHZRDK2fJla19n07pCKrc64omQGy0bFhUZmedQ5y1QOttJnieVVWfR64H9/g9Rg0mimcyNVXlnN6ceAuXMwNz8Ij5Gm2swk1qTH8C6V4hcoZrMs8L9tUmEq4PqLe+2J344= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XvWVTOzE; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XvWVTOzE" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DE9C91F00899; Fri, 29 May 2026 17:24:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780075446; bh=nYuIV9AvDafYDcSbmI0+95lKITkmfWI6WJtx/j8RINo=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=XvWVTOzEc6sLLsdd4agf0l6vVfQ3dDH3V6mSedWMZ0sIe8WfAkKZgTwvAtegYhe1J /9O9mfzQVuRCoqpPEx7Az7/6HCtpkbMVElfEZx42ZaEquV8f5HlHw3OG4INX4RrbGC F/mpfWR5y73pWB9956SiFHQhTs7Xwfwl1zTpqorsQ+Y6GURSENcSb3wICGdSBxJBu+ 4CwnPFXu99Z5zip652MehYlCylZTvq/OckJhvgowMP1xFkJuZb9Kd+VLPG9SJz/dPg 8uKCgilCOIw/6NI5HsFb77mwYdpIL4yyrKNh6UJUf0OhgRWqKEKKXtgFuK3jGqvdb6 Ug4vkXyw7FjmQ== Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfauth.phl.internal (Postfix) with ESMTP id 4A2CAF4006D; Fri, 29 May 2026 13:24:05 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-02.internal (MEProxy); Fri, 29 May 2026 13:24:05 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTEZfpMr5/eRRE28w8SXJrP2dhbWJDto6huRdd/cnfVbAAhK4j6L5mXvQtvbZW1Tgn qdHjS1xVDGkdt7hr1SYRochmR7yuP2KsuqbhH6HkebbwRdIZkVYqIRv2TfGI9SaQtUiic8 G8SBlRV6/L2Xtb05bm6QmOQTKoj3b/RGprk0F6woL4YfqntUvh+0D5sL6ruauatQhdAfx4 sxg873kOfAaKAVmot+PkxhAMe2YWyy/aSlFAam2C/bHdTY9DLsPBVhceLLRJtCvIR2Q+zM kaoJA5O2aRPotfcyz8zPVtzBiOjMLe5Pw3NzV/bo4OKaUt+lobJ5eojHQXZMuqCslD6AnS ok5QtyObWq6j+DXuni1rlZOw3XtlbsdrBtLk/PduNPZnMWbxiseSW1kQIaGiYoG0Owi9Jp 9XjI1ZxuqSyNya4Ifrta5xPSeY6MbudbjLqoQOxnqqoSfmGjibIBRAdMYKp0lTqE92dtia gIKy/RhhuXxoarlGCcIglkcPQ/ZRLxd6pcw2uj5oCWiJXoJC0ZVSRe+qltKVCgafAsu4hU rYsD0f1RyR4nUE9eGm/7OvFVKPnlAstWfYnjFnH6iAP4Dv8Dm284OQXfbNkidxkyt9DHu6 +bwThVFcMbvJHZ6h4DjLhbKj4GHVfcB11PnXN1U61Tttbh5oo2wkrZpwBzwA X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 29 May 2026 13:24:03 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lorenzo Stoakes , Mike Rapoport , David Hildenbrand , "Kiryl Shutsemau (Meta)" , stable@vger.kernel.org, Sashiko AI review , "Liam R. Howlett" , Vlastimil Babka , Jann Horn , Pedro Falcato , =?UTF-8?q?Micha=C5=82=20Miros=C5=82aw?= , Muhammad Usama Anjum , Andrei Vagin , Stephen Rothwell , linux-fsdevel@vger.kernel.org Subject: [PATCH 3/6] fs/proc/task_mmu: fix hugetlb self-deadlock in pagemap_scan_pte_hole() Date: Fri, 29 May 2026 18:23:27 +0100 Message-ID: <20260529172331.356655-4-kas@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529172331.356655-1-kas@kernel.org> References: <20260529172331.356655-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A PAGEMAP_SCAN ioctl requesting PM_SCAN_WP_MATCHING on a hugetlb VMA hangs the calling thread, unkillably, as soon as the scan reaches an unpopulated part of the range: do_pagemap_scan() walk_page_range() walk_hugetlb_range() hugetlb_vma_lock_read() # take the vma lock for read ... pagemap_scan_pte_hole() # ... ->pte_hole() for a hole uffd_wp_range() change_protection() hugetlb_change_protection() hugetlb_vma_lock_write() # ... and block taking it for wri= te walk_hugetlb_range() holds the hugetlb vma lock for read across the whole walk. A present entry goes to ->hugetlb_entry(); an unpopulated one goes to ->pte_hole(), i.e. pagemap_scan_pte_hole(). To write-protect the hole that handler calls uffd_wp_range(), which on a hugetlb VMA reaches hugetlb_change_protection() and takes the same vma lock for write. The thread then blocks in down_write() waiting for the read lock it is itself holding. The populated path avoids this: pagemap_scan_hugetlb_entry() write-protects the entry inline under the page-table lock and never enters hugetlb_change_protection(). Do the same for holes. Fault in the page table and install the uffd-wp marker directly with make_uffd_wp_huge_pte() under the page-table lock, rather than routing through uffd_wp_range(). That is the same sequence hugetlb_change_protection() runs for an unpopulated entry, minus the vma write lock -- which is safe to skip because PMD sharing is disabled on uffd-wp VMAs (hugetlb_unshare_all_pmds() runs at registration), leaving nothing for that lock to serialise against. Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optional= ly clear info about PTEs") Cc: stable@vger.kernel.org Reported-by: Sashiko AI review Signed-off-by: Kiryl Shutsemau Assisted-by: Claude:claude-opus-4-8 --- fs/proc/task_mmu.c | 59 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 58 insertions(+), 1 deletion(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 1489c67e88f7..06fb94a965ff 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -2977,8 +2977,62 @@ static int pagemap_scan_hugetlb_entry(pte_t *ptep, u= nsigned long hmask, =20 return ret; } + +/* + * Write-protect the unpopulated hugetlb entries covering [addr, end) by + * installing uffd-wp markers inline, exactly as pagemap_scan_hugetlb_entr= y() + * does for populated entries. + * + * walk_hugetlb_range() currently calls ->pte_hole() once per huge page, s= o the + * loop normally runs a single iteration; it is written to cover the full = range + * in case the walker ever coalesces adjacent holes. + * + * The obvious route -- uffd_wp_range() -> hugetlb_change_protection() -- + * cannot be used here: it takes hugetlb_vma_lock_write(), but the page-ta= ble + * walker (walk_hugetlb_range()) already holds hugetlb_vma_lock_read() on = the + * same VMA, so the scanning thread would deadlock against itself. PMD sha= ring + * is disabled on uffd-wp VMAs (hugetlb_unshare_all_pmds() at registration= ), so + * the vma lock guards nothing that matters for these entries anyway. + */ +static int pagemap_scan_hugetlb_hole_wp(struct vm_area_struct *vma, + unsigned long addr, unsigned long end) +{ + struct hstate *h =3D hstate_vma(vma); + unsigned long psize =3D huge_page_size(h); + struct mm_struct *mm =3D vma->vm_mm; + spinlock_t *ptl; + pte_t *ptep; + pte_t pte; + + for (addr =3D ALIGN_DOWN(addr, psize); addr < end; addr +=3D psize) { + ptep =3D huge_pte_alloc(mm, vma, addr, psize); + if (!ptep) + return -ENOMEM; + + i_mmap_lock_write(vma->vm_file->f_mapping); + ptl =3D huge_pte_lock(h, mm, ptep); + pte =3D huge_ptep_get(mm, addr, ptep); + make_uffd_wp_huge_pte(vma, addr, ptep, pte); + /* + * A none entry has no cached translation, so installing the + * marker needs no TLB flush. Flush only if a fault populated + * the entry between huge_pte_alloc() and the page table lock. + */ + if (!huge_pte_none(pte)) + flush_hugetlb_tlb_range(vma, addr, addr + psize); + spin_unlock(ptl); + i_mmap_unlock_write(vma->vm_file->f_mapping); + } + + return 0; +} #else #define pagemap_scan_hugetlb_entry NULL +static int pagemap_scan_hugetlb_hole_wp(struct vm_area_struct *vma, + unsigned long addr, unsigned long end) +{ + return 0; +} #endif =20 static int pagemap_scan_pte_hole(unsigned long addr, unsigned long end, @@ -2998,7 +3052,10 @@ static int pagemap_scan_pte_hole(unsigned long addr,= unsigned long end, if (~p->arg.flags & PM_SCAN_WP_MATCHING) return ret; =20 - err =3D uffd_wp_range(vma, addr, end - addr, true); + if (is_vm_hugetlb_page(vma)) + err =3D pagemap_scan_hugetlb_hole_wp(vma, addr, end); + else + err =3D uffd_wp_range(vma, addr, end - addr, true); if (err < 0) ret =3D err; =20 --=20 2.54.0 From nobody Mon Jun 8 10:56:42 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBA6833B6CB for ; Fri, 29 May 2026 17:24:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075455; cv=none; b=q0+fkHZM03niDevoVY5s960THKTpOd4a1TNRe66XPm+3lLdbHYim2qGlaqJ/iD5Hopgv3Yd5y1CcME2i5jEQBxOfZ8lZU7xYeSZdcdZVwS2/MAQ/dzlbv6Cfx2TeioITuvDCRXt6/5QtpVk6QV/6agnh4RyLMz2oHrXjTiNL9sI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075455; c=relaxed/simple; bh=lRg8zaWuIQB7a0E/S0QVAKxbOsaItBa8/W7RVnYmB3E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g7nIo2jXxOGtkZJjFk1Lv3ZRM0RLUXSOIo3UyjpfQphLCE2UxE8FPBCGzIc4wVVUGjiKr+XsVc2lAY+XZmZFhhmPRWrnvOO++hpm36Pb+QbTmYkJNFUv9hHF9JGHLyOnWQS5RKOHicr5HyxGzloltyWiemf0yGU0nISO5i7dzYQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YTtJGEh7; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YTtJGEh7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 308531F00898; Fri, 29 May 2026 17:24:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780075454; bh=fbbpli+hXkiBIhpDsuI5TYzprk7eDfizGvmgjVMK96o=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=YTtJGEh7Mm/KsO0ad+cS2miLHGEUbr39u5u2xM9Y2h6nuK4ndN/FjmfSHlvrQqKqy DYekQYcuQZpKYD9Ml3nEoHVWaVJxiCL3/7FC2HsSvdCL0+h5jKZs3U4VixQUlD0xC+ Nn8UIwuIFl2ymSweMZP8GYYRuS2/K4OchuOOIlUxY58xBxjlZbc88XLYpQBvRxX6ni bv5ZEb0lsX7Dv8yfnRxp2PdKyLROn5vzsCTTU0oVnrlzg/jQroWFamkOu7gRq8GV+c c9PsRwf4abnjdO0fmtbKBelE6GrrKZf7u4Yodc+682Zyca5zJ/aX9K0rYDDBg8zRgJ 2nlzus/Po9QSg== Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfauth.phl.internal (Postfix) with ESMTP id 83AC1F4006E; Fri, 29 May 2026 13:24:13 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-01.internal (MEProxy); Fri, 29 May 2026 13:24:13 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTEZfpMr5/eRRE28w8SXJrP2dhbWJDto6huRdd/cnfVbAAhK4j6L5mXvQtvbZW1Tgn qdHjS1xVDGkdt7hr1SYRochmR7yuP2KsuqbhH6HkebbwRdIZkVYqIRv2TfGI9SaQtUiic8 G8SBlRV6/L2Xtb05bm6QmOQTKoj3b/RGprk0F6woL4YfqntUvh+0D5sL6ruauatQhdAfx4 sxg873kOfAaKAVmot+PkxhAMe2YWyy/aSlFAam2C/bHdTY9DLsPBVhceLLRJtCvIR2Q+zM kaoJA5O2aRPotfcyz8zPVtzBiOjMLe5Pw3NzV/bo4OKaUt+lobJ5eojHQXZMuqCslD6ADP cnvxXMLJ+GGO+YsdKG/VtAzip3oPKBED00TGmZboavLxb51m+v659bwaubg2JY1xQjHuTg UmCM8Q2oSZdxrvMk3bWtd5KMX1M2Pd6UArm6jSDfw81nsFHKAMQ4fbbfRC+vk9N/Vgfi7V oqaul2hByG2eXK5j2n6x3xF395psx+G0RTb5h9LdHo131yNJ4WEPG72PtkjD4woebIeM/+ o5O0e2DGUKzrXWs7x4haLKZGLWCkhawOH+sDfIB69Q8admleEF9xdNHiVEu2gjwY2cS4MD 0S3PRQnsWFPiUiwqDqufDTz1neUdguM5GjjW9YxmtF3DDBpmERQbvTJuH76A X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 29 May 2026 13:24:11 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lorenzo Stoakes , Mike Rapoport , David Hildenbrand , "Kiryl Shutsemau (Meta)" , stable@vger.kernel.org, Sashiko AI review , Zi Yan , Baolin Wang , "Liam R. Howlett" , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Lance Yang , Balbir Singh , Matthew Brost Subject: [PATCH 4/6] mm/huge_memory: preserve pmd_swp_uffd_wp on device-private PMD downgrade Date: Fri, 29 May 2026 18:23:28 +0100 Message-ID: <20260529172331.356655-5-kas@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529172331.356655-1-kas@kernel.org> References: <20260529172331.356655-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" change_non_present_huge_pmd() rewrites a writable device-private PMD swap entry into a readable one without carrying pmd_swp_uffd_wp() across. The PTE-level change_softleaf_pte() does this correctly; mirror that here, matching what copy_huge_pmd() does for the fork path. Without the carry, a plain mprotect() over a UFFD_WP-marked device-private THP strips the bit and the trap is bypassed on swap-in. Fixes: 368076f52ebe ("mm/huge_memory: add device-private THP support to PMD= operations") Cc: stable@vger.kernel.org Reported-by: Sashiko AI review Signed-off-by: Kiryl Shutsemau Reviewed-by: Balbir Singh --- mm/huge_memory.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 42b86e8ab7c0..b7c895b1d366 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2663,6 +2663,8 @@ static void change_non_present_huge_pmd(struct mm_str= uct *mm, } else if (softleaf_is_device_private_write(entry)) { entry =3D make_readable_device_private_entry(swp_offset(entry)); newpmd =3D swp_entry_to_pmd(entry); + if (pmd_swp_uffd_wp(*pmd)) + newpmd =3D pmd_swp_mkuffd_wp(newpmd); } else { newpmd =3D *pmd; } --=20 2.54.0 From nobody Mon Jun 8 10:56:42 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49694384CCD for ; Fri, 29 May 2026 17:24:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075459; cv=none; b=rVY/CEf4cA/7HwxDUE+HmZL55C68qCWbqEJX7wCOHPjEZHtDynrIDefVhLrC4TCcvhW5dtqUK3ze8+X+8LdmDUYp/049/xKfLa1MZCc4SmPDoNO4J8tnJhkX58EXKSwgD8W+7WVMp/2KonYk7vPc3utjq51sRkEVlEB4F1vuDZY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075459; c=relaxed/simple; bh=nxvHsHswlUDcFohXoUUYbb3RCJcKckvf5pmldrrkpR8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RK7sUBI+vSRGBz5FWLwnBIzAFUp87cgpYqBgnJTrBkimD2hI8bBkfhGS/RAeUfzs5hbo76ozY+RtdS5dnCsD3zXnMBC3LtWjU6Wh+r8A84WMVmL8UY4jrQpiVwYllvkqgseJSKY4LIAL90VgkBAMzVxpHRbWj6/sTy7eo6vq32M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=n+LQqFiO; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="n+LQqFiO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A37341F00898; Fri, 29 May 2026 17:24:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780075458; bh=XeG9LCEJFa4u0fWY5Mnfc8ZMY6z2gJj23YToDXVIM3Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=n+LQqFiOrrE+y48YJVShBHh09EBlf/KfbOr/+DglsfsudubBwGEcOuV/86RGkBrEW PO6UUw2W6zT0zBEnQSu0Mn+8+UaYRFX9sMEaz6cdgqlrFuFBPk344udWQUkT8rSii9 jiF1+OfFFcXmtEWu15POIX+CzsIUL3BrmUf2+ifllCG3sRQ2EMMxeMU1mSwKK3SDug ueU37TNVC07ovUNAdqN5P5gGDp4EjChGLK8e4Fpm6Bdg6Z604AdB4N0sLMtSgJyyU5 yaDLEeYqkHRjH1BVc062zkNLzMQsUzzvh+rm8R/VQ/vbVLf0GQ4k+bEEtAA0t/q7+i 95rhGgBX6x0wg== Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfauth.phl.internal (Postfix) with ESMTP id 0EDD9F4006D; Fri, 29 May 2026 13:24:17 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-01.internal (MEProxy); Fri, 29 May 2026 13:24:17 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTFrgX5MM3DUSJ7xzqsXiT9K7Mp6hFpPFfxmC0MA+iN7AjFohwvWx7oxJHZ+5oTsa/ eNprXeJ/674yJe1dHbWF7GZb1XnToVb0R4autdIDJiozLrBStYEb9fpXa/2JrApUqbrre1 LdBuyJn9i8qKcJffyvK1YIhy0raIMBiWiOnHjwA5r86k+Zhi6bSohRFuPzPhOdP14jnWJI /xiWZuCo09JkwaiWtT4Gp2J4EUMHafSQnln9uaxNuylCfDh3CVEdSw0SaHR74GjGjlbPAf 5lIf5oHRJw+wgkoNP7G5YvX9mPChAnbwGYXnMrRN1deEijcEXsq+k9G0Lzj5Q5OMaNJVHB 4OOwyJkq6mBqr043+j7zRMjiXH5kV6zXX7/bgSmKaz/4+CS2nFJJ/vmFn/GaYqZ+ipcj+9 zaCSc4zsH8zhnIMSsXMMe8+k11/OpeFmN4szkrGrgFc21nOEwGIKONCBFrCimY714mJP5y eKIrJe8HvjiBOndrTcALefXl4v4JeJuPsrnHxXfOBEoBkarzNgrbvXTpxFmHceYD/BAmlo sLsztvo3cGZYIOqO78ihV6vneu2kX8mZI5NKqYckQd3GswUUBMSNKmUqO3ygONGUdHdS59 BfN7xsE7X4zcmiX30qSzkdNfO6eHulZVz1QW+CycnZ7Z16VTgUR+37tDLTVg X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 29 May 2026 13:24:16 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lorenzo Stoakes , Mike Rapoport , David Hildenbrand , "Kiryl Shutsemau (Meta)" , stable@vger.kernel.org, Sashiko AI review , Peter Xu , Mike Kravetz , Andrea Arcangeli , Jerome Glisse Subject: [PATCH 5/6] userfaultfd: gate must_wait writability check on pte_present() Date: Fri, 29 May 2026 18:23:29 +0100 Message-ID: <20260529172331.356655-6-kas@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529172331.356655-1-kas@kernel.org> References: <20260529172331.356655-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" userfaultfd_must_wait() and userfaultfd_huge_must_wait() read the PTE without taking the page table lock and then apply pte_write() / huge_pte_write() to it. Those accessors decode bits from the present encoding only; on a swap or migration entry they read the offset bits that happen to share the same position and return an undefined result. The intent of the check is "is this fault still WP-blocked?". A non-marker swap entry means the page is in transit -- the userfault context the original fault delivered against is no longer the same, and the swap-in or migration completion path will re-deliver a fresh fault if userspace still needs to handle it. Worst case under the current code the garbage write bit says "wait", and the thread stays asleep until a UFFDIO_WAKE that may never arrive. Gate the writability check on pte_present() so the lockless re-check only inspects present-PTE bits when the entry is actually present. The non-present, non-marker case returns "don't wait" and lets the fault path retry. Fixes: 369cd2121be4 ("userfaultfd: hugetlbfs: userfaultfd_huge_must_wait fo= r hugepmd ranges") Fixes: 63b2d4174c4a ("userfaultfd: wp: add the writeprotect API to userfaul= tfd ioctl") Cc: stable@vger.kernel.org Reported-by: Sashiko AI review Signed-off-by: Kiryl Shutsemau Reviewed-by: Lorenzo Stoakes Reviewed-by: Mike Rapoport (Microsoft) --- mm/userfaultfd.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 35b206cc9aa6..f6d2a1c67019 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -2535,6 +2535,15 @@ static inline bool userfaultfd_huge_must_wait(struct= userfaultfd_ctx *ctx, /* UFFD PTE markers require userspace to resolve the fault. */ if (pte_is_uffd_marker(pte)) return true; + /* + * Concurrent migration may have replaced the present PTE with a + * non-marker swap entry between fault delivery and this lockless + * re-check. huge_pte_write() on a swap entry decodes random offset + * bits, so gate it on pte_present(). The migration completion path + * will re-deliver the fault if it still needs userspace. + */ + if (!pte_present(pte)) + return false; /* * If VMA has UFFD WP faults enabled and WP fault, wait for userspace to * resolve the fault. @@ -2621,6 +2630,17 @@ static inline bool userfaultfd_must_wait(struct user= faultfd_ctx *ctx, /* UFFD PTE markers require userspace to resolve the fault. */ if (pte_is_uffd_marker(ptent)) goto out; + /* + * Concurrent swap-out / migration may have replaced the present PTE + * with a non-marker swap entry between fault delivery and this + * lockless re-check. pte_write() on a swap entry decodes random + * offset bits, so gate it on pte_present(). The page-in path will + * re-deliver the fault if it still needs userspace. + */ + if (!pte_present(ptent)) { + ret =3D false; + goto out; + } /* * If VMA has UFFD WP faults enabled and WP fault, wait for userspace to * resolve the fault. --=20 2.54.0 From nobody Mon Jun 8 10:56:42 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59BD7382F16; Fri, 29 May 2026 17:24:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075461; cv=none; b=hXRAkivpcsGYuO2FHETgB6tKzHW/OAoinXJKFjY1zl733jZUVxk2RXWuce+KJA5X9fq1C9dPtrK6+isRPRXvWP7LHtjrVL2oOZ3/G3nXuEC/DUkyEFhLVD0RHt6d7yCLL4v6sLxEMglJpiFMOREQ7sIl7kYb4CwoLYpVC8mvt4Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780075461; c=relaxed/simple; bh=zAhzZ6g2VJ/vzQqtegJOWymYQRoeShAfuNrc5d3THSY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rDMpjKlwSlXT62H3tGv/abmgTnT4Mp7qhLu1lnAulrdhbaHy/AyEh7L0PEHU7DGezccjDPIoxXVPfr2+WschrgNX3Xho+XiShshgubueGD5ytBwyMb+38gafcHtHUMlRxc7AffIy3UV3lLBrTXYn1sF0ccTcpqF0aaYZS36fKWQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=CP/1ymRV; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="CP/1ymRV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8F3861F00898; Fri, 29 May 2026 17:24:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780075460; bh=iVoJCbkI8s/nRA3AM39fhnw9JOfOtPPsCwAz6Lm0b5Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=CP/1ymRVjopu0mbNo8SQ13KnvFis4z0GUsv4n98d6LfjMJzaw4bVpCWrsy0bWsDo5 SNfXxNNj/rG9GwZV5YwwtfvXEnbnq2Ttyc76h3lEAPmqCLsNogeTialbWZlp1cBgjQ 6guZ6WiAYxF1lk9tetZZLqCxgbzUc+uiEY3SlC1aJOFJHfsmoPRi9ZesmbN5BQKcsT 7j72n6vXeGh4V+gUxbip/MepYAhUw3hrqgg40B54uEtvpOS+RWndrKLg08WCUVKk7B EhC3OWm+oVznXAi/kvucVClV2vZ1QDMUiT9OiOGs5V+aBPnVCFux6SEePMFBZRnnFo XBaOmyYSeh1eg== Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfauth.phl.internal (Postfix) with ESMTP id EFD09F4006E; Fri, 29 May 2026 13:24:18 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-02.internal (MEProxy); Fri, 29 May 2026 13:24:18 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTFrgX5MM3DUSJ7xzqsXiT9K7Mp6hFpPFfxmC0MA+iN7AjFohwvWx7oxJHZ+5oTsa/ eNprXeJ/674yJe1dHbWF7GZb1XnToVb0R4autdIDJiozLrBStYEb9fpXa/2JrApUqbrre1 LdBuyJn9i8qKcJffyvK1YIhy0raIMBiWiOnHjwA5r86k+Zhi6bSohRFuPzPhOdP14jnWJI /xiWZuCo09JkwaiWtT4Gp2J4EUMHafSQnln9uaxNuylCfDh3CVEdSw0SaHR74GjGjlbPAf 5lIf5oHRJw+wgkoNP7G5YvX9mPChAnbwGYXnMrRN1deEijcEXsq+k9G0Lzj5Q5OMaNJVZP ynHfTSCMN2Wglln7FwCeGxWcRR9bI2b3w8oJShKhvuMKIwP0IBVzf44fyIK2FGEUvOK548 Eq/sQNY5Z1h23soaVM4lhxLQFbLksIU3Fsd8HKslIEZ0s+EQ8tfFmkcDNslrl9b1g0noyB LxGGX71AMWfXhsgBQSLBvluF8XBHVSeIQ9sxPjxnVVXsDHHaAwEFV3ywTZoDM30+wwW2T+ e1UAgp4FmIIfSe+xihledFa9IIacTucEedO4UqqAbpFgVcLaeNQxsxaRON9wmaMXnTxUnr JggfNJowpV6m3GFgUHO3Qq9TKFsPiXlu4b06H73uaWUCYGZiIDRpW/m3fzVA X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 29 May 2026 13:24:18 -0400 (EDT) From: "Kiryl Shutsemau (Meta)" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lorenzo Stoakes , Mike Rapoport , David Hildenbrand , "Kiryl Shutsemau (Meta)" , stable@vger.kernel.org, Sashiko AI review , "Liam R. Howlett" , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Peter Xu , Pedro Falcato , Alice Ryhl Subject: [PATCH 6/6] userfaultfd: build __VMA_UFFD_FLAGS from config-gated masks Date: Fri, 29 May 2026 18:23:30 +0100 Message-ID: <20260529172331.356655-7-kas@kernel.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260529172331.356655-1-kas@kernel.org> References: <20260529172331.356655-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The VMA flags bitmap is a single word today: NUM_VMA_FLAG_BITS is BITS_PER_LONG, so on 32-bit vma_flags_t holds only 32 bits. (The bitmap type exists so this can grow past BITS_PER_LONG later; until it does, anything declared above the first word is out of range on 32-bit.) The bit enum nevertheless declares some bits unconditionally above BITS_PER_LONG -- VMA_UFFD_MINOR_BIT is 41, with VM_UFFD_MINOR =3D=3D VM_NONE on 32-bit so no= VMA actually carries the bit. __VMA_UFFD_FLAGS feeds VMA_UFFD_MINOR_BIT to mk_vma_flags() unconditionally. On 32-bit that becomes __set_bit(41, &one_long), a write one word past the end of the single-word bitmap. The compiler folds the out-of-bounds store with wraparound (1UL << (41 % 32) =3D=3D bit 9) into the first word; bit 9 = is already in __VMA_UFFD_FLAGS so the mask happens to come out right today, but it is an out-of-bounds write all the same, and any high-numbered bit whose mod-BITS_PER_LONG position is otherwise unused would silently OR an extra bit into the mask. Rather than feed bit numbers that may not exist on the current build to mk_vma_flags(), build the mask from whole per-mode masks that collapse to EMPTY_VMA_FLAGS when their feature is unavailable. Add mk_vma_flags_from_masks() for that, and define VMA_UFFD_MISSING / _WP / _MINOR alongside the VM_UFFD_* flags, gating VMA_UFFD_MINOR on the same config as VM_UFFD_MINOR (which implies 64BIT, where bit 41 fits). An out-of-range bit is then never materialised, on any arch, and the in-range fast path stays a compile-time constant. Fixes: 9ea35a25d51b ("mm: introduce VMA flags bitmap type") Cc: stable@vger.kernel.org Reported-by: Sashiko AI review Suggested-by: Lorenzo Stoakes Signed-off-by: Kiryl Shutsemau Assisted-by: Claude:claude-opus-4-8 Reviewed-by: Lorenzo Stoakes Reviewed-by: Mike Rapoport (Microsoft) --- include/linux/mm.h | 39 +++++++++++++++++++++++++++++++++++ include/linux/userfaultfd_k.h | 4 ++-- 2 files changed, 41 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 0f2612a70fb1..485df9c2dbdd 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -496,6 +496,21 @@ enum { #else #define VM_UFFD_MINOR VM_NONE #endif + +/* + * vma_flags_t masks for the userfaultfd VMA flags. VMA_UFFD_MINOR is gate= d on + * the same config as VM_UFFD_MINOR -- which implies 64BIT, where the bit = fits + * -- so an out-of-range bit is never fed to mk_vma_flags() on a build who= se + * bitmap cannot hold it. + */ +#define VMA_UFFD_MISSING mk_vma_flags(VMA_UFFD_MISSING_BIT) +#define VMA_UFFD_WP mk_vma_flags(VMA_UFFD_WP_BIT) +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR +#define VMA_UFFD_MINOR mk_vma_flags(VMA_UFFD_MINOR_BIT) +#else +#define VMA_UFFD_MINOR EMPTY_VMA_FLAGS +#endif + #ifdef CONFIG_64BIT #define VM_ALLOW_ANY_UNCACHED INIT_VM_FLAG(ALLOW_ANY_UNCACHED) #define VM_SEALED INIT_VM_FLAG(SEALED) @@ -1238,6 +1253,30 @@ static __always_inline void vma_flags_set_mask(vma_f= lags_t *flags, #define vma_flags_set(flags, ...) \ vma_flags_set_mask(flags, mk_vma_flags(__VA_ARGS__)) =20 +static __always_inline vma_flags_t __mk_vma_flags_from_masks(size_t count, + const vma_flags_t *masks) +{ + vma_flags_t flags =3D EMPTY_VMA_FLAGS; + size_t i; + + for (i =3D 0; i < count; i++) + vma_flags_set_mask(&flags, masks[i]); + return flags; +} + +/* + * Combine pre-computed vma_flags_t masks into one value, e.g.: + * + * vma_flags_t flags =3D mk_vma_flags_from_masks(VMA_UFFD_WP, VMA_UFFD_MIN= OR); + * + * Unlike mk_vma_flags(), which takes bit numbers, this takes whole masks = -- + * each of which may be EMPTY_VMA_FLAGS when its feature is unavailable --= so a + * bit that does not exist on the current build is never materialised. + */ +#define mk_vma_flags_from_masks(...) \ + __mk_vma_flags_from_masks(COUNT_ARGS(__VA_ARGS__), \ + (const vma_flags_t []){__VA_ARGS__}) + /* Clear all of the to-clear flags in flags, non-atomically. */ static __always_inline void vma_flags_clear_mask(vma_flags_t *flags, vma_flags_t to_clear) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 3ec8e1071673..68edac4dcd78 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -23,8 +23,8 @@ /* The set of all possible UFFD-related VM flags. */ #define __VM_UFFD_FLAGS (VM_UFFD_MISSING | VM_UFFD_WP | VM_UFFD_MINOR) =20 -#define __VMA_UFFD_FLAGS mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BI= T, \ - VMA_UFFD_MINOR_BIT) +#define __VMA_UFFD_FLAGS mk_vma_flags_from_masks(VMA_UFFD_MISSING, VMA_UFF= D_WP, \ + VMA_UFFD_MINOR) =20 /* * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining --=20 2.54.0