fs/hugetlbfs/inode.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
operations. As per the original design in commit 40549ba8f8e0 ("hugetlb:
use new vma_lock for pmd sharing synchronization"), if the trylock fails
or the VMA has no lock, it should skip that VMA. Any remaining mapped
pages are handled by remove_inode_hugepages() which is called after
hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
unmapping success.
Currently, when hugetlb_vma_trylock_write() returns success (1) for VMAs
without shareable locks, the code proceeds to call unmap_hugepage_range().
This causes assertion failures in huge_pmd_unshare() → hugetlb_vma_assert_locked()
because no lock is actually held:
WARNING: CPU: 1 PID: 6594 Comm: syz.0.28 Not tainted
Call Trace:
hugetlb_vma_assert_locked+0x1dd/0x250
huge_pmd_unshare+0x2c8/0x540
__unmap_hugepage_range+0x6e3/0x1aa0
unmap_hugepage_range+0x32e/0x410
hugetlb_vmdelete_list+0x189/0x1f0
Fix by using goto to ensure locks acquired by trylock are always released, even
when skipping VMAs without shareable locks.
Reported-by: syzbot+f26d7c75c26ec19790e7@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7
Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
Changes in v2:
- Use goto to unlock after trylock, avoiding lock leaks (Andrew Morton)
- Add comment explaining why non-shareable VMAs are skipped (Andrew Morton)
---
fs/hugetlbfs/inode.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 9e0625167517..9fa7c72ac1a6 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -488,6 +488,14 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end,
if (!hugetlb_vma_trylock_write(vma))
continue;
+ /*
+ * Skip VMAs without shareable locks. Per the design in commit
+ * 40549ba8f8e0, these will be handled by remove_inode_hugepages()
+ * called after this function with proper locking.
+ */
+ if (!__vma_shareable_lock(vma))
+ goto skip;
+
v_start = vma_offset_start(vma, start);
v_end = vma_offset_end(vma, end);
@@ -498,7 +506,8 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end,
* vmas. Therefore, lock is not held when calling
* unmap_hugepage_range for private vmas.
*/
- hugetlb_vma_unlock_write(vma);
+skip:
+ hugetlb_vma_unlock_write(vma);
}
}
--
2.43.0
On Fri, Sep 26, 2025 at 09:02:54AM +0530, Deepanshu Kartikey wrote:
> hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
> operations. As per the original design in commit 40549ba8f8e0 ("hugetlb:
> use new vma_lock for pmd sharing synchronization"), if the trylock fails
> or the VMA has no lock, it should skip that VMA. Any remaining mapped
> pages are handled by remove_inode_hugepages() which is called after
> hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
> unmapping success.
For the past few days I've been seeing failures on Raspberry Pi 4 in
the hugetlbfs-madvise kselftest in -next which bisect to this patch.
The test reports:
# # -------------------------
# # running ./hugetlb-madvise
# # -------------------------
# # Unexpected number of free huge pages line 252
# # [FAIL]
# not ok 6 hugetlb-madvise # exit=1
Full log:
https://lava.sirena.org.uk/scheduler/job/1913276#L1803
Bisect log:
# bad: [7396732143a22b42bb97710173d598aaf50daa89] Add linux-next specific files for 20251002
# good: [9d3bc72cc0a9791bf4910ef854b2c3dd61af3bbf] Merge branch 'for-rc' of https://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl.git
# good: [d4ecae56a8c7d3287a5bcdb2d65f7102ee580ab6] clk: mediatek: Add MT8196 mcu clock support
# good: [4c134c2a5f3db29afe35b2d30e39bb6d867b08da] um: Indent time-travel help messages
# good: [bf1af4f6e62878e053d20cd71267aed8dfb3e715] perf arm-spe: Downsample all sample types equally
# good: [e414334883f4835058ca06f934bc4988eb9cd9e6] Merge branch 'next/dt' into for-next
# good: [54653bb3ec83d1f717adab6108db82a3966d19ee] clk: renesas: rzv2h: remove round_rate() in favor of determine_rate()
# good: [87a877de367d835b527d1086f75727123ef85fc4] KVM: x86: Rename handle_fastpath_set_msr_irqoff() to handle_fastpath_wrmsr()
# good: [c26675447faff8c4ddc1dc5d2cd28326b8181aaf] KVM: x86: Zero XSTATE components on INIT by iterating over supported features
# good: [6684aba0780da9f505c202f27e68ee6d18c0aa66] XArray: Add extra debugging check to xas_lock and friends
git bisect start '7396732143a22b42bb97710173d598aaf50daa89' '9d3bc72cc0a9791bf4910ef854b2c3dd61af3bbf' 'd4ecae56a8c7d3287a5bcdb2d65f7102ee580ab6' '4c134c2a5f3db29afe35b2d30e39bb6d867b08da' 'bf1af4f6e62878e053d20cd71267aed8dfb3e715' 'e414334883f4835058ca06f934bc4988eb9cd9e6' '54653bb3ec83d1f717adab6108db82a3966d19ee' '87a877de367d835b527d1086f75727123ef85fc4' 'c26675447faff8c4ddc1dc5d2cd28326b8181aaf' '6684aba0780da9f505c202f27e68ee6d18c0aa66'
# test job: [d4ecae56a8c7d3287a5bcdb2d65f7102ee580ab6] https://lava.sirena.org.uk/scheduler/job/1907306
# test job: [4c134c2a5f3db29afe35b2d30e39bb6d867b08da] https://lava.sirena.org.uk/scheduler/job/1903298
# test job: [bf1af4f6e62878e053d20cd71267aed8dfb3e715] https://lava.sirena.org.uk/scheduler/job/1900552
# test job: [e414334883f4835058ca06f934bc4988eb9cd9e6] https://lava.sirena.org.uk/scheduler/job/1904803
# test job: [54653bb3ec83d1f717adab6108db82a3966d19ee] https://lava.sirena.org.uk/scheduler/job/1900685
# test job: [87a877de367d835b527d1086f75727123ef85fc4] https://lava.sirena.org.uk/scheduler/job/1697972
# test job: [c26675447faff8c4ddc1dc5d2cd28326b8181aaf] https://lava.sirena.org.uk/scheduler/job/1698132
# test job: [6684aba0780da9f505c202f27e68ee6d18c0aa66] https://lava.sirena.org.uk/scheduler/job/1738722
# test job: [7396732143a22b42bb97710173d598aaf50daa89] https://lava.sirena.org.uk/scheduler/job/1913276
# bad: [7396732143a22b42bb97710173d598aaf50daa89] Add linux-next specific files for 20251002
git bisect bad 7396732143a22b42bb97710173d598aaf50daa89
# test job: [74fc450198cf792e3db35ea4d49197a467233373] https://lava.sirena.org.uk/scheduler/job/1913848
# bad: [74fc450198cf792e3db35ea4d49197a467233373] Merge branch 'main' of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
git bisect bad 74fc450198cf792e3db35ea4d49197a467233373
# test job: [db484ff3fff1fafa0017cdd017795bec09ace5e4] https://lava.sirena.org.uk/scheduler/job/1913993
# bad: [db484ff3fff1fafa0017cdd017795bec09ace5e4] Merge branch 'docs-next' of git://git.lwn.net/linux.git
git bisect bad db484ff3fff1fafa0017cdd017795bec09ace5e4
# test job: [7d942c9d9660e6808dcd835c4c73ad5405cc5518] https://lava.sirena.org.uk/scheduler/job/1914055
# bad: [7d942c9d9660e6808dcd835c4c73ad5405cc5518] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git
git bisect bad 7d942c9d9660e6808dcd835c4c73ad5405cc5518
# test job: [db03d3c83bdb21667392d1596fafdfb38325c2a0] https://lava.sirena.org.uk/scheduler/job/1914176
# bad: [db03d3c83bdb21667392d1596fafdfb38325c2a0] Merge branch 'dma-mapping-for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux.git
git bisect bad db03d3c83bdb21667392d1596fafdfb38325c2a0
# test job: [84a7a9823e73fe3c0adcc4780fa7a091981048ef] https://lava.sirena.org.uk/scheduler/job/1914247
# good: [84a7a9823e73fe3c0adcc4780fa7a091981048ef] mm/shmem, swap: remove redundant error handling for replacing folio
git bisect good 84a7a9823e73fe3c0adcc4780fa7a091981048ef
# test job: [c7416f37e4d31fb28ac4ed584b13037e69a22dbe] https://lava.sirena.org.uk/scheduler/job/1914387
# bad: [c7416f37e4d31fb28ac4ed584b13037e69a22dbe] Merge branch 'mm-nonmm-stable' of https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect bad c7416f37e4d31fb28ac4ed584b13037e69a22dbe
# test job: [3dfd02c900379d209ac9dcac24b4a61d8478842a] https://lava.sirena.org.uk/scheduler/job/1914497
# good: [3dfd02c900379d209ac9dcac24b4a61d8478842a] hugetlb: increase number of reserving hugepages via cmdline
git bisect good 3dfd02c900379d209ac9dcac24b4a61d8478842a
# test job: [fe7a283b39160153b6d1bd7f61b0a9d5d44987a8] https://lava.sirena.org.uk/scheduler/job/1915206
# good: [fe7a283b39160153b6d1bd7f61b0a9d5d44987a8] ocfs2: add suballoc slot check in ocfs2_validate_inode_block()
git bisect good fe7a283b39160153b6d1bd7f61b0a9d5d44987a8
# test job: [74058c0a9fc8b2b4d5f4a0ef7ee2cfa66a9e49cf] https://lava.sirena.org.uk/scheduler/job/1916011
# good: [74058c0a9fc8b2b4d5f4a0ef7ee2cfa66a9e49cf] Squashfs: fix uninit-value in squashfs_get_parent
git bisect good 74058c0a9fc8b2b4d5f4a0ef7ee2cfa66a9e49cf
# test job: [9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b] https://lava.sirena.org.uk/scheduler/job/1916064
# good: [9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b] Squashfs: reject negative file sizes in squashfs_read_inode()
git bisect good 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
# test job: [fb552b2425cf8f16c9c72229a972d1744b24d855] https://lava.sirena.org.uk/scheduler/job/1916102
# good: [fb552b2425cf8f16c9c72229a972d1744b24d855] alloc_tag: fix boot failure due to NULL pointer dereference
git bisect good fb552b2425cf8f16c9c72229a972d1744b24d855
# test job: [81e78b7ec61e89e8bab9736551839f79b063614c] https://lava.sirena.org.uk/scheduler/job/1916193
# bad: [81e78b7ec61e89e8bab9736551839f79b063614c] mm: convert folio_page() back to a macro
git bisect bad 81e78b7ec61e89e8bab9736551839f79b063614c
# test job: [1acc369373008b9eeb930fbb47847c0693055553] https://lava.sirena.org.uk/scheduler/job/1916218
# bad: [1acc369373008b9eeb930fbb47847c0693055553] mm/khugepaged: use start_addr/addr for improved readability
git bisect bad 1acc369373008b9eeb930fbb47847c0693055553
# test job: [dd83609b88986f4add37c0871c3434310652ebd5] https://lava.sirena.org.uk/scheduler/job/1916225
# bad: [dd83609b88986f4add37c0871c3434310652ebd5] hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
git bisect bad dd83609b88986f4add37c0871c3434310652ebd5
# first bad commit: [dd83609b88986f4add37c0871c3434310652ebd5] hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
On Fri, Oct 03, 2025 at 11:57:35AM +0100, Mark Brown wrote:
> On Fri, Sep 26, 2025 at 09:02:54AM +0530, Deepanshu Kartikey wrote:
> > hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
> > operations. As per the original design in commit 40549ba8f8e0 ("hugetlb:
> > use new vma_lock for pmd sharing synchronization"), if the trylock fails
> > or the VMA has no lock, it should skip that VMA. Any remaining mapped
> > pages are handled by remove_inode_hugepages() which is called after
> > hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
> > unmapping success.
>
> For the past few days I've been seeing failures on Raspberry Pi 4 in
> the hugetlbfs-madvise kselftest in -next which bisect to this patch.
> The test reports:
>
> # # -------------------------
> # # running ./hugetlb-madvise
> # # -------------------------
> # # Unexpected number of free huge pages line 252
> # # [FAIL]
> # not ok 6 hugetlb-madvise # exit=1
This issue is now present in mainline:
Raspberry Pi 4: https://lava.sirena.org.uk/scheduler/job/1976561#L1798
Orion O6: https://lava.sirena.org.uk/scheduler/job/1977081#L1779
and still bisects to this patch.
> Full log:
>
> https://lava.sirena.org.uk/scheduler/job/1913276#L1803
>
> Bisect log:
>
> # bad: [7396732143a22b42bb97710173d598aaf50daa89] Add linux-next specific files for 20251002
> # good: [9d3bc72cc0a9791bf4910ef854b2c3dd61af3bbf] Merge branch 'for-rc' of https://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl.git
> # good: [d4ecae56a8c7d3287a5bcdb2d65f7102ee580ab6] clk: mediatek: Add MT8196 mcu clock support
> # good: [4c134c2a5f3db29afe35b2d30e39bb6d867b08da] um: Indent time-travel help messages
> # good: [bf1af4f6e62878e053d20cd71267aed8dfb3e715] perf arm-spe: Downsample all sample types equally
> # good: [e414334883f4835058ca06f934bc4988eb9cd9e6] Merge branch 'next/dt' into for-next
> # good: [54653bb3ec83d1f717adab6108db82a3966d19ee] clk: renesas: rzv2h: remove round_rate() in favor of determine_rate()
> # good: [87a877de367d835b527d1086f75727123ef85fc4] KVM: x86: Rename handle_fastpath_set_msr_irqoff() to handle_fastpath_wrmsr()
> # good: [c26675447faff8c4ddc1dc5d2cd28326b8181aaf] KVM: x86: Zero XSTATE components on INIT by iterating over supported features
> # good: [6684aba0780da9f505c202f27e68ee6d18c0aa66] XArray: Add extra debugging check to xas_lock and friends
> git bisect start '7396732143a22b42bb97710173d598aaf50daa89' '9d3bc72cc0a9791bf4910ef854b2c3dd61af3bbf' 'd4ecae56a8c7d3287a5bcdb2d65f7102ee580ab6' '4c134c2a5f3db29afe35b2d30e39bb6d867b08da' 'bf1af4f6e62878e053d20cd71267aed8dfb3e715' 'e414334883f4835058ca06f934bc4988eb9cd9e6' '54653bb3ec83d1f717adab6108db82a3966d19ee' '87a877de367d835b527d1086f75727123ef85fc4' 'c26675447faff8c4ddc1dc5d2cd28326b8181aaf' '6684aba0780da9f505c202f27e68ee6d18c0aa66'
> # test job: [d4ecae56a8c7d3287a5bcdb2d65f7102ee580ab6] https://lava.sirena.org.uk/scheduler/job/1907306
> # test job: [4c134c2a5f3db29afe35b2d30e39bb6d867b08da] https://lava.sirena.org.uk/scheduler/job/1903298
> # test job: [bf1af4f6e62878e053d20cd71267aed8dfb3e715] https://lava.sirena.org.uk/scheduler/job/1900552
> # test job: [e414334883f4835058ca06f934bc4988eb9cd9e6] https://lava.sirena.org.uk/scheduler/job/1904803
> # test job: [54653bb3ec83d1f717adab6108db82a3966d19ee] https://lava.sirena.org.uk/scheduler/job/1900685
> # test job: [87a877de367d835b527d1086f75727123ef85fc4] https://lava.sirena.org.uk/scheduler/job/1697972
> # test job: [c26675447faff8c4ddc1dc5d2cd28326b8181aaf] https://lava.sirena.org.uk/scheduler/job/1698132
> # test job: [6684aba0780da9f505c202f27e68ee6d18c0aa66] https://lava.sirena.org.uk/scheduler/job/1738722
> # test job: [7396732143a22b42bb97710173d598aaf50daa89] https://lava.sirena.org.uk/scheduler/job/1913276
> # bad: [7396732143a22b42bb97710173d598aaf50daa89] Add linux-next specific files for 20251002
> git bisect bad 7396732143a22b42bb97710173d598aaf50daa89
> # test job: [74fc450198cf792e3db35ea4d49197a467233373] https://lava.sirena.org.uk/scheduler/job/1913848
> # bad: [74fc450198cf792e3db35ea4d49197a467233373] Merge branch 'main' of https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
> git bisect bad 74fc450198cf792e3db35ea4d49197a467233373
> # test job: [db484ff3fff1fafa0017cdd017795bec09ace5e4] https://lava.sirena.org.uk/scheduler/job/1913993
> # bad: [db484ff3fff1fafa0017cdd017795bec09ace5e4] Merge branch 'docs-next' of git://git.lwn.net/linux.git
> git bisect bad db484ff3fff1fafa0017cdd017795bec09ace5e4
> # test job: [7d942c9d9660e6808dcd835c4c73ad5405cc5518] https://lava.sirena.org.uk/scheduler/job/1914055
> # bad: [7d942c9d9660e6808dcd835c4c73ad5405cc5518] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git
> git bisect bad 7d942c9d9660e6808dcd835c4c73ad5405cc5518
> # test job: [db03d3c83bdb21667392d1596fafdfb38325c2a0] https://lava.sirena.org.uk/scheduler/job/1914176
> # bad: [db03d3c83bdb21667392d1596fafdfb38325c2a0] Merge branch 'dma-mapping-for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux.git
> git bisect bad db03d3c83bdb21667392d1596fafdfb38325c2a0
> # test job: [84a7a9823e73fe3c0adcc4780fa7a091981048ef] https://lava.sirena.org.uk/scheduler/job/1914247
> # good: [84a7a9823e73fe3c0adcc4780fa7a091981048ef] mm/shmem, swap: remove redundant error handling for replacing folio
> git bisect good 84a7a9823e73fe3c0adcc4780fa7a091981048ef
> # test job: [c7416f37e4d31fb28ac4ed584b13037e69a22dbe] https://lava.sirena.org.uk/scheduler/job/1914387
> # bad: [c7416f37e4d31fb28ac4ed584b13037e69a22dbe] Merge branch 'mm-nonmm-stable' of https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> git bisect bad c7416f37e4d31fb28ac4ed584b13037e69a22dbe
> # test job: [3dfd02c900379d209ac9dcac24b4a61d8478842a] https://lava.sirena.org.uk/scheduler/job/1914497
> # good: [3dfd02c900379d209ac9dcac24b4a61d8478842a] hugetlb: increase number of reserving hugepages via cmdline
> git bisect good 3dfd02c900379d209ac9dcac24b4a61d8478842a
> # test job: [fe7a283b39160153b6d1bd7f61b0a9d5d44987a8] https://lava.sirena.org.uk/scheduler/job/1915206
> # good: [fe7a283b39160153b6d1bd7f61b0a9d5d44987a8] ocfs2: add suballoc slot check in ocfs2_validate_inode_block()
> git bisect good fe7a283b39160153b6d1bd7f61b0a9d5d44987a8
> # test job: [74058c0a9fc8b2b4d5f4a0ef7ee2cfa66a9e49cf] https://lava.sirena.org.uk/scheduler/job/1916011
> # good: [74058c0a9fc8b2b4d5f4a0ef7ee2cfa66a9e49cf] Squashfs: fix uninit-value in squashfs_get_parent
> git bisect good 74058c0a9fc8b2b4d5f4a0ef7ee2cfa66a9e49cf
> # test job: [9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b] https://lava.sirena.org.uk/scheduler/job/1916064
> # good: [9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b] Squashfs: reject negative file sizes in squashfs_read_inode()
> git bisect good 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b
> # test job: [fb552b2425cf8f16c9c72229a972d1744b24d855] https://lava.sirena.org.uk/scheduler/job/1916102
> # good: [fb552b2425cf8f16c9c72229a972d1744b24d855] alloc_tag: fix boot failure due to NULL pointer dereference
> git bisect good fb552b2425cf8f16c9c72229a972d1744b24d855
> # test job: [81e78b7ec61e89e8bab9736551839f79b063614c] https://lava.sirena.org.uk/scheduler/job/1916193
> # bad: [81e78b7ec61e89e8bab9736551839f79b063614c] mm: convert folio_page() back to a macro
> git bisect bad 81e78b7ec61e89e8bab9736551839f79b063614c
> # test job: [1acc369373008b9eeb930fbb47847c0693055553] https://lava.sirena.org.uk/scheduler/job/1916218
> # bad: [1acc369373008b9eeb930fbb47847c0693055553] mm/khugepaged: use start_addr/addr for improved readability
> git bisect bad 1acc369373008b9eeb930fbb47847c0693055553
> # test job: [dd83609b88986f4add37c0871c3434310652ebd5] https://lava.sirena.org.uk/scheduler/job/1916225
> # bad: [dd83609b88986f4add37c0871c3434310652ebd5] hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
> git bisect bad dd83609b88986f4add37c0871c3434310652ebd5
> # first bad commit: [dd83609b88986f4add37c0871c3434310652ebd5] hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
On Mon, 20 Oct 2025 18:52:11 +0100 Mark Brown <broonie@kernel.org> wrote:
> > For the past few days I've been seeing failures on Raspberry Pi 4 in
> > the hugetlbfs-madvise kselftest in -next which bisect to this patch.
> > The test reports:
> >
> > # # -------------------------
> > # # running ./hugetlb-madvise
> > # # -------------------------
> > # # Unexpected number of free huge pages line 252
> > # # [FAIL]
> > # not ok 6 hugetlb-madvise # exit=1
>
> This issue is now present in mainline:
>
> Raspberry Pi 4: https://lava.sirena.org.uk/scheduler/job/1976561#L1798
> Orion O6: https://lava.sirena.org.uk/scheduler/job/1977081#L1779
>
> and still bisects to this patch.
Thanks. Were you able to test the proposed fix?
From: Deepanshu Kartikey <kartikey406@gmail.com>
Subject: hugetlbfs: move lock assertions after early returns in huge_pmd_unshare()
Date: Tue, 14 Oct 2025 17:03:44 +0530
When hugetlb_vmdelete_list() processes VMAs during truncate operations, it
may encounter VMAs where huge_pmd_unshare() is called without the required
shareable lock. This triggers an assertion failure in
hugetlb_vma_assert_locked().
The previous fix in commit dd83609b8898 ("hugetlbfs: skip VMAs without
shareable locks in hugetlb_vmdelete_list") skipped entire VMAs without
shareable locks to avoid the assertion. However, this prevented pages
from being unmapped and freed, causing a regression in
fallocate(PUNCH_HOLE) operations where pages were not freed immediately,
as reported by Mark Brown.
Instead of checking locks in the caller or skipping VMAs, move the lock
assertions in huge_pmd_unshare() to after the early return checks. The
assertions are only needed when actual PMD unsharing work will be
performed. If the function returns early because sz != PMD_SIZE or the
PMD is not shared, no locks are required and assertions should not fire.
This approach reverts the VMA skipping logic from commit dd83609b8898
("hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list")
while moving the assertions to avoid the assertion failure, keeping all
the logic within huge_pmd_unshare() itself and allowing page unmapping and
freeing to proceed for all VMAs.
Link: https://lkml.kernel.org/r/20251014113344.21194-1-kartikey406@gmail.com
Fixes: dd83609b8898 ("hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Reported-by: <syzbot+f26d7c75c26ec19790e7@syzkaller.appspotmail.com>
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7
Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Oscar Salvador <osalvador@suse.de>
Tested-by: <syzbot+f26d7c75c26ec19790e7@syzkaller.appspotmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/hugetlbfs/inode.c | 9 ---------
mm/hugetlb.c | 5 ++---
2 files changed, 2 insertions(+), 12 deletions(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-move-lock-assertions-after-early-returns-in-huge_pmd_unshare
+++ a/fs/hugetlbfs/inode.c
@@ -478,14 +478,6 @@ hugetlb_vmdelete_list(struct rb_root_cac
if (!hugetlb_vma_trylock_write(vma))
continue;
- /*
- * Skip VMAs without shareable locks. Per the design in commit
- * 40549ba8f8e0, these will be handled by remove_inode_hugepages()
- * called after this function with proper locking.
- */
- if (!__vma_shareable_lock(vma))
- goto skip;
-
v_start = vma_offset_start(vma, start);
v_end = vma_offset_end(vma, end);
@@ -496,7 +488,6 @@ hugetlb_vmdelete_list(struct rb_root_cac
* vmas. Therefore, lock is not held when calling
* unmap_hugepage_range for private vmas.
*/
-skip:
hugetlb_vma_unlock_write(vma);
}
}
--- a/mm/hugetlb.c~hugetlbfs-move-lock-assertions-after-early-returns-in-huge_pmd_unshare
+++ a/mm/hugetlb.c
@@ -7614,13 +7614,12 @@ int huge_pmd_unshare(struct mm_struct *m
p4d_t *p4d = p4d_offset(pgd, addr);
pud_t *pud = pud_offset(p4d, addr);
- i_mmap_assert_write_locked(vma->vm_file->f_mapping);
- hugetlb_vma_assert_locked(vma);
if (sz != PMD_SIZE)
return 0;
if (!ptdesc_pmd_is_shared(virt_to_ptdesc(ptep)))
return 0;
-
+ i_mmap_assert_write_locked(vma->vm_file->f_mapping);
+ hugetlb_vma_assert_locked(vma);
pud_clear(pud);
/*
* Once our caller drops the rmap lock, some other process might be
_
On Tue, Oct 21, 2025 at 02:10:47PM -0700, Andrew Morton wrote: > On Mon, 20 Oct 2025 18:52:11 +0100 Mark Brown <broonie@kernel.org> wrote: > > This issue is now present in mainline: > > Raspberry Pi 4: https://lava.sirena.org.uk/scheduler/job/1976561#L1798 > > Orion O6: https://lava.sirena.org.uk/scheduler/job/1977081#L1779 > > and still bisects to this patch. > Thanks. Were you able to test the proposed fix? I didn't, there were a lot of new versions in quite a short period.
© 2016 - 2026 Red Hat, Inc.