From nobody Sun Feb 8 05:57:32 2026 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 798AC1CB518 for ; Wed, 22 Jan 2025 23:27:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737588463; cv=none; b=ANokejzX7pQMqvZP5XR44oNaLQOaEeXgFBQGRk/b9J0mxp6VoOVuoZxYEYbsaO0z2WehWd4p2OraH9FAw1Bg0KjE/gtPpz5tyWrat+B6i8deS37VAfSXyyReYpj2Xjua+WrwbIH6Rz4AYiomDpZDcm7DUW//yQJgL2AqRPMwyfk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737588463; c=relaxed/simple; bh=b7PQasaRmXcllvPRUaQY3BCgOWjczHJg/yL1nYWgMRc=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=YhTGnobsuz0eMC1b3xjKIfqVVrAxXsAXC0jEFt7TvVqY/yKFhcEOCf1bCyv/Wh66OqJVVaYol7cYOecemoZwUhSpxfIIy1vrhMp8jkljnKLh1GXjGv+fsXVLULW8Rhs8i8JNRaO9l68GW/FYyzYYE0r3dkaTI52xmpgmFKgL9IQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=qduLD1F6; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="qduLD1F6" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1737588443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=/gVgCDajiZ5FVHubL9gu4gQZ78OrbIDAjdC1qsGbA80=; b=qduLD1F6WnVvqtatJhvmCJmkj/mC+ztYelFTH2KHW8h0JVkwzncD3WUcR5YeCmw9bEXgjt YEDtM9zHnABi1Ao/95Q2orkpTyhiDvC5J1j8MWIGBq2Ib4t7c7sIAtCGC7+1w/xN95UooT q5wx7vOfb1RU14RWBAPoloMznH/YUdw= From: Roman Gushchin To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, Andrew Morton , Roman Gushchin , Jann Horn , Peter Zijlstra , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Hugh Dickins , linux-arch@vger.kernel.org Subject: [PATCH v2] mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables() Date: Wed, 22 Jan 2025 23:27:16 +0000 Message-ID: <20250122232716.1321171-1-roman.gushchin@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Commit b67fbebd4cf9 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas") added a forced tlbflush to tlb_vma_end(), which is required to avoid a race between munmap() and unmap_mapping_range(). However it added some overhead to other paths where tlb_vma_end() is used, but vmas are not removed, e.g. madvise(MADV_DONTNEED). Fix this by moving the tlb flush out of tlb_end_vma() into free_pgtables(), somewhat similar to the stable version of the original commit: e.g. stable commit 895428ee124a ("mm: Force TLB flush for PFNMAP mappings before unlink_file_vma()"). Note, that if tlb->fullmm is set, no flush is required, as the whole mm is about to be destroyed. v2: - moved vma_pfn flag handling into tlb.h (by Peter Z.) - added comments (by Peter Z.) - fixed the vma_pfn flag setting (by Hugh D.) Suggested-by: Jann Horn Signed-off-by: Roman Gushchin Cc: Peter Zijlstra Cc: Will Deacon Cc: "Aneesh Kumar K.V" Cc: Andrew Morton Cc: Nick Piggin Cc: Hugh Dickins Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org --- include/asm-generic/tlb.h | 41 ++++++++++++++++++++++++++------------- mm/memory.c | 7 +++++++ 2 files changed, 35 insertions(+), 13 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 709830274b75..fbe31f49a5af 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -449,7 +449,14 @@ tlb_update_vma_flags(struct mmu_gather *tlb, struct vm= _area_struct *vma) */ tlb->vma_huge =3D is_vm_hugetlb_page(vma); tlb->vma_exec =3D !!(vma->vm_flags & VM_EXEC); - tlb->vma_pfn =3D !!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)); + + /* + * vma_pfn is checked and cleared by tlb_flush_mmu_pfnmap() + * for a set of vma's, so it should be set if at least one vma + * has VM_PFNMAP or VM_MIXEDMAP flags set. + */ + if (vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) + tlb->vma_pfn =3D 1; } =20 static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) @@ -466,6 +473,22 @@ static inline void tlb_flush_mmu_tlbonly(struct mmu_ga= ther *tlb) __tlb_reset_range(tlb); } =20 +static inline void tlb_flush_mmu_pfnmap(struct mmu_gather *tlb) +{ + /* + * VM_PFNMAP and VM_MIXEDMAP maps are fragile because the core mm + * doesn't track the page mapcount -- there might not be page-frames + * for these PFNs after all. Force flush TLBs for such ranges to avoid + * munmap() vs unmap_mapping_range() races. + * Ensure we have no stale TLB entries by the time this mapping is + * removed from the rmap. + */ + if (unlikely(!tlb->fullmm && tlb->vma_pfn)) { + tlb_flush_mmu_tlbonly(tlb); + tlb->vma_pfn =3D 0; + } +} + static inline void tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_size) { @@ -549,22 +572,14 @@ static inline void tlb_start_vma(struct mmu_gather *t= lb, struct vm_area_struct * =20 static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_stru= ct *vma) { - if (tlb->fullmm) + if (tlb->fullmm || IS_ENABLED(CONFIG_MMU_GATHER_MERGE_VMAS)) return; =20 /* - * VM_PFNMAP is more fragile because the core mm will not track the - * page mapcount -- there might not be page-frames for these PFNs after - * all. Force flush TLBs for such ranges to avoid munmap() vs - * unmap_mapping_range() races. + * Do a TLB flush and reset the range at VMA boundaries; this avoids + * the ranges growing with the unused space between consecutive VMAs. */ - if (tlb->vma_pfn || !IS_ENABLED(CONFIG_MMU_GATHER_MERGE_VMAS)) { - /* - * Do a TLB flush and reset the range at VMA boundaries; this avoids - * the ranges growing with the unused space between consecutive VMAs. - */ - tlb_flush_mmu_tlbonly(tlb); - } + tlb_flush_mmu_tlbonly(tlb); } =20 /* diff --git a/mm/memory.c b/mm/memory.c index 398c031be9ba..c2a9effb2e32 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -365,6 +365,13 @@ void free_pgtables(struct mmu_gather *tlb, struct ma_s= tate *mas, { struct unlink_vma_file_batch vb; =20 + /* + * VM_PFNMAP and VM_MIXEDMAP maps require a special handling here: + * force flush TLBs for such ranges to avoid munmap() vs + * unmap_mapping_range() races. + */ + tlb_flush_mmu_pfnmap(tlb); + do { unsigned long addr =3D vma->vm_start; struct vm_area_struct *next; --=20 2.48.1.262.g85cc9f2d1e-goog