From nobody Fri Jun 19 14:43:11 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7A37C433FE for ; Sun, 3 Apr 2022 05:41:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237972AbiDCFnN (ORCPT ); Sun, 3 Apr 2022 01:43:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237986AbiDCFnL (ORCPT ); Sun, 3 Apr 2022 01:43:11 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27DAA33EAD for ; Sat, 2 Apr 2022 22:41:18 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id s8so6075729pfk.12 for ; Sat, 02 Apr 2022 22:41:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Xp6qCxwTnE0C1FxChih956SdBv3W6L+dQXT7C0J3Dts=; b=2ABETFqdsMhesFHDHhEUmJyVsW7RZ5Js5waMUtu7AgAnw7NQJE9KF6VPHqmSdvMej+ 2OJl2yUuMkKlsJntBOzaVw6h3FbkNTEtAUb7NgwTf/4KS8Unjkp0AkIDihqOM4AQy0E9 b0D1cmPFHN/ZT3IC0qJD0E+WI024G3g/JHnvtyGmSNaBRaaNOf62Cjctp4an9NRZ9b6z vPzjHQqLtX8FSAzX5BdZbUx0QSmSsl5esFSgUJJfPkwZ9OBGd+Ki5S02q7b/3idXLY9Q v3WBGcPbhY5PtafSU0qiWng1lLj8uoWDT3H3wd0bwa2IQYzXzlExfT+r6ROb2/HljUOo gBaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Xp6qCxwTnE0C1FxChih956SdBv3W6L+dQXT7C0J3Dts=; b=TRMdatTvOc3Dwzl6J/yEf+OC9IbPSPdC6ttwQWMcfySxwoGRyFNjTE6iOVuLhVgdvr zIPRX2YJcB6g5BvsRsHQzpEFhiO6jJWMS6ctQTkbQbW2MvsFIlNNX2uS1woSRlpP+ZhI 9WHR6zA2H/k1ilpUpeFY1TO8wL1KlnDVU4YU4pysVA30zfre9RIKhokfBxBq+mGOeryc PBZ2SsdKDe2hGH7ALpekYOi7l+15+2DEEfEpArntgVzsxvLWiNgYmvpqFjQXTHEaOZk0 ehNtPXCk6XHLjdIUWtYxotUQYL2JSwub8sxyT3BhkjV7+pePDBjmjobRY4gAG/DXjj6h pikg== X-Gm-Message-State: AOAM5300+cXVHK7thM1yAe57sKzSuJV6CJHMlUFejYdK43R60fq/hAf4 i5XfQcbMi57O2undyvnNg1MZ6Q== X-Google-Smtp-Source: ABdhPJyZBSv2yEKXE7ELYRPowq8GlcSVl+mBvyh+Dr/+2j3dfmhbMmgifEOf+lzxBO7utJytI2ZQWA== X-Received: by 2002:a05:6a00:b95:b0:4fa:ec15:7eb7 with SMTP id g21-20020a056a000b9500b004faec157eb7mr18416062pfj.74.1648964477478; Sat, 02 Apr 2022 22:41:17 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:17 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 1/6] mm: rmap: fix cache flush on THP pages Date: Sun, 3 Apr 2022 13:39:52 +0800 Message-Id: <20220403053957.10770-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. At least, no problems were found due to this. Maybe because the architectures that have virtual indexed caches is less. Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped= _walk()") Signed-off-by: Muchun Song Reviewed-by: Yang Shi Reviewed-by: Dan Williams Reviewed-by: Christoph Hellwig --- mm/rmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index fc46a3d7b704..723682ddb9e8 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -970,7 +970,8 @@ static bool page_mkclean_one(struct folio *folio, struc= t vm_area_struct *vma, if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) continue; =20 - flush_cache_page(vma, address, folio_pfn(folio)); + flush_cache_range(vma, address, + address + HPAGE_PMD_SIZE); entry =3D pmdp_invalidate(vma, address, pmd); entry =3D pmd_wrprotect(entry); entry =3D pmd_mkclean(entry); --=20 2.11.0 From nobody Fri Jun 19 14:43:11 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 532F3C433F5 for ; Sun, 3 Apr 2022 05:41:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238671AbiDCFn2 (ORCPT ); Sun, 3 Apr 2022 01:43:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238730AbiDCFnU (ORCPT ); Sun, 3 Apr 2022 01:43:20 -0400 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C0D2381B3 for ; Sat, 2 Apr 2022 22:41:25 -0700 (PDT) Received: by mail-pj1-x1031.google.com with SMTP id o3-20020a17090a3d4300b001c6bc749227so6121171pjf.1 for ; Sat, 02 Apr 2022 22:41:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qCp2b48ULThIjDs7kZ69Cjh0BLYiJwB01qLpS62xP+w=; b=lwrBARs+/7DTDJPCPDkO87hzSIKPt6GQ3BJuVqyr1LMMRf0ELecQkoLiSNJjbiZdpd OOw8+jdaEjkrKEkQOCmLvwQJzHLz27JTlGJ5zE2dn5Jip53qlttwM0FkCYZueUeb7zeF RVGdSY44kEkFd3NNdpkbB9EPEooKpn6V9VAwbVH27fUkD5jUhTGJPGtIrzGx+IUgXU/O uu1GRfpRmgRGEwvWhRZtaM7CW/f27In9HxOfBsluDtGcV56/sxLL1FJ//gEuOidep45r /RAHUxBOsm5BUtauhEFj88JE1ydBVybi1r2rzExNbwhHo6EWG1darzCFI36vXEKt9WKC z0qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qCp2b48ULThIjDs7kZ69Cjh0BLYiJwB01qLpS62xP+w=; b=3pMrNldeshdUBlTVQPUSHs85TjCCI9MOYe4Vwn0CrFVQ942jhu26Rq8+FI/jtT7u5c h+T1OjdwC/ceiqNpMxNdi2/Mo8zJ6QZ2ncTNrTiQyrXetPNv2V7Ld5bQk+im2WzlEMF3 i0lkpBpMqLEXbo/4o8HxUMzDbKSqGMbD+neeKx1Ram8qb2EMkhfMVssxtvtR6p0iugsc BNWFVDZqdnsLNcBsF+nLzSQK8Rr83IXMhm243Wo55x3tkQU7GhWdRNTyXi9BXQ+u8QSO 2R7ITkyQg+T77dNsBVtMDxx6D6sNR5+eeyKygvETeZKtA7q6Uwexkbbe2hJKH9KsF5Aq B1YQ== X-Gm-Message-State: AOAM533ucUoEUtKyabsuUF1Ocq9xUxrPhINrmn8mbbR0flgyDM0cg37w k0SozByWjJu7cWA/U39xELxcFQ== X-Google-Smtp-Source: ABdhPJwPE+R5M1c944zWgzZX/rK39iXrxGm2KhaVKut5BUEWAAHDPdGQ5lQJXH1Kcs/14QSL7BveZw== X-Received: by 2002:a17:90b:4b42:b0:1c7:3f6a:5d97 with SMTP id mi2-20020a17090b4b4200b001c73f6a5d97mr19591035pjb.27.1648964484666; Sat, 02 Apr 2022 22:41:24 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:24 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 2/6] dax: fix cache flush on PMD-mapped pages Date: Sun, 3 Apr 2022 13:39:53 +0800 Message-Id: <20220403053957.10770-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. This is just a documentation issue with the respect to properly documenting the expected usage of cache flushing before modifying the pmd. However, in practice this is not a problem due to the fact that DAX is not available on architectures with virtually indexed caches per: commit d92576f1167c ("dax: does not work correctly with virtual aliasing = caches") Fixes: f729c8c9b24f ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean") Signed-off-by: Muchun Song Reviewed-by: Dan Williams Reviewed-by: Christoph Hellwig --- fs/dax.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index 67a08a32fccb..a372304c9695 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -845,7 +845,8 @@ static void dax_entry_mkclean(struct address_space *map= ping, pgoff_t index, if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) goto unlock_pmd; =20 - flush_cache_page(vma, address, pfn); + flush_cache_range(vma, address, + address + HPAGE_PMD_SIZE); pmd =3D pmdp_invalidate(vma, address, pmdp); pmd =3D pmd_wrprotect(pmd); pmd =3D pmd_mkclean(pmd); --=20 2.11.0 From nobody Fri Jun 19 14:43:11 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1953C433F5 for ; Sun, 3 Apr 2022 05:41:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239412AbiDCFnc (ORCPT ); Sun, 3 Apr 2022 01:43:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239659AbiDCFnZ (ORCPT ); Sun, 3 Apr 2022 01:43:25 -0400 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12C5A33E97 for ; Sat, 2 Apr 2022 22:41:32 -0700 (PDT) Received: by mail-pj1-x1029.google.com with SMTP id n18-20020a17090ade9200b001ca699ee462so1838824pjv.3 for ; Sat, 02 Apr 2022 22:41:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ss8jaoNqbCttgkdiicrTIYI1R2QuFynfz36gAXgUE7k=; b=ljK+nt2UNQ8zKRGE/zCfCxSLHdGZqaxkxh6a9KwpoVhKSi8iS7pzyFG5rUuCc4et/Y FSVM1Xf4ajyjXed3xauvnzbn4LoRQo2l7LRafjxetfr7eFZMr/NRuhrZg9qV7N4TFN09 Fg055h0MHAVQEzLwkOe7W3nbvuZqhE4jrho3aGzB+9TxiAw1nUibE8uCsMENUHr7sgb3 2+y0AbE9pYhCOu/Vm1jJ0f7ryZ0b7JYHyUL4YgLEiQehW0Sp5NpEnfwYu4CB2Z04zLVK 51yRZI2AJpzXQZFF31Na/4DZYHjtqwqKFaQ665+Osp4W0qnram+2HxYQV8U9EWtFRxoB DSpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ss8jaoNqbCttgkdiicrTIYI1R2QuFynfz36gAXgUE7k=; b=2Q76+vtBPx2TpIii779H2100GwwIK1b+qBg0FMbF/9RI5/Wi1B+KY+D6s21tTG6a44 sueogr8Q3j5KrlFQYhEdT/to2+N+lIV6zwL9iFOvsugr96cAdSQSwR+SAd64fTyvIu4S zdkupiqXibDUZkRGDN36tgaR+y86vq0UvZSyDt4+hkQqZ3DLeWZuc/lOUrht9HaqxQRe hzBJocEOAfNYOilzhr/1I3eibuel5gpMCCNu3TI8Gne6qLjFKCzEM97bcb0jK1hVM3vB 6+ZFxMpkqaTViiOnh17RUfNptSz82s9qcoW1O53nNKy8gGCFdT8r0Db5j5ZfnAFuyJFE BbPw== X-Gm-Message-State: AOAM5320MC24z6kz2HqZwfyL5sQzONp6wNnBD1AoYsLiRa7SMG7C5TqH jIgnwBsfx36Jsk/R09H80RZwtg== X-Google-Smtp-Source: ABdhPJzb7n1x9H/3exn1YcDrylPGvv27RMTn16h2h9+bcsaRc2iqaKLat0zmBiFPgo3tmrFG1hAZGQ== X-Received: by 2002:a17:90a:488c:b0:1c7:b62e:8e8c with SMTP id b12-20020a17090a488c00b001c7b62e8e8cmr19839143pjh.157.1648964491520; Sat, 02 Apr 2022 22:41:31 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:31 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song Subject: [PATCH v7 3/6] mm: rmap: introduce pfn_mkclean_range() to cleans PTEs Date: Sun, 3 Apr 2022 13:39:54 +0800 Message-Id: <20220403053957.10770-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The page_mkclean_one() is supposed to be used with the pfn that has a associated struct page, but not all the pfns (e.g. DAX) have a struct page. Introduce a new function pfn_mkclean_range() to cleans the PTEs (including PMDs) mapped with range of pfns which has no struct page associated with them. This helper will be used by DAX device in the next patch to make pfns clean. Signed-off-by: Muchun Song --- include/linux/rmap.h | 3 +++ mm/internal.h | 26 +++++++++++++-------- mm/rmap.c | 65 +++++++++++++++++++++++++++++++++++++++++++-----= ---- 3 files changed, 74 insertions(+), 20 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b58ddb8b2220..a6ec0d3e40c1 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -263,6 +263,9 @@ unsigned long page_address_in_vma(struct page *, struct= vm_area_struct *); */ int folio_mkclean(struct folio *); =20 +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t p= goff, + struct vm_area_struct *vma); + void remove_migration_ptes(struct folio *src, struct folio *dst, bool lock= ed); =20 /* diff --git a/mm/internal.h b/mm/internal.h index f45292dc4ef5..664e6d48607c 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -516,26 +516,22 @@ void mlock_page_drain(int cpu); extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); =20 /* - * At what user virtual address is page expected in vma? - * Returns -EFAULT if all of the page is outside the range of vma. - * If page is a compound head, the entire compound page is considered. + * Return the start of user virtual address at the specific offset within + * a vma. */ static inline unsigned long -vma_address(struct page *page, struct vm_area_struct *vma) +vma_pgoff_address(pgoff_t pgoff, unsigned long nr_pages, + struct vm_area_struct *vma) { - pgoff_t pgoff; unsigned long address; =20 - VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ - pgoff =3D page_to_pgoff(page); if (pgoff >=3D vma->vm_pgoff) { address =3D vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); /* Check for address beyond vma (or wrapped through 0?) */ if (address < vma->vm_start || address >=3D vma->vm_end) address =3D -EFAULT; - } else if (PageHead(page) && - pgoff + compound_nr(page) - 1 >=3D vma->vm_pgoff) { + } else if (pgoff + nr_pages - 1 >=3D vma->vm_pgoff) { /* Test above avoids possibility of wrap to 0 on 32-bit */ address =3D vma->vm_start; } else { @@ -545,6 +541,18 @@ vma_address(struct page *page, struct vm_area_struct *= vma) } =20 /* + * Return the start of user virtual address of a page within a vma. + * Returns -EFAULT if all of the page is outside the range of vma. + * If page is a compound head, the entire compound page is considered. + */ +static inline unsigned long +vma_address(struct page *page, struct vm_area_struct *vma) +{ + VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ + return vma_pgoff_address(page_to_pgoff(page), compound_nr(page), vma); +} + +/* * Then at what user virtual address will none of the range be found in vm= a? * Assumes that vma_address() already returned a good starting address. */ diff --git a/mm/rmap.c b/mm/rmap.c index 723682ddb9e8..ad5cf0e45a73 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -929,12 +929,12 @@ int folio_referenced(struct folio *folio, int is_lock= ed, return pra.referenced; } =20 -static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *v= ma, - unsigned long address, void *arg) +static int page_vma_mkclean_one(struct page_vma_mapped_walk *pvmw) { - DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int cleaned =3D 0; + struct vm_area_struct *vma =3D pvmw->vma; struct mmu_notifier_range range; - int *cleaned =3D arg; + unsigned long address =3D pvmw->address; =20 /* * We have to assume the worse case ie pmd for invalidation. Note that @@ -942,16 +942,16 @@ static bool page_mkclean_one(struct folio *folio, str= uct vm_area_struct *vma, */ mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE, 0, vma, vma->vm_mm, address, - vma_address_end(&pvmw)); + vma_address_end(pvmw)); mmu_notifier_invalidate_range_start(&range); =20 - while (page_vma_mapped_walk(&pvmw)) { + while (page_vma_mapped_walk(pvmw)) { int ret =3D 0; =20 - address =3D pvmw.address; - if (pvmw.pte) { + address =3D pvmw->address; + if (pvmw->pte) { pte_t entry; - pte_t *pte =3D pvmw.pte; + pte_t *pte =3D pvmw->pte; =20 if (!pte_dirty(*pte) && !pte_write(*pte)) continue; @@ -964,7 +964,7 @@ static bool page_mkclean_one(struct folio *folio, struc= t vm_area_struct *vma, ret =3D 1; } else { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - pmd_t *pmd =3D pvmw.pmd; + pmd_t *pmd =3D pvmw->pmd; pmd_t entry; =20 if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) @@ -991,11 +991,22 @@ static bool page_mkclean_one(struct folio *folio, str= uct vm_area_struct *vma, * See Documentation/vm/mmu_notifier.rst */ if (ret) - (*cleaned)++; + cleaned++; } =20 mmu_notifier_invalidate_range_end(&range); =20 + return cleaned; +} + +static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *v= ma, + unsigned long address, void *arg) +{ + DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int *cleaned =3D arg; + + *cleaned +=3D page_vma_mkclean_one(&pvmw); + return true; } =20 @@ -1033,6 +1044,38 @@ int folio_mkclean(struct folio *folio) EXPORT_SYMBOL_GPL(folio_mkclean); =20 /** + * pfn_mkclean_range - Cleans the PTEs (including PMDs) mapped with range = of + * [@pfn, @pfn + @nr_pages) at the specific offset (@p= goff) + * within the @vma of shared mappings. And since clean= PTEs + * should also be readonly, write protects them too. + * @pfn: start pfn. + * @nr_pages: number of physically contiguous pages srarting with @pfn. + * @pgoff: page offset that the @pfn mapped with. + * @vma: vma that @pfn mapped within. + * + * Returns the number of cleaned PTEs (including PMDs). + */ +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t p= goff, + struct vm_area_struct *vma) +{ + struct page_vma_mapped_walk pvmw =3D { + .pfn =3D pfn, + .nr_pages =3D nr_pages, + .pgoff =3D pgoff, + .vma =3D vma, + .flags =3D PVMW_SYNC, + }; + + if (invalid_mkclean_vma(vma, NULL)) + return 0; + + pvmw.address =3D vma_pgoff_address(pgoff, nr_pages, vma); + VM_BUG_ON_VMA(pvmw.address =3D=3D -EFAULT, vma); + + return page_vma_mkclean_one(&pvmw); +} + +/** * page_move_anon_rmap - move a page to our anon_vma * @page: the page to move to our anon_vma * @vma: the vma the page belongs to --=20 2.11.0 From nobody Fri Jun 19 14:43:11 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53CFDC433EF for ; Sun, 3 Apr 2022 05:41:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240021AbiDCFnj (ORCPT ); Sun, 3 Apr 2022 01:43:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237986AbiDCFnc (ORCPT ); Sun, 3 Apr 2022 01:43:32 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 144E7340F5 for ; Sat, 2 Apr 2022 22:41:39 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 125so528000pgc.4 for ; Sat, 02 Apr 2022 22:41:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P/G4cY6LA4u6yV/b8BGhElOxrEr7wa4Csrvtb+Y6qso=; b=7jSVQ79b+01y0a3OIt4mIy3OD3Bmy88FmGcA/qn97f2Sw2HTM5zHt5Z3vf6eSa05O5 8KATfUa7aPdV9idYkmMFpQa7avLXAeGK7o21AsI+a/FVKxiqlZmHorgTTju6bnHLfHt6 oWLIAHsD4cxl82X558HPB6VNb45Uomwcmk9+JIZ0QfXDwex0bJnDmMai6OsLpPpHWteC 2QezqEXHY1peozS+If4seb5mEfpnvno/EA2OzdmWprY8lmLKDdKzXeOOvxPeQCKg3YuH Z4f9zEjIPoK+mkeSx70BhARKhSvGBE+qygkzwVs/Nr7nirXqN0f33JPXiYNKA5gZjZdA LCdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P/G4cY6LA4u6yV/b8BGhElOxrEr7wa4Csrvtb+Y6qso=; b=O8JYF4AByFQBM4puCXxaDWp7VX8Txiz+H7nmR8qZ0qgCcpOPNy9NPFbuDxpGudUe2G Nv5PkMD36Om3A9Pa5QQlVpcYtkp7ETOvpkLxeKhJc3T2K121ER1LBTdrsIPsNNQcSgp+ k8wB27iWBIlLFfBu0UCFZTUDzG9gSh/UYoqAksYHXOsfC4igN0EG7XGBeZT++NrICxhc FNNl8S2wFdqoKQFUHvVk7mwsYU127/97c5cwC28yWXCnYHGi7qKH9n4RhBaEdJLROHzd UiJktC8icg00Luzbr+8G+fsbKs6wlYlydn5xL0KRyqmqZ+Tv770J5YLcUgmWAa4CUpUm 3j/Q== X-Gm-Message-State: AOAM533iVrvI3kBC8udhZnKBJbTPfwRkJzq1Ne119l9yt3goVukxViF4 WVwKUbii73Euoru8tPbhvi4AMQ== X-Google-Smtp-Source: ABdhPJxGOROIqTOKId5F4JMIrbX834xDIMvBWzvSxSoC7uaCqPhyXCkc5AknFgZY3po9EwZFzgLsfQ== X-Received: by 2002:a65:6e82:0:b0:381:71c9:9856 with SMTP id bm2-20020a656e82000000b0038171c99856mr21437599pgb.316.1648964498579; Sat, 02 Apr 2022 22:41:38 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:38 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song Subject: [PATCH v7 4/6] mm: pvmw: add support for walking devmap pages Date: Sun, 3 Apr 2022 13:39:55 +0800 Message-Id: <20220403053957.10770-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The devmap pages can not use page_vma_mapped_walk() to check if a huge devmap page is mapped into a vma. Add support for walking huge devmap pages so that DAX can use it in the next patch. Signed-off-by: Muchun Song --- mm/page_vma_mapped.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 1187f9c1ec5b..3da82bf65de8 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -210,16 +210,10 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk= *pvmw) */ pmde =3D READ_ONCE(*pvmw->pmd); =20 - if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { + if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde) || + (pmd_present(pmde) && pmd_devmap(pmde))) { pvmw->ptl =3D pmd_lock(mm, pvmw->pmd); pmde =3D *pvmw->pmd; - if (likely(pmd_trans_huge(pmde))) { - if (pvmw->flags & PVMW_MIGRATION) - return not_found(pvmw); - if (!check_pmd(pmd_pfn(pmde), pvmw)) - return not_found(pvmw); - return true; - } if (!pmd_present(pmde)) { swp_entry_t entry; =20 @@ -232,6 +226,13 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk = *pvmw) return not_found(pvmw); return true; } + if (likely(pmd_trans_huge(pmde) || pmd_devmap(pmde))) { + if (pvmw->flags & PVMW_MIGRATION) + return not_found(pvmw); + if (!check_pmd(pmd_pfn(pmde), pvmw)) + return not_found(pvmw); + return true; + } /* THP pmd was split under us: handle on pte level */ spin_unlock(pvmw->ptl); pvmw->ptl =3D NULL; --=20 2.11.0 From nobody Fri Jun 19 14:43:11 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E95BC433EF for ; Sun, 3 Apr 2022 05:42:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239523AbiDCFoC (ORCPT ); Sun, 3 Apr 2022 01:44:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241323AbiDCFnq (ORCPT ); Sun, 3 Apr 2022 01:43:46 -0400 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50542387BF for ; Sat, 2 Apr 2022 22:41:46 -0700 (PDT) Received: by mail-pj1-x102a.google.com with SMTP id h23-20020a17090a051700b001c9c1dd3acbso6101068pjh.3 for ; Sat, 02 Apr 2022 22:41:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ij9Ou9wostYj6EGhx1lGIRtlvYKIpDSow4S1vsu6OkU=; b=XeDR2vJSmU7Ivd3WN7rCTrbkf25VSpqOphH+Y0b8v30A2+sw5zqGls9Infj7s8yVlP svWNwaO1p/aGhXjj7Zupj5x3tuNFjanAQWQWM6u7liBIOurPY1rHvAScK/VCuZ7iFAfg pFi84i3m4JempmgKh34P9NIz6wJ875JOw+0Aqg1U6aRUn/Xtf4315rOLnph7LcPVNo/o bCkdnwqsmM4WrVRAFJEPnNeFpXgXWQf43xph7Siue3ttUD6DaIe9v5c5W0aQH8CZed+l DJeXFlvu8ihwu8DuH0KCvDNYmwo5uFWXXWrAhMpb9CnbUIRuHIanH9teTlUUk6AjPz45 B+cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ij9Ou9wostYj6EGhx1lGIRtlvYKIpDSow4S1vsu6OkU=; b=q7ugL8JTMH18UPGMlOZKdIdlA1sdLxCgINozOt5wlasXRRw32s5SkLC+hPB72MIctb LTYuHR9wwGQxmtpMpnhRZeaT9k5xcgRPpodHguvMKlF8cNNUT2E6DjBAdMbpmqY2lw4k uLFHtVJh4dJdmG7pBYQgu1tuoNa5AlVYZTCqCvWu/skuDArPOzjRuQJBXEScvxveO2u6 9peaEyuuEK652/Er6qzCAX5R7h091qpgpev+kBWFeNuDyFSfBAIhi6+Is9q5d6KsiR8m O+1kEuKRlWpRngxt9H9vp4oCgYzeK3iNGehJ47XgjHUFDZolzhhdYj9AQ+YTmipXXui7 oOGA== X-Gm-Message-State: AOAM533irp99QHSwaXchhtS4o0L4mDc5rg8iJihADoKuo60oRWfL4IvO m7EL6HAvoSXVQ4HxwvdYQEoCRA== X-Google-Smtp-Source: ABdhPJxCMZdeA2cMPE4dzvOrFTBRiMQzhbgxuHDpUKbgulRM1aIbxvU3i75LzfHUzm2CYnokBq58DQ== X-Received: by 2002:a17:902:dad0:b0:154:740a:9094 with SMTP id q16-20020a170902dad000b00154740a9094mr17337374plx.107.1648964505726; Sat, 02 Apr 2022 22:41:45 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:45 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 5/6] dax: fix missing writeprotect the pte entry Date: Sun, 3 Apr 2022 13:39:56 +0800 Message-Id: <20220403053957.10770-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently dax_mapping_entry_mkclean() fails to clean and write protect the pte entry within a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd entry dirty and writeable. 2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K) write to the same file, dirtying PMD radix tree entry (already done in 1)) and making the pte entry dirty and writeable. 3) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pte entry as clean and write protected since the vma of process B is not covered in dax_entry_mkclean(). 4) process B writes to the pte. These don't cause any page faults since the pte entry is dirty and writeable. The radix tree entry remains clean. 5) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 6) crash - dirty data that should have been fsync'd as part of 5) could still have been in the processor cache, and is lost. Just to use pfn_mkclean_range() to clean the pfns to fix this issue. Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Signed-off-by: Muchun Song Reviewed-by: Christoph Hellwig --- fs/dax.c | 99 ++++++++----------------------------------------------------= ---- 1 file changed, 12 insertions(+), 87 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a372304c9695..1ac12e877f4f 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -24,6 +24,7 @@ #include #include #include +#include #include =20 #define CREATE_TRACE_POINTS @@ -789,96 +790,12 @@ static void *dax_insert_entry(struct xa_state *xas, return entry; } =20 -static inline -unsigned long pgoff_address(pgoff_t pgoff, struct vm_area_struct *vma) -{ - unsigned long address; - - address =3D vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); - VM_BUG_ON_VMA(address < vma->vm_start || address >=3D vma->vm_end, vma); - return address; -} - -/* Walk all mappings of a given index of a file and writeprotect them */ -static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, - unsigned long pfn) -{ - struct vm_area_struct *vma; - pte_t pte, *ptep =3D NULL; - pmd_t *pmdp =3D NULL; - spinlock_t *ptl; - - i_mmap_lock_read(mapping); - vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) { - struct mmu_notifier_range range; - unsigned long address; - - cond_resched(); - - if (!(vma->vm_flags & VM_SHARED)) - continue; - - address =3D pgoff_address(index, vma); - - /* - * follow_invalidate_pte() will use the range to call - * mmu_notifier_invalidate_range_start() on our behalf before - * taking any lock. - */ - if (follow_invalidate_pte(vma->vm_mm, address, &range, &ptep, - &pmdp, &ptl)) - continue; - - /* - * No need to call mmu_notifier_invalidate_range() as we are - * downgrading page table protection not changing it to point - * to a new page. - * - * See Documentation/vm/mmu_notifier.rst - */ - if (pmdp) { -#ifdef CONFIG_FS_DAX_PMD - pmd_t pmd; - - if (pfn !=3D pmd_pfn(*pmdp)) - goto unlock_pmd; - if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) - goto unlock_pmd; - - flush_cache_range(vma, address, - address + HPAGE_PMD_SIZE); - pmd =3D pmdp_invalidate(vma, address, pmdp); - pmd =3D pmd_wrprotect(pmd); - pmd =3D pmd_mkclean(pmd); - set_pmd_at(vma->vm_mm, address, pmdp, pmd); -unlock_pmd: -#endif - spin_unlock(ptl); - } else { - if (pfn !=3D pte_pfn(*ptep)) - goto unlock_pte; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock_pte; - - flush_cache_page(vma, address, pfn); - pte =3D ptep_clear_flush(vma, address, ptep); - pte =3D pte_wrprotect(pte); - pte =3D pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); -unlock_pte: - pte_unmap_unlock(ptep, ptl); - } - - mmu_notifier_invalidate_range_end(&range); - } - i_mmap_unlock_read(mapping); -} - static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_= dev, struct address_space *mapping, void *entry) { - unsigned long pfn, index, count; + unsigned long pfn, index, count, end; long ret =3D 0; + struct vm_area_struct *vma; =20 /* * A page got tagged dirty in DAX mapping? Something is seriously @@ -936,8 +853,16 @@ static int dax_writeback_one(struct xa_state *xas, str= uct dax_device *dax_dev, pfn =3D dax_to_pfn(entry); count =3D 1UL << dax_entry_order(entry); index =3D xas->xa_index & ~(count - 1); + end =3D index + count - 1; + + /* Walk all mappings of a given index of a file and writeprotect them */ + i_mmap_lock_read(mapping); + vma_interval_tree_foreach(vma, &mapping->i_mmap, index, end) { + pfn_mkclean_range(pfn, count, index, vma); + cond_resched(); + } + i_mmap_unlock_read(mapping); =20 - dax_entry_mkclean(mapping, index, pfn); dax_flush(dax_dev, page_address(pfn_to_page(pfn)), count * PAGE_SIZE); /* * After we have flushed the cache, we can clear the dirty tag. There --=20 2.11.0 From nobody Fri Jun 19 14:43:11 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58E93C433F5 for ; Sun, 3 Apr 2022 05:42:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242601AbiDCFoJ (ORCPT ); Sun, 3 Apr 2022 01:44:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240883AbiDCFnz (ORCPT ); Sun, 3 Apr 2022 01:43:55 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDB4B381BD for ; Sat, 2 Apr 2022 22:41:53 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id h24so1783548pfo.6 for ; Sat, 02 Apr 2022 22:41:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pquxx3qH2nuqmnC96VcgoZZu7sN0crimqByO4QYpxLI=; b=GHXKTG43kCR2umCkJhnnOQOd88LUpFg6cViEu00MVxJOQ2rWmTplbsf/IitrWYgSHd rZxvN/NrO90+8edNTpRLUynQfaGmARDAqr7zPpmiFS11Zrt4utAean69PdTj48CP2KO0 C5qyBBPUy5KOw8921uFQf2iytsaH6zuZ3pyqapbrDeq88E4SSxXCc7EyvkNl5vo54R1l fCVL47Tl6nwHugt2GPYTiUmBUhoaThzL8bnjc4/Z2pj3JDrHgH71JOIbTI+YZfc35Oa3 KcBQBW6V5LTYIJwGJ+N5eQgB5TKPUw42FBmGGibFyKJz3RR6hG8pbyWZyxD5D0pLvbQX D4HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pquxx3qH2nuqmnC96VcgoZZu7sN0crimqByO4QYpxLI=; b=F6XUEOZnHHZ0rNh6llw8IdpQ/1rMuGvYfHhkHriX6/tN604Yzt2tPjVVAN6/UHVEWV THKXAAEI8M/950zX27sc0+SMza3FlFlzyUHq7eA2e9Da5HgVzmVckn0UM/F8OF9QnaEo A1qtuFVaGKhRw4uGtIZp6GGDTP1+DLVJiumuNue/iDL4+r8vgCdvMy606xYW+laGVFBY YHxWGHfJ97am6x0ftfMR7sZKdHAOENyfgWwIqkLXkcywZD79FCEOUBV5yWwIwLe7lVd/ 9udUoxb9MXI83PGeZaRyTsw5PUhERr//eOHL1UxX1AVdYxLo1UPEupld8P0T5cKkTCyE rQFg== X-Gm-Message-State: AOAM533QKvGk/R2eNQVFVHydAzFWsUBO+F3zDdT5xrXEtLCiMYfTTkve r7F18mm60eKk+EsPzVygadBg4A== X-Google-Smtp-Source: ABdhPJx2ieH8dT2jPBxFH/WIhN69bGTVTj3Q7+m7G0MbpiKUgbWZgnzMYMsn0NXOrADeD3t2AWF1qA== X-Received: by 2002:a62:e215:0:b0:4fa:87f1:dc16 with SMTP id a21-20020a62e215000000b004fa87f1dc16mr18414877pfi.19.1648964513366; Sat, 02 Apr 2022 22:41:53 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:53 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 6/6] mm: simplify follow_invalidate_pte() Date: Sun, 3 Apr 2022 13:39:57 +0800 Message-Id: <20220403053957.10770-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The only user (DAX) of range and pmdpp parameters of follow_invalidate_pte() is gone, it is safe to remove them and make it static to simlify the code. This is revertant of the following commits: 097963959594 ("mm: add follow_pte_pmd()") a4d1a8852513 ("dax: update to new mmu_notifier semantic") There is only one caller of the follow_invalidate_pte(). So just fold it into follow_pte() and remove it. Signed-off-by: Muchun Song Reviewed-by: Christoph Hellwig --- include/linux/mm.h | 3 -- mm/memory.c | 81 ++++++++++++++++----------------------------------= ---- 2 files changed, 23 insertions(+), 61 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index c9bada4096ac..be7ec4c37ebe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1871,9 +1871,6 @@ void free_pgd_range(struct mmu_gather *tlb, unsigned = long addr, unsigned long end, unsigned long floor, unsigned long ceiling); int copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src= _vma); -int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, - pmd_t **pmdpp, spinlock_t **ptlp); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index cc6968dc8e4e..84f7250e6cd1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4964,9 +4964,29 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, un= signed long address) } #endif /* __PAGETABLE_PMD_FOLDED */ =20 -int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, - pmd_t **pmdpp, spinlock_t **ptlp) +/** + * follow_pte - look up PTE at a user virtual address + * @mm: the mm_struct of the target address space + * @address: user virtual address + * @ptepp: location to store found PTE + * @ptlp: location to store the lock for the PTE + * + * On a successful return, the pointer to the PTE is stored in @ptepp; + * the corresponding lock is taken and its location is stored in @ptlp. + * The contents of the PTE are only stable until @ptlp is released; + * any further use, if any, must be protected against invalidation + * with MMU notifiers. + * + * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore + * should be taken for read. + * + * KVM uses this function. While it is arguably less bad than ``follow_pf= n``, + * it is not a good general-purpose API. + * + * Return: zero on success, -ve otherwise. + */ +int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { pgd_t *pgd; p4d_t *p4d; @@ -4989,35 +5009,9 @@ int follow_invalidate_pte(struct mm_struct *mm, unsi= gned long address, pmd =3D pmd_offset(pud, address); VM_BUG_ON(pmd_trans_huge(*pmd)); =20 - if (pmd_huge(*pmd)) { - if (!pmdpp) - goto out; - - if (range) { - mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, - NULL, mm, address & PMD_MASK, - (address & PMD_MASK) + PMD_SIZE); - mmu_notifier_invalidate_range_start(range); - } - *ptlp =3D pmd_lock(mm, pmd); - if (pmd_huge(*pmd)) { - *pmdpp =3D pmd; - return 0; - } - spin_unlock(*ptlp); - if (range) - mmu_notifier_invalidate_range_end(range); - } - if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; =20 - if (range) { - mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm, - address & PAGE_MASK, - (address & PAGE_MASK) + PAGE_SIZE); - mmu_notifier_invalidate_range_start(range); - } ptep =3D pte_offset_map_lock(mm, pmd, address, ptlp); if (!pte_present(*ptep)) goto unlock; @@ -5025,38 +5019,9 @@ int follow_invalidate_pte(struct mm_struct *mm, unsi= gned long address, return 0; unlock: pte_unmap_unlock(ptep, *ptlp); - if (range) - mmu_notifier_invalidate_range_end(range); out: return -EINVAL; } - -/** - * follow_pte - look up PTE at a user virtual address - * @mm: the mm_struct of the target address space - * @address: user virtual address - * @ptepp: location to store found PTE - * @ptlp: location to store the lock for the PTE - * - * On a successful return, the pointer to the PTE is stored in @ptepp; - * the corresponding lock is taken and its location is stored in @ptlp. - * The contents of the PTE are only stable until @ptlp is released; - * any further use, if any, must be protected against invalidation - * with MMU notifiers. - * - * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore - * should be taken for read. - * - * KVM uses this function. While it is arguably less bad than ``follow_pf= n``, - * it is not a good general-purpose API. - * - * Return: zero on success, -ve otherwise. - */ -int follow_pte(struct mm_struct *mm, unsigned long address, - pte_t **ptepp, spinlock_t **ptlp) -{ - return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp); -} EXPORT_SYMBOL_GPL(follow_pte); =20 /** --=20 2.11.0