From nobody Sun Jun 21 10:14:44 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2296C433EF for ; Tue, 29 Mar 2022 13:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237501AbiC2Nv2 (ORCPT ); Tue, 29 Mar 2022 09:51:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237500AbiC2NvZ (ORCPT ); Tue, 29 Mar 2022 09:51:25 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EEEC9A9BB for ; Tue, 29 Mar 2022 06:49:42 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id d30so456547pjk.0 for ; Tue, 29 Mar 2022 06:49:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Xp6qCxwTnE0C1FxChih956SdBv3W6L+dQXT7C0J3Dts=; b=7dvqFGIZamDh6z2A7HqI+S0QNO4F3ZMglCOrx6y5hhx5QhuJeI8ofBVTiND7e6IZ2V m3lUYzObTIB/gdTZFA4KQD4DVsVBtwZ8tSMWVw+36xNQvtuPiVUrYb0u+Y3+2lro1C/F zPYEz7Vt8YKGhinn+4PXZ3dukjN0B1MbiKgtAlv1rGMO4ZQpvr71RTKO76fB44QtchOm abY0vvsDPKb/HSm86BzJ17VHR5e2mrLwsVZuC/1Ttk6sY1o74EUfeVjNCyd0bQ7PiVCl 3BXyud3hnwNvDSI2kGuWLZy+j235vtliI5g0/WHD/uaa16qqBriGcKHkz9VNYKXowtrs NR9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Xp6qCxwTnE0C1FxChih956SdBv3W6L+dQXT7C0J3Dts=; b=S7gR+SfGn2gSDTFR0qIoSkyt5kHhxU2VJD8reCnO5jbMka6LT9zMP0jTlmRxdpDCnP hI9RAFUQQrYKm4dDZ8AWJu59dPA8RI4lpNFCi+FqnUgQ/fJ3ufbIa0KG386HTzRUWwUy 0GgSNoNfLa8t79mlnpNZ7A/bWw/F6W4H5sUiMTCOJWVlFlXalZH3M0OBYfujFqYfxp4E NmYIG8rqM6lKFjKD0VD5zePSFWlwT7Osxk+ZZWV8nHZVk+YYUZzDU/j0udztIzS16lZk YYeuk1Pb6eWZnJFLl4xcdR0a8svIJrVjZHeUGYhszbZKok8Eq25WpGnMT1I2WRjd+5Nz JaHA== X-Gm-Message-State: AOAM530MVY2FpaXBtyY1XIVO/lKZ316XXIeQRzFbxiRzC0JyWJehJjwv gDGKp6D/0PCZmKsVtA4Ag4J6ew== X-Google-Smtp-Source: ABdhPJzL1DT1WGANxca1cudkBoNvw0RQh0wE19/V2uhTvOrIsqwOC1l/X6irJPErT8PhUYtRlg0EqQ== X-Received: by 2002:a17:903:32c7:b0:154:19dd:fd43 with SMTP id i7-20020a17090332c700b0015419ddfd43mr31879012plr.150.1648561782107; Tue, 29 Mar 2022 06:49:42 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id o14-20020a056a0015ce00b004fab49cd65csm20911293pfu.205.2022.03.29.06.49.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Mar 2022 06:49:41 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v6 1/6] mm: rmap: fix cache flush on THP pages Date: Tue, 29 Mar 2022 21:48:48 +0800 Message-Id: <20220329134853.68403-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220329134853.68403-1-songmuchun@bytedance.com> References: <20220329134853.68403-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. At least, no problems were found due to this. Maybe because the architectures that have virtual indexed caches is less. Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped= _walk()") Signed-off-by: Muchun Song Reviewed-by: Yang Shi Reviewed-by: Dan Williams Reviewed-by: Christoph Hellwig --- mm/rmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index fc46a3d7b704..723682ddb9e8 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -970,7 +970,8 @@ static bool page_mkclean_one(struct folio *folio, struc= t vm_area_struct *vma, if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) continue; =20 - flush_cache_page(vma, address, folio_pfn(folio)); + flush_cache_range(vma, address, + address + HPAGE_PMD_SIZE); entry =3D pmdp_invalidate(vma, address, pmd); entry =3D pmd_wrprotect(entry); entry =3D pmd_mkclean(entry); --=20 2.11.0 From nobody Sun Jun 21 10:14:44 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E114EC433F5 for ; Tue, 29 Mar 2022 13:49:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237517AbiC2Nvh (ORCPT ); Tue, 29 Mar 2022 09:51:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237500AbiC2Nve (ORCPT ); Tue, 29 Mar 2022 09:51:34 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A89FEA774D for ; Tue, 29 Mar 2022 06:49:49 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id t13so13621214pgn.8 for ; Tue, 29 Mar 2022 06:49:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qCp2b48ULThIjDs7kZ69Cjh0BLYiJwB01qLpS62xP+w=; b=Jar+sTN8J6FAXOFb7BXYhLAXyRgigam+L7qPx28Z5edW3JyP3P6JHWeb2bLTRguazc SjpC/ypJCbFX/ajZVHDSDqni5U8ITqmnuOup25CPNWU54sPtLLF6LiZIWWYzvpzYHTgC k9wnNqKqGo/1GzqBqOc6VqRLxftOHupN+lmJwjQKTo0yrNNVI19TB+ENyA+aJNvEzEvC 4kCQFtgdIRbEb7xMywi2TakGdJ5HnWvuECSJvLUuNL6T7xJIPM+veTI0gmlSGWT2iso8 VdcBqyyuLihOgQjHs+9PwRAccmVb8/VV6SQGLGwzIL7cy5hpOdt9uiOByhluEpyB2CoO Catg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qCp2b48ULThIjDs7kZ69Cjh0BLYiJwB01qLpS62xP+w=; b=6Fndvs1Fh+MHz44djPB/+alLDzfuVApbVVm4Yv6P6Ou28RqKd8PQ4VF7StehtRraxM OHxmS34gstvSyrOXHCrjQoj0ZVFsP/OtakbrSPswMCd1zljxboL1yaCtQsrGKwqO3IS4 w2R2BY4eaVW54fCv1XFOBzGE+5XHfnazwG9Zs40DqLD7/WUSSxaTCz6PMsaYmaUh/Ip5 JdRZ4zBoeHpGDnhqkfhZi+etgUUUNCs3pabz7ITHEsjzj2I1tubsfxwY4ZPCLYBZlTn4 L9TpeyPxWcYxM7RrNmNjxn79sxiG1VE2gY4BWP9lATnyrj68HRSBfjKyfTbVBvQMeGkJ jDcQ== X-Gm-Message-State: AOAM530UbVFeGdAIXTMYJMcZnlpc25Boxtx5Pmh6R54ovEZAOJTvrIzT MWsZ4enSE5mxNITopDfrHsf+CQ== X-Google-Smtp-Source: ABdhPJwIyVhxdz8+cNA2oBRKlLOaDoMyukZC+uWljQRRd5G8K+hiEQjPnAjv0Umpi70bDPH8wlP4Bw== X-Received: by 2002:a63:5b48:0:b0:381:10:43e5 with SMTP id l8-20020a635b48000000b00381001043e5mr2093565pgm.544.1648561789075; Tue, 29 Mar 2022 06:49:49 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id o14-20020a056a0015ce00b004fab49cd65csm20911293pfu.205.2022.03.29.06.49.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Mar 2022 06:49:48 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v6 2/6] dax: fix cache flush on PMD-mapped pages Date: Tue, 29 Mar 2022 21:48:49 +0800 Message-Id: <20220329134853.68403-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220329134853.68403-1-songmuchun@bytedance.com> References: <20220329134853.68403-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. This is just a documentation issue with the respect to properly documenting the expected usage of cache flushing before modifying the pmd. However, in practice this is not a problem due to the fact that DAX is not available on architectures with virtually indexed caches per: commit d92576f1167c ("dax: does not work correctly with virtual aliasing = caches") Fixes: f729c8c9b24f ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean") Signed-off-by: Muchun Song Reviewed-by: Dan Williams Reviewed-by: Christoph Hellwig --- fs/dax.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index 67a08a32fccb..a372304c9695 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -845,7 +845,8 @@ static void dax_entry_mkclean(struct address_space *map= ping, pgoff_t index, if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) goto unlock_pmd; =20 - flush_cache_page(vma, address, pfn); + flush_cache_range(vma, address, + address + HPAGE_PMD_SIZE); pmd =3D pmdp_invalidate(vma, address, pmdp); pmd =3D pmd_wrprotect(pmd); pmd =3D pmd_mkclean(pmd); --=20 2.11.0 From nobody Sun Jun 21 10:14:44 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F3C3C433EF for ; Tue, 29 Mar 2022 13:50:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237527AbiC2Nvp (ORCPT ); Tue, 29 Mar 2022 09:51:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237514AbiC2Nvj (ORCPT ); Tue, 29 Mar 2022 09:51:39 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E5B5A145C for ; Tue, 29 Mar 2022 06:49:56 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id mr5-20020a17090b238500b001c67366ae93so2005049pjb.4 for ; Tue, 29 Mar 2022 06:49:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5g/0QWFd98PH+GDbIHxo8Dsrvq3fGfzY+G57IkW5WS4=; b=5jTksSyqDEj2RjKSzOYdzXajZNOJ+ImYqLNcdNfjk1dYvzTzQ0ix1xjuWpsKGO4tvV XmJevgQbARoI/SiYwUK5zVXHRhySAkQhL/qfjlq+swp2iMm7NKrVsqLl6nlUbcz70IXi fAdPp9yiKRDbsldcR7qEx9Ixpseip+7ALWVwC1GMERZ8QU3IGXmm33o5IfaHuNuamKTX 6KSlQTGCA7VCxdOZUJrTp+Ren6ihHc+Z972wetIf+Dzfhd3HK27eiZlzXK5F3DRwHbb9 ethRFkrtkCiYCerP8pGR0oq+yNkeae4YOj+fop++hsrBsVszD/GfXQ0SX5ZH8mn6eMsx V7Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5g/0QWFd98PH+GDbIHxo8Dsrvq3fGfzY+G57IkW5WS4=; b=67LieZmjdODnZw+5uFSia5XSI5LC+K5ngd6IjEM6cUD45qznGy17QOuNamV+OfGTq8 raEmewv53tMrA/1G7fb+LeFYONddk1X8qyzvUKOh+WdZoAq1LXbrfU6NOk2cWsumRuPT t7jv+KTOcAg69+sfS8SOlby3iCd2fwGiAZaxfyYFwIV28k9ErGYMDSfe1IOxap8RwejT 2HuOUpJZjBqxz2AhkedOUE7r9rnBDslJAPZg+YwvjjjLV/kOkfEMmdI38NUrz6WN4qbk 8XXTZeTNATFJWjt6qR8GYSisqD0WixgnEog4FWkfm+ct2y00BmyFhOLe8PLmUcWMIKSm keZQ== X-Gm-Message-State: AOAM531laE7ttvSI6d8zKUxWL6WkWBH7/crHSOQERX+rZsrcPCxn0Ehx ZikNPz3FV+L14+4h8ewofxnjcA== X-Google-Smtp-Source: ABdhPJzrBTv1qO1OUayF66SVnLQDlK+xwrvlhNk2hu/mbWXxrmyumWfoQ0v8Vs1PXGVwMLfleNNFqw== X-Received: by 2002:a17:902:650e:b0:153:99d4:9151 with SMTP id b14-20020a170902650e00b0015399d49151mr31133954plk.20.1648561795808; Tue, 29 Mar 2022 06:49:55 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id o14-20020a056a0015ce00b004fab49cd65csm20911293pfu.205.2022.03.29.06.49.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Mar 2022 06:49:55 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song Subject: [PATCH v6 3/6] mm: rmap: introduce pfn_mkclean_range() to cleans PTEs Date: Tue, 29 Mar 2022 21:48:50 +0800 Message-Id: <20220329134853.68403-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220329134853.68403-1-songmuchun@bytedance.com> References: <20220329134853.68403-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The page_mkclean_one() is supposed to be used with the pfn that has a associated struct page, but not all the pfns (e.g. DAX) have a struct page. Introduce a new function pfn_mkclean_range() to cleans the PTEs (including PMDs) mapped with range of pfns which has no struct page associated with them. This helper will be used by DAX device in the next patch to make pfns clean. Signed-off-by: Muchun Song --- include/linux/rmap.h | 3 +++ mm/internal.h | 26 +++++++++++++-------- mm/rmap.c | 65 +++++++++++++++++++++++++++++++++++++++++++-----= ---- 3 files changed, 74 insertions(+), 20 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b58ddb8b2220..a6ec0d3e40c1 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -263,6 +263,9 @@ unsigned long page_address_in_vma(struct page *, struct= vm_area_struct *); */ int folio_mkclean(struct folio *); =20 +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t p= goff, + struct vm_area_struct *vma); + void remove_migration_ptes(struct folio *src, struct folio *dst, bool lock= ed); =20 /* diff --git a/mm/internal.h b/mm/internal.h index f45292dc4ef5..ff873944749f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -516,26 +516,22 @@ void mlock_page_drain(int cpu); extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); =20 /* - * At what user virtual address is page expected in vma? - * Returns -EFAULT if all of the page is outside the range of vma. - * If page is a compound head, the entire compound page is considered. + * * Return the start of user virtual address at the specific offset within + * a vma. */ static inline unsigned long -vma_address(struct page *page, struct vm_area_struct *vma) +vma_pgoff_address(pgoff_t pgoff, unsigned long nr_pages, + struct vm_area_struct *vma) { - pgoff_t pgoff; unsigned long address; =20 - VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ - pgoff =3D page_to_pgoff(page); if (pgoff >=3D vma->vm_pgoff) { address =3D vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); /* Check for address beyond vma (or wrapped through 0?) */ if (address < vma->vm_start || address >=3D vma->vm_end) address =3D -EFAULT; - } else if (PageHead(page) && - pgoff + compound_nr(page) - 1 >=3D vma->vm_pgoff) { + } else if (pgoff + nr_pages - 1 >=3D vma->vm_pgoff) { /* Test above avoids possibility of wrap to 0 on 32-bit */ address =3D vma->vm_start; } else { @@ -545,6 +541,18 @@ vma_address(struct page *page, struct vm_area_struct *= vma) } =20 /* + * Return the start of user virtual address of a page within a vma. + * Returns -EFAULT if all of the page is outside the range of vma. + * If page is a compound head, the entire compound page is considered. + */ +static inline unsigned long +vma_address(struct page *page, struct vm_area_struct *vma) +{ + VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ + return vma_pgoff_address(page_to_pgoff(page), compound_nr(page), vma); +} + +/* * Then at what user virtual address will none of the range be found in vm= a? * Assumes that vma_address() already returned a good starting address. */ diff --git a/mm/rmap.c b/mm/rmap.c index 723682ddb9e8..ad5cf0e45a73 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -929,12 +929,12 @@ int folio_referenced(struct folio *folio, int is_lock= ed, return pra.referenced; } =20 -static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *v= ma, - unsigned long address, void *arg) +static int page_vma_mkclean_one(struct page_vma_mapped_walk *pvmw) { - DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int cleaned =3D 0; + struct vm_area_struct *vma =3D pvmw->vma; struct mmu_notifier_range range; - int *cleaned =3D arg; + unsigned long address =3D pvmw->address; =20 /* * We have to assume the worse case ie pmd for invalidation. Note that @@ -942,16 +942,16 @@ static bool page_mkclean_one(struct folio *folio, str= uct vm_area_struct *vma, */ mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE, 0, vma, vma->vm_mm, address, - vma_address_end(&pvmw)); + vma_address_end(pvmw)); mmu_notifier_invalidate_range_start(&range); =20 - while (page_vma_mapped_walk(&pvmw)) { + while (page_vma_mapped_walk(pvmw)) { int ret =3D 0; =20 - address =3D pvmw.address; - if (pvmw.pte) { + address =3D pvmw->address; + if (pvmw->pte) { pte_t entry; - pte_t *pte =3D pvmw.pte; + pte_t *pte =3D pvmw->pte; =20 if (!pte_dirty(*pte) && !pte_write(*pte)) continue; @@ -964,7 +964,7 @@ static bool page_mkclean_one(struct folio *folio, struc= t vm_area_struct *vma, ret =3D 1; } else { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - pmd_t *pmd =3D pvmw.pmd; + pmd_t *pmd =3D pvmw->pmd; pmd_t entry; =20 if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) @@ -991,11 +991,22 @@ static bool page_mkclean_one(struct folio *folio, str= uct vm_area_struct *vma, * See Documentation/vm/mmu_notifier.rst */ if (ret) - (*cleaned)++; + cleaned++; } =20 mmu_notifier_invalidate_range_end(&range); =20 + return cleaned; +} + +static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *v= ma, + unsigned long address, void *arg) +{ + DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int *cleaned =3D arg; + + *cleaned +=3D page_vma_mkclean_one(&pvmw); + return true; } =20 @@ -1033,6 +1044,38 @@ int folio_mkclean(struct folio *folio) EXPORT_SYMBOL_GPL(folio_mkclean); =20 /** + * pfn_mkclean_range - Cleans the PTEs (including PMDs) mapped with range = of + * [@pfn, @pfn + @nr_pages) at the specific offset (@p= goff) + * within the @vma of shared mappings. And since clean= PTEs + * should also be readonly, write protects them too. + * @pfn: start pfn. + * @nr_pages: number of physically contiguous pages srarting with @pfn. + * @pgoff: page offset that the @pfn mapped with. + * @vma: vma that @pfn mapped within. + * + * Returns the number of cleaned PTEs (including PMDs). + */ +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t p= goff, + struct vm_area_struct *vma) +{ + struct page_vma_mapped_walk pvmw =3D { + .pfn =3D pfn, + .nr_pages =3D nr_pages, + .pgoff =3D pgoff, + .vma =3D vma, + .flags =3D PVMW_SYNC, + }; + + if (invalid_mkclean_vma(vma, NULL)) + return 0; + + pvmw.address =3D vma_pgoff_address(pgoff, nr_pages, vma); + VM_BUG_ON_VMA(pvmw.address =3D=3D -EFAULT, vma); + + return page_vma_mkclean_one(&pvmw); +} + +/** * page_move_anon_rmap - move a page to our anon_vma * @page: the page to move to our anon_vma * @vma: the vma the page belongs to --=20 2.11.0 From nobody Sun Jun 21 10:14:44 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1D5EC433F5 for ; Tue, 29 Mar 2022 13:50:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237545AbiC2Nv6 (ORCPT ); Tue, 29 Mar 2022 09:51:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237533AbiC2Nvq (ORCPT ); Tue, 29 Mar 2022 09:51:46 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C7C917E361 for ; Tue, 29 Mar 2022 06:50:03 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id v4so17492664pjh.2 for ; Tue, 29 Mar 2022 06:50:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lWUQux1KtYb2lW/SJw8SxLz0dcCTGFUlQsH2e++z+XY=; b=G+yhSk66jiVLFXbnCli3qF1UuypYzD4UqkQIn1/aZOoVNQLq390i4W2QbZAC1uDbTv rZHDyKdUZch1cCyQwiItvCeWoFUVMPhHHNpNl1SsUHEKpGxXNeS2lnMlGl2mQyg9e6Td 20C+VHU3kXJ7MkQD50IREG/ueYGfVheuUaQE2rss4yrmVFYh0hD6tO+E5CAfYUbjzLen 91ZIRCLWjLARmAsFuj6larvZQL+0dmu5fCHPE+h/nGNEq6n4pFDmjJdWxd3UgmzovtfT KKja+3nR5tlwpLaEOr/W6RFWXi69I8Uy3nflR3wKh75PEib2MYmy8YG0yDgvkjLKI9Cm E7Ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lWUQux1KtYb2lW/SJw8SxLz0dcCTGFUlQsH2e++z+XY=; b=Vo0a6chmCabBGMZnvCuYa1pPOaxkJE57vailXW0amAeaYq/W7KByhEvtdcZs6KN6zg 5wgZnnLoQJqhEtXpAUvWo3QvXUGZMoLl0CUHTu+626+w/M1F7qq3Ise/sgYu8vTkrNJh GW6UOxUtA3J0ALABpTYFoHJKzP4aRIDMnGKgEOzWejMkmX5Gnq6mN125idZbjXCqlQfV Nj5ot7lung6+WjT8Jqyf40Yb/ifpxbJzfMSs9IKvazRZ6Ncx2tdQrx7/rD/ZNpnPGzXe vfW37x0HC6E6whbzEYO6v3eqMEt2wuyHgRfkLBWG6q8pefxGVMfSUsJgryL436LhJqs7 m+lQ== X-Gm-Message-State: AOAM532UxYRhySergyQNFNPWjWDGzt8Ym8ZvDxD4H/jYsiDoy1IjHndg U9lJny6FpLTE8mMOIXEMGV9zpQ== X-Google-Smtp-Source: ABdhPJxdiH3g9l4QwZUkPToArdeUP0GSfrH58csPQ0KFDVI06KXDGgJnArklayTM7bVUr37Iyzttug== X-Received: by 2002:a17:90b:3851:b0:1c7:d26:2294 with SMTP id nl17-20020a17090b385100b001c70d262294mr4598517pjb.97.1648561802997; Tue, 29 Mar 2022 06:50:02 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id o14-20020a056a0015ce00b004fab49cd65csm20911293pfu.205.2022.03.29.06.49.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Mar 2022 06:50:02 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song Subject: [PATCH v6 4/6] mm: pvmw: add support for walking devmap pages Date: Tue, 29 Mar 2022 21:48:51 +0800 Message-Id: <20220329134853.68403-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220329134853.68403-1-songmuchun@bytedance.com> References: <20220329134853.68403-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The devmap pages can not use page_vma_mapped_walk() to check if a huge devmap page is mapped into a vma. Add support for walking huge devmap pages so that DAX can use it in the next patch. Signed-off-by: Muchun Song --- mm/page_vma_mapped.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 1187f9c1ec5b..b3bf802a6435 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -210,16 +210,9 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk = *pvmw) */ pmde =3D READ_ONCE(*pvmw->pmd); =20 - if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { + if (pmd_leaf(pmde) || is_pmd_migration_entry(pmde)) { pvmw->ptl =3D pmd_lock(mm, pvmw->pmd); pmde =3D *pvmw->pmd; - if (likely(pmd_trans_huge(pmde))) { - if (pvmw->flags & PVMW_MIGRATION) - return not_found(pvmw); - if (!check_pmd(pmd_pfn(pmde), pvmw)) - return not_found(pvmw); - return true; - } if (!pmd_present(pmde)) { swp_entry_t entry; =20 @@ -232,6 +225,13 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk = *pvmw) return not_found(pvmw); return true; } + if (likely(pmd_trans_huge(pmde) || pmd_devmap(pmde))) { + if (pvmw->flags & PVMW_MIGRATION) + return not_found(pvmw); + if (!check_pmd(pmd_pfn(pmde), pvmw)) + return not_found(pvmw); + return true; + } /* THP pmd was split under us: handle on pte level */ spin_unlock(pvmw->ptl); pvmw->ptl =3D NULL; --=20 2.11.0 From nobody Sun Jun 21 10:14:44 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EFB9C433F5 for ; Tue, 29 Mar 2022 13:50:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237557AbiC2NwD (ORCPT ); Tue, 29 Mar 2022 09:52:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237514AbiC2Nv4 (ORCPT ); Tue, 29 Mar 2022 09:51:56 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE835182D97 for ; Tue, 29 Mar 2022 06:50:10 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id mj15-20020a17090b368f00b001c637aa358eso2071084pjb.0 for ; Tue, 29 Mar 2022 06:50:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ij9Ou9wostYj6EGhx1lGIRtlvYKIpDSow4S1vsu6OkU=; b=ROv3gGvR4aj4QzcYTstt7yTPVhifMoCxjapD0iUBvL6NPfDYKN8ysRyTztYLXGTbP8 GTDD0F8x5+tMdVFclYToME6C4sD7FMlXkOzvtBfabm6yxbaw36EOX0zeKP132FgYPXnk LtWipIS3jcgWNm54gGcVX7c0dyn1IZ5451Y3hYpNt6l+7iWnkB6KOHwhcLcWlBb6wPqM jmrEGFTlHvgudfMNezasDjin7tJ3tdudb+IRpBncyShJ0WX+pszVHaWXcu2N7IDUe7h3 IzVRhJUMOu48y8wloZSubGf3mUl9ExrisVJODAiV32y0xLmQXE0Jkn8BMB5lCm8td+KN EIOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ij9Ou9wostYj6EGhx1lGIRtlvYKIpDSow4S1vsu6OkU=; b=0AbdxtE7H11Qe2W+MBUQR0NN3uYyyA/MIO2AOv4CWLaPhxlRJekXmRfq0CcDRYXwrX nZIWnTX0pe4GyKSdibbBfYcmsXs8n6JPyv9vIm75YzGi8yP1BFrR8ngIvoey+11Lrtzo O8eX7lvTg5f7EK/UI4GAkTO/8He5mOQEd1Bri31zIDwGiJcxovatDEzOPfl/z1Y83QJe 4Z3qHpgcum8h44epVb32ZDmzfzZZGnQy/h+qG7sx6GdiZLoyw+fNpG4ONth41fPyXcuO iXnQtsmCuSKNangsKqqrm8pjMGPNhArqF+rjlFm5bGRlFhZbG++/lKvwXYDPkB7/9YgV AeIg== X-Gm-Message-State: AOAM533aNJDwYqBx+CGr5je3lViy4rRJq7Hgy1sc6CuYU5iHjPQXzRoe wnYvSuU0xjvWpUHQ/tbgZ795IA== X-Google-Smtp-Source: ABdhPJws5dw3OuoG5t5mgMbZscU3EjZAPvWaUO7WlRNNr6CHqwI9r4zh+6lbqq2P8RMF7JZ+DOLaaw== X-Received: by 2002:a17:902:9889:b0:153:abee:fbc7 with SMTP id s9-20020a170902988900b00153abeefbc7mr30682998plp.117.1648561809991; Tue, 29 Mar 2022 06:50:09 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id o14-20020a056a0015ce00b004fab49cd65csm20911293pfu.205.2022.03.29.06.50.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Mar 2022 06:50:09 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v6 5/6] dax: fix missing writeprotect the pte entry Date: Tue, 29 Mar 2022 21:48:52 +0800 Message-Id: <20220329134853.68403-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220329134853.68403-1-songmuchun@bytedance.com> References: <20220329134853.68403-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently dax_mapping_entry_mkclean() fails to clean and write protect the pte entry within a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd entry dirty and writeable. 2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K) write to the same file, dirtying PMD radix tree entry (already done in 1)) and making the pte entry dirty and writeable. 3) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pte entry as clean and write protected since the vma of process B is not covered in dax_entry_mkclean(). 4) process B writes to the pte. These don't cause any page faults since the pte entry is dirty and writeable. The radix tree entry remains clean. 5) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 6) crash - dirty data that should have been fsync'd as part of 5) could still have been in the processor cache, and is lost. Just to use pfn_mkclean_range() to clean the pfns to fix this issue. Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Signed-off-by: Muchun Song Reviewed-by: Christoph Hellwig --- fs/dax.c | 99 ++++++++----------------------------------------------------= ---- 1 file changed, 12 insertions(+), 87 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a372304c9695..1ac12e877f4f 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -24,6 +24,7 @@ #include #include #include +#include #include =20 #define CREATE_TRACE_POINTS @@ -789,96 +790,12 @@ static void *dax_insert_entry(struct xa_state *xas, return entry; } =20 -static inline -unsigned long pgoff_address(pgoff_t pgoff, struct vm_area_struct *vma) -{ - unsigned long address; - - address =3D vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); - VM_BUG_ON_VMA(address < vma->vm_start || address >=3D vma->vm_end, vma); - return address; -} - -/* Walk all mappings of a given index of a file and writeprotect them */ -static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, - unsigned long pfn) -{ - struct vm_area_struct *vma; - pte_t pte, *ptep =3D NULL; - pmd_t *pmdp =3D NULL; - spinlock_t *ptl; - - i_mmap_lock_read(mapping); - vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) { - struct mmu_notifier_range range; - unsigned long address; - - cond_resched(); - - if (!(vma->vm_flags & VM_SHARED)) - continue; - - address =3D pgoff_address(index, vma); - - /* - * follow_invalidate_pte() will use the range to call - * mmu_notifier_invalidate_range_start() on our behalf before - * taking any lock. - */ - if (follow_invalidate_pte(vma->vm_mm, address, &range, &ptep, - &pmdp, &ptl)) - continue; - - /* - * No need to call mmu_notifier_invalidate_range() as we are - * downgrading page table protection not changing it to point - * to a new page. - * - * See Documentation/vm/mmu_notifier.rst - */ - if (pmdp) { -#ifdef CONFIG_FS_DAX_PMD - pmd_t pmd; - - if (pfn !=3D pmd_pfn(*pmdp)) - goto unlock_pmd; - if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) - goto unlock_pmd; - - flush_cache_range(vma, address, - address + HPAGE_PMD_SIZE); - pmd =3D pmdp_invalidate(vma, address, pmdp); - pmd =3D pmd_wrprotect(pmd); - pmd =3D pmd_mkclean(pmd); - set_pmd_at(vma->vm_mm, address, pmdp, pmd); -unlock_pmd: -#endif - spin_unlock(ptl); - } else { - if (pfn !=3D pte_pfn(*ptep)) - goto unlock_pte; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock_pte; - - flush_cache_page(vma, address, pfn); - pte =3D ptep_clear_flush(vma, address, ptep); - pte =3D pte_wrprotect(pte); - pte =3D pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); -unlock_pte: - pte_unmap_unlock(ptep, ptl); - } - - mmu_notifier_invalidate_range_end(&range); - } - i_mmap_unlock_read(mapping); -} - static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_= dev, struct address_space *mapping, void *entry) { - unsigned long pfn, index, count; + unsigned long pfn, index, count, end; long ret =3D 0; + struct vm_area_struct *vma; =20 /* * A page got tagged dirty in DAX mapping? Something is seriously @@ -936,8 +853,16 @@ static int dax_writeback_one(struct xa_state *xas, str= uct dax_device *dax_dev, pfn =3D dax_to_pfn(entry); count =3D 1UL << dax_entry_order(entry); index =3D xas->xa_index & ~(count - 1); + end =3D index + count - 1; + + /* Walk all mappings of a given index of a file and writeprotect them */ + i_mmap_lock_read(mapping); + vma_interval_tree_foreach(vma, &mapping->i_mmap, index, end) { + pfn_mkclean_range(pfn, count, index, vma); + cond_resched(); + } + i_mmap_unlock_read(mapping); =20 - dax_entry_mkclean(mapping, index, pfn); dax_flush(dax_dev, page_address(pfn_to_page(pfn)), count * PAGE_SIZE); /* * After we have flushed the cache, we can clear the dirty tag. There --=20 2.11.0 From nobody Sun Jun 21 10:14:44 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66C06C433EF for ; Tue, 29 Mar 2022 13:50:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237588AbiC2NwJ (ORCPT ); Tue, 29 Mar 2022 09:52:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237570AbiC2NwB (ORCPT ); Tue, 29 Mar 2022 09:52:01 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE081186F9C for ; Tue, 29 Mar 2022 06:50:17 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id h19so14967023pfv.1 for ; Tue, 29 Mar 2022 06:50:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pquxx3qH2nuqmnC96VcgoZZu7sN0crimqByO4QYpxLI=; b=oQLhz53anwt4ILImkOxJ+rt/T/W68dglu9K98xYhD8rw+vwfthUTW0dlUF8vK73ffS Kv3IyJwBAManxnSc7qMd6HVrXGOcrmlEd9PUFM1/o4FKfPU7W/CIfqGtKeBwLXPu1Q4t rxxXG/rtme18YD+1HWkiEXrBu84RD8iXJNM00tyT9mm1bqKI5JGbw4yzwyfnFxAVrXbd Yk9O3W0irr+zwnhCLF3Xcvf/NWQURV1ee5Et+3KdaXBWcOjGFYjeMP2EgatcjC/6D/uG tQoLY1hSo0kNxBNJcz1UQn+rraUqyPegGACymxj/UrrDw/JhKQZPhtl4M52HOW6NNfsf RFKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pquxx3qH2nuqmnC96VcgoZZu7sN0crimqByO4QYpxLI=; b=kT+/nLO3Jbq85sNX3CqQwE7Y88W73I70LfcfgCbHnIebtq/JsVI6NHfQRNB87FXg6t yNHzst2nP9dWJ+BQlON7qF+SGAg/higghCsir9YCQGlV2sNI+vHpFgowuYS8sAF4inXt TVhqwq4W+Jopj5tMld+8jCmN8pS/tQEfIJGnK2IxiLIJrzUwo3FkqlvuNlr1gQiylhn6 4A2biSp7ezvRaKFWbDJ+S4pYYhl7A/C0kfWWfrO3BKbe+okr31Yv7eDVqgCIyg6rHGI7 hd7lWqz+/iHblQ+VU00KtxzLC48prRaUSp/bucyEhXsxowGcKlfM85vFAXCl8eb2kwUI t7UQ== X-Gm-Message-State: AOAM531YWhRsjQHeQaMiWcW8N+h+KgGojiNgawurwVFKbWuuy64VFfIc RFXcUcfs1UpLHdxVOuR/a0db7Q== X-Google-Smtp-Source: ABdhPJyTEtUT0mAGxDz5bcPBQWC+tSTA07XI+vJ4sxeO4APu0XKS8tsBFbZC6aGY6wMBrL/C65Wt8Q== X-Received: by 2002:a63:f418:0:b0:382:b4f6:5f95 with SMTP id g24-20020a63f418000000b00382b4f65f95mr2084659pgi.619.1648561817153; Tue, 29 Mar 2022 06:50:17 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id o14-20020a056a0015ce00b004fab49cd65csm20911293pfu.205.2022.03.29.06.50.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Mar 2022 06:50:16 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v6 6/6] mm: simplify follow_invalidate_pte() Date: Tue, 29 Mar 2022 21:48:53 +0800 Message-Id: <20220329134853.68403-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220329134853.68403-1-songmuchun@bytedance.com> References: <20220329134853.68403-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The only user (DAX) of range and pmdpp parameters of follow_invalidate_pte() is gone, it is safe to remove them and make it static to simlify the code. This is revertant of the following commits: 097963959594 ("mm: add follow_pte_pmd()") a4d1a8852513 ("dax: update to new mmu_notifier semantic") There is only one caller of the follow_invalidate_pte(). So just fold it into follow_pte() and remove it. Signed-off-by: Muchun Song Reviewed-by: Christoph Hellwig --- include/linux/mm.h | 3 -- mm/memory.c | 81 ++++++++++++++++----------------------------------= ---- 2 files changed, 23 insertions(+), 61 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index c9bada4096ac..be7ec4c37ebe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1871,9 +1871,6 @@ void free_pgd_range(struct mmu_gather *tlb, unsigned = long addr, unsigned long end, unsigned long floor, unsigned long ceiling); int copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src= _vma); -int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, - pmd_t **pmdpp, spinlock_t **ptlp); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index cc6968dc8e4e..84f7250e6cd1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4964,9 +4964,29 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, un= signed long address) } #endif /* __PAGETABLE_PMD_FOLDED */ =20 -int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, - pmd_t **pmdpp, spinlock_t **ptlp) +/** + * follow_pte - look up PTE at a user virtual address + * @mm: the mm_struct of the target address space + * @address: user virtual address + * @ptepp: location to store found PTE + * @ptlp: location to store the lock for the PTE + * + * On a successful return, the pointer to the PTE is stored in @ptepp; + * the corresponding lock is taken and its location is stored in @ptlp. + * The contents of the PTE are only stable until @ptlp is released; + * any further use, if any, must be protected against invalidation + * with MMU notifiers. + * + * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore + * should be taken for read. + * + * KVM uses this function. While it is arguably less bad than ``follow_pf= n``, + * it is not a good general-purpose API. + * + * Return: zero on success, -ve otherwise. + */ +int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { pgd_t *pgd; p4d_t *p4d; @@ -4989,35 +5009,9 @@ int follow_invalidate_pte(struct mm_struct *mm, unsi= gned long address, pmd =3D pmd_offset(pud, address); VM_BUG_ON(pmd_trans_huge(*pmd)); =20 - if (pmd_huge(*pmd)) { - if (!pmdpp) - goto out; - - if (range) { - mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, - NULL, mm, address & PMD_MASK, - (address & PMD_MASK) + PMD_SIZE); - mmu_notifier_invalidate_range_start(range); - } - *ptlp =3D pmd_lock(mm, pmd); - if (pmd_huge(*pmd)) { - *pmdpp =3D pmd; - return 0; - } - spin_unlock(*ptlp); - if (range) - mmu_notifier_invalidate_range_end(range); - } - if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; =20 - if (range) { - mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm, - address & PAGE_MASK, - (address & PAGE_MASK) + PAGE_SIZE); - mmu_notifier_invalidate_range_start(range); - } ptep =3D pte_offset_map_lock(mm, pmd, address, ptlp); if (!pte_present(*ptep)) goto unlock; @@ -5025,38 +5019,9 @@ int follow_invalidate_pte(struct mm_struct *mm, unsi= gned long address, return 0; unlock: pte_unmap_unlock(ptep, *ptlp); - if (range) - mmu_notifier_invalidate_range_end(range); out: return -EINVAL; } - -/** - * follow_pte - look up PTE at a user virtual address - * @mm: the mm_struct of the target address space - * @address: user virtual address - * @ptepp: location to store found PTE - * @ptlp: location to store the lock for the PTE - * - * On a successful return, the pointer to the PTE is stored in @ptepp; - * the corresponding lock is taken and its location is stored in @ptlp. - * The contents of the PTE are only stable until @ptlp is released; - * any further use, if any, must be protected against invalidation - * with MMU notifiers. - * - * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore - * should be taken for read. - * - * KVM uses this function. While it is arguably less bad than ``follow_pf= n``, - * it is not a good general-purpose API. - * - * Return: zero on success, -ve otherwise. - */ -int follow_pte(struct mm_struct *mm, unsigned long address, - pte_t **ptepp, spinlock_t **ptlp) -{ - return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp); -} EXPORT_SYMBOL_GPL(follow_pte); =20 /** --=20 2.11.0