From nobody Thu May 14 08:26:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04661C433F5 for ; Mon, 25 Apr 2022 13:27:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242035AbiDYNaw (ORCPT ); Mon, 25 Apr 2022 09:30:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236614AbiDYNaL (ORCPT ); Mon, 25 Apr 2022 09:30:11 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C04C54091E for ; Mon, 25 Apr 2022 06:27:02 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Kn5PX2YVxzfb3D; Mon, 25 Apr 2022 21:26:08 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 25 Apr 2022 21:26:59 +0800 From: Miaohe Lin To: , , CC: , , , , , , , Subject: [PATCH v2 1/4] mm/migration: reduce the rcu lock duration Date: Mon, 25 Apr 2022 21:27:20 +0800 Message-ID: <20220425132723.34824-2-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220425132723.34824-1-linmiaohe@huawei.com> References: <20220425132723.34824-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" rcu_read_lock is required by grabbing the task refcount but it's not needed for ptrace_may_access. So we could release the rcu lock after task refcount is successfully grabbed to reduce the rcu holding time. Reviewed-by: Muchun Song Reviewed-by: Christoph Hellwig Signed-off-by: Miaohe Lin Cc: Huang Ying Cc: David Howells Cc: Christoph Lameter --- mm/migrate.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index b2678279eb43..b779646665fe 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1902,17 +1902,16 @@ static struct mm_struct *find_mm_struct(pid_t pid, = nodemask_t *mem_nodes) return ERR_PTR(-ESRCH); } get_task_struct(task); + rcu_read_unlock(); =20 /* * Check if this process has the right to modify the specified * process. Use the regular "ptrace_may_access()" checks. */ if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) { - rcu_read_unlock(); mm =3D ERR_PTR(-EPERM); goto out; } - rcu_read_unlock(); =20 mm =3D ERR_PTR(security_task_movememory(task)); if (IS_ERR(mm)) --=20 2.23.0 From nobody Thu May 14 08:26:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2019C433F5 for ; Mon, 25 Apr 2022 13:27:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242001AbiDYNa3 (ORCPT ); Mon, 25 Apr 2022 09:30:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233775AbiDYNaK (ORCPT ); Mon, 25 Apr 2022 09:30:10 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C011340918 for ; Mon, 25 Apr 2022 06:27:02 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Kn5PX74BCzfb4V; Mon, 25 Apr 2022 21:26:08 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 25 Apr 2022 21:27:00 +0800 From: Miaohe Lin To: , , CC: , , , , , , , Subject: [PATCH v2 2/4] mm/migration: remove unneeded lock page and PageMovable check Date: Mon, 25 Apr 2022 21:27:21 +0800 Message-ID: <20220425132723.34824-3-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220425132723.34824-1-linmiaohe@huawei.com> References: <20220425132723.34824-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When non-lru movable page was freed from under us, __ClearPageMovable must have been done. Even if it's not done, ClearPageIsolated here won't hurt as page will be freed anyway. So we can thus remove unneeded lock page and PageMovable check here. Signed-off-by: Miaohe Lin Reviewed-by: Christoph Hellwig --- mm/migrate.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index b779646665fe..0fc4651b3e39 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1093,12 +1093,8 @@ static int unmap_and_move(new_page_t get_new_page, /* page was freed from under us. So we are done. */ ClearPageActive(page); ClearPageUnevictable(page); - if (unlikely(__PageMovable(page))) { - lock_page(page); - if (!PageMovable(page)) - ClearPageIsolated(page); - unlock_page(page); - } + if (unlikely(__PageMovable(page))) + ClearPageIsolated(page); goto out; } =20 --=20 2.23.0 From nobody Thu May 14 08:26:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3BF6C433EF for ; Mon, 25 Apr 2022 13:27:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242026AbiDYNai (ORCPT ); Mon, 25 Apr 2022 09:30:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233932AbiDYNaK (ORCPT ); Mon, 25 Apr 2022 09:30:10 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C001A40915 for ; Mon, 25 Apr 2022 06:27:03 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Kn5QK6McpzhYlC; Mon, 25 Apr 2022 21:26:49 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 25 Apr 2022 21:27:00 +0800 From: Miaohe Lin To: , , CC: , , , , , , , Subject: [PATCH v2 3/4] mm/migration: return errno when isolate_huge_page failed Date: Mon, 25 Apr 2022 21:27:22 +0800 Message-ID: <20220425132723.34824-4-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220425132723.34824-1-linmiaohe@huawei.com> References: <20220425132723.34824-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" We might fail to isolate huge page due to e.g. the page is under migration which cleared HPageMigratable. So we should return -EBUSY in this case rather than always return 1 which could confuse the user. Also we make the prototype of isolate_huge_page consistent with isolate_lru_page to improve the readability. Fixes: e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") Suggested-by: Huang Ying Signed-off-by: Miaohe Lin --- include/linux/hugetlb.h | 6 +++--- mm/gup.c | 2 +- mm/hugetlb.c | 11 +++++------ mm/memory-failure.c | 2 +- mm/mempolicy.c | 2 +- mm/migrate.c | 5 +++-- 6 files changed, 14 insertions(+), 14 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 04f0186b089b..306d6ef3fa22 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -170,7 +170,7 @@ bool hugetlb_reserve_pages(struct inode *inode, long fr= om, long to, vm_flags_t vm_flags); long hugetlb_unreserve_pages(struct inode *inode, long start, long end, long freed); -bool isolate_huge_page(struct page *page, struct list_head *list); +int isolate_huge_page(struct page *page, struct list_head *list); int get_hwpoison_huge_page(struct page *page, bool *hugetlb); int get_huge_page_for_hwpoison(unsigned long pfn, int flags); void putback_active_hugepage(struct page *page); @@ -376,9 +376,9 @@ static inline pte_t *huge_pte_offset(struct mm_struct *= mm, unsigned long addr, return NULL; } =20 -static inline bool isolate_huge_page(struct page *page, struct list_head *= list) +static inline int isolate_huge_page(struct page *page, struct list_head *l= ist) { - return false; + return -EBUSY; } =20 static inline int get_hwpoison_huge_page(struct page *page, bool *hugetlb) diff --git a/mm/gup.c b/mm/gup.c index 5c17d4816441..c15d41636e8e 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1869,7 +1869,7 @@ static long check_and_migrate_movable_pages(unsigned = long nr_pages, * Try to move out any movable page before pinning the range. */ if (folio_test_hugetlb(folio)) { - if (!isolate_huge_page(&folio->page, + if (isolate_huge_page(&folio->page, &movable_page_list)) isolation_error_count++; continue; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 74c9964c1b11..098f81e8550d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2766,8 +2766,7 @@ static int alloc_and_dissolve_huge_page(struct hstate= *h, struct page *old_page, * Fail with -EBUSY if not possible. */ spin_unlock_irq(&hugetlb_lock); - if (!isolate_huge_page(old_page, list)) - ret =3D -EBUSY; + ret =3D isolate_huge_page(old_page, list); spin_lock_irq(&hugetlb_lock); goto free_new; } else if (!HPageFreed(old_page)) { @@ -2843,7 +2842,7 @@ int isolate_or_dissolve_huge_page(struct page *page, = struct list_head *list) if (hstate_is_gigantic(h)) return -ENOMEM; =20 - if (page_count(head) && isolate_huge_page(head, list)) + if (page_count(head) && !isolate_huge_page(head, list)) ret =3D 0; else if (!page_count(head)) ret =3D alloc_and_dissolve_huge_page(h, head, list); @@ -6940,15 +6939,15 @@ follow_huge_pgd(struct mm_struct *mm, unsigned long= address, pgd_t *pgd, int fla return pte_page(*(pte_t *)pgd) + ((address & ~PGDIR_MASK) >> PAGE_SHIFT); } =20 -bool isolate_huge_page(struct page *page, struct list_head *list) +int isolate_huge_page(struct page *page, struct list_head *list) { - bool ret =3D true; + int ret =3D 0; =20 spin_lock_irq(&hugetlb_lock); if (!PageHeadHuge(page) || !HPageMigratable(page) || !get_page_unless_zero(page)) { - ret =3D false; + ret =3D -EBUSY; goto unlock; } ClearHPageMigratable(page); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 1d117190c350..a83d32bbc567 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2203,7 +2203,7 @@ static bool isolate_page(struct page *page, struct li= st_head *pagelist) bool lru =3D PageLRU(page); =20 if (PageHuge(page)) { - isolated =3D isolate_huge_page(page, pagelist); + isolated =3D !isolate_huge_page(page, pagelist); } else { if (lru) isolated =3D !isolate_lru_page(page); diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e4f125e48cc4..a4467c4e9f8d 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -602,7 +602,7 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned lon= g hmask, /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */ if (flags & (MPOL_MF_MOVE_ALL) || (flags & MPOL_MF_MOVE && page_mapcount(page) =3D=3D 1)) { - if (!isolate_huge_page(page, qp->pagelist) && + if (isolate_huge_page(page, qp->pagelist) && (flags & MPOL_MF_STRICT)) /* * Failed to isolate page but allow migrating pages diff --git a/mm/migrate.c b/mm/migrate.c index 0fc4651b3e39..c937a496239b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1628,8 +1628,9 @@ static int add_page_for_migration(struct mm_struct *m= m, unsigned long addr, =20 if (PageHuge(page)) { if (PageHead(page)) { - isolate_huge_page(page, pagelist); - err =3D 1; + err =3D isolate_huge_page(page, pagelist); + if (!err) + err =3D 1; } } else { struct page *head; --=20 2.23.0 From nobody Thu May 14 08:26:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAE60C433F5 for ; Mon, 25 Apr 2022 13:27:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242038AbiDYNa4 (ORCPT ); Mon, 25 Apr 2022 09:30:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238432AbiDYNaL (ORCPT ); Mon, 25 Apr 2022 09:30:11 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07D0940928 for ; Mon, 25 Apr 2022 06:27:03 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Kn5QG2w2vzhYnQ; Mon, 25 Apr 2022 21:26:46 +0800 (CST) Received: from huawei.com (10.175.124.27) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 25 Apr 2022 21:27:01 +0800 From: Miaohe Lin To: , , CC: , , , , , , , Subject: [PATCH v2 4/4] mm/migration: fix potential pte_unmap on an not mapped pte Date: Mon, 25 Apr 2022 21:27:23 +0800 Message-ID: <20220425132723.34824-5-linmiaohe@huawei.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220425132723.34824-1-linmiaohe@huawei.com> References: <20220425132723.34824-1-linmiaohe@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" __migration_entry_wait and migration_entry_wait_on_locked assume pte is always mapped from caller. But this is not the case when it's called from migration_entry_wait_huge and follow_huge_pmd. Add a hugetlbfs variant that calls hugetlb_migration_entry_wait(ptep =3D=3D NULL) to fix this issue. Fixes: 30dad30922cc ("mm: migration: add migrate_entry_wait_huge()") Suggested-by: David Hildenbrand Signed-off-by: Miaohe Lin Reviewed-by: David Hildenbrand --- include/linux/swapops.h | 12 ++++++++---- mm/hugetlb.c | 4 ++-- mm/migrate.c | 23 +++++++++++++++++++---- 3 files changed, 29 insertions(+), 10 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 30cded849ee4..862e5a2053b1 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -244,8 +244,10 @@ extern void __migration_entry_wait(struct mm_struct *m= m, pte_t *ptep, spinlock_t *ptl); extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address); -extern void migration_entry_wait_huge(struct vm_area_struct *vma, - struct mm_struct *mm, pte_t *pte); +#ifdef CONFIG_HUGETLB_PAGE +extern void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl); +extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *p= te); +#endif #else static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) { @@ -271,8 +273,10 @@ static inline void __migration_entry_wait(struct mm_st= ruct *mm, pte_t *ptep, spinlock_t *ptl) { } static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { } -static inline void migration_entry_wait_huge(struct vm_area_struct *vma, - struct mm_struct *mm, pte_t *pte) { } +#ifdef CONFIG_HUGETLB_PAGE +static inline void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *pt= l) { } +static inline void migration_entry_wait_huge(struct vm_area_struct *vma, p= te_t *pte) { } +#endif static inline int is_writable_migration_entry(swp_entry_t entry) { return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 098f81e8550d..994361ec75e0 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5689,7 +5689,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct= vm_area_struct *vma, */ entry =3D huge_ptep_get(ptep); if (unlikely(is_hugetlb_entry_migration(entry))) { - migration_entry_wait_huge(vma, mm, ptep); + migration_entry_wait_huge(vma, ptep); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) return VM_FAULT_HWPOISON_LARGE | @@ -6907,7 +6907,7 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long a= ddress, } else { if (is_hugetlb_entry_migration(pte)) { spin_unlock(ptl); - __migration_entry_wait(mm, (pte_t *)pmd, ptl); + __migration_entry_wait_huge((pte_t *)pmd, ptl); goto retry; } /* diff --git a/mm/migrate.c b/mm/migrate.c index c937a496239b..7b31d0b06977 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -315,13 +315,28 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t= *pmd, __migration_entry_wait(mm, ptep, ptl); } =20 -void migration_entry_wait_huge(struct vm_area_struct *vma, - struct mm_struct *mm, pte_t *pte) +#ifdef CONFIG_HUGETLB_PAGE +void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) { - spinlock_t *ptl =3D huge_pte_lockptr(hstate_vma(vma), mm, pte); - __migration_entry_wait(mm, pte, ptl); + pte_t pte; + + spin_lock(ptl); + pte =3D huge_ptep_get(ptep); + + if (unlikely(!is_hugetlb_entry_migration(pte))) + spin_unlock(ptl); + else + migration_entry_wait_on_locked(pte_to_swp_entry(pte), NULL, ptl); } =20 +void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) +{ + spinlock_t *ptl =3D huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte); + + __migration_entry_wait_huge(pte, ptl); +} +#endif + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd) { --=20 2.23.0