From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3C234207DF1 for ; Mon, 16 Dec 2024 16:51:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367908; cv=none; b=Fppn6JXTopZ88TUkrz28KpgMK6TTccCY7xAJj7F66HPtF5bD/uOQM6pPb3bx1+rOFp9fMNFngxDoeYdyqwAi5pNCJ3tFf9m4aBXEnxdUWpF/WftKuLdqk5/kZsEsicto9sEuT9HfPXvQ3CVJO+IUyTJnzxZO3IERNtEVD19IQQY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367908; c=relaxed/simple; bh=4+oEou68DnPE82MGvfHbNQ7lqyWYnBbY4LDya451PpY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZkUzn3+7zvTwcME/dtaHmRGcG3Eg9YYg6TBe3vc4tWNoQfI6srWxqoXzK319agMgIihRS37/bZ9xtZ3P/OF1pS5QZcFyZJFOrrCyZnUpMO/qJW8Ugncz32RG0Bc77f7Qssn6j0MDZ0D+nKmhBVeOTlO70x2c7EFX1Sn884h+sqI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8152F11FB; Mon, 16 Dec 2024 08:52:13 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AC6DC3F528; Mon, 16 Dec 2024 08:51:34 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 01/12] khugepaged: Rename hpage_collapse_scan_pmd() -> ptes() Date: Mon, 16 Dec 2024 22:20:54 +0530 Message-Id: <20241216165105.56185-2-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rename prior to generalizing the collapse function. Signed-off-by: Dev Jain --- mm/khugepaged.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 99dc995aac11..95643e6e5f31 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -605,7 +605,7 @@ static int __collapse_huge_page_isolate(struct vm_area_= struct *vma, folio =3D page_folio(page); VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio); =20 - /* See hpage_collapse_scan_pmd(). */ + /* See hpage_collapse_scan_ptes(). */ if (folio_likely_mapped_shared(folio)) { ++shared; if (cc->is_khugepaged && @@ -991,7 +991,7 @@ static int check_pmd_still_valid(struct mm_struct *mm, =20 /* * Bring missing pages in from swap, to complete THP collapse. - * Only done if hpage_collapse_scan_pmd believes it is worthwhile. + * Only done if hpage_collapse_scan_ptes believes it is worthwhile. * * Called and returns without pte mapped or spinlocks held. * Returns result: if not SCAN_SUCCEED, mmap_lock has been released. @@ -1263,7 +1263,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, return result; } =20 -static int hpage_collapse_scan_pmd(struct mm_struct *mm, +static int hpage_collapse_scan_ptes(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, bool *mmap_locked, struct collapse_control *cc) @@ -2457,7 +2457,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned = int pages, int *result, mmap_read_unlock(mm); } } else { - *result =3D hpage_collapse_scan_pmd(mm, vma, + *result =3D hpage_collapse_scan_ptes(mm, vma, khugepaged_scan.address, &mmap_locked, cc); } =20 @@ -2792,7 +2792,7 @@ int madvise_collapse(struct vm_area_struct *vma, stru= ct vm_area_struct **prev, cc); fput(file); } else { - result =3D hpage_collapse_scan_pmd(mm, vma, addr, + result =3D hpage_collapse_scan_ptes(mm, vma, addr, &mmap_locked, cc); } if (!mmap_locked) --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4F241207DF1 for ; Mon, 16 Dec 2024 16:51:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367919; cv=none; b=VO3DKMtHJlxE6QEsjyGIT0ulbAEG5IrQQnASR1f62fbYkQT352rjhHI4MBn42IQACc83jR2O31kfkJfj5kCSUStmB6pxegQSKhSR/Rq4TBiGve75a2L2WTnnD2bme6tLzqNfKMPHDSTuPDKrn2PttTMqq94QW9dOx1yWPox1aN8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367919; c=relaxed/simple; bh=mq5qjwjE+ui40jJfnEACvGFzl1xWmIFAoM5IxTU3rAA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dN852zo271KdHfcf24Z5svPg7YkwcAVgdH1Z9crboYlINhheKTDaCWFAbCwjNh2YUr0RnHBQMd0kz36tMz0mBppLwG7nJag+5V8VFSYDbWHmsLYUWGjvSiB5gaJq3S92vs2IdYMefn68HJBCWYjhUlQJ4FqRlN1SNgDovUB1vBk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C211B1692; Mon, 16 Dec 2024 08:52:24 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 0E60D3F528; Mon, 16 Dec 2024 08:51:45 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 02/12] khugepaged: Generalize alloc_charge_folio() Date: Mon, 16 Dec 2024 22:20:55 +0530 Message-Id: <20241216165105.56185-3-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pass order to alloc_charge_folio() and update mTHP statistics. Signed-off-by: Dev Jain --- include/linux/huge_mm.h | 2 ++ mm/huge_memory.c | 4 ++++ mm/khugepaged.c | 13 +++++++++---- 3 files changed, 15 insertions(+), 4 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 93e509b6c00e..8b6d0fed99b3 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -119,6 +119,8 @@ enum mthp_stat_item { MTHP_STAT_ANON_FAULT_ALLOC, MTHP_STAT_ANON_FAULT_FALLBACK, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, + MTHP_STAT_ANON_COLLAPSE_ALLOC, + MTHP_STAT_ANON_COLLAPSE_ALLOC_FAILED, MTHP_STAT_ZSWPOUT, MTHP_STAT_SWPIN, MTHP_STAT_SWPIN_FALLBACK, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2da5520bfe24..2e582fad4c77 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -615,6 +615,8 @@ static struct kobj_attribute _name##_attr =3D __ATTR_RO= (_name) DEFINE_MTHP_STAT_ATTR(anon_fault_alloc, MTHP_STAT_ANON_FAULT_ALLOC); DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK); DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FAL= LBACK_CHARGE); +DEFINE_MTHP_STAT_ATTR(anon_collapse_alloc, MTHP_STAT_ANON_COLLAPSE_ALLOC); +DEFINE_MTHP_STAT_ATTR(anon_collapse_alloc_failed, MTHP_STAT_ANON_COLLAPSE_= ALLOC_FAILED); DEFINE_MTHP_STAT_ATTR(zswpout, MTHP_STAT_ZSWPOUT); DEFINE_MTHP_STAT_ATTR(swpin, MTHP_STAT_SWPIN); DEFINE_MTHP_STAT_ATTR(swpin_fallback, MTHP_STAT_SWPIN_FALLBACK); @@ -636,6 +638,8 @@ static struct attribute *anon_stats_attrs[] =3D { &anon_fault_alloc_attr.attr, &anon_fault_fallback_attr.attr, &anon_fault_fallback_charge_attr.attr, + &anon_collapse_alloc_attr.attr, + &anon_collapse_alloc_failed_attr.attr, #ifndef CONFIG_SHMEM &zswpout_attr.attr, &swpin_attr.attr, diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 95643e6e5f31..02cd424b8e48 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1073,21 +1073,26 @@ static int __collapse_huge_page_swapin(struct mm_st= ruct *mm, } =20 static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm, - struct collapse_control *cc) + int order, struct collapse_control *cc) { gfp_t gfp =3D (cc->is_khugepaged ? alloc_hugepage_khugepaged_gfpmask() : GFP_TRANSHUGE); int node =3D hpage_collapse_find_target_node(cc); struct folio *folio; =20 - folio =3D __folio_alloc(gfp, HPAGE_PMD_ORDER, node, &cc->alloc_nmask); + folio =3D __folio_alloc(gfp, order, node, &cc->alloc_nmask); if (!folio) { *foliop =3D NULL; count_vm_event(THP_COLLAPSE_ALLOC_FAILED); + if (order !=3D HPAGE_PMD_ORDER) + count_mthp_stat(order, MTHP_STAT_ANON_COLLAPSE_ALLOC_FAILED); return SCAN_ALLOC_HUGE_PAGE_FAIL; } =20 count_vm_event(THP_COLLAPSE_ALLOC); + if (order !=3D HPAGE_PMD_ORDER) + count_mthp_stat(order, MTHP_STAT_ANON_COLLAPSE_ALLOC); + if (unlikely(mem_cgroup_charge(folio, mm, gfp))) { folio_put(folio); *foliop =3D NULL; @@ -1124,7 +1129,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, */ mmap_read_unlock(mm); =20 - result =3D alloc_charge_folio(&folio, mm, cc); + result =3D alloc_charge_folio(&folio, mm, order, cc); if (result !=3D SCAN_SUCCEED) goto out_nolock; =20 @@ -1850,7 +1855,7 @@ static int collapse_file(struct mm_struct *mm, unsign= ed long addr, VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); =20 - result =3D alloc_charge_folio(&new_folio, mm, cc); + result =3D alloc_charge_folio(&new_folio, mm, HPAGE_PMD_ORDER, cc); if (result !=3D SCAN_SUCCEED) goto out; =20 --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8DD2A139B for ; Mon, 16 Dec 2024 16:52:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367930; cv=none; b=sp5htQitfJgI0aeWArCTL3kuDQo/bnh06GjudFJj9GLwQ3vuYzN9jqO3sTKzjt3qA+IywAobyVbPazlWwgmK7lE4WVzKdnRyJrZRRYeygat4KJSz8Lb+FbLP56T2TDwCKNw9KrkUDoWjqABq0oOoKOJKn8dk+8Pl44Atp74u4l4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367930; c=relaxed/simple; bh=v76QClb9U9m30SebRAPamXow+5zI76bUsPRY7vqJ/xE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Qvos2qs4izlxXR2m34lu1ZdmKiQKwFMhcMziF+hr+3YX37dHh4Yd8+DX3eI4MgcZ2Q/oVo47E3sWDbnLl1KgJovm4J1hgGjVTSR3iVDXCc1vwveB9bCRNmH6cEf/UdND7BtmS5zuByMkFiJVu+YiV3YcIIeNMVVyaK4tKE8KDcQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4136B16F2; Mon, 16 Dec 2024 08:52:36 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 527BB3F528; Mon, 16 Dec 2024 08:51:57 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 03/12] khugepaged: Generalize hugepage_vma_revalidate() Date: Mon, 16 Dec 2024 22:20:56 +0530 Message-Id: <20241216165105.56185-4-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Post retaking the lock, it must be checked that the VMA is suitable for our scan order. Hence, generalize hugepage_vma_revalidate(). Signed-off-by: Dev Jain --- mm/khugepaged.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 02cd424b8e48..2f0601795471 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -918,7 +918,7 @@ static int hpage_collapse_find_target_node(struct colla= pse_control *cc) =20 static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long add= ress, bool expect_anon, - struct vm_area_struct **vmap, + struct vm_area_struct **vmap, int order, struct collapse_control *cc) { struct vm_area_struct *vma; @@ -931,9 +931,9 @@ static int hugepage_vma_revalidate(struct mm_struct *mm= , unsigned long address, if (!vma) return SCAN_VMA_NULL; =20 - if (!thp_vma_suitable_order(vma, address, PMD_ORDER)) + if (!thp_vma_suitable_order(vma, address, order)) return SCAN_ADDRESS_RANGE; - if (!thp_vma_allowable_order(vma, vma->vm_flags, tva_flags, PMD_ORDER)) + if (!thp_vma_allowable_order(vma, vma->vm_flags, tva_flags, order)) return SCAN_VMA_CHECK; /* * Anon VMA expected, the address may be unmapped then @@ -1134,7 +1134,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, goto out_nolock; =20 mmap_read_lock(mm); - result =3D hugepage_vma_revalidate(mm, address, true, &vma, cc); + result =3D hugepage_vma_revalidate(mm, address, true, &vma, order, cc); if (result !=3D SCAN_SUCCEED) { mmap_read_unlock(mm); goto out_nolock; @@ -1168,7 +1168,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, * mmap_lock. */ mmap_write_lock(mm); - result =3D hugepage_vma_revalidate(mm, address, true, &vma, cc); + result =3D hugepage_vma_revalidate(mm, address, true, &vma, order, cc); if (result !=3D SCAN_SUCCEED) goto out_up_write; /* check if the pmd is still valid */ @@ -2776,7 +2776,7 @@ int madvise_collapse(struct vm_area_struct *vma, stru= ct vm_area_struct **prev, mmap_read_lock(mm); mmap_locked =3D true; result =3D hugepage_vma_revalidate(mm, addr, false, &vma, - cc); + HPAGE_PMD_ORDER, cc); if (result !=3D SCAN_SUCCEED) { last_fail =3D result; goto out_nolock; --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1C003206261 for ; Mon, 16 Dec 2024 16:52:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367941; cv=none; b=V9dX5x1ZELeUITjRfCkro/r8uhgLTM1LQRZ92TxcoeQ4rtH+vqmGxSJcp35cQSMdtG1ru/IHHNsF74Ma3sL2Bx08Ty9ZrYijFBOPKiS8mfDaHhHoAYt8UrnH1IkdYhcWFljK4CnoLmHTgdGWh8DPhhn2GDEBNOSe8HH1Exf5hHA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367941; c=relaxed/simple; bh=C70+LWn6tFJ84D4PU6SU9Shm+vO2wQu5vxL1j6bexdg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Xo6Kh4OkciQEGDKABjmHo8vogWy5FGAHMJFmzpDRVQoyrql30EyyNUnFXBaYKxPZOVm1EoUrizj+YbOmnRrd5gyO9VnTpB65S03fyzpdLlm2stkpWObgkTIIukLbI0NoeaZuN5ZMuU9a5CkEX1FG0lq+SWE6Y92pUEYajzgXhRY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 81C5F16F3; Mon, 16 Dec 2024 08:52:47 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C5BF03F528; Mon, 16 Dec 2024 08:52:08 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 04/12] khugepaged: Generalize __collapse_huge_page_swapin() Date: Mon, 16 Dec 2024 22:20:57 +0530 Message-Id: <20241216165105.56185-5-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If any PTE in our scan range is a swap entry, then use do_swap_page() to sw= ap-in the corresponding folio. Signed-off-by: Dev Jain --- mm/khugepaged.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 2f0601795471..f52dae7d5179 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -998,17 +998,17 @@ static int check_pmd_still_valid(struct mm_struct *mm, */ static int __collapse_huge_page_swapin(struct mm_struct *mm, struct vm_area_struct *vma, - unsigned long haddr, pmd_t *pmd, - int referenced) + unsigned long addr, pmd_t *pmd, + int referenced, int order) { int swapped_in =3D 0; vm_fault_t ret =3D 0; - unsigned long address, end =3D haddr + (HPAGE_PMD_NR * PAGE_SIZE); + unsigned long address, end =3D addr + ((1UL << order) * PAGE_SIZE); int result; pte_t *pte =3D NULL; spinlock_t *ptl; =20 - for (address =3D haddr; address < end; address +=3D PAGE_SIZE) { + for (address =3D addr; address < end; address +=3D PAGE_SIZE) { struct vm_fault vmf =3D { .vma =3D vma, .address =3D address, @@ -1153,7 +1153,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, * that case. Continuing to collapse causes inconsistency. */ result =3D __collapse_huge_page_swapin(mm, vma, address, pmd, - referenced); + referenced, order); if (result !=3D SCAN_SUCCEED) goto out_nolock; } --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5360D206261 for ; Mon, 16 Dec 2024 16:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367953; cv=none; b=WtCxzlnlkdG83CmDVRpGSchi2bPQo9shbkO5tm0WcP9fRGwZ304+hw9gMwUHId4ntyrIl7jpU2kc0vb/OhQi4jZ8IqZ6wC7T97eUpeLWhlddLN3t6VwYa6RIsA45yR8o2Avs4VrdB77Hx1WyofVTLUlhOYnbJATDH8SijQQiQo0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367953; c=relaxed/simple; bh=256Ahx58gErL3CDuIbymk9CjuoQB/zV6T5dPnCoRtk8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=k8MwqpnNhh4pUii5gotAYBGnlJk/tBRoziM5n6ulamWhBW/xdNDS4hY50ZWrCqxjb6mvdUC/uWaVSCXHRV3pNXc6ft+xCS/q7MkX7PfGeHDBJIzbmIOXXwL5cjm6KfJ7XETNClaut+A/s0BHODUN5qbLI0SE3C+xcoFDEX025x0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 905511AED; Mon, 16 Dec 2024 08:52:58 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 0F31C3F528; Mon, 16 Dec 2024 08:52:19 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 05/12] khugepaged: Generalize __collapse_huge_page_isolate() Date: Mon, 16 Dec 2024 22:20:58 +0530 Message-Id: <20241216165105.56185-6-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Scale down the scan range and the sysfs tunables according to the scan orde= r, and isolate the folios. Signed-off-by: Dev Jain --- mm/khugepaged.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index f52dae7d5179..de044b1f83d4 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -564,15 +564,18 @@ static int __collapse_huge_page_isolate(struct vm_are= a_struct *vma, unsigned long address, pte_t *pte, struct collapse_control *cc, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, int order) { - struct page *page =3D NULL; - struct folio *folio =3D NULL; - pte_t *_pte; + unsigned int max_ptes_shared =3D khugepaged_max_ptes_shared >> (HPAGE_PMD= _ORDER - order); + unsigned int max_ptes_none =3D khugepaged_max_ptes_none >> (HPAGE_PMD_ORD= ER - order); int none_or_zero =3D 0, shared =3D 0, result =3D SCAN_FAIL, referenced = =3D 0; + struct folio *folio =3D NULL; + struct page *page =3D NULL; bool writable =3D false; + pte_t *_pte; =20 - for (_pte =3D pte; _pte < pte + HPAGE_PMD_NR; + + for (_pte =3D pte; _pte < pte + (1UL << order); _pte++, address +=3D PAGE_SIZE) { pte_t pteval =3D ptep_get(_pte); if (pte_none(pteval) || (pte_present(pteval) && @@ -580,7 +583,7 @@ static int __collapse_huge_page_isolate(struct vm_area_= struct *vma, ++none_or_zero; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || - none_or_zero <=3D khugepaged_max_ptes_none)) { + none_or_zero <=3D max_ptes_none)) { continue; } else { result =3D SCAN_EXCEED_NONE_PTE; @@ -609,7 +612,7 @@ static int __collapse_huge_page_isolate(struct vm_area_= struct *vma, if (folio_likely_mapped_shared(folio)) { ++shared; if (cc->is_khugepaged && - shared > khugepaged_max_ptes_shared) { + shared > max_ptes_shared) { result =3D SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -1200,7 +1203,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, pte =3D pte_offset_map_lock(mm, &_pmd, address, &pte_ptl); if (pte) { result =3D __collapse_huge_page_isolate(vma, address, pte, cc, - &compound_pagelist); + &compound_pagelist, order); spin_unlock(pte_ptl); } else { result =3D SCAN_PMD_NULL; --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EC57B20967E for ; Mon, 16 Dec 2024 16:52:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367964; cv=none; b=RpIpnTnkN3UWmFXzz50P0ZdDX15xj+IUQi2wMxX8RL93Ubr44jPtPpla8/CqBKJD0sz4ydILLSUyjenZQRhwULjtZxrtzFx23MyDTXhVanZxUsuB8A5DZSFKEEUutGVSMtu8Og1uW9iP8tfo1ekr6nN97NW+purRSb8yITzE6NE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367964; c=relaxed/simple; bh=s4Dcm7BxivCXfxFTi4K7SM7RPhBnJFy1cF3ksI6fZ6Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=iVfsv5YrKXEgqtLdk4B/jx0iy+4B5hqJj0entLT49s+mduqc+5+9bRUz2IIovn3iZ5r2VT9BnfT7WzjvGHt9YJLRSJgUp5x9iTPUoyyKHjCe0LhUhqYr6KMdLywflIrKLCihhEwr+LFkOcR8xUk4I5+zbzmq8M8BlEYeLUlPrB8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2894116F8; Mon, 16 Dec 2024 08:53:10 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 364F13F528; Mon, 16 Dec 2024 08:52:30 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 06/12] khugepaged: Generalize __collapse_huge_page_copy_failed() Date: Mon, 16 Dec 2024 22:20:59 +0530 Message-Id: <20241216165105.56185-7-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Upon failure, we repopulate the PMD in case of PMD-THP collapse. Hence, make this logic specific for PMD case. Signed-off-by: Dev Jain --- mm/khugepaged.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index de044b1f83d4..886c76816963 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -766,7 +766,7 @@ static void __collapse_huge_page_copy_failed(pte_t *pte, pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, int order) { spinlock_t *pmd_ptl; =20 @@ -776,14 +776,16 @@ static void __collapse_huge_page_copy_failed(pte_t *p= te, * pages. Since pages are still isolated and locked here, * acquiring anon_vma_lock_write is unnecessary. */ - pmd_ptl =3D pmd_lock(vma->vm_mm, pmd); - pmd_populate(vma->vm_mm, pmd, pmd_pgtable(orig_pmd)); - spin_unlock(pmd_ptl); + if (order =3D=3D HPAGE_PMD_ORDER) { + pmd_ptl =3D pmd_lock(vma->vm_mm, pmd); + pmd_populate(vma->vm_mm, pmd, pmd_pgtable(orig_pmd)); + spin_unlock(pmd_ptl); + } /* * Release both raw and compound pages isolated * in __collapse_huge_page_isolate. */ - release_pte_pages(pte, pte + HPAGE_PMD_NR, compound_pagelist); + release_pte_pages(pte, pte + (1UL << order), compound_pagelist); } =20 /* @@ -834,7 +836,7 @@ static int __collapse_huge_page_copy(pte_t *pte, struct= folio *folio, compound_pagelist); else __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma, - compound_pagelist); + compound_pagelist, order); =20 return result; } --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 87E66206261 for ; Mon, 16 Dec 2024 16:52:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367975; cv=none; b=VO/fdcIAuRXD3SuiLbCUYJhjrAJy+3hz1mw68m6NO+X2raAuGignKGnCxg6rGjy2VoaVHc5XpDCiiGYmFa1m+ZPPpFoxIW0eMbyjGfcp2sQkmm6AZzt2pD1jfxoEzpFhE9jzAsQ+QxhxqySCz2yCMmJLE+lou5XU1PIqT55P84o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367975; c=relaxed/simple; bh=uC5ZXq3Aad3MK7iTcNO4KgggSYyI1G5HepkvikAssAY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=g6zIKwQHsZq0tCic3YxpYNx7vfZJP8jlD792I1feTarMa+AK16fDyI6eu/n8RqjdcPRUz4IBO6fsHsLQOlWpEZ1mbGNXHRftjyktJxgYjfBa2wPQI1HIWa4Twz1UeUsnZMgEaAkTIQVAKMjLDmWJNAnci7BUV28rlyQGd41YorI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 42BAF16F8; Mon, 16 Dec 2024 08:53:21 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A83103F528; Mon, 16 Dec 2024 08:52:42 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 07/12] khugepaged: Scan PTEs order-wise Date: Mon, 16 Dec 2024 22:21:00 +0530 Message-Id: <20241216165105.56185-8-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Scan the PTEs order-wise, using the mask of suitable orders for this VMA derived in conjunction with sysfs THP settings. Scale down the tunables; in case of collapse failure, we drop down to the next order. Otherwise, we try= to jump to the highest possible order and then start a fresh scan. Note that madvise(MADV_COLLAPSE) has not been generalized. Signed-off-by: Dev Jain --- mm/khugepaged.c | 84 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 69 insertions(+), 15 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 886c76816963..078794aa3335 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -20,6 +20,7 @@ #include #include #include +#include =20 #include #include @@ -1111,7 +1112,7 @@ static int alloc_charge_folio(struct folio **foliop, = struct mm_struct *mm, } =20 static int collapse_huge_page(struct mm_struct *mm, unsigned long address, - int referenced, int unmapped, + int referenced, int unmapped, int order, struct collapse_control *cc) { LIST_HEAD(compound_pagelist); @@ -1278,38 +1279,59 @@ static int hpage_collapse_scan_ptes(struct mm_struc= t *mm, unsigned long address, bool *mmap_locked, struct collapse_control *cc) { - pmd_t *pmd; - pte_t *pte, *_pte; - int result =3D SCAN_FAIL, referenced =3D 0; - int none_or_zero =3D 0, shared =3D 0; - struct page *page =3D NULL; + unsigned int max_ptes_shared, max_ptes_none, max_ptes_swap; + int referenced, shared, none_or_zero, unmapped; + unsigned long _address, org_address =3D address; struct folio *folio =3D NULL; - unsigned long _address; - spinlock_t *ptl; - int node =3D NUMA_NO_NODE, unmapped =3D 0; + struct page *page =3D NULL; + int node =3D NUMA_NO_NODE; + int result =3D SCAN_FAIL; bool writable =3D false; + unsigned long orders; + pte_t *pte, *_pte; + spinlock_t *ptl; + pmd_t *pmd; + int order; =20 VM_BUG_ON(address & ~HPAGE_PMD_MASK); =20 + orders =3D thp_vma_allowable_orders(vma, vma->vm_flags, + TVA_IN_PF | TVA_ENFORCE_SYSFS, BIT(PMD_ORDER + 1) - 1); + orders =3D thp_vma_suitable_orders(vma, address, orders); + order =3D highest_order(orders); + + /* MADV_COLLAPSE needs to work irrespective of sysfs setting */ + if (!cc->is_khugepaged) + order =3D HPAGE_PMD_ORDER; + +scan_pte_range: + + max_ptes_shared =3D khugepaged_max_ptes_shared >> (HPAGE_PMD_ORDER - orde= r); + max_ptes_none =3D khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); + max_ptes_swap =3D khugepaged_max_ptes_swap >> (HPAGE_PMD_ORDER - order); + referenced =3D 0, shared =3D 0, none_or_zero =3D 0, unmapped =3D 0; + + /* Check pmd after taking mmap lock */ result =3D find_pmd_or_thp_or_none(mm, address, &pmd); if (result !=3D SCAN_SUCCEED) goto out; =20 memset(cc->node_load, 0, sizeof(cc->node_load)); nodes_clear(cc->alloc_nmask); + pte =3D pte_offset_map_lock(mm, pmd, address, &ptl); if (!pte) { result =3D SCAN_PMD_NULL; goto out; } =20 - for (_address =3D address, _pte =3D pte; _pte < pte + HPAGE_PMD_NR; + for (_address =3D address, _pte =3D pte; _pte < pte + (1UL << order); _pte++, _address +=3D PAGE_SIZE) { pte_t pteval =3D ptep_get(_pte); if (is_swap_pte(pteval)) { ++unmapped; if (!cc->is_khugepaged || - unmapped <=3D khugepaged_max_ptes_swap) { + unmapped <=3D max_ptes_swap) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1330,7 +1352,7 @@ static int hpage_collapse_scan_ptes(struct mm_struct = *mm, ++none_or_zero; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || - none_or_zero <=3D khugepaged_max_ptes_none)) { + none_or_zero <=3D max_ptes_none)) { continue; } else { result =3D SCAN_EXCEED_NONE_PTE; @@ -1375,7 +1397,7 @@ static int hpage_collapse_scan_ptes(struct mm_struct = *mm, if (folio_likely_mapped_shared(folio)) { ++shared; if (cc->is_khugepaged && - shared > khugepaged_max_ptes_shared) { + shared > max_ptes_shared) { result =3D SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -1432,7 +1454,7 @@ static int hpage_collapse_scan_ptes(struct mm_struct = *mm, result =3D SCAN_PAGE_RO; } else if (cc->is_khugepaged && (!referenced || - (unmapped && referenced < HPAGE_PMD_NR / 2))) { + (unmapped && referenced < (1UL << order) / 2))) { result =3D SCAN_LACK_REFERENCED_PAGE; } else { result =3D SCAN_SUCCEED; @@ -1441,9 +1463,41 @@ static int hpage_collapse_scan_ptes(struct mm_struct= *mm, pte_unmap_unlock(pte, ptl); if (result =3D=3D SCAN_SUCCEED) { result =3D collapse_huge_page(mm, address, referenced, - unmapped, cc); + unmapped, order, cc); /* collapse_huge_page will return with the mmap_lock released */ *mmap_locked =3D false; + + /* Immediately exit on exhaustion of range */ + if (_address =3D=3D org_address + (PAGE_SIZE << HPAGE_PMD_ORDER)) + goto out; + } + if (result !=3D SCAN_SUCCEED) { + + /* Go to the next order. */ + order =3D next_order(&orders, order); + if (order < 2) + goto out; + goto maybe_mmap_lock; + } else { + address =3D _address; + pte =3D _pte; + + + /* Get highest order possible starting from address */ + order =3D count_trailing_zeros(address >> PAGE_SHIFT); + + /* This needs to be present in the mask too */ + if (!(orders & (1UL << order))) + order =3D next_order(&orders, order); + if (order < 2) + goto out; + +maybe_mmap_lock: + if (!(*mmap_locked)) { + mmap_read_lock(mm); + *mmap_locked =3D true; + } + goto scan_pte_range; } out: trace_mm_khugepaged_scan_pmd(mm, &folio->page, writable, referenced, --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 369A7207A03 for ; Mon, 16 Dec 2024 16:53:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367986; cv=none; b=Z0eIVbCF+0vdKVzQMQbz+1DchdZUDJ2/xsvfFkPKt13Ds4yhUBMuxl8TDAekyF7fArZWzarBw4oCstOvI365rIEqNAxK5LcPM+hvVB3U7OkFjcTT4SjRDSsWw3MDfksfS2vEmXj8rV1y8epFKNZ3nscSHWo8KRyTXcQ/4lo0Bd4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367986; c=relaxed/simple; bh=4YEoph5wgjSnLtk0qNr+xrEdds8hSnIuYMICDvRWMwo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=k8Ax2mjj5mEDaC3ujqPTNj7bJ9aqWX9GceIVUKa+gGGZns9NxbR/OeqNJxck28Qeil/2OjWNnYCrTI6EXyUvE71Op2kvy7055ntmD+6RmEyKxVbqn682iNKh3ohpDG6WbzCt86SxYpHFU6yHEs+unpokNWCCUddq1jeh9WP7PA0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8BFE216F8; Mon, 16 Dec 2024 08:53:32 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C07E13F528; Mon, 16 Dec 2024 08:52:53 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 08/12] khugepaged: Abstract PMD-THP collapse Date: Mon, 16 Dec 2024 22:21:01 +0530 Message-Id: <20241216165105.56185-9-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Abstract away taking the mmap_lock exclusively, copying page contents, and setting the PMD, into vma_collapse_anon_folio_pmd(). Signed-off-by: Dev Jain --- mm/khugepaged.c | 119 +++++++++++++++++++++++++++--------------------- 1 file changed, 66 insertions(+), 53 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 078794aa3335..88beebef773e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1111,58 +1111,17 @@ static int alloc_charge_folio(struct folio **foliop= , struct mm_struct *mm, return SCAN_SUCCEED; } =20 -static int collapse_huge_page(struct mm_struct *mm, unsigned long address, - int referenced, int unmapped, int order, - struct collapse_control *cc) +static int vma_collapse_anon_folio_pmd(struct mm_struct *mm, unsigned long= address, + struct vm_area_struct *vma, struct collapse_control *cc, pmd_t *pmd, + struct folio *folio) { + struct mmu_notifier_range range; + spinlock_t *pmd_ptl, *pte_ptl; LIST_HEAD(compound_pagelist); - pmd_t *pmd, _pmd; - pte_t *pte; pgtable_t pgtable; - struct folio *folio; - spinlock_t *pmd_ptl, *pte_ptl; - int result =3D SCAN_FAIL; - struct vm_area_struct *vma; - struct mmu_notifier_range range; - - VM_BUG_ON(address & ~HPAGE_PMD_MASK); - - /* - * Before allocating the hugepage, release the mmap_lock read lock. - * The allocation can take potentially a long time if it involves - * sync compaction, and we do not need to hold the mmap_lock during - * that. We will recheck the vma after taking it again in write mode. - */ - mmap_read_unlock(mm); - - result =3D alloc_charge_folio(&folio, mm, order, cc); - if (result !=3D SCAN_SUCCEED) - goto out_nolock; - - mmap_read_lock(mm); - result =3D hugepage_vma_revalidate(mm, address, true, &vma, order, cc); - if (result !=3D SCAN_SUCCEED) { - mmap_read_unlock(mm); - goto out_nolock; - } - - result =3D find_pmd_or_thp_or_none(mm, address, &pmd); - if (result !=3D SCAN_SUCCEED) { - mmap_read_unlock(mm); - goto out_nolock; - } - - if (unmapped) { - /* - * __collapse_huge_page_swapin will return with mmap_lock - * released when it fails. So we jump out_nolock directly in - * that case. Continuing to collapse causes inconsistency. - */ - result =3D __collapse_huge_page_swapin(mm, vma, address, pmd, - referenced, order); - if (result !=3D SCAN_SUCCEED) - goto out_nolock; - } + int result; + pmd_t _pmd; + pte_t *pte; =20 mmap_read_unlock(mm); /* @@ -1174,7 +1133,8 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, * mmap_lock. */ mmap_write_lock(mm); - result =3D hugepage_vma_revalidate(mm, address, true, &vma, order, cc); + + result =3D hugepage_vma_revalidate(mm, address, true, &vma, HPAGE_PMD_ORD= ER, cc); if (result !=3D SCAN_SUCCEED) goto out_up_write; /* check if the pmd is still valid */ @@ -1206,7 +1166,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, pte =3D pte_offset_map_lock(mm, &_pmd, address, &pte_ptl); if (pte) { result =3D __collapse_huge_page_isolate(vma, address, pte, cc, - &compound_pagelist, order); + &compound_pagelist, HPAGE_PMD_ORDER); spin_unlock(pte_ptl); } else { result =3D SCAN_PMD_NULL; @@ -1262,11 +1222,64 @@ static int collapse_huge_page(struct mm_struct *mm,= unsigned long address, deferred_split_folio(folio, false); spin_unlock(pmd_ptl); =20 - folio =3D NULL; - result =3D SCAN_SUCCEED; out_up_write: mmap_write_unlock(mm); + return result; +} + +static int collapse_huge_page(struct mm_struct *mm, unsigned long address, + int referenced, int unmapped, int order, + struct collapse_control *cc) +{ + struct vm_area_struct *vma; + int result =3D SCAN_FAIL; + struct folio *folio; + pmd_t *pmd; + + /* + * Before allocating the hugepage, release the mmap_lock read lock. + * The allocation can take potentially a long time if it involves + * sync compaction, and we do not need to hold the mmap_lock during + * that. We will recheck the vma after taking it again in write mode. + */ + mmap_read_unlock(mm); + + result =3D alloc_charge_folio(&folio, mm, order, cc); + if (result !=3D SCAN_SUCCEED) + goto out_nolock; + + mmap_read_lock(mm); + result =3D hugepage_vma_revalidate(mm, address, true, &vma, order, cc); + if (result !=3D SCAN_SUCCEED) { + mmap_read_unlock(mm); + goto out_nolock; + } + + result =3D find_pmd_or_thp_or_none(mm, address, &pmd); + if (result !=3D SCAN_SUCCEED) { + mmap_read_unlock(mm); + goto out_nolock; + } + + if (unmapped) { + /* + * __collapse_huge_page_swapin will return with mmap_lock + * released when it fails. So we jump out_nolock directly in + * that case. Continuing to collapse causes inconsistency. + */ + result =3D __collapse_huge_page_swapin(mm, vma, address, pmd, + referenced, order); + if (result !=3D SCAN_SUCCEED) + goto out_nolock; + } + + if (order =3D=3D HPAGE_PMD_ORDER) + result =3D vma_collapse_anon_folio_pmd(mm, address, vma, cc, pmd, folio); + + if (result =3D=3D SCAN_SUCCEED) + folio =3D NULL; + out_nolock: if (folio) folio_put(folio); --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5C73C14B96E for ; Mon, 16 Dec 2024 16:53:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367998; cv=none; b=TaUQ1d8AMg/5cXEDOeKiXUEt3CLKHGsqc3Hrnr3DgnCeProiLPCV/GhPy+qhmgdeAvCrlsNli1ve0DDAU4vXqHytFsLhTPlbxChva2EIFaGyRdqF8wk90zIVBFDhXRDSLOhqHhU5gC4HUe8ZiPdO7Y1IubBYlNb5SaE3RwY1Bv4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734367998; c=relaxed/simple; bh=/lM/lm4WngV8L+aveUlJ0+j1Jt+cUCYWF6sWZAKhYFY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rzpv1CF1PghbuRG0L7OnbgTQxIUTSBGYPjWVve+zwNqRBmNYicv6YyWdHtOymcmfpidkDGeBDW0KEyUF/KLDZ/KANtN0tfiow4yKxdQ6cBskPABMTqrSwEa+8ic1wkb8+DFvUlEehZ/BrRrcJrO8KgHm5gtp325s7bzZoSNP1eM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CAC2C16F8; Mon, 16 Dec 2024 08:53:43 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 1C5503F528; Mon, 16 Dec 2024 08:53:04 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 09/12] khugepaged: Introduce vma_collapse_anon_folio() Date: Mon, 16 Dec 2024 22:21:02 +0530 Message-Id: <20241216165105.56185-10-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In contrast to PMD-collapse, we do not need to operate on two levels of pag= etable simultaneously. Therefore, downgrade the mmap lock from write to read mode.= Still take the anon_vma lock in exclusive mode so as to not waste time in the rma= p path, which is anyways going to fail since the PTEs are going to be changed. Unde= r the PTL, copy page contents, clear the PTEs, remove folio pins, and (try to) unmap t= he old folios. Set the PTEs to the new folio using the set_ptes() API. Signed-off-by: Dev Jain --- Note: I have been trying hard to get rid of the locks in here: we still are taking the PTL around the page copying; dropping the PTL and taking it after the copying should lead to a deadlock, for example: khugepaged madvise(MADV_COLD) folio_lock() lock(ptl) lock(ptl) folio_lock() We can create a locked folio list, altogether drop both the locks, take the= PTL, do everything which __collapse_huge_page_isolate() does *except* the isolat= ion and again try locking folios, but then it will reduce efficiency of khugepaged and almost looks like a forced solution :) Please note the following discussion if anyone is interested: https://lore.kernel.org/all/66bb7496-a445-4ad7-8e56-4f2863465c54@arm.com/ (Apologies for not CCing the mailing list from the start) mm/khugepaged.c | 108 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 87 insertions(+), 21 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 88beebef773e..8040b130e677 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -714,24 +714,28 @@ static void __collapse_huge_page_copy_succeeded(pte_t= *pte, struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, int order) { struct folio *src, *tmp; pte_t *_pte; pte_t pteval; =20 - for (_pte =3D pte; _pte < pte + HPAGE_PMD_NR; + for (_pte =3D pte; _pte < pte + (1UL << order); _pte++, address +=3D PAGE_SIZE) { pteval =3D ptep_get(_pte); if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1); if (is_zero_pfn(pte_pfn(pteval))) { - /* - * ptl mostly unnecessary. - */ - spin_lock(ptl); - ptep_clear(vma->vm_mm, address, _pte); - spin_unlock(ptl); + if (order =3D=3D HPAGE_PMD_ORDER) { + /* + * ptl mostly unnecessary. + */ + spin_lock(ptl); + ptep_clear(vma->vm_mm, address, _pte); + spin_unlock(ptl); + } else { + ptep_clear(vma->vm_mm, address, _pte); + } ksm_might_unmap_zero_page(vma->vm_mm, pteval); } } else { @@ -740,15 +744,20 @@ static void __collapse_huge_page_copy_succeeded(pte_t= *pte, src =3D page_folio(src_page); if (!folio_test_large(src)) release_pte_folio(src); - /* - * ptl mostly unnecessary, but preempt has to - * be disabled to update the per-cpu stats - * inside folio_remove_rmap_pte(). - */ - spin_lock(ptl); - ptep_clear(vma->vm_mm, address, _pte); - folio_remove_rmap_pte(src, src_page, vma); - spin_unlock(ptl); + if (order =3D=3D HPAGE_PMD_ORDER) { + /* + * ptl mostly unnecessary, but preempt has to + * be disabled to update the per-cpu stats + * inside folio_remove_rmap_pte(). + */ + spin_lock(ptl); + ptep_clear(vma->vm_mm, address, _pte); + folio_remove_rmap_pte(src, src_page, vma); + spin_unlock(ptl); + } else { + ptep_clear(vma->vm_mm, address, _pte); + folio_remove_rmap_pte(src, src_page, vma); + } free_page_and_swap_cache(src_page); } } @@ -807,7 +816,7 @@ static void __collapse_huge_page_copy_failed(pte_t *pte, static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio, pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, int order) { unsigned int i; int result =3D SCAN_SUCCEED; @@ -815,7 +824,7 @@ static int __collapse_huge_page_copy(pte_t *pte, struct= folio *folio, /* * Copying pages' contents is subject to memory poison at any iteration. */ - for (i =3D 0; i < HPAGE_PMD_NR; i++) { + for (i =3D 0; i < (1 << order); i++) { pte_t pteval =3D ptep_get(pte + i); struct page *page =3D folio_page(folio, i); unsigned long src_addr =3D address + i * PAGE_SIZE; @@ -834,7 +843,7 @@ static int __collapse_huge_page_copy(pte_t *pte, struct= folio *folio, =20 if (likely(result =3D=3D SCAN_SUCCEED)) __collapse_huge_page_copy_succeeded(pte, vma, address, ptl, - compound_pagelist); + compound_pagelist, order); else __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma, compound_pagelist, order); @@ -1196,7 +1205,7 @@ static int vma_collapse_anon_folio_pmd(struct mm_stru= ct *mm, unsigned long addre =20 result =3D __collapse_huge_page_copy(pte, folio, pmd, _pmd, vma, address, pte_ptl, - &compound_pagelist); + &compound_pagelist, HPAGE_PMD_ORDER); pte_unmap(pte); if (unlikely(result !=3D SCAN_SUCCEED)) goto out_up_write; @@ -1228,6 +1237,61 @@ static int vma_collapse_anon_folio_pmd(struct mm_str= uct *mm, unsigned long addre return result; } =20 +/* Enter with mmap read lock */ +static int vma_collapse_anon_folio(struct mm_struct *mm, unsigned long add= ress, + struct vm_area_struct *vma, struct collapse_control *cc, pmd_t *pmd, + struct folio *folio, int order) +{ + int result; + struct mmu_notifier_range range; + spinlock_t *pte_ptl; + LIST_HEAD(compound_pagelist); + pte_t *pte; + pte_t entry; + int nr_pages =3D folio_nr_pages(folio); + + anon_vma_lock_write(vma->anon_vma); + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, address, + address + (PAGE_SIZE << order)); + mmu_notifier_invalidate_range_start(&range); + + pte =3D pte_offset_map_lock(mm, pmd, address, &pte_ptl); + if (pte) + result =3D __collapse_huge_page_isolate(vma, address, pte, cc, + &compound_pagelist, order); + else + result =3D SCAN_PMD_NULL; + + if (unlikely(result !=3D SCAN_SUCCEED)) + goto out_up_read; + + anon_vma_unlock_write(vma->anon_vma); + + __folio_mark_uptodate(folio); + entry =3D mk_pte(&folio->page, vma->vm_page_prot); + entry =3D maybe_mkwrite(entry, vma); + + result =3D __collapse_huge_page_copy(pte, folio, pmd, *pmd, + vma, address, pte_ptl, + &compound_pagelist, order); + if (unlikely(result !=3D SCAN_SUCCEED)) + goto out_up_read; + + folio_ref_add(folio, nr_pages - 1); + folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); + folio_add_lru_vma(folio, vma); + deferred_split_folio(folio, false); + set_ptes(mm, address, pte, entry, nr_pages); + update_mmu_cache_range(NULL, vma, address, pte, nr_pages); + pte_unmap_unlock(pte, pte_ptl); + mmu_notifier_invalidate_range_end(&range); + result =3D SCAN_SUCCEED; + +out_up_read: + mmap_read_unlock(mm); + return result; +} + static int collapse_huge_page(struct mm_struct *mm, unsigned long address, int referenced, int unmapped, int order, struct collapse_control *cc) @@ -1276,6 +1340,8 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, =20 if (order =3D=3D HPAGE_PMD_ORDER) result =3D vma_collapse_anon_folio_pmd(mm, address, vma, cc, pmd, folio); + else + result =3D vma_collapse_anon_folio(mm, address, vma, cc, pmd, folio, ord= er); =20 if (result =3D=3D SCAN_SUCCEED) folio =3D NULL; --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 858E814B96E for ; Mon, 16 Dec 2024 16:53:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734368009; cv=none; b=UsuHmfyh1QJcZ8aaKCFJxRah7HxuGML5WIyk0qNt1GkFW93I62KzsSDuz/EMsswfZvjqDf8CWVKaVW2p2AWxnNxKo/1HsI+M2/OYeFRU0NDQ/7WQVgmwFEy2VyYUfIYxSvVF+G5WrnXer9LWQtcUW6mVhWvvBaP9rx0iTPzZKik= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734368009; c=relaxed/simple; bh=x8/lax7AnEY8aIMV7rJN+ovhV7eHuZNGumeyQOCxLVE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XnK6lHb3eh+pGcn+IbMf9Pl/EwPu4AyTuhVlOCkVkxHVhy98FYBCiK8r5JOC56LRzRixeXk7H62ts2bOeJ4pxCrFYuN0GZNwoyfiEBFb762435GrLKtw96qX5QBebmR6G+5062yntjJyppY/lCYi1RBqYIylZ/qm5gchzeI4s8g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 32FFE16F8; Mon, 16 Dec 2024 08:53:55 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 552383F528; Mon, 16 Dec 2024 08:53:16 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 10/12] khugepaged: Skip PTE range if a larger mTHP is already mapped Date: Mon, 16 Dec 2024 22:21:03 +0530 Message-Id: <20241216165105.56185-11-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We may hit a situation wherein we have a larger folio mapped. It is incorre= ct to go ahead with the collapse since some pages will be unmapped, leading to the entire folio getting unmapped. Therefore, skip the corresponding range. Signed-off-by: Dev Jain --- In the future, if at all it is required that at some point we want all the = folios in the system to be of a specific order, we may split these larger folios. mm/khugepaged.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 8040b130e677..47e7c476b893 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -33,6 +33,7 @@ enum scan_result { SCAN_PMD_NULL, SCAN_PMD_NONE, SCAN_PMD_MAPPED, + SCAN_PTE_MAPPED, SCAN_EXCEED_NONE_PTE, SCAN_EXCEED_SWAP_PTE, SCAN_EXCEED_SHARED_PTE, @@ -609,6 +610,11 @@ static int __collapse_huge_page_isolate(struct vm_area= _struct *vma, folio =3D page_folio(page); VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio); =20 + if (order !=3DHPAGE_PMD_ORDER && folio_order(folio) >=3D order) { + result =3D SCAN_PTE_MAPPED; + goto out; + } + /* See hpage_collapse_scan_ptes(). */ if (folio_likely_mapped_shared(folio)) { ++shared; @@ -1369,6 +1375,7 @@ static int hpage_collapse_scan_ptes(struct mm_struct = *mm, unsigned long orders; pte_t *pte, *_pte; spinlock_t *ptl; + int found_order; pmd_t *pmd; int order; =20 @@ -1467,6 +1474,24 @@ static int hpage_collapse_scan_ptes(struct mm_struct= *mm, goto out_unmap; } =20 + found_order =3D folio_order(folio); + + /* + * No point of scanning. Two options: if this folio was hit + * somewhere in the middle of the scan, then drop down the + * order. Or, completely skip till the end of this folio. The + * latter gives us a higher order to start with, with atmost + * 1 << order PTEs not collapsed; the former may force us + * to end up going below order 2 and exiting. + */ + if (order !=3D HPAGE_PMD_ORDER && found_order >=3D order) { + result =3D SCAN_PTE_MAPPED; + _address +=3D (PAGE_SIZE << found_order); + _pte +=3D (1UL << found_order); + pte_unmap_unlock(pte, ptl); + goto decide_order; + } + /* * We treat a single page as shared if any part of the THP * is shared. "False negatives" from @@ -1550,6 +1575,10 @@ static int hpage_collapse_scan_ptes(struct mm_struct= *mm, if (_address =3D=3D org_address + (PAGE_SIZE << HPAGE_PMD_ORDER)) goto out; } + /* A larger folio was mapped; it will be skipped in next iteration */ + if (result =3D=3D SCAN_PTE_MAPPED) + goto decide_order; + if (result !=3D SCAN_SUCCEED) { =20 /* Go to the next order. */ @@ -1558,6 +1587,8 @@ static int hpage_collapse_scan_ptes(struct mm_struct = *mm, goto out; goto maybe_mmap_lock; } else { + +decide_order: address =3D _address; pte =3D _pte; =20 --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 15EC320628A for ; Mon, 16 Dec 2024 16:53:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734368020; cv=none; b=m0wDxHpcHLTNCi6qKA1NIh/77iYuonXILJIfcZsCYfhNj18la4freOKlpFMCNVyTCl9Sol5U2fseMZXvLQaLFh14925ZzQ3E4vcSSIPRI866m3pA7XcZ9FM+6k4QaOBdTPVF5P/IRBChr35d/5XYPodSN9WETLGnVGMWvc1dKFU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734368020; c=relaxed/simple; bh=Qq63Q6G0y1Jtctd8xyf5/g1EfA4Ks1NpSue9gU92jfQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=o+jNEIbHWegMwsmdt2TVw6fyGV/aLTrVlT9JlPFyvrjF0hvfHXejowcErM8xAvZJWXTCwoGabAepP+6YXN/lfC846CeYusd7HAAD3I0jwAbA+XjoLmibX6Z1z1JMNUkTXd4+LEfOfFZ/T6gcWAtme9HRGe7GRg/R5SPnFaQutKw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 911B616F8; Mon, 16 Dec 2024 08:54:06 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B695E3F528; Mon, 16 Dec 2024 08:53:27 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 11/12] khugepaged: Enable sysfs to control order of collapse Date: Mon, 16 Dec 2024 22:21:04 +0530 Message-Id: <20241216165105.56185-12-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Activate khugepaged for anonymous collapse even if a single order is activa= ted. Note that, we are still scanning the VMAs only when they are PMD-aligned/si= zed, for ease of implementation. Signed-off-by: Dev Jain --- mm/khugepaged.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 47e7c476b893..ffc4d5aef991 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -414,24 +414,20 @@ static inline int hpage_collapse_test_exit_or_disable= (struct mm_struct *mm) test_bit(MMF_DISABLE_THP, &mm->flags); } =20 -static bool hugepage_pmd_enabled(void) +static bool thp_enabled(void) { /* * We cover the anon, shmem and the file-backed case here; file-backed * hugepages, when configured in, are determined by the global control. - * Anon pmd-sized hugepages are determined by the pmd-size control. + * Anon mTHPs are determined by the per-size control. * Shmem pmd-sized hugepages are also determined by its pmd-size control, * except when the global shmem_huge is set to SHMEM_HUGE_DENY. */ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && hugepage_global_enabled()) return true; - if (test_bit(PMD_ORDER, &huge_anon_orders_always)) - return true; - if (test_bit(PMD_ORDER, &huge_anon_orders_madvise)) - return true; - if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && - hugepage_global_enabled()) + if (huge_anon_orders_always || huge_anon_orders_madvise || + (huge_anon_orders_inherit && hugepage_global_enabled())) return true; if (IS_ENABLED(CONFIG_SHMEM) && shmem_hpage_pmd_enabled()) return true; @@ -474,9 +470,9 @@ void khugepaged_enter_vma(struct vm_area_struct *vma, unsigned long vm_flags) { if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) && - hugepage_pmd_enabled()) { - if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS, - PMD_ORDER)) + thp_enabled()) { + if (thp_vma_allowable_orders(vma, vm_flags, TVA_ENFORCE_SYSFS, + BIT(PMD_ORDER + 1) - 1)) __khugepaged_enter(vma->vm_mm); } } @@ -2586,8 +2582,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned = int pages, int *result, progress++; break; } - if (!thp_vma_allowable_order(vma, vma->vm_flags, - TVA_ENFORCE_SYSFS, PMD_ORDER)) { + if (!thp_vma_allowable_orders(vma, vma->vm_flags, + TVA_ENFORCE_SYSFS, BIT(PMD_ORDER + 1) - 1)) { skip: progress++; continue; @@ -2611,6 +2607,11 @@ static unsigned int khugepaged_scan_mm_slot(unsigned= int pages, int *result, khugepaged_scan.address + HPAGE_PMD_SIZE > hend); if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) { + if (!thp_vma_allowable_order(vma, vma->vm_flags, + TVA_ENFORCE_SYSFS, PMD_ORDER)) { + khugepaged_scan.address +=3D HPAGE_PMD_SIZE; + continue; + } struct file *file =3D get_file(vma->vm_file); pgoff_t pgoff =3D linear_page_index(vma, khugepaged_scan.address); @@ -2689,7 +2690,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned = int pages, int *result, =20 static int khugepaged_has_work(void) { - return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled(); + return !list_empty(&khugepaged_scan.mm_head) && thp_enabled(); } =20 static int khugepaged_wait_event(void) @@ -2762,7 +2763,7 @@ static void khugepaged_wait_work(void) return; } =20 - if (hugepage_pmd_enabled()) + if (thp_enabled()) wait_event_freezable(khugepaged_wait, khugepaged_wait_event()); } =20 @@ -2793,7 +2794,7 @@ static void set_recommended_min_free_kbytes(void) int nr_zones =3D 0; unsigned long recommended_min; =20 - if (!hugepage_pmd_enabled()) { + if (!thp_enabled()) { calculate_min_free_kbytes(); goto update_wmarks; } @@ -2843,7 +2844,7 @@ int start_stop_khugepaged(void) int err =3D 0; =20 mutex_lock(&khugepaged_mutex); - if (hugepage_pmd_enabled()) { + if (thp_enabled()) { if (!khugepaged_thread) khugepaged_thread =3D kthread_run(khugepaged, NULL, "khugepaged"); @@ -2869,7 +2870,7 @@ int start_stop_khugepaged(void) void khugepaged_min_free_kbytes_update(void) { mutex_lock(&khugepaged_mutex); - if (hugepage_pmd_enabled() && khugepaged_thread) + if (thp_enabled() && khugepaged_thread) set_recommended_min_free_kbytes(); mutex_unlock(&khugepaged_mutex); } --=20 2.30.2 From nobody Tue Feb 10 09:22:13 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5679E20ADEC for ; Mon, 16 Dec 2024 16:53:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734368031; cv=none; b=PuS+nykRnwCrl7W9bcvpydiTXxrYHW7NZpKw0b4Z/txnO+QZ3iCL8oq3lwrt4qp/F/L260ZJKW+ogjJZ6BIOGRLU0dfOwdZ7SqRWqDZxyvSqH1cQa5MUP1yVcYt6+MFXGI4yq8FJEnms8iGB7GSG/7T0fXTdWGVgMVMFXsS7ffs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734368031; c=relaxed/simple; bh=/XCD4xAd14x2YIkaw5c8j44TIdHV/AaMQm1xOv/49dc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Uh8xvoxso9EaDYvtrxNGgmQFkLXFCeNLQuCSbB3oOIS6TzMb83EQh4lrqUGeANq0pQwsyZdMqnKqM7V4c1mzjo50suDxnhs1Wf2nO3BtDaAaVaLgC99YRxoL9NzLDywX5vdH+ofgPw2Zc/2IS/qxoPK58VwqhhKrwb1QaZfL4r4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0479416F8; Mon, 16 Dec 2024 08:54:18 -0800 (PST) Received: from K4MQJ0H1H2.arm.com (unknown [10.163.78.212]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 1A31E3F528; Mon, 16 Dec 2024 08:53:38 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [RFC PATCH 12/12] selftests/mm: khugepaged: Enlighten for mTHP collapse Date: Mon, 16 Dec 2024 22:21:05 +0530 Message-Id: <20241216165105.56185-13-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20241216165105.56185-1-dev.jain@arm.com> References: <20241216165105.56185-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" One of the testcases triggers a CoW on the 255th page (0-indexing) with max_ptes_shared =3D 256. This leads to 0-254 pages (255 in number) being un= shared, and 257 pages shared, exceeding the constraint. Suppose we run the test as ./khugepaged -s 2. Therefore, khugepaged starts collapsing the range to ord= er-2 folios, since PMD-collapse will fail due to the constraint. When the scan reaches 254-257 PTE range, because at least one PTE in this r= ange is writable, with other 3 being read-only, khugepaged collapses this into an order-2 mTHP, resulting in 3 extra PTEs getting unshared. After this, we en= counter a 4-sized chunk of read-only PTEs, and mTHP collapse stops according to the= scaled constraint, but the number of shared PTEs have now come under the constrain= t for PMD-sized THPs. Therefore, the next scan of khugepaged will be able to coll= apse this range into a PMD-mapped hugepage, leading to failure of this subtest. = Fix this by reducing the CoW range. Note: The only objective of this patch is to make the test work for the PMD= -case; no extension has been made for testing for mTHPs. Signed-off-by: Dev Jain --- tools/testing/selftests/mm/khugepaged.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selfte= sts/mm/khugepaged.c index 8a4d34cce36b..143c4ad9f6a1 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -981,6 +981,7 @@ static void collapse_fork_compound(struct collapse_cont= ext *c, struct mem_ops *o static void collapse_max_ptes_shared(struct collapse_context *c, struct me= m_ops *ops) { int max_ptes_shared =3D thp_read_num("khugepaged/max_ptes_shared"); + int fault_nr_pages =3D is_anon(ops) ? 1 << anon_order : 1; int wstatus; void *p; =20 @@ -997,8 +998,8 @@ static void collapse_max_ptes_shared(struct collapse_co= ntext *c, struct mem_ops fail("Fail"); =20 printf("Trigger CoW on page %d of %d...", - hpage_pmd_nr - max_ptes_shared - 1, hpage_pmd_nr); - ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared - 1) * page_size); + hpage_pmd_nr - max_ptes_shared - fault_nr_pages, hpage_pmd_nr); + ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared - fault_nr_pages) * pag= e_size); if (ops->check_huge(p, 0)) success("OK"); else --=20 2.30.2