From nobody Sun Jun 14 15:49:59 2026 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D0EF3EDE6D; Wed, 10 Jun 2026 10:29:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087382; cv=none; b=CT5Ym8ct8pC2ebXXe6wZPqg99KrvxY8z8UHN5D594t+1p9Gff7jnXO8wqGHkbnR2asTcPvhnOL4ahe+f048BhsheI7K2x+lLMLDou3yqoKSAhmVkQWgtyWHuNJ0CjxwCHIGoLcTquMvxy4yP+kn+lWJLHEIk511XsTH5LxLLIds= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087382; c=relaxed/simple; bh=4O34GxUhg4ul80pVP2sLoUDvNkEf/2B9K3nPbckIkDg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=urFgmgrkGx3MrAiYeTw66WwdoTdgqfm5FtAmpAzSg06gDSrK6DSzM+uogHlGAC5A/lYJLCrJq5EPFU1DzCDFLlTtcYisr22L+bq65Zig+1qS+9dCmrpNsEnW1LFeYXoaUUrmU1J8JN9b3wldj4zUlydIy0cujiPo6X4I6xD629k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=nsHtLaKp; arc=none smtp.client-ip=115.124.30.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="nsHtLaKp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087372; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=GT0MH/saso5XoNvBa9r1fqryd7oF1xj+K/jBExVfL50=; b=nsHtLaKpPmZN3gB96dCvPqj1xePqNPvDTXWzsteirsvEwb919JNZfHF/kOddOOVZ+tqi0uH+7wNQFyzU9cjDVr4pgFCjr/DWt4qzkEDZXe3UqfN7Yhb7g3eN8cKA76g4Kvrhe4dbt4BsWJBlLBjH8cTLptcqZXA+zbQug9nwcXo= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aHmg1_1781087370; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aHmg1_1781087370 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:30 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 01/11] mm: khugepaged: add max_ptes_none check in collapse_file() Date: Wed, 10 Jun 2026 18:29:09 +0800 Message-ID: <2ae023e4c9f5270482c99f244af5d3a40fc49165.1781083630.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Similar to the anonymous folios collapse, we should also check the 'max_ptes_none' when trying to collapse shmem/file folios, which is also intended as preparation for shmem mTHP collapse in the following patches. Signed-off-by: Baolin Wang Reviewed-by: Nico Pache --- mm/khugepaged.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b12187709f6d..631459172e19 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2234,6 +2234,7 @@ static void retract_page_tables(struct address_space = *mapping, pgoff_t pgoff) static enum scan_result collapse_file(struct mm_struct *mm, unsigned long = addr, struct file *file, pgoff_t start, struct collapse_control *cc) { + const unsigned int max_ptes_none =3D collapse_max_ptes_none(cc, NULL, HPA= GE_PMD_ORDER); struct address_space *mapping =3D file->f_mapping; struct page *dst; struct folio *folio, *tmp, *new_folio; @@ -2299,7 +2300,13 @@ static enum scan_result collapse_file(struct mm_stru= ct *mm, unsigned long addr, goto xa_locked; } } - nr_none++; + + if (++nr_none > max_ptes_none) { + result =3D SCAN_EXCEED_NONE_PTE; + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); + goto xa_locked; + } + index++; continue; } --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98ECD3C109B; Wed, 10 Jun 2026 10:29:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087384; cv=none; b=YtB8gC7tSqZmcRpV08BnTqj3igC7vOmXGr9NdjDweI6S7AVy7K/oOtVUywmVFrKSCyI5IiyYmMGn3BsKz3jWeF/a3ia371ZF2m4AEdo9DFcVUT4M6tVdrALcboiDNK3T8Sw7NABEpPaGa6ZpWwrbQLAJsd6hkWAXEQ6sDPlG6Wc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087384; c=relaxed/simple; bh=ZFU8zFpoaJ2EqlmsbSVqqhiGQA7AOddp0ISzQ/Qfias=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nI+K/IKTvtv6Ajmi6WppRuXNqABVl9Zp+H1M6C1yzFi5/LZL2xa2CmZFwzjS/4OBlsZWU8IAab4kvZg1y8613O2n6VyWrNsqULBQL9cJquhC7AHrDE6S5om60j3eaBT2VKIypJxRL6t6/7/hLHpkbCelpGWp1+GCy19g1aVOiE8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=BZyYS8LB; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="BZyYS8LB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087373; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=qEB/1EhxVw7xHTTvXrEFTFQel1qTuMuF1xr/NGxBIHM=; b=BZyYS8LBiPtLXe6DqtT9qNSmTrPy39HAJWRPzAYHP+ATOzdMnOV5UmQvLxZptJmrL/WxmX8wcEMfSuhYPWQ9KuGtEO2TwZ0Edv8vIeSK6GHTQovvaM+vKVrDku5LEagM0GsioIeL/aNXjJzJlO1/Qw54Dr5E0lm0YX5aC1Xajyo= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037026112;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aI8Vv_1781087371; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aI8Vv_1781087371 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:31 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 02/11] mm: khugepaged: generalize collapse_file() for shmem mTHP support Date: Wed, 10 Jun 2026 18:29:10 +0800 Message-ID: <1274846e121e74f8db53950bec64f8f1938f2ec9.1781083630.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Generalize the order of the collapse_file() function to support future shmem mTHP collapse. No functional changes in this patch. Signed-off-by: Baolin Wang --- mm/khugepaged.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 631459172e19..4adc8c6de062 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2214,6 +2214,7 @@ static void retract_page_tables(struct address_space = *mapping, pgoff_t pgoff) * @file: file that collapse on * @start: collapse start address * @cc: collapse context and scratchpad + * @order: folio order being collapsed to * * Basic scheme is simple, details are more complex: * - allocate and lock a new huge page; @@ -2232,15 +2233,17 @@ static void retract_page_tables(struct address_spac= e *mapping, pgoff_t pgoff) * + unlock and free huge page; */ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long = addr, - struct file *file, pgoff_t start, struct collapse_control *cc) + struct file *file, pgoff_t start, struct collapse_control *cc, + int order) { - const unsigned int max_ptes_none =3D collapse_max_ptes_none(cc, NULL, HPA= GE_PMD_ORDER); + const unsigned int max_ptes_none =3D collapse_max_ptes_none(cc, NULL, ord= er); struct address_space *mapping =3D file->f_mapping; + const unsigned long nr_pages =3D 1UL << order; struct page *dst; struct folio *folio, *tmp, *new_folio; - pgoff_t index =3D 0, end =3D start + HPAGE_PMD_NR; + pgoff_t index =3D 0, end =3D start + nr_pages; LIST_HEAD(pagelist); - XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); + XA_STATE_ORDER(xas, &mapping->i_pages, start, order); enum scan_result result =3D SCAN_SUCCEED; int nr_none =3D 0; bool is_shmem =3D shmem_file(file); @@ -2252,9 +2255,9 @@ static enum scan_result collapse_file(struct mm_struc= t *mm, unsigned long addr, * mapping, the shmem check can be removed. */ VM_WARN_ON_ONCE(!is_shmem && !mapping_pmd_folio_support(mapping)); - VM_WARN_ON_ONCE(start & (HPAGE_PMD_NR - 1)); + VM_WARN_ON_ONCE(start & (nr_pages - 1)); =20 - result =3D alloc_charge_folio(&new_folio, mm, cc, HPAGE_PMD_ORDER); + result =3D alloc_charge_folio(&new_folio, mm, cc, order); if (result !=3D SCAN_SUCCEED) goto out; =20 @@ -2591,12 +2594,12 @@ static enum scan_result collapse_file(struct mm_str= uct *mm, unsigned long addr, } =20 if (is_shmem) { - lruvec_stat_mod_folio(new_folio, NR_SHMEM, HPAGE_PMD_NR); + lruvec_stat_mod_folio(new_folio, NR_SHMEM, nr_pages); lruvec_stat_mod_folio(new_folio, NR_SHMEM_THPS, HPAGE_PMD_NR); } else { lruvec_stat_mod_folio(new_folio, NR_FILE_THPS, HPAGE_PMD_NR); } - lruvec_stat_mod_folio(new_folio, NR_FILE_PAGES, HPAGE_PMD_NR); + lruvec_stat_mod_folio(new_folio, NR_FILE_PAGES, nr_pages); =20 /* * Mark new_folio as uptodate before inserting it into the @@ -2604,14 +2607,14 @@ static enum scan_result collapse_file(struct mm_str= uct *mm, unsigned long addr, * unwritten page. */ folio_mark_uptodate(new_folio); - folio_ref_add(new_folio, HPAGE_PMD_NR - 1); + folio_ref_add(new_folio, nr_pages - 1); =20 if (is_shmem) folio_mark_dirty(new_folio); folio_add_lru(new_folio); =20 /* Join all the small entries into a single multi-index entry. */ - xas_set_order(&xas, start, HPAGE_PMD_ORDER); + xas_set_order(&xas, start, order); xas_store(&xas, new_folio); WARN_ON_ONCE(xas_error(&xas)); xas_unlock_irq(&xas); @@ -2666,7 +2669,7 @@ static enum scan_result collapse_file(struct mm_struc= t *mm, unsigned long addr, folio_put(new_folio); out: VM_BUG_ON(!list_empty(&pagelist)); - trace_mm_khugepaged_collapse_file(mm, new_folio, index, addr, is_shmem, f= ile, HPAGE_PMD_NR, result); + trace_mm_khugepaged_collapse_file(mm, new_folio, index, addr, is_shmem, f= ile, nr_pages, result); return result; } =20 @@ -2769,7 +2772,7 @@ static enum scan_result collapse_scan_file(struct mm_= struct *mm, result =3D SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { - result =3D collapse_file(mm, addr, file, start, cc); + result =3D collapse_file(mm, addr, file, start, cc, HPAGE_PMD_ORDER); } } =20 --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 070FC3EB108; Wed, 10 Jun 2026 10:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087378; cv=none; b=c4BHB2XFPMtyzB+QFTt3hykeWG30H3nbfcBnohpF9/3S+WoMroYKYDFA8BFW+Fd++jDsTnubUPw8+LNrH2mf/kCdHzA0Av3RwO0s26/CK2zeoytGpnS4Fif9W7IEy6td10gs/TXDnttmFfqcaXzvuwhFJzRaWGGoeccJBAREPOA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087378; c=relaxed/simple; bh=YUOf2WDW5txQE8N5WRor6rnu453FeoALhOI8ghFvK80=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lbHlo0XDJ2a1BUcZIGk37XOD7RzFXPjH3icSgwMH3Aj23WoOb7plIlg9rxLb12VsFenBKVY09EFd9qemNl+29s/1/xMmjcfIhQJXOSWl8olvhYN9EnT2PRNUQ2FjAtg0woPXY5DUDtLcSO86FhVJNPjAotnmh7t5niiDRAg78bg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=lw+qdbTA; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="lw+qdbTA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087374; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=73PkU5SoaImbKfw1j/NfpvTWcYJyb2ykeO+wdSwB/I8=; b=lw+qdbTAaWwjS/N08az+Ir1hvzvPy9ohCbGeiJmqD3WeTtOhiNaTusI1Is1ITDqiqMQQORHgZjDh/nX/euLuFAOrJxapZDngYyV8AsPu76L/jciLV4whW5C0ixpP5l6c7lDebRJYQpC4mS93TCKqr8Vin4PkF4wq8aWV8Hd/5kc= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R231e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aU07y_1781087372; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aU07y_1781087372 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:32 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 03/11] mm: khugepaged: add an order check for PMD-sized THP statistics Date: Wed, 10 Jun 2026 18:29:11 +0800 Message-ID: <6e8cefeb420866a68353fe8decdb4162c877205e.1781083630.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In order to support shmem mTHP collapse in the following patches, add an PMD-sized THP order check to avoid PMD-sized THP statistics errors. No functional changes. Signed-off-by: Baolin Wang --- mm/khugepaged.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4adc8c6de062..0c8dfbd48410 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2595,8 +2595,9 @@ static enum scan_result collapse_file(struct mm_struc= t *mm, unsigned long addr, =20 if (is_shmem) { lruvec_stat_mod_folio(new_folio, NR_SHMEM, nr_pages); - lruvec_stat_mod_folio(new_folio, NR_SHMEM_THPS, HPAGE_PMD_NR); - } else { + if (is_pmd_order(order)) + lruvec_stat_mod_folio(new_folio, NR_SHMEM_THPS, HPAGE_PMD_NR); + } else if (is_pmd_order(order)) { lruvec_stat_mod_folio(new_folio, NR_FILE_THPS, HPAGE_PMD_NR); } lruvec_stat_mod_folio(new_folio, NR_FILE_PAGES, nr_pages); --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2BDF3E7BD2; Wed, 10 Jun 2026 10:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087378; cv=none; b=KJL79Lck1jK+3VV77EMiSCgPTeT3fsBDD0BJNLcyYs8I+9clEe3n4N+4f8o0JOhBIbUEN8apWVmREPwZGNLdbUA5TdS8JdJK3z9/YojiTXKRHGb4jQg+N4Yh1wL6KL6Y4TFvbaXHkNZBUVB6Kxkdoq+HoQaOird0MtOh9Qf0jPc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087378; c=relaxed/simple; bh=JV1UXG/kZF2a3d0roUuDqRj5IBMOpqHf9w66Kw7SojY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PHWAqJ9+GeoM7fIeajyLO38L6gLt9kEC+riW0ItXs9ZOmt/4I9XVEMPSBLbhQaiVCEuUsaADKK0PQb+X5XHm2qCyx9PzI4klcZEQcY2Tlqv4+8dQ2QChjnsrvcaKIXSukWEtHvdjOfQoGPuZxugSirJu0GvbaD+pf78Ed3LdEms= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=kn2aySCw; arc=none smtp.client-ip=115.124.30.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="kn2aySCw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087374; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=4biN/oAT+ak60w5oZZalu280HqYTdojibr3grv/ESUA=; b=kn2aySCwtK9FjtTo9xQVNa3VBPq9VQcnCqd7C6fo681kNK1tstH94dVOar2QQ3rymyuMlB38yJ5bmYVq10r3j85lhpxgz9MBfQzIDSmM777fB0zm+spIjVPHscCvCxxEL5aJIOv+6KVRtPR7S6lPS/HVzS06zOXX8t992lgnklY= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037009110;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aI8WM_1781087372; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aI8WM_1781087372 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:33 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 04/11] mm: khugepaged: add shmem mTHP collapse support Date: Wed, 10 Jun 2026 18:29:12 +0800 Message-ID: <3895c09794739bbfae99d9e021ffacf8a51fb8e2.1781083630.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Khugepaged already supports the anonymous mTHP collapse. Similarly, let khugepaged also support the shmem mTHP collapse. The strategy for shmem mTHP collapse follows the anonymous mTHP collapse: Track present pages via a bitmap while scanning PMD ranges for collapse candidates. After the scan completes, use the bitmap to determine the most efficient mTHP order to collapse to. Scale 'max_ptes_none' by the attempted collapse order to determine the minimum fill threshold for eligibility. Similarly, shmem mTHP collapse rejects regions containing swapped-out pages to avoid creep. Currently, the collapse_pte_mapped_thp() does not build the mapping for mTH= P. Cause we still expect to establish the mTHP mapping via refault under the control of fault_around. So collapse_pte_mapped_thp() remains responsible only for building the mapping for PMD-sized THP, which is reasonable and makes life easier. Note that we do not need to remove pte page tables for shmem mTHP collapse. Signed-off-by: Baolin Wang --- mm/khugepaged.c | 115 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 91 insertions(+), 24 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 0c8dfbd48410..818d51915748 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -135,6 +135,10 @@ static struct khugepaged_scan khugepaged_scan =3D { .mm_head =3D LIST_HEAD_INIT(khugepaged_scan.mm_head), }; =20 +static enum scan_result collapse_file(struct mm_struct *mm, unsigned long = addr, + struct file *file, pgoff_t start, + struct collapse_control *cc, int order); + #ifdef CONFIG_SYSFS static ssize_t scan_sleep_millisecs_show(struct kobject *kobj, struct kobj_attribute *attr, @@ -1487,6 +1491,7 @@ static unsigned int max_order_from_offset(unsigned in= t offset) * mTHP. */ static enum scan_result mthp_collapse(struct mm_struct *mm, + struct file *file, pgoff_t start, unsigned long address, int referenced, int unmapped, struct collapse_control *cc, unsigned long enabled_orders) { @@ -1512,8 +1517,12 @@ static enum scan_result mthp_collapse(struct mm_stru= ct *mm, enum scan_result ret; =20 collapse_address =3D address + offset * PAGE_SIZE; - ret =3D collapse_huge_page(mm, collapse_address, referenced, - unmapped, cc, order); + if (file) + ret =3D collapse_file(mm, collapse_address, file, + start + offset, cc, order); + else + ret =3D collapse_huge_page(mm, collapse_address, + referenced, unmapped, cc, order); =20 switch (ret) { /* Cases where we continue to next collapse candidate */ @@ -1521,6 +1530,7 @@ static enum scan_result mthp_collapse(struct mm_struc= t *mm, collapsed +=3D nr_ptes; fallthrough; case SCAN_PTE_MAPPED_HUGEPAGE: + case SCAN_PAGE_COMPOUND: goto next_offset; /* Cases where lower orders might still succeed */ case SCAN_ALLOC_HUGE_PAGE_FAIL: @@ -1774,7 +1784,7 @@ static enum scan_result collapse_scan_pmd(struct mm_s= truct *mm, if (result =3D=3D SCAN_SUCCEED) { /* collapse_huge_page expects the lock to be dropped before calling */ mmap_read_unlock(mm); - result =3D mthp_collapse(mm, start_addr, referenced, + result =3D mthp_collapse(mm, NULL, 0, start_addr, referenced, unmapped, cc, enabled_orders); /* mmap_lock was released above, set lock_dropped */ *lock_dropped =3D true; @@ -2306,7 +2316,9 @@ static enum scan_result collapse_file(struct mm_struc= t *mm, unsigned long addr, =20 if (++nr_none > max_ptes_none) { result =3D SCAN_EXCEED_NONE_PTE; - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); + if (is_pmd_order(order)) + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); + count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_NONE); goto xa_locked; } =20 @@ -2316,6 +2328,19 @@ static enum scan_result collapse_file(struct mm_stru= ct *mm, unsigned long addr, =20 if (xa_is_value(folio) || !folio_test_uptodate(folio)) { xas_unlock_irq(&xas); + + /* + * TODO: Support swapin without leading to further mTHP + * collapses. Currently bringing in new pages via swapin may + * cause a future higher order collapse on a rescan of the same + * range. + */ + if (!is_pmd_order(order)) { + count_mthp_stat(order, MTHP_STAT_COLLAPSE_EXCEED_SWAP); + result =3D SCAN_EXCEED_SWAP_PTE; + goto xa_unlocked; + } + /* swap in or instantiate fallocated page */ if (shmem_get_folio(mapping->host, index, 0, &folio, SGP_NOALLOC)) { @@ -2399,6 +2424,18 @@ static enum scan_result collapse_file(struct mm_stru= ct *mm, unsigned long addr, goto out_unlock; } =20 + /* + * If the folio order is greater than the collapse order, there is + * no need to continue attempting to collapse. + * And should return SCAN_PAGE_COMPOUND instead of SCAN_PTE_MAPPED_HUGEP= AGE, + * then we can build the mapping under the control of fault_around + * when refaulting. + */ + if (folio_order(folio) >=3D order) { + result =3D SCAN_PAGE_COMPOUND; + goto out_unlock; + } + if (folio_mapping(folio) !=3D mapping) { result =3D SCAN_TRUNCATED; goto out_unlock; @@ -2621,12 +2658,11 @@ static enum scan_result collapse_file(struct mm_str= uct *mm, unsigned long addr, xas_unlock_irq(&xas); =20 /* - * Remove pte page tables, so we can re-fault the page as huge. - * If MADV_COLLAPSE, adjust result to call try_collapse_pte_mapped_thp(). + * Remove pte page tables for PMD-sized THP collapse, so we can + * re-fault the page as huge. */ - retract_page_tables(mapping, start); - if (cc && !cc->is_khugepaged) - result =3D SCAN_PTE_MAPPED_HUGEPAGE; + if (is_pmd_order(order)) + retract_page_tables(mapping, start); folio_unlock(new_folio); =20 /* @@ -2675,22 +2711,35 @@ static enum scan_result collapse_file(struct mm_str= uct *mm, unsigned long addr, } =20 static enum scan_result collapse_scan_file(struct mm_struct *mm, - unsigned long addr, struct file *file, pgoff_t start, - struct collapse_control *cc) + struct vm_area_struct *vma, unsigned long addr, + struct file *file, pgoff_t start, struct collapse_control *cc) { - const unsigned int max_ptes_none =3D collapse_max_ptes_none(cc, NULL, HPA= GE_PMD_ORDER); + enum tva_type tva_flags =3D cc->is_khugepaged ? TVA_KHUGEPAGED : TVA_FORC= ED_COLLAPSE; + unsigned int max_ptes_none =3D collapse_max_ptes_none(cc, NULL, HPAGE_PMD= _ORDER); const unsigned int max_ptes_swap =3D collapse_max_ptes_swap(cc, HPAGE_PMD= _ORDER); - struct folio *folio =3D NULL; struct address_space *mapping =3D file->f_mapping; XA_STATE(xas, &mapping->i_pages, start); - int present, swap; - int node =3D NUMA_NO_NODE; enum scan_result result =3D SCAN_SUCCEED; + unsigned long enabled_orders, nr_pages; + struct folio *folio =3D NULL; + int node =3D NUMA_NO_NODE; + int present, swap; + pgoff_t pgoff; =20 present =3D 0; swap =3D 0; + bitmap_zero(cc->mthp_present_ptes, MAX_PTRS_PER_PTE); memset(cc->node_load, 0, sizeof(cc->node_load)); nodes_clear(cc->alloc_nmask); + + enabled_orders =3D collapse_possible_orders(vma, vma->vm_flags, tva_flags= ); + /* + * If PMD is the only enabled order, enforce max_ptes_none, otherwise + * scan all pages to populate the bitmap for mTHP collapse. + */ + if (enabled_orders !=3D BIT(HPAGE_PMD_ORDER)) + max_ptes_none =3D KHUGEPAGED_MAX_PTES_LIMIT; + rcu_read_lock(); xas_for_each(&xas, folio, start + HPAGE_PMD_NR - 1) { if (xas_retry(&xas, folio)) @@ -2754,7 +2803,17 @@ static enum scan_result collapse_scan_file(struct mm= _struct *mm, * is just too costly... */ =20 - present +=3D folio_nr_pages(folio); + nr_pages =3D folio_nr_pages(folio); + present +=3D nr_pages; + + /* + * If there are folios present, keep track of it in the bitmap + * for file/shmem mTHP collapse. + */ + pgoff =3D max_t(pgoff_t, start, folio->index) - start; + nr_pages =3D min_t(int, HPAGE_PMD_NR - pgoff, nr_pages); + bitmap_set(cc->mthp_present_ptes, pgoff, nr_pages); + folio_put(folio); =20 if (need_resched()) { @@ -2768,15 +2827,23 @@ static enum scan_result collapse_scan_file(struct m= m_struct *mm, else cc->progress +=3D HPAGE_PMD_NR; =20 - if (result =3D=3D SCAN_SUCCEED) { - if (present < HPAGE_PMD_NR - max_ptes_none) { - result =3D SCAN_EXCEED_NONE_PTE; - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); - } else { - result =3D collapse_file(mm, addr, file, start, cc, HPAGE_PMD_ORDER); - } + if (result !=3D SCAN_SUCCEED) + goto out; + + if (present < HPAGE_PMD_NR - max_ptes_none) { + result =3D SCAN_EXCEED_NONE_PTE; + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); + count_mthp_stat(HPAGE_PMD_ORDER, + MTHP_STAT_COLLAPSE_EXCEED_NONE); + goto out; } =20 + result =3D mthp_collapse(mm, file, start, addr, 0, 0, cc, enabled_orders); + if (result =3D=3D SCAN_SUCCEED && !cc->is_khugepaged) { + /* If MADV_COLLAPSE, adjust result to call collapse_pte_mapped_thp(). */ + result =3D SCAN_PTE_MAPPED_HUGEPAGE; + } +out: trace_mm_khugepaged_scan_file(mm, folio, file, present, swap, result); return result; } @@ -2808,7 +2875,7 @@ static enum scan_result collapse_single_pmd(unsigned = long addr, mmap_read_unlock(mm); *lock_dropped =3D true; retry: - result =3D collapse_scan_file(mm, addr, file, pgoff, cc); + result =3D collapse_scan_file(mm, vma, addr, file, pgoff, cc); =20 /* * For MADV_COLLAPSE, when encountering dirty pages, try to writeback, --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61E443F8231; Wed, 10 Jun 2026 10:29:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087387; cv=none; b=Jm9qCQ3yNprp3wgcOpYoNC6RY5LREyd0EwCSdHZ4aFbHa7vw3i7qawcUMH/nhboqF5Ru8eoFlHSG0UUbU05nvifAVxCWpvFMVjBZzJUfhMtQVzC/K/1Im2xM5YxvCDDRX/uwxF1YwOmj/gcv6Ujk3DUb9bd3gfctsljZsbS/dLk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087387; c=relaxed/simple; bh=uLnkTESqiZWGT6qRX+3wN16OC/Z7r23GhC78WtGJcnI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=a5oay4y1QYWiIeD8LZm3ahibEHuIDvkidJ4WoM1swTiDcqPRfFA37h/7WBXwO4JMTTz0d8ONptnGSkLLxDhbVwxZc6T3tNEtIcC5bqbMv5mTws4Xdk9PJUvzof8f73UDGVqxipQvcVea9Cz0P9vGy8RI0yEY5HOQRQRg3uXka04= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=ZG9TAZzY; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="ZG9TAZzY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087375; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=WBnURwRT3PtluzIttkapBTzpEpviWA0up/dk/97TMVA=; b=ZG9TAZzYXCid/tgbR/iAL/dGLhZGLNXPaIfJ0S/EhvNDRpxjhf5HRePiFqg0qsLSSKDdhUH8dEAIWiK/MAxT6d+dGwcTFTdJH9jTwEJmQxr7+pJKfSk/d21uE7z4DKSSyqW+VWbjj0DVYL4fo2FTRAh5nArME9rzNKsB7DfKTpE= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045133197;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aI8Wb_1781087373; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aI8Wb_1781087373 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:33 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 05/11] mm: shmem: run khugepaged for all shmem mTHP orders Date: Wed, 10 Jun 2026 18:29:13 +0800 Message-ID: <593cc1b1a876dc0f89bb7e3585db7001b44c0061.1781083630.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When only non-PMD-sized mTHP is enabled (such as only 64K mTHP enabled), we should also allow kicking khugepaged to attempt scanning and collapsing 64K shmem mTHP. Modify shmem_hpage_pmd_enabled() to support shmem mTHP collapse, and while we are at it, rename it to make the function name more clear. Signed-off-by: Baolin Wang --- include/linux/shmem_fs.h | 4 ++-- mm/khugepaged.c | 2 +- mm/shmem.c | 10 +++++----- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index acb8dd961b45..1ec358b40c9b 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -131,7 +131,7 @@ int shmem_unuse(unsigned int type); unsigned long shmem_allowable_huge_orders(struct inode *inode, struct vm_area_struct *vma, pgoff_t index, loff_t write_end, bool shmem_huge_force); -bool shmem_hpage_pmd_enabled(void); +bool shmem_hpage_enabled(void); #else static inline unsigned long shmem_allowable_huge_orders(struct inode *inod= e, struct vm_area_struct *vma, pgoff_t index, @@ -140,7 +140,7 @@ static inline unsigned long shmem_allowable_huge_orders= (struct inode *inode, return 0; } =20 -static inline bool shmem_hpage_pmd_enabled(void) +static inline bool shmem_hpage_enabled(void) { return false; } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 818d51915748..75b18ec4a6c3 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -532,7 +532,7 @@ static bool hugepage_enabled(void) return true; if (anon_hpage_enabled()) return true; - if (shmem_hpage_pmd_enabled()) + if (shmem_hpage_enabled()) return true; return false; } diff --git a/mm/shmem.c b/mm/shmem.c index 56c23a7b15c7..a8d30a123b1f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1792,17 +1792,17 @@ static struct folio *shmem_swapin_cluster(swp_entry= _t swap, gfp_t gfp, } =20 #ifdef CONFIG_TRANSPARENT_HUGEPAGE -bool shmem_hpage_pmd_enabled(void) +bool shmem_hpage_enabled(void) { if (shmem_huge =3D=3D SHMEM_HUGE_DENY) return false; - if (test_bit(HPAGE_PMD_ORDER, &huge_shmem_orders_always)) + if (READ_ONCE(huge_shmem_orders_always)) return true; - if (test_bit(HPAGE_PMD_ORDER, &huge_shmem_orders_madvise)) + if (READ_ONCE(huge_shmem_orders_madvise)) return true; - if (test_bit(HPAGE_PMD_ORDER, &huge_shmem_orders_within_size)) + if (READ_ONCE(huge_shmem_orders_within_size)) return true; - if (test_bit(HPAGE_PMD_ORDER, &huge_shmem_orders_inherit) && + if (READ_ONCE(huge_shmem_orders_inherit) && shmem_huge !=3D SHMEM_HUGE_NEVER) return true; =20 --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 260943F0763; Wed, 10 Jun 2026 10:29:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087381; cv=none; b=IAPP57rYHb+QCXbf3RUPoxRlmjsAgtjni1CvbgURS2Wxp74iHgkjDwWFtn8iKz+6j/mo0EzMhEY4oIT30PlDuJzhfRjabaM0iMwpob562v6WWIApPoh6DWkjnclW5xXAKQaDdsgYAfDAvzgBk2VWHMzSy9jhR9ERly4pNGBdVKE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087381; c=relaxed/simple; bh=UtQCH3uy20HwxtFrwP88iG9vYt2++QDI+vc0wtIbR4w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AVvy2WD6XiOg49OgZ29aJ/YbOglwIcxE5By/HtlbVLzL4lcC7wN+oICQ9z/n271jpxa1UMLnQHGynJJDo5BrQbV3yAlu8vr5uPy9umMkwB5Ivh4a9LdmTZ8FifOGQRrgEqOf1c7xi/vh+RK18ZX0lckAthd4JxfjB0DS0ySD6uw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=spSEVkFJ; arc=none smtp.client-ip=115.124.30.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="spSEVkFJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087376; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=zdfnlijOCrbir+VZ2n7kQKRLZHQg8II97GPm1C9tmhA=; b=spSEVkFJNbbSA0wXxQWDGpgZYN8zo1W2nNG+hiSbNvKSj3qmuz1t6pwb33/TW0Uq6qANO3sKj0cCHuZK6XcIffzZ8ck3bEhAJLaBguX9V4g5u+EgZ5biCChI1Guxuc+O3bQaezr1TdXJID/skf2+5WPvBPY1C7hHiaR01aL/4y8= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R861e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aHmiG_1781087374; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aHmiG_1781087374 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:34 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 06/11] mm: khugepaged: allow khugepaged to check all shmem mTHP-sized orders Date: Wed, 10 Jun 2026 18:29:14 +0800 Message-ID: X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We are now ready to enable shmem mTHP collapse, allowing thp_vma_allowable_orders() to check all permissible shmem large orders. Signed-off-by: Baolin Wang --- mm/khugepaged.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 75b18ec4a6c3..a87918b7e18c 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -578,9 +578,14 @@ static unsigned long collapse_possible_orders(struct v= m_area_struct *vma, { unsigned long orders; =20 - /* If khugepaged is scanning an anonymous vma, allow mTHP collapse */ + /* + * If khugepaged is scanning an anonymous or shmem vma, + * allow mTHP collapse. + */ if ((tva_flags =3D=3D TVA_KHUGEPAGED) && vma_is_anonymous(vma)) orders =3D THP_ORDERS_ALL_ANON; + else if ((tva_flags =3D=3D TVA_KHUGEPAGED) && vma_is_shmem(vma)) + orders =3D THP_ORDERS_ALL_FILE_DEFAULT; else orders =3D BIT(HPAGE_PMD_ORDER); =20 --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1D143F7878; Wed, 10 Jun 2026 10:29:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087386; cv=none; b=SU57mo3Exf0nFjGbijKEJO8ZzOy1jnCvXD4Ud69Mh7rINpmm+ytXy+fgroyHY0KdJgiU2OShLCixkkIA+OKt1/xOATF+XFvE8ZcGRX0/2wTOqb6bbeMYengrD32J/HraXtoa+KhxqMr5Piic9G0wBNf7ch5cjG11/m7UD1BoM78= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087386; c=relaxed/simple; bh=wHobtzCR1hOAstkHCejifG+Mt/lK6w99MKs1TZrg2bo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pwU6PUru177AXIcHdHGxg1Pxxhzd2cT26VNLfLAmEfqRC0MIiyfiVaglX73tYfvW6Ynzcv/OIaWWtQtH6n34EO/GqOUbVMd/3ykNTNFJJmbDdfDgSuzai3OpKIsmIjSq4wh3ruh22kqnTcJ8En26OgIo9XATNb17Y62MoVPqp/Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=D29d5QTk; arc=none smtp.client-ip=115.124.30.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="D29d5QTk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087377; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=CJahy7zB3OklBuIOghVRfBNSCYpp64IAvawu3272VZE=; b=D29d5QTkxb/C2yMCfF/Lbh7dMlS7VhvdNp2Xuv7PPCKnKnSzvVDWNeH2Y2Q/oJWDz/3eGM5W9E1JBo9apytKPLpSx7OCMnP5mbY9DHAguqos+lcODcb/x/FJAwPz1n/Oo+sJE0BA3Xd4vwFzQuHPuLtf1Qi53oGrwSxfzeScfZA= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045133197;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aHmig_1781087375; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aHmig_1781087375 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:35 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 07/11] mm: khugepaged: skip large folios that don't need to be collapsed Date: Wed, 10 Jun 2026 18:29:15 +0800 Message-ID: <9e05e5ca589e1156f75078951efe874118f88587.1781083630.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If a VMA has already created a mapping of large folios after a successful mTHP collapse, we can skip those folios that exceed the 'highest_enabled_or= der' when scanning the VMA range again, as they can no longer be collapsed furth= er. This helps prevent wasting CPU cycles. Signed-off-by: Baolin Wang --- mm/khugepaged.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a87918b7e18c..a9664ac26f11 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2724,12 +2724,12 @@ static enum scan_result collapse_scan_file(struct m= m_struct *mm, const unsigned int max_ptes_swap =3D collapse_max_ptes_swap(cc, HPAGE_PMD= _ORDER); struct address_space *mapping =3D file->f_mapping; XA_STATE(xas, &mapping->i_pages, start); + unsigned int highest_enabled_order =3D 0; enum scan_result result =3D SCAN_SUCCEED; unsigned long enabled_orders, nr_pages; struct folio *folio =3D NULL; int node =3D NUMA_NO_NODE; int present, swap; - pgoff_t pgoff; =20 present =3D 0; swap =3D 0; @@ -2738,6 +2738,9 @@ static enum scan_result collapse_scan_file(struct mm_= struct *mm, nodes_clear(cc->alloc_nmask); =20 enabled_orders =3D collapse_possible_orders(vma, vma->vm_flags, tva_flags= ); + if (enabled_orders > 0) + highest_enabled_order =3D highest_order(enabled_orders); + /* * If PMD is the only enabled order, enforce max_ptes_none, otherwise * scan all pages to populate the bitmap for mTHP collapse. @@ -2814,10 +2817,17 @@ static enum scan_result collapse_scan_file(struct m= m_struct *mm, /* * If there are folios present, keep track of it in the bitmap * for file/shmem mTHP collapse. + * + * Skip those folios whose order has already exceeded the + * 'highest_enabled_order', meaning they cannot be collapsed + * into larger order folios. */ - pgoff =3D max_t(pgoff_t, start, folio->index) - start; - nr_pages =3D min_t(int, HPAGE_PMD_NR - pgoff, nr_pages); - bitmap_set(cc->mthp_present_ptes, pgoff, nr_pages); + if (folio_order(folio) < highest_enabled_order) { + pgoff_t pgoff =3D max_t(pgoff_t, start, folio->index) - start; + + nr_pages =3D min_t(int, HPAGE_PMD_NR - pgoff, nr_pages); + bitmap_set(cc->mthp_present_ptes, pgoff, nr_pages); + } =20 folio_put(folio); =20 @@ -2843,6 +2853,11 @@ static enum scan_result collapse_scan_file(struct mm= _struct *mm, goto out; } =20 + if (bitmap_empty(cc->mthp_present_ptes, MAX_PTRS_PER_PTE)) { + result =3D SCAN_FAIL; + goto out; + } + result =3D mthp_collapse(mm, file, start, addr, 0, 0, cc, enabled_orders); if (result =3D=3D SCAN_SUCCEED && !cc->is_khugepaged) { /* If MADV_COLLAPSE, adjust result to call collapse_pte_mapped_thp(). */ --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D30853F164A; Wed, 10 Jun 2026 10:29:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087389; cv=none; b=nJFeayQpsiC3dqbTrusKB1oxETOI8g0Q7CxpRgOTSP5k4G68uds9JUljsRZXjhGFquzPnJAVYdIjMhNEAEMYia/6SF/MvHm8A1NAREeNv+PIdtb7V++K0QkLI26kxscJzHCj/7xhLGZ1vkLWpDuXoCTJ6LBg7cBj0GP19ZirBwA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087389; c=relaxed/simple; bh=Q6ZsVjM/mdnToVXTY6XWCc7GaRcTellzi0E4ctLj0Vo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pMv8ZGMMC7+LRzJ1MMPz+WTyvKB8zqVMo65RqSjumV1ktFo2OilwsGTpXSKjOqOg7l1+c8amM2KCFeCi+Qy+75a8JqPx4q4xinDGCSrhbKmB4RgsSsnS3a1xsdmHjeNrOjJMmBA4+mzLvvW0kXudlUlPQucDTR+AAON771Hl6Ks= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=xYEFHfS+; arc=none smtp.client-ip=115.124.30.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="xYEFHfS+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087377; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=vrIcOXzwuiScXs6fHd7aW1v6ogQNMzf9EWC7m6bJ12Q=; b=xYEFHfS+dn0MNHk7UeH9npfsxLbL/bBbMeDq81onqylfWjSKTYvdl6HknsYGXUvdWOcznqp4PacvdW2cn+wdHuM0ckw4hFOZf3/PEraHumIDA9y2dSBCiCZu6pdDHOixPkTdho4D7KYzgh/SBqi8Q9++yjpJ7kWtqQ9E5BbeQfE= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R531e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032089153;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aU09U_1781087376; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aU09U_1781087376 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:36 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 08/11] selftests: mm: extend the check_huge() to support mTHP check Date: Wed, 10 Jun 2026 18:29:16 +0800 Message-ID: X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To support checking for various sized mTHPs during mTHP collapse, extend the check_huge() function prototype to accept two new parameters specifying the address range and mTHP size, in preparation for the following patches. No functional changes. Signed-off-by: Baolin Wang --- .../selftests/mm/folio_split_race_test.c | 2 +- tools/testing/selftests/mm/khugepaged.c | 66 ++++++++++--------- .../testing/selftests/mm/prctl_thp_disable.c | 2 +- tools/testing/selftests/mm/soft-dirty.c | 2 +- .../selftests/mm/split_huge_page_test.c | 14 ++-- tools/testing/selftests/mm/uffd-common.c | 4 +- tools/testing/selftests/mm/vm_util.c | 6 +- tools/testing/selftests/mm/vm_util.h | 6 +- 8 files changed, 55 insertions(+), 47 deletions(-) diff --git a/tools/testing/selftests/mm/folio_split_race_test.c b/tools/tes= ting/selftests/mm/folio_split_race_test.c index 6329e37fff4c..45b84f7b364e 100644 --- a/tools/testing/selftests/mm/folio_split_race_test.c +++ b/tools/testing/selftests/mm/folio_split_race_test.c @@ -182,7 +182,7 @@ static uint64_t run_iteration(void) for (i =3D 0; i < TOTAL_PAGES; i++) fill_page(mmap_base, i); =20 - if (!check_huge_shmem(mmap_base, NR_PMD_PAGE, pmd_pagesize)) + if (!check_huge_shmem(mmap_base, FILE_SIZE, NR_PMD_PAGE, pmd_pagesize)) ksft_exit_fail_msg("No shmem THP is allocated\n"); =20 if (pthread_barrier_init(&ctl.barrier, NULL, NUM_READER_THREADS + 1) !=3D= 0) diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selfte= sts/mm/khugepaged.c index 10e8dedcb087..f69be6be0ecd 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -51,7 +51,7 @@ struct mem_ops { void *(*setup_area)(int nr_hpages); void (*cleanup_area)(void *p, unsigned long size); void (*fault)(void *p, unsigned long start, unsigned long end); - bool (*check_huge)(void *addr, int nr_hpages); + bool (*check_huge)(void *addr, unsigned long size, int nr_hpages, unsigne= d long hpage_size); const char *name; }; =20 @@ -276,7 +276,7 @@ static void *alloc_hpage(struct mem_ops *ops) ksft_print_msg("Allocate huge page..."); if (madvise_collapse_retry(p, hpage_pmd_size)) ksft_exit_fail_perror("madvise(MADV_COLLAPSE)"); - if (!ops->check_huge(p, 1)) + if (!ops->check_huge(p, hpage_pmd_size, 1, hpage_pmd_size)) ksft_exit_fail_perror("madvise(MADV_COLLAPSE)"); if (madvise(p, hpage_pmd_size, MADV_HUGEPAGE)) ksft_exit_fail_perror("madvise(MADV_HUGEPAGE)"); @@ -310,9 +310,10 @@ static void anon_fault(void *p, unsigned long start, u= nsigned long end) fill_memory(p, start, end); } =20 -static bool anon_check_huge(void *addr, int nr_hpages) +static bool anon_check_huge(void *addr, unsigned long size, + int nr_hpages, unsigned long hpage_size) { - return check_huge_anon(addr, nr_hpages, hpage_pmd_size); + return check_huge_anon(addr, size, nr_hpages, hpage_size); } =20 static void *file_setup_area_common(int nr_hpages, enum file_setup_ops set= up) @@ -412,13 +413,14 @@ static void file_fault_write(void *p, unsigned long s= tart, unsigned long end) ksft_exit_fail_perror("madvise(MADV_POPULATE_WRITE)"); } =20 -static bool file_check_huge(void *addr, int nr_hpages) +static bool file_check_huge(void *addr, unsigned long size, + int nr_hpages, unsigned long hpage_size) { switch (finfo.type) { case VMA_FILE: - return check_huge_file(addr, nr_hpages, hpage_pmd_size); + return check_huge_file(addr, size, nr_hpages, hpage_size); case VMA_SHMEM: - return check_huge_shmem(addr, nr_hpages, hpage_pmd_size); + return check_huge_shmem(addr, size, nr_hpages, hpage_size); default: exit(EXIT_FAILURE); return false; @@ -448,9 +450,10 @@ static void shmem_cleanup_area(void *p, unsigned long = size) close(finfo.fd); } =20 -static bool shmem_check_huge(void *addr, int nr_hpages) +static bool shmem_check_huge(void *addr, unsigned long size, + int nr_hpages, unsigned long hpage_size) { - return check_huge_shmem(addr, nr_hpages, hpage_pmd_size); + return check_huge_shmem(addr, size, nr_hpages, hpage_size); } =20 static struct mem_ops __anon_ops =3D { @@ -533,7 +536,7 @@ static void __madvise_collapse(const char *msg, char *p= , int nr_hpages, ret =3D madvise_collapse_retry(p, nr_hpages * hpage_pmd_size); if (((bool)ret) =3D=3D expect) fail("Fail: Bad return value"); - else if (!ops->check_huge(p, expect ? nr_hpages : 0)) + else if (!ops->check_huge(p, nr_hpages * hpage_pmd_size, expect ? nr_hpag= es : 0, hpage_pmd_size)) fail("Fail: check_huge()"); else success("OK"); @@ -545,7 +548,7 @@ static void madvise_collapse(const char *msg, char *p, = int nr_hpages, struct mem_ops *ops, bool expect) { /* Sanity check */ - if (!ops->check_huge(p, 0)) + if (!ops->check_huge(p, nr_hpages * hpage_pmd_size, 0, hpage_pmd_size)) ksft_exit_fail_msg("Unexpected huge page\n"); __madvise_collapse(msg, p, nr_hpages, ops, expect); } @@ -554,11 +557,12 @@ static void madvise_collapse(const char *msg, char *p= , int nr_hpages, static bool wait_for_scan(const char *msg, char *p, int nr_hpages, struct mem_ops *ops) { + unsigned long size =3D nr_hpages * hpage_pmd_size; int full_scans; int timeout =3D 6; /* 3 seconds */ =20 /* Sanity check */ - if (!ops->check_huge(p, 0)) + if (!ops->check_huge(p, size, 0, hpage_pmd_size)) ksft_exit_fail_msg("Unexpected huge page\n"); =20 madvise(p, nr_hpages * hpage_pmd_size, MADV_HUGEPAGE); @@ -568,7 +572,7 @@ static bool wait_for_scan(const char *msg, char *p, int= nr_hpages, =20 ksft_print_msg("%s...", msg); while (timeout--) { - if (ops->check_huge(p, nr_hpages)) + if (ops->check_huge(p, size, nr_hpages, hpage_pmd_size)) break; if (thp_read_num("khugepaged/full_scans") >=3D full_scans) break; @@ -582,6 +586,8 @@ static bool wait_for_scan(const char *msg, char *p, int= nr_hpages, static void khugepaged_collapse(const char *msg, char *p, int nr_hpages, struct mem_ops *ops, bool expect) { + unsigned long size =3D nr_hpages * hpage_pmd_size; + /* * read&write file collapse fails since khugepaged does not flush * the target dirty folios @@ -605,7 +611,7 @@ static void khugepaged_collapse(const char *msg, char *= p, int nr_hpages, if (ops !=3D &__anon_ops) ops->fault(p, 0, nr_hpages * hpage_pmd_size); =20 - if (ops->check_huge(p, expect ? nr_hpages : 0)) + if (ops->check_huge(p, size, expect ? nr_hpages : 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -634,7 +640,7 @@ static void alloc_at_fault(void) p =3D alloc_mapping(1); *p =3D 1; ksft_print_msg("Allocate huge page on fault..."); - if (check_huge_anon(p, 1, hpage_pmd_size)) + if (check_huge_anon(p, hpage_pmd_size, 1, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -643,7 +649,7 @@ static void alloc_at_fault(void) =20 madvise(p, page_size, MADV_DONTNEED); ksft_print_msg("Split huge PMD on MADV_DONTNEED..."); - if (check_huge_anon(p, 0, hpage_pmd_size)) + if (check_huge_anon(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -815,7 +821,7 @@ static void collapse_single_pte_entry_compound(struct c= ollapse_context *c, struc madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE); ksft_print_msg("Split huge page leaving single PTE mapping compound page.= .."); madvise(p + page_size, hpage_pmd_size - page_size, MADV_DONTNEED); - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -836,7 +842,7 @@ static void collapse_full_of_compound(struct collapse_c= ontext *c, struct mem_ops ksft_print_msg("Split huge page leaving single PTE page table full of com= pound pages..."); madvise(p, page_size, MADV_NOHUGEPAGE); madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE); - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -858,7 +864,7 @@ static void collapse_compound_extreme(struct collapse_c= ontext *c, struct mem_ops for (i =3D 0; i < hpage_pmd_nr; i++) { madvise(BASE_ADDR, hpage_pmd_size, MADV_HUGEPAGE); ops->fault(BASE_ADDR, 0, hpage_pmd_size); - if (!ops->check_huge(BASE_ADDR, 1)) + if (!ops->check_huge(BASE_ADDR, hpage_pmd_size, 1, hpage_pmd_size)) ksft_exit_fail_msg("Failed to allocate huge page\n"); madvise(BASE_ADDR, hpage_pmd_size, MADV_NOHUGEPAGE); =20 @@ -881,7 +887,7 @@ static void collapse_compound_extreme(struct collapse_c= ontext *c, struct mem_ops =20 ops->cleanup_area(BASE_ADDR, hpage_pmd_size); ops->fault(p, 0, hpage_pmd_size); - if (!ops->check_huge(p, 1)) + if (!ops->check_huge(p, hpage_pmd_size, 1, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -903,7 +909,7 @@ static void collapse_fork(struct collapse_context *c, s= truct mem_ops *ops) =20 ksft_print_msg("Allocate small page..."); ops->fault(p, 0, page_size); - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -911,7 +917,7 @@ static void collapse_fork(struct collapse_context *c, s= truct mem_ops *ops) ksft_print_msg("Share small page over fork()..."); if (!fork()) { /* Do not touch settings on child exit */ - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -929,7 +935,7 @@ static void collapse_fork(struct collapse_context *c, s= truct mem_ops *ops) exit_status =3D WEXITSTATUS(wstatus); =20 ksft_print_msg("Check if parent still has small page..."); - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -947,7 +953,7 @@ static void collapse_fork_compound(struct collapse_cont= ext *c, struct mem_ops *o ksft_print_msg("Share huge page over fork()..."); if (!fork()) { /* Do not touch settings on child exit */ - if (ops->check_huge(p, 1)) + if (ops->check_huge(p, hpage_pmd_size, 1, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -955,7 +961,7 @@ static void collapse_fork_compound(struct collapse_cont= ext *c, struct mem_ops *o ksft_print_msg("Split huge page PMD in child process..."); madvise(p, page_size, MADV_NOHUGEPAGE); madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE); - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -976,7 +982,7 @@ static void collapse_fork_compound(struct collapse_cont= ext *c, struct mem_ops *o exit_status =3D WEXITSTATUS(wstatus); =20 ksft_print_msg("Check if parent still has huge page..."); - if (ops->check_huge(p, 1)) + if (ops->check_huge(p, hpage_pmd_size, 1, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -995,7 +1001,7 @@ static void collapse_max_ptes_shared(struct collapse_c= ontext *c, struct mem_ops ksft_print_msg("Share huge page over fork()..."); if (!fork()) { /* Do not touch settings on child exit */ - if (ops->check_huge(p, 1)) + if (ops->check_huge(p, hpage_pmd_size, 1, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -1003,7 +1009,7 @@ static void collapse_max_ptes_shared(struct collapse_= context *c, struct mem_ops ksft_print_msg("Trigger CoW on page %d of %d...", hpage_pmd_nr - max_ptes_shared - 1, hpage_pmd_nr); ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared - 1) * page_size); - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -1016,7 +1022,7 @@ static void collapse_max_ptes_shared(struct collapse_= context *c, struct mem_ops hpage_pmd_nr - max_ptes_shared, hpage_pmd_nr); ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared) * page_size); - if (ops->check_huge(p, 0)) + if (ops->check_huge(p, hpage_pmd_size, 0, hpage_pmd_size)) success("OK"); else fail("Fail"); @@ -1034,7 +1040,7 @@ static void collapse_max_ptes_shared(struct collapse_= context *c, struct mem_ops exit_status =3D WEXITSTATUS(wstatus); =20 ksft_print_msg("Check if parent still has huge page..."); - if (ops->check_huge(p, 1)) + if (ops->check_huge(p, hpage_pmd_size, 1, hpage_pmd_size)) success("OK"); else fail("Fail"); diff --git a/tools/testing/selftests/mm/prctl_thp_disable.c b/tools/testing= /selftests/mm/prctl_thp_disable.c index d8d9d1de57b8..82c6e96ea6eb 100644 --- a/tools/testing/selftests/mm/prctl_thp_disable.c +++ b/tools/testing/selftests/mm/prctl_thp_disable.c @@ -67,7 +67,7 @@ static int test_mmap_thp(enum thp_collapse_type madvise_b= uf, size_t pmdsize) /* HACK: make sure we have a separate VMA that we can check reliably. */ mprotect(mem, pmdsize, PROT_READ); =20 - ret =3D check_huge_anon(mem, 1, pmdsize); + ret =3D check_huge_anon(mem, pmdsize, 1, pmdsize); munmap(mmap_mem, mmap_size); return ret; } diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selfte= sts/mm/soft-dirty.c index fb1864a68e1c..e198facf78bb 100644 --- a/tools/testing/selftests/mm/soft-dirty.c +++ b/tools/testing/selftests/mm/soft-dirty.c @@ -103,7 +103,7 @@ static void test_hugepage(int pagemap_fd, int pagesize) for (i =3D 0; i < hpage_len; i++) map[i] =3D (char)i; =20 - if (check_huge_anon(map, 1, hpage_len)) { + if (check_huge_anon(map, hpage_len, 1, hpage_len)) { ksft_test_result_pass("Test %s huge page allocation\n", __func__); =20 clear_softdirty(); diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/test= ing/selftests/mm/split_huge_page_test.c index 32b991472f74..4cc70873a674 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -296,7 +296,7 @@ static void verify_rss_anon_split_huge_page_all_zeroes(= char *one_page, int nr_hp unsigned long rss_anon_before, rss_anon_after; size_t i; =20 - if (!check_huge_anon(one_page, nr_hpages, pmd_pagesize)) + if (!check_huge_anon(one_page, nr_hpages * pmd_pagesize, nr_hpages, pmd_p= agesize)) ksft_exit_fail_msg("No THP is allocated\n"); =20 rss_anon_before =3D rss_anon(); @@ -311,7 +311,7 @@ static void verify_rss_anon_split_huge_page_all_zeroes(= char *one_page, int nr_hp if (one_page[i] !=3D (char)0) ksft_exit_fail_msg("%ld byte corrupted\n", i); =20 - if (!check_huge_anon(one_page, 0, pmd_pagesize)) + if (!check_huge_anon(one_page, nr_hpages * pmd_pagesize, 0, pmd_pagesize)) ksft_exit_fail_msg("Still AnonHugePages not split\n"); =20 rss_anon_after =3D rss_anon(); @@ -347,7 +347,7 @@ static void split_pmd_thp_to_order(int order) for (i =3D 0; i < len; i++) one_page[i] =3D (char)i; =20 - if (!check_huge_anon(one_page, 4, pmd_pagesize)) + if (!check_huge_anon(one_page, 4 * pmd_pagesize, 4, pmd_pagesize)) ksft_exit_fail_msg("No THP is allocated\n"); =20 /* split all THPs */ @@ -366,7 +366,7 @@ static void split_pmd_thp_to_order(int order) (pmd_order + 1))) ksft_exit_fail_msg("Unexpected THP split\n"); =20 - if (!check_huge_anon(one_page, 0, pmd_pagesize)) + if (!check_huge_anon(one_page, 4 * pmd_pagesize, 0, pmd_pagesize)) ksft_exit_fail_msg("Still AnonHugePages not split\n"); =20 ksft_test_result_pass("Split huge pages to order %d successful\n", order); @@ -393,7 +393,7 @@ static void split_pte_mapped_thp(void) for (i =3D 0; i < thp_area_size; i++) thp_area[i] =3D (char)i; =20 - if (!check_huge_anon(thp_area, nr_thps, pmd_pagesize)) { + if (!check_huge_anon(thp_area, nr_thps * pmd_pagesize, nr_thps, pmd_pages= ize)) { ksft_test_result_skip("Not all THPs allocated\n"); goto out; } @@ -657,7 +657,7 @@ static int create_pagecache_thp_and_fd(const char *test= file, size_t fd_size, =20 force_read_pages(*addr, fd_size / pmd_pagesize, pmd_pagesize); =20 - if (!check_huge_file(*addr, fd_size / pmd_pagesize, pmd_pagesize)) { + if (!check_huge_file(*addr, fd_size, fd_size / pmd_pagesize, pmd_pagesize= )) { ksft_print_msg("No large pagecache folio generated, please provide a fil= esystem supporting large folio\n"); munmap(*addr, fd_size); close(*fd); @@ -735,7 +735,7 @@ static void split_thp_in_pagecache_to_order_at(size_t f= d_size, goto out; } =20 - if (!check_huge_file(addr, 0, pmd_pagesize)) { + if (!check_huge_file(addr, fd_size, 0, pmd_pagesize)) { ksft_print_msg("Still FilePmdMapped not split\n"); err =3D EXIT_FAILURE; goto out; diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selft= ests/mm/uffd-common.c index edd02328f77b..777f276044e2 100644 --- a/tools/testing/selftests/mm/uffd-common.c +++ b/tools/testing/selftests/mm/uffd-common.c @@ -194,7 +194,9 @@ static void shmem_alias_mapping(uffd_global_test_opts_t= *gopts, __u64 *start, =20 static void shmem_check_pmd_mapping(uffd_global_test_opts_t *gopts, void *= p, int expect_nr_hpages) { - if (!check_huge_shmem(gopts->area_dst_alias, expect_nr_hpages, + unsigned long size =3D expect_nr_hpages * read_pmd_pagesize(); + + if (!check_huge_shmem(gopts->area_dst_alias, size, expect_nr_hpages, read_pmd_pagesize())) err("Did not find expected %d number of hugepages", expect_nr_hpages); diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests= /mm/vm_util.c index 311fc5b4513e..b43adfa92116 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -247,17 +247,17 @@ bool __check_huge(void *addr, char *pattern, int nr_h= pages, return thp =3D=3D (nr_hpages * (hpage_size >> 10)); } =20 -bool check_huge_anon(void *addr, int nr_hpages, uint64_t hpage_size) +bool check_huge_anon(void *addr, unsigned long size, int nr_hpages, uint64= _t hpage_size) { return __check_huge(addr, "AnonHugePages: ", nr_hpages, hpage_size); } =20 -bool check_huge_file(void *addr, int nr_hpages, uint64_t hpage_size) +bool check_huge_file(void *addr, unsigned long size, int nr_hpages, uint64= _t hpage_size) { return __check_huge(addr, "FilePmdMapped:", nr_hpages, hpage_size); } =20 -bool check_huge_shmem(void *addr, int nr_hpages, uint64_t hpage_size) +bool check_huge_shmem(void *addr, unsigned long size, int nr_hpages, uint6= 4_t hpage_size) { return __check_huge(addr, "ShmemPmdMapped:", nr_hpages, hpage_size); } diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests= /mm/vm_util.h index ea8fc8fdf0eb..40c6d8c4f1b8 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -90,9 +90,9 @@ void clear_softdirty(void); bool check_for_pattern(FILE *fp, const char *pattern, char *buf, size_t le= n); uint64_t read_pmd_pagesize(void); unsigned long rss_anon(void); -bool check_huge_anon(void *addr, int nr_hpages, uint64_t hpage_size); -bool check_huge_file(void *addr, int nr_hpages, uint64_t hpage_size); -bool check_huge_shmem(void *addr, int nr_hpages, uint64_t hpage_size); +bool check_huge_anon(void *addr, unsigned long size, int nr_hpages, uint64= _t hpage_size); +bool check_huge_file(void *addr, unsigned long size, int nr_hpages, uint64= _t hpage_size); +bool check_huge_shmem(void *addr, unsigned long size, int nr_hpages, uint6= 4_t hpage_size); int64_t allocate_transhuge(void *ptr, int pagemap_fd); int pageflags_get(unsigned long pfn, int kpageflags_fd, uint64_t *flags); =20 --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E848A3F65EE; Wed, 10 Jun 2026 10:29:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087385; cv=none; b=injbB3XEQXaiJ+wm0eg6L8h/PxlYh6yvdaEWgDD4equzYjQrIfbcx+8kp/PbJZQYeZ+jEwXswGs/ufBFUjIPOXBrnIfhNHOshyyzwZRpDoWh+SDD2IPm2N0qtQZx0AKiqlSAxzvmkrDdVb9RpYs+jCsKOhnp7qcaWfwpX/BygKI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087385; c=relaxed/simple; bh=u5wopZ8EqnQPL18jqmNttZXnWP/NXU4kSzqnuPxmvx0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=c9nfu2+eR6f3iRKvnuJkUiXKoaUjUam0cAPqJ9i7CF0dEnvPRdfu13q2mXVUf8bwRjEc4dIYrIq1VcpJkNT9vAYeGgDNAiYEVXC44GnHkaEF+lA3CNZcjrcWP80OiWpkwdIlZ7ry+J7QYhD+Fbc2F8osOMKqGj76KSoDBsbIoOE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=iNjVhvkz; arc=none smtp.client-ip=115.124.30.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="iNjVhvkz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087378; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=uH2kHOlqaXzUSXC626PClBD0HY2gB+0EeBu4k4Dn1Fs=; b=iNjVhvkzMNnBuidlIdidDF/LCiO9FkuuHIwvaAe6S2NY5WHf7lJy8dJ0qLtYVOyS9DS6lo1Sj9qOlrqJWl8Q7T5yaRzeQKxIqly8cClSsYC97qp3AzJsHzQAnSM8JY1K845yZiVFHuHrLlrtVqHYwAyCKjBRZytyl8FlvdPM4ls= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037009110;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aHmjA_1781087376; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aHmjA_1781087376 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:37 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 09/11] selftests: mm: move gather_after_split_folio_orders() into vm_util.c file Date: Wed, 10 Jun 2026 18:29:17 +0800 Message-ID: X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move gather_after_split_folio_orders() to vm_util.c as a helper function in preparation for implementing checks for mTHP collapse. While we are at it, rename this function to indicate that it is not only used for large folio splits. No functional changes. Signed-off-by: Baolin Wang --- .../selftests/mm/split_huge_page_test.c | 125 +----------------- tools/testing/selftests/mm/vm_util.c | 123 +++++++++++++++++ tools/testing/selftests/mm/vm_util.h | 2 + 3 files changed, 126 insertions(+), 124 deletions(-) diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/test= ing/selftests/mm/split_huge_page_test.c index 4cc70873a674..86a603692826 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -104,129 +104,6 @@ static bool is_backed_by_folio(char *vaddr, int order= , int pagemap_fd, return false; } =20 -static int vaddr_pageflags_get(char *vaddr, int pagemap_fd, int kpageflags= _fd, - uint64_t *flags) -{ - unsigned long pfn; - - pfn =3D pagemap_get_pfn(pagemap_fd, vaddr); - - /* non-present PFN */ - if (pfn =3D=3D -1UL) - return 1; - - if (pageflags_get(pfn, kpageflags_fd, flags)) - return -1; - - return 0; -} - -/* - * gather_after_split_folio_orders - scan through [vaddr_start, len) and r= ecord - * folio orders - * - * @vaddr_start: start vaddr - * @len: range length - * @pagemap_fd: file descriptor to /proc//pagemap - * @kpageflags_fd: file descriptor to /proc/kpageflags - * @orders: output folio order array - * @nr_orders: folio order array size - * - * gather_after_split_folio_orders() scan through [vaddr_start, len) and c= heck - * all folios within the range and record their orders. All order-0 pages = will - * be recorded. Non-present vaddr is skipped. - * - * NOTE: the function is used to check folio orders after a split is perfo= rmed, - * so it assumes [vaddr_start, len) fully maps to after-split folios withi= n that - * range. - * - * Return: 0 - no error, -1 - unhandled cases - */ -static int gather_after_split_folio_orders(char *vaddr_start, size_t len, - int pagemap_fd, int kpageflags_fd, int orders[], int nr_orders) -{ - uint64_t page_flags =3D 0; - int cur_order =3D -1; - char *vaddr; - - if (pagemap_fd =3D=3D -1 || kpageflags_fd =3D=3D -1) - return -1; - if (!orders) - return -1; - if (nr_orders <=3D 0) - return -1; - - for (vaddr =3D vaddr_start; vaddr < vaddr_start + len;) { - char *next_folio_vaddr; - int status; - - status =3D vaddr_pageflags_get(vaddr, pagemap_fd, kpageflags_fd, - &page_flags); - if (status < 0) - return -1; - - /* skip non present vaddr */ - if (status =3D=3D 1) { - vaddr +=3D psize(); - continue; - } - - /* all order-0 pages with possible false postive (non folio) */ - if (!(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) { - orders[0]++; - vaddr +=3D psize(); - continue; - } - - /* skip non thp compound pages */ - if (!(page_flags & KPF_THP)) { - vaddr +=3D psize(); - continue; - } - - /* vpn points to part of a THP at this point */ - if (page_flags & KPF_COMPOUND_HEAD) - cur_order =3D 1; - else { - vaddr +=3D psize(); - continue; - } - - next_folio_vaddr =3D vaddr + (1UL << (cur_order + pshift())); - - if (next_folio_vaddr >=3D vaddr_start + len) - break; - - while ((status =3D vaddr_pageflags_get(next_folio_vaddr, - pagemap_fd, kpageflags_fd, - &page_flags)) >=3D 0) { - /* - * non present vaddr, next compound head page, or - * order-0 page - */ - if (status =3D=3D 1 || - (page_flags & KPF_COMPOUND_HEAD) || - !(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) { - if (cur_order < nr_orders) { - orders[cur_order]++; - cur_order =3D -1; - vaddr =3D next_folio_vaddr; - } - break; - } - - cur_order++; - next_folio_vaddr =3D vaddr + (1UL << (cur_order + pshift())); - } - - if (status < 0) - return status; - } - if (cur_order > 0 && cur_order < nr_orders) - orders[cur_order]++; - return 0; -} - static int check_after_split_folio_orders(char *vaddr_start, size_t len, int pagemap_fd, int kpageflags_fd, int orders[], int nr_orders) { @@ -240,7 +117,7 @@ static int check_after_split_folio_orders(char *vaddr_s= tart, size_t len, ksft_exit_fail_msg("Cannot allocate memory for vaddr_orders"); =20 memset(vaddr_orders, 0, sizeof(int) * nr_orders); - status =3D gather_after_split_folio_orders(vaddr_start, len, pagemap_fd, + status =3D gather_folio_orders(vaddr_start, len, pagemap_fd, kpageflags_fd, vaddr_orders, nr_orders); if (status) ksft_exit_fail_msg("gather folio info failed\n"); diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests= /mm/vm_util.c index b43adfa92116..b08bf655ab23 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -194,6 +194,129 @@ unsigned long rss_anon(void) return rss_anon; } =20 +static int vaddr_pageflags_get(char *vaddr, int pagemap_fd, int kpageflags= _fd, + uint64_t *flags) +{ + unsigned long pfn; + + pfn =3D pagemap_get_pfn(pagemap_fd, vaddr); + + /* non-present PFN */ + if (pfn =3D=3D -1UL) + return 1; + + if (pageflags_get(pfn, kpageflags_fd, flags)) + return -1; + + return 0; +} + +/* + * gather_folio_orders - scan through [vaddr_start, len) and record + * folio orders + * + * @vaddr_start: start vaddr + * @len: range length + * @pagemap_fd: file descriptor to /proc//pagemap + * @kpageflags_fd: file descriptor to /proc/kpageflags + * @orders: output folio order array + * @nr_orders: folio order array size + * + * gather_after_split_folio_orders() scan through [vaddr_start, len) and c= heck + * all folios within the range and record their orders. All order-0 pages = will + * be recorded. Non-present vaddr is skipped. + * + * NOTE: the function is used to check folio orders after a split is perfo= rmed, + * so it assumes [vaddr_start, len) fully maps to after-split folios withi= n that + * range. + * + * Return: 0 - no error, -1 - unhandled cases + */ +int gather_folio_orders(char *vaddr_start, size_t len, + int pagemap_fd, int kpageflags_fd, int orders[], int nr_orders) +{ + uint64_t page_flags =3D 0; + int cur_order =3D -1; + char *vaddr; + + if (pagemap_fd =3D=3D -1 || kpageflags_fd =3D=3D -1) + return -1; + if (!orders) + return -1; + if (nr_orders <=3D 0) + return -1; + + for (vaddr =3D vaddr_start; vaddr < vaddr_start + len;) { + char *next_folio_vaddr; + int status; + + status =3D vaddr_pageflags_get(vaddr, pagemap_fd, kpageflags_fd, + &page_flags); + if (status < 0) + return -1; + + /* skip non present vaddr */ + if (status =3D=3D 1) { + vaddr +=3D psize(); + continue; + } + + /* all order-0 pages with possible false postive (non folio) */ + if (!(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) { + orders[0]++; + vaddr +=3D psize(); + continue; + } + + /* skip non thp compound pages */ + if (!(page_flags & KPF_THP)) { + vaddr +=3D psize(); + continue; + } + + /* vpn points to part of a THP at this point */ + if (page_flags & KPF_COMPOUND_HEAD) + cur_order =3D 1; + else { + vaddr +=3D psize(); + continue; + } + + next_folio_vaddr =3D vaddr + (1UL << (cur_order + pshift())); + + if (next_folio_vaddr >=3D vaddr_start + len) + break; + + while ((status =3D vaddr_pageflags_get(next_folio_vaddr, + pagemap_fd, kpageflags_fd, + &page_flags)) >=3D 0) { + /* + * non present vaddr, next compound head page, or + * order-0 page + */ + if (status =3D=3D 1 || + (page_flags & KPF_COMPOUND_HEAD) || + !(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) { + if (cur_order < nr_orders) { + orders[cur_order]++; + cur_order =3D -1; + vaddr =3D next_folio_vaddr; + } + break; + } + + cur_order++; + next_folio_vaddr =3D vaddr + (1UL << (cur_order + pshift())); + } + + if (status < 0) + return status; + } + if (cur_order > 0 && cur_order < nr_orders) + orders[cur_order]++; + return 0; +} + char *__get_smap_entry(void *addr, const char *pattern, char *buf, size_t = len) { int ret; diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests= /mm/vm_util.h index 40c6d8c4f1b8..19d8568a8553 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -95,6 +95,8 @@ bool check_huge_file(void *addr, unsigned long size, int = nr_hpages, uint64_t hpa bool check_huge_shmem(void *addr, unsigned long size, int nr_hpages, uint6= 4_t hpage_size); int64_t allocate_transhuge(void *ptr, int pagemap_fd); int pageflags_get(unsigned long pfn, int kpageflags_fd, uint64_t *flags); +int gather_folio_orders(char *vaddr_start, size_t len, + int pagemap_fd, int kpageflags_fd, int orders[], int nr_orders); =20 int uffd_register(int uffd, void *addr, uint64_t len, bool miss, bool wp, bool minor); --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14F693F8ED9; Wed, 10 Jun 2026 10:29:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087391; cv=none; b=rffN7LS+HP3ENmkK95fTvjiSTNWwkzI/3/eNo0jgLs4wQ1mz/2Rf2KigFIOMTJ95LmXkVGO8aYnrEpG1QLa9xbdaGOI4Fi/NDrhC31Kb7rXV6vvWU6WLvI0tT04PLc7/OoMcr6vD0BGqH3+Yl4bj1Q+rT96GJ4g+31H2c+Acb3Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087391; c=relaxed/simple; bh=c8WQ7LEz5A/sQu8HpNo5YaboATo2t5p+f8IX3oQHR1o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LQo3E3qlLu4Fnwg9nTE3dl1pMDDFuuBaJUTj0rSrffXc3mlyRKWdD8BRgTr17wmLg3yJnmSOMHoI/59/UZavTLAUDHrUqVD+2XIu3b9rbV6Gbw2YZBvZw4cwBkzQgq5m8v6vFyu9S6JIgxDra7xa1WhMLZHRLRtyieylSY9yIhE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=TDMe8O6G; arc=none smtp.client-ip=115.124.30.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="TDMe8O6G" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087379; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=bh4A8aWubwzVF623+x5XXbydK4sCvs2okQ9F54Wtcac=; b=TDMe8O6G1oe6F8WV1Xaed0cxTwbxdHF/lrD8SBejZXJgmQ4Gn7JsGziPJuLpbEaxR+tNBGfbewL0YGxa29nNt+sSVTlpFuRoAPHEPYAKN40jiPlXxOEHGD93c0NbZ6GGGzqkcjMuT+ryxjSGtFTFaoJUrUJJkBbTOfRCpm3nHUU= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R651e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037009110;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aI8Xd_1781087377; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aI8Xd_1781087377 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:37 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 10/11] selftests: mm: implement the mTHP-sized hugepage check helpers Date: Wed, 10 Jun 2026 18:29:18 +0800 Message-ID: <6f79e2194d5d4c5179fa954ea4968e8a76e931c2.1781083630.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement mTHP-sized hugepage checking helpers using gather_folio_orders(). Also rename the existing PMD-sized huge page check function to __check_pmd_huge() for clarity. Signed-off-by: Baolin Wang --- tools/testing/selftests/mm/vm_util.c | 55 ++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests= /mm/vm_util.c index b08bf655ab23..6d464d49f164 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -15,6 +15,10 @@ #define SMAP_FILE_PATH "/proc/self/smaps" #define STATUS_FILE_PATH "/proc/self/status" #define MAX_LINE_LENGTH 500 +#define PAGEMAP_PATH "/proc/self/pagemap" +#define KPAGEFLAGS_PATH "/proc/kpageflags" +#define GET_ORDER(nr_pages) (31 - __builtin_clz(nr_pages)) +#define NR_ORDERS 20 =20 unsigned int __page_size; unsigned int __page_shift; @@ -352,7 +356,7 @@ char *__get_smap_entry(void *addr, const char *pattern,= char *buf, size_t len) return entry; } =20 -bool __check_huge(void *addr, char *pattern, int nr_hpages, +static bool __check_pmd_huge(void *addr, char *pattern, int nr_hpages, uint64_t hpage_size) { char buffer[MAX_LINE_LENGTH]; @@ -370,19 +374,62 @@ bool __check_huge(void *addr, char *pattern, int nr_h= pages, return thp =3D=3D (nr_hpages * (hpage_size >> 10)); } =20 +static bool check_large_folios(void *addr, unsigned long size, int nr_hpag= es, uint64_t hpage_size) +{ + int pagesize =3D getpagesize(); + int order =3D GET_ORDER(hpage_size / pagesize); + int pagemap_fd, kpageflags_fd; + int orders[NR_ORDERS], status; + bool ret =3D false; + + memset(orders, 0, sizeof(int) * NR_ORDERS); + + pagemap_fd =3D open(PAGEMAP_PATH, O_RDONLY); + if (pagemap_fd =3D=3D -1) + ksft_exit_fail_msg("read pagemap fail\n"); + + kpageflags_fd =3D open(KPAGEFLAGS_PATH, O_RDONLY); + if (kpageflags_fd =3D=3D -1) { + close(pagemap_fd); + ksft_exit_fail_msg("read kpageflags fail\n"); + } + + status =3D gather_folio_orders(addr, size, pagemap_fd, + kpageflags_fd, orders, NR_ORDERS); + if (status) + goto out; + + if (orders[order] =3D=3D nr_hpages) + ret =3D true; + +out: + close(pagemap_fd); + close(kpageflags_fd); + return ret; +} + bool check_huge_anon(void *addr, unsigned long size, int nr_hpages, uint64= _t hpage_size) { - return __check_huge(addr, "AnonHugePages: ", nr_hpages, hpage_size); + if (hpage_size =3D=3D read_pmd_pagesize()) + return __check_pmd_huge(addr, "AnonHugePages: ", nr_hpages, hpage_size); + + return check_large_folios(addr, size, nr_hpages, hpage_size); } =20 bool check_huge_file(void *addr, unsigned long size, int nr_hpages, uint64= _t hpage_size) { - return __check_huge(addr, "FilePmdMapped:", nr_hpages, hpage_size); + if (hpage_size =3D=3D read_pmd_pagesize()) + return __check_pmd_huge(addr, "FilePmdMapped:", nr_hpages, hpage_size); + + return check_large_folios(addr, size, nr_hpages, hpage_size); } =20 bool check_huge_shmem(void *addr, unsigned long size, int nr_hpages, uint6= 4_t hpage_size) { - return __check_huge(addr, "ShmemPmdMapped:", nr_hpages, hpage_size); + if (hpage_size =3D=3D read_pmd_pagesize()) + return __check_pmd_huge(addr, "ShmemPmdMapped:", nr_hpages, hpage_size); + + return check_large_folios(addr, size, nr_hpages, hpage_size); } =20 int64_t allocate_transhuge(void *ptr, int pagemap_fd) --=20 2.47.3 From nobody Sun Jun 14 15:49:59 2026 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 737563F6612; Wed, 10 Jun 2026 10:29:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087385; cv=none; b=l6KmdUnLxikFjoMn8Sv+vyc5vMQaUv84KEprFYhhM59FrrzffSmsxXbvL8Wj6pul/tMjHmJ14EH/fKkvz0mK3j/XM4VmaFqmt9o5t4kjCpppsnBlQkWLqth17v9l9X7digIqSuawh34kFHaXoC37n36lRlBpzpobUzMZ/6terZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087385; c=relaxed/simple; bh=fB15VNdZCyWcSQmTlCl6G3NqZ0dVluGjmSnUsffWFrE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Sln1b/+CW3KvHC0URKTCFKmhCtI8CiuaoJWfzeh0EwrXuN289uUjsGPJ5e0UsyUKfCNMi8w1fCoQVsX5V7cx7EaoxyGKwWbN2H8dd5qmxK6BHKFhQLmJEq6zIlwI4d82QYI8zn7gbYIfA1Br3h3hEKAANx4jJf5vegfPvqsjsXY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Kqpng4mo; arc=none smtp.client-ip=115.124.30.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Kqpng4mo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1781087380; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=oeDhCWxyn2Hq9Tl6jnQX/f83a0uxpv/FV7uC2BVaW8M=; b=Kqpng4mor28w2Kw5Q21+kB3hzTM8Rsi/RF2XMq5GVDScLQdjQf0EBS9oyYZtdYon9axb/YpB/G9oR3eGmRmuIkmrfc0bX66dVBxAC+QckjeLd3gxR2sV8db0g31tFDpDfwtDYPZB9wvaFF8A2YOqDByLxZ8T0DK8u97IUmfZ5Qk= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032089153;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0X4aU0AL_1781087378; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0X4aU0AL_1781087378 cluster:ay36) by smtp.aliyun-inc.com; Wed, 10 Jun 2026 18:29:38 +0800 From: Baolin Wang To: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org, hughd@google.com Cc: willy@infradead.org, ziy@nvidia.com, liam@infradead.org, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 11/11] selftests: mm: add mTHP collapse test cases Date: Wed, 10 Jun 2026 18:29:19 +0800 Message-ID: X-Mailer: git-send-email 2.43.5 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Added a new command 'mthp_khugepaged' for mTHP collapse, along with the '-c' parameter to specify the collapse order. Additionally, added mTHP collapse test cases for 'collapse_full', 'collapse_empty', and 'collapse_single_mthp' for both anonymous pages and shmem. All khugepaged test cases passed. Signed-off-by: Baolin Wang --- tools/testing/selftests/mm/khugepaged.c | 135 +++++++++++++++++++--- tools/testing/selftests/mm/run_vmtests.sh | 4 + 2 files changed, 120 insertions(+), 19 deletions(-) diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selfte= sts/mm/khugepaged.c index f69be6be0ecd..8975be5b7b2f 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -26,9 +26,11 @@ =20 #define BASE_ADDR ((void *)(1UL << 30)) static unsigned long hpage_pmd_size; +static int hpage_pmd_order; static unsigned long page_size; static int hpage_pmd_nr; static int anon_order; +static int collapse_order; =20 #define PID_SMAPS "/proc/self/smaps" #define TEST_FILE "collapse_test_file" @@ -69,6 +71,7 @@ struct collapse_context { }; =20 static struct collapse_context *khugepaged_context; +static struct collapse_context *mthp_khugepaged_context; static struct collapse_context *madvise_context; =20 struct file_info { @@ -554,25 +557,25 @@ static void madvise_collapse(const char *msg, char *p= , int nr_hpages, } =20 #define TICK 500000 -static bool wait_for_scan(const char *msg, char *p, int nr_hpages, - struct mem_ops *ops) +static bool wait_for_scan(const char *msg, char *p, unsigned long size, + int nr_hpages, int collap_order, struct mem_ops *ops) { - unsigned long size =3D nr_hpages * hpage_pmd_size; + unsigned long hpage_size =3D page_size << collap_order; int full_scans; int timeout =3D 6; /* 3 seconds */ =20 /* Sanity check */ - if (!ops->check_huge(p, size, 0, hpage_pmd_size)) + if (!ops->check_huge(p, size, 0, hpage_size)) ksft_exit_fail_msg("Unexpected huge page\n"); =20 - madvise(p, nr_hpages * hpage_pmd_size, MADV_HUGEPAGE); + madvise(p, size, MADV_HUGEPAGE); =20 /* Wait until the second full_scan completed */ full_scans =3D thp_read_num("khugepaged/full_scans") + 2; =20 ksft_print_msg("%s...", msg); while (timeout--) { - if (ops->check_huge(p, size, nr_hpages, hpage_pmd_size)) + if (ops->check_huge(p, size, nr_hpages, hpage_size)) break; if (thp_read_num("khugepaged/full_scans") >=3D full_scans) break; @@ -595,7 +598,7 @@ static void khugepaged_collapse(const char *msg, char *= p, int nr_hpages, if (!is_tmpfs(ops) && ops =3D=3D &__read_write_file_write_ops) expect =3D false; =20 - if (wait_for_scan(msg, p, nr_hpages, ops)) { + if (wait_for_scan(msg, p, size, nr_hpages, hpage_pmd_order, ops)) { if (expect) fail("Timeout"); else @@ -617,12 +620,65 @@ static void khugepaged_collapse(const char *msg, char= *p, int nr_hpages, fail("Fail"); } =20 +static void mthp_khugepaged_collapse(const char *msg, char *p, int nr_hpag= es, + struct mem_ops *ops, bool expect) +{ + unsigned long hpage_size =3D page_size << collapse_order; + struct thp_settings settings =3D *thp_current_settings(); + /* mTHP collpase only allocates PMD sized memory */ + unsigned long size =3D hpage_pmd_size; + + /* Set mTHP setting for mTHP collapse */ + if (ops =3D=3D &__anon_ops) { + settings.thp_enabled =3D THP_NEVER; + settings.hugepages[collapse_order].enabled =3D THP_ALWAYS; + } else if (ops =3D=3D &__shmem_ops) { + settings.shmem_enabled =3D SHMEM_NEVER; + settings.shmem_hugepages[collapse_order].enabled =3D SHMEM_ALWAYS; + } + + thp_push_settings(&settings); + + if (wait_for_scan(msg, p, size, nr_hpages, collapse_order, ops)) { + if (expect) + fail("Timeout"); + else + success("OK"); + + /* Restore THP settings for mTHP collapse. */ + thp_pop_settings(); + return; + } + + /* + * For file and shmem memory, khugepaged only retracts pte entries after + * putting the new hugepage in the page cache. The hugepage must be + * subsequently refaulted to install the pmd mapping for the mm. + */ + if (ops !=3D &__anon_ops) + ops->fault(p, 0, nr_hpages * hpage_size); + + if (ops->check_huge(p, size, expect ? nr_hpages : 0, hpage_size)) + success("OK"); + else + fail("Fail"); + + /* Restore THP settings for mTHP collapse. */ + thp_pop_settings(); +} + static struct collapse_context __khugepaged_context =3D { .collapse =3D &khugepaged_collapse, .enforce_pte_scan_limits =3D true, .name =3D "khugepaged", }; =20 +static struct collapse_context __mthp_khugepaged_context =3D { + .collapse =3D &mthp_khugepaged_collapse, + .enforce_pte_scan_limits =3D true, + .name =3D "mthp_khugepaged", +}; + static struct collapse_context __madvise_context =3D { .collapse =3D &madvise_collapse, .enforce_pte_scan_limits =3D false, @@ -661,10 +717,17 @@ static void alloc_at_fault(void) static void collapse_full(struct collapse_context *c, struct mem_ops *ops) { void *p; - int nr_hpages =3D 4; + int nr_pmds =3D 4, nr_hpages =3D 4; unsigned long size =3D nr_hpages * hpage_pmd_size; =20 - p =3D ops->setup_area(nr_hpages); + /* Only try 1 PMD sized range for mTHP collapse. */ + if (c =3D=3D &__mthp_khugepaged_context) { + nr_pmds =3D 1; + nr_hpages =3D 1 << (hpage_pmd_order - collapse_order); + size =3D hpage_pmd_size; + } + + p =3D ops->setup_area(nr_pmds); ops->fault(p, 0, size); c->collapse("Collapse multiple fully populated PTE table", p, nr_hpages, ops, true); @@ -676,10 +739,31 @@ static void collapse_full(struct collapse_context *c,= struct mem_ops *ops) =20 static void collapse_empty(struct collapse_context *c, struct mem_ops *ops) { + int nr_hpages =3D 1; + void *p; + + if (c =3D=3D &__mthp_khugepaged_context) + nr_hpages =3D 1 << (hpage_pmd_order - collapse_order); + + p =3D ops->setup_area(1); + c->collapse("Do not collapse empty PTE table", p, nr_hpages, ops, false); + ops->cleanup_area(p, hpage_pmd_size); + ksft_test_result_report(exit_status, "%s\n", __func__); +} + +static void collapse_single_mthp(struct collapse_context *c, struct mem_op= s *ops) +{ + unsigned long hpage_size =3D page_size << collapse_order; void *p; =20 p =3D ops->setup_area(1); - c->collapse("Do not collapse empty PTE table", p, 1, ops, false); + /* + * Only fault collapse_order sized ranges, and only check 1 + * collapse_order sized huge page. + */ + ops->fault(p, 0, hpage_size); + c->collapse("Collapse PTE table with half PTE entries present", + p, 1, ops, true); ops->cleanup_area(p, hpage_pmd_size); ksft_test_result_report(exit_status, "%s\n", __func__); } @@ -1081,8 +1165,8 @@ static void madvise_retracted_page_tables(struct coll= apse_context *c, ops->fault(p, 0, size); =20 /* Let khugepaged collapse and leave pmd cleared */ - if (wait_for_scan("Collapse and leave PMD cleared", p, nr_hpages, - ops)) { + if (wait_for_scan("Collapse and leave PMD cleared", p, size, nr_hpages, + hpage_pmd_order, ops)) { fail("Timeout"); return; } @@ -1098,7 +1182,7 @@ static void usage(void) { fprintf(stderr, "\nUsage: ./khugepaged [OPTIONS] [dir]\n\n"); fprintf(stderr, "\t\t: :\n"); - fprintf(stderr, "\t\t: [all|khugepaged|madvise]\n"); + fprintf(stderr, "\t\t: [all|khugepaged|mthp_khugepaged|madvise]\= n"); fprintf(stderr, "\t\t: [all|anon|file|shmem]\n"); fprintf(stderr, "\n\t\"file,all\" mem_type requires [dir] argument\n"); fprintf(stderr, "\n\t\"file,all\" mem_type requires a file system\n"); @@ -1109,6 +1193,7 @@ static void usage(void) fprintf(stderr, "\t\t-h: This help message.\n"); fprintf(stderr, "\t\t-s: mTHP size, expressed as page order.\n"); fprintf(stderr, "\t\t Defaults to 0. Use this size for anon or shmem a= llocations.\n"); + fprintf(stderr, "\t\t-c: collapse order for mTHP collapse, expressed as p= age order.\n"); exit(1); } =20 @@ -1118,11 +1203,14 @@ static void parse_test_type(int argc, char **argv) char *buf; const char *token; =20 - while ((opt =3D getopt(argc, argv, "s:h")) !=3D -1) { + while ((opt =3D getopt(argc, argv, "s:c:h")) !=3D -1) { switch (opt) { case 's': anon_order =3D atoi(optarg); break; + case 'c': + collapse_order =3D atoi(optarg); + break; case 'h': default: usage(); @@ -1148,6 +1236,10 @@ static void parse_test_type(int argc, char **argv) madvise_context =3D &__madvise_context; } else if (!strcmp(token, "khugepaged")) { khugepaged_context =3D &__khugepaged_context; + } else if (!strcmp(token, "mthp_khugepaged")) { + mthp_khugepaged_context =3D &__mthp_khugepaged_context; + if (collapse_order =3D=3D 0 || collapse_order >=3D hpage_pmd_order) + usage(); } else if (!strcmp(token, "madvise")) { madvise_context =3D &__madvise_context; } else { @@ -1213,7 +1305,6 @@ static int nr_test_cases; =20 int main(int argc, char **argv) { - int hpage_pmd_order; struct thp_settings default_settings =3D { .thp_enabled =3D THP_MADVISE, .thp_defrag =3D THP_DEFRAG_ALWAYS, @@ -1239,10 +1330,6 @@ int main(int argc, char **argv) if (!thp_is_enabled()) ksft_exit_skip("Transparent Hugepages not available\n"); =20 - parse_test_type(argc, argv); - - setbuf(stdout, NULL); - page_size =3D getpagesize(); hpage_pmd_size =3D read_pmd_pagesize(); if (!hpage_pmd_size) @@ -1250,6 +1337,10 @@ int main(int argc, char **argv) hpage_pmd_nr =3D hpage_pmd_size / page_size; hpage_pmd_order =3D __builtin_ctz(hpage_pmd_nr); =20 + parse_test_type(argc, argv); + + setbuf(stdout, NULL); + default_settings.khugepaged.max_ptes_none =3D hpage_pmd_nr - 1; default_settings.khugepaged.max_ptes_swap =3D hpage_pmd_nr / 8; default_settings.khugepaged.max_ptes_shared =3D hpage_pmd_nr / 2; @@ -1267,6 +1358,8 @@ int main(int argc, char **argv) TEST(collapse_full, khugepaged_context, read_write_file_read_ops); TEST(collapse_full, khugepaged_context, read_write_file_write_ops); TEST(collapse_full, khugepaged_context, shmem_ops); + TEST(collapse_full, mthp_khugepaged_context, anon_ops); + TEST(collapse_full, mthp_khugepaged_context, shmem_ops); TEST(collapse_full, madvise_context, anon_ops); TEST(collapse_full, madvise_context, read_only_file_ops); TEST(collapse_full, madvise_context, read_write_file_read_ops); @@ -1274,8 +1367,12 @@ int main(int argc, char **argv) TEST(collapse_full, madvise_context, shmem_ops); =20 TEST(collapse_empty, khugepaged_context, anon_ops); + TEST(collapse_empty, mthp_khugepaged_context, anon_ops); TEST(collapse_empty, madvise_context, anon_ops); =20 + TEST(collapse_single_mthp, mthp_khugepaged_context, anon_ops); + TEST(collapse_single_mthp, mthp_khugepaged_context, shmem_ops); + TEST(collapse_single_pte_entry, khugepaged_context, anon_ops); TEST(collapse_single_pte_entry, khugepaged_context, read_only_file_ops); TEST(collapse_single_pte_entry, khugepaged_context, read_write_file_read_= ops); diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/self= tests/mm/run_vmtests.sh index 8c296dedf047..c0f4f3e5f1f1 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -411,6 +411,10 @@ CATEGORY=3D"thp" run_test ./khugepaged all:shmem =20 CATEGORY=3D"thp" run_test ./khugepaged -s 4 all:shmem =20 +CATEGORY=3D"thp" run_test ./khugepaged -c 4 mthp_khugepaged:anon + +CATEGORY=3D"thp" run_test ./khugepaged -c 4 mthp_khugepaged:shmem + # Try to create XFS if not provided if [ -z "${SPLIT_HUGE_PAGE_TEST_XFS_PATH}" ]; then if test_selected "thp"; then --=20 2.47.3