From nobody Mon Feb 9 03:11:25 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 270AD2980A0 for ; Mon, 28 Apr 2025 18:13:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745864024; cv=none; b=gtPBRv5W2URRKqHU5aGCu4L53QwI8zGFMCYrqBku8Z2tYFfi6hhnkJ1vzvZ48isCXxiJ2z0jdvILoM/2Vz+cbx6P3eBjMs8H8Nf/SGugaRxgEn92Mmwr357LRgC63ZB2VfHQvGwao/q2DvLma2YZDoTPiFKzr/F9fPdLPHq+5kU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745864024; c=relaxed/simple; bh=8fooa8LpMn4Bw/fr1qymLmE032t4Yl0QP3/E06QCPY8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uZNMAJuU3gaIXADN0PjRjmiVL0BC7Ci5RI8B8j6zVr2tIiJsMj2+2a0a0cK7QkE3oubXBcvYrETrfq0MF5wC3AlJaeRffjj2UqreTRUczhthVqrMqJ+zCARYYStuntazn1S8MK79GmG0LvWvcjT8+ZXKiOFWy9K2SrtRH+EWmqY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QoaYn3IC; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QoaYn3IC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1745864021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NWjIg3Fg24s78QL5xrV3h5AP8YC8bz0Iqifh84RizAw=; b=QoaYn3ICVmGzeDNRLYhoKFWh8dEA+jlhs8r14ENucLToQg1cM3DtYDNytKUXSAi3a/FqjC JSgRANHpOHYypE5DEsLoAZDNKLXLJEg3KWdSPkf8A8Nb0J75Rbsc0ry8uTHjI72OD/nDqj boTkZSpjsuMEaZcu6oc1i2DSgt+roWM= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-12-byG1LfqQPmiDTXFuy1js7w-1; Mon, 28 Apr 2025 14:13:39 -0400 X-MC-Unique: byG1LfqQPmiDTXFuy1js7w-1 X-Mimecast-MFC-AGG-ID: byG1LfqQPmiDTXFuy1js7w_1745864014 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1562A1800980; Mon, 28 Apr 2025 18:13:34 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.65.12]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 21A39180045B; Mon, 28 Apr 2025 18:13:26 +0000 (UTC) From: Nico Pache To: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, david@redhat.com, baohua@kernel.org, baolin.wang@linux.alibaba.com, ryan.roberts@arm.com, willy@infradead.org, peterx@redhat.com, ziy@nvidia.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com Subject: [PATCH v5 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support Date: Mon, 28 Apr 2025 12:12:12 -0600 Message-ID: <20250428181218.85925-7-npache@redhat.com> In-Reply-To: <20250428181218.85925-1-npache@redhat.com> References: <20250428181218.85925-1-npache@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Content-Type: text/plain; charset="utf-8" khugepaged scans anons PMD ranges for potential collapse to a hugepage. To add mTHP support we use this scan to instead record chunks of utilized sections of the PMD. khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap that represents chunks of utilized regions. We can then determine what mTHP size fits best and in the following patch, we set this bitmap while scanning the anon PMD. max_ptes_none is used as a scale to determine how "full" an order must be before being considered for collapse. When attempting to collapse an order that has its order set to "always" lets always collapse to that order in a greedy manner without considering the number of bits set. Signed-off-by: Nico Pache --- include/linux/khugepaged.h | 4 ++ mm/khugepaged.c | 94 ++++++++++++++++++++++++++++++++++---- 2 files changed, 89 insertions(+), 9 deletions(-) diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index 1f46046080f5..18fe6eb5051d 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -1,6 +1,10 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_KHUGEPAGED_H #define _LINUX_KHUGEPAGED_H +#define KHUGEPAGED_MIN_MTHP_ORDER 2 +#define KHUGEPAGED_MIN_MTHP_NR (1<mthp_bitmap_stack[++top] =3D (struct scan_bit_state) + { HPAGE_PMD_ORDER - KHUGEPAGED_MIN_MTHP_ORDER, 0 }; + + while (top >=3D 0) { + state =3D cc->mthp_bitmap_stack[top--]; + order =3D state.order + KHUGEPAGED_MIN_MTHP_ORDER; + offset =3D state.offset; + num_chunks =3D 1 << (state.order); + // Skip mTHP orders that are not enabled + if (!test_bit(order, &enabled_orders)) + goto next; + + // copy the relavant section to a new bitmap + bitmap_shift_right(cc->mthp_bitmap_temp, cc->mthp_bitmap, offset, + MTHP_BITMAP_SIZE); + + bits_set =3D bitmap_weight(cc->mthp_bitmap_temp, num_chunks); + threshold_bits =3D (HPAGE_PMD_NR - khugepaged_max_ptes_none - 1) + >> (HPAGE_PMD_ORDER - state.order); + + //Check if the region is "almost full" based on the threshold + if (bits_set > threshold_bits || is_pmd_only + || test_bit(order, &huge_anon_orders_always)) { + ret =3D collapse_huge_page(mm, address, referenced, unmapped, cc, + mmap_locked, order, offset * KHUGEPAGED_MIN_MTHP_NR); + if (ret =3D=3D SCAN_SUCCEED) { + collapsed +=3D (1 << order); + continue; + } + } + +next: + if (state.order > 0) { + next_order =3D state.order - 1; + mid_offset =3D offset + (num_chunks / 2); + cc->mthp_bitmap_stack[++top] =3D (struct scan_bit_state) + { next_order, mid_offset }; + cc->mthp_bitmap_stack[++top] =3D (struct scan_bit_state) + { next_order, offset }; + } + } + return collapsed; +} + static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, bool *mmap_locked, @@ -1445,9 +1523,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_unmap_unlock(pte, ptl); if (result =3D=3D SCAN_SUCCEED) { result =3D collapse_huge_page(mm, address, referenced, - unmapped, cc); - /* collapse_huge_page will return with the mmap_lock released */ - *mmap_locked =3D false; + unmapped, cc, mmap_locked, HPAGE_PMD_ORDER, 0); } out: trace_mm_khugepaged_scan_pmd(mm, &folio->page, writable, referenced, --=20 2.48.1