From nobody Mon Dec 15 23:36:16 2025 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 686142459C5 for ; Mon, 15 Dec 2025 09:06:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789582; cv=none; b=QE+8NO7xxgwmajsumxqekByqL7YUpzcoqYtIw/JsvINpAovN932txNYVGyyV0vVxuC2oFM8GLBU4IHih5iwW+0NkyD2obrixfoQwCeaBmz6XcEz59h7mdGdko5s05pOKmCUmbRUO6gaC2xg+7jisjIj6tSG4SJPVTpUARySjgZY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789582; c=relaxed/simple; bh=svImEu8994OggomTZmqmwF/6XY9V3qWTmlEwHmHARgk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Nka+iW937beqEnKc1xYzdOSz7CfcGEotKWcYdO2nugwh5SMBdNzEUcgPe1LH55iaVrNch7aayuLpbdEAC/XXI9uqmUHhZtDcsy6yTFzBS3UUbWrj9gabpLj7pL3EEmnw4U2yY20wd8dvjBXJnRi6Vc/KsEEXntqbRKF7HfAvXgc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HWAzJ5Uu; arc=none smtp.client-ip=209.85.210.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HWAzJ5Uu" Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-7aab7623f42so3773344b3a.2 for ; Mon, 15 Dec 2025 01:06:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765789578; x=1766394378; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4oBTIsoOygiD0s+5B1YDZo7zeVCet+58utYNMPDTAKY=; b=HWAzJ5UuoY5u2zsLzGLs8+65XMCHkaEWyhf68tZcK9b0nh88sg20cpL93Tl4CLVR2w rUb8WxPjdN3S9fcvlYFrkN3kMoP1j0a4Tb47A2QLzvEj2D2zBjjmsK08Nx+jwGkqUQz4 yaEAhxV36nrzWcRnnsO4Ey9eSnR/RgBRGozwQtYemw25Y9gbox++bbH5DYULfAgd8MaK rziaid2aYiESsBvwsQL8EFF/+de+QduJFUlPiu5j3uX2l0cLg39Rhrr2+Cdz/Oq0ghR9 piGc76oQsQ65jGIRfTV6vCVHcjeay+Jc/6i5ceRxt79100JcxRgc3YEiI5T93WRpZ5GX Z6LQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765789578; x=1766394378; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=4oBTIsoOygiD0s+5B1YDZo7zeVCet+58utYNMPDTAKY=; b=M0BowbTg2Tn2K46PP7nypj58/0nfHiprYrTDpKF4VamHmmufE5EJU5Ylr/gDPN4Bgh Wn5C+dY9Erizfl1wNdlZTRXWcqJeDKTgtvwblPJ8eelGdqnfc7HAy75S5uzGgBi9levC XiL/qtS676KwIAOxZ1Ysn6k8Jvh08oR9/oc/IrZ2V0K0GNEJiVjq0JDDeRHNxvsHsLoS 0ha4t2p49MdFZxm+KNTHs7pnWQ6nhWhOzCq2Lkm5DUgVZN2ZXwoA3YBp5i+BU1Yn23xZ 5a5Lm/EVegXKClkoz3E9H4+CNaNgHapuAPC+7Rozuz3/LNmIMFsfs+eu7UC3WuzJxH1a XAAg== X-Forwarded-Encrypted: i=1; AJvYcCW1bnmtyubQKg6fOW2BNrM8hkNSA66a/h1H9xId+1fgkBnA1HsTA/0vSdShDGCJu26j3cS3F2F8VuVlGtc=@vger.kernel.org X-Gm-Message-State: AOJu0Yy4c4cum98bOdOM9KE+s+kqqt6GO/TQVdbN9AJtFZxTHlZ2jhyT MeuqwAVMm7V+o030hC7JdM8Fb5WQl9KYeOfbQjG++eJWScubYRrXBpd6 X-Gm-Gg: AY/fxX7+aWR91kTjFeUz5Xwp4zBJHg433afVXZ4w7q8KzbpAf9M+jfPiilgetIOcE7H DjO46KHOdhB5F/JxdgSQKyC55wwmuLGYX2CWtayWlMIqQOUQUQgBWXQ3xt6aaDzZ6B4LSicOdM1 9qi1CinTUQb20GArHfR+sECrumJ8HLaQSOruyxnyMxH/c0ggf59/vENosUbaa+866vytY11JcwX 6lPGF6+PlTnamjDH1HVWggmqwqMMBLKGwkp04WMtstVP+mSgPdzZSgFMtqSpsMzkrfg43IWlGb1 uLeTpbDnFz011eLL9lLfPjB05Bfzj0La6DpN5W039zze0I7sExuV7GKxwHfE2LT5NylARHthGeJ mnGkpIipZirQJb+1EeIptKTiM5Ea+//qnU2u6d+lSRW9JbfVmsIHef7JccUwfkOZIxm2QdcJ+SN VMKBFaboNA8VXXDKtSNidlA8ZDMUC81Q== X-Google-Smtp-Source: AGHT+IFyD8g0MtnJcE2vXRcQWMhGTOBNHVDi7wi14ga9RWkNywrlmieoCk6BJXjRwgXYDf93kVU2Vg== X-Received: by 2002:a05:6a00:bc90:b0:7ed:2cd6:58f7 with SMTP id d2e1a72fcca58-7f6694aaa51mr10086224b3a.36.1765789577836; Mon, 15 Dec 2025 01:06:17 -0800 (PST) Received: from localhost.localdomain ([114.231.217.195]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7f4c5093a40sm11993160b3a.46.2025.12.15.01.06.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Dec 2025 01:06:17 -0800 (PST) From: Vernon Yang X-Google-Original-From: Vernon Yang To: akpm@linux-foundation.org, david@kernel.org, lorenzo.stoakes@oracle.com Cc: ziy@nvidia.com, npache@redhat.com, baohua@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vernon Yang Subject: [PATCH 1/4] mm: khugepaged: add trace_mm_khugepaged_scan event Date: Mon, 15 Dec 2025 17:04:16 +0800 Message-ID: <20251215090419.174418-2-yanglincheng@kylinos.cn> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251215090419.174418-1-yanglincheng@kylinos.cn> References: <20251215090419.174418-1-yanglincheng@kylinos.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add mm_khugepaged_scan event to track the total time for full scan and the total number of pages scanned of khugepaged. Signed-off-by: Vernon Yang --- include/trace/events/huge_memory.h | 24 ++++++++++++++++++++++++ mm/khugepaged.c | 2 ++ 2 files changed, 26 insertions(+) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge= _memory.h index dd94d14a2427..b2824c2f8238 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -237,5 +237,29 @@ TRACE_EVENT(mm_khugepaged_collapse_file, __print_symbolic(__entry->result, SCAN_STATUS)) ); =20 +TRACE_EVENT(mm_khugepaged_scan, + + TP_PROTO(struct mm_struct *mm, int progress, bool full), + + TP_ARGS(mm, progress, full), + + TP_STRUCT__entry( + __field(struct mm_struct *, mm) + __field(int, progress) + __field(bool, full) + ), + + TP_fast_assign( + __entry->mm =3D mm; + __entry->progress =3D progress; + __entry->full =3D full; + ), + + TP_printk("mm=3D%p, progress=3D%d, full=3D%d", + __entry->mm, + __entry->progress, + __entry->full) +); + #endif /* __HUGE_MEMORY_H */ #include diff --git a/mm/khugepaged.c b/mm/khugepaged.c index abe54f0043c7..0598a19a98cc 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2516,6 +2516,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned = int pages, int *result, collect_mm_slot(slot); } =20 + trace_mm_khugepaged_scan(mm, progress, khugepaged_scan.mm_slot =3D=3D NUL= L); + return progress; } =20 --=20 2.51.0 From nobody Mon Dec 15 23:36:16 2025 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1EAE42E54AA for ; Mon, 15 Dec 2025 09:06:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789584; cv=none; b=FMuooS1R9omk6qtvOXn7ISh0qG3+zN/yEJPBc8kuZMicCW7GhTitEXpGGwnbHpyWF6AcL1/toH4J5IdjcVaXw6djEPWdlyDXsbL6ebKalvvIuFE4AVPOOVFwhv837QMgOpqlz85sjNqNk2uDLNuEEKKava9ldHFOzf1Hkrc2d04= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789584; c=relaxed/simple; bh=w72+VISOAS+YQYRxnLaAYQKehtIl15QnPh/oLGLO0Uc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bYNVbwxpkAM/35GXf0T158bQeTIxzzQaMndaxnlppvgrZ4mQ6X3U0WVBT+edojui1QGHNcYd3Ftn2fbZQ5mwm1VisuoTG/oGUNUXeVCcUkfwPeKv6+Qhp4EJ4/OFyxJEDsF3d0Pgle7B26Q0XQeIV8RmmsvqvrAfZx6+yxUcvO0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UIE48YHE; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UIE48YHE" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-7bc248dc16aso2597883b3a.0 for ; Mon, 15 Dec 2025 01:06:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765789581; x=1766394381; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nLnvhMv7algfkSEFPStrT4JgGd0qCSqLUxdZxB3OeXc=; b=UIE48YHEWgmqMdVSQbvg8YbPyu8R+youXAiycbl2Sd0RpYzz6El5dC+ofgXpEBA+KG MHdTgA4L8RyDqEDrU7xWpVuW7HpKliOmJ1iQ8pkIa0TbrhxI4s8e2CkaYDP0ZZuF6Tio Kz/LzwwBqsJO2VWBnKV5niHmUCMqW5KRiw9J2q7zQ981ecnbPq0r/tYsiFzeOpIk+syc UK16rpm8cMAvEGvNSSdKAdUTtTSJEI+X6nRWUrbiMgkO6v98K565B95ggL8jAHJpTecT ougPnWVUhE50tb8OOAf8I//vVGPvRZ9ymAmPUWmF9DwzPA5Vuc0GvU9pkmBldOAfoThX Rz+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765789581; x=1766394381; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=nLnvhMv7algfkSEFPStrT4JgGd0qCSqLUxdZxB3OeXc=; b=NeqNM7YXoncjH8T7LeIrHmZpy87JyQNleLIirnHbO5TpUeLDFiAzN3rvFnkgJtnXYL fXOSXbew6iagA6dVh89WOkhnTfMajTevXpnsjPkcabZi0NDE4upaL7y191G5y/ZIB7Wl XKqliOazCS/i2xljDn+wJN49es0usvId1yD5T4XrcIxDP7Din8hgZbqk4Fdc5jI/vHuO KJAAu9pZG7Nh+9ANKchLLVsSsL0wUCEV5YqDw8Ee/Kh+QOqHE+9Y5OLpUZYJVde8/TGU boPBnJJ82lWLwjnF/BfmsxldNXH4rU+FXPkSSzy153beCu1cZAPKfe7Gf81ehK8Xbrag 8OUA== X-Forwarded-Encrypted: i=1; AJvYcCWPi7lfU5WsO5YQybPR2rTsvfBcHamVK0l+IsQGVZF3JWUa50JYpTrpSOCdzWhjiqMDiixckAlfI+6lQMs=@vger.kernel.org X-Gm-Message-State: AOJu0Yzs9lMjxWEBTBhi5hjxVlb0CD/a1KgvjEgS/wihHIaBNnXKJpR3 zIH7F0uAmTwDJOYedaJpVrEhlgkaYvN6a/fhZAaivAZOILMXKYB9qnpB X-Gm-Gg: AY/fxX7Pb1RjnMSNKOFT8N4kmQ7S/hjcHhA3HaWjnwS8oaB0BnGB5/stt3YyhLP9YAc mUVomafekxhRPnaHd+z9xMdfZ/el+UY7q7ezDVUCFpkglcnRzEoB4x2QIUv94Z6zhk6qdjql2tG i8UdKIgGUEbioi2uw4U9aZ6ZqqDJs6dUvHqr3ROJe6ToiLcsfz9nZhO8U5njjD+ECTo1ukvAVtc t5gAgGyNLd+v0BkCYzggyCYNS/IJh6vmYe13Y4yiaEUREXKdW+BKhVi4LWXVx4LZQAtf7+lhxqY SOoknlewiLXjb79qzh1sWLqnP+yAksjaFQtSuEd7tbvdB0Q5uYaSCikLQnyOGSq5qI7QsTUYcXJ MUttkk7xa6TJ5P/QTP2C9AXfvVEf75XPtfe1TTNdLH+0IWB03i5LrhuPNua00Bz7U2Zf1uZolH3 83eF4EvYBON89IRlFkU0ZOp6f5Emoxqw== X-Google-Smtp-Source: AGHT+IEkFdpSwAU2QsBWc557vGo0xZJBtCJJz+ChzCFKRe5nKXPyONEwNRKDwuLsgqdT0E+Gsu59gg== X-Received: by 2002:a05:6a00:1d0a:b0:7bf:1a4b:1665 with SMTP id d2e1a72fcca58-7f667a2ba24mr8638059b3a.15.1765789580815; Mon, 15 Dec 2025 01:06:20 -0800 (PST) Received: from localhost.localdomain ([114.231.217.195]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7f4c5093a40sm11993160b3a.46.2025.12.15.01.06.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Dec 2025 01:06:20 -0800 (PST) From: Vernon Yang X-Google-Original-From: Vernon Yang To: akpm@linux-foundation.org, david@kernel.org, lorenzo.stoakes@oracle.com Cc: ziy@nvidia.com, npache@redhat.com, baohua@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vernon Yang Subject: [PATCH 2/4] mm: khugepaged: remove mm when all memory has been collapsed Date: Mon, 15 Dec 2025 17:04:17 +0800 Message-ID: <20251215090419.174418-3-yanglincheng@kylinos.cn> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251215090419.174418-1-yanglincheng@kylinos.cn> References: <20251215090419.174418-1-yanglincheng@kylinos.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The following data is traced by bpftrace on a desktop system. After the system has been left idle for 10 minutes upon booting, a lot of SCAN_PMD_MAPPED or SCAN_PMD_NONE are observed during a full scan by khugepaged. @scan_pmd_status[1]: 1 ## SCAN_SUCCEED @scan_pmd_status[4]: 158 ## SCAN_PMD_MAPPED @scan_pmd_status[3]: 174 ## SCAN_PMD_NONE total progress size: 701 MB Total time : 440 seconds ## include khugepaged_scan_sleep_millisecs The khugepaged_scan list save all task that support collapse into hugepage, as long as the take is not destroyed, khugepaged will not remove it from the khugepaged_scan list. This exist a phenomenon where task has already collapsed all memory regions into hugepage, but khugepaged continues to scan it, which wastes CPU time and invalid, and due to khugepaged_scan_sleep_millisecs (default 10s) causes a long wait for scanning a large number of invalid task, so scanning really valid task is later. After applying this patch, when all memory is either SCAN_PMD_MAPPED or SCAN_PMD_NONE, the mm is automatically removed from khugepaged's scan list. If the page fault or MADV_HUGEPAGE again, it is added back to khugepaged. Signed-off-by: Vernon Yang --- mm/khugepaged.c | 35 +++++++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 0598a19a98cc..1ec1af5be3c8 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -115,6 +115,7 @@ struct khugepaged_scan { struct list_head mm_head; struct mm_slot *mm_slot; unsigned long address; + bool maybe_collapse; }; =20 static struct khugepaged_scan khugepaged_scan =3D { @@ -1420,22 +1421,19 @@ static int hpage_collapse_scan_pmd(struct mm_struct= *mm, return result; } =20 -static void collect_mm_slot(struct mm_slot *slot) +static void collect_mm_slot(struct mm_slot *slot, bool maybe_collapse) { struct mm_struct *mm =3D slot->mm; =20 lockdep_assert_held(&khugepaged_mm_lock); =20 - if (hpage_collapse_test_exit(mm)) { + if (hpage_collapse_test_exit(mm) || !maybe_collapse) { /* free mm_slot */ hash_del(&slot->hash); list_del(&slot->mm_node); =20 - /* - * Not strictly needed because the mm exited already. - * - * mm_flags_clear(MMF_VM_HUGEPAGE, mm); - */ + if (!maybe_collapse) + mm_flags_clear(MMF_VM_HUGEPAGE, mm); =20 /* khugepaged_mm_lock actually not necessary for the below */ mm_slot_free(mm_slot_cache, slot); @@ -2397,6 +2395,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned = int pages, int *result, struct mm_slot, mm_node); khugepaged_scan.address =3D 0; khugepaged_scan.mm_slot =3D slot; + khugepaged_scan.maybe_collapse =3D false; } spin_unlock(&khugepaged_mm_lock); =20 @@ -2470,8 +2469,18 @@ static unsigned int khugepaged_scan_mm_slot(unsigned= int pages, int *result, khugepaged_scan.address, &mmap_locked, cc); } =20 - if (*result =3D=3D SCAN_SUCCEED) + switch (*result) { + case SCAN_PMD_NULL: + case SCAN_PMD_NONE: + case SCAN_PMD_MAPPED: + case SCAN_PTE_MAPPED_HUGEPAGE: + break; + case SCAN_SUCCEED: ++khugepaged_pages_collapsed; + fallthrough; + default: + khugepaged_scan.maybe_collapse =3D true; + } =20 /* move to next address */ khugepaged_scan.address +=3D HPAGE_PMD_SIZE; @@ -2500,6 +2509,11 @@ static unsigned int khugepaged_scan_mm_slot(unsigned= int pages, int *result, * if we scanned all vmas of this mm. */ if (hpage_collapse_test_exit(mm) || !vma) { + bool maybe_collapse =3D khugepaged_scan.maybe_collapse; + + if (mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm)) + maybe_collapse =3D true; + /* * Make sure that if mm_users is reaching zero while * khugepaged runs here, khugepaged_exit will find @@ -2508,12 +2522,13 @@ static unsigned int khugepaged_scan_mm_slot(unsigne= d int pages, int *result, if (!list_is_last(&slot->mm_node, &khugepaged_scan.mm_head)) { khugepaged_scan.mm_slot =3D list_next_entry(slot, mm_node); khugepaged_scan.address =3D 0; + khugepaged_scan.maybe_collapse =3D false; } else { khugepaged_scan.mm_slot =3D NULL; khugepaged_full_scans++; } =20 - collect_mm_slot(slot); + collect_mm_slot(slot, maybe_collapse); } =20 trace_mm_khugepaged_scan(mm, progress, khugepaged_scan.mm_slot =3D=3D NUL= L); @@ -2616,7 +2631,7 @@ static int khugepaged(void *none) slot =3D khugepaged_scan.mm_slot; khugepaged_scan.mm_slot =3D NULL; if (slot) - collect_mm_slot(slot); + collect_mm_slot(slot, true); spin_unlock(&khugepaged_mm_lock); return 0; } --=20 2.51.0 From nobody Mon Dec 15 23:36:16 2025 Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17D86328B54 for ; Mon, 15 Dec 2025 09:06:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789586; cv=none; b=VfxegwWKsqrkuTzhbah6uEcNrWM4bQe9r5hqGKjlr+d0MReaRn5iuqi+KFWpdL0Rm55+4MPoUmYMw5/IGc/inhHd2HTfQKNt2KewhNWFMVZShll1lXeqnIoghWv8OM00ZrQAVNZW/XjqcUcSrIhu7GlwbwnQ/DGWIzPm9ItNN0Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789586; c=relaxed/simple; bh=xrceK18srUwqt3PRAltsvJjOGf/R5Oj3ojvTGve5QHA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FutovpqUTIeVeZBHpQjhMblUDeXXEuvSR9AW8Z2FHiqXHt36sZk8R4tvSqar3pHJGNFbgTNBQeAA8allC3bfk45JH/v8Su/PXkufwdpSVv5n0DHgcoWIzQ9wIarPXHIRPGcIl3VL7HaV7PFCfQMOLomr4w08kPRVK63qrAZVxaU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=U5a5JNYb; arc=none smtp.client-ip=209.85.210.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="U5a5JNYb" Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-7b80fed1505so3092396b3a.3 for ; Mon, 15 Dec 2025 01:06:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765789584; x=1766394384; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NmcIU9qlUpZvwHLYQ6LJ9BQcVZiKr95F6LsU889+bmM=; b=U5a5JNYbvsmj0iytNrbKOYYiOJ9P246ZyoqpebB2MUfuI+vC+rRc46hJKTfKulVHVE fV+lQ6aewpp0e0muS9yYU+xzMWZyWdQLdo6TkvpUEZ95kYV0DDIJYX/ez4at7mkkDqBo ZdlzvKsrJ8TGOehZA7lZ+WpiGBqqjNa7o5NusUB0oF1iXfTKtbieE1JlhCpIkl3wgTlX 3neRpad3Mx15n4BtkX87nK5CDIl0sxt4cRquSS96cAYbGOxEHXsk/wdKSSm83GDvBIsJ RXPQYV5q8kdSZCs6XfkFf/5QDVzsmYBYudLMtJ67a9hB+ZgxkpkYyeqXbz/E8lEMT6RT PySQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765789584; x=1766394384; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=NmcIU9qlUpZvwHLYQ6LJ9BQcVZiKr95F6LsU889+bmM=; b=fgo3tC0lR/jlTwe+w9OLm2lzw9O3PHZHD2A7zV760xK28fbsqFJb+AmJfFDr1dxZgY 3P3qHQYM1kTOdw4Dhi0f3hO4Zp+AtYR5o/HmoWlVfNid0AW5tid6qK0hWtQMfAfJweW/ nBdgfMlOccaOTTerT/XUCR+AOXRkhG+K0J74VoXKLp3X3KOX4hA1IEpZxdmkRHqiOo1D ODJTu+EyaSxdufBBRNWvBQ+baD5mdwBc5JGTLAbkBTjS7Eq+ZVrIvgX2GUWpr96qz/7I D9zgaIce7XfehLDYhIqq/Q/2uTSHqX4MEviDXUfqJu10uXb1Ye5W/s2/oISQp2m4qQ9W dIJA== X-Forwarded-Encrypted: i=1; AJvYcCWefi9u12vihbw9cZJpXdUhUYT1RSaDu28oLWKC54BBNs1iaenXxCr72Nk8pNzB+K9T7oYb7zxjXAP6Ta8=@vger.kernel.org X-Gm-Message-State: AOJu0YzuL0XVAoby0IY3khxG7bjykKYYcRxpNuxrbWpNbPzOzr3fzZWF uailDBa0nvW77fOUiD9WjW2/Dy/YnU0EogJ/nAn/tc6EQA5dP1q7+kGIem02hs96TCg= X-Gm-Gg: AY/fxX42FTAs3X50XoQajFPWoC6UlmSJQ0wXwoHzH4Ev6NBmR5Rq9S6o2V3FrGO0oKS SQYxw6nvB9iLBb2mEyFk/uWbdUFMIKfixp6KNtqxVXfJx7VZIpKFmErtGVxy/II9EbqnTyilc9M oBmVw3qZNGwKDnnYOYTRHEgIed9YYqTQLgtUqAJ2FC2iAL4NLS2JdKME2oZIUhYYI8aH0Rgx4mb 9Bx6DlBy5DFDOMrpbke4Ydb9hONVazODTz7mhxhBI1u8GvMRJZ2Pk7fgaas6K/An+aQiAGUBfwx 27rxjhRxXbysQdPUa0m/EH4hoi5TWlvUr1sWl8p1G0tShX4GnEVSxu8rQzXKOKq30K6CYwu2mg0 AqB1N23NNOZHoEwY/qus2WuPKHaso82TZxQZk7KH2d/bhoIFWf+0E0H6XQQkVA1H0FtT8xVbHH1 IRZgfi9zfAXHg59L1LGHkMpeYE+e8GCw== X-Google-Smtp-Source: AGHT+IEPI+nTV1vhiglExDtSpRez29RRQ8Wci1gwL6qRWUlWpm/mQuVAr7V1HihJUsx0PLEtt8CIig== X-Received: by 2002:a05:6a00:3697:b0:7b2:2d85:ae53 with SMTP id d2e1a72fcca58-7f66744661fmr9423954b3a.8.1765789584199; Mon, 15 Dec 2025 01:06:24 -0800 (PST) Received: from localhost.localdomain ([114.231.217.195]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7f4c5093a40sm11993160b3a.46.2025.12.15.01.06.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Dec 2025 01:06:23 -0800 (PST) From: Vernon Yang X-Google-Original-From: Vernon Yang To: akpm@linux-foundation.org, david@kernel.org, lorenzo.stoakes@oracle.com Cc: ziy@nvidia.com, npache@redhat.com, baohua@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vernon Yang Subject: [PATCH 3/4] mm: khugepaged: move mm to list tail when MADV_COLD/MADV_FREE Date: Mon, 15 Dec 2025 17:04:18 +0800 Message-ID: <20251215090419.174418-4-yanglincheng@kylinos.cn> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251215090419.174418-1-yanglincheng@kylinos.cn> References: <20251215090419.174418-1-yanglincheng@kylinos.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For example, create three task: hot1 -> cold -> hot2. After all three task are created, each allocate memory 128MB. the hot1/hot2 task continuously access 128 MB memory, while the cold task only accesses its memory briefly andthen call madvise(MADV_COLD). However, khugepaged still prioritizes scanning the cold task and only scans the hot2 task after completing the scan of the cold task. So if the user has explicitly informed us via MADV_COLD/FREE that this memory is cold or will be freed, it is appropriate for khugepaged to scan it only at the latest possible moment, thereby avoiding unnecessary scan and collapse operations to reducing CPU wastage. Here are the performance test results: (Throughput bigger is better, other smaller is better) Testing on x86_64 machine: | task hot2 | without patch | with patch | delta | |---------------------|---------------|---------------|---------| | total accesses time | 3.14 sec | 2.92 sec | -7.01% | | cycles per access | 4.91 | 2.07 | -57.84% | | Throughput | 104.38 M/sec | 112.12 M/sec | +7.42% | | dTLB-load-misses | 288966432 | 1292908 | -99.55% | Testing on qemu-system-x86_64 -enable-kvm: | task hot2 | without patch | with patch | delta | |---------------------|---------------|---------------|---------| | total accesses time | 3.35 sec | 2.96 sec | -11.64% | | cycles per access | 7.23 | 2.12 | -70.68% | | Throughput | 97.88 M/sec | 110.76 M/sec | +13.16% | | dTLB-load-misses | 237406497 | 3189194 | -98.66% | Signed-off-by: Vernon Yang --- include/linux/khugepaged.h | 1 + mm/khugepaged.c | 14 ++++++++++++++ mm/madvise.c | 3 +++ 3 files changed, 18 insertions(+) diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index eb1946a70cff..726e99de84e9 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -15,6 +15,7 @@ extern void __khugepaged_enter(struct mm_struct *mm); extern void __khugepaged_exit(struct mm_struct *mm); extern void khugepaged_enter_vma(struct vm_area_struct *vma, vm_flags_t vm_flags); +void khugepaged_move_tail(struct mm_struct *mm); extern void khugepaged_min_free_kbytes_update(void); extern bool current_is_khugepaged(void); extern int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long add= r, diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 1ec1af5be3c8..91836dda2015 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -468,6 +468,20 @@ void khugepaged_enter_vma(struct vm_area_struct *vma, } } =20 +void khugepaged_move_tail(struct mm_struct *mm) +{ + struct mm_slot *slot; + + if (!mm_flags_test(MMF_VM_HUGEPAGE, mm)) + return; + + spin_lock(&khugepaged_mm_lock); + slot =3D mm_slot_lookup(mm_slots_hash, mm); + if (slot && khugepaged_scan.mm_slot !=3D slot) + list_move_tail(&slot->mm_node, &khugepaged_scan.mm_head); + spin_unlock(&khugepaged_mm_lock); +} + void __khugepaged_exit(struct mm_struct *mm) { struct mm_slot *slot; diff --git a/mm/madvise.c b/mm/madvise.c index fb1c86e630b6..3f9ca7af2c82 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -608,6 +608,8 @@ static long madvise_cold(struct madvise_behavior *madv_= behavior) madvise_cold_page_range(&tlb, madv_behavior); tlb_finish_mmu(&tlb); =20 + khugepaged_move_tail(vma->vm_mm); + return 0; } =20 @@ -835,6 +837,7 @@ static int madvise_free_single_vma(struct madvise_behav= ior *madv_behavior) &walk_ops, tlb); tlb_end_vma(tlb, vma); mmu_notifier_invalidate_range_end(&range); + khugepaged_move_tail(mm); return 0; } =20 --=20 2.51.0 From nobody Mon Dec 15 23:36:16 2025 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB5D3329E7E for ; Mon, 15 Dec 2025 09:06:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789589; cv=none; b=QiuG/Ll4clBH9GrMX3r32eEj5DICkqeIvK7m6WuX7bQ9SKhi4BovDHGyTkV2Im2CAB6P0X9yIOwh5krb/9FCQQyCDXpManbOyah3sD2mRzR6Z6SWd3BalBWBeynKT1vuWGMqBJTxND6JXXZbpfp9XsEdT6BloqAJ8Yt1IAmKJKE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765789589; c=relaxed/simple; bh=gURuOkRAj+HHVD56jG1av261BW/p8zz/OQgLD4QVZrc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LpBv3Chh7crcNLHhj7ZyqYDi11yH62H0MeFBD0gvMTdQdjchS5EGvkYDuqi1lSExnNIwUrgnX8REnZVUWov3PLARrEzRh4Amo5oTEWtLRa+B6ACw/m6u9co101GZiEPd53rs14wLdOaGpAa/KEIXjioJY9nkA0VkhR+pqh33oU0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=iCNPWUq7; arc=none smtp.client-ip=209.85.210.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iCNPWUq7" Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-7aab061e7cbso3992419b3a.1 for ; Mon, 15 Dec 2025 01:06:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765789587; x=1766394387; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fHlq5O2U4Y391ivye8Dbi23NfhTsn0lBjEKHffIvPIE=; b=iCNPWUq7ZyimZrIjzF9XXwn7aFXrixXhOl8AULfgiUAxBCYGJ3ydAQsGmfwXVuD9Sb dzAfa9MfTDeA88UVxk8sFjCTfd78wf5hFSz5HDeS4QD9AMMfzy9pzAZt6kIsthQjNqF7 lfGgtg0UEVtpc8rzKBAetRnTq+4s85z6BjNyqpTa+RdCsd52YcwBqiRlS2rVGh6rQLeq r0NM7eXXsgmmBvG3HZRdAGgWbXYZ6HguLGROIUApykgOxASa3690TwW7e5AXzYX45she rNh0aPBsBozgfQwZb8cpi7YgulYK5V0pfX9ldwEnQOD8dC/dLuBNrauLaRCM9gFzhfjI KYiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765789587; x=1766394387; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=fHlq5O2U4Y391ivye8Dbi23NfhTsn0lBjEKHffIvPIE=; b=owv1vV49t+UGCEyFJAfK1VrGQ0NrExrThthEjcDHXTDIegpN+O1rghVa6vXcIyyUU8 Dz/em7hilAAwIo/eDvOzvXcvlPSLk4sPqbkyJvsRRhCiB2ydhQaZfEFrM9fJkrmbz3I5 LAsH00ypcVkfHRRsKdCWMYCrdfLm+ms7vkhoRG7cONFzhG2KXq295O/PG9Y0KBY7CP5l 9V/7DcvPOFPCLl5IHqDYiQZYtNQta3gUcGIV7urH69+boLk5hwYede7iWJyJ+FfXz/GS 3JrPQN58jg0zSlTNZzMJMbFYiDfN0HskMYfEHBt3f6bSMTCu5qZ6LGwAGxbmpibraAqi Anow== X-Forwarded-Encrypted: i=1; AJvYcCX3tRQSHb0fc5RfIZ1fpYooSQtBY5ozMl4bL6yHJxBomMBdZl1J225jNm4TNe52cmYj7Q/Z9/vCUvjFVp0=@vger.kernel.org X-Gm-Message-State: AOJu0YxgVwP6SAtQuMOfpeQuWCHweTEAoy/MDNvwkXihowxiMo2TBAoV ZtYS57xlMYKFcdc1oKObx/n4krxNVl1ub+IKHHeJo849uqEF4R4RPeZs X-Gm-Gg: AY/fxX5qi0/6qFKrxQDD6QCW0510KPuPaB/fBYJqrHrt7PKrf7SYNV8hYTAZzPBGeR+ X/PpYmPbqaeIaMbqaReOmMWgvEh3E+M7qx3BOtj/tslMj5DAwz/bsM9YPFpbasArYMlxhmtGRnz 7DGPNQ5UnVDPs4yVHacK/q562qNXWAw99GAi4WHwGDVKHev2dOUS/EqPZX3gtdCEuGQmfPqqheu U7DkMmeCo/UU8YCQgxWt68b1cCi3+LNpXxghqPU2V3S44ojRJJUhetIjLNDXzPFtOyVsxGjTRrK w2TgzB6qp5zn/GSE3+JpuCFcz/JLZk5O5G01XLsTG2PrXRAV+q9a0HyrULe4JGq0ko09v5Je7Rj s3tc0/PE53rAp7O+TPVmogG8SxiBOYVFavA210Pao9CEvd7bImOL24w5rIHGpfFuGuYGz5EVtxF y+zSpfUKcaa6P2VAP3wJ4bQmMKZEjAXA== X-Google-Smtp-Source: AGHT+IHYADMjuET4kHZxSorC57UbvUxdSQ8+E3XkBJFAdBGpGyt2UvW855XokwZ+T543AobCcPohQA== X-Received: by 2002:a05:6a00:a381:b0:7f6:2b06:7134 with SMTP id d2e1a72fcca58-7f6694aa6dfmr7999408b3a.32.1765789587269; Mon, 15 Dec 2025 01:06:27 -0800 (PST) Received: from localhost.localdomain ([114.231.217.195]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7f4c5093a40sm11993160b3a.46.2025.12.15.01.06.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Dec 2025 01:06:26 -0800 (PST) From: Vernon Yang X-Google-Original-From: Vernon Yang To: akpm@linux-foundation.org, david@kernel.org, lorenzo.stoakes@oracle.com Cc: ziy@nvidia.com, npache@redhat.com, baohua@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vernon Yang Subject: [PATCH 4/4] mm: khugepaged: set to next mm direct when mm has MMF_DISABLE_THP_COMPLETELY Date: Mon, 15 Dec 2025 17:04:19 +0800 Message-ID: <20251215090419.174418-5-yanglincheng@kylinos.cn> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251215090419.174418-1-yanglincheng@kylinos.cn> References: <20251215090419.174418-1-yanglincheng@kylinos.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When an mm with the MMF_DISABLE_THP_COMPLETELY flag is detected during scanning, directly set khugepaged_scan.mm_slot to the next mm_slot, reduce redundant operation. Signed-off-by: Vernon Yang --- mm/khugepaged.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 91836dda2015..a8723eea12f1 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2432,6 +2432,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned = int pages, int *result, =20 cond_resched(); if (unlikely(hpage_collapse_test_exit_or_disable(mm))) { + vma =3D NULL; progress++; break; } @@ -2452,8 +2453,10 @@ static unsigned int khugepaged_scan_mm_slot(unsigned= int pages, int *result, bool mmap_locked =3D true; =20 cond_resched(); - if (unlikely(hpage_collapse_test_exit_or_disable(mm))) + if (unlikely(hpage_collapse_test_exit_or_disable(mm))) { + vma =3D NULL; goto breakouterloop; + } =20 VM_BUG_ON(khugepaged_scan.address < hstart || khugepaged_scan.address + HPAGE_PMD_SIZE > @@ -2470,8 +2473,10 @@ static unsigned int khugepaged_scan_mm_slot(unsigned= int pages, int *result, fput(file); if (*result =3D=3D SCAN_PTE_MAPPED_HUGEPAGE) { mmap_read_lock(mm); - if (hpage_collapse_test_exit_or_disable(mm)) + if (hpage_collapse_test_exit_or_disable(mm)) { + vma =3D NULL; goto breakouterloop; + } *result =3D collapse_pte_mapped_thp(mm, khugepaged_scan.address, false); if (*result =3D=3D SCAN_PMD_MAPPED) --=20 2.51.0