From nobody Tue Feb 10 05:26:16 2026 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06A9026C385 for ; Mon, 29 Dec 2025 05:52:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766987545; cv=none; b=ErzXVbPK5e4ImNoivafFZV5kgcddjuY/WKV8MHlmqAytTuiD5Cqy1csUibBEo2y+WMG02niKkVPWjpVQVinZQA++P3WJQzmL6Y3LDv2ER+Rl6BmXxSdPRnWijui9ECm6XjtiBkw5thFbMhxLg6sRe5hQTgYnHJnqH1bdsZLeMks= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766987545; c=relaxed/simple; bh=s04s2hnGgwxDpsfL9iSgvkVcB8t+j5N/EEA9utdpMVk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qxtBUEyvbS+mYfM7QZ4akg/29OvPscEYg7dsT+r56VTfSC9NW6oK+ooaUN0GhOGmSD0Vm2o/qC7OyQ9eT4X1PG96bG3JB188Q97MN+Aq51TBgkbNF3JFfzcSg21e/ul+u6Pq2urs4RODcKfzg4rzuR0jyqxJ27CPIrWAHKRwIPU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AhY7Duqv; arc=none smtp.client-ip=209.85.216.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AhY7Duqv" Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-34c718c5481so8487723a91.3 for ; Sun, 28 Dec 2025 21:52:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766987543; x=1767592343; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=G+YV981o5ya8+p67zEySvWRoO1hxPee1KmoTUwsGb3A=; b=AhY7DuqvKWHOfP9ljKl1oZIi04vtVje29ZUQc3heNzo02isPoKMOUAH3m4mBkSmlnp v0529AxWSlhGQV8M+9S47G6L48sLCz94qxkjfl+28VUgPwD9jPC7jM+75Opmz+nGoBqx 8M4PUjPr96SbtJ3zgirF/cXk8jGRBVBdrMu8O2fJOrW5+PvR6BUOY4wv15I9P99QDvSv urAS1u14SZU9VZyVkZVGqarrd7NfsShUCXXRTgzWkWYA/904A6n1TpJ0+JfD8mimZTxw ESXiVHAOD+WRr84BaiaaDbS0IKzBSOVXCkDfoNA/gMyyftFdaKWmZi3eVkSFmuRwSyJR 0fuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766987543; x=1767592343; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=G+YV981o5ya8+p67zEySvWRoO1hxPee1KmoTUwsGb3A=; b=FZR2ZM0htkk9tcigx2qA3cYMqvEkq2N6odgjA1aOwx0AX4kW/TtgVLYAl3ypYnGZsO QJcNeSkr1XkvjtryfA5MBFK3dcp75N1dxQiZeUgy48tJTUka29HdoL2UweWdQ0w+jcCV BoHujgIFZjwikxQc1IM1SZYhR8H8+w8KEqBSeddS0AE0YF7gVwEKN+h2+1F3ugALlhVh sOI0HGSnbGIIk2zp2k4C8JtNvorjtGlDOlExAoVuPphYn1vcih98X1/0OXbbuhgCOqBV f4Qu8s1ACT2AfiMdAOAsdeAXvNVmTCVT9Jjm6C3Fi8OOH8iPvhO3VRlNo1U/cyO3MjLl ZHEA== X-Forwarded-Encrypted: i=1; AJvYcCXsyUQYtJROYDRikb5i3xBwdVuhe3FvV+tNULH59yiXcVR0G3qKGfiJp6EUxr7CHNJxacmE4msK9IW+uwU=@vger.kernel.org X-Gm-Message-State: AOJu0Yzw3HGTer+TZ5rRtUnhl6WuSvEww/uypSpvXOCDMQPjXpqAcG8l /5lXE3ui7b9RcH5YJ5a9s2TsEBuKradOAwjdpHSOPT5Eq5FdnKrW3Ih8 X-Gm-Gg: AY/fxX7C32L6hnMG4LLfPaWOwuxB+z8dIeQrYbJTsFolF1Gadh05hpUBHiL7rTXWh2x hDsPsYNKICR49ksgBW3szNSsaCyPnq4IKQTdV0U9Jh4waAP3lB06Cijkf0fEbyASHKE0NVgzZSV IoEhH7ZiSqF3FTIQflo/MxHhXjsV2bsE6DsnuyUCoMsNHdRa6P1oTjCdnHZLgOq5Xbggv3/ZzHw L6NHwuN60jfp/G71RebU4QvpGKiLBuQjkk6nho610cCaO7js91bq1S86l1rQs6DXayynDU29Msd F0v8nzgwTRfQTN6sSKXjKRW2bSH9n8NtgeaQB682Dg7MjM48WTGJJYh93bGqTvJLXU8YINgoldg CgunHrv6Yac1LmYkZrJLaWlZcx/50Lm+t6SLCduzVG0NMGyn/O6mopI9ajt+PtvMfHjuVgwNgH4 SJlMKfLsmScM3GzdGtHio28BZkXFaI X-Google-Smtp-Source: AGHT+IElx6QVmh8bkNAdqI996CICA7RGsTegEhzBCgArsJpY8uIeyJ+BEZslcnT19OkVzRSeb+gp8g== X-Received: by 2002:a17:90b:1c12:b0:340:2a3a:71b7 with SMTP id 98e67ed59e1d1-34e9214bebcmr31415353a91.12.1766987543189; Sun, 28 Dec 2025 21:52:23 -0800 (PST) Received: from localhost.localdomain ([121.232.80.251]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-34e920c9a7csm26164019a91.0.2025.12.28.21.52.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Dec 2025 21:52:22 -0800 (PST) From: Vernon Yang X-Google-Original-From: Vernon Yang To: akpm@linux-foundation.org, david@kernel.org, lorenzo.stoakes@oracle.com Cc: ziy@nvidia.com, dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev, richard.weiyang@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vernon Yang Subject: [PATCH v2 3/4] mm: khugepaged: set VM_NOHUGEPAGE flag when MADV_COLD/MADV_FREE Date: Mon, 29 Dec 2025 13:51:50 +0800 Message-ID: <20251229055151.54887-4-yanglincheng@kylinos.cn> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251229055151.54887-1-yanglincheng@kylinos.cn> References: <20251229055151.54887-1-yanglincheng@kylinos.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For example, create three task: hot1 -> cold -> hot2. After all three task are created, each allocate memory 128MB. the hot1/hot2 task continuously access 128 MB memory, while the cold task only accesses its memory briefly andthen call madvise(MADV_COLD). However, khugepaged still prioritizes scanning the cold task and only scans the hot2 task after completing the scan of the cold task. So if the user has explicitly informed us via MADV_COLD/FREE that this memory is cold or will be freed, it is appropriate for khugepaged to skip it only, thereby avoiding unnecessary scan and collapse operations to reducing CPU wastage. Here are the performance test results: (Throughput bigger is better, other smaller is better) Testing on x86_64 machine: | task hot2 | without patch | with patch | delta | |---------------------|---------------|---------------|---------| | total accesses time | 3.14 sec | 2.93 sec | -6.69% | | cycles per access | 4.96 | 2.21 | -55.44% | | Throughput | 104.38 M/sec | 111.89 M/sec | +7.19% | | dTLB-load-misses | 284814532 | 69597236 | -75.56% | Testing on qemu-system-x86_64 -enable-kvm: | task hot2 | without patch | with patch | delta | |---------------------|---------------|---------------|---------| | total accesses time | 3.35 sec | 2.96 sec | -11.64% | | cycles per access | 7.29 | 2.07 | -71.60% | | Throughput | 97.67 M/sec | 110.77 M/sec | +13.41% | | dTLB-load-misses | 241600871 | 3216108 | -98.67% | Signed-off-by: Vernon Yang --- mm/madvise.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index b617b1be0f53..3a48d725a3fc 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1360,11 +1360,8 @@ static int madvise_vma_behavior(struct madvise_behav= ior *madv_behavior) return madvise_remove(madv_behavior); case MADV_WILLNEED: return madvise_willneed(madv_behavior); - case MADV_COLD: - return madvise_cold(madv_behavior); case MADV_PAGEOUT: return madvise_pageout(madv_behavior); - case MADV_FREE: case MADV_DONTNEED: case MADV_DONTNEED_LOCKED: return madvise_dontneed_free(madv_behavior); @@ -1378,6 +1375,18 @@ static int madvise_vma_behavior(struct madvise_behav= ior *madv_behavior) =20 /* The below behaviours update VMAs via madvise_update_vma(). */ =20 + case MADV_COLD: + error =3D madvise_cold(madv_behavior); + if (error) + goto out; + new_flags =3D (new_flags & ~VM_HUGEPAGE) | VM_NOHUGEPAGE; + break; + case MADV_FREE: + error =3D madvise_dontneed_free(madv_behavior); + if (error) + goto out; + new_flags =3D (new_flags & ~VM_HUGEPAGE) | VM_NOHUGEPAGE; + break; case MADV_NORMAL: new_flags =3D new_flags & ~VM_RAND_READ & ~VM_SEQ_READ; break; @@ -1756,7 +1765,6 @@ static enum madvise_lock_mode get_lock_mode(struct ma= dvise_behavior *madv_behavi switch (madv_behavior->behavior) { case MADV_REMOVE: case MADV_WILLNEED: - case MADV_COLD: case MADV_PAGEOUT: case MADV_POPULATE_READ: case MADV_POPULATE_WRITE: @@ -1766,7 +1774,6 @@ static enum madvise_lock_mode get_lock_mode(struct ma= dvise_behavior *madv_behavi case MADV_GUARD_REMOVE: case MADV_DONTNEED: case MADV_DONTNEED_LOCKED: - case MADV_FREE: return MADVISE_VMA_READ_LOCK; default: return MADVISE_MMAP_WRITE_LOCK; --=20 2.51.0