From nobody Mon Apr 6 20:28:35 2026 Received: from mail-dl1-f73.google.com (mail-dl1-f73.google.com [74.125.82.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 926223644C6 for ; Tue, 17 Mar 2026 23:07:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773788845; cv=none; b=kDoP2pyxvqL1xLGLpmto9BwSXoTL2xEPLQ+HHaNGL+/BQUiB+UaWQi71SEg3ms1IoUB9dPXEaAW90Dr7QSg/5YrfcHEzVTEb6sixx9GSOv7hjOH3L6QIaPz1tIalRbKQVevOswpXG2r6QJlt2LlZ+MMip8UcJtcX7xtSHLSwtYA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773788845; c=relaxed/simple; bh=YScM6Kr7HMm7e7moOv23pU8w0YmvqNuDylDtSGuk148=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=P6KSwqerRHcoHxJwZpFAkwKb42qf4PCjhrthJjm9oKhH3akrrdNtB1iRtPMGigzskAPs7Yz3R2parE8fEMI6f77E9zpBOHLua7nHyQ6jzt24YZWfombX2BAdyrCIaartzazlD3EXaSHd5BEMe2xloFOzI7/iIGKBghsRXhSQcv8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--bingjiao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iNmTIL67; arc=none smtp.client-ip=74.125.82.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--bingjiao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iNmTIL67" Received: by mail-dl1-f73.google.com with SMTP id a92af1059eb24-12711ec96fbso122572429c88.0 for ; Tue, 17 Mar 2026 16:07:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1773788844; x=1774393644; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HrgwEjKu+QJdHEXo5hFZmjJR7NuvkUAEBDxhWygoT50=; b=iNmTIL67p1EKa+bS8YZjmL8jMIA4RhXiVDgMJL14pn0v6le9h+i7uFAMeSoLdo4iE6 e0aqtrtEOaIguuFg7Vw/h0fiwV1B8BUVFD+3b/eB+uyc07oLCzKvpaBGwWuljHrci49X EDEgMe1ETLO3NILIFGWrziJBzqskS7xGxzbtkdHmETP6BtIOnA+hwV/aLcVf28IbdYXk NElcKwOPrdix8XzKclunWmi36esOZUpbW/vBx5QOF4KQJOo3VAZjnAYwtuLYvaXU6ydV J40Vz9a7iQ/BibgAQipqIZenJ+AR2vSoZ/h5N05EZIRc+eVVB4WjgiPc2//1hG55BTwL betg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773788844; x=1774393644; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HrgwEjKu+QJdHEXo5hFZmjJR7NuvkUAEBDxhWygoT50=; b=cvadRsht6xZbVZnlEIjGGi0pLX4TYQnozN480K+HqUoUJlbaRanWdrlfCz64SrI2RL r7a+6rk7UR+8UMf+6F33pYLzTDatcW7Jmruo/vpQdvjmeD91jPW7d03dgjO1OqaHAA+U 8PAIyT43ZsbmaDd16gg1qjc7o2GezNT77ts+u2uFeW6LHOc6KKTUxk0YhV6IITwDem8Q gcWUaqPwkqhLK5enUYO7n4Txrm0M4D4ghwdtYLOhyAP7z1DFBdJZUsfnqnwrma+sN4bV u/qhvoJV2/BHBbR9l0RF4NYZ+HgQLcDOh8jTjfQ50Gxosf/2F0Xt6P3AA7tCscr2nRWo C52w== X-Forwarded-Encrypted: i=1; AJvYcCVLq1bBoakEU0CLQk9JHhOf/opuFNdc2EOtmygz33IfTlpliziHClLZzxMDgHR7ornZObBYZKfyY67L/qw=@vger.kernel.org X-Gm-Message-State: AOJu0YzJYXu5F3qvSm7lFgpN78NV3mClYtAiRIAoyoQAr4ryhgDReDMn iNm0QBfMrkLQB8H0/Zw3Kg8Lex2r89JUtJW2ThTT3+FXHNs0T51RNNmSjCE2bwuF82gG17t+vcD AMAmZlsNNso3BMg== X-Received: from dlbep9.prod.google.com ([2002:a05:7022:1089:b0:128:e027:28aa]) (user=bingjiao job=prod-delivery.src-stubby-dispatcher) by 2002:a05:7022:619c:b0:127:3915:76b2 with SMTP id a92af1059eb24-129a71795ccmr770593c88.27.1773788843379; Tue, 17 Mar 2026 16:07:23 -0700 (PDT) Date: Tue, 17 Mar 2026 23:07:00 +0000 In-Reply-To: <20260317230720.990329-1-bingjiao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260317230720.990329-1-bingjiao@google.com> X-Mailer: git-send-email 2.53.0.851.ga537e3e6e9-goog Message-ID: <20260317230720.990329-2-bingjiao@google.com> Subject: [PATCH 1/3] mm/memcontrol: fix reclaim_options leak in try_charge_memcg() From: Bing Jiao To: linux-mm@kvack.org Cc: Bing Jiao , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , David Rientjes , Yosry Ahmed , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , Barry Song , Youngjun Park , David Hildenbrand , Qi Zheng , Lorenzo Stoakes , Axel Rasmussen , Yuanchu Xie , Wei Xu , Joshua Hahn Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In try_charge_memcg(), the 'reclaim_options' variable is initialized once at the start of the function. However, the function contains a retry loop. If reclaim_options were modified during an iteration (e.g., by encountering a memsw limit), the modified state would persist into subsequent retries. This could lead to incorrect reclaim behavior, such as anon pages cannot be reclaimed if memsw has quotas after retries. Fix by moving the initialization of 'reclaim_options' inside the retry loop, ensuring a clean state for every reclaim attempt. Fixes: 73b73bac90d9 ("mm: vmpressure: don't count proactive reclaim in vmpr= essure") Signed-off-by: Bing Jiao Reviewed-by: Yosry Ahmed --- mm/memcontrol.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a47fb68dd65f..303ac622d22d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2558,7 +2558,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg,= gfp_t gfp_mask, struct page_counter *counter; unsigned long nr_reclaimed; bool passed_oom =3D false; - unsigned int reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP; + unsigned int reclaim_options; bool drained =3D false; bool raised_max_event =3D false; unsigned long pflags; @@ -2572,6 +2572,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg,= gfp_t gfp_mask, /* Avoid the refill and flush of the older stock */ batch =3D nr_pages; + reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP; if (!do_memsw_account() || page_counter_try_charge(&memcg->memsw, batch, &counter)) { if (page_counter_try_charge(&memcg->memory, batch, &counter)) -- 2.53.0.851.ga537e3e6e9-goog From nobody Mon Apr 6 20:28:35 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05E0E3E556F for ; Tue, 17 Mar 2026 23:07:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773788847; cv=none; b=emvK/XiYOYYwo9cMwMlM4ishLP6gaiJylCeQbNHh27x1inGWHph8EmG5nvmYxNKctys9GIJCRCuoHVk5MkQU8yoGbAlKOhP5oPHJ2hEelJUnwxdFxMEcqAAMYejpV7m8ADoeWQ4bqHvJ7uklNKXl9vZp4FItbl4fm80aS6A0Ncc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773788847; c=relaxed/simple; bh=lTY2DkVlB1ZqmArZtsCW+40/HO85obKkV4Bbav50FdY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jjuqBg4mmM2vH7HdU+S1aEQdpZJQMUB0xwA+Cg+9QzO+lW4ut7WnymNFyGfbqyWk82JvBYCh1b5l8b3ylIRZHnvVJKSm51jLEcw+RZ18UOVkvHr0948MWPJluz4IYkVEE+f9Ab4nqZ5vpGnT+5MC5jU0ZCgJByM7MnsfVbpk7Qg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--bingjiao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=muesGX3u; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--bingjiao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="muesGX3u" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2b064f043adso17920075ad.0 for ; Tue, 17 Mar 2026 16:07:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1773788845; x=1774393645; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=By9SVSNyV82E5eeo5rQU1kGPzHgbL0ThBDCP4IlBKCg=; b=muesGX3uZ4sPoOWXsfwkJ0bxJTO3/DnReO0HtDdsSTAWX0sKLtEFLVvSjQi8tyXgNE NnXn2mUvxDExVOJJkF3ZNnfMXXIM/6vYssOp2tEEeBffq7PiI3TZ4CJIpBaFGysUB2VY 77bcubsbrlT3unBOEm90JZ3j9KUB+8xXzOHAt+SXGo7ChYx0izEaRF8Wd7qRpar2m2rh SRcp+Z5UPGBBb192UxDAN3k9ASktuSMPi5oYHd7F9VeEiBRdG9san6bk3iaPed/ofNEm 7G8R6JDMbphBKCc8N9bxEls8sKIfm8UhPfi4X75WkENOx1xaVMPgxae/gccB0HuYpGm0 HkLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773788845; x=1774393645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=By9SVSNyV82E5eeo5rQU1kGPzHgbL0ThBDCP4IlBKCg=; b=V2VBJZB7ctO31CXg0nF0LdexBw66eiWPV3idXVRigi8YovHUzO9m6lzlvV0LjQKzKV LKZQMh31iZi1vf/xT58sDk9Qbz8IGd2XVQes7tOGmj3F/wcmlZqq6ZbnH9mptcWVYvmF ptvPo0VK7Uc+bj4bwKFL1X79DJCbBRqj+X7Vxgz1WeIaygvDsqfixjnpaqXUrr0W9O5F K4BL5OraSw+sWMjrJawT0PozXgQIphmfVbeJ0qm1EuAE15yjMmkDvEWM81I94JnS56RC L4gBs/ptKo2lIRInFUL16DjpesFBE5ZqcCd+10IY2gZxjpERa1qL7mmIUWTOjVvayphl h4sg== X-Forwarded-Encrypted: i=1; AJvYcCUDDjYhlUMLKerVUnzWosSMSLuuMp+kWVAFBepVfQHwxlc+328tIOwKRwCJepCpac7/+1dRjDCW/xaLbrw=@vger.kernel.org X-Gm-Message-State: AOJu0YzX0h9OduAmoU3F39CPNwuKDQuVr6c4HqivbKrc6Y/xHwXxVbkN OXgDqwvXU4k561cqxGP3y1HA8UVEzo81jHAth2zEVFxIkeUS8aBE7CUVMiN3yo3SC39Gs992nTp wf+DrgGfkC7lGpQ== X-Received: from plbli7.prod.google.com ([2002:a17:903:2947:b0:2b0:4e8e:5c09]) (user=bingjiao job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f544:b0:2ae:4732:285a with SMTP id d9443c01a7336-2b06e332b4fmr12404475ad.3.1773788845155; Tue, 17 Mar 2026 16:07:25 -0700 (PDT) Date: Tue, 17 Mar 2026 23:07:01 +0000 In-Reply-To: <20260317230720.990329-1-bingjiao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260317230720.990329-1-bingjiao@google.com> X-Mailer: git-send-email 2.53.0.851.ga537e3e6e9-goog Message-ID: <20260317230720.990329-3-bingjiao@google.com> Subject: [PATCH 2/3] mm/memcontrol: disable demotion in memcg direct reclaim From: Bing Jiao To: linux-mm@kvack.org Cc: Bing Jiao , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , David Rientjes , Yosry Ahmed , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , Barry Song , Youngjun Park , David Hildenbrand , Qi Zheng , Lorenzo Stoakes , Axel Rasmussen , Yuanchu Xie , Wei Xu , Joshua Hahn Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" NUMA demotion counts towards reclaim targets in shrink_folio_list(), but it does not reduce the total memory usage of a memcg. In memcg direct reclaim paths (e.g., charge-triggered or manual limit writes), where demotion is allowed, this leads to "fake progress" where the reclaim loop concludes it has satisfied the memory request without actually reducing the cgroup's charge. This could result in inefficient reclaim loops, CPU waste, moving all pages to far-tier nodes, and potentially premature OOM kills when the cgroup is under memory pressure but demotion is still possible. Introduce the MEMCG_RECLAIM_NO_DEMOTION flag to disable demotion in these memcg-specific reclaim paths. This ensures that reclaim progress is only counted when memory is actually freed or swapped out. Signed-off-by: Bing Jiao --- include/linux/swap.h | 1 + mm/memcontrol-v1.c | 10 ++++++++-- mm/memcontrol.c | 16 +++++++++++----- mm/vmscan.c | 1 + 4 files changed, 21 insertions(+), 7 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 7a09df6977a5..e83897a6dc72 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -356,6 +356,7 @@ unsigned long lruvec_lru_size(struct lruvec *lruvec, en= um lru_list lru, int zone #define MEMCG_RECLAIM_MAY_SWAP (1 << 1) #define MEMCG_RECLAIM_PROACTIVE (1 << 2) +#define MEMCG_RECLAIM_NO_DEMOTION (1 << 3) #define MIN_SWAPPINESS 0 #define MAX_SWAPPINESS 200 diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index 433bba9dfe71..3cb600e28e5b 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -1466,6 +1466,10 @@ static int mem_cgroup_resize_max(struct mem_cgroup *= memcg, int ret; bool limits_invariant; struct page_counter *counter =3D memsw ? &memcg->memsw : &memcg->memory; + unsigned int reclaim_options =3D MEMCG_RECLAIM_NO_DEMOTION; + + if (!memsw) + reclaim_options |=3D MEMCG_RECLAIM_MAY_SWAP; do { if (signal_pending(current)) { @@ -1500,7 +1504,7 @@ static int mem_cgroup_resize_max(struct mem_cgroup *m= emcg, } if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, - memsw ? 0 : MEMCG_RECLAIM_MAY_SWAP, NULL)) { + reclaim_options, NULL)) { ret =3D -EBUSY; break; } @@ -1520,6 +1524,8 @@ static int mem_cgroup_resize_max(struct mem_cgroup *m= emcg, static int mem_cgroup_force_empty(struct mem_cgroup *memcg) { int nr_retries =3D MAX_RECLAIM_RETRIES; + unsigned int reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP | + MEMCG_RECLAIM_NO_DEMOTION; /* we call try-to-free pages for make this cgroup empty */ lru_add_drain_all(); @@ -1532,7 +1538,7 @@ static int mem_cgroup_force_empty(struct mem_cgroup *= memcg) return -EINTR; if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, - MEMCG_RECLAIM_MAY_SWAP, NULL)) + reclaim_options, NULL)) nr_retries--; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 303ac622d22d..fcf1cd0da643 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2287,6 +2287,8 @@ static unsigned long reclaim_high(struct mem_cgroup *= memcg, gfp_t gfp_mask) { unsigned long nr_reclaimed =3D 0; + unsigned int reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP | + MEMCG_RECLAIM_NO_DEMOTION; do { unsigned long pflags; @@ -2300,7 +2302,7 @@ static unsigned long reclaim_high(struct mem_cgroup *= memcg, psi_memstall_enter(&pflags); nr_reclaimed +=3D try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, - MEMCG_RECLAIM_MAY_SWAP, + reclaim_options, NULL); psi_memstall_leave(&pflags); } while ((memcg =3D parent_mem_cgroup(memcg)) && @@ -2572,7 +2574,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg,= gfp_t gfp_mask, /* Avoid the refill and flush of the older stock */ batch =3D nr_pages; - reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP; + reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP | MEMCG_RECLAIM_NO_DEMOTION; if (!do_memsw_account() || page_counter_try_charge(&memcg->memsw, batch, &counter)) { if (page_counter_try_charge(&memcg->memory, batch, &counter)) @@ -2610,7 +2612,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg,= gfp_t gfp_mask, psi_memstall_enter(&pflags); nr_reclaimed =3D try_to_free_mem_cgroup_pages(mem_over_limit, nr_pages, - gfp_mask, reclaim_options, NULL); + gfp_mask, reclaim_options, NULL); psi_memstall_leave(&pflags); if (mem_cgroup_margin(mem_over_limit) >=3D nr_pages) @@ -4638,6 +4640,8 @@ static ssize_t memory_high_write(struct kernfs_open_f= ile *of, { struct mem_cgroup *memcg =3D mem_cgroup_from_css(of_css(of)); unsigned int nr_retries =3D MAX_RECLAIM_RETRIES; + unsigned int reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP | + MEMCG_RECLAIM_NO_DEMOTION; bool drained =3D false; unsigned long high; int err; @@ -4669,7 +4673,7 @@ static ssize_t memory_high_write(struct kernfs_open_f= ile *of, } reclaimed =3D try_to_free_mem_cgroup_pages(memcg, nr_pages - high, - GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP, NULL); + GFP_KERNEL, reclaim_options, NULL); if (!reclaimed && !nr_retries--) break; @@ -4690,6 +4694,8 @@ static ssize_t memory_max_write(struct kernfs_open_fi= le *of, { struct mem_cgroup *memcg =3D mem_cgroup_from_css(of_css(of)); unsigned int nr_reclaims =3D MAX_RECLAIM_RETRIES; + unsigned int reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP | + MEMCG_RECLAIM_NO_DEMOTION; bool drained =3D false; unsigned long max; int err; @@ -4721,7 +4727,7 @@ static ssize_t memory_max_write(struct kernfs_open_fi= le *of, if (nr_reclaims) { if (!try_to_free_mem_cgroup_pages(memcg, nr_pages - max, - GFP_KERNEL, MEMCG_RECLAIM_MAY_SWAP, NULL)) + GFP_KERNEL, reclaim_options, NULL)) nr_reclaims--; continue; } diff --git a/mm/vmscan.c b/mm/vmscan.c index 33287ba4a500..7a8617ba1748 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -6809,6 +6809,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem= _cgroup *memcg, .may_unmap =3D 1, .may_swap =3D !!(reclaim_options & MEMCG_RECLAIM_MAY_SWAP), .proactive =3D !!(reclaim_options & MEMCG_RECLAIM_PROACTIVE), + .no_demotion =3D !!(reclaim_options & MEMCG_RECLAIM_NO_DEMOTION), }; /* * Traverse the ZONELIST_FALLBACK zonelist of the current node to put -- 2.53.0.851.ga537e3e6e9-goog From nobody Mon Apr 6 20:28:35 2026 Received: from mail-dl1-f73.google.com (mail-dl1-f73.google.com [74.125.82.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33D7E3FA5D3 for ; Tue, 17 Mar 2026 23:07:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773788849; cv=none; b=i/0mxAl0kSYOrYS5db2SB8EtTQb8Q7gxtCd2ZmmYJPRvrSF7OIjtNmHFMczEpwCFOc3kkGz7m1gV8Qe7de+btXrba4imU+sm7lkFw42Cq/XwBGHX4Bur3nfGluMqkavZon7JxZWjbkl6TcE0VQ0yOYr1H8jUINd3O59NAOtje7c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773788849; c=relaxed/simple; bh=ZfWE2COIH+ldGUo90wdzJ/MxDEymNSaHSsDjounjNKw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ZrMq1fDXkZp9bQjcvASk3mbMOq45nNjXHGCNkOM0aPnSplH1WQUP/XsbxOfn5rVldDeAqgeGFuhheNhWhC8LX8lV42Qgy+hMbknyJ0dXMc+gMKjxSqFdgqw2qI0/TMnmYsbiwhphFWrDuVGrD9rSBRd8tuqEiiKMiDKkqOZ+mrE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--bingjiao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=D5UjtZbv; arc=none smtp.client-ip=74.125.82.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--bingjiao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="D5UjtZbv" Received: by mail-dl1-f73.google.com with SMTP id a92af1059eb24-128edc72e5bso5636631c88.1 for ; Tue, 17 Mar 2026 16:07:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1773788847; x=1774393647; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2sLkrHizaGZBp890B7+nzygesvVkoY/tFf4Qx2iHrc0=; b=D5UjtZbv3NJS6j8ZtyKUljmjPFX3Bd52ycLqWXTKwyBlR+Ebja4J3fpcSYNqFAF8/k ox4ickECjfEHg8yBCX1MJiyB755Qk0eClW52q7t1+dLc4gyHnBS70m8PESLg5N2Nj5a9 jKlx8WzWsxun/9DUxHMDrCr7GVZoRMImoMC/WLnkhRusFdAGAE/eSZ/FeN5yJmCOlLxb 0ZD1jOvHx8+1/cEpfnhFMjkhIrJsK8pri6gSKxLQYR+it8VbuDZ2LRp9V94oYyNVpvgB r9LY0IgvBFpsAPS+FhGqkUGRX+nhupv4+cFFE5xErxQkadnfIwGzQXz0R6C//bPvNqKS tlWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773788847; x=1774393647; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2sLkrHizaGZBp890B7+nzygesvVkoY/tFf4Qx2iHrc0=; b=CDpjDFNkyHCFZjpAOI6qtmXt6bMoTR6qxKgV3gAij1Ua0jo5dz1XG+by4yzX7paHR2 yg0pfKCI/xQX5TTONMY4JEjgADur2xmNdovFGXgIS9y7avXd8fIOtgtby6OLGdLgodde yckWOPBagdlrWztUqNsMnFpZBhsOK3sjZCKnFe4qpTOk7VRJWuUvDb14YfDDxjCmNdGd aB/MShs7fBorQQgTooCrI/3zZ+wBvC4NGn23UKQC67TgPg25zpdOIBrIcrObJ5YPfInL V3iF3swmX20xM/6/E0eCWzNgyt6bqyT/mE3F2stbtsan0/2/mQFeaTMRLmMfpWzulS7L JjaQ== X-Forwarded-Encrypted: i=1; AJvYcCVF2Et2AZcJoOVDNqkNoJT5z4rcBBln+qTpFLg0Hgw48bPgByGhfyrOGDeGtu3vC6P1N54zjv2fGfcGEIg=@vger.kernel.org X-Gm-Message-State: AOJu0Yzc5VJ99HYa1QGETt7sKGCD8b5u+8a/3nzcqAt4XFr9Z8fpl7M7 KMgYTUJZcqXI+YWKTSRDNdZViNSXevo69rrgR5j1DlDp0xVvmsHmVSrfE9VyWJNw3v5UYtcVu2f 9f+WU9gz39IBUsA== X-Received: from dlbrn2.prod.google.com ([2002:a05:7022:1502:b0:128:d185:c6ff]) (user=bingjiao job=prod-delivery.src-stubby-dispatcher) by 2002:a05:7022:2218:b0:119:e569:fb9b with SMTP id a92af1059eb24-129a7111483mr694227c88.10.1773788846767; Tue, 17 Mar 2026 16:07:26 -0700 (PDT) Date: Tue, 17 Mar 2026 23:07:02 +0000 In-Reply-To: <20260317230720.990329-1-bingjiao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260317230720.990329-1-bingjiao@google.com> X-Mailer: git-send-email 2.53.0.851.ga537e3e6e9-goog Message-ID: <20260317230720.990329-4-bingjiao@google.com> Subject: [PATCH 3/3] mm/vmscan: add demote= option to proactive reclaim From: Bing Jiao To: linux-mm@kvack.org Cc: Bing Jiao , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , David Rientjes , Yosry Ahmed , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , Barry Song , Youngjun Park , David Hildenbrand , Qi Zheng , Lorenzo Stoakes , Axel Rasmussen , Yuanchu Xie , Wei Xu , Joshua Hahn Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In tiered-memory systems, proactive memory reclaim (via the cgroup memory.reclaim interface) can demote pages to a lower memory tier before eventually reclaiming them to swap. Add a 'demote=3D%u' option to memory.reclaim to allow users to control this behavior. Setting 'demote=3D1' enables demotion, while 'demote=3D0' disables it. By default, demote is disabled (0). This change ensures that proactive reclaim behaves consistently with cgroup limit-based reclaim (e.g., memory.high), where the goal is typically to reduce the overall memory footprint rather than migrating it to slower tiers. Signed-off-by: Bing Jiao --- mm/vmscan.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index 7a8617ba1748..80194270fa2e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -7878,11 +7878,13 @@ static unsigned long __node_reclaim(struct pglist_d= ata *pgdat, gfp_t gfp_mask, enum { MEMORY_RECLAIM_SWAPPINESS =3D 0, MEMORY_RECLAIM_SWAPPINESS_MAX, + MEMORY_RECLAIM_ALLOW_DEMOTION, MEMORY_RECLAIM_NULL, }; static const match_table_t tokens =3D { { MEMORY_RECLAIM_SWAPPINESS, "swappiness=3D%d"}, { MEMORY_RECLAIM_SWAPPINESS_MAX, "swappiness=3Dmax"}, + { MEMORY_RECLAIM_ALLOW_DEMOTION, "demote=3D%u"}, { MEMORY_RECLAIM_NULL, NULL }, }; @@ -7890,6 +7892,7 @@ int user_proactive_reclaim(char *buf, struct mem_cgroup *memcg, pg_data_t *pgdat) { unsigned int nr_retries =3D MAX_RECLAIM_RETRIES; + unsigned int allow_demotion =3D 0; unsigned long nr_to_reclaim, nr_reclaimed =3D 0; int swappiness =3D -1; char *old_buf, *start; @@ -7922,6 +7925,10 @@ int user_proactive_reclaim(char *buf, case MEMORY_RECLAIM_SWAPPINESS_MAX: swappiness =3D SWAPPINESS_ANON_ONLY; break; + case MEMORY_RECLAIM_ALLOW_DEMOTION: + if (match_uint(&args[0], &allow_demotion)) + return -EINVAL; + break; default: return -EINVAL; } @@ -7947,6 +7954,8 @@ int user_proactive_reclaim(char *buf, reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP | MEMCG_RECLAIM_PROACTIVE; + if (!allow_demotion) + reclaim_options |=3D MEMCG_RECLAIM_NO_DEMOTION; reclaimed =3D try_to_free_mem_cgroup_pages(memcg, batch_size, gfp_mask, reclaim_options, @@ -7962,6 +7971,7 @@ int user_proactive_reclaim(char *buf, .may_unmap =3D 1, .may_swap =3D 1, .proactive =3D 1, + .no_demotion =3D !(allow_demotion), }; if (test_and_set_bit_lock(PGDAT_RECLAIM_LOCKED, -- 2.53.0.851.ga537e3e6e9-goog