[PATCH v2 0/4] mm/zswap: Implement per-cgroup proactive writeback

Hao Jia posted 4 patches 2 weeks ago
There is a newer version of this series
Documentation/admin-guide/cgroup-v2.rst     |  22 +-
Documentation/admin-guide/mm/zswap.rst      |  11 +-
include/linux/memcontrol.h                  |   3 +
include/linux/vm_event_item.h               |   1 +
include/linux/zswap.h                       |  16 ++
mm/memcontrol.c                             |   4 +
mm/vmscan.c                                 |  14 +
mm/vmstat.c                                 |   1 +
mm/zswap.c                                  | 292 +++++++++++++++++---
tools/testing/selftests/cgroup/test_zswap.c | 161 ++++++++++-
10 files changed, 470 insertions(+), 55 deletions(-)
[PATCH v2 0/4] mm/zswap: Implement per-cgroup proactive writeback
Posted by Hao Jia 2 weeks ago
From: Hao Jia <jiahao1@lixiang.com>

Zswap currently writes back pages to backing swap reactively, triggered
either by the shrinker or by the pool reaching its size limit. Although
proactive memory reclaim can automatically write back a portion of zswap
pages via the shrinker, it cannot explicitly control the amount of
writeback for a specific memory cgroup. Moreover, proactive memory reclaim
may not always be triggered during a steady state.

In certain scenarios, it is desirable to trigger writeback in advance to
free up memory. For example, users may want to prepare for an upcoming
memory-intensive workload by flushing cold memory to the backing storage
when the system is relatively idle.

This patch series introduces a "zswap_writeback_only" key to memory.reclaim
cgroup interface, allowing users to proactively write back cold compressed
pages from zswap to the backing swap device. When specified, this key
bypasses standard memory reclaim and exclusively performs proactive zswap
writeback up to the requested budget. If omitted, the default reclaim
behavior remains unchanged.

Example usage:
  # Write back 100MB of pages from zswap to the backing swap
  echo "100M zswap_writeback_only" > memory.reclaim

Patch 1: Move the global zswap shrink cursor into struct mem_cgroup as a
  per-memcg zswap_wb_iter, so patch 2 can scope writeback to a given memcg
  and make forward progress across its subtree on repeated invocations.

Patch 2: Extend the memory.reclaim cgroup v2 interface with a new
  "zswap_writeback_only" key, allowing users to trigger proactive zswap
  writeback up to a requested budget.

Patch 3: Add a zswpwb_proactive counter to memory.stat and /proc/vmstat
  to track the number of writebacks triggered by proactive writeback.

Patch 4: Add tests for zswap proactive writeback.

v1->v2:
    - As suggested by Yosry and Nhat, extend the memory.reclaim cgroup v2
      interface with a "zswap_writeback_only" key instead of adding a new
      dedicated cgroup interface.
    - Update the zswap documentation and add selftests for proactive writeback.

[v1] https://lore.kernel.org/all/20260511105149.75584-1-jiahao.kernel@gmail.com

Hao Jia (4):
  mm/zswap: Make shrink_worker writeback cursor per-memcg
  mm/zswap: Implement proactive writeback
  mm/zswap: Add per-memcg stat for proactive writeback
  selftests/cgroup: Add tests for zswap proactive writeback

 Documentation/admin-guide/cgroup-v2.rst     |  22 +-
 Documentation/admin-guide/mm/zswap.rst      |  11 +-
 include/linux/memcontrol.h                  |   3 +
 include/linux/vm_event_item.h               |   1 +
 include/linux/zswap.h                       |  16 ++
 mm/memcontrol.c                             |   4 +
 mm/vmscan.c                                 |  14 +
 mm/vmstat.c                                 |   1 +
 mm/zswap.c                                  | 292 +++++++++++++++++---
 tools/testing/selftests/cgroup/test_zswap.c | 161 ++++++++++-
 10 files changed, 470 insertions(+), 55 deletions(-)

-- 
2.34.1
Re: [PATCH v2 0/4] mm/zswap: Implement per-cgroup proactive writeback
Posted by Andrew Morton 2 weeks ago
On Mon, 25 May 2026 20:22:38 +0800 Hao Jia <jiahao.kernel@gmail.com> wrote:

> Zswap currently writes back pages to backing swap reactively, triggered
> either by the shrinker or by the pool reaching its size limit. Although
> proactive memory reclaim can automatically write back a portion of zswap
> pages via the shrinker, it cannot explicitly control the amount of
> writeback for a specific memory cgroup. Moreover, proactive memory reclaim
> may not always be triggered during a steady state.
> 
> In certain scenarios, it is desirable to trigger writeback in advance to
> free up memory. For example, users may want to prepare for an upcoming
> memory-intensive workload by flushing cold memory to the backing storage
> when the system is relatively idle.
> 
> This patch series introduces a "zswap_writeback_only" key to memory.reclaim
> cgroup interface, allowing users to proactively write back cold compressed
> pages from zswap to the backing swap device. When specified, this key
> bypasses standard memory reclaim and exclusively performs proactive zswap
> writeback up to the requested budget. If omitted, the default reclaim
> behavior remains unchanged.

Thanks.  AI review found a few things to complain about, one of them
described as "preexisting".
Re: [PATCH v2 0/4] mm/zswap: Implement per-cgroup proactive writeback
Posted by Hao Jia 1 week, 6 days ago

On 2026/5/26 03:24, Andrew Morton wrote:
> On Mon, 25 May 2026 20:22:38 +0800 Hao Jia <jiahao.kernel@gmail.com> wrote:
> 
>> Zswap currently writes back pages to backing swap reactively, triggered
>> either by the shrinker or by the pool reaching its size limit. Although
>> proactive memory reclaim can automatically write back a portion of zswap
>> pages via the shrinker, it cannot explicitly control the amount of
>> writeback for a specific memory cgroup. Moreover, proactive memory reclaim
>> may not always be triggered during a steady state.
>>
>> In certain scenarios, it is desirable to trigger writeback in advance to
>> free up memory. For example, users may want to prepare for an upcoming
>> memory-intensive workload by flushing cold memory to the backing storage
>> when the system is relatively idle.
>>
>> This patch series introduces a "zswap_writeback_only" key to memory.reclaim
>> cgroup interface, allowing users to proactively write back cold compressed
>> pages from zswap to the backing swap device. When specified, this key
>> bypasses standard memory reclaim and exclusively performs proactive zswap
>> writeback up to the requested budget. If omitted, the default reclaim
>> behavior remains unchanged.
> 
> Thanks.  AI review found a few things to complain about, one of them
> described as "preexisting".
> 

Thanks Andrew.  I have replied to the AI's review comments in a separate 
email and posted v3.
https://lore.kernel.org/all/20260526114601.67041-1-jiahao.kernel@gmail.com

Thanks,
Hao
Re: [PATCH v2 0/4] mm/zswap: Implement per-cgroup proactive writeback
Posted by Yosry Ahmed 1 week, 6 days ago
On Tue, May 26, 2026 at 4:56 AM Hao Jia <jiahao.kernel@gmail.com> wrote:
>
>
>
> On 2026/5/26 03:24, Andrew Morton wrote:
> > On Mon, 25 May 2026 20:22:38 +0800 Hao Jia <jiahao.kernel@gmail.com> wrote:
> >
> >> Zswap currently writes back pages to backing swap reactively, triggered
> >> either by the shrinker or by the pool reaching its size limit. Although
> >> proactive memory reclaim can automatically write back a portion of zswap
> >> pages via the shrinker, it cannot explicitly control the amount of
> >> writeback for a specific memory cgroup. Moreover, proactive memory reclaim
> >> may not always be triggered during a steady state.
> >>
> >> In certain scenarios, it is desirable to trigger writeback in advance to
> >> free up memory. For example, users may want to prepare for an upcoming
> >> memory-intensive workload by flushing cold memory to the backing storage
> >> when the system is relatively idle.
> >>
> >> This patch series introduces a "zswap_writeback_only" key to memory.reclaim
> >> cgroup interface, allowing users to proactively write back cold compressed
> >> pages from zswap to the backing swap device. When specified, this key
> >> bypasses standard memory reclaim and exclusively performs proactive zswap
> >> writeback up to the requested budget. If omitted, the default reclaim
> >> behavior remains unchanged.
> >
> > Thanks.  AI review found a few things to complain about, one of them
> > described as "preexisting".
> >
>
> Thanks Andrew.  I have replied to the AI's review comments in a separate
> email and posted v3.
> https://lore.kernel.org/all/20260526114601.67041-1-jiahao.kernel@gmail.com

Generally speaking, please give time for reviewers to take a look
before sending a new version. Less than a day is usually too fast
(unless you're iterating super fast with the reviewers). Review
feedback does not have to be addressed immediately, usually wait for a
bit to collect as much feedback as possible before spinning a new
version.

I will take a look at v3 soon, thank you.
Re: [PATCH v2 0/4] mm/zswap: Implement per-cgroup proactive writeback
Posted by Hao Jia 1 week, 5 days ago

On 2026/5/27 02:55, Yosry Ahmed wrote:
> On Tue, May 26, 2026 at 4:56 AM Hao Jia <jiahao.kernel@gmail.com> wrote:
>>
>>
>>
>> On 2026/5/26 03:24, Andrew Morton wrote:
>>> On Mon, 25 May 2026 20:22:38 +0800 Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>
>>>> Zswap currently writes back pages to backing swap reactively, triggered
>>>> either by the shrinker or by the pool reaching its size limit. Although
>>>> proactive memory reclaim can automatically write back a portion of zswap
>>>> pages via the shrinker, it cannot explicitly control the amount of
>>>> writeback for a specific memory cgroup. Moreover, proactive memory reclaim
>>>> may not always be triggered during a steady state.
>>>>
>>>> In certain scenarios, it is desirable to trigger writeback in advance to
>>>> free up memory. For example, users may want to prepare for an upcoming
>>>> memory-intensive workload by flushing cold memory to the backing storage
>>>> when the system is relatively idle.
>>>>
>>>> This patch series introduces a "zswap_writeback_only" key to memory.reclaim
>>>> cgroup interface, allowing users to proactively write back cold compressed
>>>> pages from zswap to the backing swap device. When specified, this key
>>>> bypasses standard memory reclaim and exclusively performs proactive zswap
>>>> writeback up to the requested budget. If omitted, the default reclaim
>>>> behavior remains unchanged.
>>>
>>> Thanks.  AI review found a few things to complain about, one of them
>>> described as "preexisting".
>>>
>>
>> Thanks Andrew.  I have replied to the AI's review comments in a separate
>> email and posted v3.
>> https://lore.kernel.org/all/20260526114601.67041-1-jiahao.kernel@gmail.com
> 
> Generally speaking, please give time for reviewers to take a look
> before sending a new version. Less than a day is usually too fast
> (unless you're iterating super fast with the reviewers). Review
> feedback does not have to be addressed immediately, usually wait for a
> bit to collect as much feedback as possible before spinning a new
> version.
>

Thanks for the advice, Yosry. Got it.

> I will take a look at v3 soon, thank you.

Appreciate you taking a look at v3.

Thakns,
Hao