mm/memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
A filesystem writeback performance issue was discovered by repeatedly
running CPU hotplug operations while a process in a cgroup with memory
and io controllers enabled wrote to an ext4 file in a loop.
When a CPU is offlined, the memcg_hotplug_cpu_dead() callback function
flushes per-cpu vmstats counters. However, instead of applying a per-cpu
counter once to each cgroup in the heirarchy, the per-cpu counter is
applied repeatedly just to the nested cgroup. Under certain conditions,
the per-cpu NR_FILE_DIRTY counter is routinely positive during hotplug
events and the dirty file count artifically inflates. Once the dirty
file count grows past the dirty_freerun_ceiling(), balance_dirty_pages()
starts a backgroup writeback each time a file page is marked dirty
within the nested cgroup.
This change fixes memcg_hotplug_cpu_dead() so that the per-cpu vmstats
and vmevents counters are applied once to each cgroup in the heirarchy,
similar to __mod_memcg_state() and __count_memcg_events().
Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Andrew Guerrero <ajgja@amazon.com>
Reviewed-by: Gunnar Kudrjavets <gunnarku@amazon.com>
---
Hey all,
This patch is intended for the 5.10 longterm release branch. It will not apply
cleanly to mainline and is inadvertantly fixed by a larger series of changes in
later release branches:
a3d4c05a4474 ("mm: memcontrol: fix cpuhotplug statistics flushing").
In 5.15, the counter flushing code is completely removed. This may be another
viable option here too, though it's a larger change.
Thanks!
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 142b4d5e08fe..8e085a4f45b7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2394,7 +2394,7 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu)
x = this_cpu_xchg(memcg->vmstats_percpu->stat[i], 0);
if (x)
for (mi = memcg; mi; mi = parent_mem_cgroup(mi))
- atomic_long_add(x, &memcg->vmstats[i]);
+ atomic_long_add(x, &mi->vmstats[i]);
if (i >= NR_VM_NODE_STAT_ITEMS)
continue;
@@ -2417,7 +2417,7 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu)
x = this_cpu_xchg(memcg->vmstats_percpu->events[i], 0);
if (x)
for (mi = memcg; mi; mi = parent_mem_cgroup(mi))
- atomic_long_add(x, &memcg->vmevents[i]);
+ atomic_long_add(x, &mi->vmevents[i]);
}
}
base-commit: c30b4019ea89633d790f0bfcbb03234f0d006f87
--
2.47.3
On Sat, Sep 06, 2025 at 03:21:08AM +0000, Andrew Guerrero wrote: > A filesystem writeback performance issue was discovered by repeatedly > running CPU hotplug operations while a process in a cgroup with memory > and io controllers enabled wrote to an ext4 file in a loop. > > When a CPU is offlined, the memcg_hotplug_cpu_dead() callback function > flushes per-cpu vmstats counters. However, instead of applying a per-cpu > counter once to each cgroup in the heirarchy, the per-cpu counter is > applied repeatedly just to the nested cgroup. Under certain conditions, > the per-cpu NR_FILE_DIRTY counter is routinely positive during hotplug > events and the dirty file count artifically inflates. Once the dirty > file count grows past the dirty_freerun_ceiling(), balance_dirty_pages() > starts a backgroup writeback each time a file page is marked dirty > within the nested cgroup. > > This change fixes memcg_hotplug_cpu_dead() so that the per-cpu vmstats > and vmevents counters are applied once to each cgroup in the heirarchy, > similar to __mod_memcg_state() and __count_memcg_events(). > > Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty") > Signed-off-by: Andrew Guerrero <ajgja@amazon.com> > Reviewed-by: Gunnar Kudrjavets <gunnarku@amazon.com> > --- > Hey all, > > This patch is intended for the 5.10 longterm release branch. It will not apply > cleanly to mainline and is inadvertantly fixed by a larger series of changes in > later release branches: > a3d4c05a4474 ("mm: memcontrol: fix cpuhotplug statistics flushing"). Why can't we take those instead? > In 5.15, the counter flushing code is completely removed. This may be another > viable option here too, though it's a larger change. If it's not needed anymore, why not just remove it with the upstream commits as well? thanks, greg k-h
© 2016 - 2025 Red Hat, Inc.