mm/memcontrol.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
Hello,
cgroup1 (by function memcg1_stat_format) already contains two lines
hierarchical_memory_limit %llu
hierarchical_memsw_limit %llu
which are useful for userland to easily and performance-wise find out the
effective cgroup limits being applied. Otherwise userland has to
open+read+close the file "memory.max" and/or "memory.swap.max" in multiple
parent directories of a nested cgroup.
For cgroup1 it was implemented by:
memcg: show real limit under hierarchy mode
https://github.com/torvalds/linux/commit/fee7b548e6f2bd4bfd03a1a45d3afd593de7d5e9
Date: Wed Jan 7 18:08:26 2009 -0800
But for cgroup2 it has been missing so far, this is just a copy-paste of the
cgroup1 code while changing s/memsw/swap/ as that is what cgroup1 vs. cgroup2
tracks. I have added it to the end of "memory.stat" to prevent possible
compatibility problems with existing code parsing that file.
Jan Kratochvil
Signed-off-by: Jan Kratochvil (Azul) <jkratochvil@azul.com>
mm/memcontrol.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 46d8d0211..2631dd810 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1636,6 +1636,8 @@ static inline unsigned long memcg_page_state_local_output(
static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
{
int i;
+ unsigned long memory, swap;
+ struct mem_cgroup *mi;
/*
* Provide statistics on the state of the memory subsystem as
@@ -1682,6 +1684,17 @@ static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
memcg_events(memcg, memcg_vm_event_stat[i]));
}
+ /* Hierarchical information */
+ memory = swap = PAGE_COUNTER_MAX;
+ for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) {
+ memory = min(memory, READ_ONCE(mi->memory.max));
+ swap = min(swap, READ_ONCE(mi->swap.max));
+ }
+ seq_buf_printf(s, "hierarchical_memory_limit %llu\n",
+ (u64)memory * PAGE_SIZE);
+ seq_buf_printf(s, "hierarchical_swap_limit %llu\n",
+ (u64)swap * PAGE_SIZE);
+
/* The above should easily fit into one page */
WARN_ON_ONCE(seq_buf_has_overflowed(s));
}
Hello. Something like this would come quite handy. On Mon, Feb 12, 2024 at 12:10:38PM +0800, "Jan Kratochvil (Azul)" <jkratochvil@azul.com> wrote: > which are useful for userland to easily and performance-wise find out the > effective cgroup limits being applied. And the only way to figure out inside cgroupns. > But for cgroup2 it has been missing so far, this is just a copy-paste of the > cgroup1 code while changing s/memsw/swap/ as that is what cgroup1 vs. cgroup2 > tracks. I have added it to the end of "memory.stat" to prevent possible > compatibility problems with existing code parsing that file. I was thinking of memory.max.effective (and others). - no need to (possibly flush) stats when reading memory.stat - can be generalized also for pids controller (and other "limiting" controllers) - analogous to precedent of cpuset.cpus.effective Whereas, using v1 approach in v2: - memory.stat mixes true stats and limits, - memmory.stat is hierarchical by default, no need for the prefix. What do you think of the separate .effective file(s)? Thanks Michal
On 2/12/24 10:00, Michal Koutný wrote: > Hello. > > Something like this would come quite handy. > > On Mon, Feb 12, 2024 at 12:10:38PM +0800, "Jan Kratochvil (Azul)" <jkratochvil@azul.com> wrote: >> which are useful for userland to easily and performance-wise find out the >> effective cgroup limits being applied. > And the only way to figure out inside cgroupns. > >> But for cgroup2 it has been missing so far, this is just a copy-paste of the >> cgroup1 code while changing s/memsw/swap/ as that is what cgroup1 vs. cgroup2 >> tracks. I have added it to the end of "memory.stat" to prevent possible >> compatibility problems with existing code parsing that file. > I was thinking of memory.max.effective (and others). > > - no need to (possibly flush) stats when reading memory.stat > - can be generalized also for pids controller (and other "limiting" controllers) > - analogous to precedent of cpuset.cpus.effective > > Whereas, using v1 approach in v2: > - memory.stat mixes true stats and limits, > - memmory.stat is hierarchical by default, no need for the prefix. > > What do you think of the separate .effective file(s)? This is certainly a good alternative. Cheers, Longman
© 2016 - 2025 Red Hat, Inc.