[RFC PATCH] memcg: adjust memcg used to charge for new simple_xattrs objects

Vasily Averin posted 1 patch 3 years, 7 months ago
fs/kernfs/inode.c | 31 ++++++++++++++++++++++++-------
mm/shmem.c        | 10 +++++++++-
2 files changed, 33 insertions(+), 8 deletions(-)
[RFC PATCH] memcg: adjust memcg used to charge for new simple_xattrs objects
Posted by Vasily Averin 3 years, 7 months ago
The patch was announced here:
https://lore.kernel.org/all/62188f37-f816-08e9-cdd5-8df23131746d@openvz.org/
"3) adjust active memcg for simple_xattr accounting
  sysfs and tmpfs are in-memory file system, 
  for extended attributes they uses simple_xattr infrastructure.
  The patch forces sys_set[f]xattr calls to account xattr object
  to predictable memcg: for kernfs memcg will be taken from kernfs node,
  for shmem -- from shmem_info.
  Like 1) case, this allows to understand which memcg should be used
  for object accounting and simplify debugging of possible troubles."

It was compiled but was not tested yet.
---
sys_set[f]xattr uses simple_xattr infrastructure to create a new
extended attribute for in-memory file systems like sysfs and tmpfs.
Number and size of allocated objects are controlled by user space,
they are always living in memory and its lifetime is indefinitely long.
Therefore this memory should be properly accounted.

By default new memory is accounted to memcg of creator process.
As a result, neighboring xattrs of the same inode can be charged to
different memcgs. This looks unexpected and makes hard the
investigation of the memcg accounting issues.

This patch adjust memcg used for such allocations. For kernfs
it gives memcg from kernfs node, for shmem -- from shmem_info.
This allows to cahrge all inode-sepcific objects to the same
memory cgroup.

Signed-off-by: Vasily Averin <vvs@openvz.org>
---
 fs/kernfs/inode.c | 31 ++++++++++++++++++++++++-------
 mm/shmem.c        | 10 +++++++++-
 2 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 3d783d80f5da..975532b32e7c 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -14,6 +14,7 @@
 #include <linux/slab.h>
 #include <linux/xattr.h>
 #include <linux/security.h>
+#include <linux/memcontrol.h>
 
 #include "kernfs-internal.h"
 
@@ -335,8 +336,16 @@ static int kernfs_vfs_xattr_set(const struct xattr_handler *handler,
 {
 	const char *name = xattr_full_name(handler, suffix);
 	struct kernfs_node *kn = inode->i_private;
+	struct mem_cgroup *old, *memcg;
+	int ret;
+
+	memcg = mem_cgroup_or_root(get_mem_cgroup_from_obj(kn));
+	old = set_active_memcg(memcg);
 
-	return kernfs_xattr_set(kn, name, value, size, flags);
+	ret = kernfs_xattr_set(kn, name, value, size, flags);
+	set_active_memcg(old);
+	mem_cgroup_put(memcg);
+	return ret;
 }
 
 static int kernfs_vfs_user_xattr_add(struct kernfs_node *kn,
@@ -403,21 +412,29 @@ static int kernfs_vfs_user_xattr_set(const struct xattr_handler *handler,
 	const char *full_name = xattr_full_name(handler, suffix);
 	struct kernfs_node *kn = inode->i_private;
 	struct kernfs_iattrs *attrs;
+	struct mem_cgroup *old, *memcg;
+	int ret = -ENOMEM;
 
 	if (!(kernfs_root(kn)->flags & KERNFS_ROOT_SUPPORT_USER_XATTR))
 		return -EOPNOTSUPP;
 
+	memcg = mem_cgroup_or_root(get_mem_cgroup_from_obj(kn));
+	old = set_active_memcg(memcg);
+
 	attrs = kernfs_iattrs(kn);
 	if (!attrs)
-		return -ENOMEM;
+		goto out;
 
 	if (value)
-		return kernfs_vfs_user_xattr_add(kn, full_name, &attrs->xattrs,
-						 value, size, flags);
-	else
-		return kernfs_vfs_user_xattr_rm(kn, full_name, &attrs->xattrs,
+		ret = kernfs_vfs_user_xattr_add(kn, full_name, &attrs->xattrs,
 						value, size, flags);
-
+	else
+		ret = kernfs_vfs_user_xattr_rm(kn, full_name, &attrs->xattrs,
+					       value, size, flags);
+out:
+	set_active_memcg(old);
+	mem_cgroup_put(memcg);
+	return ret;
 }
 
 static const struct xattr_handler kernfs_trusted_xattr_handler = {
diff --git a/mm/shmem.c b/mm/shmem.c
index 5783f11351bb..291c15774340 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3255,9 +3255,17 @@ static int shmem_xattr_handler_set(const struct xattr_handler *handler,
 				   size_t size, int flags)
 {
 	struct shmem_inode_info *info = SHMEM_I(inode);
+	struct mem_cgroup *old, *memcg;
+	int ret;
+
+	memcg = mem_cgroup_or_root(get_mem_cgroup_from_obj(info));
+	old = set_active_memcg(memcg);
 
 	name = xattr_full_name(handler, name);
-	return simple_xattr_set(&info->xattrs, name, value, size, flags, NULL);
+	ret = simple_xattr_set(&info->xattrs, name, value, size, flags, NULL);
+	set_active_memcg(old);
+	mem_cgroup_put(memcg);
+	return ret;
 }
 
 static const struct xattr_handler shmem_security_xattr_handler = {
-- 
2.34.1
Re: [RFC PATCH] memcg: adjust memcg used to charge for new simple_xattrs objects
Posted by Michal Koutný 3 years, 7 months ago
On Thu, Aug 18, 2022 at 12:10:45PM +0300, Vasily Averin <vvs@openvz.org> wrote:
> sys_set[f]xattr uses simple_xattr infrastructure to create a new
> extended attribute for in-memory file systems like sysfs and tmpfs.
> Number and size of allocated objects are controlled by user space,
> they are always living in memory and its lifetime is indefinitely long.
> Therefore this memory should be properly accounted.
> 
> By default new memory is accounted to memcg of creator process.

despite objects aren't bound to this process lifetime.

(I think this was the main argument for this approach and should be in
the commit message then.)

> As a result, neighboring xattrs of the same inode can be charged to
> different memcgs. This looks unexpected and makes hard the
> investigation of the memcg accounting issues.
> 
> This patch adjust memcg used for such allocations. For kernfs
> it gives memcg from kernfs node, for shmem -- from shmem_info.
> This allows to cahrge all inode-sepcific objects to the same
> memory cgroup.

IIUC you intend to inherit association from shmem_inode_info (i.e.
whoever created the inode). shmem_inode_cachep has SLAB_ACCOUNT, so it's valid.

Thanks,
Michal
Re: [RFC PATCH] memcg: adjust memcg used to charge for new simple_xattrs objects
Posted by Vasily Averin 3 years, 7 months ago
On 8/18/22 15:27, Michal Koutný wrote:
> On Thu, Aug 18, 2022 at 12:10:45PM +0300, Vasily Averin <vvs@openvz.org> wrote:
>> sys_set[f]xattr uses simple_xattr infrastructure to create a new
>> extended attribute for in-memory file systems like sysfs and tmpfs.
>> Number and size of allocated objects are controlled by user space,
>> they are always living in memory and its lifetime is indefinitely long.
>> Therefore this memory should be properly accounted.
>>
>> By default new memory is accounted to memcg of creator process.
> 
> despite objects aren't bound to this process lifetime.
> 
> (I think this was the main argument for this approach and should be in
> the commit message then.)

Thank you for the remark, I'll update patch description in the next version

>> As a result, neighboring xattrs of the same inode can be charged to
>> different memcgs. This looks unexpected and makes hard the
>> investigation of the memcg accounting issues.
>>
>> This patch adjust memcg used for such allocations. For kernfs
>> it gives memcg from kernfs node, for shmem -- from shmem_info.
>> This allows to cahrge all inode-sepcific objects to the same
>> memory cgroup.
> 
> IIUC you intend to inherit association from shmem_inode_info (i.e.
> whoever created the inode). shmem_inode_cachep has SLAB_ACCOUNT, so it's valid.

Yes, you are right, I'll clarify this in next patch version.

Thank you,
	Vasily Averin