mm/slab.c | 3 +++ mm/slub.c | 3 +++ 2 files changed, 6 insertions(+)
Slab caches marked with SLAB_ACCOUNT force accounting for every
allocation from this cache even if __GFP_ACCOUNT flag is not passed.
Unfortunately, at the moment this flag is not visible in ftrace output,
and this makes it difficult to analyze the accounted allocations.
This patch adds the __GFP_ACCOUNT flag for allocations from slab caches
marked with SLAB_ACCOUNT to the ftrace output.
Signed-off-by: Vasily Averin <vvs@openvz.org>
---
mm/slab.c | 3 +++
mm/slub.c | 3 +++
2 files changed, 6 insertions(+)
diff --git a/mm/slab.c b/mm/slab.c
index 0edb474edef1..4c3da8dfcbdb 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3492,6 +3492,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
{
void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
+ if (cachep->flags & SLAB_ACCOUNT)
+ flags |= __GFP_ACCOUNT;
+
trace_kmem_cache_alloc(_RET_IP_, ret,
cachep->object_size, cachep->size, flags);
diff --git a/mm/slub.c b/mm/slub.c
index ed5c2c03a47a..670bbfef9e49 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3231,6 +3231,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
{
void *ret = slab_alloc(s, lru, gfpflags, _RET_IP_, s->object_size);
+ if (s->flags & SLAB_ACCOUNT)
+ gfpflags |= __GFP_ACCOUNT;
+
trace_kmem_cache_alloc(_RET_IP_, ret, s->object_size,
s->size, gfpflags);
--
2.25.1
On Mon, May 16, 2022 at 09:53:32PM +0300, Vasily Averin wrote:
> Slab caches marked with SLAB_ACCOUNT force accounting for every
> allocation from this cache even if __GFP_ACCOUNT flag is not passed.
> Unfortunately, at the moment this flag is not visible in ftrace output,
> and this makes it difficult to analyze the accounted allocations.
>
> This patch adds the __GFP_ACCOUNT flag for allocations from slab caches
> marked with SLAB_ACCOUNT to the ftrace output.
>
> Signed-off-by: Vasily Averin <vvs@openvz.org>
> ---
> mm/slab.c | 3 +++
> mm/slub.c | 3 +++
> 2 files changed, 6 insertions(+)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 0edb474edef1..4c3da8dfcbdb 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3492,6 +3492,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
> {
> void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
>
> + if (cachep->flags & SLAB_ACCOUNT)
> + flags |= __GFP_ACCOUNT;
> +
> trace_kmem_cache_alloc(_RET_IP_, ret,
> cachep->object_size, cachep->size, flags);
>
> diff --git a/mm/slub.c b/mm/slub.c
> index ed5c2c03a47a..670bbfef9e49 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3231,6 +3231,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
> {
> void *ret = slab_alloc(s, lru, gfpflags, _RET_IP_, s->object_size);
>
> + if (s->flags & SLAB_ACCOUNT)
> + gfpflags |= __GFP_ACCOUNT;
> +
> trace_kmem_cache_alloc(_RET_IP_, ret, s->object_size,
> s->size, gfpflags);
>
> --
> 2.25.1
>
>
To me it sounds like it would confuse memory cgroup because:
1) For now objects are charged only in slab memcg hooks
2) This patch makes buddy allocator charge the page too
--
Thanks,
Hyeonggon
On Mon, May 16, 2022 at 11:53 AM Vasily Averin <vvs@openvz.org> wrote:
>
> Slab caches marked with SLAB_ACCOUNT force accounting for every
> allocation from this cache even if __GFP_ACCOUNT flag is not passed.
> Unfortunately, at the moment this flag is not visible in ftrace output,
> and this makes it difficult to analyze the accounted allocations.
>
> This patch adds the __GFP_ACCOUNT flag for allocations from slab caches
> marked with SLAB_ACCOUNT to the ftrace output.
>
> Signed-off-by: Vasily Averin <vvs@openvz.org>
> ---
> mm/slab.c | 3 +++
> mm/slub.c | 3 +++
> 2 files changed, 6 insertions(+)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 0edb474edef1..4c3da8dfcbdb 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3492,6 +3492,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
What about kmem_cache_alloc_node()?
> {
> void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
>
> + if (cachep->flags & SLAB_ACCOUNT)
Should this 'if' be unlikely() or should we trace cachep->flags
explicitly to avoid this branch altogether?
> + flags |= __GFP_ACCOUNT;
> +
> trace_kmem_cache_alloc(_RET_IP_, ret,
> cachep->object_size, cachep->size, flags);
>
> diff --git a/mm/slub.c b/mm/slub.c
> index ed5c2c03a47a..670bbfef9e49 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3231,6 +3231,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
> {
> void *ret = slab_alloc(s, lru, gfpflags, _RET_IP_, s->object_size);
>
> + if (s->flags & SLAB_ACCOUNT)
> + gfpflags |= __GFP_ACCOUNT;
> +
> trace_kmem_cache_alloc(_RET_IP_, ret, s->object_size,
> s->size, gfpflags);
>
> --
> 2.25.1
>
On 5/16/22 22:10, Shakeel Butt wrote:
> On Mon, May 16, 2022 at 11:53 AM Vasily Averin <vvs@openvz.org> wrote:
>> +++ b/mm/slab.c
>> @@ -3492,6 +3492,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
>
> What about kmem_cache_alloc_node()?
Thank you for the hint, I was inaccurate and missed *_node.
>> {
>> void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
>>
>> + if (cachep->flags & SLAB_ACCOUNT)
>
> Should this 'if' be unlikely() or should we trace cachep->flags
> explicitly to avoid this branch altogether?
In general output of cachep->flags can be useful, but at the moment
I am only interested in SLAB_ACCOUNT flag and in any case I would
prefer to translate it to GFP_ACCOUNT.
So I'm going to use unlikely() in v2 patch version.
>> + flags |= __GFP_ACCOUNT;
>> +
>> trace_kmem_cache_alloc(_RET_IP_, ret,
>> cachep->object_size, cachep->size, flags);
>>
Thank you,
Vasily Averin
On Tue, May 17, 2022 at 06:32:28AM +0300, Vasily Averin wrote: > > Should this 'if' be unlikely() or should we trace cachep->flags > > explicitly to avoid this branch altogether? > > In general output of cachep->flags can be useful, but at the moment > I am only interested in SLAB_ACCOUNT flag and in any case I would > prefer to translate it to GFP_ACCOUNT. > So I'm going to use unlikely() in v2 patch version. It's still going to be an extra test, and networking is extremely sensitive to extra instructions if tracing is compiled out. Passing in 'cachep' to the trace call looked like the best suggestion to me.
On 5/16/22 21:10, Shakeel Butt wrote:
> On Mon, May 16, 2022 at 11:53 AM Vasily Averin <vvs@openvz.org> wrote:
>>
>> Slab caches marked with SLAB_ACCOUNT force accounting for every
>> allocation from this cache even if __GFP_ACCOUNT flag is not passed.
>> Unfortunately, at the moment this flag is not visible in ftrace output,
>> and this makes it difficult to analyze the accounted allocations.
>>
>> This patch adds the __GFP_ACCOUNT flag for allocations from slab caches
>> marked with SLAB_ACCOUNT to the ftrace output.
>>
>> Signed-off-by: Vasily Averin <vvs@openvz.org>
>> ---
>> mm/slab.c | 3 +++
>> mm/slub.c | 3 +++
>> 2 files changed, 6 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 0edb474edef1..4c3da8dfcbdb 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -3492,6 +3492,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
>
> What about kmem_cache_alloc_node()?
>
>> {
>> void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
>>
>> + if (cachep->flags & SLAB_ACCOUNT)
>
> Should this 'if' be unlikely() or should we trace cachep->flags
> explicitly to avoid this branch altogether?
Hm I think ideally the tracepoint accepts cachep instead of current
cachep->*size parameters and does the necessary extraction and
modification in its fast_assign.
>
>> + flags |= __GFP_ACCOUNT;
>> +
>> trace_kmem_cache_alloc(_RET_IP_, ret,
>> cachep->object_size, cachep->size, flags);
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index ed5c2c03a47a..670bbfef9e49 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -3231,6 +3231,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
>> {
>> void *ret = slab_alloc(s, lru, gfpflags, _RET_IP_, s->object_size);
>>
>> + if (s->flags & SLAB_ACCOUNT)
>> + gfpflags |= __GFP_ACCOUNT;
>> +
>> trace_kmem_cache_alloc(_RET_IP_, ret, s->object_size,
>> s->size, gfpflags);
>>
>> --
>> 2.25.1
>>
On Mon, May 16, 2022 at 11:41:27PM +0200, Vlastimil Babka wrote:
> On 5/16/22 21:10, Shakeel Butt wrote:
> > On Mon, May 16, 2022 at 11:53 AM Vasily Averin <vvs@openvz.org> wrote:
> >>
> >> Slab caches marked with SLAB_ACCOUNT force accounting for every
> >> allocation from this cache even if __GFP_ACCOUNT flag is not passed.
> >> Unfortunately, at the moment this flag is not visible in ftrace output,
> >> and this makes it difficult to analyze the accounted allocations.
> >>
> >> This patch adds the __GFP_ACCOUNT flag for allocations from slab caches
> >> marked with SLAB_ACCOUNT to the ftrace output.
> >>
> >> Signed-off-by: Vasily Averin <vvs@openvz.org>
> >> ---
> >> mm/slab.c | 3 +++
> >> mm/slub.c | 3 +++
> >> 2 files changed, 6 insertions(+)
> >>
> >> diff --git a/mm/slab.c b/mm/slab.c
> >> index 0edb474edef1..4c3da8dfcbdb 100644
> >> --- a/mm/slab.c
> >> +++ b/mm/slab.c
> >> @@ -3492,6 +3492,9 @@ void *__kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
> >
> > What about kmem_cache_alloc_node()?
> >
> >> {
> >> void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
> >>
> >> + if (cachep->flags & SLAB_ACCOUNT)
> >
> > Should this 'if' be unlikely() or should we trace cachep->flags
> > explicitly to avoid this branch altogether?
>
> Hm I think ideally the tracepoint accepts cachep instead of current
> cachep->*size parameters and does the necessary extraction and
> modification in its fast_assign.
+1 for fast_assign
Changing flags just for tracing looks a bit excessive.
On 5/17/22 01:08, Roman Gushchin wrote:
> On Mon, May 16, 2022 at 11:41:27PM +0200, Vlastimil Babka wrote:
>> On 5/16/22 21:10, Shakeel Butt wrote:
>>> On Mon, May 16, 2022 at 11:53 AM Vasily Averin <vvs@openvz.org> wrote:
>>>> {
>>>> void *ret = slab_alloc(cachep, lru, flags, cachep->object_size, _RET_IP_);
>>>>
>>>> + if (cachep->flags & SLAB_ACCOUNT)
>>>
>>> Should this 'if' be unlikely() or should we trace cachep->flags
>>> explicitly to avoid this branch altogether?
>>
>> Hm I think ideally the tracepoint accepts cachep instead of current
>> cachep->*size parameters and does the necessary extraction and
>> modification in its fast_assign.
>
> +1 for fast_assign
>
> Changing flags just for tracing looks a bit excessive.
At the kmem_cache_alloc and kmem_alloc use the same tracing template.
Ok, I'll try to redesign this.
© 2016 - 2026 Red Hat, Inc.