[v1] SLUB debugfs improvements based on stackdepot

[PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Vlastimil Babka 4 years, 3 months ago

Hi,

this series combines and revives patches from Oliver's last year
bachelor thesis (where I was the advisor) that make SLUB's debugfs
files alloc_traces and free_traces more useful.
The resubmission was blocked on stackdepot changes that are now merged,
as explained in patch 2.

Patch 1 is a new preparatory cleanup.

Patch 2 originally submitted here [1], was merged to mainline but
reverted for stackdepot related issues as explained in the patch.

Patches 3-5 originally submitted as RFC here [2]. In this submission I
have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
be considered too intrusive so I will postpone it for later. The docs
patch is adjusted accordingly.

Also available in git, based on v5.17-rc1:
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1

I'd like to ask for some review before I add this to the slab tree.

[1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
[2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/

Oliver Glitta (4):
  mm/slub: use stackdepot to save stack trace in objects
  mm/slub: aggregate and print stack traces in debugfs files
  mm/slub: sort debugfs output by frequency of stack traces
  slab, documentation: add description of debugfs files for SLUB caches

Vlastimil Babka (1):
  mm/slub: move struct track init out of set_track()

 Documentation/vm/slub.rst |  61 +++++++++++++++
 init/Kconfig              |   1 +
 mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
 3 files changed, 162 insertions(+), 52 deletions(-)

-- 
2.35.1

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Hyeonggon Yoo 4 years, 3 months ago

On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> Hi,
> 
> this series combines and revives patches from Oliver's last year
> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> files alloc_traces and free_traces more useful.
> The resubmission was blocked on stackdepot changes that are now merged,
> as explained in patch 2.
> 

Hello. I just started review/testing this series.

it crashed on my system (arm64)

I ran with boot parameter slub_debug=U, and without KASAN.
So CONFIG_STACKDEPOT_ALWAYS_INIT=n.

void * __init memblock_alloc_try_nid(
                        phys_addr_t size, phys_addr_t align,
                        phys_addr_t min_addr, phys_addr_t max_addr,
                        int nid)
{
        void *ptr;

        memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
                     __func__, (u64)size, (u64)align, nid, &min_addr,
                     &max_addr, (void *)_RET_IP_);
        ptr = memblock_alloc_internal(size, align,
                                           min_addr, max_addr, nid, false);
        if (ptr)
                memset(ptr, 0, size); <--- Crash Here

        return ptr;
}

It crashed during create_boot_cache() -> stack_depot_init() ->
memblock_alloc().

I think That's because, in kmem_cache_init(), both slab and memblock is not
available. (AFAIU memblock is not available after mem_init() because of
memblock_free_all(), right?)

Thanks!

/*
 * Set up kernel memory allocators
 */
static void __init mm_init(void)
{
        /*
         * page_ext requires contiguous pages,
         * bigger than MAX_ORDER unless SPARSEMEM.
         */
        page_ext_init_flatmem();
        init_mem_debugging_and_hardening();
        kfence_alloc_pool();
        report_meminit();
        stack_depot_early_init();
        mem_init();
        mem_init_print_info();
        kmem_cache_init();
        /*
         * page_owner must be initialized after buddy is ready, and also after
         * slab is ready so that stack_depot_init() works properly
         */)

> Patch 1 is a new preparatory cleanup.
> 
> Patch 2 originally submitted here [1], was merged to mainline but
> reverted for stackdepot related issues as explained in the patch.
> 
> Patches 3-5 originally submitted as RFC here [2]. In this submission I
> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
> be considered too intrusive so I will postpone it for later. The docs
> patch is adjusted accordingly.
> 
> Also available in git, based on v5.17-rc1:
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
> 
> I'd like to ask for some review before I add this to the slab tree.
> 
> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
> 
> Oliver Glitta (4):
>   mm/slub: use stackdepot to save stack trace in objects
>   mm/slub: aggregate and print stack traces in debugfs files
>   mm/slub: sort debugfs output by frequency of stack traces
>   slab, documentation: add description of debugfs files for SLUB caches
> 
> Vlastimil Babka (1):
>   mm/slub: move struct track init out of set_track()
> 
>  Documentation/vm/slub.rst |  61 +++++++++++++++
>  init/Kconfig              |   1 +
>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
>  3 files changed, 162 insertions(+), 52 deletions(-)
> 
> -- 
> 2.35.1
> 
> 

-- 
Thank you, You are awesome!
Hyeonggon :-)

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Vlastimil Babka 4 years, 3 months ago

On 2/26/22 08:19, Hyeonggon Yoo wrote:
> On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
>> Hi,
>> 
>> this series combines and revives patches from Oliver's last year
>> bachelor thesis (where I was the advisor) that make SLUB's debugfs
>> files alloc_traces and free_traces more useful.
>> The resubmission was blocked on stackdepot changes that are now merged,
>> as explained in patch 2.
>> 
> 
> Hello. I just started review/testing this series.
> 
> it crashed on my system (arm64)

Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
from memblock. arm64 must have memblock freeing happen earlier or something.
(CCing memblock experts)

> I ran with boot parameter slub_debug=U, and without KASAN.
> So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> 
> void * __init memblock_alloc_try_nid(
>                         phys_addr_t size, phys_addr_t align,
>                         phys_addr_t min_addr, phys_addr_t max_addr,
>                         int nid)
> {
>         void *ptr;
> 
>         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
>                      __func__, (u64)size, (u64)align, nid, &min_addr,
>                      &max_addr, (void *)_RET_IP_);
>         ptr = memblock_alloc_internal(size, align,
>                                            min_addr, max_addr, nid, false);
>         if (ptr)
>                 memset(ptr, 0, size); <--- Crash Here
> 
>         return ptr;
> }
> 
> It crashed during create_boot_cache() -> stack_depot_init() ->
> memblock_alloc().
> 
> I think That's because, in kmem_cache_init(), both slab and memblock is not
> available. (AFAIU memblock is not available after mem_init() because of
> memblock_free_all(), right?)

Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
But then, I would expect stack_depot_init() to detect that memblock_alloc()
returns NULL, we print ""Stack Depot hash table allocation failed,
disabling" and disable it. Instead it seems memblock_alloc() returns
something that's already potentially used by somebody else? Sounds like a bug?

> Thanks!
> 
> /*
>  * Set up kernel memory allocators
>  */
> static void __init mm_init(void)
> {
>         /*
>          * page_ext requires contiguous pages,
>          * bigger than MAX_ORDER unless SPARSEMEM.
>          */
>         page_ext_init_flatmem();
>         init_mem_debugging_and_hardening();
>         kfence_alloc_pool();
>         report_meminit();
>         stack_depot_early_init();
>         mem_init();
>         mem_init_print_info();
>         kmem_cache_init();
>         /*
>          * page_owner must be initialized after buddy is ready, and also after
>          * slab is ready so that stack_depot_init() works properly
>          */)
> 
>> Patch 1 is a new preparatory cleanup.
>> 
>> Patch 2 originally submitted here [1], was merged to mainline but
>> reverted for stackdepot related issues as explained in the patch.
>> 
>> Patches 3-5 originally submitted as RFC here [2]. In this submission I
>> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
>> be considered too intrusive so I will postpone it for later. The docs
>> patch is adjusted accordingly.
>> 
>> Also available in git, based on v5.17-rc1:
>> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
>> 
>> I'd like to ask for some review before I add this to the slab tree.
>> 
>> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
>> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
>> 
>> Oliver Glitta (4):
>>   mm/slub: use stackdepot to save stack trace in objects
>>   mm/slub: aggregate and print stack traces in debugfs files
>>   mm/slub: sort debugfs output by frequency of stack traces
>>   slab, documentation: add description of debugfs files for SLUB caches
>> 
>> Vlastimil Babka (1):
>>   mm/slub: move struct track init out of set_track()
>> 
>>  Documentation/vm/slub.rst |  61 +++++++++++++++
>>  init/Kconfig              |   1 +
>>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
>>  3 files changed, 162 insertions(+), 52 deletions(-)
>> 
>> -- 
>> 2.35.1
>> 
>> 
>

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Mike Rapoport 4 years, 3 months ago

On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> On 2/26/22 08:19, Hyeonggon Yoo wrote:
> > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> >> Hi,
> >> 
> >> this series combines and revives patches from Oliver's last year
> >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> >> files alloc_traces and free_traces more useful.
> >> The resubmission was blocked on stackdepot changes that are now merged,
> >> as explained in patch 2.
> >> 
> > 
> > Hello. I just started review/testing this series.
> > 
> > it crashed on my system (arm64)
> 
> Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
> from memblock. arm64 must have memblock freeing happen earlier or something.
> (CCing memblock experts)
> 
> > I ran with boot parameter slub_debug=U, and without KASAN.
> > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> > 
> > void * __init memblock_alloc_try_nid(
> >                         phys_addr_t size, phys_addr_t align,
> >                         phys_addr_t min_addr, phys_addr_t max_addr,
> >                         int nid)
> > {
> >         void *ptr;
> > 
> >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
> >                      __func__, (u64)size, (u64)align, nid, &min_addr,
> >                      &max_addr, (void *)_RET_IP_);
> >         ptr = memblock_alloc_internal(size, align,
> >                                            min_addr, max_addr, nid, false);
> >         if (ptr)
> >                 memset(ptr, 0, size); <--- Crash Here
> > 
> >         return ptr;
> > }
> > 
> > It crashed during create_boot_cache() -> stack_depot_init() ->
> > memblock_alloc().
> > 
> > I think That's because, in kmem_cache_init(), both slab and memblock is not
> > available. (AFAIU memblock is not available after mem_init() because of
> > memblock_free_all(), right?)
> 
> Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
> But then, I would expect stack_depot_init() to detect that memblock_alloc()
> returns NULL, we print ""Stack Depot hash table allocation failed,
> disabling" and disable it. Instead it seems memblock_alloc() returns
> something that's already potentially used by somebody else? Sounds like a bug?

If stack_depot_init() is called from kmem_cache_init(), there will be a
confusion what allocator should be used because we use slab_is_available()
to stop using memblock and start using kmalloc() instead in both
stack_depot_init() and in memblock.

Hyeonggon, did you run your tests with panic on warn at any chance?
 
> > Thanks!
> > 
> > /*
> >  * Set up kernel memory allocators
> >  */
> > static void __init mm_init(void)
> > {
> >         /*
> >          * page_ext requires contiguous pages,
> >          * bigger than MAX_ORDER unless SPARSEMEM.
> >          */
> >         page_ext_init_flatmem();
> >         init_mem_debugging_and_hardening();
> >         kfence_alloc_pool();
> >         report_meminit();
> >         stack_depot_early_init();
> >         mem_init();
> >         mem_init_print_info();
> >         kmem_cache_init();
> >         /*
> >          * page_owner must be initialized after buddy is ready, and also after
> >          * slab is ready so that stack_depot_init() works properly
> >          */)
> > 
> >> Patch 1 is a new preparatory cleanup.
> >> 
> >> Patch 2 originally submitted here [1], was merged to mainline but
> >> reverted for stackdepot related issues as explained in the patch.
> >> 
> >> Patches 3-5 originally submitted as RFC here [2]. In this submission I
> >> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
> >> be considered too intrusive so I will postpone it for later. The docs
> >> patch is adjusted accordingly.
> >> 
> >> Also available in git, based on v5.17-rc1:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
> >> 
> >> I'd like to ask for some review before I add this to the slab tree.
> >> 
> >> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
> >> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
> >> 
> >> Oliver Glitta (4):
> >>   mm/slub: use stackdepot to save stack trace in objects
> >>   mm/slub: aggregate and print stack traces in debugfs files
> >>   mm/slub: sort debugfs output by frequency of stack traces
> >>   slab, documentation: add description of debugfs files for SLUB caches
> >> 
> >> Vlastimil Babka (1):
> >>   mm/slub: move struct track init out of set_track()
> >> 
> >>  Documentation/vm/slub.rst |  61 +++++++++++++++
> >>  init/Kconfig              |   1 +
> >>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
> >>  3 files changed, 162 insertions(+), 52 deletions(-)
> >> 
> >> -- 
> >> 2.35.1
> >> 
> >> 
> > 
> 

-- 
Sincerely yours,
Mike.

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Vlastimil Babka 4 years, 3 months ago

On 2/28/22 21:01, Mike Rapoport wrote:
> On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
>> On 2/26/22 08:19, Hyeonggon Yoo wrote:
>> > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
>> >> Hi,
>> >> 
>> >> this series combines and revives patches from Oliver's last year
>> >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
>> >> files alloc_traces and free_traces more useful.
>> >> The resubmission was blocked on stackdepot changes that are now merged,
>> >> as explained in patch 2.
>> >> 
>> > 
>> > Hello. I just started review/testing this series.
>> > 
>> > it crashed on my system (arm64)
>> 
>> Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
>> from memblock. arm64 must have memblock freeing happen earlier or something.
>> (CCing memblock experts)
>> 
>> > I ran with boot parameter slub_debug=U, and without KASAN.
>> > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
>> > 
>> > void * __init memblock_alloc_try_nid(
>> >                         phys_addr_t size, phys_addr_t align,
>> >                         phys_addr_t min_addr, phys_addr_t max_addr,
>> >                         int nid)
>> > {
>> >         void *ptr;
>> > 
>> >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
>> >                      __func__, (u64)size, (u64)align, nid, &min_addr,
>> >                      &max_addr, (void *)_RET_IP_);
>> >         ptr = memblock_alloc_internal(size, align,
>> >                                            min_addr, max_addr, nid, false);
>> >         if (ptr)
>> >                 memset(ptr, 0, size); <--- Crash Here
>> > 
>> >         return ptr;
>> > }
>> > 
>> > It crashed during create_boot_cache() -> stack_depot_init() ->
>> > memblock_alloc().
>> > 
>> > I think That's because, in kmem_cache_init(), both slab and memblock is not
>> > available. (AFAIU memblock is not available after mem_init() because of
>> > memblock_free_all(), right?)
>> 
>> Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
>> But then, I would expect stack_depot_init() to detect that memblock_alloc()
>> returns NULL, we print ""Stack Depot hash table allocation failed,
>> disabling" and disable it. Instead it seems memblock_alloc() returns
>> something that's already potentially used by somebody else? Sounds like a bug?
> 
> If stack_depot_init() is called from kmem_cache_init(), there will be a
> confusion what allocator should be used because we use slab_is_available()
> to stop using memblock and start using kmalloc() instead in both
> stack_depot_init() and in memblock.

I did check that stack_depot_init() is called from kmem_cache_init()
*before* we make slab_is_available() true, hence assumed that memblock would
be still available at that point and expected no confusion. But seems if
memblock is already beyond memblock_free_all() then it being still available
is just an illusion?

> Hyeonggon, did you run your tests with panic on warn at any chance?
>  
>> > Thanks!
>> > 
>> > /*
>> >  * Set up kernel memory allocators
>> >  */
>> > static void __init mm_init(void)
>> > {
>> >         /*
>> >          * page_ext requires contiguous pages,
>> >          * bigger than MAX_ORDER unless SPARSEMEM.
>> >          */
>> >         page_ext_init_flatmem();
>> >         init_mem_debugging_and_hardening();
>> >         kfence_alloc_pool();
>> >         report_meminit();
>> >         stack_depot_early_init();
>> >         mem_init();
>> >         mem_init_print_info();
>> >         kmem_cache_init();
>> >         /*
>> >          * page_owner must be initialized after buddy is ready, and also after
>> >          * slab is ready so that stack_depot_init() works properly
>> >          */)
>> > 
>> >> Patch 1 is a new preparatory cleanup.
>> >> 
>> >> Patch 2 originally submitted here [1], was merged to mainline but
>> >> reverted for stackdepot related issues as explained in the patch.
>> >> 
>> >> Patches 3-5 originally submitted as RFC here [2]. In this submission I
>> >> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
>> >> be considered too intrusive so I will postpone it for later. The docs
>> >> patch is adjusted accordingly.
>> >> 
>> >> Also available in git, based on v5.17-rc1:
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
>> >> 
>> >> I'd like to ask for some review before I add this to the slab tree.
>> >> 
>> >> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
>> >> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
>> >> 
>> >> Oliver Glitta (4):
>> >>   mm/slub: use stackdepot to save stack trace in objects
>> >>   mm/slub: aggregate and print stack traces in debugfs files
>> >>   mm/slub: sort debugfs output by frequency of stack traces
>> >>   slab, documentation: add description of debugfs files for SLUB caches
>> >> 
>> >> Vlastimil Babka (1):
>> >>   mm/slub: move struct track init out of set_track()
>> >> 
>> >>  Documentation/vm/slub.rst |  61 +++++++++++++++
>> >>  init/Kconfig              |   1 +
>> >>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
>> >>  3 files changed, 162 insertions(+), 52 deletions(-)
>> >> 
>> >> -- 
>> >> 2.35.1
>> >> 
>> >> 
>> > 
>> 
>

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Mike Rapoport 4 years, 3 months ago

On Tue, Mar 01, 2022 at 12:38:11AM +0100, Vlastimil Babka wrote:
> On 2/28/22 21:01, Mike Rapoport wrote:
> > On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> >> On 2/26/22 08:19, Hyeonggon Yoo wrote:
> >> > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> >> >> Hi,
> >> >> 
> >> >> this series combines and revives patches from Oliver's last year
> >> >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> >> >> files alloc_traces and free_traces more useful.
> >> >> The resubmission was blocked on stackdepot changes that are now merged,
> >> >> as explained in patch 2.
> >> >> 
> >> > 
> >> > Hello. I just started review/testing this series.
> >> > 
> >> > it crashed on my system (arm64)
> >> 
> >> Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
> >> from memblock. arm64 must have memblock freeing happen earlier or something.
> >> (CCing memblock experts)
> >> 
> >> > I ran with boot parameter slub_debug=U, and without KASAN.
> >> > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> >> > 
> >> > void * __init memblock_alloc_try_nid(
> >> >                         phys_addr_t size, phys_addr_t align,
> >> >                         phys_addr_t min_addr, phys_addr_t max_addr,
> >> >                         int nid)
> >> > {
> >> >         void *ptr;
> >> > 
> >> >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
> >> >                      __func__, (u64)size, (u64)align, nid, &min_addr,
> >> >                      &max_addr, (void *)_RET_IP_);
> >> >         ptr = memblock_alloc_internal(size, align,
> >> >                                            min_addr, max_addr, nid, false);
> >> >         if (ptr)
> >> >                 memset(ptr, 0, size); <--- Crash Here
> >> > 
> >> >         return ptr;
> >> > }
> >> > 
> >> > It crashed during create_boot_cache() -> stack_depot_init() ->
> >> > memblock_alloc().
> >> > 
> >> > I think That's because, in kmem_cache_init(), both slab and memblock is not
> >> > available. (AFAIU memblock is not available after mem_init() because of
> >> > memblock_free_all(), right?)
> >> 
> >> Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
> >> But then, I would expect stack_depot_init() to detect that memblock_alloc()
> >> returns NULL, we print ""Stack Depot hash table allocation failed,
> >> disabling" and disable it. Instead it seems memblock_alloc() returns
> >> something that's already potentially used by somebody else? Sounds like a bug?
> > 
> > If stack_depot_init() is called from kmem_cache_init(), there will be a
> > confusion what allocator should be used because we use slab_is_available()
> > to stop using memblock and start using kmalloc() instead in both
> > stack_depot_init() and in memblock.
> 
> I did check that stack_depot_init() is called from kmem_cache_init()
> *before* we make slab_is_available() true, hence assumed that memblock would
> be still available at that point and expected no confusion. But seems if
> memblock is already beyond memblock_free_all() then it being still available
> is just an illusion?

Yeah, it appears it is an illusion :)

I think we have to deal with allocations that happen between
memblock_free_all() and slab_is_available() at the memblock level and then
figure out the where to put stack_depot_init() and how to allocate memory
there.

I believe something like this (untested) patch below addresses the first
issue. As for stack_depot_init() I'm still trying to figure out the
possible call paths, but it seems we can use stack_depot_early_init() for
SLUB debugging case. I'll try to come up with something Really Soon (tm).

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 50ad19662a32..4ea89d44d22a 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -90,6 +90,7 @@ struct memblock_type {
  */
 struct memblock {
 	bool bottom_up;  /* is bottom up direction? */
+	bool mem_freed;
 	phys_addr_t current_limit;
 	struct memblock_type memory;
 	struct memblock_type reserved;
diff --git a/mm/memblock.c b/mm/memblock.c
index b12a364f2766..60196dc4980e 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -120,6 +120,7 @@ struct memblock memblock __initdata_memblock = {
 	.reserved.name		= "reserved",
 
 	.bottom_up		= false,
+	.mem_freed		= false,
 	.current_limit		= MEMBLOCK_ALLOC_ANYWHERE,
 };
 
@@ -1487,6 +1488,13 @@ static void * __init memblock_alloc_internal(
 	if (WARN_ON_ONCE(slab_is_available()))
 		return kzalloc_node(size, GFP_NOWAIT, nid);
 
+	if (memblock.mem_freed) {
+		unsigned int order = get_order(size);
+
+		pr_warn("memblock: allocating from buddy\n");
+		return __alloc_pages_node(nid, order, GFP_KERNEL);
+	}
+
 	if (max_addr > memblock.current_limit)
 		max_addr = memblock.current_limit;
 
@@ -2116,6 +2124,7 @@ void __init memblock_free_all(void)
 
 	pages = free_low_memory_core_early();
 	totalram_pages_add(pages);
+	memblock.mem_freed = true;
 }
 
 #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_ARCH_KEEP_MEMBLOCK)
 
> > Hyeonggon, did you run your tests with panic on warn at any chance?
> >  
> >> > Thanks!
> >> > 
> >> > /*
> >> >  * Set up kernel memory allocators
> >> >  */
> >> > static void __init mm_init(void)
> >> > {
> >> >         /*
> >> >          * page_ext requires contiguous pages,
> >> >          * bigger than MAX_ORDER unless SPARSEMEM.
> >> >          */
> >> >         page_ext_init_flatmem();
> >> >         init_mem_debugging_and_hardening();
> >> >         kfence_alloc_pool();
> >> >         report_meminit();
> >> >         stack_depot_early_init();
> >> >         mem_init();
> >> >         mem_init_print_info();
> >> >         kmem_cache_init();
> >> >         /*
> >> >          * page_owner must be initialized after buddy is ready, and also after
> >> >          * slab is ready so that stack_depot_init() works properly
> >> >          */)
> >> > 
> >> >> Patch 1 is a new preparatory cleanup.
> >> >> 
> >> >> Patch 2 originally submitted here [1], was merged to mainline but
> >> >> reverted for stackdepot related issues as explained in the patch.
> >> >> 
> >> >> Patches 3-5 originally submitted as RFC here [2]. In this submission I
> >> >> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
> >> >> be considered too intrusive so I will postpone it for later. The docs
> >> >> patch is adjusted accordingly.
> >> >> 
> >> >> Also available in git, based on v5.17-rc1:
> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
> >> >> 
> >> >> I'd like to ask for some review before I add this to the slab tree.
> >> >> 
> >> >> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
> >> >> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
> >> >> 
> >> >> Oliver Glitta (4):
> >> >>   mm/slub: use stackdepot to save stack trace in objects
> >> >>   mm/slub: aggregate and print stack traces in debugfs files
> >> >>   mm/slub: sort debugfs output by frequency of stack traces
> >> >>   slab, documentation: add description of debugfs files for SLUB caches
> >> >> 
> >> >> Vlastimil Babka (1):
> >> >>   mm/slub: move struct track init out of set_track()
> >> >> 
> >> >>  Documentation/vm/slub.rst |  61 +++++++++++++++
> >> >>  init/Kconfig              |   1 +
> >> >>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
> >> >>  3 files changed, 162 insertions(+), 52 deletions(-)
> >> >> 
> >> >> -- 
> >> >> 2.35.1
> >> >> 
> >> >> 
> >> > 
> >> 
> > 
> 

-- 
Sincerely yours,
Mike.

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Vlastimil Babka 4 years, 3 months ago

On 3/1/22 10:21, Mike Rapoport wrote:
> On Tue, Mar 01, 2022 at 12:38:11AM +0100, Vlastimil Babka wrote:
>> On 2/28/22 21:01, Mike Rapoport wrote:
>> > On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
>> >> On 2/26/22 08:19, Hyeonggon Yoo wrote:
>> >> > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
>> >> >> Hi,
>> >> >> 
>> >> >> this series combines and revives patches from Oliver's last year
>> >> >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
>> >> >> files alloc_traces and free_traces more useful.
>> >> >> The resubmission was blocked on stackdepot changes that are now merged,
>> >> >> as explained in patch 2.
>> >> >> 
>> >> > 
>> >> > Hello. I just started review/testing this series.
>> >> > 
>> >> > it crashed on my system (arm64)
>> >> 
>> >> Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
>> >> from memblock. arm64 must have memblock freeing happen earlier or something.
>> >> (CCing memblock experts)
>> >> 
>> >> > I ran with boot parameter slub_debug=U, and without KASAN.
>> >> > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
>> >> > 
>> >> > void * __init memblock_alloc_try_nid(
>> >> >                         phys_addr_t size, phys_addr_t align,
>> >> >                         phys_addr_t min_addr, phys_addr_t max_addr,
>> >> >                         int nid)
>> >> > {
>> >> >         void *ptr;
>> >> > 
>> >> >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
>> >> >                      __func__, (u64)size, (u64)align, nid, &min_addr,
>> >> >                      &max_addr, (void *)_RET_IP_);
>> >> >         ptr = memblock_alloc_internal(size, align,
>> >> >                                            min_addr, max_addr, nid, false);
>> >> >         if (ptr)
>> >> >                 memset(ptr, 0, size); <--- Crash Here
>> >> > 
>> >> >         return ptr;
>> >> > }
>> >> > 
>> >> > It crashed during create_boot_cache() -> stack_depot_init() ->
>> >> > memblock_alloc().
>> >> > 
>> >> > I think That's because, in kmem_cache_init(), both slab and memblock is not
>> >> > available. (AFAIU memblock is not available after mem_init() because of
>> >> > memblock_free_all(), right?)
>> >> 
>> >> Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
>> >> But then, I would expect stack_depot_init() to detect that memblock_alloc()
>> >> returns NULL, we print ""Stack Depot hash table allocation failed,
>> >> disabling" and disable it. Instead it seems memblock_alloc() returns
>> >> something that's already potentially used by somebody else? Sounds like a bug?
>> > 
>> > If stack_depot_init() is called from kmem_cache_init(), there will be a
>> > confusion what allocator should be used because we use slab_is_available()
>> > to stop using memblock and start using kmalloc() instead in both
>> > stack_depot_init() and in memblock.
>> 
>> I did check that stack_depot_init() is called from kmem_cache_init()
>> *before* we make slab_is_available() true, hence assumed that memblock would
>> be still available at that point and expected no confusion. But seems if
>> memblock is already beyond memblock_free_all() then it being still available
>> is just an illusion?
> 
> Yeah, it appears it is an illusion :)
> 
> I think we have to deal with allocations that happen between
> memblock_free_all() and slab_is_available() at the memblock level and then
> figure out the where to put stack_depot_init() and how to allocate memory
> there.
> 
> I believe something like this (untested) patch below addresses the first
> issue. As for stack_depot_init() I'm still trying to figure out the
> possible call paths, but it seems we can use stack_depot_early_init() for
> SLUB debugging case. I'll try to come up with something Really Soon (tm).

Yeah as you already noticed, we are pursuing an approach to decide on
calling stack_depot_early_init(), which should be a good way to solve this
given how special slab is in this case. For memblock I just wanted to point
out that it could be more robust, your patch below seems to be on the right
patch. Maybe it just doesn't have to fallback to buddy, which could be
considered a layering violation, but just return NULL that can be
immediately recognized as an error?

> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 50ad19662a32..4ea89d44d22a 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -90,6 +90,7 @@ struct memblock_type {
>   */
>  struct memblock {
>  	bool bottom_up;  /* is bottom up direction? */
> +	bool mem_freed;
>  	phys_addr_t current_limit;
>  	struct memblock_type memory;
>  	struct memblock_type reserved;
> diff --git a/mm/memblock.c b/mm/memblock.c
> index b12a364f2766..60196dc4980e 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -120,6 +120,7 @@ struct memblock memblock __initdata_memblock = {
>  	.reserved.name		= "reserved",
>  
>  	.bottom_up		= false,
> +	.mem_freed		= false,
>  	.current_limit		= MEMBLOCK_ALLOC_ANYWHERE,
>  };
>  
> @@ -1487,6 +1488,13 @@ static void * __init memblock_alloc_internal(
>  	if (WARN_ON_ONCE(slab_is_available()))
>  		return kzalloc_node(size, GFP_NOWAIT, nid);
>  
> +	if (memblock.mem_freed) {
> +		unsigned int order = get_order(size);
> +
> +		pr_warn("memblock: allocating from buddy\n");
> +		return __alloc_pages_node(nid, order, GFP_KERNEL);
> +	}
> +
>  	if (max_addr > memblock.current_limit)
>  		max_addr = memblock.current_limit;
>  
> @@ -2116,6 +2124,7 @@ void __init memblock_free_all(void)
>  
>  	pages = free_low_memory_core_early();
>  	totalram_pages_add(pages);
> +	memblock.mem_freed = true;
>  }
>  
>  #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_ARCH_KEEP_MEMBLOCK)
>  
>> > Hyeonggon, did you run your tests with panic on warn at any chance?
>> >  
>> >> > Thanks!
>> >> > 
>> >> > /*
>> >> >  * Set up kernel memory allocators
>> >> >  */
>> >> > static void __init mm_init(void)
>> >> > {
>> >> >         /*
>> >> >          * page_ext requires contiguous pages,
>> >> >          * bigger than MAX_ORDER unless SPARSEMEM.
>> >> >          */
>> >> >         page_ext_init_flatmem();
>> >> >         init_mem_debugging_and_hardening();
>> >> >         kfence_alloc_pool();
>> >> >         report_meminit();
>> >> >         stack_depot_early_init();
>> >> >         mem_init();
>> >> >         mem_init_print_info();
>> >> >         kmem_cache_init();
>> >> >         /*
>> >> >          * page_owner must be initialized after buddy is ready, and also after
>> >> >          * slab is ready so that stack_depot_init() works properly
>> >> >          */)
>> >> > 
>> >> >> Patch 1 is a new preparatory cleanup.
>> >> >> 
>> >> >> Patch 2 originally submitted here [1], was merged to mainline but
>> >> >> reverted for stackdepot related issues as explained in the patch.
>> >> >> 
>> >> >> Patches 3-5 originally submitted as RFC here [2]. In this submission I
>> >> >> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
>> >> >> be considered too intrusive so I will postpone it for later. The docs
>> >> >> patch is adjusted accordingly.
>> >> >> 
>> >> >> Also available in git, based on v5.17-rc1:
>> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
>> >> >> 
>> >> >> I'd like to ask for some review before I add this to the slab tree.
>> >> >> 
>> >> >> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
>> >> >> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
>> >> >> 
>> >> >> Oliver Glitta (4):
>> >> >>   mm/slub: use stackdepot to save stack trace in objects
>> >> >>   mm/slub: aggregate and print stack traces in debugfs files
>> >> >>   mm/slub: sort debugfs output by frequency of stack traces
>> >> >>   slab, documentation: add description of debugfs files for SLUB caches
>> >> >> 
>> >> >> Vlastimil Babka (1):
>> >> >>   mm/slub: move struct track init out of set_track()
>> >> >> 
>> >> >>  Documentation/vm/slub.rst |  61 +++++++++++++++
>> >> >>  init/Kconfig              |   1 +
>> >> >>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
>> >> >>  3 files changed, 162 insertions(+), 52 deletions(-)
>> >> >> 
>> >> >> -- 
>> >> >> 2.35.1
>> >> >> 
>> >> >> 
>> >> > 
>> >> 
>> > 
>> 
>

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Mike Rapoport 4 years, 3 months ago

On Tue, Mar 01, 2022 at 10:41:10AM +0100, Vlastimil Babka wrote:
> On 3/1/22 10:21, Mike Rapoport wrote:
> > On Tue, Mar 01, 2022 at 12:38:11AM +0100, Vlastimil Babka wrote:
> >> On 2/28/22 21:01, Mike Rapoport wrote:
> >> > On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> >> > 
> >> > If stack_depot_init() is called from kmem_cache_init(), there will be a
> >> > confusion what allocator should be used because we use slab_is_available()
> >> > to stop using memblock and start using kmalloc() instead in both
> >> > stack_depot_init() and in memblock.
> >> 
> >> I did check that stack_depot_init() is called from kmem_cache_init()
> >> *before* we make slab_is_available() true, hence assumed that memblock would
> >> be still available at that point and expected no confusion. But seems if
> >> memblock is already beyond memblock_free_all() then it being still available
> >> is just an illusion?
> > 
> > Yeah, it appears it is an illusion :)
> > 
> > I think we have to deal with allocations that happen between
> > memblock_free_all() and slab_is_available() at the memblock level and then
> > figure out the where to put stack_depot_init() and how to allocate memory
> > there.
> > 
> > I believe something like this (untested) patch below addresses the first
> > issue. As for stack_depot_init() I'm still trying to figure out the
> > possible call paths, but it seems we can use stack_depot_early_init() for
> > SLUB debugging case. I'll try to come up with something Really Soon (tm).
> 
> Yeah as you already noticed, we are pursuing an approach to decide on
> calling stack_depot_early_init(), which should be a good way to solve this
> given how special slab is in this case. For memblock I just wanted to point
> out that it could be more robust, your patch below seems to be on the right
> patch. Maybe it just doesn't have to fallback to buddy, which could be
> considered a layering violation, but just return NULL that can be
> immediately recognized as an error?

The layering violation is anyway there for slab_is_available() case, so
adding a __alloc_pages() there will be only consistent.
 
> > diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> > index 50ad19662a32..4ea89d44d22a 100644
> > --- a/include/linux/memblock.h
> > +++ b/include/linux/memblock.h
> > @@ -90,6 +90,7 @@ struct memblock_type {
> >   */
> >  struct memblock {
> >  	bool bottom_up;  /* is bottom up direction? */
> > +	bool mem_freed;
> >  	phys_addr_t current_limit;
> >  	struct memblock_type memory;
> >  	struct memblock_type reserved;
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index b12a364f2766..60196dc4980e 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -120,6 +120,7 @@ struct memblock memblock __initdata_memblock = {
> >  	.reserved.name		= "reserved",
> >  
> >  	.bottom_up		= false,
> > +	.mem_freed		= false,
> >  	.current_limit		= MEMBLOCK_ALLOC_ANYWHERE,
> >  };
> >  
> > @@ -1487,6 +1488,13 @@ static void * __init memblock_alloc_internal(
> >  	if (WARN_ON_ONCE(slab_is_available()))
> >  		return kzalloc_node(size, GFP_NOWAIT, nid);
> >  
> > +	if (memblock.mem_freed) {
> > +		unsigned int order = get_order(size);
> > +
> > +		pr_warn("memblock: allocating from buddy\n");
> > +		return __alloc_pages_node(nid, order, GFP_KERNEL);
> > +	}
> > +
> >  	if (max_addr > memblock.current_limit)
> >  		max_addr = memblock.current_limit;
> >  
> > @@ -2116,6 +2124,7 @@ void __init memblock_free_all(void)
> >  
> >  	pages = free_low_memory_core_early();
> >  	totalram_pages_add(pages);
> > +	memblock.mem_freed = true;
> >  }
> >  
> >  #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_ARCH_KEEP_MEMBLOCK)
> >  

-- 
Sincerely yours,
Mike.

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Hyeonggon Yoo 4 years, 3 months ago

On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> On 2/26/22 08:19, Hyeonggon Yoo wrote:
> > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> >> Hi,
> >> 
> >> this series combines and revives patches from Oliver's last year
> >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> >> files alloc_traces and free_traces more useful.
> >> The resubmission was blocked on stackdepot changes that are now merged,
> >> as explained in patch 2.
> >> 
> > 
> > Hello. I just started review/testing this series.
> > 
> > it crashed on my system (arm64)
> 
> Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
> from memblock. arm64 must have memblock freeing happen earlier or something.
> (CCing memblock experts)
> 
> > I ran with boot parameter slub_debug=U, and without KASAN.
> > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> > 
> > void * __init memblock_alloc_try_nid(
> >                         phys_addr_t size, phys_addr_t align,
> >                         phys_addr_t min_addr, phys_addr_t max_addr,
> >                         int nid)
> > {
> >         void *ptr;
> > 
> >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
> >                      __func__, (u64)size, (u64)align, nid, &min_addr,
> >                      &max_addr, (void *)_RET_IP_);
> >         ptr = memblock_alloc_internal(size, align,
> >                                            min_addr, max_addr, nid, false);
> >         if (ptr)
> >                 memset(ptr, 0, size); <--- Crash Here
> > 
> >         return ptr;
> > }
> > 
> > It crashed during create_boot_cache() -> stack_depot_init() ->
> > memblock_alloc().
> > 
> > I think That's because, in kmem_cache_init(), both slab and memblock is not
> > available. (AFAIU memblock is not available after mem_init() because of
> > memblock_free_all(), right?)
> 
> Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
> But then, I would expect stack_depot_init() to detect that memblock_alloc()
> returns NULL, we print ""Stack Depot hash table allocation failed,
> disabling" and disable it. Instead it seems memblock_alloc() returns
> something that's already potentially used by somebody else? Sounds like a bug?


By the way, I fixed this by allowing stack_depot_init() to be called in
kmem_cache_init() too [1] and Marco suggested that calling
stack_depot_init() depending on slub_debug parameter for simplicity. [2]

I would prefer [2], Would you take a look?

[1] https://lkml.org/lkml/2022/2/27/31

[2] https://lkml.org/lkml/2022/2/28/717

> > Thanks!
> > 
> > /*
> >  * Set up kernel memory allocators
> >  */
> > static void __init mm_init(void)
> > {
> >         /*
> >          * page_ext requires contiguous pages,
> >          * bigger than MAX_ORDER unless SPARSEMEM.
> >          */
> >         page_ext_init_flatmem();
> >         init_mem_debugging_and_hardening();
> >         kfence_alloc_pool();
> >         report_meminit();
> >         stack_depot_early_init();
> >         mem_init();
> >         mem_init_print_info();
> >         kmem_cache_init();
> >         /*
> >          * page_owner must be initialized after buddy is ready, and also after
> >          * slab is ready so that stack_depot_init() works properly
> >          */)
> > 
> >> Patch 1 is a new preparatory cleanup.
> >> 
> >> Patch 2 originally submitted here [1], was merged to mainline but
> >> reverted for stackdepot related issues as explained in the patch.
> >> 
> >> Patches 3-5 originally submitted as RFC here [2]. In this submission I
> >> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
> >> be considered too intrusive so I will postpone it for later. The docs
> >> patch is adjusted accordingly.
> >> 
> >> Also available in git, based on v5.17-rc1:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
> >> 
> >> I'd like to ask for some review before I add this to the slab tree.
> >> 
> >> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
> >> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
> >> 
> >> Oliver Glitta (4):
> >>   mm/slub: use stackdepot to save stack trace in objects
> >>   mm/slub: aggregate and print stack traces in debugfs files
> >>   mm/slub: sort debugfs output by frequency of stack traces
> >>   slab, documentation: add description of debugfs files for SLUB caches
> >> 
> >> Vlastimil Babka (1):
> >>   mm/slub: move struct track init out of set_track()
> >> 
> >>  Documentation/vm/slub.rst |  61 +++++++++++++++
> >>  init/Kconfig              |   1 +
> >>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
> >>  3 files changed, 162 insertions(+), 52 deletions(-)
> >> 
> >> -- 
> >> 2.35.1
> >> 
> >> 
> > 
> 

-- 
Thank you, You are awesome!
Hyeonggon :-)

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Mike Rapoport 4 years, 3 months ago

Hi,

On Mon, Feb 28, 2022 at 09:27:02PM +0000, Hyeonggon Yoo wrote:
> On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> > On 2/26/22 08:19, Hyeonggon Yoo wrote:
> > > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> > >> Hi,
> > >> 
> > >> this series combines and revives patches from Oliver's last year
> > >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> > >> files alloc_traces and free_traces more useful.
> > >> The resubmission was blocked on stackdepot changes that are now merged,
> > >> as explained in patch 2.
> > >> 
> > > 
> > > Hello. I just started review/testing this series.
> > > 
> > > it crashed on my system (arm64)
> > 
> > Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
> > from memblock. arm64 must have memblock freeing happen earlier or something.
> > (CCing memblock experts)
> > 
> > > I ran with boot parameter slub_debug=U, and without KASAN.
> > > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> > > 
> > > void * __init memblock_alloc_try_nid(
> > >                         phys_addr_t size, phys_addr_t align,
> > >                         phys_addr_t min_addr, phys_addr_t max_addr,
> > >                         int nid)
> > > {
> > >         void *ptr;
> > > 
> > >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
> > >                      __func__, (u64)size, (u64)align, nid, &min_addr,
> > >                      &max_addr, (void *)_RET_IP_);
> > >         ptr = memblock_alloc_internal(size, align,
> > >                                            min_addr, max_addr, nid, false);
> > >         if (ptr)
> > >                 memset(ptr, 0, size); <--- Crash Here
> > > 
> > >         return ptr;
> > > }
> > > 
> > > It crashed during create_boot_cache() -> stack_depot_init() ->
> > > memblock_alloc().
> > > 
> > > I think That's because, in kmem_cache_init(), both slab and memblock is not
> > > available. (AFAIU memblock is not available after mem_init() because of
> > > memblock_free_all(), right?)
> > 
> > Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
> > But then, I would expect stack_depot_init() to detect that memblock_alloc()
> > returns NULL, we print ""Stack Depot hash table allocation failed,
> > disabling" and disable it. Instead it seems memblock_alloc() returns
> > something that's already potentially used by somebody else? Sounds like a bug?
> 
> 
> By the way, I fixed this by allowing stack_depot_init() to be called in
> kmem_cache_init() too [1] and Marco suggested that calling
> stack_depot_init() depending on slub_debug parameter for simplicity. [2]
> 
> I would prefer [2], Would you take a look?
> 
> [1] https://lkml.org/lkml/2022/2/27/31
> 
> [2] https://lkml.org/lkml/2022/2/28/717

I'm still looking at stack_depot_init() callers, but I think it's possible
to make stack_depot_early_init() a proper function that calls
memblock_alloc() and only use stack_depot_init() when slab is actually
available.
 
> > > Thanks!
> > > 
> > > /*
> > >  * Set up kernel memory allocators
> > >  */
> > > static void __init mm_init(void)
> > > {
> > >         /*
> > >          * page_ext requires contiguous pages,
> > >          * bigger than MAX_ORDER unless SPARSEMEM.
> > >          */
> > >         page_ext_init_flatmem();
> > >         init_mem_debugging_and_hardening();
> > >         kfence_alloc_pool();
> > >         report_meminit();
> > >         stack_depot_early_init();
> > >         mem_init();
> > >         mem_init_print_info();
> > >         kmem_cache_init();
> > >         /*
> > >          * page_owner must be initialized after buddy is ready, and also after
> > >          * slab is ready so that stack_depot_init() works properly
> > >          */)
> > > 
> > >> Patch 1 is a new preparatory cleanup.
> > >> 
> > >> Patch 2 originally submitted here [1], was merged to mainline but
> > >> reverted for stackdepot related issues as explained in the patch.
> > >> 
> > >> Patches 3-5 originally submitted as RFC here [2]. In this submission I
> > >> have omitted the new file 'all_objects' (patch 3/3 in [2]) as it might
> > >> be considered too intrusive so I will postpone it for later. The docs
> > >> patch is adjusted accordingly.
> > >> 
> > >> Also available in git, based on v5.17-rc1:
> > >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-stackdepot-v1
> > >> 
> > >> I'd like to ask for some review before I add this to the slab tree.
> > >> 
> > >> [1] https://lore.kernel.org/all/20210414163434.4376-1-glittao@gmail.com/
> > >> [2] https://lore.kernel.org/all/20210521121127.24653-1-glittao@gmail.com/
> > >> 
> > >> Oliver Glitta (4):
> > >>   mm/slub: use stackdepot to save stack trace in objects
> > >>   mm/slub: aggregate and print stack traces in debugfs files
> > >>   mm/slub: sort debugfs output by frequency of stack traces
> > >>   slab, documentation: add description of debugfs files for SLUB caches
> > >> 
> > >> Vlastimil Babka (1):
> > >>   mm/slub: move struct track init out of set_track()
> > >> 
> > >>  Documentation/vm/slub.rst |  61 +++++++++++++++
> > >>  init/Kconfig              |   1 +
> > >>  mm/slub.c                 | 152 +++++++++++++++++++++++++-------------
> > >>  3 files changed, 162 insertions(+), 52 deletions(-)
> > >> 
> > >> -- 
> > >> 2.35.1
> > >> 
> > >> 
> > > 
> > 
> 
> -- 
> Thank you, You are awesome!
> Hyeonggon :-)

-- 
Sincerely yours,
Mike.

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Mike Rapoport 4 years, 3 months ago

On Mon, Feb 28, 2022 at 09:27:02PM +0000, Hyeonggon Yoo wrote:
> On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> > On 2/26/22 08:19, Hyeonggon Yoo wrote:
> > > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> > >> Hi,
> > >> 
> > >> this series combines and revives patches from Oliver's last year
> > >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> > >> files alloc_traces and free_traces more useful.
> > >> The resubmission was blocked on stackdepot changes that are now merged,
> > >> as explained in patch 2.
> > >> 
> > > 
> > > Hello. I just started review/testing this series.
> > > 
> > > it crashed on my system (arm64)
> > 
> > Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
> > from memblock. arm64 must have memblock freeing happen earlier or something.
> > (CCing memblock experts)
> > 
> > > I ran with boot parameter slub_debug=U, and without KASAN.
> > > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> > > 
> > > void * __init memblock_alloc_try_nid(
> > >                         phys_addr_t size, phys_addr_t align,
> > >                         phys_addr_t min_addr, phys_addr_t max_addr,
> > >                         int nid)
> > > {
> > >         void *ptr;
> > > 
> > >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
> > >                      __func__, (u64)size, (u64)align, nid, &min_addr,
> > >                      &max_addr, (void *)_RET_IP_);
> > >         ptr = memblock_alloc_internal(size, align,
> > >                                            min_addr, max_addr, nid, false);
> > >         if (ptr)
> > >                 memset(ptr, 0, size); <--- Crash Here
> > > 
> > >         return ptr;
> > > }
> > > 
> > > It crashed during create_boot_cache() -> stack_depot_init() ->
> > > memblock_alloc().
> > > 
> > > I think That's because, in kmem_cache_init(), both slab and memblock is not
> > > available. (AFAIU memblock is not available after mem_init() because of
> > > memblock_free_all(), right?)
> > 
> > Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
> > But then, I would expect stack_depot_init() to detect that memblock_alloc()
> > returns NULL, we print ""Stack Depot hash table allocation failed,
> > disabling" and disable it. Instead it seems memblock_alloc() returns
> > something that's already potentially used by somebody else? Sounds like a bug?
> 
> 
> By the way, I fixed this by allowing stack_depot_init() to be called in
> kmem_cache_init() too [1] and Marco suggested that calling
> stack_depot_init() depending on slub_debug parameter for simplicity. [2]
> 
> I would prefer [2], Would you take a look?
> 
> [1] https://lkml.org/lkml/2022/2/27/31
> 
> [2] https://lkml.org/lkml/2022/2/28/717

I have the third version :)

diff --git a/mm/slub.c b/mm/slub.c
index a74afe59a403..0c3ab2335b46 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1548,6 +1548,10 @@ static int __init setup_slub_debug(char *str)
 	}
 out:
 	slub_debug = global_flags;
+
+	if (slub_flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
+		stack_depot_early_init();
+
 	if (slub_debug != 0 || slub_debug_string)
 		static_branch_enable(&slub_debug_enabled);
 	else
@@ -4221,9 +4225,6 @@ static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags)
 	s->remote_node_defrag_ratio = 1000;
 #endif
 
-	if (s->flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
-		stack_depot_init();
-
 	/* Initialize the pre-computed randomized freelist if slab is up */
 	if (slab_state >= UP) {
 		if (init_cache_random_seq(s))
 
> -- 
> Thank you, You are awesome!
> Hyeonggon :-)

-- 
Sincerely yours,
Mike.

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Vlastimil Babka 4 years, 3 months ago

On 3/2/22 09:37, Mike Rapoport wrote:
> On Mon, Feb 28, 2022 at 09:27:02PM +0000, Hyeonggon Yoo wrote:
>> On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
>> > On 2/26/22 08:19, Hyeonggon Yoo wrote:
>> > > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
>> > >> Hi,
>> > >> 
>> > >> this series combines and revives patches from Oliver's last year
>> > >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
>> > >> files alloc_traces and free_traces more useful.
>> > >> The resubmission was blocked on stackdepot changes that are now merged,
>> > >> as explained in patch 2.
>> > >> 
>> > > 
>> > > Hello. I just started review/testing this series.
>> > > 
>> > > it crashed on my system (arm64)
>> > 
>> > Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
>> > from memblock. arm64 must have memblock freeing happen earlier or something.
>> > (CCing memblock experts)
>> > 
>> > > I ran with boot parameter slub_debug=U, and without KASAN.
>> > > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
>> > > 
>> > > void * __init memblock_alloc_try_nid(
>> > >                         phys_addr_t size, phys_addr_t align,
>> > >                         phys_addr_t min_addr, phys_addr_t max_addr,
>> > >                         int nid)
>> > > {
>> > >         void *ptr;
>> > > 
>> > >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
>> > >                      __func__, (u64)size, (u64)align, nid, &min_addr,
>> > >                      &max_addr, (void *)_RET_IP_);
>> > >         ptr = memblock_alloc_internal(size, align,
>> > >                                            min_addr, max_addr, nid, false);
>> > >         if (ptr)
>> > >                 memset(ptr, 0, size); <--- Crash Here
>> > > 
>> > >         return ptr;
>> > > }
>> > > 
>> > > It crashed during create_boot_cache() -> stack_depot_init() ->
>> > > memblock_alloc().
>> > > 
>> > > I think That's because, in kmem_cache_init(), both slab and memblock is not
>> > > available. (AFAIU memblock is not available after mem_init() because of
>> > > memblock_free_all(), right?)
>> > 
>> > Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
>> > But then, I would expect stack_depot_init() to detect that memblock_alloc()
>> > returns NULL, we print ""Stack Depot hash table allocation failed,
>> > disabling" and disable it. Instead it seems memblock_alloc() returns
>> > something that's already potentially used by somebody else? Sounds like a bug?
>> 
>> 
>> By the way, I fixed this by allowing stack_depot_init() to be called in
>> kmem_cache_init() too [1] and Marco suggested that calling
>> stack_depot_init() depending on slub_debug parameter for simplicity. [2]
>> 
>> I would prefer [2], Would you take a look?
>> 
>> [1] https://lkml.org/lkml/2022/2/27/31
>> 
>> [2] https://lkml.org/lkml/2022/2/28/717
> 
> I have the third version :)

While simple, it changes the timing of stack_depot_early_init() that was
supposed to be at a single callsite - now it's less predictable and depends
on e.g. kernel parameter ordering. Some arch/config combo could break,
dunno. Setting a variable that stack_depot_early_init() checks should be
more robust.

> 
> diff --git a/mm/slub.c b/mm/slub.c
> index a74afe59a403..0c3ab2335b46 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1548,6 +1548,10 @@ static int __init setup_slub_debug(char *str)
>  	}
>  out:
>  	slub_debug = global_flags;
> +
> +	if (slub_flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
> +		stack_depot_early_init();
> +
>  	if (slub_debug != 0 || slub_debug_string)
>  		static_branch_enable(&slub_debug_enabled);
>  	else
> @@ -4221,9 +4225,6 @@ static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags)
>  	s->remote_node_defrag_ratio = 1000;
>  #endif
>  
> -	if (s->flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
> -		stack_depot_init();
> -
>  	/* Initialize the pre-computed randomized freelist if slab is up */
>  	if (slab_state >= UP) {
>  		if (init_cache_random_seq(s))
>  
>> -- 
>> Thank you, You are awesome!
>> Hyeonggon :-)
>

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Mike Rapoport 4 years, 3 months ago

On Wed, Mar 02, 2022 at 10:09:37AM +0100, Vlastimil Babka wrote:
> On 3/2/22 09:37, Mike Rapoport wrote:
> > On Mon, Feb 28, 2022 at 09:27:02PM +0000, Hyeonggon Yoo wrote:
> >> On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> >> > On 2/26/22 08:19, Hyeonggon Yoo wrote:
> >> > > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> >> > >> Hi,
> >> > >> 
> >> > >> this series combines and revives patches from Oliver's last year
> >> > >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> >> > >> files alloc_traces and free_traces more useful.
> >> > >> The resubmission was blocked on stackdepot changes that are now merged,
> >> > >> as explained in patch 2.
> >> > >> 
> >> > > 
> >> > > Hello. I just started review/testing this series.
> >> > > 
> >> > > it crashed on my system (arm64)
> >> > 
> >> > Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
> >> > from memblock. arm64 must have memblock freeing happen earlier or something.
> >> > (CCing memblock experts)
> >> > 
> >> > > I ran with boot parameter slub_debug=U, and without KASAN.
> >> > > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> >> > > 
> >> > > void * __init memblock_alloc_try_nid(
> >> > >                         phys_addr_t size, phys_addr_t align,
> >> > >                         phys_addr_t min_addr, phys_addr_t max_addr,
> >> > >                         int nid)
> >> > > {
> >> > >         void *ptr;
> >> > > 
> >> > >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
> >> > >                      __func__, (u64)size, (u64)align, nid, &min_addr,
> >> > >                      &max_addr, (void *)_RET_IP_);
> >> > >         ptr = memblock_alloc_internal(size, align,
> >> > >                                            min_addr, max_addr, nid, false);
> >> > >         if (ptr)
> >> > >                 memset(ptr, 0, size); <--- Crash Here
> >> > > 
> >> > >         return ptr;
> >> > > }
> >> > > 
> >> > > It crashed during create_boot_cache() -> stack_depot_init() ->
> >> > > memblock_alloc().
> >> > > 
> >> > > I think That's because, in kmem_cache_init(), both slab and memblock is not
> >> > > available. (AFAIU memblock is not available after mem_init() because of
> >> > > memblock_free_all(), right?)
> >> > 
> >> > Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
> >> > But then, I would expect stack_depot_init() to detect that memblock_alloc()
> >> > returns NULL, we print ""Stack Depot hash table allocation failed,
> >> > disabling" and disable it. Instead it seems memblock_alloc() returns
> >> > something that's already potentially used by somebody else? Sounds like a bug?
> >> 
> >> 
> >> By the way, I fixed this by allowing stack_depot_init() to be called in
> >> kmem_cache_init() too [1] and Marco suggested that calling
> >> stack_depot_init() depending on slub_debug parameter for simplicity. [2]
> >> 
> >> I would prefer [2], Would you take a look?
> >> 
> >> [1] https://lkml.org/lkml/2022/2/27/31
> >> 
> >> [2] https://lkml.org/lkml/2022/2/28/717
> > 
> > I have the third version :)
> 
> While simple, it changes the timing of stack_depot_early_init() that was
> supposed to be at a single callsite - now it's less predictable and depends
> on e.g. kernel parameter ordering. Some arch/config combo could break,
> dunno. Setting a variable that stack_depot_early_init() checks should be
> more robust.

Not sure I follow.
stack_depot_early_init() is a wrapper for stack_depot_init() which already
checks 

	if (!stack_depot_disable && !stack_table)

So largely it can be at multiple call sites just like stack_depot_init...

Still, I understand your concern of having multiple call sites for
stack_depot_early_init().

The most robust way I can think of will be to make stack_depot_early_init()
a proper function, move memblock_alloc() there and add a variable, say
stack_depot_needed_early that will be set to 1 if
CONFIG_STACKDEPOT_ALWAYS_INIT=y or by the callers that need to allocate the
stack_table before kmalloc is up.
 
E.g

__init int stack_depot_early_init(void)
{

	if (stack_depot_needed_early && !stack_table) {
		size_t size = (STACK_HASH_SIZE * sizeof(struct stack_record *));
		int i;

		pr_info("Stack Depot allocating hash table with memblock_alloc\n");
		stack_table = memblock_alloc(size, SMP_CACHE_BYTES);
	
		if (!stack_table) {
			pr_err("Stack Depot hash table allocation failed, disabling\n");
			stack_depot_disable = true;
			return -ENOMEM;
		}
	}

	return 0;
}

The mutex is not needed here because mm_init() -> stack_depot_early_init()
happens before SMP and setting stack_table[i] to NULL is redundant with
memblock_alloc(). (btw, kvmalloc case could use __GFP_ZERO as well).

I'm not sure if the stack depot should be disabled for good if the early
allocation failed, but that's another story.

> > diff --git a/mm/slub.c b/mm/slub.c
> > index a74afe59a403..0c3ab2335b46 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -1548,6 +1548,10 @@ static int __init setup_slub_debug(char *str)
> >  	}
> >  out:
> >  	slub_debug = global_flags;
> > +
> > +	if (slub_flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
> > +		stack_depot_early_init();
> > +
> >  	if (slub_debug != 0 || slub_debug_string)
> >  		static_branch_enable(&slub_debug_enabled);
> >  	else
> > @@ -4221,9 +4225,6 @@ static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags)
> >  	s->remote_node_defrag_ratio = 1000;
> >  #endif
> >  
> > -	if (s->flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
> > -		stack_depot_init();
> > -
> >  	/* Initialize the pre-computed randomized freelist if slab is up */
> >  	if (slab_state >= UP) {
> >  		if (init_cache_random_seq(s))
> >  
> >> -- 
> >> Thank you, You are awesome!
> >> Hyeonggon :-)
> > 
> 

-- 
Sincerely yours,
Mike.

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Hyeonggon Yoo 4 years, 3 months ago

On Wed, Mar 02, 2022 at 02:30:56PM +0200, Mike Rapoport wrote:
> On Wed, Mar 02, 2022 at 10:09:37AM +0100, Vlastimil Babka wrote:
> > On 3/2/22 09:37, Mike Rapoport wrote:
> > > On Mon, Feb 28, 2022 at 09:27:02PM +0000, Hyeonggon Yoo wrote:
> > >> On Mon, Feb 28, 2022 at 08:10:18PM +0100, Vlastimil Babka wrote:
> > >> > On 2/26/22 08:19, Hyeonggon Yoo wrote:
> > >> > > On Fri, Feb 25, 2022 at 07:03:13PM +0100, Vlastimil Babka wrote:
> > >> > >> Hi,
> > >> > >> 
> > >> > >> this series combines and revives patches from Oliver's last year
> > >> > >> bachelor thesis (where I was the advisor) that make SLUB's debugfs
> > >> > >> files alloc_traces and free_traces more useful.
> > >> > >> The resubmission was blocked on stackdepot changes that are now merged,
> > >> > >> as explained in patch 2.
> > >> > >> 
> > >> > > 
> > >> > > Hello. I just started review/testing this series.
> > >> > > 
> > >> > > it crashed on my system (arm64)
> > >> > 
> > >> > Hmm, interesting. On x86_64 this works for me and stackdepot is allocated
> > >> > from memblock. arm64 must have memblock freeing happen earlier or something.
> > >> > (CCing memblock experts)
> > >> > 
> > >> > > I ran with boot parameter slub_debug=U, and without KASAN.
> > >> > > So CONFIG_STACKDEPOT_ALWAYS_INIT=n.
> > >> > > 
> > >> > > void * __init memblock_alloc_try_nid(
> > >> > >                         phys_addr_t size, phys_addr_t align,
> > >> > >                         phys_addr_t min_addr, phys_addr_t max_addr,
> > >> > >                         int nid)
> > >> > > {
> > >> > >         void *ptr;
> > >> > > 
> > >> > >         memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n",
> > >> > >                      __func__, (u64)size, (u64)align, nid, &min_addr,
> > >> > >                      &max_addr, (void *)_RET_IP_);
> > >> > >         ptr = memblock_alloc_internal(size, align,
> > >> > >                                            min_addr, max_addr, nid, false);
> > >> > >         if (ptr)
> > >> > >                 memset(ptr, 0, size); <--- Crash Here
> > >> > > 
> > >> > >         return ptr;
> > >> > > }
> > >> > > 
> > >> > > It crashed during create_boot_cache() -> stack_depot_init() ->
> > >> > > memblock_alloc().
> > >> > > 
> > >> > > I think That's because, in kmem_cache_init(), both slab and memblock is not
> > >> > > available. (AFAIU memblock is not available after mem_init() because of
> > >> > > memblock_free_all(), right?)
> > >> > 
> > >> > Hm yes I see, even in x86_64 version mem_init() calls memblock_free_all().
> > >> > But then, I would expect stack_depot_init() to detect that memblock_alloc()
> > >> > returns NULL, we print ""Stack Depot hash table allocation failed,
> > >> > disabling" and disable it. Instead it seems memblock_alloc() returns
> > >> > something that's already potentially used by somebody else? Sounds like a bug?
> > >> 
> > >> 
> > >> By the way, I fixed this by allowing stack_depot_init() to be called in
> > >> kmem_cache_init() too [1] and Marco suggested that calling
> > >> stack_depot_init() depending on slub_debug parameter for simplicity. [2]
> > >> 
> > >> I would prefer [2], Would you take a look?
> > >> 
> > >> [1] https://lkml.org/lkml/2022/2/27/31
> > >> 
> > >> [2] https://lkml.org/lkml/2022/2/28/717
> > > 
> > > I have the third version :)
> > 
> > While simple, it changes the timing of stack_depot_early_init() that was
> > supposed to be at a single callsite - now it's less predictable and depends
> > on e.g. kernel parameter ordering. Some arch/config combo could break,
> > dunno. Setting a variable that stack_depot_early_init() checks should be
> > more robust.
> 
> Not sure I follow.
> stack_depot_early_init() is a wrapper for stack_depot_init() which already
> checks 
> 
> 	if (!stack_depot_disable && !stack_table)
> 
> So largely it can be at multiple call sites just like stack_depot_init...

In my opinion, allowing to call stack_depot_init() in various context is not a good
idea. For another simple example, slub_debug=U,vmap_area can fool the
current code because it's called in context where slab is available,
but vmalloc is unavailable. and stack_depot_init() will try to allocate
using kvmalloc(). Late initialization adds too much complexity.

So IMO we have two solutions.

First solution is only allowing early init and avoiding late init.
(setting a global variable that is visible to stack depot would do this)

And second solution is to make caller allocate and manage its own hash
table. All of this complexity is because we're trying to make stack_table
global.

First solution looks ok if we have few users of stack depot.
But I think we should use second approach if stack depot is growing
more and more callers?

> Still, I understand your concern of having multiple call sites for
> stack_depot_early_init().
> 
> The most robust way I can think of will be to make stack_depot_early_init()
> a proper function, move memblock_alloc() there and add a variable, say
> stack_depot_needed_early that will be set to 1 if
> CONFIG_STACKDEPOT_ALWAYS_INIT=y or by the callers that need to allocate the
> stack_table before kmalloc is up.
>  
> E.g
>
> __init int stack_depot_early_init(void)
> {
> 
> 	if (stack_depot_needed_early && !stack_table) {
> 		size_t size = (STACK_HASH_SIZE * sizeof(struct stack_record *));
> 		int i;
> 
> 		pr_info("Stack Depot allocating hash table with memblock_alloc\n");
> 		stack_table = memblock_alloc(size, SMP_CACHE_BYTES);
> 	
> 		if (!stack_table) {
> 			pr_err("Stack Depot hash table allocation failed, disabling\n");
> 			stack_depot_disable = true;
> 			return -ENOMEM;
> 		}
> 	}
> 
> 	return 0;
> }
>
> The mutex is not needed here because mm_init() -> stack_depot_early_init()
> happens before SMP and setting stack_table[i] to NULL is redundant with
> memblock_alloc(). (btw, kvmalloc case could use __GFP_ZERO as well).
> 
> I'm not sure if the stack depot should be disabled for good if the early
> allocation failed, but that's another story.
> 
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index a74afe59a403..0c3ab2335b46 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -1548,6 +1548,10 @@ static int __init setup_slub_debug(char *str)
> > >  	}
> > >  out:
> > >  	slub_debug = global_flags;
> > > +
> > > +	if (slub_flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
> > > +		stack_depot_early_init();
> > > +
> > >  	if (slub_debug != 0 || slub_debug_string)
> > >  		static_branch_enable(&slub_debug_enabled);
> > >  	else
> > > @@ -4221,9 +4225,6 @@ static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags)
> > >  	s->remote_node_defrag_ratio = 1000;
> > >  #endif
> > >  
> > > -	if (s->flags & SLAB_STORE_USER && IS_ENABLED(CONFIG_STACKDEPOT))
> > > -		stack_depot_init();
> > > -
> > >  	/* Initialize the pre-computed randomized freelist if slab is up */
> > >  	if (slab_state >= UP) {
> > >  		if (init_cache_random_seq(s))
> > >  
> > >> -- 
> > >> Thank you, You are awesome!
> > >> Hyeonggon :-)
> > > 
> > 
> 
> -- 
> Sincerely yours,
> Mike.

-- 
Thank you, You are awesome!
Hyeonggon :-)

Re: [PATCH 0/5] SLUB debugfs improvements based on stackdepot

Posted by Marco Elver 4 years, 3 months ago

On Wed, 2 Mar 2022 at 18:02, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
[...]
> So IMO we have two solutions.
>
> First solution is only allowing early init and avoiding late init.
> (setting a global variable that is visible to stack depot would do this)
>
> And second solution is to make caller allocate and manage its own hash
> table. All of this complexity is because we're trying to make stack_table
> global.

I think this would be a mistake, because then we have to continuously
audit all users of stackdepot and make sure that allocation stack
traces don't end up in duplicate hash tables. It's global for a
reason.

> First solution looks ok if we have few users of stack depot.
> But I think we should use second approach if stack depot is growing
> more and more callers?

The problem here really is just that initialization of stackdepot and
slabs can have a cyclic dependency with the changes you're making. I
very much doubt there'll be other cases (beyond the allocator itself
used by stackdepot) which can introduce such a cyclic dependency.

The easiest way to break the cyclic dependency is to initialize
stackdepot earlier, assuming it can be determined it is required (in
this case it can because the command line is parsed before slab
creation). The suggestion with the stack_depot_needed_early variable
(like Mike's suggested code) would solve all that.

I don't understand the concern about multiple contexts. The problem is
just about a cyclic dependency during early init, and I doubt we'll
have more of that.

Thanks,
-- Marco