[v1] locking/lockdep: Disable KASAN instrumentation of lockdep.c

[PATCH] locking/lockdep: Disable KASAN instrumentation of lockdep.c

Posted by Waiman Long 1 year ago

Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
Each of them can significantly slow down the speed of a debug kernel.
Enabling KASAN instrumentation of the LOCKDEP code will further slow
thing down.

Since LOCKDEP is a high overhead debugging tool, it will never get
enabled in a production kernel. The LOCKDEP code is also pretty mature
and is unlikely to get major changes. There is also a possibility of
recursion similar to KCSAN. As the small advantage of enabling KASAN
instrumentation to catch potential memory access error is probably
not worth the drawback of further slowing down a debug kernel, disable
KASAN instrumentation to enable a debug kernel to gain a little bit of
speed back.

With a debug kernel with both LOCKDEP and KASAN enabled running on a
2-socket 144-thread system, the time to do a "make -j144" kernel build
was 18m40.641s. After applying this patch, the parallel kernel build
time was reduced to 17m35.136s. This is a reduction of about 66s (5.8%).

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
index 0db4093d17b8..8a588b0227b1 100644
--- a/kernel/locking/Makefile
+++ b/kernel/locking/Makefile
@@ -6,6 +6,7 @@ KCOV_INSTRUMENT		:= n
 obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
 
 # Avoid recursion lockdep -> sanitizer -> ... -> lockdep.
+KASAN_SANITIZE_lockdep.o := n
 KCSAN_SANITIZE_lockdep.o := n
 
 ifdef CONFIG_FUNCTION_TRACER
-- 
2.48.1

Re: [PATCH] locking/lockdep: Disable KASAN instrumentation of lockdep.c

Posted by Waiman Long 1 year ago

On 1/31/25 11:50 AM, Waiman Long wrote:
> Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
> Each of them can significantly slow down the speed of a debug kernel.
> Enabling KASAN instrumentation of the LOCKDEP code will further slow
> thing down.
>
> Since LOCKDEP is a high overhead debugging tool, it will never get
> enabled in a production kernel. The LOCKDEP code is also pretty mature
> and is unlikely to get major changes. There is also a possibility of
> recursion similar to KCSAN. As the small advantage of enabling KASAN
> instrumentation to catch potential memory access error is probably
> not worth the drawback of further slowing down a debug kernel, disable
> KASAN instrumentation to enable a debug kernel to gain a little bit of
> speed back.
>
> With a debug kernel with both LOCKDEP and KASAN enabled running on a
> 2-socket 144-thread system, the time to do a "make -j144" kernel build
> was 18m40.641s. After applying this patch, the parallel kernel build
> time was reduced to 17m35.136s. This is a reduction of about 66s (5.8%).
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>   kernel/locking/Makefile | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
> index 0db4093d17b8..8a588b0227b1 100644
> --- a/kernel/locking/Makefile
> +++ b/kernel/locking/Makefile
> @@ -6,6 +6,7 @@ KCOV_INSTRUMENT		:= n
>   obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
>   
>   # Avoid recursion lockdep -> sanitizer -> ... -> lockdep.
> +KASAN_SANITIZE_lockdep.o := n
>   KCSAN_SANITIZE_lockdep.o := n
>   
>   ifdef CONFIG_FUNCTION_TRACER

The rationale behind this patch is due to the fact that a similar 
configured PREEMPT_RT debug kernel is found to be about 3 times slower 
than the non-RT debug kernel. For the test same system, the parallel 
build runtime is 59m56.722s. After applying this patch, it is reduced to 
38m3.348s. Its more than 1/3 reduction is more than I would have 
expected. So the lockdep code is much more heavily used in a PREEMPT_RT 
debug kernel.

Cheers,
Longman

Re: [PATCH] locking/lockdep: Disable KASAN instrumentation of lockdep.c

Posted by Peter Zijlstra 1 year ago

On Fri, Jan 31, 2025 at 04:47:06PM -0500, Waiman Long wrote:
> On 1/31/25 11:50 AM, Waiman Long wrote:
> > Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
> > Each of them can significantly slow down the speed of a debug kernel.
> > Enabling KASAN instrumentation of the LOCKDEP code will further slow
> > thing down.
> > 
> > Since LOCKDEP is a high overhead debugging tool, it will never get
> > enabled in a production kernel. The LOCKDEP code is also pretty mature
> > and is unlikely to get major changes. There is also a possibility of
> > recursion similar to KCSAN. As the small advantage of enabling KASAN
> > instrumentation to catch potential memory access error is probably
> > not worth the drawback of further slowing down a debug kernel, disable
> > KASAN instrumentation to enable a debug kernel to gain a little bit of
> > speed back.
> > 
> > With a debug kernel with both LOCKDEP and KASAN enabled running on a
> > 2-socket 144-thread system, the time to do a "make -j144" kernel build
> > was 18m40.641s. After applying this patch, the parallel kernel build
> > time was reduced to 17m35.136s. This is a reduction of about 66s (5.8%).
> > 
> > Signed-off-by: Waiman Long <longman@redhat.com>
> > ---
> >   kernel/locking/Makefile | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
> > index 0db4093d17b8..8a588b0227b1 100644
> > --- a/kernel/locking/Makefile
> > +++ b/kernel/locking/Makefile
> > @@ -6,6 +6,7 @@ KCOV_INSTRUMENT		:= n
> >   obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
> >   # Avoid recursion lockdep -> sanitizer -> ... -> lockdep.
> > +KASAN_SANITIZE_lockdep.o := n
> >   KCSAN_SANITIZE_lockdep.o := n
> >   ifdef CONFIG_FUNCTION_TRACER
> 
> The rationale behind this patch is due to the fact that a similar configured
> PREEMPT_RT debug kernel is found to be about 3 times slower than the non-RT
> debug kernel. For the test same system, the parallel build runtime is
> 59m56.722s. After applying this patch, it is reduced to 38m3.348s. Its more
> than 1/3 reduction is more than I would have expected. So the lockdep code
> is much more heavily used in a PREEMPT_RT debug kernel.

Perhaps put that in the changelog instead?

Its not like RT is this secret out of tree project :-)

Also, any quick clues as to what causes the extra lockdep overhead?
Initially I thought perhaps local-lock, but that should also cause
lockdep on !RT builds.

Re: [PATCH] locking/lockdep: Disable KASAN instrumentation of lockdep.c

Posted by Waiman Long 1 year ago

On 2/3/25 6:24 AM, Peter Zijlstra wrote:
> On Fri, Jan 31, 2025 at 04:47:06PM -0500, Waiman Long wrote:
>> On 1/31/25 11:50 AM, Waiman Long wrote:
>>> Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
>>> Each of them can significantly slow down the speed of a debug kernel.
>>> Enabling KASAN instrumentation of the LOCKDEP code will further slow
>>> thing down.
>>>
>>> Since LOCKDEP is a high overhead debugging tool, it will never get
>>> enabled in a production kernel. The LOCKDEP code is also pretty mature
>>> and is unlikely to get major changes. There is also a possibility of
>>> recursion similar to KCSAN. As the small advantage of enabling KASAN
>>> instrumentation to catch potential memory access error is probably
>>> not worth the drawback of further slowing down a debug kernel, disable
>>> KASAN instrumentation to enable a debug kernel to gain a little bit of
>>> speed back.
>>>
>>> With a debug kernel with both LOCKDEP and KASAN enabled running on a
>>> 2-socket 144-thread system, the time to do a "make -j144" kernel build
>>> was 18m40.641s. After applying this patch, the parallel kernel build
>>> time was reduced to 17m35.136s. This is a reduction of about 66s (5.8%).
>>>
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>> ---
>>>    kernel/locking/Makefile | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
>>> index 0db4093d17b8..8a588b0227b1 100644
>>> --- a/kernel/locking/Makefile
>>> +++ b/kernel/locking/Makefile
>>> @@ -6,6 +6,7 @@ KCOV_INSTRUMENT		:= n
>>>    obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
>>>    # Avoid recursion lockdep -> sanitizer -> ... -> lockdep.
>>> +KASAN_SANITIZE_lockdep.o := n
>>>    KCSAN_SANITIZE_lockdep.o := n
>>>    ifdef CONFIG_FUNCTION_TRACER
>> The rationale behind this patch is due to the fact that a similar configured
>> PREEMPT_RT debug kernel is found to be about 3 times slower than the non-RT
>> debug kernel. For the test same system, the parallel build runtime is
>> 59m56.722s. After applying this patch, it is reduced to 38m3.348s. Its more
>> than 1/3 reduction is more than I would have expected. So the lockdep code
>> is much more heavily used in a PREEMPT_RT debug kernel.
> Perhaps put that in the changelog instead?
>
> Its not like RT is this secret out of tree project :-)
>
> Also, any quick clues as to what causes the extra lockdep overhead?
> Initially I thought perhaps local-lock, but that should also cause
> lockdep on !RT builds.

Yes, I am planning to update the patch with more RT debug kernel 
performance data.

As to why, my guess is that the average nesting depth will be higher 
because spin_lock_irq* no longer disable IRQ and there is an extra wait 
lock underneath the rt-mutex. Also the increase in the number of 
sleep-wake cycles because of the sleeping lock nature of rt-spinlock may 
be a contributing factor.

Cheers,
Longman

Cheers,
Longman