[PATCH] prinkt/nbcon: Add a scheduling point to nbcon_kthread_func().

Sebastian Andrzej Siewior posted 1 patch 1 year, 5 months ago
kernel/printk/nbcon.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] prinkt/nbcon: Add a scheduling point to nbcon_kthread_func().
Posted by Sebastian Andrzej Siewior 1 year, 5 months ago
Constant printing can lead to a CPU hog in nbcon_kthread_func(). The
context is preemptible but on !PREEMPT kernels there is no explicit
preemption point which leads softlockup warnings.

Add an explicit preemption point in nbcon_kthread_func().

Reported-by: Derek Barbosa <debarbos@redhat.com>
Link: https://lore.kernel.org/ZnHF5j1DUDjN1kkq@debarbos-thinkpadt14sgen2i.remote.csb
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/printk/nbcon.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index bb9689f94d302..0813ce88a49c5 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1119,6 +1119,7 @@ static int nbcon_kthread_func(void *__console)
 		}
 
 		console_srcu_read_unlock(cookie);
+		cond_resched();
 
 	} while (backlog);
 
-- 
2.45.2
Re: [PATCH] prinkt/nbcon: Add a scheduling point to nbcon_kthread_func().
Posted by John Ogness 1 year, 5 months ago
On 2024-06-20, Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> Constant printing can lead to a CPU hog in nbcon_kthread_func(). The
> context is preemptible but on !PREEMPT kernels there is no explicit
> preemption point which leads softlockup warnings.
>
> Add an explicit preemption point in nbcon_kthread_func().
>
> Reported-by: Derek Barbosa <debarbos@redhat.com>
> Link: https://lore.kernel.org/ZnHF5j1DUDjN1kkq@debarbos-thinkpadt14sgen2i.remote.csb
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Reviewed-by: John Ogness <john.ogness@linutronix.de>
Re: [PATCH] prinkt/nbcon: Add a scheduling point to nbcon_kthread_func().
Posted by Andrew Halaney 1 year, 5 months ago
nit: s/prinkt/printk

I make that typo so often :P

On Thu, Jun 20, 2024 at 11:43:00AM GMT, Sebastian Andrzej Siewior wrote:
> Constant printing can lead to a CPU hog in nbcon_kthread_func(). The
> context is preemptible but on !PREEMPT kernels there is no explicit
> preemption point which leads softlockup warnings.
> 
> Add an explicit preemption point in nbcon_kthread_func().
> 
> Reported-by: Derek Barbosa <debarbos@redhat.com>

Acked-by: Andrew Halaney <ahalaney@redhat.com>
Tested-by: Andrew Halaney <ahalaney@redhat.com>

This survived a bunch of tests that normally would cause some lockups
etc in PREEMPT_VOLUNTARY systems. I can see that the nbcon thread successfully
migrated NUMA nodes etc during periods of overwhelming the console backlog
successfully, which without this would not work prior.

Thanks!

> Link: https://lore.kernel.org/ZnHF5j1DUDjN1kkq@debarbos-thinkpadt14sgen2i.remote.csb
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  kernel/printk/nbcon.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index bb9689f94d302..0813ce88a49c5 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1119,6 +1119,7 @@ static int nbcon_kthread_func(void *__console)
>  		}
>  
>  		console_srcu_read_unlock(cookie);
> +		cond_resched();
>  
>  	} while (backlog);
>  
> -- 
> 2.45.2
>
Re: [PATCH] prinkt/nbcon: Add a scheduling point to nbcon_kthread_func().
Posted by Derek Barbosa 1 year, 5 months ago
On Thu, Jun 20, 2024 at 12:18:37PM -0500, Andrew Halaney wrote:
> Acked-by: Andrew Halaney <ahalaney@redhat.com>
> Tested-by: Andrew Halaney <ahalaney@redhat.com>
> 
> This survived a bunch of tests that normally would cause some lockups
> etc in PREEMPT_VOLUNTARY systems. I can see that the nbcon thread successfully
> migrated NUMA nodes etc during periods of overwhelming the console backlog
> successfully, which without this would not work prior.
> 
> Thanks!

I'm going to second Andrew's observed results here. With the original
reproducer of calling LTP pty03 && pty06 in a while loop, plus invoking
stress-ng with --timeout 60000s && --numa 64, there were no problems with the
nbcon thread migrating NUMA nodes and no panic(s) with 
kernel.softlockup_panic = 1

This was observed on an nproc == 128 machine.

Thanks! :-) 

Acked-by: Derek Barbosa <debarbos@redhat.com>
Tested-by: Derek Barbosa <debarbos@redhat.com>