kernel/smp.c | 2 ++ 1 file changed, 2 insertions(+)
Using the PowerPC P2040 (e500mc) CPU, soft lockups can occasionally be
seen in smp_call_function_many_cond(). The conclusion is that this CPU
does not process the doorbell interrupt while in a data-storage (MMU)
exception. If more than one CPU in a multi core environment is calling
this function at the same time, it is possible for a deadlock to occur.
The fix for this is to call flush_smp_call_function_queue() before
waiting for responses from other CPUs. If there is something in the
queue, this is a good time to process it before busy-waiting on other
CPUs. On other architectures this call will quickly do nothing, as the
queue will be empty.
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
---
kernel/smp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/smp.c b/kernel/smp.c
index a0bb56bd8dda..3c4467654ab0 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -884,6 +884,8 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
local_irq_restore(flags);
}
+ flush_smp_call_function_queue();
+
if (run_remote && wait) {
for_each_cpu(cpu, cfd->cpumask) {
call_single_data_t *csd;
--
2.54.0
On Wed, 2026-05-27 at 15:16 +1200, Mark Tomlinson wrote: > Using the PowerPC P2040 (e500mc) CPU, soft lockups can occasionally > be > seen in smp_call_function_many_cond(). The conclusion is that this > CPU > does not process the doorbell interrupt while in a data-storage (MMU) > exception. If more than one CPU in a multi core environment is > calling > this function at the same time, it is possible for a deadlock to > occur. Does that mean if the CPU in question does not call smp_call_function_many_cond() while in a data-storage exception, the system might still hang? Not that there's anything wrong with reducing the frequency of what is (presumably) an already pretty rare hang. > > The fix for this is to call flush_smp_call_function_queue() before > waiting for responses from other CPUs. If there is something in the > queue, this is a good time to process it before busy-waiting on other > CPUs. On other architectures this call will quickly do nothing, as > the > queue will be empty. Agreed, this does look completely harmless at worst, and does look like it would at the very least improve that e500mc issue. > > Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> > Reviewed-by: Rik van Riel <riel@surriel.com> -- All Rights Reversed.
(adding others suggested by get_maintainer.pl)
On 27/05/2026 15:16, Mark Tomlinson wrote:
> Using the PowerPC P2040 (e500mc) CPU, soft lockups can occasionally be
> seen in smp_call_function_many_cond(). The conclusion is that this CPU
> does not process the doorbell interrupt while in a data-storage (MMU)
> exception. If more than one CPU in a multi core environment is calling
> this function at the same time, it is possible for a deadlock to occur.
>
> The fix for this is to call flush_smp_call_function_queue() before
> waiting for responses from other CPUs. If there is something in the
> queue, this is a good time to process it before busy-waiting on other
> CPUs. On other architectures this call will quickly do nothing, as the
> queue will be empty.
>
> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
> ---
> kernel/smp.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index a0bb56bd8dda..3c4467654ab0 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -884,6 +884,8 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> local_irq_restore(flags);
> }
>
> + flush_smp_call_function_queue();
> +
> if (run_remote && wait) {
> for_each_cpu(cpu, cfd->cpumask) {
> call_single_data_t *csd;
On Tue, Jun 02, 2026 at 09:18:39PM +0000, Chris Packham wrote:
> (adding others suggested by get_maintainer.pl)
>
> On 27/05/2026 15:16, Mark Tomlinson wrote:
> > Using the PowerPC P2040 (e500mc) CPU, soft lockups can occasionally be
> > seen in smp_call_function_many_cond(). The conclusion is that this CPU
> > does not process the doorbell interrupt while in a data-storage (MMU)
> > exception. If more than one CPU in a multi core environment is calling
> > this function at the same time, it is possible for a deadlock to occur.
> >
> > The fix for this is to call flush_smp_call_function_queue() before
> > waiting for responses from other CPUs. If there is something in the
> > queue, this is a good time to process it before busy-waiting on other
> > CPUs. On other architectures this call will quickly do nothing, as the
> > queue will be empty.
OK, I will bite...
How do we know that another entry will not get added by some other CPU
just before this new call to flush_smp_call_function_queue()?
Thanx, Paul
> > Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
> > ---
> > kernel/smp.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index a0bb56bd8dda..3c4467654ab0 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -884,6 +884,8 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> > local_irq_restore(flags);
> > }
> >
> > + flush_smp_call_function_queue();
> > +
> > if (run_remote && wait) {
> > for_each_cpu(cpu, cfd->cpumask) {
> > call_single_data_t *csd;
© 2016 - 2026 Red Hat, Inc.