[PATCH v2 1/4] x86/boot: print CPU and APIC ID in bring up failure

Roger Pau Monne posted 4 patches 5 months, 1 week ago
There is a newer version of this series
[PATCH v2 1/4] x86/boot: print CPU and APIC ID in bring up failure
Posted by Roger Pau Monne 5 months, 1 week ago
Print the CPU and APIC ID that fails to respond to the init sequence, or
that didn't manage to reach the "callin" state.  Expand a bit the printed
error messages.  Otherwise the "Not responding." message is not easy to
understand by users.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Also print APIC ID.
---
 xen/arch/x86/smpboot.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 0189d6c332a4..dbc2f2f1d411 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -618,10 +618,12 @@ static int do_boot_cpu(int apicid, int cpu)
             smp_mb();
             if ( bootsym(trampoline_cpu_started) == 0xA5 )
                 /* trampoline started but...? */
-                printk("Stuck ??\n");
+                printk("CPU%u/APICID%u: Didn't finish startup sequence\n",
+                       cpu, apicid);
             else
                 /* trampoline code not run */
-                printk("Not responding.\n");
+                printk("CPU%u/APICID%u: Not responding to startup\n",
+                       cpu, apicid);
         }
     }
 
-- 
2.49.0


Re: [PATCH v2 1/4] x86/boot: print CPU and APIC ID in bring up failure
Posted by Jan Beulich 5 months, 1 week ago
On 22.05.2025 09:54, Roger Pau Monne wrote:
> Print the CPU and APIC ID that fails to respond to the init sequence, or
> that didn't manage to reach the "callin" state.  Expand a bit the printed
> error messages.  Otherwise the "Not responding." message is not easy to
> understand by users.
> 
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - Also print APIC ID.
> ---
>  xen/arch/x86/smpboot.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> index 0189d6c332a4..dbc2f2f1d411 100644
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -618,10 +618,12 @@ static int do_boot_cpu(int apicid, int cpu)
>              smp_mb();
>              if ( bootsym(trampoline_cpu_started) == 0xA5 )
>                  /* trampoline started but...? */
> -                printk("Stuck ??\n");
> +                printk("CPU%u/APICID%u: Didn't finish startup sequence\n",
> +                       cpu, apicid);
>              else
>                  /* trampoline code not run */
> -                printk("Not responding.\n");
> +                printk("CPU%u/APICID%u: Not responding to startup\n",
> +                       cpu, apicid);
>          }
>      }
>  

Elsewhere I think we print AIC IDs in hex; may be better to do so here, too.
That may then want some text re-arrangement though, e.g.

"CPU%u: APICID %#x not responding to startup\n"

Thoughts?

Jan

Re: [PATCH v2 1/4] x86/boot: print CPU and APIC ID in bring up failure
Posted by Andrew Cooper 5 months, 1 week ago
On 22/05/2025 10:10 am, Jan Beulich wrote:
> On 22.05.2025 09:54, Roger Pau Monne wrote:
>> Print the CPU and APIC ID that fails to respond to the init sequence, or
>> that didn't manage to reach the "callin" state.  Expand a bit the printed
>> error messages.  Otherwise the "Not responding." message is not easy to
>> understand by users.
>>
>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>> ---
>> Changes since v1:
>>  - Also print APIC ID.
>> ---
>>  xen/arch/x86/smpboot.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>> index 0189d6c332a4..dbc2f2f1d411 100644
>> --- a/xen/arch/x86/smpboot.c
>> +++ b/xen/arch/x86/smpboot.c
>> @@ -618,10 +618,12 @@ static int do_boot_cpu(int apicid, int cpu)
>>              smp_mb();
>>              if ( bootsym(trampoline_cpu_started) == 0xA5 )
>>                  /* trampoline started but...? */
>> -                printk("Stuck ??\n");
>> +                printk("CPU%u/APICID%u: Didn't finish startup sequence\n",
>> +                       cpu, apicid);
>>              else
>>                  /* trampoline code not run */
>> -                printk("Not responding.\n");
>> +                printk("CPU%u/APICID%u: Not responding to startup\n",
>> +                       cpu, apicid);
>>          }
>>      }
>>  
> Elsewhere I think we print AIC IDs in hex; may be better to do so here, too.
> That may then want some text re-arrangement though, e.g.
>
> "CPU%u: APICID %#x not responding to startup\n"
>
> Thoughts?

Definitely hex.  Elsewhere APIC_ID always has an underscore.

I'd suggest:

"APIC_ID %#x (CPU%u) didn't respond to SIPI\n"

The APIC_ID is the critical detail, and the CPU number is fairly incidental.

Also as we're changing things, lets not retain the STARTUP name seeing
as it only exists on pre-Pentium APICs.  SIPI is what we use these days.

~Andrew

Re: [PATCH v2 1/4] x86/boot: print CPU and APIC ID in bring up failure
Posted by Roger Pau Monné 5 months, 1 week ago
On Thu, May 22, 2025 at 03:39:57PM +0100, Andrew Cooper wrote:
> On 22/05/2025 10:10 am, Jan Beulich wrote:
> > On 22.05.2025 09:54, Roger Pau Monne wrote:
> >> Print the CPU and APIC ID that fails to respond to the init sequence, or
> >> that didn't manage to reach the "callin" state.  Expand a bit the printed
> >> error messages.  Otherwise the "Not responding." message is not easy to
> >> understand by users.
> >>
> >> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >> ---
> >> Changes since v1:
> >>  - Also print APIC ID.
> >> ---
> >>  xen/arch/x86/smpboot.c | 6 ++++--
> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> >> index 0189d6c332a4..dbc2f2f1d411 100644
> >> --- a/xen/arch/x86/smpboot.c
> >> +++ b/xen/arch/x86/smpboot.c
> >> @@ -618,10 +618,12 @@ static int do_boot_cpu(int apicid, int cpu)
> >>              smp_mb();
> >>              if ( bootsym(trampoline_cpu_started) == 0xA5 )
> >>                  /* trampoline started but...? */
> >> -                printk("Stuck ??\n");
> >> +                printk("CPU%u/APICID%u: Didn't finish startup sequence\n",
> >> +                       cpu, apicid);
> >>              else
> >>                  /* trampoline code not run */
> >> -                printk("Not responding.\n");
> >> +                printk("CPU%u/APICID%u: Not responding to startup\n",
> >> +                       cpu, apicid);
> >>          }
> >>      }
> >>  
> > Elsewhere I think we print AIC IDs in hex; may be better to do so here, too.
> > That may then want some text re-arrangement though, e.g.
> >
> > "CPU%u: APICID %#x not responding to startup\n"
> >
> > Thoughts?
> 
> Definitely hex.  Elsewhere APIC_ID always has an underscore.

Maybe I'm confused, but I don't think Xen uses an underscore, it's
always 'APIC ID' when printed.  I don't mind adding it here, I assume
what you mean with elsewhere is other projects like Linux?

Thanks, Roger.

Re: [PATCH v2 1/4] x86/boot: print CPU and APIC ID in bring up failure
Posted by Andrew Cooper 5 months, 1 week ago
On 22/05/2025 5:50 pm, Roger Pau Monné wrote:
> On Thu, May 22, 2025 at 03:39:57PM +0100, Andrew Cooper wrote:
>> On 22/05/2025 10:10 am, Jan Beulich wrote:
>>> On 22.05.2025 09:54, Roger Pau Monne wrote:
>>>> Print the CPU and APIC ID that fails to respond to the init sequence, or
>>>> that didn't manage to reach the "callin" state.  Expand a bit the printed
>>>> error messages.  Otherwise the "Not responding." message is not easy to
>>>> understand by users.
>>>>
>>>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>> ---
>>>> Changes since v1:
>>>>  - Also print APIC ID.
>>>> ---
>>>>  xen/arch/x86/smpboot.c | 6 ++++--
>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>>>> index 0189d6c332a4..dbc2f2f1d411 100644
>>>> --- a/xen/arch/x86/smpboot.c
>>>> +++ b/xen/arch/x86/smpboot.c
>>>> @@ -618,10 +618,12 @@ static int do_boot_cpu(int apicid, int cpu)
>>>>              smp_mb();
>>>>              if ( bootsym(trampoline_cpu_started) == 0xA5 )
>>>>                  /* trampoline started but...? */
>>>> -                printk("Stuck ??\n");
>>>> +                printk("CPU%u/APICID%u: Didn't finish startup sequence\n",
>>>> +                       cpu, apicid);
>>>>              else
>>>>                  /* trampoline code not run */
>>>> -                printk("Not responding.\n");
>>>> +                printk("CPU%u/APICID%u: Not responding to startup\n",
>>>> +                       cpu, apicid);
>>>>          }
>>>>      }
>>>>  
>>> Elsewhere I think we print AIC IDs in hex; may be better to do so here, too.
>>> That may then want some text re-arrangement though, e.g.
>>>
>>> "CPU%u: APICID %#x not responding to startup\n"
>>>
>>> Thoughts?
>> Definitely hex.  Elsewhere APIC_ID always has an underscore.
> Maybe I'm confused, but I don't think Xen uses an underscore, it's
> always 'APIC ID' when printed.  I don't mind adding it here, I assume
> what you mean with elsewhere is other projects like Linux?

It's apic_id in plenty of smpboot.c, but fine - lets do it with a space.

We do need to reduce from $N ways of rendering this down to 1.

~Andrew

Re: [PATCH v2 1/4] x86/boot: print CPU and APIC ID in bring up failure
Posted by Roger Pau Monné 5 months, 1 week ago
On Thu, May 22, 2025 at 06:22:19PM +0100, Andrew Cooper wrote:
> On 22/05/2025 5:50 pm, Roger Pau Monné wrote:
> > On Thu, May 22, 2025 at 03:39:57PM +0100, Andrew Cooper wrote:
> >> On 22/05/2025 10:10 am, Jan Beulich wrote:
> >>> On 22.05.2025 09:54, Roger Pau Monne wrote:
> >>>> Print the CPU and APIC ID that fails to respond to the init sequence, or
> >>>> that didn't manage to reach the "callin" state.  Expand a bit the printed
> >>>> error messages.  Otherwise the "Not responding." message is not easy to
> >>>> understand by users.
> >>>>
> >>>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >>>> ---
> >>>> Changes since v1:
> >>>>  - Also print APIC ID.
> >>>> ---
> >>>>  xen/arch/x86/smpboot.c | 6 ++++--
> >>>>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> >>>> index 0189d6c332a4..dbc2f2f1d411 100644
> >>>> --- a/xen/arch/x86/smpboot.c
> >>>> +++ b/xen/arch/x86/smpboot.c
> >>>> @@ -618,10 +618,12 @@ static int do_boot_cpu(int apicid, int cpu)
> >>>>              smp_mb();
> >>>>              if ( bootsym(trampoline_cpu_started) == 0xA5 )
> >>>>                  /* trampoline started but...? */
> >>>> -                printk("Stuck ??\n");
> >>>> +                printk("CPU%u/APICID%u: Didn't finish startup sequence\n",
> >>>> +                       cpu, apicid);
> >>>>              else
> >>>>                  /* trampoline code not run */
> >>>> -                printk("Not responding.\n");
> >>>> +                printk("CPU%u/APICID%u: Not responding to startup\n",
> >>>> +                       cpu, apicid);
> >>>>          }
> >>>>      }
> >>>>  
> >>> Elsewhere I think we print AIC IDs in hex; may be better to do so here, too.
> >>> That may then want some text re-arrangement though, e.g.
> >>>
> >>> "CPU%u: APICID %#x not responding to startup\n"
> >>>
> >>> Thoughts?
> >> Definitely hex.  Elsewhere APIC_ID always has an underscore.
> > Maybe I'm confused, but I don't think Xen uses an underscore, it's
> > always 'APIC ID' when printed.  I don't mind adding it here, I assume
> > what you mean with elsewhere is other projects like Linux?
> 
> It's apic_id in plenty of smpboot.c, but fine - lets do it with a space.
> 
> We do need to reduce from $N ways of rendering this down to 1.

Oh, so it's lowercase with underscore.  Anyway, will use uppercase and
space as agreed.

Thanks, Roger.