It used to be called from smp_callin(), however BUG_ON() was invoked on
multiple occasions before that. It may end up calling machine_restart()
which tries to get APIC ID for CPU running this code. If BSP detected
that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
Enabling x2APIC on secondary CPUs earlier protects against an endless
loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
MSR while x2APIC is disabled in IA32_APIC_BASE.
Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
---
xen/arch/x86/smpboot.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 8ae65ab1769f..a3895dafa267 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -184,7 +184,6 @@ static void smp_callin(void)
* update until we finish. We are free to set up this CPU: first the APIC.
*/
Dprintk("CALLIN, before setup_local_APIC().\n");
- x2apic_ap_setup();
setup_local_APIC(false);
/* Save our processor parameters. */
@@ -351,6 +350,14 @@ void start_secondary(void *unused)
get_cpu_info()->xen_cr3 = 0;
get_cpu_info()->pv_cr3 = 0;
+ /*
+ * BUG_ON() used in load_system_tables() and later code may end up calling
+ * machine_restart() which tries to get APIC ID for CPU running this code.
+ * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
+ * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
+ * with endless #GP loop.
+ */
+ x2apic_ap_setup();
load_system_tables();
/* Full exception support from here on in. */
--
2.41.0
On 14.11.2023 18:50, Krystian Hebel wrote:
> It used to be called from smp_callin(), however BUG_ON() was invoked on
> multiple occasions before that. It may end up calling machine_restart()
> which tries to get APIC ID for CPU running this code. If BSP detected
> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
> Enabling x2APIC on secondary CPUs earlier protects against an endless
> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
> MSR while x2APIC is disabled in IA32_APIC_BASE.
>
> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
> ---
> xen/arch/x86/smpboot.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> index 8ae65ab1769f..a3895dafa267 100644
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -184,7 +184,6 @@ static void smp_callin(void)
> * update until we finish. We are free to set up this CPU: first the APIC.
> */
> Dprintk("CALLIN, before setup_local_APIC().\n");
> - x2apic_ap_setup();
> setup_local_APIC(false);
>
> /* Save our processor parameters. */
> @@ -351,6 +350,14 @@ void start_secondary(void *unused)
> get_cpu_info()->xen_cr3 = 0;
> get_cpu_info()->pv_cr3 = 0;
>
> + /*
> + * BUG_ON() used in load_system_tables() and later code may end up calling
> + * machine_restart() which tries to get APIC ID for CPU running this code.
> + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
> + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
> + * with endless #GP loop.
> + */
> + x2apic_ap_setup();
> load_system_tables();
While I find the argument convincing, I seem to recall that there was a
firm plan to have load_system_tables() as early as possible. Andrew?
Jan
On 7.02.2024 18:02, Jan Beulich wrote:
> On 14.11.2023 18:50, Krystian Hebel wrote:
>> It used to be called from smp_callin(), however BUG_ON() was invoked on
>> multiple occasions before that. It may end up calling machine_restart()
>> which tries to get APIC ID for CPU running this code. If BSP detected
>> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
>> Enabling x2APIC on secondary CPUs earlier protects against an endless
>> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
>> MSR while x2APIC is disabled in IA32_APIC_BASE.
>>
>> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
>> ---
>> xen/arch/x86/smpboot.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>> index 8ae65ab1769f..a3895dafa267 100644
>> --- a/xen/arch/x86/smpboot.c
>> +++ b/xen/arch/x86/smpboot.c
>> @@ -184,7 +184,6 @@ static void smp_callin(void)
>> * update until we finish. We are free to set up this CPU: first the APIC.
>> */
>> Dprintk("CALLIN, before setup_local_APIC().\n");
>> - x2apic_ap_setup();
>> setup_local_APIC(false);
>>
>> /* Save our processor parameters. */
>> @@ -351,6 +350,14 @@ void start_secondary(void *unused)
>> get_cpu_info()->xen_cr3 = 0;
>> get_cpu_info()->pv_cr3 = 0;
>>
>> + /*
>> + * BUG_ON() used in load_system_tables() and later code may end up calling
>> + * machine_restart() which tries to get APIC ID for CPU running this code.
>> + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
>> + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
>> + * with endless #GP loop.
>> + */
>> + x2apic_ap_setup();
>> load_system_tables();
> While I find the argument convincing, I seem to recall that there was a
> firm plan to have load_system_tables() as early as possible. Andrew?
This is where the code failed for me during testing. How about moving
x2apic_ap_setup() into load_system_tables(), just before BUG_ON? Or maybe
move those BUG_ON one level higher, after load_system_tables() returns?
Either way some code will end up in place it doesn't belong, but I'd
argue that
BUG_ON is only useful if it itself doesn't crash.
>
> Jan
--
Krystian Hebel
Firmware Engineer
https://3mdeb.com | @3mdeb_com
On 12.03.2024 17:02, Krystian Hebel wrote:
>
> On 7.02.2024 18:02, Jan Beulich wrote:
>> On 14.11.2023 18:50, Krystian Hebel wrote:
>>> It used to be called from smp_callin(), however BUG_ON() was invoked on
>>> multiple occasions before that. It may end up calling machine_restart()
>>> which tries to get APIC ID for CPU running this code. If BSP detected
>>> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
>>> Enabling x2APIC on secondary CPUs earlier protects against an endless
>>> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
>>> MSR while x2APIC is disabled in IA32_APIC_BASE.
>>>
>>> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
>>> ---
>>> xen/arch/x86/smpboot.c | 9 ++++++++-
>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>>> index 8ae65ab1769f..a3895dafa267 100644
>>> --- a/xen/arch/x86/smpboot.c
>>> +++ b/xen/arch/x86/smpboot.c
>>> @@ -184,7 +184,6 @@ static void smp_callin(void)
>>> * update until we finish. We are free to set up this CPU: first the APIC.
>>> */
>>> Dprintk("CALLIN, before setup_local_APIC().\n");
>>> - x2apic_ap_setup();
>>> setup_local_APIC(false);
>>>
>>> /* Save our processor parameters. */
>>> @@ -351,6 +350,14 @@ void start_secondary(void *unused)
>>> get_cpu_info()->xen_cr3 = 0;
>>> get_cpu_info()->pv_cr3 = 0;
>>>
>>> + /*
>>> + * BUG_ON() used in load_system_tables() and later code may end up calling
>>> + * machine_restart() which tries to get APIC ID for CPU running this code.
>>> + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
>>> + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
>>> + * with endless #GP loop.
>>> + */
>>> + x2apic_ap_setup();
>>> load_system_tables();
>> While I find the argument convincing, I seem to recall that there was a
>> firm plan to have load_system_tables() as early as possible. Andrew?
> This is where the code failed for me during testing. How about moving
> x2apic_ap_setup() into load_system_tables(),
How does a call to x2apic_ap_setup() fit in a function named
load_system_tables()?
> just before BUG_ON? Or maybe
> move those BUG_ON one level higher, after load_system_tables() returns?
But they're there for a reason.
> Either way some code will end up in place it doesn't belong, but I'd
> argue that
> BUG_ON is only useful if it itself doesn't crash.
I guess I don't understand this: That BUG_ON() is already guarded by a
system_state check, to prevent it uselessly hanging the system.
In any event - besides you still wanting to get input from Andrew, it
ought to be clear that anything unusual / unexpected will require extra
justification in the description.
Jan
© 2016 - 2026 Red Hat, Inc.