.../admin-guide/kernel-parameters.txt | 3 + arch/x86/include/asm/processor.h | 6 +- arch/x86/include/asm/realmode.h | 4 +- arch/x86/include/asm/smp.h | 15 +- arch/x86/include/asm/topology.h | 2 - arch/x86/kernel/acpi/sleep.c | 15 +- arch/x86/kernel/apic/apic.c | 2 +- arch/x86/kernel/apic/x2apic_cluster.c | 126 ++++--- arch/x86/kernel/asm-offsets.c | 1 + arch/x86/kernel/cpu/common.c | 6 +- arch/x86/kernel/head_64.S | 129 +++++-- arch/x86/kernel/smpboot.c | 350 +++++++++++++----- arch/x86/realmode/init.c | 3 + arch/x86/realmode/rm/trampoline_64.S | 27 +- arch/x86/xen/smp_pv.c | 4 +- arch/x86/xen/xen-head.S | 2 +- include/linux/cpuhotplug.h | 2 + include/linux/smpboot.h | 7 + kernel/cpu.c | 31 +- kernel/smpboot.h | 2 - 20 files changed, 537 insertions(+), 200 deletions(-)
The main code change over v11 is the build error fix by Brian Gerst and
acquiring tr_lock in trampoline_64.S whenever the stack is setup.
The git history is also rewritten to move the commits that removed
initial_stack, early_gdt_descr and initial_gs earlier in the patchset.
Thanks,
Usama
Changes across versions:
v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
in preparation for more parallelisation.
v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
avoid scribbling on initial_gs in common_cpu_up(), and to allow all
24 bits of the physical X2APIC ID to be used. That patch still needs
a Signed-off-by from its original author, who once claimed not to
remember writing it at all. But now we've fixed it, hopefully he'll
admit it now :)
v5: rebase to v6.1 and remeasure performance, disable parallel bringup
for AMD CPUs.
v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and
reused timer calibration for secondary CPUs.
v7: [David Woodhouse] iterate over all possible CPUs to find any existing
cluster mask in alloc_clustermask. (patch 1/9)
Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf
0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient.
Included sanity checks for APIC id from 0x0B. (patch 6/9)
Removed patch for reusing timer calibration for secondary CPUs.
commit message and code improvements.
v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and
early_gdt_descr.
Drop trampoline lock and bail if APIC ID not found in find_cpunr.
Code comments improved and debug prints added.
v9: Drop patch to avoid repeated saves of MTRR at boot time.
rebased and retested at v6.2-rc8.
added kernel doc for no_parallel_bringup and made do_parallel_bringup
__ro_after_init.
v10: Fixed suspend/resume not working with parallel smpboot.
rebased and retested to 6.2.
fixed checkpatch errors.
v11: Added patches from Brian Gerst to remove the global variables initial_gs,
initial_stack, and early_gdt_descr from the 64-bit boot code
(https://lore.kernel.org/all/20230222221301.245890-1-brgerst@gmail.com/).
v12: Fixed compilation errors, acquire tr_lock for every stack setup in
trampoline_64.S.
Rearranged commits for a cleaner git history.
Brian Gerst (3):
x86/smpboot: Remove initial_stack on 64-bit
x86/smpboot: Remove early_gdt_descr on 64-bit
x86/smpboot: Remove initial_gs
David Woodhouse (8):
x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
cpu/hotplug: Add dynamic parallel bringup states before
CPUHP_BRINGUP_CPU
x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
x86/smpboot: Split up native_cpu_up into separate phases and document
them
x86/smpboot: Support parallel startup of secondary CPUs
x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
x86/smpboot: Serialize topology updates for secondary bringup
.../admin-guide/kernel-parameters.txt | 3 +
arch/x86/include/asm/processor.h | 6 +-
arch/x86/include/asm/realmode.h | 4 +-
arch/x86/include/asm/smp.h | 15 +-
arch/x86/include/asm/topology.h | 2 -
arch/x86/kernel/acpi/sleep.c | 15 +-
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/apic/x2apic_cluster.c | 126 ++++---
arch/x86/kernel/asm-offsets.c | 1 +
arch/x86/kernel/cpu/common.c | 6 +-
arch/x86/kernel/head_64.S | 129 +++++--
arch/x86/kernel/smpboot.c | 350 +++++++++++++-----
arch/x86/realmode/init.c | 3 +
arch/x86/realmode/rm/trampoline_64.S | 27 +-
arch/x86/xen/smp_pv.c | 4 +-
arch/x86/xen/xen-head.S | 2 +-
include/linux/cpuhotplug.h | 2 +
include/linux/smpboot.h | 7 +
kernel/cpu.c | 31 +-
kernel/smpboot.h | 2 -
20 files changed, 537 insertions(+), 200 deletions(-)
--
2.25.1
Hi Usama and David, thanks for the great series! I've tested it on Steam Deck (with and without the "no_parallel_bringup" parameter), it works fine - also tested S3/deep sleep-resume cycle. Feel free to add (to the series): Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Also, just taking the opportunity since I'm already replying here: on patch 09, found two typos: s/correect/correct (commit message) s/brinugp/bring-up (kernel-parameters.txt) Cheers, Guilherme
Dear Guilherme, Am 27.02.23 um 22:39 schrieb Guilherme G. Piccoli: > I've tested it on Steam Deck (with and without the "no_parallel_bringup" > parameter), it works fine - also tested S3/deep sleep-resume cycle. > > Feel free to add (to the series): > Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Thank you for testing the series. It’d be great if you could share the timing differences. […] Kind regards, Paul
On 28/02/2023 07:13, Paul Menzel wrote: > Dear Guilherme, > > > Am 27.02.23 um 22:39 schrieb Guilherme G. Piccoli: > >> I've tested it on Steam Deck (with and without the "no_parallel_bringup" >> parameter), it works fine - also tested S3/deep sleep-resume cycle. >> >> Feel free to add (to the series): >> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> > > Thank you for testing the series. It’d be great if you could share the > timing differences. > > […] > > > Kind regards, > > Paul Hi Paul! The results...weren't so great, I felt no difference heh Which is also not bad, it seems the series favors big SMP systems, Deck has only 8 CPUs. But maybe the way I measured is not ideal? I just compared timestamps on dmesg from the first SMP message up to the one that says the boot of secondary CPUs is complete. Do you have a better suggestion? I can try things here. Cheers, Guilherme
On Mon, 2023-02-27 at 18:39 -0300, Guilherme G. Piccoli wrote: > Hi Usama and David, thanks for the great series! > > I've tested it on Steam Deck (with and without the > "no_parallel_bringup" > parameter), it works fine - also tested S3/deep sleep-resume cycle. > > Feel free to add (to the series): > Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> > > Also, just taking the opportunity since I'm already replying here: on > patch 09, found two typos: > > s/correect/correct (commit message) > s/brinugp/bring-up (kernel-parameters.txt) > > Cheers, Thanks. I've done that and pushed it out to https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/parallel-6.2-rc8-v12bis ready for the next round.
Hello. On neděle 26. února 2023 12:07:51 CET Usama Arif wrote: > The main code change over v11 is the build error fix by Brian Gerst and > acquiring tr_lock in trampoline_64.S whenever the stack is setup. > > The git history is also rewritten to move the commits that removed > initial_stack, early_gdt_descr and initial_gs earlier in the patchset. > > Thanks, > Usama > > Changes across versions: > v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more > v3: Clean up x2apic patch, add MTRR optimisation, lock topology update > in preparation for more parallelisation. > v4: Fixes to the real mode parallelisation patch spotted by SeanC, to > avoid scribbling on initial_gs in common_cpu_up(), and to allow all > 24 bits of the physical X2APIC ID to be used. That patch still needs > a Signed-off-by from its original author, who once claimed not to > remember writing it at all. But now we've fixed it, hopefully he'll > admit it now :) > v5: rebase to v6.1 and remeasure performance, disable parallel bringup > for AMD CPUs. > v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and > reused timer calibration for secondary CPUs. > v7: [David Woodhouse] iterate over all possible CPUs to find any existing > cluster mask in alloc_clustermask. (patch 1/9) > Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf > 0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient. > Included sanity checks for APIC id from 0x0B. (patch 6/9) > Removed patch for reusing timer calibration for secondary CPUs. > commit message and code improvements. > v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and > early_gdt_descr. > Drop trampoline lock and bail if APIC ID not found in find_cpunr. > Code comments improved and debug prints added. > v9: Drop patch to avoid repeated saves of MTRR at boot time. > rebased and retested at v6.2-rc8. > added kernel doc for no_parallel_bringup and made do_parallel_bringup > __ro_after_init. > v10: Fixed suspend/resume not working with parallel smpboot. > rebased and retested to 6.2. > fixed checkpatch errors. > v11: Added patches from Brian Gerst to remove the global variables initial_gs, > initial_stack, and early_gdt_descr from the 64-bit boot code > (https://lore.kernel.org/all/20230222221301.245890-1-brgerst@gmail.com/). > v12: Fixed compilation errors, acquire tr_lock for every stack setup in > trampoline_64.S. > Rearranged commits for a cleaner git history. > > Brian Gerst (3): > x86/smpboot: Remove initial_stack on 64-bit > x86/smpboot: Remove early_gdt_descr on 64-bit > x86/smpboot: Remove initial_gs > > David Woodhouse (8): > x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel > cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> > cpu/hotplug: Add dynamic parallel bringup states before > CPUHP_BRINGUP_CPU > x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() > x86/smpboot: Split up native_cpu_up into separate phases and document > them > x86/smpboot: Support parallel startup of secondary CPUs > x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel > x86/smpboot: Serialize topology updates for secondary bringup > > .../admin-guide/kernel-parameters.txt | 3 + > arch/x86/include/asm/processor.h | 6 +- > arch/x86/include/asm/realmode.h | 4 +- > arch/x86/include/asm/smp.h | 15 +- > arch/x86/include/asm/topology.h | 2 - > arch/x86/kernel/acpi/sleep.c | 15 +- > arch/x86/kernel/apic/apic.c | 2 +- > arch/x86/kernel/apic/x2apic_cluster.c | 126 ++++--- > arch/x86/kernel/asm-offsets.c | 1 + > arch/x86/kernel/cpu/common.c | 6 +- > arch/x86/kernel/head_64.S | 129 +++++-- > arch/x86/kernel/smpboot.c | 350 +++++++++++++----- > arch/x86/realmode/init.c | 3 + > arch/x86/realmode/rm/trampoline_64.S | 27 +- > arch/x86/xen/smp_pv.c | 4 +- > arch/x86/xen/xen-head.S | 2 +- > include/linux/cpuhotplug.h | 2 + > include/linux/smpboot.h | 7 + > kernel/cpu.c | 31 +- > kernel/smpboot.h | 2 - > 20 files changed, 537 insertions(+), 200 deletions(-) With `CONFIG_FORCE_NR_CPUS=y` this results in: ``` ld: vmlinux.o: in function `secondary_startup_64_no_verify': (.head.text+0x10c): undefined reference to `nr_cpu_ids' ``` That's because in `arch/x86/kernel/head_64.S` `secondary_startup_64_no_verify()` refers to `nr_cpu_ids` under `#ifdef CONFIG_SMP`, but this symbol is available under the following conditions: ``` 38 #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS) 39 #define nr_cpu_ids ((unsigned int)NR_CPUS) 40 #else 41 extern unsigned int nr_cpu_ids; 42 #endif 1090 #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS) 1091 /* Setup number of possible processor ids */ 1092 unsigned int nr_cpu_ids __read_mostly = NR_CPUS; 1093 EXPORT_SYMBOL(nr_cpu_ids); 1094 #endif ``` So having `CONFIG_SMP=y` and, for instance, `CONFIG_NR_CPUS=320`, it is possible to compile out `EXPORT_SYMBOL(nr_cpu_ids);` if `CONFIG_FORCE_NR_CPUS=y` is set. -- Oleksandr Natalenko (post-factum)
On 26/02/2023 18:31, Oleksandr Natalenko wrote:
> Hello.
>
> On neděle 26. února 2023 12:07:51 CET Usama Arif wrote:
>> The main code change over v11 is the build error fix by Brian Gerst and
>> acquiring tr_lock in trampoline_64.S whenever the stack is setup.
>>
>> The git history is also rewritten to move the commits that removed
>> initial_stack, early_gdt_descr and initial_gs earlier in the patchset.
>>
>> Thanks,
>> Usama
>>
>> Changes across versions:
>> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
>> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
>> in preparation for more parallelisation.
>> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
>> avoid scribbling on initial_gs in common_cpu_up(), and to allow all
>> 24 bits of the physical X2APIC ID to be used. That patch still needs
>> a Signed-off-by from its original author, who once claimed not to
>> remember writing it at all. But now we've fixed it, hopefully he'll
>> admit it now :)
>> v5: rebase to v6.1 and remeasure performance, disable parallel bringup
>> for AMD CPUs.
>> v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and
>> reused timer calibration for secondary CPUs.
>> v7: [David Woodhouse] iterate over all possible CPUs to find any existing
>> cluster mask in alloc_clustermask. (patch 1/9)
>> Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf
>> 0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient.
>> Included sanity checks for APIC id from 0x0B. (patch 6/9)
>> Removed patch for reusing timer calibration for secondary CPUs.
>> commit message and code improvements.
>> v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and
>> early_gdt_descr.
>> Drop trampoline lock and bail if APIC ID not found in find_cpunr.
>> Code comments improved and debug prints added.
>> v9: Drop patch to avoid repeated saves of MTRR at boot time.
>> rebased and retested at v6.2-rc8.
>> added kernel doc for no_parallel_bringup and made do_parallel_bringup
>> __ro_after_init.
>> v10: Fixed suspend/resume not working with parallel smpboot.
>> rebased and retested to 6.2.
>> fixed checkpatch errors.
>> v11: Added patches from Brian Gerst to remove the global variables initial_gs,
>> initial_stack, and early_gdt_descr from the 64-bit boot code
>> (https://lore.kernel.org/all/20230222221301.245890-1-brgerst@gmail.com/).
>> v12: Fixed compilation errors, acquire tr_lock for every stack setup in
>> trampoline_64.S.
>> Rearranged commits for a cleaner git history.
>>
>> Brian Gerst (3):
>> x86/smpboot: Remove initial_stack on 64-bit
>> x86/smpboot: Remove early_gdt_descr on 64-bit
>> x86/smpboot: Remove initial_gs
>>
>> David Woodhouse (8):
>> x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
>> cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
>> cpu/hotplug: Add dynamic parallel bringup states before
>> CPUHP_BRINGUP_CPU
>> x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
>> x86/smpboot: Split up native_cpu_up into separate phases and document
>> them
>> x86/smpboot: Support parallel startup of secondary CPUs
>> x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
>> x86/smpboot: Serialize topology updates for secondary bringup
>>
>> .../admin-guide/kernel-parameters.txt | 3 +
>> arch/x86/include/asm/processor.h | 6 +-
>> arch/x86/include/asm/realmode.h | 4 +-
>> arch/x86/include/asm/smp.h | 15 +-
>> arch/x86/include/asm/topology.h | 2 -
>> arch/x86/kernel/acpi/sleep.c | 15 +-
>> arch/x86/kernel/apic/apic.c | 2 +-
>> arch/x86/kernel/apic/x2apic_cluster.c | 126 ++++---
>> arch/x86/kernel/asm-offsets.c | 1 +
>> arch/x86/kernel/cpu/common.c | 6 +-
>> arch/x86/kernel/head_64.S | 129 +++++--
>> arch/x86/kernel/smpboot.c | 350 +++++++++++++-----
>> arch/x86/realmode/init.c | 3 +
>> arch/x86/realmode/rm/trampoline_64.S | 27 +-
>> arch/x86/xen/smp_pv.c | 4 +-
>> arch/x86/xen/xen-head.S | 2 +-
>> include/linux/cpuhotplug.h | 2 +
>> include/linux/smpboot.h | 7 +
>> kernel/cpu.c | 31 +-
>> kernel/smpboot.h | 2 -
>> 20 files changed, 537 insertions(+), 200 deletions(-)
>
> With `CONFIG_FORCE_NR_CPUS=y` this results in:
>
> ```
> ld: vmlinux.o: in function `secondary_startup_64_no_verify':
> (.head.text+0x10c): undefined reference to `nr_cpu_ids'
> ```
>
> That's because in `arch/x86/kernel/head_64.S` `secondary_startup_64_no_verify()` refers to `nr_cpu_ids` under `#ifdef CONFIG_SMP`, but this symbol is available under the following conditions:
>
> ```
> 38 #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
> 39 #define nr_cpu_ids ((unsigned int)NR_CPUS)
> 40 #else
> 41 extern unsigned int nr_cpu_ids;
> 42 #endif
>
> 1090 #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS)
> 1091 /* Setup number of possible processor ids */
> 1092 unsigned int nr_cpu_ids __read_mostly = NR_CPUS;
> 1093 EXPORT_SYMBOL(nr_cpu_ids);
> 1094 #endif
> ```
>
> So having `CONFIG_SMP=y` and, for instance, `CONFIG_NR_CPUS=320`, it is possible to compile out `EXPORT_SYMBOL(nr_cpu_ids);` if `CONFIG_FORCE_NR_CPUS=y` is set.
>
I think something like below diff should work in all scenarios?
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index c2aa0aa26b45..e3727dab9cab 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -35,7 +35,7 @@ typedef struct cpumask { DECLARE_BITMAP(bits,
NR_CPUS); } cpumask_t;
*/
#define cpumask_pr_args(maskp) nr_cpu_ids, cpumask_bits(maskp)
-#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
+#if ((NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)) &&
!defined(CONFIG_SMP)
#define nr_cpu_ids ((unsigned int)NR_CPUS)
#else
extern unsigned int nr_cpu_ids;
@@ -43,7 +43,7 @@ extern unsigned int nr_cpu_ids;
static inline void set_nr_cpu_ids(unsigned int nr)
{
-#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
+#if ((NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)) &&
!defined(CONFIG_SMP)
WARN_ON(nr != nr_cpu_ids);
#else
nr_cpu_ids = nr;
diff --git a/kernel/smp.c b/kernel/smp.c
index 06a413987a14..a051b16d4a24 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1087,11 +1087,9 @@ static int __init maxcpus(char *str)
early_param("maxcpus", maxcpus);
-#if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS)
/* Setup number of possible processor ids */
unsigned int nr_cpu_ids __read_mostly = NR_CPUS;
EXPORT_SYMBOL(nr_cpu_ids);
-#endif
/* An arch may set nr_cpu_ids earlier if needed, so this would be
redundant */
void __init setup_nr_cpu_ids(void)
On 26 February 2023 20:59:17 GMT, Usama Arif <usama.arif@bytedance.com> wrote: > > >On 26/02/2023 18:31, Oleksandr Natalenko wrote: >> Hello. >> >> On neděle 26. února 2023 12:07:51 CET Usama Arif wrote: >>> The main code change over v11 is the build error fix by Brian Gerst and >>> acquiring tr_lock in trampoline_64.S whenever the stack is setup. >>> >>> The git history is also rewritten to move the commits that removed >>> initial_stack, early_gdt_descr and initial_gs earlier in the patchset. >>> >>> Thanks, >>> Usama >>> >>> Changes across versions: >>> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more >>> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update >>> in preparation for more parallelisation. >>> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to >>> avoid scribbling on initial_gs in common_cpu_up(), and to allow all >>> 24 bits of the physical X2APIC ID to be used. That patch still needs >>> a Signed-off-by from its original author, who once claimed not to >>> remember writing it at all. But now we've fixed it, hopefully he'll >>> admit it now :) >>> v5: rebase to v6.1 and remeasure performance, disable parallel bringup >>> for AMD CPUs. >>> v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and >>> reused timer calibration for secondary CPUs. >>> v7: [David Woodhouse] iterate over all possible CPUs to find any existing >>> cluster mask in alloc_clustermask. (patch 1/9) >>> Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf >>> 0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient. >>> Included sanity checks for APIC id from 0x0B. (patch 6/9) >>> Removed patch for reusing timer calibration for secondary CPUs. >>> commit message and code improvements. >>> v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and >>> early_gdt_descr. >>> Drop trampoline lock and bail if APIC ID not found in find_cpunr. >>> Code comments improved and debug prints added. >>> v9: Drop patch to avoid repeated saves of MTRR at boot time. >>> rebased and retested at v6.2-rc8. >>> added kernel doc for no_parallel_bringup and made do_parallel_bringup >>> __ro_after_init. >>> v10: Fixed suspend/resume not working with parallel smpboot. >>> rebased and retested to 6.2. >>> fixed checkpatch errors. >>> v11: Added patches from Brian Gerst to remove the global variables initial_gs, >>> initial_stack, and early_gdt_descr from the 64-bit boot code >>> (https://lore.kernel.org/all/20230222221301.245890-1-brgerst@gmail.com/). >>> v12: Fixed compilation errors, acquire tr_lock for every stack setup in >>> trampoline_64.S. >>> Rearranged commits for a cleaner git history. >>> >>> Brian Gerst (3): >>> x86/smpboot: Remove initial_stack on 64-bit >>> x86/smpboot: Remove early_gdt_descr on 64-bit >>> x86/smpboot: Remove initial_gs >>> >>> David Woodhouse (8): >>> x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel >>> cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> >>> cpu/hotplug: Add dynamic parallel bringup states before >>> CPUHP_BRINGUP_CPU >>> x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() >>> x86/smpboot: Split up native_cpu_up into separate phases and document >>> them >>> x86/smpboot: Support parallel startup of secondary CPUs >>> x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel >>> x86/smpboot: Serialize topology updates for secondary bringup >>> >>> .../admin-guide/kernel-parameters.txt | 3 + >>> arch/x86/include/asm/processor.h | 6 +- >>> arch/x86/include/asm/realmode.h | 4 +- >>> arch/x86/include/asm/smp.h | 15 +- >>> arch/x86/include/asm/topology.h | 2 - >>> arch/x86/kernel/acpi/sleep.c | 15 +- >>> arch/x86/kernel/apic/apic.c | 2 +- >>> arch/x86/kernel/apic/x2apic_cluster.c | 126 ++++--- >>> arch/x86/kernel/asm-offsets.c | 1 + >>> arch/x86/kernel/cpu/common.c | 6 +- >>> arch/x86/kernel/head_64.S | 129 +++++-- >>> arch/x86/kernel/smpboot.c | 350 +++++++++++++----- >>> arch/x86/realmode/init.c | 3 + >>> arch/x86/realmode/rm/trampoline_64.S | 27 +- >>> arch/x86/xen/smp_pv.c | 4 +- >>> arch/x86/xen/xen-head.S | 2 +- >>> include/linux/cpuhotplug.h | 2 + >>> include/linux/smpboot.h | 7 + >>> kernel/cpu.c | 31 +- >>> kernel/smpboot.h | 2 - >>> 20 files changed, 537 insertions(+), 200 deletions(-) >> >> With `CONFIG_FORCE_NR_CPUS=y` this results in: >> >> ``` >> ld: vmlinux.o: in function `secondary_startup_64_no_verify': >> (.head.text+0x10c): undefined reference to `nr_cpu_ids' >> ``` >> >> That's because in `arch/x86/kernel/head_64.S` `secondary_startup_64_no_verify()` refers to `nr_cpu_ids` under `#ifdef CONFIG_SMP`, but this symbol is available under the following conditions: >> >> ``` >> 38 #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS) >> 39 #define nr_cpu_ids ((unsigned int)NR_CPUS) >> 40 #else >> 41 extern unsigned int nr_cpu_ids; >> 42 #endif >> >> 1090 #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS) >> 1091 /* Setup number of possible processor ids */ >> 1092 unsigned int nr_cpu_ids __read_mostly = NR_CPUS; >> 1093 EXPORT_SYMBOL(nr_cpu_ids); >> 1094 #endif >> ``` >> >> So having `CONFIG_SMP=y` and, for instance, `CONFIG_NR_CPUS=320`, it is possible to compile out `EXPORT_SYMBOL(nr_cpu_ids);` if `CONFIG_FORCE_NR_CPUS=y` is set. >> > >I think something like below diff should work in all scenarios? I'd've changed the asm side to use the constant limit.
On 27/02/2023 06:13, David Woodhouse wrote: > > > On 26 February 2023 20:59:17 GMT, Usama Arif <usama.arif@bytedance.com> wrote: >> >> >> On 26/02/2023 18:31, Oleksandr Natalenko wrote: >>> Hello. >>> >>> On neděle 26. února 2023 12:07:51 CET Usama Arif wrote: >>>> The main code change over v11 is the build error fix by Brian Gerst and >>>> acquiring tr_lock in trampoline_64.S whenever the stack is setup. >>>> >>>> The git history is also rewritten to move the commits that removed >>>> initial_stack, early_gdt_descr and initial_gs earlier in the patchset. >>>> >>>> Thanks, >>>> Usama >>>> >>>> Changes across versions: >>>> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more >>>> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update >>>> in preparation for more parallelisation. >>>> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to >>>> avoid scribbling on initial_gs in common_cpu_up(), and to allow all >>>> 24 bits of the physical X2APIC ID to be used. That patch still needs >>>> a Signed-off-by from its original author, who once claimed not to >>>> remember writing it at all. But now we've fixed it, hopefully he'll >>>> admit it now :) >>>> v5: rebase to v6.1 and remeasure performance, disable parallel bringup >>>> for AMD CPUs. >>>> v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and >>>> reused timer calibration for secondary CPUs. >>>> v7: [David Woodhouse] iterate over all possible CPUs to find any existing >>>> cluster mask in alloc_clustermask. (patch 1/9) >>>> Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf >>>> 0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient. >>>> Included sanity checks for APIC id from 0x0B. (patch 6/9) >>>> Removed patch for reusing timer calibration for secondary CPUs. >>>> commit message and code improvements. >>>> v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and >>>> early_gdt_descr. >>>> Drop trampoline lock and bail if APIC ID not found in find_cpunr. >>>> Code comments improved and debug prints added. >>>> v9: Drop patch to avoid repeated saves of MTRR at boot time. >>>> rebased and retested at v6.2-rc8. >>>> added kernel doc for no_parallel_bringup and made do_parallel_bringup >>>> __ro_after_init. >>>> v10: Fixed suspend/resume not working with parallel smpboot. >>>> rebased and retested to 6.2. >>>> fixed checkpatch errors. >>>> v11: Added patches from Brian Gerst to remove the global variables initial_gs, >>>> initial_stack, and early_gdt_descr from the 64-bit boot code >>>> (https://lore.kernel.org/all/20230222221301.245890-1-brgerst@gmail.com/). >>>> v12: Fixed compilation errors, acquire tr_lock for every stack setup in >>>> trampoline_64.S. >>>> Rearranged commits for a cleaner git history. >>>> >>>> Brian Gerst (3): >>>> x86/smpboot: Remove initial_stack on 64-bit >>>> x86/smpboot: Remove early_gdt_descr on 64-bit >>>> x86/smpboot: Remove initial_gs >>>> >>>> David Woodhouse (8): >>>> x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel >>>> cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> >>>> cpu/hotplug: Add dynamic parallel bringup states before >>>> CPUHP_BRINGUP_CPU >>>> x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() >>>> x86/smpboot: Split up native_cpu_up into separate phases and document >>>> them >>>> x86/smpboot: Support parallel startup of secondary CPUs >>>> x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel >>>> x86/smpboot: Serialize topology updates for secondary bringup >>>> >>>> .../admin-guide/kernel-parameters.txt | 3 + >>>> arch/x86/include/asm/processor.h | 6 +- >>>> arch/x86/include/asm/realmode.h | 4 +- >>>> arch/x86/include/asm/smp.h | 15 +- >>>> arch/x86/include/asm/topology.h | 2 - >>>> arch/x86/kernel/acpi/sleep.c | 15 +- >>>> arch/x86/kernel/apic/apic.c | 2 +- >>>> arch/x86/kernel/apic/x2apic_cluster.c | 126 ++++--- >>>> arch/x86/kernel/asm-offsets.c | 1 + >>>> arch/x86/kernel/cpu/common.c | 6 +- >>>> arch/x86/kernel/head_64.S | 129 +++++-- >>>> arch/x86/kernel/smpboot.c | 350 +++++++++++++----- >>>> arch/x86/realmode/init.c | 3 + >>>> arch/x86/realmode/rm/trampoline_64.S | 27 +- >>>> arch/x86/xen/smp_pv.c | 4 +- >>>> arch/x86/xen/xen-head.S | 2 +- >>>> include/linux/cpuhotplug.h | 2 + >>>> include/linux/smpboot.h | 7 + >>>> kernel/cpu.c | 31 +- >>>> kernel/smpboot.h | 2 - >>>> 20 files changed, 537 insertions(+), 200 deletions(-) >>> >>> With `CONFIG_FORCE_NR_CPUS=y` this results in: >>> >>> ``` >>> ld: vmlinux.o: in function `secondary_startup_64_no_verify': >>> (.head.text+0x10c): undefined reference to `nr_cpu_ids' >>> ``` >>> >>> That's because in `arch/x86/kernel/head_64.S` `secondary_startup_64_no_verify()` refers to `nr_cpu_ids` under `#ifdef CONFIG_SMP`, but this symbol is available under the following conditions: >>> >>> ``` >>> 38 #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS) >>> 39 #define nr_cpu_ids ((unsigned int)NR_CPUS) >>> 40 #else >>> 41 extern unsigned int nr_cpu_ids; >>> 42 #endif >>> >>> 1090 #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS) >>> 1091 /* Setup number of possible processor ids */ >>> 1092 unsigned int nr_cpu_ids __read_mostly = NR_CPUS; >>> 1093 EXPORT_SYMBOL(nr_cpu_ids); >>> 1094 #endif >>> ``` >>> >>> So having `CONFIG_SMP=y` and, for instance, `CONFIG_NR_CPUS=320`, it is possible to compile out `EXPORT_SYMBOL(nr_cpu_ids);` if `CONFIG_FORCE_NR_CPUS=y` is set. >>> >> >> I think something like below diff should work in all scenarios? > > I'd've changed the asm side to use the constant limit. Yup, just needed the morning coffee :) Had sent the proper fix in https://lore.kernel.org/all/5e8ad90a-1dc6-95c2-e020-5e95da6f9eda@bytedance.com/#t I guess the diff is still small over v12 (including the cosmetic changes) to send out a new version so soon, probably better to wait a couple of days incase something else comes up as well? Thanks, Usama
On 26/02/2023 20:59, Usama Arif wrote:
>
>
> On 26/02/2023 18:31, Oleksandr Natalenko wrote:
>> Hello.
>>
>> On neděle 26. února 2023 12:07:51 CET Usama Arif wrote:
>>> The main code change over v11 is the build error fix by Brian Gerst and
>>> acquiring tr_lock in trampoline_64.S whenever the stack is setup.
>>>
>>> The git history is also rewritten to move the commits that removed
>>> initial_stack, early_gdt_descr and initial_gs earlier in the patchset.
>>>
>>> Thanks,
>>> Usama
>>>
>>> Changes across versions:
>>> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
>>> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
>>> in preparation for more parallelisation.
>>> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
>>> avoid scribbling on initial_gs in common_cpu_up(), and to allow all
>>> 24 bits of the physical X2APIC ID to be used. That patch still
>>> needs
>>> a Signed-off-by from its original author, who once claimed not to
>>> remember writing it at all. But now we've fixed it, hopefully he'll
>>> admit it now :)
>>> v5: rebase to v6.1 and remeasure performance, disable parallel bringup
>>> for AMD CPUs.
>>> v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and
>>> reused timer calibration for secondary CPUs.
>>> v7: [David Woodhouse] iterate over all possible CPUs to find any
>>> existing
>>> cluster mask in alloc_clustermask. (patch 1/9)
>>> Keep parallel AMD support enabled in AMD, using APIC ID in CPUID
>>> leaf
>>> 0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are
>>> sufficient.
>>> Included sanity checks for APIC id from 0x0B. (patch 6/9)
>>> Removed patch for reusing timer calibration for secondary CPUs.
>>> commit message and code improvements.
>>> v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and
>>> early_gdt_descr.
>>> Drop trampoline lock and bail if APIC ID not found in find_cpunr.
>>> Code comments improved and debug prints added.
>>> v9: Drop patch to avoid repeated saves of MTRR at boot time.
>>> rebased and retested at v6.2-rc8.
>>> added kernel doc for no_parallel_bringup and made
>>> do_parallel_bringup
>>> __ro_after_init.
>>> v10: Fixed suspend/resume not working with parallel smpboot.
>>> rebased and retested to 6.2.
>>> fixed checkpatch errors.
>>> v11: Added patches from Brian Gerst to remove the global variables
>>> initial_gs,
>>> initial_stack, and early_gdt_descr from the 64-bit boot code
>>>
>>> (https://lore.kernel.org/all/20230222221301.245890-1-brgerst@gmail.com/).
>>> v12: Fixed compilation errors, acquire tr_lock for every stack setup in
>>> trampoline_64.S.
>>> Rearranged commits for a cleaner git history.
>>>
>>> Brian Gerst (3):
>>> x86/smpboot: Remove initial_stack on 64-bit
>>> x86/smpboot: Remove early_gdt_descr on 64-bit
>>> x86/smpboot: Remove initial_gs
>>>
>>> David Woodhouse (8):
>>> x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
>>> cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
>>> cpu/hotplug: Add dynamic parallel bringup states before
>>> CPUHP_BRINGUP_CPU
>>> x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
>>> x86/smpboot: Split up native_cpu_up into separate phases and document
>>> them
>>> x86/smpboot: Support parallel startup of secondary CPUs
>>> x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
>>> x86/smpboot: Serialize topology updates for secondary bringup
>>>
>>> .../admin-guide/kernel-parameters.txt | 3 +
>>> arch/x86/include/asm/processor.h | 6 +-
>>> arch/x86/include/asm/realmode.h | 4 +-
>>> arch/x86/include/asm/smp.h | 15 +-
>>> arch/x86/include/asm/topology.h | 2 -
>>> arch/x86/kernel/acpi/sleep.c | 15 +-
>>> arch/x86/kernel/apic/apic.c | 2 +-
>>> arch/x86/kernel/apic/x2apic_cluster.c | 126 ++++---
>>> arch/x86/kernel/asm-offsets.c | 1 +
>>> arch/x86/kernel/cpu/common.c | 6 +-
>>> arch/x86/kernel/head_64.S | 129 +++++--
>>> arch/x86/kernel/smpboot.c | 350 +++++++++++++-----
>>> arch/x86/realmode/init.c | 3 +
>>> arch/x86/realmode/rm/trampoline_64.S | 27 +-
>>> arch/x86/xen/smp_pv.c | 4 +-
>>> arch/x86/xen/xen-head.S | 2 +-
>>> include/linux/cpuhotplug.h | 2 +
>>> include/linux/smpboot.h | 7 +
>>> kernel/cpu.c | 31 +-
>>> kernel/smpboot.h | 2 -
>>> 20 files changed, 537 insertions(+), 200 deletions(-)
>>
>> With `CONFIG_FORCE_NR_CPUS=y` this results in:
>>
>> ```
>> ld: vmlinux.o: in function `secondary_startup_64_no_verify':
>> (.head.text+0x10c): undefined reference to `nr_cpu_ids'
>> ```
>>
>> That's because in `arch/x86/kernel/head_64.S`
>> `secondary_startup_64_no_verify()` refers to `nr_cpu_ids` under
>> `#ifdef CONFIG_SMP`, but this symbol is available under the following
>> conditions:
>>
>> ```
>> 38 #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
>> 39 #define nr_cpu_ids ((unsigned int)NR_CPUS)
>> 40 #else
>> 41 extern unsigned int nr_cpu_ids;
>> 42 #endif
>>
>> 1090 #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS)
>> 1091 /* Setup number of possible processor ids */
>> 1092 unsigned int nr_cpu_ids __read_mostly = NR_CPUS;
>> 1093 EXPORT_SYMBOL(nr_cpu_ids);
>> 1094 #endif
>> ```
>>
>> So having `CONFIG_SMP=y` and, for instance, `CONFIG_NR_CPUS=320`, it
>> is possible to compile out `EXPORT_SYMBOL(nr_cpu_ids);` if
>> `CONFIG_FORCE_NR_CPUS=y` is set.
>>
>
> I think something like below diff should work in all scenarios?
>
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index c2aa0aa26b45..e3727dab9cab 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -35,7 +35,7 @@ typedef struct cpumask { DECLARE_BITMAP(bits,
> NR_CPUS); } cpumask_t;
> */
> #define cpumask_pr_args(maskp) nr_cpu_ids, cpumask_bits(maskp)
>
> -#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
> +#if ((NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)) &&
> !defined(CONFIG_SMP)
> #define nr_cpu_ids ((unsigned int)NR_CPUS)
> #else
> extern unsigned int nr_cpu_ids;
> @@ -43,7 +43,7 @@ extern unsigned int nr_cpu_ids;
>
> static inline void set_nr_cpu_ids(unsigned int nr)
> {
> -#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
> +#if ((NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)) &&
> !defined(CONFIG_SMP)
> WARN_ON(nr != nr_cpu_ids);
> #else
> nr_cpu_ids = nr;
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 06a413987a14..a051b16d4a24 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -1087,11 +1087,9 @@ static int __init maxcpus(char *str)
>
> early_param("maxcpus", maxcpus);
>
> -#if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS)
> /* Setup number of possible processor ids */
> unsigned int nr_cpu_ids __read_mostly = NR_CPUS;
> EXPORT_SYMBOL(nr_cpu_ids);
> -#endif
>
> /* An arch may set nr_cpu_ids earlier if needed, so this would be
> redundant */
> void __init setup_nr_cpu_ids(void)
Or better just do below?
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 17bdd6122dca..5d709aa67df4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -273,7 +273,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify,
SYM_L_GLOBAL)
cmpl (%rbx,%rcx,4), %edx
jz .Lsetup_cpu
inc %ecx
+#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
+ cmpl $NR_CPUS, %ecx
+#else
cmpl nr_cpu_ids(%rip), %ecx
+#endif
jb .Lfind_cpunr
/* APIC ID not found in the table. Drop the trampoline lock
and bail. */
On Mon, 2023-02-27 at 06:14 +0000, Usama Arif wrote: > > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S > index 17bdd6122dca..5d709aa67df4 100644 > --- a/arch/x86/kernel/head_64.S > +++ b/arch/x86/kernel/head_64.S > @@ -273,7 +273,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, > SYM_L_GLOBAL) > cmpl (%rbx,%rcx,4), %edx > jz .Lsetup_cpu > inc %ecx > +#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS) > + cmpl $NR_CPUS, %ecx > +#else > cmpl nr_cpu_ids(%rip), %ecx > +#endif > jb .Lfind_cpunr > > /* APIC ID not found in the table. Drop the trampoline lock > and bail. */ The whitespace looks dodgy there but maybe that's just your mail client? Given this code is already in #ifdef CONFIG_SMP, can NR_CPUS be 1?
On 27/02/2023 15:29, David Woodhouse wrote:
> On Mon, 2023-02-27 at 06:14 +0000, Usama Arif wrote:
>>
>> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
>> index 17bdd6122dca..5d709aa67df4 100644
>> --- a/arch/x86/kernel/head_64.S
>> +++ b/arch/x86/kernel/head_64.S
>> @@ -273,7 +273,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify,
>> SYM_L_GLOBAL)
>> cmpl (%rbx,%rcx,4), %edx
>> jz .Lsetup_cpu
>> inc %ecx
>> +#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
>> + cmpl $NR_CPUS, %ecx
>> +#else
>> cmpl nr_cpu_ids(%rip), %ecx
>> +#endif
>> jb .Lfind_cpunr
>>
>> /* APIC ID not found in the table. Drop the trampoline lock
>> and bail. */
>
> The whitespace looks dodgy there but maybe that's just your mail client?
>
> Given this code is already in #ifdef CONFIG_SMP, can NR_CPUS be 1?
Ah yes, we have
config NR_CPUS_RANGE_BEGIN
int
default NR_CPUS_RANGE_END if MAXSMP
default 1 if !SMP
default 2
in arch/x86/Kconfig which doesn't let us select 1 for NR_CPUS if SMP is
enabled, so this should be enough
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 17bdd6122dca..c79ae67492e1 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -273,7 +273,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify,
SYM_L_GLOBAL)
cmpl (%rbx,%rcx,4), %edx
jz .Lsetup_cpu
inc %ecx
+#if defined(CONFIG_FORCE_NR_CPUS)
+ cmpl $NR_CPUS, %ebx
+#else
cmpl nr_cpu_ids(%rip), %ecx
+#endif
jb .Lfind_cpunr
© 2016 - 2026 Red Hat, Inc.