From nobody Fri Dec 19 16:01:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73865C001DB for ; Mon, 7 Aug 2023 13:54:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234511AbjHGNyz (ORCPT ); Mon, 7 Aug 2023 09:54:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234627AbjHGNxv (ORCPT ); Mon, 7 Aug 2023 09:53:51 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F5E01BE3 for ; Mon, 7 Aug 2023 06:53:32 -0700 (PDT) Message-ID: <20230807135028.328142041@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1691416410; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Zpdb7SGU6c8Ds83Kh6nglp1WOfJEA2K/h0y4EGVH3ck=; b=3zhhsKnjn1f06wB/Iddtt1yQ5sVtATuFc8l884JKT56Urekwm7qaex2ODQnyN7WGEt64DU L5dT2rAVBEsLfkEL0EEq6RjJ+ShazPtlMS5pDLzS1zjcp+ZbnbOzh9SKUxS5K/3m+8+qLu 2wjCrF/tL5GHm8/QJdVSZFY2Xq0az/xSCVWA6GBVr/+8EQT5F2AAPdsn4wMPj2e2Mp2aey Ys2Sjc/WI7C3RK3NMVDbD9+xW1LGPIuVAFlqGbCdSHgn1vIy99ol9yEto++xbOAysRcm5s oVbDKBcbrkPVzs8I8bshalr41oGfOcjikfnSvLwadFvf0w1yYMo7EwhmPaTjfA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1691416410; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Zpdb7SGU6c8Ds83Kh6nglp1WOfJEA2K/h0y4EGVH3ck=; b=bF90uPgwmU54yr4dJrHdYygvIBfMHG6REh98tNXOnnQQZjK1XiMOPwJ/MoqRYgbnrEqwJE xLGhx3MhWqDp1VAw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Tom Lendacky , Andrew Cooper , Arjan van de Ven , Huang Rui , Juergen Gross , Dimitri Sivanich , Michael Kelley , Sohil Mehta , K Prateek Nayak , Kan Liang , Zhang Rui , "Paul E. McKenney" , Feng Tang , Andy Shevchenko Subject: [patch 36/53] x86/cpu/topology: Rework possible CPU management References: <20230807130108.853357011@linutronix.de> MIME-Version: 1.0 Date: Mon, 7 Aug 2023 15:53:30 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Managing possible CPUs is an unreadable and uncomprehensible maze. Aside of that it's backwards because it applies command line limits after registering all APICs. Rewrite it so that it: - Applies the command line limits upfront so that only the allowed amount of APIC IDs can be registered. - Applies eventual late restrictions in an understandable way - Uses simple min_t() calculations which are trivial to follow. - Provides a separate function for resetting to UP mode late in the bringup process. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/apic.h | 5 + arch/x86/include/asm/topology.h | 4=20 arch/x86/kernel/cpu/topology.c | 176 ++++++++++++++++++++++++-----------= ----- arch/x86/kernel/setup.c | 9 -- arch/x86/kernel/smpboot.c | 6 - 5 files changed, 120 insertions(+), 80 deletions(-) --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -175,6 +175,9 @@ extern void topology_register_apic(u32 a extern void topology_register_boot_apic(u32 apic_id); extern int topology_hotplug_apic(u32 apic_id, u32 acpi_id); extern void topology_hotunplug_apic(unsigned int cpu); +extern void topology_apply_cmdline_limits_early(void); +extern void topology_init_possible_cpus(void); +extern void topology_reset_possible_cpus_up(void); =20 #else /* !CONFIG_X86_LOCAL_APIC */ static inline void lapic_shutdown(void) { } @@ -190,6 +193,8 @@ static inline void apic_intr_mode_init(v static inline void lapic_assign_system_vectors(void) { } static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { } static inline bool apic_needs_pit(void) { return true; } +static inline void topology_apply_cmdline_limits_early(void) { } +static inline void topology_init_possible_cpus(void) { } #endif /* !CONFIG_X86_LOCAL_APIC */ =20 #ifdef CONFIG_X86_X2APIC --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -190,6 +190,9 @@ static inline bool topology_is_primary_t { return cpumask_test_cpu(cpu, cpu_primary_thread_mask); } + +void topology_apply_cmdline_limits_early(void); + #else /* CONFIG_SMP */ #define topology_max_packages() (1) static inline int @@ -202,6 +205,7 @@ static inline int topology_max_smt_threa static inline bool topology_is_primary_thread(unsigned int cpu) { return t= rue; } static inline bool topology_smt_supported(void) { return false; } static inline unsigned int topology_amd_nodes_per_pkg(void) { return 0; }; +static inline void topology_apply_cmdline_limits_early(void) { } #endif /* !CONFIG_SMP */ =20 static inline void arch_fix_phys_package_id(int num, u32 slot) --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -5,6 +5,7 @@ #include =20 #include +#include #include #include =20 @@ -85,73 +86,6 @@ early_initcall(smp_init_primary_thread_m static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int = apicid) { } #endif =20 -static int __initdata setup_possible_cpus =3D -1; - -/* - * cpu_possible_mask should be static, it cannot change as cpu's - * are onlined, or offlined. The reason is per-cpu data-structures - * are allocated by some modules at init time, and don't expect to - * do this dynamically on cpu arrival/departure. - * cpu_present_mask on the other hand can change dynamically. - * In case when cpu_hotplug is not compiled, then we resort to current - * behaviour, which is cpu_possible =3D=3D cpu_present. - * - Ashok Raj - * - * Three ways to find out the number of additional hotplug CPUs: - * - If the BIOS specified disabled CPUs in ACPI/mptables use that. - * - The user can overwrite it with possible_cpus=3DNUM - * - Otherwise don't reserve additional CPUs. - * We do this because additional CPUs waste a lot of memory. - * -AK - */ -__init void prefill_possible_map(void) -{ - unsigned int num_processors =3D topo_info.nr_assigned_cpus; - unsigned int disabled_cpus =3D topo_info.nr_disabled_cpus; - int i, possible; - - i =3D setup_max_cpus ?: 1; - if (setup_possible_cpus =3D=3D -1) { - possible =3D topo_info.nr_assigned_cpus; -#ifdef CONFIG_HOTPLUG_CPU - if (setup_max_cpus) - possible +=3D num_processors; -#else - if (possible > i) - possible =3D i; -#endif - } else - possible =3D setup_possible_cpus; - - total_cpus =3D max_t(int, possible, num_processors + disabled_cpus); - - /* nr_cpu_ids could be reduced via nr_cpus=3D */ - if (possible > nr_cpu_ids) { - pr_warn("%d Processors exceeds NR_CPUS limit of %u\n", - possible, nr_cpu_ids); - possible =3D nr_cpu_ids; - } - -#ifdef CONFIG_HOTPLUG_CPU - if (!setup_max_cpus) -#endif - if (possible > i) { - pr_warn("%d Processors exceeds max_cpus limit of %u\n", - possible, setup_max_cpus); - possible =3D i; - } - - set_nr_cpu_ids(possible); - - pr_info("Allowing %d CPUs, %d hotplug CPUs\n", - possible, max_t(int, possible - num_processors, 0)); - - reset_cpu_possible_mask(); - - for (i =3D 0; i < possible; i++) - set_cpu_possible(i, true); -} - static int topo_lookup_cpuid(u32 apic_id) { int i; @@ -294,12 +228,114 @@ void topology_hotunplug_apic(unsigned in } #endif =20 -static int __init _setup_possible_cpus(char *str) +#ifdef CONFIG_SMP +static unsigned int max_possible_cpus __initdata =3D NR_CPUS; + +/** + * topology_apply_cmdline_limits_early - Apply topology command line limit= s early + * + * Ensure that command line limits are in effect before firmware parsing + * takes place. + */ +void __init topology_apply_cmdline_limits_early(void) +{ + unsigned int possible =3D nr_cpu_ids; + + /* 'maxcpus=3D0' 'nosmp' 'nolapic' 'disableapic' 'noapic' */ + if (!setup_max_cpus || ioapic_is_disabled || apic_is_disabled) + possible =3D 1; + + /* 'possible_cpus=3DN' */ + possible =3D min_t(unsigned int, max_possible_cpus, possible); + + if (possible < nr_cpu_ids) { + pr_info("Limiting to %u possible CPUs\n", possible); + set_nr_cpu_ids(possible); + } +} + +static __init bool restrict_to_up(void) +{ + if (!smp_found_config || ioapic_is_disabled) + return true; + /* + * XEN PV is special as it does not advertise the local APIC + * properly, but provides a fake topology for it so that the + * infrastructure works. So don't apply the restrictions vs. APIC + * here. + */ + if (xen_pv_domain()) + return false; + + return apic_is_disabled; +} + +void __init topology_init_possible_cpus(void) +{ + unsigned int assigned =3D topo_info.nr_assigned_cpus; + unsigned int disabled =3D topo_info.nr_disabled_cpus; + unsigned int total =3D assigned + disabled; + unsigned int cpu, allowed =3D 1; + + if (!restrict_to_up()) { + if (WARN_ON_ONCE(assigned > nr_cpu_ids)) { + disabled +=3D assigned - nr_cpu_ids; + assigned =3D nr_cpu_ids; + } + allowed =3D min_t(unsigned int, total, nr_cpu_ids); + } + + if (total > allowed) + pr_warn("%u possible CPUs exceed the limit of %u\n", total, allowed); + + assigned =3D min_t(unsigned int, allowed, assigned); + disabled =3D allowed - assigned; + + topo_info.nr_assigned_cpus =3D assigned; + topo_info.nr_disabled_cpus =3D disabled; + + total_cpus =3D allowed; + set_nr_cpu_ids(allowed); + + pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disa= bled); + if (topo_info.nr_rejected_cpus) + pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus); + + init_cpu_present(cpumask_of(0)); + init_cpu_possible(cpumask_of(0)); + + for (cpu =3D 0; cpu < allowed; cpu++) { + u32 apicid =3D cpuid_to_apicid[cpu]; + + set_cpu_possible(cpu, true); + + if (apicid =3D=3D BAD_APICID) + continue; + + set_cpu_present(cpu, test_bit(apicid, phys_cpu_present_map)); + } +} + +/* + * Late SMP disable after sizing CPU masks when APIC/IOAPIC setup failed. + */ +void __init topology_reset_possible_cpus_up(void) { - get_option(&str, &setup_possible_cpus); + init_cpu_present(cpumask_of(0)); + init_cpu_possible(cpumask_of(0)); + + bitmap_zero(phys_cpu_present_map, MAX_LOCAL_APIC); + if (topo_info.boot_cpu_apic_id !=3D BAD_APICID) + set_bit(topo_info.boot_cpu_apic_id, phys_cpu_present_map); +} + +static int __init setup_possible_cpus(char *str) +{ + get_option(&str, &max_possible_cpus); return 0; } -early_param("possible_cpus", _setup_possible_cpus); +early_param("possible_cpus", setup_possible_cpus); +#endif =20 static int __init apic_set_disabled_cpu_apicid(char *arg) { --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1258,6 +1258,8 @@ void __init setup_arch(char **cmdline_p) =20 early_quirks(); =20 + topology_apply_cmdline_limits_early(); + /* * Parse SMP configuration. Try ACPI first and then the platform * specific parser. @@ -1265,13 +1267,10 @@ void __init setup_arch(char **cmdline_p) acpi_boot_init(); x86_init.mpparse.parse_smp_cfg(); =20 - /* - * Systems w/o ACPI and mptables might not have it mapped the local - * APIC yet, but prefill_possible_map() might need to access it. - */ + /* Last opportunity to detect and map the local APIC */ init_apic_mappings(); =20 - prefill_possible_map(); + topology_init_possible_cpus(); =20 init_cpu_to_node(); init_gi_nodes(); --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1156,11 +1156,7 @@ static __init void disable_smp(void) pr_info("SMP disabled\n"); =20 disable_ioapic_support(); - - init_cpu_present(cpumask_of(0)); - init_cpu_possible(cpumask_of(0)); - - reset_phys_cpu_present_map(smp_found_config ? boot_cpu_physical_apicid : = 0); + topology_reset_possible_cpus_up(); =20 cpumask_set_cpu(0, topology_sibling_cpumask(0)); cpumask_set_cpu(0, topology_core_cpumask(0));