From nobody Tue Jun 16 07:17:16 2026 Received: from canpmsgout08.his.huawei.com (canpmsgout08.his.huawei.com [113.46.200.223]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 099BA2DCF74; Fri, 17 Apr 2026 03:57:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=113.46.200.223 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776398277; cv=none; b=OoAYVGXYdp0/KgbgpEu4gLPOdgT02f+CHb9HS9RrmFngTA5KGEFAyZwUJbaQ4sppFdAJFs5DQz5TDj/Ze//v7G5A4JYDJseyfaVXR3kwdHPSgU3h+TYpvubjtOrTU+SA9pTtgZnWcjZJevoV2naAtsXSEtj/oQEh/s6AxSc9O88= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776398277; c=relaxed/simple; bh=JqgQH6tfVnmWWHepw3Qpa2RNHNMxsp6g8lbCRtY6FjM=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=kk/gbLGaQZGOKw8dCL6OFZfMGaZ5YBkAYL7AVPwYmqQOky6E3hMKyrHdHwleYVxrlePuh3YjEg+1cOrZwVW0CY9NorHuUmCggMqukxiEHRmz+T3nNfxAwmy3b+XSq4iE36PXywjrJtv1mcRjw8Jhx9NWFkmOwGXR4/R7PJR1uxY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b=yUdEAuzC; arc=none smtp.client-ip=113.46.200.223 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b="yUdEAuzC" dkim-signature: v=1; a=rsa-sha256; d=huawei.com; s=dkim; c=relaxed/relaxed; q=dns/txt; h=From; bh=idc8gkJwQIuISWDpjUk6l0rfAOAG94Y8xDZQBHeRnWo=; b=yUdEAuzC8ZMLK3TaJEfOoc44tiu2u6cOMSh6gFYyapisZKn2rcNJ3vW4PvIDuat6Vp+AXpYaQ xH3/2t4D4dCAX1AbYkiwDTcCnRWZnlKZF3mlIG0U7CaP14wAAeTMg29AtHb72u6/Gmh64dgH1d0 lOIZT06aJJOTVkvmcVZV1K8= Received: from mail.maildlp.com (unknown [172.19.163.15]) by canpmsgout08.his.huawei.com (SkyGuard) with ESMTPS id 4fxgsm6NNMzmV69; Fri, 17 Apr 2026 11:51:24 +0800 (CST) Received: from dggpemf500011.china.huawei.com (unknown [7.185.36.131]) by mail.maildlp.com (Postfix) with ESMTPS id C5B4640571; Fri, 17 Apr 2026 11:57:45 +0800 (CST) Received: from huawei.com (10.90.53.73) by dggpemf500011.china.huawei.com (7.185.36.131) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Fri, 17 Apr 2026 11:57:45 +0800 From: Jinjie Ruan To: , , , , CC: Subject: [PATCH v2] ACPI: CPPC: Fix related_cpus inconsistency during CPU hotplug Date: Fri, 17 Apr 2026 12:01:12 +0800 Message-ID: <20260417040112.3727756-1-ruanjinjie@huawei.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200002.china.huawei.com (7.221.188.68) To dggpemf500011.china.huawei.com (7.185.36.131) Content-Type: text/plain; charset="utf-8" When concurrently bringing up and down two SMT threads of a physical core, many warning call traces occur as below: The issue timeline is as follows: 1. when the system starts, cpufreq: cpu: 220, policy->related_cpus: 220-221, policy->cpus: 220-221 2. Offline cpu 220 and cpu 221. 3. Online cpu 220 - cpu 221 is now offline, as acpi_get_psd_map() use for_each_online_cpu(), so the cpu_data->shared_cpu_map, policy->cpus, and related_cpus has only cpu 220. cpufreq: cpu: 220, policy->related_cpus: 220, policy->cpus: 220 4. offline cpu 220 5. online cpu 221, the below call trace occurs: - Because cpu 220 and cpu 221 share one policy, and policy->related_cpus =3D 220 after step 3, so cpu 221 is not in policy->related_cpus but per_cpu(cpufreq_cpu_data, cpu221) is not NULL. After revert commit 56eb0c0ed345 ("ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs"), the issue disappeared. The _PSD (P-State Dependency) defines the hardware-level dependency of frequency control across CPU cores. Since this relationship is a physical attribute of the hardware topology, it remains constant regardless of the online or offline status of the CPUs. Using for_each_online_cpu() in acpi_get_psd_map() is problematic. If a CPU is offline, it will be excluded from the shared_cpu_map. Consequently, if that CPU is brought online later, the kernel will fail to recognize it as part of any shared frequency domain. Switch back to for_each_possible_cpu() to ensure that all cores defined in the ACPI tables are correctly mapped into their respective performance domains from the start. This aligns with the logic of policy->related_cpus, which must encompass all potentially available cores in the domain to prevent logic gaps during CPU hotplug operations. To resolve the original issue regarding the "nosmt" or "nosmt=3Dforce" boot parameter, as send_pcc_cmd() function already does if (!desc) continue, so reverting that loop back to for_each_possible_cpu() is ok, only need to change the match_cpc_ptr NULL case in acpi_get_psd_map() to continue as Sean suggested. How to reproduce, on arm64 machine with SMT support which use acpi cppc cpufreq driver: bash test.sh 220 & bash test.sh 221 & The test.sh is as below: while true do echo 0 > /sys/devices/system/cpu/cpu${1}/online sleep 0.5 cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus echo 1 > /sys/devices/system/cpu/cpu${1}/online cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus done CPU: 221 PID: 1119 Comm: cpuhp/221 Kdump: loaded Not tainted 6.6.0debug+ #5 Hardware name: To be filled by O.E.M. S920X20/BC83AMDA01-7270Z, BIOS 20.39= 09/04/2024 pstate: a1400009 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=3D--) pc : cpufreq_online+0x8ac/0xa90 lr : cpuhp_cpufreq_online+0x18/0x30 sp : ffff80008739bce0 x29: ffff80008739bce0 x28: 0000000000000000 x27: ffff28400ca32200 x26: 0000000000000000 x25: 0000000000000003 x24: ffffd483503ff000 x23: ffffd483504051a0 x22: ffffd48350024a00 x21: 00000000000000dd x20: 000000000000001d x19: ffff28400ca32000 x18: 0000000000000000 x17: 0000000000000020 x16: ffffd4834e6a3fc8 x15: 0000000000000020 x14: 0000000000000008 x13: 0000000000000001 x12: 00000000ffffffff x11: 0000000000000040 x10: ffffd48350430728 x9 : ffffd4834f087c78 x8 : 0000000000000001 x7 : ffff2840092bdf00 x6 : ffffd483504264f0 x5 : ffffd48350405000 x4 : ffff283f7f95cc60 x3 : 0000000000000000 x2 : ffff53bc2f94b000 x1 : 00000000000000dd x0 : 0000000000000000 Call trace: cpufreq_online+0x8ac/0xa90 cpuhp_cpufreq_online+0x18/0x30 cpuhp_invoke_callback+0x128/0x580 cpuhp_thread_fun+0x110/0x1b0 smpboot_thread_fn+0x140/0x190 kthread+0xec/0x100 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- Cc: stable@vger.kernel.org Fixes: 56eb0c0ed345 ("ACPI: CPPC: Fix remaining for_each_possible_cpu() to = use online CPUs") Co-developed-by: Sean Kelley Signed-off-by: Sean Kelley Signed-off-by: Jinjie Ruan --- v2: - Fix the original issue by continue if per_cpu(cpc_desc_ptr, i) is NULL. - Update the commit message --- drivers/acpi/cppc_acpi.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index f0e513e9ed5d..bcfe2e6b8445 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -362,7 +362,7 @@ static int send_pcc_cmd(int pcc_ss_id, u16 cmd) end: if (cmd =3D=3D CMD_WRITE) { if (unlikely(ret)) { - for_each_online_cpu(i) { + for_each_possible_cpu(i) { struct cpc_desc *desc =3D per_cpu(cpc_desc_ptr, i); =20 if (!desc) @@ -524,13 +524,13 @@ int acpi_get_psd_map(unsigned int cpu, struct cppc_cp= udata *cpu_data) else if (pdomain->coord_type =3D=3D DOMAIN_COORD_TYPE_SW_ANY) cpu_data->shared_type =3D CPUFREQ_SHARED_TYPE_ANY; =20 - for_each_online_cpu(i) { + for_each_possible_cpu(i) { if (i =3D=3D cpu) continue; =20 match_cpc_ptr =3D per_cpu(cpc_desc_ptr, i); if (!match_cpc_ptr) - goto err_fault; + continue; =20 match_pdomain =3D &(match_cpc_ptr->domain_info); if (match_pdomain->domain !=3D pdomain->domain) --=20 2.34.1