[PATCH v5 2/2] PCI/TPH: Fix get cpu steer-tag fail on ARM64 platform

Chengwen Feng posted 2 patches 1 month ago
There is a newer version of this series
[PATCH v5 2/2] PCI/TPH: Fix get cpu steer-tag fail on ARM64 platform
Posted by Chengwen Feng 1 month ago
pcie_tph_get_cpu_st() is broken on ARM64:
1. pcie_tph_get_cpu_st() passes cpu_uid to the PCI ACPI DSM method.
   cpu_uid should be the ACPI Processor UID [1].
2. In BNXT, pcie_tph_get_cpu_st() is passed a cpu_uid obtained via
   cpumask_first(irq->cpu_mask) - the logical CPU ID of a CPU core,
   generated and managed by kernel (e.g., [0,255] for a system  with 256
   logical CPU cores).
3. On ARM64 platforms, ACPI assigns Processor UID to cores listed in the
   MADT table, and this UID may not match the kernel's logical CPU ID.
   When this occurs, the mismatch results in the wrong CPU steer-tag.
4. On AMD x86 the logical CPU ID is identical to the ACPI Processor UID
   so the mismatch is not seen.

Resolution:
1. Implement acpi_get_cpu_acpi_id() for x86, which replaces
   cpu_acpi_id(). All ACPI platforms now have an implementation.
2. Use acpi_get_cpu_acpi_id() in pcie_tph_get_cpu_st() to translate from
   logical CPU ID to ACPI Processor UID needed for the DSM call.
3. Rename pcie_tpu_get_cpu_st() parameter from cpu_uid to cpu to
   reflect that it is a logical CPU_ID.

[1] According to ECN_TPH-ST_Revision_20200924
    (https://members.pcisig.com/wg/PCI-SIG/document/15470), the input
    is defined as: "If the target is a processor, then this field
    represents the ACPI Processor UID of the processor as specified in
    the MADT. If the target is a processor container, then this field
    represents the ACPI Processor UID of the processor container as
    specified in the PPTT."

Fixes: d2e8a34876ce ("PCI/TPH: Add Steering Tag support")
Cc: stable@vger.kernel.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
---
 Documentation/PCI/tph.rst    |  4 ++--
 arch/x86/include/asm/acpi.h  |  2 ++
 arch/x86/include/asm/cpu.h   |  1 -
 arch/x86/include/asm/smp.h   |  1 -
 arch/x86/kernel/cpu/common.c | 12 ++++++++++++
 arch/x86/xen/enlighten_hvm.c |  4 ++--
 drivers/pci/tph.c            | 11 ++++++-----
 include/linux/pci-tph.h      |  4 ++--
 8 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/Documentation/PCI/tph.rst b/Documentation/PCI/tph.rst
index e8993be64fd6..b6cf22b9bd90 100644
--- a/Documentation/PCI/tph.rst
+++ b/Documentation/PCI/tph.rst
@@ -79,10 +79,10 @@ To retrieve a Steering Tag for a target memory associated with a specific
 CPU, use the following function::
 
   int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type type,
-                          unsigned int cpu_uid, u16 *tag);
+                          unsigned int cpu, u16 *tag);
 
 The `type` argument is used to specify the memory type, either volatile
-or persistent, of the target memory. The `cpu_uid` argument specifies the
+or persistent, of the target memory. The `cpu` argument specifies the
 CPU where the memory is associated to.
 
 After the ST value is retrieved, the device driver can use the following
diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
index a03aa6f999d1..b968369715c1 100644
--- a/arch/x86/include/asm/acpi.h
+++ b/arch/x86/include/asm/acpi.h
@@ -157,6 +157,8 @@ static inline bool acpi_has_cpu_in_madt(void)
 	return !!acpi_lapic;
 }
 
+u32 acpi_get_cpu_acpi_id(unsigned int cpu);
+
 #define ACPI_HAVE_ARCH_SET_ROOT_POINTER
 static __always_inline void acpi_arch_set_root_pointer(u64 addr)
 {
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index ad235dda1ded..57a0786dfd75 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -11,7 +11,6 @@
 
 #ifndef CONFIG_SMP
 #define cpu_physical_id(cpu)			boot_cpu_physical_apicid
-#define cpu_acpi_id(cpu)			0
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 84951572ab81..05d1d479b4cf 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -130,7 +130,6 @@ __visible void smp_call_function_interrupt(struct pt_regs *regs);
 __visible void smp_call_function_single_interrupt(struct pt_regs *r);
 
 #define cpu_physical_id(cpu)	per_cpu(x86_cpu_to_apicid, cpu)
-#define cpu_acpi_id(cpu)	per_cpu(x86_cpu_to_acpiid, cpu)
 
 /*
  * This function is needed by all SMP systems. It must _always_ be valid
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 1c3261cae40c..93f4f3283c81 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -29,6 +29,7 @@
 #include <linux/utsname.h>
 #include <linux/efi.h>
 
+#include <asm/acpi.h>
 #include <asm/alternative.h>
 #include <asm/cmdline.h>
 #include <asm/cpuid/api.h>
@@ -57,6 +58,7 @@
 #include <asm/asm.h>
 #include <asm/bugs.h>
 #include <asm/cpu.h>
+#include <asm/smp.h>
 #include <asm/mce.h>
 #include <asm/msr.h>
 #include <asm/cacheinfo.h>
@@ -2643,3 +2645,13 @@ void __init arch_cpu_finalize_init(void)
 	 */
 	mem_encrypt_init();
 }
+
+u32 acpi_get_cpu_acpi_id(unsigned int cpu)
+{
+#ifndef CONFIG_SMP
+	return 0;
+#else
+	return per_cpu(x86_cpu_to_acpiid, cpu);
+#endif
+}
+EXPORT_SYMBOL_GPL(acpi_get_cpu_acpi_id);
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index fe57ff85d004..0a5cde7865b2 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -161,8 +161,8 @@ static int xen_cpu_up_prepare_hvm(unsigned int cpu)
 	 */
 	xen_uninit_lock_cpu(cpu);
 
-	if (cpu_acpi_id(cpu) != CPU_ACPIID_INVALID)
-		per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu);
+	if (acpi_get_cpu_acpi_id(cpu) != CPU_ACPIID_INVALID)
+		per_cpu(xen_vcpu_id, cpu) = acpi_get_cpu_acpi_id(cpu);
 	else
 		per_cpu(xen_vcpu_id, cpu) = cpu;
 	xen_vcpu_setup(cpu);
diff --git a/drivers/pci/tph.c b/drivers/pci/tph.c
index ca4f97be7538..c1bd60637b5a 100644
--- a/drivers/pci/tph.c
+++ b/drivers/pci/tph.c
@@ -236,18 +236,19 @@ static int write_tag_to_st_table(struct pci_dev *pdev, int index, u16 tag)
  * with a specific CPU
  * @pdev: PCI device
  * @mem_type: target memory type (volatile or persistent RAM)
- * @cpu_uid: associated CPU id
+ * @cpu: associated CPU id
  * @tag: Steering Tag to be returned
  *
  * Return the Steering Tag for a target memory that is associated with a
- * specific CPU as indicated by cpu_uid.
+ * specific CPU as indicated by cpu.
  *
  * Return: 0 if success, otherwise negative value (-errno)
  */
 int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
-			unsigned int cpu_uid, u16 *tag)
+			unsigned int cpu, u16 *tag)
 {
 #ifdef CONFIG_ACPI
+	u32 cpu_uid = acpi_get_cpu_acpi_id(cpu);
 	struct pci_dev *rp;
 	acpi_handle rp_acpi_handle;
 	union st_info info;
@@ -265,9 +266,9 @@ int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
 
 	*tag = tph_extract_tag(mem_type, pdev->tph_req_type, &info);
 
-	pci_dbg(pdev, "get steering tag: mem_type=%s, cpu_uid=%d, tag=%#04x\n",
+	pci_dbg(pdev, "get steering tag: mem_type=%s, cpu=%d, tag=%#04x\n",
 		(mem_type == TPH_MEM_TYPE_VM) ? "volatile" : "persistent",
-		cpu_uid, *tag);
+		cpu, *tag);
 
 	return 0;
 #else
diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
index ba28140ce670..be68cd17f2f8 100644
--- a/include/linux/pci-tph.h
+++ b/include/linux/pci-tph.h
@@ -25,7 +25,7 @@ int pcie_tph_set_st_entry(struct pci_dev *pdev,
 			  unsigned int index, u16 tag);
 int pcie_tph_get_cpu_st(struct pci_dev *dev,
 			enum tph_mem_type mem_type,
-			unsigned int cpu_uid, u16 *tag);
+			unsigned int cpu, u16 *tag);
 void pcie_disable_tph(struct pci_dev *pdev);
 int pcie_enable_tph(struct pci_dev *pdev, int mode);
 u16 pcie_tph_get_st_table_size(struct pci_dev *pdev);
@@ -36,7 +36,7 @@ static inline int pcie_tph_set_st_entry(struct pci_dev *pdev,
 { return -EINVAL; }
 static inline int pcie_tph_get_cpu_st(struct pci_dev *dev,
 				      enum tph_mem_type mem_type,
-				      unsigned int cpu_uid, u16 *tag)
+				      unsigned int cpu, u16 *tag)
 { return -EINVAL; }
 static inline void pcie_disable_tph(struct pci_dev *pdev) { }
 static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
-- 
2.17.1
Re: [PATCH v5 2/2] PCI/TPH: Fix get cpu steer-tag fail on ARM64 platform
Posted by Jeremy Linton 4 weeks, 1 day ago
Hi,

+Wathsala who has been looking at TPH too.

On 3/9/26 10:20 PM, Chengwen Feng wrote:
> pcie_tph_get_cpu_st() is broken on ARM64:
> 1. pcie_tph_get_cpu_st() passes cpu_uid to the PCI ACPI DSM method.
>     cpu_uid should be the ACPI Processor UID [1].
> 2. In BNXT, pcie_tph_get_cpu_st() is passed a cpu_uid obtained via
>     cpumask_first(irq->cpu_mask) - the logical CPU ID of a CPU core,
>     generated and managed by kernel (e.g., [0,255] for a system  with 256
>     logical CPU cores).
> 3. On ARM64 platforms, ACPI assigns Processor UID to cores listed in the
>     MADT table, and this UID may not match the kernel's logical CPU ID.
>     When this occurs, the mismatch results in the wrong CPU steer-tag.
> 4. On AMD x86 the logical CPU ID is identical to the ACPI Processor UID
>     so the mismatch is not seen.
> 
> Resolution:
> 1. Implement acpi_get_cpu_acpi_id() for x86, which replaces
>     cpu_acpi_id(). All ACPI platforms now have an implementation.
> 2. Use acpi_get_cpu_acpi_id() in pcie_tph_get_cpu_st() to translate from
>     logical CPU ID to ACPI Processor UID needed for the DSM call.
> 3. Rename pcie_tpu_get_cpu_st() parameter from cpu_uid to cpu to
>     reflect that it is a logical CPU_ID.
> 
> [1] According to ECN_TPH-ST_Revision_20200924
>      (https://members.pcisig.com/wg/PCI-SIG/document/15470), the input
>      is defined as: "If the target is a processor, then this field
>      represents the ACPI Processor UID of the processor as specified in
>      the MADT. If the target is a processor container, then this field
>      represents the ACPI Processor UID of the processor container as
>      specified in the PPTT."

The bit about "processor containers" is not supported by linux yet, and 
is potentially a problem worth considering.

The original rename comments from the previous patch versions are on 
point, but since they have grown large, i'm going to suggest the x86 
rename/shuffle here is also in its own patch seperate from the TPH 
specific changes like the arm ones now are.

Thanks for looking after this.

> 
> Fixes: d2e8a34876ce ("PCI/TPH: Add Steering Tag support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> ---
>   Documentation/PCI/tph.rst    |  4 ++--
>   arch/x86/include/asm/acpi.h  |  2 ++
>   arch/x86/include/asm/cpu.h   |  1 -
>   arch/x86/include/asm/smp.h   |  1 -
>   arch/x86/kernel/cpu/common.c | 12 ++++++++++++
>   arch/x86/xen/enlighten_hvm.c |  4 ++--
>   drivers/pci/tph.c            | 11 ++++++-----
>   include/linux/pci-tph.h      |  4 ++--
>   8 files changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/PCI/tph.rst b/Documentation/PCI/tph.rst
> index e8993be64fd6..b6cf22b9bd90 100644
> --- a/Documentation/PCI/tph.rst
> +++ b/Documentation/PCI/tph.rst
> @@ -79,10 +79,10 @@ To retrieve a Steering Tag for a target memory associated with a specific
>   CPU, use the following function::
>   
>     int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type type,
> -                          unsigned int cpu_uid, u16 *tag);
> +                          unsigned int cpu, u16 *tag);
>   
>   The `type` argument is used to specify the memory type, either volatile
> -or persistent, of the target memory. The `cpu_uid` argument specifies the
> +or persistent, of the target memory. The `cpu` argument specifies the
>   CPU where the memory is associated to.
>   
>   After the ST value is retrieved, the device driver can use the following
> diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
> index a03aa6f999d1..b968369715c1 100644
> --- a/arch/x86/include/asm/acpi.h
> +++ b/arch/x86/include/asm/acpi.h
> @@ -157,6 +157,8 @@ static inline bool acpi_has_cpu_in_madt(void)
>   	return !!acpi_lapic;
>   }
>   
> +u32 acpi_get_cpu_acpi_id(unsigned int cpu);
> +
>   #define ACPI_HAVE_ARCH_SET_ROOT_POINTER
>   static __always_inline void acpi_arch_set_root_pointer(u64 addr)
>   {
> diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
> index ad235dda1ded..57a0786dfd75 100644
> --- a/arch/x86/include/asm/cpu.h
> +++ b/arch/x86/include/asm/cpu.h
> @@ -11,7 +11,6 @@
>   
>   #ifndef CONFIG_SMP
>   #define cpu_physical_id(cpu)			boot_cpu_physical_apicid
> -#define cpu_acpi_id(cpu)			0
>   #endif /* CONFIG_SMP */
>   
>   #ifdef CONFIG_HOTPLUG_CPU
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index 84951572ab81..05d1d479b4cf 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -130,7 +130,6 @@ __visible void smp_call_function_interrupt(struct pt_regs *regs);
>   __visible void smp_call_function_single_interrupt(struct pt_regs *r);
>   
>   #define cpu_physical_id(cpu)	per_cpu(x86_cpu_to_apicid, cpu)
> -#define cpu_acpi_id(cpu)	per_cpu(x86_cpu_to_acpiid, cpu)
>   
>   /*
>    * This function is needed by all SMP systems. It must _always_ be valid
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 1c3261cae40c..93f4f3283c81 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -29,6 +29,7 @@
>   #include <linux/utsname.h>
>   #include <linux/efi.h>
>   
> +#include <asm/acpi.h>
>   #include <asm/alternative.h>
>   #include <asm/cmdline.h>
>   #include <asm/cpuid/api.h>
> @@ -57,6 +58,7 @@
>   #include <asm/asm.h>
>   #include <asm/bugs.h>
>   #include <asm/cpu.h>
> +#include <asm/smp.h>
>   #include <asm/mce.h>
>   #include <asm/msr.h>
>   #include <asm/cacheinfo.h>
> @@ -2643,3 +2645,13 @@ void __init arch_cpu_finalize_init(void)
>   	 */
>   	mem_encrypt_init();
>   }
> +
> +u32 acpi_get_cpu_acpi_id(unsigned int cpu)
> +{
> +#ifndef CONFIG_SMP
> +	return 0;
> +#else
> +	return per_cpu(x86_cpu_to_acpiid, cpu);
> +#endif
> +}
> +EXPORT_SYMBOL_GPL(acpi_get_cpu_acpi_id);
> diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
> index fe57ff85d004..0a5cde7865b2 100644
> --- a/arch/x86/xen/enlighten_hvm.c
> +++ b/arch/x86/xen/enlighten_hvm.c
> @@ -161,8 +161,8 @@ static int xen_cpu_up_prepare_hvm(unsigned int cpu)
>   	 */
>   	xen_uninit_lock_cpu(cpu);
>   
> -	if (cpu_acpi_id(cpu) != CPU_ACPIID_INVALID)
> -		per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu);
> +	if (acpi_get_cpu_acpi_id(cpu) != CPU_ACPIID_INVALID)
> +		per_cpu(xen_vcpu_id, cpu) = acpi_get_cpu_acpi_id(cpu);
>   	else
>   		per_cpu(xen_vcpu_id, cpu) = cpu;
>   	xen_vcpu_setup(cpu);
> diff --git a/drivers/pci/tph.c b/drivers/pci/tph.c
> index ca4f97be7538..c1bd60637b5a 100644
> --- a/drivers/pci/tph.c
> +++ b/drivers/pci/tph.c
> @@ -236,18 +236,19 @@ static int write_tag_to_st_table(struct pci_dev *pdev, int index, u16 tag)
>    * with a specific CPU
>    * @pdev: PCI device
>    * @mem_type: target memory type (volatile or persistent RAM)
> - * @cpu_uid: associated CPU id
> + * @cpu: associated CPU id
>    * @tag: Steering Tag to be returned
>    *
>    * Return the Steering Tag for a target memory that is associated with a
> - * specific CPU as indicated by cpu_uid.
> + * specific CPU as indicated by cpu.
>    *
>    * Return: 0 if success, otherwise negative value (-errno)
>    */
>   int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
> -			unsigned int cpu_uid, u16 *tag)
> +			unsigned int cpu, u16 *tag)
>   {
>   #ifdef CONFIG_ACPI
> +	u32 cpu_uid = acpi_get_cpu_acpi_id(cpu);
>   	struct pci_dev *rp;
>   	acpi_handle rp_acpi_handle;
>   	union st_info info;
> @@ -265,9 +266,9 @@ int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
>   
>   	*tag = tph_extract_tag(mem_type, pdev->tph_req_type, &info);
>   
> -	pci_dbg(pdev, "get steering tag: mem_type=%s, cpu_uid=%d, tag=%#04x\n",
> +	pci_dbg(pdev, "get steering tag: mem_type=%s, cpu=%d, tag=%#04x\n",
>   		(mem_type == TPH_MEM_TYPE_VM) ? "volatile" : "persistent",
> -		cpu_uid, *tag);
> +		cpu, *tag);
>   
>   	return 0;
>   #else
> diff --git a/include/linux/pci-tph.h b/include/linux/pci-tph.h
> index ba28140ce670..be68cd17f2f8 100644
> --- a/include/linux/pci-tph.h
> +++ b/include/linux/pci-tph.h
> @@ -25,7 +25,7 @@ int pcie_tph_set_st_entry(struct pci_dev *pdev,
>   			  unsigned int index, u16 tag);
>   int pcie_tph_get_cpu_st(struct pci_dev *dev,
>   			enum tph_mem_type mem_type,
> -			unsigned int cpu_uid, u16 *tag);
> +			unsigned int cpu, u16 *tag);
>   void pcie_disable_tph(struct pci_dev *pdev);
>   int pcie_enable_tph(struct pci_dev *pdev, int mode);
>   u16 pcie_tph_get_st_table_size(struct pci_dev *pdev);
> @@ -36,7 +36,7 @@ static inline int pcie_tph_set_st_entry(struct pci_dev *pdev,
>   { return -EINVAL; }
>   static inline int pcie_tph_get_cpu_st(struct pci_dev *dev,
>   				      enum tph_mem_type mem_type,
> -				      unsigned int cpu_uid, u16 *tag)
> +				      unsigned int cpu, u16 *tag)
>   { return -EINVAL; }
>   static inline void pcie_disable_tph(struct pci_dev *pdev) { }
>   static inline int pcie_enable_tph(struct pci_dev *pdev, int mode)
Re: [PATCH v5 2/2] PCI/TPH: Fix get cpu steer-tag fail on ARM64 platform
Posted by Bjorn Helgaas 4 weeks, 1 day ago
On Tue, Mar 10, 2026 at 10:58:49AM -0500, Jeremy Linton wrote:
> On 3/9/26 10:20 PM, Chengwen Feng wrote:
> > pcie_tph_get_cpu_st() is broken on ARM64:
> > 1. pcie_tph_get_cpu_st() passes cpu_uid to the PCI ACPI DSM method.
> >     cpu_uid should be the ACPI Processor UID [1].
> > 2. In BNXT, pcie_tph_get_cpu_st() is passed a cpu_uid obtained via
> >     cpumask_first(irq->cpu_mask) - the logical CPU ID of a CPU core,
> >     generated and managed by kernel (e.g., [0,255] for a system  with 256
> >     logical CPU cores).
> > 3. On ARM64 platforms, ACPI assigns Processor UID to cores listed in the
> >     MADT table, and this UID may not match the kernel's logical CPU ID.
> >     When this occurs, the mismatch results in the wrong CPU steer-tag.
> > 4. On AMD x86 the logical CPU ID is identical to the ACPI Processor UID
> >     so the mismatch is not seen.

> >   int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
> > -			unsigned int cpu_uid, u16 *tag)
> > +			unsigned int cpu, u16 *tag)
> >   {
> >   #ifdef CONFIG_ACPI
> > +	u32 cpu_uid = acpi_get_cpu_acpi_id(cpu);

From AI review (gemini/gemini-3.1-pro-preview):

  Does this code need to validate that `cpu` is within bounds before
  using it?  Before this change, the `cpu_uid` parameter was passed
  opaquely to the ACPI firmware via `tph_invoke_dsm()`, which would
  gracefully handle invalid values.

  Now, `cpu` is treated as a logical CPU index and passed to
  `acpi_get_cpu_acpi_id(cpu)`. On architectures like arm64 and riscv,
  `acpi_get_cpu_acpi_id()` uses `cpu` directly as an array index
  (`&cpu_madt_gicc[cpu]` and `&cpu_madt_rintc[cpu]`). On x86, it uses
  `per_cpu(x86_cpu_to_acpiid, cpu)`.

  If a caller passes an out-of-bounds `cpu` index (for example, if an
  IRQ affinity mask is empty and `cpumask_first()` returns
  `nr_cpu_ids`, or if userspace passes an arbitrary ID via
  `mlx5_st_alloc_index()`), this will result in an out-of-bounds
  memory read.

  Consider adding a bounds check:

    if (cpu >= nr_cpu_ids)
      return -EINVAL;

I agree that this is an issue, and I think implementations of
acpi_get_cpu_acpi_id() should validate their inputs.

I don't know if there's a value that can never be a valid ACPI CPU UID
and could be used as an error value from acpi_get_cpu_acpi_id().  I do
see a few mentions of a ~0 value meaning "all processors" (ACPI r6.6,
sec 5.2.12.13).
Re: [PATCH v5 2/2] PCI/TPH: Fix get cpu steer-tag fail on ARM64 platform
Posted by fengchengwen 4 weeks, 1 day ago
On 3/11/2026 6:09 AM, Bjorn Helgaas wrote:
> On Tue, Mar 10, 2026 at 10:58:49AM -0500, Jeremy Linton wrote:
>> On 3/9/26 10:20 PM, Chengwen Feng wrote:
>>> pcie_tph_get_cpu_st() is broken on ARM64:
>>> 1. pcie_tph_get_cpu_st() passes cpu_uid to the PCI ACPI DSM method.
>>>     cpu_uid should be the ACPI Processor UID [1].
>>> 2. In BNXT, pcie_tph_get_cpu_st() is passed a cpu_uid obtained via
>>>     cpumask_first(irq->cpu_mask) - the logical CPU ID of a CPU core,
>>>     generated and managed by kernel (e.g., [0,255] for a system  with 256
>>>     logical CPU cores).
>>> 3. On ARM64 platforms, ACPI assigns Processor UID to cores listed in the
>>>     MADT table, and this UID may not match the kernel's logical CPU ID.
>>>     When this occurs, the mismatch results in the wrong CPU steer-tag.
>>> 4. On AMD x86 the logical CPU ID is identical to the ACPI Processor UID
>>>     so the mismatch is not seen.
> 
>>>   int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
>>> -			unsigned int cpu_uid, u16 *tag)
>>> +			unsigned int cpu, u16 *tag)
>>>   {
>>>   #ifdef CONFIG_ACPI
>>> +	u32 cpu_uid = acpi_get_cpu_acpi_id(cpu);
> 
> From AI review (gemini/gemini-3.1-pro-preview):
> 
>   Does this code need to validate that `cpu` is within bounds before
>   using it?  Before this change, the `cpu_uid` parameter was passed
>   opaquely to the ACPI firmware via `tph_invoke_dsm()`, which would
>   gracefully handle invalid values.
> 
>   Now, `cpu` is treated as a logical CPU index and passed to
>   `acpi_get_cpu_acpi_id(cpu)`. On architectures like arm64 and riscv,
>   `acpi_get_cpu_acpi_id()` uses `cpu` directly as an array index
>   (`&cpu_madt_gicc[cpu]` and `&cpu_madt_rintc[cpu]`). On x86, it uses
>   `per_cpu(x86_cpu_to_acpiid, cpu)`.
> 
>   If a caller passes an out-of-bounds `cpu` index (for example, if an
>   IRQ affinity mask is empty and `cpumask_first()` returns
>   `nr_cpu_ids`, or if userspace passes an arbitrary ID via
>   `mlx5_st_alloc_index()`), this will result in an out-of-bounds
>   memory read.
> 
>   Consider adding a bounds check:
> 
>     if (cpu >= nr_cpu_ids)
>       return -EINVAL;
> 
> I agree that this is an issue, and I think implementations of
> acpi_get_cpu_acpi_id() should validate their inputs.
> 
> I don't know if there's a value that can never be a valid ACPI CPU UID
> and could be used as an error value from acpi_get_cpu_acpi_id().  I do
> see a few mentions of a ~0 value meaning "all processors" (ACPI r6.6,
> sec 5.2.12.13).  

I only have the ACPI Specification Version 6.5, so I will use v6.5 as an example.

The ACPI specification does not define invalid value ranges for the ACPI UID.
For the arm64 platform (Section 5.2.12.14):
  ACPI Processor UID: The OS associates this GICC Structure with a processor device
                      object in the namespace when the _UID child object of the
                      processor device evaluates to a numeric value that matches
                      the numeric value in this field.

I am concerned that we cannot implement it like this:
	int acpi_get_cpu_uid(unsigned int cpu) {
		if (cpu >= nr_cpu_ids)
			return -EINVAL;
		...
	}
or:
	u32 acpi_get_cpu_uid(unsigned int cpu) {
		if (cpu >= nr_cpu_ids)
			return U32_MAX;
		...
	}

How about implementing it as follows:
	s64 acpi_get_cpu_uid(unsigned int cpu) {
		if (cpu >= nr_cpu_ids)
			return -EINVAL;
		...
	}
or
	int acpi_get_cpu_uid(unsigned int cpu, u32 *uid) {
		if (cpu >= nr_cpu_ids)
			return -EINVAL;
		*uid = xxx;
		return 0;
	}



Another issue: This commit also provides an implementation for the x86 platform.
However, further code analysis revealed a potential problem in the implementation:

The acpi_get_cpu_acpi_id() retrieves uid from x86_cpu_to_acpiid in SMP, and
x86_cpu_to_acpiid is set through the call chain: acpi_parse_lapic() ->
topology_register_apic() -> topo_register_apic() -> topo_set_cpuids() ->
x86_cpu_to_acpiid. It appears to retrieve the "ACPI Processor UID" from
ACPI Section 5.2.12.2, but the problem is that this field is only one byte in length,
which may cause issues in huge-core systems.

Therefore, I suggest re-implementing the acpi_get_cpu_uid function for the x86
platform. Either I provide a default implementation (shown below), or x86 guys
contribute to the implementation:

	s64 acpi_get_cpu_uid(unsigned int cpu) {
		if (cpu >= nr_cpu_ids)
			return -EINVAL;
		return cpu;
	}

Thanks

>
Re: [PATCH v5 2/2] PCI/TPH: Fix get cpu steer-tag fail on ARM64 platform
Posted by Bjorn Helgaas 4 weeks ago
On Wed, Mar 11, 2026 at 11:00:02AM +0800, fengchengwen wrote:
> On 3/11/2026 6:09 AM, Bjorn Helgaas wrote:
> > On Tue, Mar 10, 2026 at 10:58:49AM -0500, Jeremy Linton wrote:
> >> On 3/9/26 10:20 PM, Chengwen Feng wrote:
> >>> pcie_tph_get_cpu_st() is broken on ARM64:
> >>> 1. pcie_tph_get_cpu_st() passes cpu_uid to the PCI ACPI DSM method.
> >>>     cpu_uid should be the ACPI Processor UID [1].
> >>> 2. In BNXT, pcie_tph_get_cpu_st() is passed a cpu_uid obtained via
> >>>     cpumask_first(irq->cpu_mask) - the logical CPU ID of a CPU core,
> >>>     generated and managed by kernel (e.g., [0,255] for a system  with 256
> >>>     logical CPU cores).
> >>> 3. On ARM64 platforms, ACPI assigns Processor UID to cores listed in the
> >>>     MADT table, and this UID may not match the kernel's logical CPU ID.
> >>>     When this occurs, the mismatch results in the wrong CPU steer-tag.
> >>> 4. On AMD x86 the logical CPU ID is identical to the ACPI Processor UID
> >>>     so the mismatch is not seen.
> > 
> >>>   int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
> >>> -			unsigned int cpu_uid, u16 *tag)
> >>> +			unsigned int cpu, u16 *tag)
> >>>   {
> >>>   #ifdef CONFIG_ACPI
> >>> +	u32 cpu_uid = acpi_get_cpu_acpi_id(cpu);
> > 
> > From AI review (gemini/gemini-3.1-pro-preview):
> > 
> >   Does this code need to validate that `cpu` is within bounds before
> >   using it?  Before this change, the `cpu_uid` parameter was passed
> >   opaquely to the ACPI firmware via `tph_invoke_dsm()`, which would
> >   gracefully handle invalid values.
> > 
> >   Now, `cpu` is treated as a logical CPU index and passed to
> >   `acpi_get_cpu_acpi_id(cpu)`. On architectures like arm64 and riscv,
> >   `acpi_get_cpu_acpi_id()` uses `cpu` directly as an array index
> >   (`&cpu_madt_gicc[cpu]` and `&cpu_madt_rintc[cpu]`). On x86, it uses
> >   `per_cpu(x86_cpu_to_acpiid, cpu)`.
> > 
> >   If a caller passes an out-of-bounds `cpu` index (for example, if an
> >   IRQ affinity mask is empty and `cpumask_first()` returns
> >   `nr_cpu_ids`, or if userspace passes an arbitrary ID via
> >   `mlx5_st_alloc_index()`), this will result in an out-of-bounds
> >   memory read.
> > 
> >   Consider adding a bounds check:
> > 
> >     if (cpu >= nr_cpu_ids)
> >       return -EINVAL;
> > 
> > I agree that this is an issue, and I think implementations of
> > acpi_get_cpu_acpi_id() should validate their inputs.
> > 
> > I don't know if there's a value that can never be a valid ACPI CPU UID
> > and could be used as an error value from acpi_get_cpu_acpi_id().  I do
> > see a few mentions of a ~0 value meaning "all processors" (ACPI r6.6,
> > sec 5.2.12.13).  
> 
> I only have the ACPI Specification Version 6.5, so I will use v6.5
> as an example.

https://uefi.org/sites/default/files/resources/ACPI_Spec_6.6.pdf

> 	int acpi_get_cpu_uid(unsigned int cpu, u32 *uid) {
> 		if (cpu >= nr_cpu_ids)
> 			return -EINVAL;
> 		*uid = xxx;
> 		return 0;
> 	}

This looks good to me.

> Another issue: This commit also provides an implementation for the
> x86 platform.  However, further code analysis revealed a potential
> problem in the implementation:
> 
> The acpi_get_cpu_acpi_id() retrieves uid from x86_cpu_to_acpiid in
> SMP, and x86_cpu_to_acpiid is set through the call chain:
> acpi_parse_lapic() -> topology_register_apic() ->
> topo_register_apic() -> topo_set_cpuids() -> x86_cpu_to_acpiid. It
> appears to retrieve the "ACPI Processor UID" from ACPI Section
> 5.2.12.2, but the problem is that this field is only one byte in
> length, which may cause issues in huge-core systems.
> 
> Therefore, I suggest re-implementing the acpi_get_cpu_uid function
> for the x86 platform. Either I provide a default implementation
> (shown below), or x86 guys contribute to the implementation:
> 
> 	s64 acpi_get_cpu_uid(unsigned int cpu) {
> 		if (cpu >= nr_cpu_ids)
> 			return -EINVAL;
> 		return cpu;
> 	}

I don't think this is your problem to solve, so don't worry about it.
If you implement acpi_get_cpu_uid() to return the same values as
cpu_acpi_id(), it's up to the x86 folks to deal with any one-byte ID
issues.

Bjorn