[PATCH v2 4/7] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]

Zhao Liu posted 7 patches 5 months ago
[PATCH v2 4/7] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
Posted by Zhao Liu 5 months ago
From: Chuang Xu <xuchuangxclwt@bytedance.com>

When QEMU is started with:
-cpu host,migratable=on,host-cache-info=on,l3-cache=off
-smp 180,sockets=2,dies=1,cores=45,threads=2

On Intel platform:
CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
logical processors in the physical package".

When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
CPUID.04H.EAX[31:26], which matches the expected result.

As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
it's necessary to round up CPUID.01H.EBX[23:16] to the nearest power-of-2
integer too. Otherwise there would be unexpected results in guest with
older kernel.

For example, when QEMU is started with CLI above and xtopology is disabled,
guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
as the result, even though threads-per-core should actually be 2.

And on AMD platform:
CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
result meets our expectation.

So round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer only
for Intel platform to solve the unexpected result.

Use the "x-vendor-cpuid-only-v2" compat option to fix this issue.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes Since New v1 [**]:
 * Drop Igor's Acked-by since this version uses the newly added
   x-vendor-cpuid-only-v2.
 * Add Zhaoxin since this is the behavior defined in SDM.

Changes Since original v6 [*] :
 * Rebase on the b69801dd6b1e ("Merge tag 'for_upstream' of
   https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging").
 * Polish the comment in code.
 * Explain the change doesn't need extra compat property.

[*] original v6: https://lore.kernel.org/qemu-devel/20241009035638.59330-1-xuchuangxclwt@bytedance.com/
[**] new v1: https://lore.kernel.org/qemu-devel/20250227062523.124601-2-zhao1.liu@intel.com/
---
 target/i386/cpu.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9e110e49ab8a..7fcb6c144d94 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7869,7 +7869,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         *edx = env->features[FEAT_1_EDX];
         if (threads_per_pkg > 1) {
-            *ebx |= threads_per_pkg << 16;
+            /*
+             * For CPUID.01H.EBX[Bits 23-16], AMD requires logical processor
+             * count, but Intel needs maximum number of addressable IDs for
+             * logical processors per package.
+             */
+            if (cpu->vendor_cpuid_only_v2 &&
+                (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) {
+                *ebx |= 1 << apicid_pkg_offset(topo_info) << 16;
+            } else {
+                *ebx |= threads_per_pkg << 16;
+            }
         }
         break;
     case 2: { /* cache info: needed for Pentium Pro compatibility */
-- 
2.34.1
Re: [PATCH v2 4/7] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
Posted by Michael Tokarev 5 months ago
On 14.07.2025 11:08, Zhao Liu wrote:
> From: Chuang Xu <xuchuangxclwt@bytedance.com>
> 
> When QEMU is started with:
> -cpu host,migratable=on,host-cache-info=on,l3-cache=off
> -smp 180,sockets=2,dies=1,cores=45,threads=2
> 
> On Intel platform:
> CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
> logical processors in the physical package".
> 
> When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
> CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
> executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
> CPUID.04H.EAX[31:26], which matches the expected result.
> 
> As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
> it's necessary to round up CPUID.01H.EBX[23:16] to the nearest power-of-2
> integer too. Otherwise there would be unexpected results in guest with
> older kernel.
> 
> For example, when QEMU is started with CLI above and xtopology is disabled,
> guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
> calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
> as the result, even though threads-per-core should actually be 2.
> 
> And on AMD platform:
> CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
> result meets our expectation.
> 
> So round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer only
> for Intel platform to solve the unexpected result.
> 
> Use the "x-vendor-cpuid-only-v2" compat option to fix this issue.
> 
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes Since New v1 [**]:
>   * Drop Igor's Acked-by since this version uses the newly added
>     x-vendor-cpuid-only-v2.
>   * Add Zhaoxin since this is the behavior defined in SDM.
> 
> Changes Since original v6 [*] :
>   * Rebase on the b69801dd6b1e ("Merge tag 'for_upstream' of
>     https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging").
>   * Polish the comment in code.
>   * Explain the change doesn't need extra compat property.
> 
> [*] original v6: https://lore.kernel.org/qemu-devel/20241009035638.59330-1-xuchuangxclwt@bytedance.com/
> [**] new v1: https://lore.kernel.org/qemu-devel/20250227062523.124601-2-zhao1.liu@intel.com/
> ---
>   target/i386/cpu.c | 12 +++++++++++-
>   1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 9e110e49ab8a..7fcb6c144d94 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7869,7 +7869,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>           }
>           *edx = env->features[FEAT_1_EDX];
>           if (threads_per_pkg > 1) {
> -            *ebx |= threads_per_pkg << 16;
> +            /*
> +             * For CPUID.01H.EBX[Bits 23-16], AMD requires logical processor
> +             * count, but Intel needs maximum number of addressable IDs for
> +             * logical processors per package.
> +             */
> +            if (cpu->vendor_cpuid_only_v2 &&
> +                (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) {

Hi!

Previous incarnation of this patch were Cc'd qemu-stable@, as it were
supposed to be picked up for the stable qemu series.  However, this
incarnation is not Cc'd to stable, and, most importantly, it relies
on a feature which was introduced after all released qemu versions.
Namely, vendor_cpuid_only_v2 is past v10.0.0, which is commit
216d9bb6d7716 "i386/cpu: Add x-vendor-cpuid-only-v2 option for
compatibility".

Should I omit this change for stable-10.0 series, or should it be
modified to work in 10.0?

Thanks,

/mjt
Re: [PATCH v2 4/7] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
Posted by Zhao Liu 4 months, 4 weeks ago
> Hi!
> 
> Previous incarnation of this patch were Cc'd qemu-stable@, as it were
> supposed to be picked up for the stable qemu series.  However, this
> incarnation is not Cc'd to stable, and, most importantly, it relies
> on a feature which was introduced after all released qemu versions.
> Namely, vendor_cpuid_only_v2 is past v10.0.0, which is commit
> 216d9bb6d7716 "i386/cpu: Add x-vendor-cpuid-only-v2 option for
> compatibility".
> 
> Should I omit this change for stable-10.0 series, or should it be
> modified to work in 10.0?

Hi Michael, considerring current fix is covered by compat option,
it's not fit to be backported to the previous version.

This issue has existed for a very long time, and the feedback received
on this is currently one case and it based on a specific topology
configuration, so the impact is limited. Therefore, in this patch
version, I fixed it with the newly added compat option, which also
avoids the controversy about the impact of migration.

So I think you could omit this change for stable-10.0 series, if it's
also okay with Paolo.

Thanks,
Zhao
Re: [PATCH v2 4/7] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
Posted by Michael Tokarev 4 months, 4 weeks ago
On 17.07.2025 06:06, Zhao Liu wrote:
>> Hi!
>>
>> Previous incarnation of this patch were Cc'd qemu-stable@, as it were
>> supposed to be picked up for the stable qemu series.  However, this
>> incarnation is not Cc'd to stable, and, most importantly, it relies
>> on a feature which was introduced after all released qemu versions.
>> Namely, vendor_cpuid_only_v2 is past v10.0.0, which is commit
>> 216d9bb6d7716 "i386/cpu: Add x-vendor-cpuid-only-v2 option for
>> compatibility".
>>
>> Should I omit this change for stable-10.0 series, or should it be
>> modified to work in 10.0?
> 
> Hi Michael, considerring current fix is covered by compat option,
> it's not fit to be backported to the previous version.
> 
> This issue has existed for a very long time, and the feedback received
> on this is currently one case and it based on a specific topology
> configuration, so the impact is limited. Therefore, in this patch
> version, I fixed it with the newly added compat option, which also
> avoids the controversy about the impact of migration.
> 
> So I think you could omit this change for stable-10.0 series, if it's
> also okay with Paolo.

This makes sense.

In this case though, the next patch, a62fef5829956 "i386/cpu: Fix cpu
number overflow in CPUID.01H.EBX[23:16]", becomes a one-liner:

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2c9517f56d..5e55dd9ee5 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6828,7 +6828,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t 
index, uint32_t count,
          }
          *edx = env->features[FEAT_1_EDX];
          if (threads_per_pkg > 1) {
-            *ebx |= threads_per_pkg << 16;
+            /* Fixup overflow: max value for bits 23-16 is 255. */
+            *ebx |= MIN(threads_per_pkg, 255) << 16;
          }
          if (!cpu->enable_pmu) {
              *ecx &= ~CPUID_EXT_PDCM;

Is it okay with you, or maybe should I drop this change for 10.0.x
too?

Thanks,

/mjt
Re: [PATCH v2 4/7] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
Posted by Zhao Liu 4 months, 4 weeks ago
> This makes sense.
> 
> In this case though, the next patch, a62fef5829956 "i386/cpu: Fix cpu
> number overflow in CPUID.01H.EBX[23:16]", becomes a one-liner:
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 2c9517f56d..5e55dd9ee5 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6828,7 +6828,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index,
> uint32_t count,
>          }
>          *edx = env->features[FEAT_1_EDX];
>          if (threads_per_pkg > 1) {
> -            *ebx |= threads_per_pkg << 16;
> +            /* Fixup overflow: max value for bits 23-16 is 255. */
> +            *ebx |= MIN(threads_per_pkg, 255) << 16;
>          }
>          if (!cpu->enable_pmu) {
>              *ecx &= ~CPUID_EXT_PDCM;
> 
> Is it okay with you, or maybe should I drop this change for 10.0.x
> too?

Yes, it's okay. This commit I should have cc'd stable, as it is indeed a
reported bug.

And after this commit, there're another 2 commits to avoid overflow:

* commit 3e86124e7cb9 ("i386/cpu: Fix overflow of cache topology fields
  in CPUID.04H")
* commit 5d21ee453ad8 ("i386/cpu: Honor maximum value for CPUID.
  8000001DH.EAX[25:14]")

I think these two could also be stable v10.0.x materials.

Thanks,
Zhao
Re: [PATCH v2 4/7] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
Posted by Xiaoyao Li 5 months ago
On 7/14/2025 4:08 PM, Zhao Liu wrote:
> From: Chuang Xu <xuchuangxclwt@bytedance.com>
> 
> When QEMU is started with:
> -cpu host,migratable=on,host-cache-info=on,l3-cache=off
> -smp 180,sockets=2,dies=1,cores=45,threads=2
> 
> On Intel platform:
> CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
> logical processors in the physical package".
> 
> When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
> CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
> executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
> CPUID.04H.EAX[31:26], which matches the expected result.
> 
> As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
> it's necessary to round up CPUID.01H.EBX[23:16] to the nearest power-of-2
> integer too. Otherwise there would be unexpected results in guest with
> older kernel.
> 
> For example, when QEMU is started with CLI above and xtopology is disabled,
> guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
> calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
> as the result, even though threads-per-core should actually be 2.
> 
> And on AMD platform:
> CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
> result meets our expectation.
> 
> So round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer only
> for Intel platform to solve the unexpected result.
> 
> Use the "x-vendor-cpuid-only-v2" compat option to fix this issue.
> 
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>