[PATCH v2 7/7] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]

Zhao Liu posted 7 patches 5 months ago
[PATCH v2 7/7] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]
Posted by Zhao Liu 5 months ago
CPUID.8000001DH:EAX[25:14] is "NumSharingCache", and the number of
logical processors sharing this cache is the value of this field
incremented by 1. Because of its width limitation, the maximum value
currently supported is 4095.

Though at present Q35 supports up to 4096 CPUs, by constructing a
specific topology, the width of the APIC ID can be extended beyond 12
bits. For example, using `-smp threads=33,cores=9,modules=9` results in
a die level offset of 6 + 4 + 4 = 14 bits, which can also cause
overflow. Check and honor the maximum value as CPUID.04H did.

Cc: Babu Moger <babu.moger@amd.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes Since RFC v1 [*]:
 * Correct the RFC's description, now there's the overflow case. Provide
   an overflow example.

RFC:
 * Although there are currently no overflow cases, to avoid any
   potential issue, add the overflow check, just as I did for Intel.

[*]: https://lore.kernel.org/qemu-devel/20250227062523.124601-5-zhao1.liu@intel.com/
---
 target/i386/cpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fedeeea151ee..eceda9865b8f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -558,7 +558,8 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
                (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
-    *eax |= max_thread_ids_for_cache(topo_info, cache->share_level) << 14;
+    /* Bits 25:14 - NumSharingCache: maximum 4095. */
+    *eax |= MIN(max_thread_ids_for_cache(topo_info, cache->share_level), 4095) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1
Re: [PATCH v2 7/7] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]
Posted by Moger, Babu 5 months ago
Hi Zhao,

On 7/14/25 03:08, Zhao Liu wrote:
> CPUID.8000001DH:EAX[25:14] is "NumSharingCache", and the number of
> logical processors sharing this cache is the value of this field
> incremented by 1. Because of its width limitation, the maximum value
> currently supported is 4095.
> 
> Though at present Q35 supports up to 4096 CPUs, by constructing a
> specific topology, the width of the APIC ID can be extended beyond 12
> bits. For example, using `-smp threads=33,cores=9,modules=9` results in
> a die level offset of 6 + 4 + 4 = 14 bits, which can also cause
> overflow. Check and honor the maximum value as CPUID.04H did.
> 
> Cc: Babu Moger <babu.moger@amd.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes Since RFC v1 [*]:
>  * Correct the RFC's description, now there's the overflow case. Provide
>    an overflow example.
> 
> RFC:
>  * Although there are currently no overflow cases, to avoid any
>    potential issue, add the overflow check, just as I did for Intel.
> 
> [*]: https://lore.kernel.org/qemu-devel/20250227062523.124601-5-zhao1.liu@intel.com/
> ---
>  target/i386/cpu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index fedeeea151ee..eceda9865b8f 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -558,7 +558,8 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>  
>      *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>                 (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> -    *eax |= max_thread_ids_for_cache(topo_info, cache->share_level) << 14;
> +    /* Bits 25:14 - NumSharingCache: maximum 4095. */
> +    *eax |= MIN(max_thread_ids_for_cache(topo_info, cache->share_level), 4095) << 14;

Will this be more meaningful?

*eax |=
 max_thread_ids_for_cache(topo_info, cache->share_level) & 0xFFF << 14

>  
>      assert(cache->line_size > 0);
>      assert(cache->partitions > 0);

-- 
Thanks
Babu Moger
Re: [PATCH v2 7/7] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]
Posted by Zhao Liu 5 months ago
On Mon, Jul 14, 2025 at 09:51:25AM -0500, Moger, Babu wrote:
> Date: Mon, 14 Jul 2025 09:51:25 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v2 7/7] i386/cpu: Honor maximum value for
>  CPUID.8000001DH.EAX[25:14]
> 
> Hi Zhao,
> 
> On 7/14/25 03:08, Zhao Liu wrote:
> > CPUID.8000001DH:EAX[25:14] is "NumSharingCache", and the number of
> > logical processors sharing this cache is the value of this field
> > incremented by 1. Because of its width limitation, the maximum value
> > currently supported is 4095.
> > 
> > Though at present Q35 supports up to 4096 CPUs, by constructing a
> > specific topology, the width of the APIC ID can be extended beyond 12
> > bits. For example, using `-smp threads=33,cores=9,modules=9` results in
> > a die level offset of 6 + 4 + 4 = 14 bits, which can also cause
> > overflow. Check and honor the maximum value as CPUID.04H did.
> > 
> > Cc: Babu Moger <babu.moger@amd.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes Since RFC v1 [*]:
> >  * Correct the RFC's description, now there's the overflow case. Provide
> >    an overflow example.
> > 
> > RFC:
> >  * Although there are currently no overflow cases, to avoid any
> >    potential issue, add the overflow check, just as I did for Intel.
> > 
> > [*]: https://lore.kernel.org/qemu-devel/20250227062523.124601-5-zhao1.liu@intel.com/
> > ---
> >  target/i386/cpu.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index fedeeea151ee..eceda9865b8f 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -558,7 +558,8 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >  
> >      *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
> >                 (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> > -    *eax |= max_thread_ids_for_cache(topo_info, cache->share_level) << 14;
> > +    /* Bits 25:14 - NumSharingCache: maximum 4095. */
> > +    *eax |= MIN(max_thread_ids_for_cache(topo_info, cache->share_level), 4095) << 14;
> 
> Will this be more meaningful?
> 
> *eax |=
>  max_thread_ids_for_cache(topo_info, cache->share_level) & 0xFFF << 14

Hi Babu, thank you for your feedback! This approach depends on truncation,
which might lead to more erroneous conclusions. Currently, such cases
shouldn't exist on actual hardware; it's only QEMU that supports so many
CPUs and custom topologies.

Previously, when Intel handled similar cases (where the topology space
wasn't large enough), it would encode the maximum value rather than
truncate, as I'm doing now (you can refer to the description of 0x1 in
patch 5, and similar fixes in Intel's 0x4 leaf in patch 6). In the
future, if actual hardware CPUs reach such numbers and has special
behavior, we can update accordingly. I think at least for now, this
avoids overflow caused by special topology in QEMU emulation.

Thanks,
Zhao
Re: [PATCH v2 7/7] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]
Posted by Moger, Babu 5 months ago

On 7/14/25 10:41, Zhao Liu wrote:
> On Mon, Jul 14, 2025 at 09:51:25AM -0500, Moger, Babu wrote:
>> Date: Mon, 14 Jul 2025 09:51:25 -0500
>> From: "Moger, Babu" <babu.moger@amd.com>
>> Subject: Re: [PATCH v2 7/7] i386/cpu: Honor maximum value for
>>  CPUID.8000001DH.EAX[25:14]
>>
>> Hi Zhao,
>>
>> On 7/14/25 03:08, Zhao Liu wrote:
>>> CPUID.8000001DH:EAX[25:14] is "NumSharingCache", and the number of
>>> logical processors sharing this cache is the value of this field
>>> incremented by 1. Because of its width limitation, the maximum value
>>> currently supported is 4095.
>>>
>>> Though at present Q35 supports up to 4096 CPUs, by constructing a
>>> specific topology, the width of the APIC ID can be extended beyond 12
>>> bits. For example, using `-smp threads=33,cores=9,modules=9` results in
>>> a die level offset of 6 + 4 + 4 = 14 bits, which can also cause
>>> overflow. Check and honor the maximum value as CPUID.04H did.
>>>
>>> Cc: Babu Moger <babu.moger@amd.com>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Reviewed-by: Babu Moger <babu.moger@amd.com>

>>> ---
>>> Changes Since RFC v1 [*]:
>>>  * Correct the RFC's description, now there's the overflow case. Provide
>>>    an overflow example.
>>>
>>> RFC:
>>>  * Although there are currently no overflow cases, to avoid any
>>>    potential issue, add the overflow check, just as I did for Intel.
>>>
>>> [*]: https://lore.kernel.org/qemu-devel/20250227062523.124601-5-zhao1.liu@intel.com/
>>> ---
>>>  target/i386/cpu.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index fedeeea151ee..eceda9865b8f 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -558,7 +558,8 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>>>  
>>>      *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>>>                 (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
>>> -    *eax |= max_thread_ids_for_cache(topo_info, cache->share_level) << 14;
>>> +    /* Bits 25:14 - NumSharingCache: maximum 4095. */
>>> +    *eax |= MIN(max_thread_ids_for_cache(topo_info, cache->share_level), 4095) << 14;
>>
>> Will this be more meaningful?
>>
>> *eax |=
>>  max_thread_ids_for_cache(topo_info, cache->share_level) & 0xFFF << 14
> 
> Hi Babu, thank you for your feedback! This approach depends on truncation,
> which might lead to more erroneous conclusions. Currently, such cases
> shouldn't exist on actual hardware; it's only QEMU that supports so many
> CPUs and custom topologies.
> 
> Previously, when Intel handled similar cases (where the topology space
> wasn't large enough), it would encode the maximum value rather than
> truncate, as I'm doing now (you can refer to the description of 0x1 in
> patch 5, and similar fixes in Intel's 0x4 leaf in patch 6). In the
> future, if actual hardware CPUs reach such numbers and has special
> behavior, we can update accordingly. I think at least for now, this
> avoids overflow caused by special topology in QEMU emulation.
> 

Sure. Sounds good to me.

-- 
Thanks
Babu Moger