[PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy

Alejandro Vallejo posted 6 patches 10 months, 1 week ago
There is a newer version of this series
[PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Alejandro Vallejo 10 months, 1 week ago
Implements the helper for mapping vcpu_id to x2apic_id given a valid
topology in a policy. The algo is written with the intention of extending
it to leaves 0x1f and e26 in the future.

Toolstack doesn't set leaf 0xb and the HVM default policy has it cleared,
so the leaf is not implemented. In that case, the new helper just returns
the legacy mapping.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 tools/tests/cpu-policy/test-cpu-policy.c | 128 +++++++++++++++++++++++
 xen/include/xen/lib/x86/cpu-policy.h     |   2 +
 xen/lib/x86/policy.c                     |  75 +++++++++++--
 3 files changed, 199 insertions(+), 6 deletions(-)

diff --git a/tools/tests/cpu-policy/test-cpu-policy.c b/tools/tests/cpu-policy/test-cpu-policy.c
index 301df2c002..6ff5c1dd3d 100644
--- a/tools/tests/cpu-policy/test-cpu-policy.c
+++ b/tools/tests/cpu-policy/test-cpu-policy.c
@@ -650,6 +650,132 @@ static void test_is_compatible_failure(void)
     }
 }
 
+static void test_x2apic_id_from_vcpu_id_success(void)
+{
+    static struct test {
+        const char *name;
+        uint32_t vcpu_id;
+        uint32_t x2apic_id;
+        struct cpu_policy policy;
+    } tests[] = {
+        {
+            .name = "3v: 3 t/c, 8 c/s", .vcpu_id = 3, .x2apic_id = 1 << 2,
+            .policy = {
+                .x86_vendor = X86_VENDOR_AMD,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "6v: 3 t/c, 8 c/s", .vcpu_id = 6, .x2apic_id = 2 << 2,
+            .policy = {
+                .x86_vendor = X86_VENDOR_AMD,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "24v: 3 t/c, 8 c/s", .vcpu_id = 24, .x2apic_id = 1 << 5,
+            .policy = {
+                .x86_vendor = X86_VENDOR_AMD,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "35v: 3 t/c, 8 c/s", .vcpu_id = 35,
+            .x2apic_id = (35 % 3) | (((35 / 3) % 8)  << 2) | ((35 / 24) << 5),
+            .policy = {
+                .x86_vendor = X86_VENDOR_AMD,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "96v: 7 t/c, 3 c/s", .vcpu_id = 96,
+            .x2apic_id = (96 % 7) | (((96 / 7) % 3)  << 3) | ((96 / 21) << 5),
+            .policy = {
+                .x86_vendor = X86_VENDOR_AMD,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 7, .level = 0, .type = 1, .id_shift = 3, },
+                    [1] = { .nr_logical = 3, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "3v: 3 t/c, 8 c/s", .vcpu_id = 3, .x2apic_id = 1 << 2,
+            .policy = {
+                .x86_vendor = X86_VENDOR_INTEL,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "6v: 3 t/c, 8 c/s", .vcpu_id = 6, .x2apic_id = 2 << 2,
+            .policy = {
+                .x86_vendor = X86_VENDOR_INTEL,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "24v: 3 t/c, 8 c/s", .vcpu_id = 24, .x2apic_id = 1 << 5,
+            .policy = {
+                .x86_vendor = X86_VENDOR_INTEL,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "35v: 3 t/c, 8 c/s", .vcpu_id = 35,
+            .x2apic_id = (35 % 3) | (((35 / 3) % 8)  << 2) | ((35 / 24) << 5),
+            .policy = {
+                .x86_vendor = X86_VENDOR_INTEL,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
+                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+        {
+            .name = "96v: 7 t/c, 3 c/s", .vcpu_id = 96,
+            .x2apic_id = (96 % 7) | (((96 / 7) % 3)  << 3) | ((96 / 21) << 5),
+            .policy = {
+                .x86_vendor = X86_VENDOR_INTEL,
+                .topo.subleaf = {
+                    [0] = { .nr_logical = 7,   .level = 0, .type = 1, .id_shift = 3, },
+                    [1] = { .nr_logical = 21,  .level = 1, .type = 2, .id_shift = 5, },
+                },
+            },
+        },
+    };
+
+    printf("Testing x2apic id from vcpu id success:\n");
+
+    for ( size_t i = 0; i < ARRAY_SIZE(tests); ++i )
+    {
+        struct test *t = &tests[i];
+        uint32_t x2apic_id = x86_x2apic_id_from_vcpu_id(&t->policy, t->vcpu_id);
+        if ( x2apic_id != t->x2apic_id )
+            fail("FAIL - '%s'. bad x2apic_id: expected=%u actual=%u\n",
+                 t->name, t->x2apic_id, x2apic_id);
+    }
+}
+
 int main(int argc, char **argv)
 {
     printf("CPU Policy unit tests\n");
@@ -667,6 +793,8 @@ int main(int argc, char **argv)
     test_is_compatible_success();
     test_is_compatible_failure();
 
+    test_x2apic_id_from_vcpu_id_success();
+
     if ( nr_failures )
         printf("Done: %u failures\n", nr_failures);
     else
diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
index 65f6335b32..d81ae2f47c 100644
--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -550,6 +550,8 @@ int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
 /**
  * Calculates the x2APIC ID of a vCPU given a CPU policy
  *
+ * If the policy lacks leaf 0xb falls back to legacy mapping of apic_id=cpu*2
+ *
  * @param p          CPU policy of the domain.
  * @param vcpu_id    vCPU ID of the vCPU.
  * @returns x2APIC ID of the vCPU.
diff --git a/xen/lib/x86/policy.c b/xen/lib/x86/policy.c
index a3b24e6879..2a50bc076a 100644
--- a/xen/lib/x86/policy.c
+++ b/xen/lib/x86/policy.c
@@ -2,15 +2,78 @@
 
 #include <xen/lib/x86/cpu-policy.h>
 
-uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
+static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
 {
     /*
-     * TODO: Derive x2APIC ID from the topology information inside `p`
-     *       rather than from vCPU ID. This bodge is a temporary measure
-     *       until all infra is in place to retrieve or derive the initial
-     *       x2APIC ID from migrated domains.
+     * `nr_logical` reported by Intel is the number of THREADS contained in
+     * the next topological scope. For example, assuming a system with 2
+     * threads/core and 3 cores/module in a fully symmetric topology,
+     * `nr_logical` at the core level will report 6. Because it's reporting
+     * the number of threads in a module.
+     *
+     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
+     * level (cores/complex, etc) so we can return it as-is.
      */
-    return vcpu_id * 2;
+    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
+        return p->topo.subleaf[lvl].nr_logical;
+
+    return p->topo.subleaf[lvl].nr_logical / p->topo.subleaf[lvl - 1].nr_logical;
+}
+
+uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t id)
+{
+    uint32_t shift = 0, x2apic_id = 0;
+
+    /* In the absence of topology leaves, fallback to traditional mapping */
+    if ( !p->topo.subleaf[0].type )
+        return id * 2;
+
+    /*
+     * `id` means different things at different points of the algo
+     *
+     * At lvl=0: global thread_id (same as vcpu_id)
+     * At lvl=1: global core_id
+     * At lvl=2: global socket_id (actually complex_id in AMD, module_id
+     *                             in Intel, but the name is inconsequential)
+     *
+     *                 +--+
+     *            ____ |#0| ______           <= 1 socket
+     *           /     +--+       \+--+
+     *       __#0__              __|#1|__    <= 2 cores/socket
+     *      /   |  \        +--+/  +-|+  \
+     *    #0   #1   #2      |#3|    #4    #5 <= 3 threads/core
+     *                      +--+
+     *
+     * ... and so on. Global in this context means that it's a unique
+     * identifier for the whole topology, and not relative to the level
+     * it's in. For example, in the diagram shown above, we're looking at
+     * thread #3 in the global sense, though it's #0 within its core.
+     *
+     * Note that dividing a global thread_id by the number of threads per
+     * core returns the global core id that contains it. e.g: 0, 1 or 2
+     * divided by 3 returns core_id=0. 3, 4 or 5 divided by 3 returns core
+     * 1, and so on. An analogous argument holds for higher levels. This is
+     * the property we exploit to derive x2apic_id from vcpu_id.
+     *
+     * NOTE: `topo` is currently derived from leaf 0xb, which is bound to
+     * two levels, but once we track leaves 0x1f (or e26) there will be a
+     * few more. The algorithm is written to cope with that case.
+     */
+    for ( uint32_t i = 0; i < ARRAY_SIZE(p->topo.raw); i++ )
+    {
+        uint32_t nr_parts;
+
+        if ( !p->topo.subleaf[i].type )
+            /* sentinel subleaf */
+            break;
+
+        nr_parts = parts_per_higher_scoped_level(p, i);
+        x2apic_id |= (id % nr_parts) << shift;
+        id /= nr_parts;
+        shift = p->topo.subleaf[i].id_shift;
+    }
+
+    return (id << shift) | x2apic_id;
 }
 
 int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
-- 
2.34.1
Re: [PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Jan Beulich 7 months, 3 weeks ago
On 09.01.2024 16:38, Alejandro Vallejo wrote:
> --- a/xen/lib/x86/policy.c
> +++ b/xen/lib/x86/policy.c
> @@ -2,15 +2,78 @@
>  
>  #include <xen/lib/x86/cpu-policy.h>
>  
> -uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
> +static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
>  {
>      /*
> -     * TODO: Derive x2APIC ID from the topology information inside `p`
> -     *       rather than from vCPU ID. This bodge is a temporary measure
> -     *       until all infra is in place to retrieve or derive the initial
> -     *       x2APIC ID from migrated domains.
> +     * `nr_logical` reported by Intel is the number of THREADS contained in
> +     * the next topological scope. For example, assuming a system with 2
> +     * threads/core and 3 cores/module in a fully symmetric topology,
> +     * `nr_logical` at the core level will report 6. Because it's reporting
> +     * the number of threads in a module.
> +     *
> +     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
> +     * level (cores/complex, etc) so we can return it as-is.
>       */
> -    return vcpu_id * 2;
> +    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
> +        return p->topo.subleaf[lvl].nr_logical;

Is "!= Intel" really appropriate here? I'd rather see this being "AMD || Hygon".

Jan

> +    return p->topo.subleaf[lvl].nr_logical / p->topo.subleaf[lvl - 1].nr_logical;
> +}
Re: [PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Alejandro Vallejo 6 months, 2 weeks ago
Hi,

On 26/03/2024 16:41, Jan Beulich wrote:
> On 09.01.2024 16:38, Alejandro Vallejo wrote:
>> --- a/xen/lib/x86/policy.c
>> +++ b/xen/lib/x86/policy.c
>> @@ -2,15 +2,78 @@
>>  
>>  #include <xen/lib/x86/cpu-policy.h>
>>  
>> -uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
>> +static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
>>  {
>>      /*
>> -     * TODO: Derive x2APIC ID from the topology information inside `p`
>> -     *       rather than from vCPU ID. This bodge is a temporary measure
>> -     *       until all infra is in place to retrieve or derive the initial
>> -     *       x2APIC ID from migrated domains.
>> +     * `nr_logical` reported by Intel is the number of THREADS contained in
>> +     * the next topological scope. For example, assuming a system with 2
>> +     * threads/core and 3 cores/module in a fully symmetric topology,
>> +     * `nr_logical` at the core level will report 6. Because it's reporting
>> +     * the number of threads in a module.
>> +     *
>> +     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
>> +     * level (cores/complex, etc) so we can return it as-is.
>>       */
>> -    return vcpu_id * 2;
>> +    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
>> +        return p->topo.subleaf[lvl].nr_logical;
> 
> Is "!= Intel" really appropriate here? I'd rather see this being "AMD || Hygon".

Sure, I don't particularly mind, but why? As far as we know only Intel
has this interpretation for the part counts. I definitely haven't seen
any non-Intel CPUID dump in which the part count is the total number of
threads (Centaur/Zhaoxin are not multithreaded, and don't expose leaves
1f or e26, as far as I could see).

Cheers,
Alejandro
Re: [PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Jan Beulich 6 months, 2 weeks ago
On 01.05.2024 18:35, Alejandro Vallejo wrote:
> Hi,
> 
> On 26/03/2024 16:41, Jan Beulich wrote:
>> On 09.01.2024 16:38, Alejandro Vallejo wrote:
>>> --- a/xen/lib/x86/policy.c
>>> +++ b/xen/lib/x86/policy.c
>>> @@ -2,15 +2,78 @@
>>>  
>>>  #include <xen/lib/x86/cpu-policy.h>
>>>  
>>> -uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
>>> +static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
>>>  {
>>>      /*
>>> -     * TODO: Derive x2APIC ID from the topology information inside `p`
>>> -     *       rather than from vCPU ID. This bodge is a temporary measure
>>> -     *       until all infra is in place to retrieve or derive the initial
>>> -     *       x2APIC ID from migrated domains.
>>> +     * `nr_logical` reported by Intel is the number of THREADS contained in
>>> +     * the next topological scope. For example, assuming a system with 2
>>> +     * threads/core and 3 cores/module in a fully symmetric topology,
>>> +     * `nr_logical` at the core level will report 6. Because it's reporting
>>> +     * the number of threads in a module.
>>> +     *
>>> +     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
>>> +     * level (cores/complex, etc) so we can return it as-is.
>>>       */
>>> -    return vcpu_id * 2;
>>> +    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
>>> +        return p->topo.subleaf[lvl].nr_logical;
>>
>> Is "!= Intel" really appropriate here? I'd rather see this being "AMD || Hygon".
> 
> Sure, I don't particularly mind, but why? As far as we know only Intel
> has this interpretation for the part counts. I definitely haven't seen
> any non-Intel CPUID dump in which the part count is the total number of
> threads (Centaur/Zhaoxin are not multithreaded, and don't expose leaves
> 1f or e26, as far as I could see).

Because of x86'es origin and perhaps other historical aspects, cloning
Intel behavior is far more likely. The fact that Hygon matches AMD is
simply because they took AMD's design wholesale.

Jan
Re: [PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Jan Beulich 6 months, 2 weeks ago
On 02.05.2024 08:55, Jan Beulich wrote:
> On 01.05.2024 18:35, Alejandro Vallejo wrote:
>> Hi,
>>
>> On 26/03/2024 16:41, Jan Beulich wrote:
>>> On 09.01.2024 16:38, Alejandro Vallejo wrote:
>>>> --- a/xen/lib/x86/policy.c
>>>> +++ b/xen/lib/x86/policy.c
>>>> @@ -2,15 +2,78 @@
>>>>  
>>>>  #include <xen/lib/x86/cpu-policy.h>
>>>>  
>>>> -uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
>>>> +static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
>>>>  {
>>>>      /*
>>>> -     * TODO: Derive x2APIC ID from the topology information inside `p`
>>>> -     *       rather than from vCPU ID. This bodge is a temporary measure
>>>> -     *       until all infra is in place to retrieve or derive the initial
>>>> -     *       x2APIC ID from migrated domains.
>>>> +     * `nr_logical` reported by Intel is the number of THREADS contained in
>>>> +     * the next topological scope. For example, assuming a system with 2
>>>> +     * threads/core and 3 cores/module in a fully symmetric topology,
>>>> +     * `nr_logical` at the core level will report 6. Because it's reporting
>>>> +     * the number of threads in a module.
>>>> +     *
>>>> +     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
>>>> +     * level (cores/complex, etc) so we can return it as-is.
>>>>       */
>>>> -    return vcpu_id * 2;
>>>> +    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
>>>> +        return p->topo.subleaf[lvl].nr_logical;
>>>
>>> Is "!= Intel" really appropriate here? I'd rather see this being "AMD || Hygon".
>>
>> Sure, I don't particularly mind, but why? As far as we know only Intel
>> has this interpretation for the part counts. I definitely haven't seen
>> any non-Intel CPUID dump in which the part count is the total number of
>> threads (Centaur/Zhaoxin are not multithreaded, and don't expose leaves
>> 1f or e26, as far as I could see).
> 
> Because of x86'es origin and perhaps other historical aspects, cloning
> Intel behavior is far more likely. The fact that Hygon matches AMD is
> simply because they took AMD's design wholesale.

Perhaps: See how many dead ends AMD have created, i.e. stuff they proudly
introduced into the architecture, but then gave up again (presumably for
diverging too far from Intel, and hence lacking long term acceptance):
3DNow!, LWP, and XOP just to name those that come to mind right away.

Jan
Re: [PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Alejandro Vallejo 6 months, 2 weeks ago
On 02/05/2024 07:57, Jan Beulich wrote:
> On 02.05.2024 08:55, Jan Beulich wrote:
>> On 01.05.2024 18:35, Alejandro Vallejo wrote:
>>> Hi,
>>>
>>> On 26/03/2024 16:41, Jan Beulich wrote:
>>>> On 09.01.2024 16:38, Alejandro Vallejo wrote:
>>>>> --- a/xen/lib/x86/policy.c
>>>>> +++ b/xen/lib/x86/policy.c
>>>>> @@ -2,15 +2,78 @@
>>>>>  
>>>>>  #include <xen/lib/x86/cpu-policy.h>
>>>>>  
>>>>> -uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
>>>>> +static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
>>>>>  {
>>>>>      /*
>>>>> -     * TODO: Derive x2APIC ID from the topology information inside `p`
>>>>> -     *       rather than from vCPU ID. This bodge is a temporary measure
>>>>> -     *       until all infra is in place to retrieve or derive the initial
>>>>> -     *       x2APIC ID from migrated domains.
>>>>> +     * `nr_logical` reported by Intel is the number of THREADS contained in
>>>>> +     * the next topological scope. For example, assuming a system with 2
>>>>> +     * threads/core and 3 cores/module in a fully symmetric topology,
>>>>> +     * `nr_logical` at the core level will report 6. Because it's reporting
>>>>> +     * the number of threads in a module.
>>>>> +     *
>>>>> +     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
>>>>> +     * level (cores/complex, etc) so we can return it as-is.
>>>>>       */
>>>>> -    return vcpu_id * 2;
>>>>> +    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
>>>>> +        return p->topo.subleaf[lvl].nr_logical;
>>>>
>>>> Is "!= Intel" really appropriate here? I'd rather see this being "AMD || Hygon".
>>>
>>> Sure, I don't particularly mind, but why? As far as we know only Intel
>>> has this interpretation for the part counts. I definitely haven't seen
>>> any non-Intel CPUID dump in which the part count is the total number of
>>> threads (Centaur/Zhaoxin are not multithreaded, and don't expose leaves
>>> 1f or e26, as far as I could see).
>>
>> Because of x86'es origin and perhaps other historical aspects, cloning
>> Intel behavior is far more likely.

That claim doesn't hold very well seeing how...

>> The fact that Hygon matches AMD is
>> simply because they took AMD's design wholesale.

... this statement contradicts it. We can't predict which new vendor (if
any) will be cloned/mimicked next, so that's not a very plausible reason
to prioritise a specific vendor in conditionals.

It remains to be seen what a Zhaoxin actually looks like, because I
couldn't get ahold of a complete cpuid dump.

> 
> Perhaps: See how many dead ends AMD have created, i.e. stuff they proudly
> introduced into the architecture, but then gave up again (presumably for
> diverging too far from Intel, and hence lacking long term acceptance):
> 3DNow!, LWP, and XOP just to name those that come to mind right away.
> 
> Jan

I can't say I agree on the cause; Regardless I'd rather not discuss the
relative merits of vendors with regards to backwards compatibility, as
that's besides the point. The point is whether there's a credible
technical reason to prefer this...

  if ( !(a & (B | C)) )
      foo();

... to this...

  if ( a == A )
      foo();

..., as is the case in patch6.

I argue there's not, and in fact legibility-wise the latter is very
clearly superior.

There's also a compelling reason to keep the check coherent on both
generators to avoid bad surprises down the line.

Cheers,
Alejandro
Re: [PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Roger Pau Monné 7 months, 3 weeks ago
On Tue, Jan 09, 2024 at 03:38:33PM +0000, Alejandro Vallejo wrote:
> Implements the helper for mapping vcpu_id to x2apic_id given a valid
> topology in a policy. The algo is written with the intention of extending
> it to leaves 0x1f and e26 in the future.
> 
> Toolstack doesn't set leaf 0xb and the HVM default policy has it cleared,
> so the leaf is not implemented. In that case, the new helper just returns
> the legacy mapping.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
>  tools/tests/cpu-policy/test-cpu-policy.c | 128 +++++++++++++++++++++++
>  xen/include/xen/lib/x86/cpu-policy.h     |   2 +
>  xen/lib/x86/policy.c                     |  75 +++++++++++--
>  3 files changed, 199 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/tests/cpu-policy/test-cpu-policy.c b/tools/tests/cpu-policy/test-cpu-policy.c
> index 301df2c002..6ff5c1dd3d 100644
> --- a/tools/tests/cpu-policy/test-cpu-policy.c
> +++ b/tools/tests/cpu-policy/test-cpu-policy.c
> @@ -650,6 +650,132 @@ static void test_is_compatible_failure(void)
>      }
>  }
>  
> +static void test_x2apic_id_from_vcpu_id_success(void)
> +{
> +    static struct test {

const

> +        const char *name;
> +        uint32_t vcpu_id;
> +        uint32_t x2apic_id;
> +        struct cpu_policy policy;
> +    } tests[] = {
> +        {
> +            .name = "3v: 3 t/c, 8 c/s", .vcpu_id = 3, .x2apic_id = 1 << 2,
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_AMD,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
> +                },

Don't we need a helper that gets passed the number of cores per
socket and threads per core and fills this up?

I would introduce this first, add a test for it, and then do this
testing using the helper.

> +            },
> +        },
> +        {
> +            .name = "6v: 3 t/c, 8 c/s", .vcpu_id = 6, .x2apic_id = 2 << 2,
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_AMD,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "24v: 3 t/c, 8 c/s", .vcpu_id = 24, .x2apic_id = 1 << 5,
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_AMD,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "35v: 3 t/c, 8 c/s", .vcpu_id = 35,
> +            .x2apic_id = (35 % 3) | (((35 / 3) % 8)  << 2) | ((35 / 24) << 5),
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_AMD,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "96v: 7 t/c, 3 c/s", .vcpu_id = 96,
> +            .x2apic_id = (96 % 7) | (((96 / 7) % 3)  << 3) | ((96 / 21) << 5),
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_AMD,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 7, .level = 0, .type = 1, .id_shift = 3, },
> +                    [1] = { .nr_logical = 3, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "3v: 3 t/c, 8 c/s", .vcpu_id = 3, .x2apic_id = 1 << 2,
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_INTEL,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "6v: 3 t/c, 8 c/s", .vcpu_id = 6, .x2apic_id = 2 << 2,
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_INTEL,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "24v: 3 t/c, 8 c/s", .vcpu_id = 24, .x2apic_id = 1 << 5,
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_INTEL,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "35v: 3 t/c, 8 c/s", .vcpu_id = 35,
> +            .x2apic_id = (35 % 3) | (((35 / 3) % 8)  << 2) | ((35 / 24) << 5),
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_INTEL,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 3,  .level = 0, .type = 1, .id_shift = 2, },
> +                    [1] = { .nr_logical = 24, .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +        {
> +            .name = "96v: 7 t/c, 3 c/s", .vcpu_id = 96,
> +            .x2apic_id = (96 % 7) | (((96 / 7) % 3)  << 3) | ((96 / 21) << 5),
> +            .policy = {
> +                .x86_vendor = X86_VENDOR_INTEL,
> +                .topo.subleaf = {
> +                    [0] = { .nr_logical = 7,   .level = 0, .type = 1, .id_shift = 3, },
> +                    [1] = { .nr_logical = 21,  .level = 1, .type = 2, .id_shift = 5, },
> +                },
> +            },
> +        },
> +    };
> +
> +    printf("Testing x2apic id from vcpu id success:\n");
> +
> +    for ( size_t i = 0; i < ARRAY_SIZE(tests); ++i )
> +    {
> +        struct test *t = &tests[i];

const

> +        uint32_t x2apic_id = x86_x2apic_id_from_vcpu_id(&t->policy, t->vcpu_id);

Newline preferably.

> +        if ( x2apic_id != t->x2apic_id )
> +            fail("FAIL - '%s'. bad x2apic_id: expected=%u actual=%u\n",
> +                 t->name, t->x2apic_id, x2apic_id);
> +    }
> +}
> +
>  int main(int argc, char **argv)
>  {
>      printf("CPU Policy unit tests\n");
> @@ -667,6 +793,8 @@ int main(int argc, char **argv)
>      test_is_compatible_success();
>      test_is_compatible_failure();
>  
> +    test_x2apic_id_from_vcpu_id_success();
> +
>      if ( nr_failures )
>          printf("Done: %u failures\n", nr_failures);
>      else
> diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h
> index 65f6335b32..d81ae2f47c 100644
> --- a/xen/include/xen/lib/x86/cpu-policy.h
> +++ b/xen/include/xen/lib/x86/cpu-policy.h
> @@ -550,6 +550,8 @@ int x86_cpu_policies_are_compatible(const struct cpu_policy *host,
>  /**
>   * Calculates the x2APIC ID of a vCPU given a CPU policy
>   *
> + * If the policy lacks leaf 0xb falls back to legacy mapping of apic_id=cpu*2
> + *
>   * @param p          CPU policy of the domain.
>   * @param vcpu_id    vCPU ID of the vCPU.
>   * @returns x2APIC ID of the vCPU.
> diff --git a/xen/lib/x86/policy.c b/xen/lib/x86/policy.c
> index a3b24e6879..2a50bc076a 100644
> --- a/xen/lib/x86/policy.c
> +++ b/xen/lib/x86/policy.c
> @@ -2,15 +2,78 @@
>  
>  #include <xen/lib/x86/cpu-policy.h>
>  
> -uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
> +static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
>  {
>      /*
> -     * TODO: Derive x2APIC ID from the topology information inside `p`
> -     *       rather than from vCPU ID. This bodge is a temporary measure
> -     *       until all infra is in place to retrieve or derive the initial
> -     *       x2APIC ID from migrated domains.
> +     * `nr_logical` reported by Intel is the number of THREADS contained in
> +     * the next topological scope. For example, assuming a system with 2
> +     * threads/core and 3 cores/module in a fully symmetric topology,
> +     * `nr_logical` at the core level will report 6. Because it's reporting
> +     * the number of threads in a module.
> +     *
> +     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
> +     * level (cores/complex, etc) so we can return it as-is.
>       */
> -    return vcpu_id * 2;
> +    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
> +        return p->topo.subleaf[lvl].nr_logical;
> +
> +    return p->topo.subleaf[lvl].nr_logical / p->topo.subleaf[lvl - 1].nr_logical;

I think this is an optimization because we only have 2 levels,
otherwise you would need a loop like:

unsigned int nr = p->topo.subleaf[lvl].nr_logical
for ( unsigned int i; i < lvl; i++ )
    nr /= p->topo.subleaf[i].nr_logical;

If that's the case I think we need a
BUILD_BUG_ON(ARRAY_SIZE(p->topo.subleaf) > 1);

> +}
> +
> +uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t id)

Can you keep the previous vcpu_id parameter name?  Or alternatively
adjust the prototype to also be id.

Thanks, Roger.
Re: [PATCH 5/6] xen/x86: Derive topologically correct x2APIC IDs from the policy
Posted by Alejandro Vallejo 6 months, 2 weeks ago
Hi,

Ack and sure to everything on types, constness and variable names.

On 20/03/2024 10:15, Roger Pau Monné wrote:
>> +        const char *name;
>> +        uint32_t vcpu_id;
>> +        uint32_t x2apic_id;
>> +        struct cpu_policy policy;
>> +    } tests[] = {
>> +        {
>> +            .name = "3v: 3 t/c, 8 c/s", .vcpu_id = 3, .x2apic_id = 1 << 2,
>> +            .policy = {
>> +                .x86_vendor = X86_VENDOR_AMD,
>> +                .topo.subleaf = {
>> +                    [0] = { .nr_logical = 3, .level = 0, .type = 1, .id_shift = 2, },
>> +                    [1] = { .nr_logical = 8, .level = 1, .type = 2, .id_shift = 5, },
>> +                },
> 
> Don't we need a helper that gets passed the number of cores per
> socket and threads per core and fills this up?
> 
> I would introduce this first, add a test for it, and then do this
> testing using the helper.

Funnily enough that's how I tested it initially. Alas, it felt silly to
put it where it will be linked against the hypervisor. I'm happy to put
it back there.

>> diff --git a/xen/lib/x86/policy.c b/xen/lib/x86/policy.c
>> index a3b24e6879..2a50bc076a 100644
>> --- a/xen/lib/x86/policy.c
>> +++ b/xen/lib/x86/policy.c
>> @@ -2,15 +2,78 @@
>>  
>>  #include <xen/lib/x86/cpu-policy.h>
>>  
>> -uint32_t x86_x2apic_id_from_vcpu_id(const struct cpu_policy *p, uint32_t vcpu_id)
>> +static uint32_t parts_per_higher_scoped_level(const struct cpu_policy *p, size_t lvl)
>>  {
>>      /*
>> -     * TODO: Derive x2APIC ID from the topology information inside `p`
>> -     *       rather than from vCPU ID. This bodge is a temporary measure
>> -     *       until all infra is in place to retrieve or derive the initial
>> -     *       x2APIC ID from migrated domains.
>> +     * `nr_logical` reported by Intel is the number of THREADS contained in
>> +     * the next topological scope. For example, assuming a system with 2
>> +     * threads/core and 3 cores/module in a fully symmetric topology,
>> +     * `nr_logical` at the core level will report 6. Because it's reporting
>> +     * the number of threads in a module.
>> +     *
>> +     * On AMD/Hygon, nr_logical is already normalized by the higher scoped
>> +     * level (cores/complex, etc) so we can return it as-is.
>>       */
>> -    return vcpu_id * 2;
>> +    if ( p->x86_vendor != X86_VENDOR_INTEL || !lvl )
>> +        return p->topo.subleaf[lvl].nr_logical;
>> +
>> +    return p->topo.subleaf[lvl].nr_logical / p->topo.subleaf[lvl - 1].nr_logical;
> 
> I think this is an optimization because we only have 2 levels,
> otherwise you would need a loop like:
> 
> unsigned int nr = p->topo.subleaf[lvl].nr_logical
> for ( unsigned int i; i < lvl; i++ )
>     nr /= p->topo.subleaf[i].nr_logical;
> 
> If that's the case I think we need a
> BUILD_BUG_ON(ARRAY_SIZE(p->topo.subleaf) > 1);

It's not meant to be. That division should still hold for leaves 0x1f
and e26 where the level count typically exceeds 2. From the SDM.

```
Bits 15-00: The number of logical processors across all instances of
this domain within the next higherscoped domain. (For example, in a
processor socket/package comprising “M” dies of “N” cores each, where
each core has “L” logical processors, the “die” domain sub-leaf value of
this field would be M*N*L.) This number reflects configuration as
shipped by Intel. Note, software must not use this field to enumerate
processor topology*.
```

So. If level1 has nr_logical=X*Y and level2 has nr_logical=X*Y*Z then
dividing the latter by the former gives you Z, which is the number of
lvl1-parts for a single instance of lvl2 (i.e: cores/package, or whatever).

Unless I'm missing something?

Cheers,
Alejandro