[v2] xen/arm: Set correct per-cpu cpu_core_mask

[PATCH v2] xen/arm: Set correct per-cpu cpu_core_mask

Posted by Henry Wang 2 months ago

In the common sysctl command XEN_SYSCTL_physinfo, the value of
cores_per_socket is calculated based on the cpu_core_mask of CPU0.
Currently on Arm this is a fixed value 1 (can be checked via xl info),
which is not correct. This is because during the Arm CPU online
process at boot time, setup_cpu_sibling_map() only sets the per-cpu
cpu_core_mask for itself.

cores_per_socket refers to the number of cores that belong to the same
socket (NUMA node). Currently Xen on Arm does not support physical
CPU hotplug and NUMA, also we assume there is no multithread. Therefore
cores_per_socket means all possible CPUs detected from the device
tree. Setting the per-cpu cpu_core_mask in setup_cpu_sibling_map()
accordingly. Drop the in-code comment which seems to be outdated.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Signed-off-by: Henry Wang <xin.wang2@amd.com>
---
v2:
- Do not do the multithread check.
---
 xen/arch/arm/smpboot.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index a84e706d77..d9ebd55d4a 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -66,7 +66,6 @@ static bool cpu_is_dead;
 
 /* ID of the PCPU we're running on */
 DEFINE_PER_CPU(unsigned int, cpu_id);
-/* XXX these seem awfully x86ish... */
 /* representing HT siblings of each logical CPU */
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
 /* representing HT and core siblings of each logical CPU */
@@ -89,6 +88,10 @@ static int setup_cpu_sibling_map(int cpu)
     cpumask_set_cpu(cpu, per_cpu(cpu_sibling_mask, cpu));
     cpumask_set_cpu(cpu, per_cpu(cpu_core_mask, cpu));
 
+    /* Currently we assume there is no multithread. */
+    cpumask_or(per_cpu(cpu_core_mask, cpu),
+               per_cpu(cpu_core_mask, cpu), &cpu_possible_map);
+
     return 0;
 }
 
-- 
2.34.1

Re: [PATCH v2] xen/arm: Set correct per-cpu cpu_core_mask

Posted by Julien Grall 1 month, 2 weeks ago

Hi Henry,

On 28/02/2024 01:58, Henry Wang wrote:
> In the common sysctl command XEN_SYSCTL_physinfo, the value of
> cores_per_socket is calculated based on the cpu_core_mask of CPU0.
> Currently on Arm this is a fixed value 1 (can be checked via xl info),
> which is not correct. This is because during the Arm CPU online
> process at boot time, setup_cpu_sibling_map() only sets the per-cpu
> cpu_core_mask for itself.
> 
> cores_per_socket refers to the number of cores that belong to the same
> socket (NUMA node). Currently Xen on Arm does not support physical
> CPU hotplug and NUMA, also we assume there is no multithread. Therefore
> cores_per_socket means all possible CPUs detected from the device
> tree. Setting the per-cpu cpu_core_mask in setup_cpu_sibling_map()
> accordingly. Drop the in-code comment which seems to be outdated.
> 
> Signed-off-by: Henry Wang <Henry.Wang@arm.com>
> Signed-off-by: Henry Wang <xin.wang2@amd.com>
> ---
> v2:
> - Do not do the multithread check.
> ---
>   xen/arch/arm/smpboot.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index a84e706d77..d9ebd55d4a 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -66,7 +66,6 @@ static bool cpu_is_dead;
>   
>   /* ID of the PCPU we're running on */
>   DEFINE_PER_CPU(unsigned int, cpu_id);
> -/* XXX these seem awfully x86ish... */

:). I guess at the time we didn't realize that MT was supported on Arm. 
It is at least part of the spec, but as Michal pointed out it doesn't 
look like a lot of processors supports it.

>   /* representing HT siblings of each logical CPU */
>   DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
>   /* representing HT and core siblings of each logical CPU */
> @@ -89,6 +88,10 @@ static int setup_cpu_sibling_map(int cpu)
>       cpumask_set_cpu(cpu, per_cpu(cpu_sibling_mask, cpu));
>       cpumask_set_cpu(cpu, per_cpu(cpu_core_mask, cpu));
>   
> +    /* Currently we assume there is no multithread. */

I am not very familiar with the scheduling in Xen. Do you know what's 
the consequence of not properly supporting MT? One thing I can think of 
is security (I know there were plenty of security issues with SMT).

Depending on the answer, I would consider to print a warning and maybe 
add it in SUPPORT.MD in a separate patch.

Also, looking at the v1 discussion, it sounds like even cpu_sibling_mask 
would not be correct. So I think it would make sense to move the comment 
on top of setup_cpu_sibling_map.

> +    cpumask_or(per_cpu(cpu_core_mask, cpu),
> +               per_cpu(cpu_core_mask, cpu), &cpu_possible_map);

AFIACT, per_cpu(cpu_core_mask, ...) will now be equal to 
cpu_possible_map. In that case, wouldn't it be better to simply copy the 
cpumask?

This would also allow to drop cpumask_set_cpu(..., cpu_core_mask) above.

Cheers,

-- 
Julien Grall

Re: [PATCH v2] xen/arm: Set correct per-cpu cpu_core_mask

Posted by Henry Wang 1 month, 2 weeks ago

Hi Julien,

On 3/14/2024 9:27 PM, Julien Grall wrote:
> Hi Henry,
>
> On 28/02/2024 01:58, Henry Wang wrote:
>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>> index a84e706d77..d9ebd55d4a 100644
>> --- a/xen/arch/arm/smpboot.c
>> +++ b/xen/arch/arm/smpboot.c
>> @@ -66,7 +66,6 @@ static bool cpu_is_dead;
>>     /* ID of the PCPU we're running on */
>>   DEFINE_PER_CPU(unsigned int, cpu_id);
>> -/* XXX these seem awfully x86ish... */
>
> :). I guess at the time we didn't realize that MT was supported on 
> Arm. It is at least part of the spec, but as Michal pointed out it 
> doesn't look like a lot of processors supports it.

Yep. Do you think changing the content of this line to something like 
"Although multithread is part of the Arm spec, there are not many 
processors support multithread and current Xen on Arm assumes there is 
no multithread" makes sense to you?

>>   /* representing HT siblings of each logical CPU */
>>   DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
>>   /* representing HT and core siblings of each logical CPU */
>> @@ -89,6 +88,10 @@ static int setup_cpu_sibling_map(int cpu)
>>       cpumask_set_cpu(cpu, per_cpu(cpu_sibling_mask, cpu));
>>       cpumask_set_cpu(cpu, per_cpu(cpu_core_mask, cpu));
>>   +    /* Currently we assume there is no multithread. */
>
> I am not very familiar with the scheduling in Xen. Do you know what's 
> the consequence of not properly supporting MT? One thing I can think 
> of is security (I know there were plenty of security issues with SMT).

Unfortunately me neither, so adding George to this thread as I think he 
can bring us some insights on this topic from the scheduler perspective.

> Depending on the answer, I would consider to print a warning and maybe 
> add it in SUPPORT.MD in a separate patch.

To be honest, as discussed in v1. I think I am quite tempted to add an 
ASSERT(system_cpuinfo.mpidr.mt == 0) to make sure we catch the 
multithread support stuff earlier. I don't really know what will happen 
if running current Xen on top of a multithread-implemented processor, 
probably it will be fine, but probably some strange behavior will happen 
after the boot time which I think will be difficult to debug...

> Also, looking at the v1 discussion, it sounds like even 
> cpu_sibling_mask would not be correct. So I think it would make sense 
> to move the comment on top of setup_cpu_sibling_map.

Sounds good. I will move it in v3.

>> +    cpumask_or(per_cpu(cpu_core_mask, cpu),
>> +               per_cpu(cpu_core_mask, cpu), &cpu_possible_map);
>
> AFIACT, per_cpu(cpu_core_mask, ...) will now be equal to 
> cpu_possible_map. In that case, wouldn't it be better to simply copy 
> the cpumask?

You mean cpumask_copy(per_cpu(cpu_core_mask, cpu), &cpu_possible_map)? 
Good question...I forgot if there was a reason behind this back to the 
time I wrote this patch (it is likely just me being silly). But yes 
definitely I can do this way in v3.

> This would also allow to drop cpumask_set_cpu(..., cpu_core_mask) above.

Good point. I will drop in v3.

Kind regards,
Henry

>
> Cheers,
>

Re: [PATCH v2] xen/arm: Set correct per-cpu cpu_core_mask

Posted by Julien Grall 1 month, 1 week ago


On 14/03/2024 14:22, Henry Wang wrote:
> Hi Julien,

Hi,

> 
> On 3/14/2024 9:27 PM, Julien Grall wrote:
>> Hi Henry,
>>
>> On 28/02/2024 01:58, Henry Wang wrote:
>>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>>> index a84e706d77..d9ebd55d4a 100644
>>> --- a/xen/arch/arm/smpboot.c
>>> +++ b/xen/arch/arm/smpboot.c
>>> @@ -66,7 +66,6 @@ static bool cpu_is_dead;
>>>     /* ID of the PCPU we're running on */
>>>   DEFINE_PER_CPU(unsigned int, cpu_id);
>>> -/* XXX these seem awfully x86ish... */
>>
>> :). I guess at the time we didn't realize that MT was supported on 
>> Arm. It is at least part of the spec, but as Michal pointed out it 
>> doesn't look like a lot of processors supports it.
> 
> Yep. Do you think changing the content of this line to something like 
> "Although multithread is part of the Arm spec, there are not many 
> processors support multithread and current Xen on Arm assumes there is 
> no multithread" makes sense to you?
> 
>>>   /* representing HT siblings of each logical CPU */
>>>   DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
>>>   /* representing HT and core siblings of each logical CPU */
>>> @@ -89,6 +88,10 @@ static int setup_cpu_sibling_map(int cpu)
>>>       cpumask_set_cpu(cpu, per_cpu(cpu_sibling_mask, cpu));
>>>       cpumask_set_cpu(cpu, per_cpu(cpu_core_mask, cpu));
>>>   +    /* Currently we assume there is no multithread. */
>>
>> I am not very familiar with the scheduling in Xen. Do you know what's 
>> the consequence of not properly supporting MT? One thing I can think 
>> of is security (I know there were plenty of security issues with SMT).
> 
> Unfortunately me neither, so adding George to this thread as I think he 
> can bring us some insights on this topic from the scheduler perspective.

+Juergen as he worked on co-scheduling.

> 
>> Depending on the answer, I would consider to print a warning and maybe 
>> add it in SUPPORT.MD in a separate patch.
> 
> To be honest, as discussed in v1. I think I am quite tempted to add an 
> ASSERT(system_cpuinfo.mpidr.mt == 0) to make sure we catch the 
> multithread support stuff earlier. 

ASSERT(...) is not the right solution. You may have user using a Xen 
shipped by distros where this would be a NOP.

We could try to hide MT completely from the scheduler. If that's too 
difficult, then we could add use warning_add() to throw a warning (see 
how we dealt with opt_hmp_unsafe).

I don't really know what will happen
> if running current Xen on top of a multithread-implemented processor, 
> probably it will be fine, but probably some strange behavior will happen 
> after the boot time which I think will be difficult to debug...

I am not sure what you mean by "strange behavior". AFAIK, you may see 
different performance characteristics and more importantly this is a 
nest for security issue. But I don't expect any difficult to debug.

> 
>> Also, looking at the v1 discussion, it sounds like even 
>> cpu_sibling_mask would not be correct. So I think it would make sense 
>> to move the comment on top of setup_cpu_sibling_map.
> 
> Sounds good. I will move it in v3.
> 
>>> +    cpumask_or(per_cpu(cpu_core_mask, cpu),
>>> +               per_cpu(cpu_core_mask, cpu), &cpu_possible_map);
>>
>> AFIACT, per_cpu(cpu_core_mask, ...) will now be equal to 
>> cpu_possible_map. In that case, wouldn't it be better to simply copy 
>> the cpumask?
> 
> You mean cpumask_copy(per_cpu(cpu_core_mask, cpu), &cpu_possible_map)? 

Yes.

Cheers,

-- 
Julien Grall

Re: [PATCH v2] xen/arm: Set correct per-cpu cpu_core_mask

Posted by Jürgen Groß 1 month, 1 week ago

On 15.03.24 11:58, Julien Grall wrote:
> 
> 
> On 14/03/2024 14:22, Henry Wang wrote:
>> Hi Julien,
> 
> Hi,
> 
>>
>> On 3/14/2024 9:27 PM, Julien Grall wrote:
>>> Hi Henry,
>>>
>>> On 28/02/2024 01:58, Henry Wang wrote:
>>>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>>>> index a84e706d77..d9ebd55d4a 100644
>>>> --- a/xen/arch/arm/smpboot.c
>>>> +++ b/xen/arch/arm/smpboot.c
>>>> @@ -66,7 +66,6 @@ static bool cpu_is_dead;
>>>>     /* ID of the PCPU we're running on */
>>>>   DEFINE_PER_CPU(unsigned int, cpu_id);
>>>> -/* XXX these seem awfully x86ish... */
>>>
>>> :). I guess at the time we didn't realize that MT was supported on Arm. It is 
>>> at least part of the spec, but as Michal pointed out it doesn't look like a 
>>> lot of processors supports it.
>>
>> Yep. Do you think changing the content of this line to something like 
>> "Although multithread is part of the Arm spec, there are not many processors 
>> support multithread and current Xen on Arm assumes there is no multithread" 
>> makes sense to you?
>>
>>>>   /* representing HT siblings of each logical CPU */
>>>>   DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
>>>>   /* representing HT and core siblings of each logical CPU */
>>>> @@ -89,6 +88,10 @@ static int setup_cpu_sibling_map(int cpu)
>>>>       cpumask_set_cpu(cpu, per_cpu(cpu_sibling_mask, cpu));
>>>>       cpumask_set_cpu(cpu, per_cpu(cpu_core_mask, cpu));
>>>>   +    /* Currently we assume there is no multithread. */
>>>
>>> I am not very familiar with the scheduling in Xen. Do you know what's the 
>>> consequence of not properly supporting MT? One thing I can think of is 
>>> security (I know there were plenty of security issues with SMT).
>>
>> Unfortunately me neither, so adding George to this thread as I think he can 
>> bring us some insights on this topic from the scheduler perspective.
> 
> +Juergen as he worked on co-scheduling.

There are four aspects regarding multithreading:

- security (basically boils down to one thread possibly being able to use
   side channel attacks for accessing other thread's data via resources in
   the core shared between the threads)

- performance (trying to spread running vcpus across as many cores as
   possible will utilize more hardware resources resulting generally in better
   performance - especially credit2 is supporting that)

- power consumption (a completely idle core will consume less power - again
   credit2 is supporting that with the correct setting)

- busy loops (cpu_relax() support)

For credit2 MT specifics see the large comment block in xen/common/sched/credit2
starting at line 636.

Note that core scheduling is currently not supported on Arm, as there are
architecture specific bits missing in the context switching logic.


Juergen

Re: [PATCH v2] xen/arm: Set correct per-cpu cpu_core_mask

Posted by Henry Wang 1 month, 1 week ago

Hi Julien,

On 3/15/2024 6:58 PM, Julien Grall wrote:
> On 14/03/2024 14:22, Henry Wang wrote:
>> Hi Julien,
> Hi,
>>>>   /* representing HT siblings of each logical CPU */
>>>>   DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
>>>>   /* representing HT and core siblings of each logical CPU */
>>>> @@ -89,6 +88,10 @@ static int setup_cpu_sibling_map(int cpu)
>>>>       cpumask_set_cpu(cpu, per_cpu(cpu_sibling_mask, cpu));
>>>>       cpumask_set_cpu(cpu, per_cpu(cpu_core_mask, cpu));
>>>>   +    /* Currently we assume there is no multithread. */
>>>
>>> I am not very familiar with the scheduling in Xen. Do you know 
>>> what's the consequence of not properly supporting MT? One thing I 
>>> can think of is security (I know there were plenty of security 
>>> issues with SMT).
>>
>> Unfortunately me neither, so adding George to this thread as I think 
>> he can bring us some insights on this topic from the scheduler 
>> perspective.
>
> +Juergen as he worked on co-scheduling.

Thanks.

>>
>>> Depending on the answer, I would consider to print a warning and 
>>> maybe add it in SUPPORT.MD in a separate patch.
>>
>> To be honest, as discussed in v1. I think I am quite tempted to add 
>> an ASSERT(system_cpuinfo.mpidr.mt == 0) to make sure we catch the 
>> multithread support stuff earlier. 
>
> ASSERT(...) is not the right solution. You may have user using a Xen 
> shipped by distros where this would be a NOP.
>
> We could try to hide MT completely from the scheduler. If that's too 
> difficult, then we could add use warning_add() to throw a warning (see 
> how we dealt with opt_hmp_unsafe).

Ok. Let's first see what George and Juergen will say, if MT cannot be 
hidden from scheduler for Arm, maybe we can add something like below:

if ( system_cpuinfo.mpidr.mt == 1 )
      warning_add("WARNING: MULTITHREADING HAS BEEN DETECTED ON THE 
PROCESSOR.\n"
                               "It might impact the security and 
stability of the system.\n");

Kind regards,
Henry