[RFC PATCH 0/2] Boot time cpupools

Luca Fancellu posted 2 patches 2 weeks, 4 days ago
Test gitlab-ci passed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20211117095711.26596-1-luca.fancellu@arm.com
tools/libs/light/libxl_utils.c | 13 ++++--
xen/arch/arm/Kconfig           | 15 ++++++
xen/arch/arm/Makefile          |  1 +
xen/arch/arm/cpupool.c         | 84 ++++++++++++++++++++++++++++++++++
xen/common/sched/cpupool.c     |  5 +-
xen/include/xen/cpupool.h      | 30 ++++++++++++
6 files changed, 142 insertions(+), 6 deletions(-)
create mode 100644 xen/arch/arm/cpupool.c
create mode 100644 xen/include/xen/cpupool.h

[RFC PATCH 0/2] Boot time cpupools

Posted by Luca Fancellu 2 weeks, 4 days ago
Currently Xen creates a default cpupool0 that contains all the cpu brought up
during boot and it assumes that the platform has only one kind of CPU.
This assumption does not hold on big.LITTLE platform, but putting different
type of CPU in the same cpupool can result in instability and security issues
for the domains running on the pool.

For this reason this serie introduces an architecture specific way to create
different cpupool at boot time, this is particularly useful on ARM big.LITTLE
platform where there might be the need to have different cpupools for each type
of core, but also systems using NUMA can have different cpu pool for each node.

To test this serie, the hmp_unsafe Xen boot argument is passed to allow Xen
starting different CPUs from the boot core.

Luca Fancellu (2):
  xen/cpupool: Create different cpupools at boot time
  tools/cpupools: Give a name to unnamed cpupools

 tools/libs/light/libxl_utils.c | 13 ++++--
 xen/arch/arm/Kconfig           | 15 ++++++
 xen/arch/arm/Makefile          |  1 +
 xen/arch/arm/cpupool.c         | 84 ++++++++++++++++++++++++++++++++++
 xen/common/sched/cpupool.c     |  5 +-
 xen/include/xen/cpupool.h      | 30 ++++++++++++
 6 files changed, 142 insertions(+), 6 deletions(-)
 create mode 100644 xen/arch/arm/cpupool.c
 create mode 100644 xen/include/xen/cpupool.h

-- 
2.17.1


Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Julien Grall 2 weeks, 4 days ago
Hi Luca,

On 17/11/2021 09:57, Luca Fancellu wrote:
> Currently Xen creates a default cpupool0 that contains all the cpu brought up
> during boot and it assumes that the platform has only one kind of CPU.
> This assumption does not hold on big.LITTLE platform, but putting different
> type of CPU in the same cpupool can result in instability and security issues
> for the domains running on the pool.

I agree that you can't move a LITTLE vCPU to a big pCPU. However...

> 
> For this reason this serie introduces an architecture specific way to create
> different cpupool at boot time, this is particularly useful on ARM big.LITTLE
> platform where there might be the need to have different cpupools for each type
> of core, but also systems using NUMA can have different cpu pool for each node.

... from my understanding, all the vCPUs of a domain have to be in the 
same cpupool. So with this approach it is not possible:
    1) to have a mix of LITTLE and big vCPUs in the domain
    2) to create a domain spanning across two NUMA nodes

So I think we need to make sure that any solutions we go through will 
not prevent us to implement those setups.

I can see two options here:
   1) Allowing a domain vCPUs to be on a different cpupool
   2) Introducing CPU class (see [1])

I can't remember why Dario suggested 2) rather than 1) in the past. 
@Dario, do you remember it?

Cheers,

[1] https://lore.kernel.org/xen-devel/1481135379.3445.142.camel@citrix.com/

> 
> To test this serie, the hmp_unsafe Xen boot argument is passed to allow Xen
> starting different CPUs from the boot core.
> 
> Luca Fancellu (2):
>    xen/cpupool: Create different cpupools at boot time
>    tools/cpupools: Give a name to unnamed cpupools
> 
>   tools/libs/light/libxl_utils.c | 13 ++++--
>   xen/arch/arm/Kconfig           | 15 ++++++
>   xen/arch/arm/Makefile          |  1 +
>   xen/arch/arm/cpupool.c         | 84 ++++++++++++++++++++++++++++++++++
>   xen/common/sched/cpupool.c     |  5 +-
>   xen/include/xen/cpupool.h      | 30 ++++++++++++
>   6 files changed, 142 insertions(+), 6 deletions(-)
>   create mode 100644 xen/arch/arm/cpupool.c
>   create mode 100644 xen/include/xen/cpupool.h
> 

-- 
Julien Grall

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Bertrand Marquis 2 weeks, 4 days ago
Hi Julien,

> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
> 
> Hi Luca,
> 
> On 17/11/2021 09:57, Luca Fancellu wrote:
>> Currently Xen creates a default cpupool0 that contains all the cpu brought up
>> during boot and it assumes that the platform has only one kind of CPU.
>> This assumption does not hold on big.LITTLE platform, but putting different
>> type of CPU in the same cpupool can result in instability and security issues
>> for the domains running on the pool.
> 
> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
> 
>> For this reason this serie introduces an architecture specific way to create
>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE
>> platform where there might be the need to have different cpupools for each type
>> of core, but also systems using NUMA can have different cpu pool for each node.
> 
> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible:
>   1) to have a mix of LITTLE and big vCPUs in the domain
>   2) to create a domain spanning across two NUMA nodes
> 
> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups.

The point of this patch is to make all cores available without breaking the current behaviour of existing system.

Someone not using cpupool will keep running on the same cores as before.
Someone wanting to use the other cores could assign a guest to the other(s) cpupool (big.LITTLE is just an example with 2 but there are now cores with 3 types of cores).
Someone wanting to build something different can now create new cpupools in Dom0 and assign the cores they want to is to build a guest having access to different types of cores.

The point here is just to make the “other” cores accessible and park them in cpupools so that current behaviour is not changed.

> 
> I can see two options here:
>  1) Allowing a domain vCPUs to be on a different cpupool
>  2) Introducing CPU class (see [1])
> 
> I can't remember why Dario suggested 2) rather than 1) in the past. @Dario, do you remember it?

I think 1) is definitely interesting and something that could be looked at in the future.

This serie just aims at making all cores available without breaking backward compatibility which is a good improvement but does not contradict the 2 options you are suggesting.

Cheers
Bertrand

> 
> Cheers,
> 
> [1] https://lore.kernel.org/xen-devel/1481135379.3445.142.camel@citrix.com/
> 
>> To test this serie, the hmp_unsafe Xen boot argument is passed to allow Xen
>> starting different CPUs from the boot core.
>> Luca Fancellu (2):
>>   xen/cpupool: Create different cpupools at boot time
>>   tools/cpupools: Give a name to unnamed cpupools
>>  tools/libs/light/libxl_utils.c | 13 ++++--
>>  xen/arch/arm/Kconfig           | 15 ++++++
>>  xen/arch/arm/Makefile          |  1 +
>>  xen/arch/arm/cpupool.c         | 84 ++++++++++++++++++++++++++++++++++
>>  xen/common/sched/cpupool.c     |  5 +-
>>  xen/include/xen/cpupool.h      | 30 ++++++++++++
>>  6 files changed, 142 insertions(+), 6 deletions(-)
>>  create mode 100644 xen/arch/arm/cpupool.c
>>  create mode 100644 xen/include/xen/cpupool.h
> 
> -- 
> Julien Grall

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Julien Grall 2 weeks, 4 days ago
On 17/11/2021 11:16, Bertrand Marquis wrote:
> Hi Julien,

Hi,

>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Luca,
>>
>> On 17/11/2021 09:57, Luca Fancellu wrote:
>>> Currently Xen creates a default cpupool0 that contains all the cpu brought up
>>> during boot and it assumes that the platform has only one kind of CPU.
>>> This assumption does not hold on big.LITTLE platform, but putting different
>>> type of CPU in the same cpupool can result in instability and security issues
>>> for the domains running on the pool.
>>
>> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
>>
>>> For this reason this serie introduces an architecture specific way to create
>>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE
>>> platform where there might be the need to have different cpupools for each type
>>> of core, but also systems using NUMA can have different cpu pool for each node.
>>
>> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible:
>>    1) to have a mix of LITTLE and big vCPUs in the domain
>>    2) to create a domain spanning across two NUMA nodes
>>
>> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups.
> 
> The point of this patch is to make all cores available without breaking the current behaviour of existing system.

I might be missing some context here. By breaking current behavior, do 
you mean user that may want to add "hmp-unsafe" on the command line?

Cheers,

-- 
Julien Grall

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Bertrand Marquis 2 weeks, 4 days ago
Hi Julien,

> On 17 Nov 2021, at 11:48, Julien Grall <julien@xen.org> wrote:
> 
> On 17/11/2021 11:16, Bertrand Marquis wrote:
>> Hi Julien,
> 
> Hi,
> 
>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
>>> 
>>> Hi Luca,
>>> 
>>> On 17/11/2021 09:57, Luca Fancellu wrote:
>>>> Currently Xen creates a default cpupool0 that contains all the cpu brought up
>>>> during boot and it assumes that the platform has only one kind of CPU.
>>>> This assumption does not hold on big.LITTLE platform, but putting different
>>>> type of CPU in the same cpupool can result in instability and security issues
>>>> for the domains running on the pool.
>>> 
>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
>>> 
>>>> For this reason this serie introduces an architecture specific way to create
>>>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE
>>>> platform where there might be the need to have different cpupools for each type
>>>> of core, but also systems using NUMA can have different cpu pool for each node.
>>> 
>>> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible:
>>>   1) to have a mix of LITTLE and big vCPUs in the domain
>>>   2) to create a domain spanning across two NUMA nodes
>>> 
>>> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups.
>> The point of this patch is to make all cores available without breaking the current behaviour of existing system.
> 
> I might be missing some context here. By breaking current behavior, do you mean user that may want to add "hmp-unsafe" on the command line?

Right, with hmp-unsafe the behaviour is now the same as without, only extra cores are parked in other cpupools.

So you have a point in fact that behaviour is changed for someone who was using hmp-unsafe before if this is activated.
The command line argument suggested by Jurgen definitely makes sense here.

We could instead do the following:
- when this is activated in the configuration, boot all cores and park them in different pools (depending on command line argument). Current behaviour not change if other pools are not used (but more cores will be on in xen)
- when hmp-unsafe is on, this feature will be turned of (if activated in configuration) and all cores would be added in the same pool.

What do you think ?

Cheers
Bertrand

> 
> Cheers,
> 
> -- 
> Julien Grall


Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Julien Grall 2 weeks, 4 days ago
(Prunning some CC to just leave Arm and sched folks)

On 17/11/2021 12:07, Bertrand Marquis wrote:
> Hi Julien,

Hi Bertrand,

>> On 17 Nov 2021, at 11:48, Julien Grall <julien@xen.org> wrote:
>>
>> On 17/11/2021 11:16, Bertrand Marquis wrote:
>>> Hi Julien,
>>
>> Hi,
>>
>>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
>>>>
>>>> Hi Luca,
>>>>
>>>> On 17/11/2021 09:57, Luca Fancellu wrote:
>>>>> Currently Xen creates a default cpupool0 that contains all the cpu brought up
>>>>> during boot and it assumes that the platform has only one kind of CPU.
>>>>> This assumption does not hold on big.LITTLE platform, but putting different
>>>>> type of CPU in the same cpupool can result in instability and security issues
>>>>> for the domains running on the pool.
>>>>
>>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
>>>>
>>>>> For this reason this serie introduces an architecture specific way to create
>>>>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE
>>>>> platform where there might be the need to have different cpupools for each type
>>>>> of core, but also systems using NUMA can have different cpu pool for each node.
>>>>
>>>> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible:
>>>>    1) to have a mix of LITTLE and big vCPUs in the domain
>>>>    2) to create a domain spanning across two NUMA nodes
>>>>
>>>> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups.
>>> The point of this patch is to make all cores available without breaking the current behaviour of existing system.
>>
>> I might be missing some context here. By breaking current behavior, do you mean user that may want to add "hmp-unsafe" on the command line?
> 
> Right, with hmp-unsafe the behaviour is now the same as without, only extra cores are parked in other cpupools.
> 
> So you have a point in fact that behaviour is changed for someone who was using hmp-unsafe before if this is activated.
> The command line argument suggested by Jurgen definitely makes sense here.
> 
> We could instead do the following:
> - when this is activated in the configuration, boot all cores and park them in different pools (depending on command line argument). Current behaviour not change if other pools are not used (but more cores will be on in xen)

 From my understanding, it is possible to move a pCPU in/out a pool 
afterwards. So the security concern with big.LITTLE is still present, 
even though it would be difficult to hit it.

I am also concerned that it would be more difficult to detect any 
misconfiguration. So I think this option would still need to be turned 
on only if hmp-unsafe are the new command line one are both on.

If we want to enable it without hmp-unsafe on, we would need to at least 
lock the pools.

> - when hmp-unsafe is on, this feature will be turned of (if activated in configuration) and all cores would be added in the same pool.
> 
> What do you think ?

I am split. On one hand, this is making easier for someone to try 
big.LITTLE as you don't have manually pin vCPUs. On the other hand, this 
is handling a single use-case and you would need to use hmp-unsafe and 
pinning if you want to get more exotic setup (e.g. a domain with 
big.LITTLE).

Another possible solution is to pin dom0 vCPUs (AFAIK they are just 
sticky by default) and then create the pools from dom0 userspace. My 
assumption is for dom0less we would want to use pinning instead.

That said I would like to hear from Xilinx and EPAM as, IIRC, they are 
already using hmp-unsafe in production.

Cheers,

-- 
Julien Grall

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Oleksandr Tyshchenko 2 weeks, 3 days ago
On Wed, Nov 17, 2021 at 9:11 PM Julien Grall <julien@xen.org> wrote:

Hi Julien, all

[Sorry for the possible format issues]

(Prunning some CC to just leave Arm and sched folks)
>
> On 17/11/2021 12:07, Bertrand Marquis wrote:
> > Hi Julien,
>
> Hi Bertrand,
>
> >> On 17 Nov 2021, at 11:48, Julien Grall <julien@xen.org> wrote:
> >>
> >> On 17/11/2021 11:16, Bertrand Marquis wrote:
> >>> Hi Julien,
> >>
> >> Hi,
> >>
> >>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
> >>>>
> >>>> Hi Luca,
> >>>>
> >>>> On 17/11/2021 09:57, Luca Fancellu wrote:
> >>>>> Currently Xen creates a default cpupool0 that contains all the cpu
> brought up
> >>>>> during boot and it assumes that the platform has only one kind of
> CPU.
> >>>>> This assumption does not hold on big.LITTLE platform, but putting
> different
> >>>>> type of CPU in the same cpupool can result in instability and
> security issues
> >>>>> for the domains running on the pool.
> >>>>
> >>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
> >>>>
> >>>>> For this reason this serie introduces an architecture specific way
> to create
> >>>>> different cpupool at boot time, this is particularly useful on ARM
> big.LITTLE
> >>>>> platform where there might be the need to have different cpupools
> for each type
> >>>>> of core, but also systems using NUMA can have different cpu pool for
> each node.
> >>>>
> >>>> ... from my understanding, all the vCPUs of a domain have to be in
> the same cpupool. So with this approach it is not possible:
> >>>>    1) to have a mix of LITTLE and big vCPUs in the domain
> >>>>    2) to create a domain spanning across two NUMA nodes
> >>>>
> >>>> So I think we need to make sure that any solutions we go through will
> not prevent us to implement those setups.
> >>> The point of this patch is to make all cores available without
> breaking the current behaviour of existing system.
> >>
> >> I might be missing some context here. By breaking current behavior, do
> you mean user that may want to add "hmp-unsafe" on the command line?
> >
> > Right, with hmp-unsafe the behaviour is now the same as without, only
> extra cores are parked in other cpupools.
> >
> > So you have a point in fact that behaviour is changed for someone who
> was using hmp-unsafe before if this is activated.
> > The command line argument suggested by Jurgen definitely makes sense
> here.
> >
> > We could instead do the following:
> > - when this is activated in the configuration, boot all cores and park
> them in different pools (depending on command line argument). Current
> behaviour not change if other pools are not used (but more cores will be on
> in xen)
>
>  From my understanding, it is possible to move a pCPU in/out a pool
> afterwards. So the security concern with big.LITTLE is still present,
> even though it would be difficult to hit it.
>
> I am also concerned that it would be more difficult to detect any
> misconfiguration. So I think this option would still need to be turned
> on only if hmp-unsafe are the new command line one are both on.
>
> If we want to enable it without hmp-unsafe on, we would need to at least
> lock the pools.
>
> > - when hmp-unsafe is on, this feature will be turned of (if activated in
> configuration) and all cores would be added in the same pool.
> >
> > What do you think ?
>
> I am split. On one hand, this is making easier for someone to try
> big.LITTLE as you don't have manually pin vCPUs. On the other hand, this
> is handling a single use-case and you would need to use hmp-unsafe and
> pinning if you want to get more exotic setup (e.g. a domain with
> big.LITTLE).
>
> Another possible solution is to pin dom0 vCPUs (AFAIK they are just
> sticky by default) and then create the pools from dom0 userspace. My
> assumption is for dom0less we would want to use pinning instead.
>
> That said I would like to hear from Xilinx and EPAM as, IIRC, they are
> already using hmp-unsafe in production.
>


We have been using hmp-unsafe since it's introduction, yes we are aware of
possible consequences of enabling that option (as different CPU types might
have different errata, cache lines, PA bits (?), etc), so we are trying to
deal with it carefully.
In our target system we pin Dom0's vCPUs to the pCPUs of the same type from
userspace via "xl vcpu-pin" command, for other domains more exotic
configuration can be used.

I share Stefano's opinion not to tie new feature (boot time MIDR-based
cpupools) to existing hmp-unsafe option and create new command line option
to control new feature.

The proposed algorithm (copy it here for the convenience) looks fine to me.
"So in short I think it should be:
- midr-cpupools alone
cpupools created at boot, warning/errors on changing cpupools
- midr-cpupools + hmp-unsafe
cpupools created at boot, changing cpupools is allowed (we could still
warn but no errors)
- hmp-unsafe alone
same as today with hmp-unsafe"

For the default behaviour I also don't have a strong preference.  One thing
is clear: default behaviour should be safe. I think, the command line
option is preferred over the config option as new feature can be
enabled/disabled without a need to re-build Xen, moreover, if we follow the
proposed algorithm route, the behaviour of new feature at runtime (whether
the changing of cpupools is allowed or not) are going to be depended on the
hmp-unsafe state which is under command line control currently. But, there
is no strong preference here as well.



>
> Cheers,
>
> --
> Julien Grall
>
>

-- 
Regards,

Oleksandr Tyshchenko

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Stefano Stabellini 2 weeks, 3 days ago
On Wed, 17 Nov 2021, Julien Grall wrote:
> > > > > On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
> > > > > 
> > > > > Hi Luca,
> > > > > 
> > > > > On 17/11/2021 09:57, Luca Fancellu wrote:
> > > > > > Currently Xen creates a default cpupool0 that contains all the cpu
> > > > > > brought up
> > > > > > during boot and it assumes that the platform has only one kind of
> > > > > > CPU.
> > > > > > This assumption does not hold on big.LITTLE platform, but putting
> > > > > > different
> > > > > > type of CPU in the same cpupool can result in instability and
> > > > > > security issues
> > > > > > for the domains running on the pool.
> > > > > 
> > > > > I agree that you can't move a LITTLE vCPU to a big pCPU. However...
> > > > > 
> > > > > > For this reason this serie introduces an architecture specific way
> > > > > > to create
> > > > > > different cpupool at boot time, this is particularly useful on ARM
> > > > > > big.LITTLE
> > > > > > platform where there might be the need to have different cpupools
> > > > > > for each type
> > > > > > of core, but also systems using NUMA can have different cpu pool for
> > > > > > each node.
> > > > > 
> > > > > ... from my understanding, all the vCPUs of a domain have to be in the
> > > > > same cpupool. So with this approach it is not possible:
> > > > >    1) to have a mix of LITTLE and big vCPUs in the domain
> > > > >    2) to create a domain spanning across two NUMA nodes
> > > > > 
> > > > > So I think we need to make sure that any solutions we go through will
> > > > > not prevent us to implement those setups.
> > > > The point of this patch is to make all cores available without breaking
> > > > the current behaviour of existing system.
> > > 
> > > I might be missing some context here. By breaking current behavior, do you
> > > mean user that may want to add "hmp-unsafe" on the command line?
> > 
> > Right, with hmp-unsafe the behaviour is now the same as without, only extra
> > cores are parked in other cpupools.
> > 
> > So you have a point in fact that behaviour is changed for someone who was
> > using hmp-unsafe before if this is activated.
> > The command line argument suggested by Jurgen definitely makes sense here.
> > 
> > We could instead do the following:
> > - when this is activated in the configuration, boot all cores and park them
> > in different pools (depending on command line argument). Current behaviour
> > not change if other pools are not used (but more cores will be on in xen)
> 
> From my understanding, it is possible to move a pCPU in/out a pool afterwards.
> So the security concern with big.LITTLE is still present, even though it would
> be difficult to hit it.

As far as I know moving a pCPU in/out of a pool is something that cannot
happen automatically: it requires manual intervention to the user and it
is uncommon. We could print a warning or simply return error to prevent
the action from happening. Or something like:

"This action might result in memory corruptions and invalid behavior. Do
you want to continue? [Y/N]


> I am also concerned that it would be more difficult to detect any
> misconfiguration. So I think this option would still need to be turned on only
> if hmp-unsafe are the new command line one are both on.
> 
> If we want to enable it without hmp-unsafe on, we would need to at least lock
> the pools.

Locking the pools is a good idea.

My preference is not to tie this feature to the hmp-unsafe command line,
more on this below.


> > - when hmp-unsafe is on, this feature will be turned of (if activated in
> > configuration) and all cores would be added in the same pool.
> > 
> > What do you think ?
> 
> I am split. On one hand, this is making easier for someone to try big.LITTLE
> as you don't have manually pin vCPUs. On the other hand, this is handling a
> single use-case and you would need to use hmp-unsafe and pinning if you want
> to get more exotic setup (e.g. a domain with big.LITTLE).
> 
> Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky by
> default) and then create the pools from dom0 userspace. My assumption is for
> dom0less we would want to use pinning instead.
> 
> That said I would like to hear from Xilinx and EPAM as, IIRC, they are already
> using hmp-unsafe in production.

This discussion has been very interesting, it is cool to hear new ideas
like this one. I have a couple of thoughts to share.

First I think that the ability of creating cpupools at boot time is
super important. It goes way beyond big.LITTLE. It would be incredibly
useful to separate real-time (sched=null) and non-real-time
(sched=credit2) workloads. I think it will only become more important
going forward so I'd love to see an option to configure cpupools that
works for dom0less. It could be based on device tree properties rather
than kconfig options.

It is true that if we had the devicetree-based cpupool configuration I
mentioned, then somebody could use it to create cpupools matching
big.LITTLE. So "in theory" it solves the problem. However, I think that
for big.LITTLE it would be suboptimal. For big.LITTLE it would be best
if Xen configured the cpupools automatically rather than based on the
device tree configuration. That way, it is going to work automatically
without extra steps even in the simplest Xen setups.

So I think that it is a good idea to have a command line option (better
than a kconfig option) to trigger the MIDR-based cpupool creation at
boot time. The option could be called midr-cpupools=on/off or
hw-cpupools=on/off for example.

In terms of whether it should be the default or not, I don't feel
strongly about it. Unfortunately we (Xilinx) rely on a number of
non-default options already so we are already in the situation where we
have to be extra-careful at the options passed. I don't think that
adding one more would make a significant difference either way.


But my preference is *not* to tie the new command line option with
hmp-unsafe because if you use midr-cpupools and don't mess with the
pools then it is actually safe. We could even lock the cpupools like
Julien suggested or warn/return error on changing the cpupools. In this
scenario, it would be detrimental to also pass hmp-unsafe: it would
allow actually unsafe configurations that the user wanted to avoid by
using midr-cpupools. It would end up disabling checks we could put in
place to make midr-cpupools safer.

So in short I think it should be:

- midr-cpupools alone
cpupools created at boot, warning/errors on changing cpupools

- midr-cpupools + hmp-unsafe
cpupools created at boot, changing cpupools is allowed (we could still
warn but no errors)

- hmp-unsafe alone
same as today with hmp-unsafe


For the default as I said I don't have a strong preference. I think
midr-cpupools could be "on" be default.

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Julien Grall 2 weeks, 2 days ago
Hi Stefano,

On 18/11/2021 02:19, Stefano Stabellini wrote:
> On Wed, 17 Nov 2021, Julien Grall wrote:
>>>>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
>>>>>>
>>>>>> Hi Luca,
>>>>>>
>>>>>> On 17/11/2021 09:57, Luca Fancellu wrote:
>>>>>>> Currently Xen creates a default cpupool0 that contains all the cpu
>>>>>>> brought up
>>>>>>> during boot and it assumes that the platform has only one kind of
>>>>>>> CPU.
>>>>>>> This assumption does not hold on big.LITTLE platform, but putting
>>>>>>> different
>>>>>>> type of CPU in the same cpupool can result in instability and
>>>>>>> security issues
>>>>>>> for the domains running on the pool.
>>>>>>
>>>>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
>>>>>>
>>>>>>> For this reason this serie introduces an architecture specific way
>>>>>>> to create
>>>>>>> different cpupool at boot time, this is particularly useful on ARM
>>>>>>> big.LITTLE
>>>>>>> platform where there might be the need to have different cpupools
>>>>>>> for each type
>>>>>>> of core, but also systems using NUMA can have different cpu pool for
>>>>>>> each node.
>>>>>>
>>>>>> ... from my understanding, all the vCPUs of a domain have to be in the
>>>>>> same cpupool. So with this approach it is not possible:
>>>>>>     1) to have a mix of LITTLE and big vCPUs in the domain
>>>>>>     2) to create a domain spanning across two NUMA nodes
>>>>>>
>>>>>> So I think we need to make sure that any solutions we go through will
>>>>>> not prevent us to implement those setups.
>>>>> The point of this patch is to make all cores available without breaking
>>>>> the current behaviour of existing system.
>>>>
>>>> I might be missing some context here. By breaking current behavior, do you
>>>> mean user that may want to add "hmp-unsafe" on the command line?
>>>
>>> Right, with hmp-unsafe the behaviour is now the same as without, only extra
>>> cores are parked in other cpupools.
>>>
>>> So you have a point in fact that behaviour is changed for someone who was
>>> using hmp-unsafe before if this is activated.
>>> The command line argument suggested by Jurgen definitely makes sense here.
>>>
>>> We could instead do the following:
>>> - when this is activated in the configuration, boot all cores and park them
>>> in different pools (depending on command line argument). Current behaviour
>>> not change if other pools are not used (but more cores will be on in xen)
>>
>>  From my understanding, it is possible to move a pCPU in/out a pool afterwards.
>> So the security concern with big.LITTLE is still present, even though it would
>> be difficult to hit it.
> 
> As far as I know moving a pCPU in/out of a pool is something that cannot
> happen automatically: it requires manual intervention to the user and it
> is uncommon. 
> We could print a warning or simply return error to prevent
> the action from happening. Or something like:
> 
> "This action might result in memory corruptions and invalid behavior. Do
> you want to continue? [Y/N]
> 
> 
>> I am also concerned that it would be more difficult to detect any
>> misconfiguration. So I think this option would still need to be turned on only
>> if hmp-unsafe are the new command line one are both on.
>>
>> If we want to enable it without hmp-unsafe on, we would need to at least lock
>> the pools.
> 
> Locking the pools is a good idea.
> 
> My preference is not to tie this feature to the hmp-unsafe command line,
> more on this below.

The only reason I suggested to tie with hmp-unsafe is if you (or anyone 
else) really wanted to use a version where the pool are unlocked.

If we are going to implement the lock, then I think the hmp-unsafe would 
not be necessary for such configuration.

> 
> 
>>> - when hmp-unsafe is on, this feature will be turned of (if activated in
>>> configuration) and all cores would be added in the same pool.
>>>
>>> What do you think ?
>>
>> I am split. On one hand, this is making easier for someone to try big.LITTLE
>> as you don't have manually pin vCPUs. On the other hand, this is handling a
>> single use-case and you would need to use hmp-unsafe and pinning if you want
>> to get more exotic setup (e.g. a domain with big.LITTLE).
>>
>> Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky by
>> default) and then create the pools from dom0 userspace. My assumption is for
>> dom0less we would want to use pinning instead.
>>
>> That said I would like to hear from Xilinx and EPAM as, IIRC, they are already
>> using hmp-unsafe in production.
> 
> This discussion has been very interesting, it is cool to hear new ideas
> like this one. I have a couple of thoughts to share.
> 
> First I think that the ability of creating cpupools at boot time is
> super important. It goes way beyond big.LITTLE. It would be incredibly
> useful to separate real-time (sched=null) and non-real-time
> (sched=credit2) workloads. I think it will only become more important
> going forward so I'd love to see an option to configure cpupools that
> works for dom0less. It could be based on device tree properties rather
> than kconfig options.
> 
> It is true that if we had the devicetree-based cpupool configuration I
> mentioned, then somebody could use it to create cpupools matching
> big.LITTLE. > So "in theory" it solves the problem. However, I think that
> for big.LITTLE it would be suboptimal. For big.LITTLE it would be best
> if Xen configured the cpupools automatically rather than based on the
> device tree configuration. 

This brings one question. How do Linux detect and use big.LITTLE? Is 
there a Device-Tree binding?

[...]

> So I think that it is a good idea to have a command line option (better
> than a kconfig option) to trigger the MIDR-based cpupool creation at
> boot time. The option could be called midr-cpupools=on/off or
> hw-cpupools=on/off for example.
> In terms of whether it should be the default or not, I don't feel
> strongly about it.

On by default means this will security supported and we need to be 
reasonably confident this cannot be broken.

In this case, I am not only referring to moving a pCPU between pool but 
also Xen doing the right thing (e.g. finding the minimum cache line size).


[...]

> - midr-cpupools alone
> cpupools created at boot, warning/errors on changing cpupools >
> - midr-cpupools + hmp-unsafe
> cpupools created at boot, changing cpupools is allowed (we could still
> warn but no errors)
> 
> - hmp-unsafe alone
> same as today with hmp-unsafe

I like better Juergen's version. But before agreeing on the command line 
, I would like to understand how Linux is dealing with big.LITTLE today 
(see my question above).

> 
> For the default as I said I don't have a strong preference. I think
> midr-cpupools could be "on" be default.

I think this should be off at the beginning until the feature is matured 
enough. We are soon opening the tree again for the next development 
cycle. So I think we have enough time to make sure everything work 
properly to have turned on by default before next release.


Cheers,

-- 
Julien Grall

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Stefano Stabellini 2 weeks, 2 days ago
On Fri, 19 Nov 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 18/11/2021 02:19, Stefano Stabellini wrote:
> > On Wed, 17 Nov 2021, Julien Grall wrote:
> > > > > > > On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
> > > > > > > 
> > > > > > > Hi Luca,
> > > > > > > 
> > > > > > > On 17/11/2021 09:57, Luca Fancellu wrote:
> > > > > > > > Currently Xen creates a default cpupool0 that contains all the
> > > > > > > > cpu
> > > > > > > > brought up
> > > > > > > > during boot and it assumes that the platform has only one kind
> > > > > > > > of
> > > > > > > > CPU.
> > > > > > > > This assumption does not hold on big.LITTLE platform, but
> > > > > > > > putting
> > > > > > > > different
> > > > > > > > type of CPU in the same cpupool can result in instability and
> > > > > > > > security issues
> > > > > > > > for the domains running on the pool.
> > > > > > > 
> > > > > > > I agree that you can't move a LITTLE vCPU to a big pCPU.
> > > > > > > However...
> > > > > > > 
> > > > > > > > For this reason this serie introduces an architecture specific
> > > > > > > > way
> > > > > > > > to create
> > > > > > > > different cpupool at boot time, this is particularly useful on
> > > > > > > > ARM
> > > > > > > > big.LITTLE
> > > > > > > > platform where there might be the need to have different
> > > > > > > > cpupools
> > > > > > > > for each type
> > > > > > > > of core, but also systems using NUMA can have different cpu pool
> > > > > > > > for
> > > > > > > > each node.
> > > > > > > 
> > > > > > > ... from my understanding, all the vCPUs of a domain have to be in
> > > > > > > the
> > > > > > > same cpupool. So with this approach it is not possible:
> > > > > > >     1) to have a mix of LITTLE and big vCPUs in the domain
> > > > > > >     2) to create a domain spanning across two NUMA nodes
> > > > > > > 
> > > > > > > So I think we need to make sure that any solutions we go through
> > > > > > > will
> > > > > > > not prevent us to implement those setups.
> > > > > > The point of this patch is to make all cores available without
> > > > > > breaking
> > > > > > the current behaviour of existing system.
> > > > > 
> > > > > I might be missing some context here. By breaking current behavior, do
> > > > > you
> > > > > mean user that may want to add "hmp-unsafe" on the command line?
> > > > 
> > > > Right, with hmp-unsafe the behaviour is now the same as without, only
> > > > extra
> > > > cores are parked in other cpupools.
> > > > 
> > > > So you have a point in fact that behaviour is changed for someone who
> > > > was
> > > > using hmp-unsafe before if this is activated.
> > > > The command line argument suggested by Jurgen definitely makes sense
> > > > here.
> > > > 
> > > > We could instead do the following:
> > > > - when this is activated in the configuration, boot all cores and park
> > > > them
> > > > in different pools (depending on command line argument). Current
> > > > behaviour
> > > > not change if other pools are not used (but more cores will be on in
> > > > xen)
> > > 
> > >  From my understanding, it is possible to move a pCPU in/out a pool
> > > afterwards.
> > > So the security concern with big.LITTLE is still present, even though it
> > > would
> > > be difficult to hit it.
> > 
> > As far as I know moving a pCPU in/out of a pool is something that cannot
> > happen automatically: it requires manual intervention to the user and it
> > is uncommon. We could print a warning or simply return error to prevent
> > the action from happening. Or something like:
> > 
> > "This action might result in memory corruptions and invalid behavior. Do
> > you want to continue? [Y/N]
> > 
> > 
> > > I am also concerned that it would be more difficult to detect any
> > > misconfiguration. So I think this option would still need to be turned on
> > > only
> > > if hmp-unsafe are the new command line one are both on.
> > > 
> > > If we want to enable it without hmp-unsafe on, we would need to at least
> > > lock
> > > the pools.
> > 
> > Locking the pools is a good idea.
> > 
> > My preference is not to tie this feature to the hmp-unsafe command line,
> > more on this below.
> 
> The only reason I suggested to tie with hmp-unsafe is if you (or anyone else)
> really wanted to use a version where the pool are unlocked.
> 
> If we are going to implement the lock, then I think the hmp-unsafe would not
> be necessary for such configuration.
> 
> > 
> > 
> > > > - when hmp-unsafe is on, this feature will be turned of (if activated in
> > > > configuration) and all cores would be added in the same pool.
> > > > 
> > > > What do you think ?
> > > 
> > > I am split. On one hand, this is making easier for someone to try
> > > big.LITTLE
> > > as you don't have manually pin vCPUs. On the other hand, this is handling
> > > a
> > > single use-case and you would need to use hmp-unsafe and pinning if you
> > > want
> > > to get more exotic setup (e.g. a domain with big.LITTLE).
> > > 
> > > Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky
> > > by
> > > default) and then create the pools from dom0 userspace. My assumption is
> > > for
> > > dom0less we would want to use pinning instead.
> > > 
> > > That said I would like to hear from Xilinx and EPAM as, IIRC, they are
> > > already
> > > using hmp-unsafe in production.
> > 
> > This discussion has been very interesting, it is cool to hear new ideas
> > like this one. I have a couple of thoughts to share.
> > 
> > First I think that the ability of creating cpupools at boot time is
> > super important. It goes way beyond big.LITTLE. It would be incredibly
> > useful to separate real-time (sched=null) and non-real-time
> > (sched=credit2) workloads. I think it will only become more important
> > going forward so I'd love to see an option to configure cpupools that
> > works for dom0less. It could be based on device tree properties rather
> > than kconfig options.
> > 
> > It is true that if we had the devicetree-based cpupool configuration I
> > mentioned, then somebody could use it to create cpupools matching
> > big.LITTLE. > So "in theory" it solves the problem. However, I think that
> > for big.LITTLE it would be suboptimal. For big.LITTLE it would be best
> > if Xen configured the cpupools automatically rather than based on the
> > device tree configuration. 
> 
> This brings one question. How do Linux detect and use big.LITTLE? Is there a
> Device-Tree binding?
> 
> [...]
> 
> > So I think that it is a good idea to have a command line option (better
> > than a kconfig option) to trigger the MIDR-based cpupool creation at
> > boot time. The option could be called midr-cpupools=on/off or
> > hw-cpupools=on/off for example.
> > In terms of whether it should be the default or not, I don't feel
> > strongly about it.
> 
> On by default means this will security supported and we need to be reasonably
> confident this cannot be broken.
> 
> In this case, I am not only referring to moving a pCPU between pool but also
> Xen doing the right thing (e.g. finding the minimum cache line size).
> 
> 
> [...]
> 
> > - midr-cpupools alone
> > cpupools created at boot, warning/errors on changing cpupools >
> > - midr-cpupools + hmp-unsafe
> > cpupools created at boot, changing cpupools is allowed (we could still
> > warn but no errors)
> > 
> > - hmp-unsafe alone
> > same as today with hmp-unsafe
> 
> I like better Juergen's version. But before agreeing on the command line , I
> would like to understand how Linux is dealing with big.LITTLE today (see my
> question above).

I also like Juergen's version better :-)

The device tree binding that covers big.LITTLE is the same that covers
cpus: Documentation/devicetree/bindings/arm/cpus.yaml

So basically, you can tell it is a big.LITTLE system because the
compatible strings differ between certain cpus, e.g.:

      cpu@0 {
        device_type = "cpu";
        compatible = "arm,cortex-a15";
        reg = <0x0>;
      };

      cpu@1 {
        device_type = "cpu";
        compatible = "arm,cortex-a15";
        reg = <0x1>;
      };

      cpu@100 {
        device_type = "cpu";
        compatible = "arm,cortex-a7";
        reg = <0x100>;
      };

      cpu@101 {
        device_type = "cpu";
        compatible = "arm,cortex-a7";
        reg = <0x101>;
      };


> > For the default as I said I don't have a strong preference. I think
> > midr-cpupools could be "on" be default.
> 
> I think this should be off at the beginning until the feature is matured
> enough. We are soon opening the tree again for the next development cycle. So
> I think we have enough time to make sure everything work properly to have
> turned on by default before next release.

That's fine

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Julien Grall 1 week, 5 days ago
Hi Stefano,

On 19/11/2021 18:55, Stefano Stabellini wrote:
> On Fri, 19 Nov 2021, Julien Grall wrote:
>> I like better Juergen's version. But before agreeing on the command line , I
>> would like to understand how Linux is dealing with big.LITTLE today (see my
>> question above).
> 
> I also like Juergen's version better :-)
> 
> The device tree binding that covers big.LITTLE is the same that covers
> cpus: Documentation/devicetree/bindings/arm/cpus.yaml

Are you sure? I found 
Documentation/devicetree/bindings/arm/cpu-capacity.txt.

> 
> So basically, you can tell it is a big.LITTLE system because the
> compatible strings differ between certain cpus, e.g.:
> 
>        cpu@0 {
>          device_type = "cpu";
>          compatible = "arm,cortex-a15";
>          reg = <0x0>;
>        };
> 
>        cpu@1 {
>          device_type = "cpu";
>          compatible = "arm,cortex-a15";
>          reg = <0x1>;
>        };
> 
>        cpu@100 {
>          device_type = "cpu";
>          compatible = "arm,cortex-a7";
>          reg = <0x100>;
>        };
> 
>        cpu@101 {
>          device_type = "cpu";
>          compatible = "arm,cortex-a7";
>          reg = <0x101>;
>        };

I have not found any code in Linux using the compatible. Instead, they 
all seem to use the cpu-map (see drivers/base/arch_topology.c).

Anyway, to me it would be better to parse the Device-Tree over using the 
MIDR. The advantage is we can cover a wide range of cases (you may have 
processor with the same MIDR but different frequency). For now, we could 
create a cpupool per cluster.

Cheers,

-- 
Julien Grall

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Bertrand Marquis 1 week, 5 days ago
Hi Julien,

> On 23 Nov 2021, at 13:54, Julien Grall <julien@xen.org> wrote:
> 
> Hi Stefano,
> 
> On 19/11/2021 18:55, Stefano Stabellini wrote:
>> On Fri, 19 Nov 2021, Julien Grall wrote:
>>> I like better Juergen's version. But before agreeing on the command line , I
>>> would like to understand how Linux is dealing with big.LITTLE today (see my
>>> question above).
>> I also like Juergen's version better :-)
>> The device tree binding that covers big.LITTLE is the same that covers
>> cpus: Documentation/devicetree/bindings/arm/cpus.yaml
> 
> Are you sure? I found Documentation/devicetree/bindings/arm/cpu-capacity.txt.
> 
>> So basically, you can tell it is a big.LITTLE system because the
>> compatible strings differ between certain cpus, e.g.:
>>       cpu@0 {
>>         device_type = "cpu";
>>         compatible = "arm,cortex-a15";
>>         reg = <0x0>;
>>       };
>>       cpu@1 {
>>         device_type = "cpu";
>>         compatible = "arm,cortex-a15";
>>         reg = <0x1>;
>>       };
>>       cpu@100 {
>>         device_type = "cpu";
>>         compatible = "arm,cortex-a7";
>>         reg = <0x100>;
>>       };
>>       cpu@101 {
>>         device_type = "cpu";
>>         compatible = "arm,cortex-a7";
>>         reg = <0x101>;
>>       };
> 
> I have not found any code in Linux using the compatible. Instead, they all seem to use the cpu-map (see drivers/base/arch_topology.c).
> 
> Anyway, to me it would be better to parse the Device-Tree over using the MIDR. The advantage is we can cover a wide range of cases (you may have processor with the same MIDR but different frequency). For now, we could create a cpupool per cluster.

This is a really good idea as this could also work for NUMA systems.

So reg & ~0xff would give the cluster number ?

We will probably end up splitting cores into different cpupools even if they are all the same as there are several CPUs having several clusters but the same cores (I recall some NXP boards being like that).

Some device tree also have a cpu-map definition:
        cpu-map {
            cluster0 {
                core0 {
                    cpu = <&A57_0>;
                };
                core1 {
                    cpu = <&A57_1>;
                };
            };

            cluster1 {
                core0 {
                    cpu = <&A53_0>;
                };
                core1 {
                    cpu = <&A53_1>;
                };
                core2 {
                    cpu = <&A53_2>;
                };
                core3 {
                    cpu = <&A53_3>;
                };
            };
        };

@stefano: is the cpu-map something standard ? Lots of device trees do have it in Linux now but I do not recall seeing that on older device trees.

Maybe using cpu-map could be a solution, we could say that device tree without a cpu-map are not supported.

Cheers
Bertrand



> 
> Cheers,
> 
> -- 
> Julien Grall


Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Stefano Stabellini 1 week, 5 days ago
On Tue, 23 Nov 2021, Bertrand Marquis wrote:
> Hi Julien,
> 
> > On 23 Nov 2021, at 13:54, Julien Grall <julien@xen.org> wrote:
> > 
> > Hi Stefano,
> > 
> > On 19/11/2021 18:55, Stefano Stabellini wrote:
> >> On Fri, 19 Nov 2021, Julien Grall wrote:
> >>> I like better Juergen's version. But before agreeing on the command line , I
> >>> would like to understand how Linux is dealing with big.LITTLE today (see my
> >>> question above).
> >> I also like Juergen's version better :-)
> >> The device tree binding that covers big.LITTLE is the same that covers
> >> cpus: Documentation/devicetree/bindings/arm/cpus.yaml
> > 
> > Are you sure? I found Documentation/devicetree/bindings/arm/cpu-capacity.txt.
> > 
> >> So basically, you can tell it is a big.LITTLE system because the
> >> compatible strings differ between certain cpus, e.g.:
> >>       cpu@0 {
> >>         device_type = "cpu";
> >>         compatible = "arm,cortex-a15";
> >>         reg = <0x0>;
> >>       };
> >>       cpu@1 {
> >>         device_type = "cpu";
> >>         compatible = "arm,cortex-a15";
> >>         reg = <0x1>;
> >>       };
> >>       cpu@100 {
> >>         device_type = "cpu";
> >>         compatible = "arm,cortex-a7";
> >>         reg = <0x100>;
> >>       };
> >>       cpu@101 {
> >>         device_type = "cpu";
> >>         compatible = "arm,cortex-a7";
> >>         reg = <0x101>;
> >>       };
> > 
> > I have not found any code in Linux using the compatible. Instead, they all seem to use the cpu-map (see drivers/base/arch_topology.c).
> > 
> > Anyway, to me it would be better to parse the Device-Tree over using the MIDR. The advantage is we can cover a wide range of cases (you may have processor with the same MIDR but different frequency). For now, we could create a cpupool per cluster.
> 
> This is a really good idea as this could also work for NUMA systems.
> 
> So reg & ~0xff would give the cluster number ?
> 
> We will probably end up splitting cores into different cpupools even if they are all the same as there are several CPUs having several clusters but the same cores (I recall some NXP boards being like that).
> 
> Some device tree also have a cpu-map definition:
>         cpu-map {
>             cluster0 {
>                 core0 {
>                     cpu = <&A57_0>;
>                 };
>                 core1 {
>                     cpu = <&A57_1>;
>                 };
>             };
> 
>             cluster1 {
>                 core0 {
>                     cpu = <&A53_0>;
>                 };
>                 core1 {
>                     cpu = <&A53_1>;
>                 };
>                 core2 {
>                     cpu = <&A53_2>;
>                 };
>                 core3 {
>                     cpu = <&A53_3>;
>                 };
>             };
>         };
> 
> @stefano: is the cpu-map something standard ? Lots of device trees do have it in Linux now but I do not recall seeing that on older device trees.
> Maybe using cpu-map could be a solution, we could say that device tree without a cpu-map are not supported.


cpu-map is newer than big.LITTLE support in Linux. See for instance
commit 4ab328f06. Before cpu-map was introduced, Linux used mostly the
MPIDR or sometimes the *machine* compatible string. I did find one
example of a board where the cpu compatible string is the same for both
big and LITTLE CPUs: arch/arm64/boot/dts/rockchip/rk3368.dtsi. That
could be the reason why the cpu compatible string is not used for
detecting big.LITTLE. Sorry about that, my earlier suggestion was wrong.

Yes I think using cpu-map would be fine. It seems to be widely used by
different vendors. (Even if something as generic as cpu-map should
really be under the devicetree.org spec, not under Linux, but anyway.)

Only one note: you mentioned NUMA. As you can see from
Documentation/devicetree/bindings/numa.txt, NUMA doesn't use cpu-map.
Instead, it uses numa-node-id and distance-map.

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Juergen Gross 2 weeks, 3 days ago
On 18.11.21 03:19, Stefano Stabellini wrote:
> On Wed, 17 Nov 2021, Julien Grall wrote:
>>>>>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
>>>>>>
>>>>>> Hi Luca,
>>>>>>
>>>>>> On 17/11/2021 09:57, Luca Fancellu wrote:
>>>>>>> Currently Xen creates a default cpupool0 that contains all the cpu
>>>>>>> brought up
>>>>>>> during boot and it assumes that the platform has only one kind of
>>>>>>> CPU.
>>>>>>> This assumption does not hold on big.LITTLE platform, but putting
>>>>>>> different
>>>>>>> type of CPU in the same cpupool can result in instability and
>>>>>>> security issues
>>>>>>> for the domains running on the pool.
>>>>>>
>>>>>> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
>>>>>>
>>>>>>> For this reason this serie introduces an architecture specific way
>>>>>>> to create
>>>>>>> different cpupool at boot time, this is particularly useful on ARM
>>>>>>> big.LITTLE
>>>>>>> platform where there might be the need to have different cpupools
>>>>>>> for each type
>>>>>>> of core, but also systems using NUMA can have different cpu pool for
>>>>>>> each node.
>>>>>>
>>>>>> ... from my understanding, all the vCPUs of a domain have to be in the
>>>>>> same cpupool. So with this approach it is not possible:
>>>>>>     1) to have a mix of LITTLE and big vCPUs in the domain
>>>>>>     2) to create a domain spanning across two NUMA nodes
>>>>>>
>>>>>> So I think we need to make sure that any solutions we go through will
>>>>>> not prevent us to implement those setups.
>>>>> The point of this patch is to make all cores available without breaking
>>>>> the current behaviour of existing system.
>>>>
>>>> I might be missing some context here. By breaking current behavior, do you
>>>> mean user that may want to add "hmp-unsafe" on the command line?
>>>
>>> Right, with hmp-unsafe the behaviour is now the same as without, only extra
>>> cores are parked in other cpupools.
>>>
>>> So you have a point in fact that behaviour is changed for someone who was
>>> using hmp-unsafe before if this is activated.
>>> The command line argument suggested by Jurgen definitely makes sense here.
>>>
>>> We could instead do the following:
>>> - when this is activated in the configuration, boot all cores and park them
>>> in different pools (depending on command line argument). Current behaviour
>>> not change if other pools are not used (but more cores will be on in xen)
>>
>>  From my understanding, it is possible to move a pCPU in/out a pool afterwards.
>> So the security concern with big.LITTLE is still present, even though it would
>> be difficult to hit it.
> 
> As far as I know moving a pCPU in/out of a pool is something that cannot
> happen automatically: it requires manual intervention to the user and it
> is uncommon. We could print a warning or simply return error to prevent
> the action from happening. Or something like:
> 
> "This action might result in memory corruptions and invalid behavior. Do
> you want to continue? [Y/N]

This should only be rejected if the source and target pool are not
compatible. So a cpupool could be attributed to allow only specific
cpus (and maybe domains?) in it.

Otherwise it would be impossible to create new cpupools after boot on
such a system and populating them with cpus.

>> I am also concerned that it would be more difficult to detect any
>> misconfiguration. So I think this option would still need to be turned on only
>> if hmp-unsafe are the new command line one are both on.
>>
>> If we want to enable it without hmp-unsafe on, we would need to at least lock
>> the pools.
> 
> Locking the pools is a good idea.

This would be another option, yes.

> My preference is not to tie this feature to the hmp-unsafe command line,
> more on this below.

I agree.

>>> - when hmp-unsafe is on, this feature will be turned of (if activated in
>>> configuration) and all cores would be added in the same pool.
>>>
>>> What do you think ?
>>
>> I am split. On one hand, this is making easier for someone to try big.LITTLE
>> as you don't have manually pin vCPUs. On the other hand, this is handling a
>> single use-case and you would need to use hmp-unsafe and pinning if you want
>> to get more exotic setup (e.g. a domain with big.LITTLE).
>>
>> Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky by
>> default) and then create the pools from dom0 userspace. My assumption is for
>> dom0less we would want to use pinning instead.
>>
>> That said I would like to hear from Xilinx and EPAM as, IIRC, they are already
>> using hmp-unsafe in production.
> 
> This discussion has been very interesting, it is cool to hear new ideas
> like this one. I have a couple of thoughts to share.
> 
> First I think that the ability of creating cpupools at boot time is
> super important. It goes way beyond big.LITTLE. It would be incredibly
> useful to separate real-time (sched=null) and non-real-time
> (sched=credit2) workloads. I think it will only become more important
> going forward so I'd love to see an option to configure cpupools that
> works for dom0less. It could be based on device tree properties rather
> than kconfig options.

I think device tree AND command line option should be possible (think of
x86 here).

> It is true that if we had the devicetree-based cpupool configuration I
> mentioned, then somebody could use it to create cpupools matching
> big.LITTLE. So "in theory" it solves the problem. However, I think that
> for big.LITTLE it would be suboptimal. For big.LITTLE it would be best
> if Xen configured the cpupools automatically rather than based on the
> device tree configuration. That way, it is going to work automatically
> without extra steps even in the simplest Xen setups.
> 
> So I think that it is a good idea to have a command line option (better
> than a kconfig option) to trigger the MIDR-based cpupool creation at
> boot time. The option could be called midr-cpupools=on/off or
> hw-cpupools=on/off for example.

I'd rather go for:

cpupools=<options>

With e.g. <options>:

- "auto-midr": split system into cpupools based on MIDR
- "auto-numa": split system into cpupools based on NUMA nodes
- "cpus=<list of cpus>[,sched=<scheduler>]

This would be rather flexible without adding more and more options
doing similar things. Other sub-options could be added rather easily.

> In terms of whether it should be the default or not, I don't feel
> strongly about it. Unfortunately we (Xilinx) rely on a number of
> non-default options already so we are already in the situation where we
> have to be extra-careful at the options passed. I don't think that
> adding one more would make a significant difference either way.
> 
> 
> But my preference is *not* to tie the new command line option with
> hmp-unsafe because if you use midr-cpupools and don't mess with the
> pools then it is actually safe. We could even lock the cpupools like
> Julien suggested or warn/return error on changing the cpupools. In this
> scenario, it would be detrimental to also pass hmp-unsafe: it would
> allow actually unsafe configurations that the user wanted to avoid by
> using midr-cpupools. It would end up disabling checks we could put in
> place to make midr-cpupools safer.
> 
> So in short I think it should be:
> 
> - midr-cpupools alone
> cpupools created at boot, warning/errors on changing cpupools
> 
> - midr-cpupools + hmp-unsafe
> cpupools created at boot, changing cpupools is allowed (we could still
> warn but no errors)

I'd rather add an explicit ",locked" option to above cpupools parameter.

> 
> - hmp-unsafe alone
> same as today with hmp-unsafe
> 
> 
> For the default as I said I don't have a strong preference. I think
> midr-cpupools could be "on" be default.
> 

What about making this a Kconfig option?


Juergen

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Stefano Stabellini 2 weeks, 3 days ago
I like all your suggestions below!


On Thu, 18 Nov 2021, Juergen Gross wrote:
> On 18.11.21 03:19, Stefano Stabellini wrote:
> > On Wed, 17 Nov 2021, Julien Grall wrote:
> > > > > > > On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
> > > > > > > 
> > > > > > > Hi Luca,
> > > > > > > 
> > > > > > > On 17/11/2021 09:57, Luca Fancellu wrote:
> > > > > > > > Currently Xen creates a default cpupool0 that contains all the
> > > > > > > > cpu
> > > > > > > > brought up
> > > > > > > > during boot and it assumes that the platform has only one kind
> > > > > > > > of
> > > > > > > > CPU.
> > > > > > > > This assumption does not hold on big.LITTLE platform, but
> > > > > > > > putting
> > > > > > > > different
> > > > > > > > type of CPU in the same cpupool can result in instability and
> > > > > > > > security issues
> > > > > > > > for the domains running on the pool.
> > > > > > > 
> > > > > > > I agree that you can't move a LITTLE vCPU to a big pCPU.
> > > > > > > However...
> > > > > > > 
> > > > > > > > For this reason this serie introduces an architecture specific
> > > > > > > > way
> > > > > > > > to create
> > > > > > > > different cpupool at boot time, this is particularly useful on
> > > > > > > > ARM
> > > > > > > > big.LITTLE
> > > > > > > > platform where there might be the need to have different
> > > > > > > > cpupools
> > > > > > > > for each type
> > > > > > > > of core, but also systems using NUMA can have different cpu pool
> > > > > > > > for
> > > > > > > > each node.
> > > > > > > 
> > > > > > > ... from my understanding, all the vCPUs of a domain have to be in
> > > > > > > the
> > > > > > > same cpupool. So with this approach it is not possible:
> > > > > > >     1) to have a mix of LITTLE and big vCPUs in the domain
> > > > > > >     2) to create a domain spanning across two NUMA nodes
> > > > > > > 
> > > > > > > So I think we need to make sure that any solutions we go through
> > > > > > > will
> > > > > > > not prevent us to implement those setups.
> > > > > > The point of this patch is to make all cores available without
> > > > > > breaking
> > > > > > the current behaviour of existing system.
> > > > > 
> > > > > I might be missing some context here. By breaking current behavior, do
> > > > > you
> > > > > mean user that may want to add "hmp-unsafe" on the command line?
> > > > 
> > > > Right, with hmp-unsafe the behaviour is now the same as without, only
> > > > extra
> > > > cores are parked in other cpupools.
> > > > 
> > > > So you have a point in fact that behaviour is changed for someone who
> > > > was
> > > > using hmp-unsafe before if this is activated.
> > > > The command line argument suggested by Jurgen definitely makes sense
> > > > here.
> > > > 
> > > > We could instead do the following:
> > > > - when this is activated in the configuration, boot all cores and park
> > > > them
> > > > in different pools (depending on command line argument). Current
> > > > behaviour
> > > > not change if other pools are not used (but more cores will be on in
> > > > xen)
> > > 
> > >  From my understanding, it is possible to move a pCPU in/out a pool
> > > afterwards.
> > > So the security concern with big.LITTLE is still present, even though it
> > > would
> > > be difficult to hit it.
> > 
> > As far as I know moving a pCPU in/out of a pool is something that cannot
> > happen automatically: it requires manual intervention to the user and it
> > is uncommon. We could print a warning or simply return error to prevent
> > the action from happening. Or something like:
> > 
> > "This action might result in memory corruptions and invalid behavior. Do
> > you want to continue? [Y/N]
> 
> This should only be rejected if the source and target pool are not
> compatible. So a cpupool could be attributed to allow only specific
> cpus (and maybe domains?) in it.

Yes you are right.


> Otherwise it would be impossible to create new cpupools after boot on
> such a system and populating them with cpus.
>
> > > I am also concerned that it would be more difficult to detect any
> > > misconfiguration. So I think this option would still need to be turned on
> > > only
> > > if hmp-unsafe are the new command line one are both on.
> > > 
> > > If we want to enable it without hmp-unsafe on, we would need to at least
> > > lock
> > > the pools.
> > 
> > Locking the pools is a good idea.
> 
> This would be another option, yes.
> 
> > My preference is not to tie this feature to the hmp-unsafe command line,
> > more on this below.
> 
> I agree.
> 
> > > > - when hmp-unsafe is on, this feature will be turned of (if activated in
> > > > configuration) and all cores would be added in the same pool.
> > > > 
> > > > What do you think ?
> > > 
> > > I am split. On one hand, this is making easier for someone to try
> > > big.LITTLE
> > > as you don't have manually pin vCPUs. On the other hand, this is handling
> > > a
> > > single use-case and you would need to use hmp-unsafe and pinning if you
> > > want
> > > to get more exotic setup (e.g. a domain with big.LITTLE).
> > > 
> > > Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky
> > > by
> > > default) and then create the pools from dom0 userspace. My assumption is
> > > for
> > > dom0less we would want to use pinning instead.
> > > 
> > > That said I would like to hear from Xilinx and EPAM as, IIRC, they are
> > > already
> > > using hmp-unsafe in production.
> > 
> > This discussion has been very interesting, it is cool to hear new ideas
> > like this one. I have a couple of thoughts to share.
> > 
> > First I think that the ability of creating cpupools at boot time is
> > super important. It goes way beyond big.LITTLE. It would be incredibly
> > useful to separate real-time (sched=null) and non-real-time
> > (sched=credit2) workloads. I think it will only become more important
> > going forward so I'd love to see an option to configure cpupools that
> > works for dom0less. It could be based on device tree properties rather
> > than kconfig options.
> 
> I think device tree AND command line option should be possible (think of
> x86 here).

Sure


> > It is true that if we had the devicetree-based cpupool configuration I
> > mentioned, then somebody could use it to create cpupools matching
> > big.LITTLE. So "in theory" it solves the problem. However, I think that
> > for big.LITTLE it would be suboptimal. For big.LITTLE it would be best
> > if Xen configured the cpupools automatically rather than based on the
> > device tree configuration. That way, it is going to work automatically
> > without extra steps even in the simplest Xen setups.
> > 
> > So I think that it is a good idea to have a command line option (better
> > than a kconfig option) to trigger the MIDR-based cpupool creation at
> > boot time. The option could be called midr-cpupools=on/off or
> > hw-cpupools=on/off for example.
> 
> I'd rather go for:
> 
> cpupools=<options>
> 
> With e.g. <options>:
> 
> - "auto-midr": split system into cpupools based on MIDR
> - "auto-numa": split system into cpupools based on NUMA nodes
> - "cpus=<list of cpus>[,sched=<scheduler>]
> 
> This would be rather flexible without adding more and more options
> doing similar things. Other sub-options could be added rather easily.

I like this


> > In terms of whether it should be the default or not, I don't feel
> > strongly about it. Unfortunately we (Xilinx) rely on a number of
> > non-default options already so we are already in the situation where we
> > have to be extra-careful at the options passed. I don't think that
> > adding one more would make a significant difference either way.
> > 
> > 
> > But my preference is *not* to tie the new command line option with
> > hmp-unsafe because if you use midr-cpupools and don't mess with the
> > pools then it is actually safe. We could even lock the cpupools like
> > Julien suggested or warn/return error on changing the cpupools. In this
> > scenario, it would be detrimental to also pass hmp-unsafe: it would
> > allow actually unsafe configurations that the user wanted to avoid by
> > using midr-cpupools. It would end up disabling checks we could put in
> > place to make midr-cpupools safer.
> > 
> > So in short I think it should be:
> > 
> > - midr-cpupools alone
> > cpupools created at boot, warning/errors on changing cpupools
> > 
> > - midr-cpupools + hmp-unsafe
> > cpupools created at boot, changing cpupools is allowed (we could still
> > warn but no errors)
> 
> I'd rather add an explicit ",locked" option to above cpupools parameter.

yeah that's better

 
> > 
> > - hmp-unsafe alone
> > same as today with hmp-unsafe
> > 
> > 
> > For the default as I said I don't have a strong preference. I think
> > midr-cpupools could be "on" be default.
> > 
> 
> What about making this a Kconfig option?

Could also be a good idea

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Juergen Gross 2 weeks, 4 days ago
On 17.11.21 12:16, Bertrand Marquis wrote:
> Hi Julien,
> 
>> On 17 Nov 2021, at 10:26, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Luca,
>>
>> On 17/11/2021 09:57, Luca Fancellu wrote:
>>> Currently Xen creates a default cpupool0 that contains all the cpu brought up
>>> during boot and it assumes that the platform has only one kind of CPU.
>>> This assumption does not hold on big.LITTLE platform, but putting different
>>> type of CPU in the same cpupool can result in instability and security issues
>>> for the domains running on the pool.
>>
>> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
>>
>>> For this reason this serie introduces an architecture specific way to create
>>> different cpupool at boot time, this is particularly useful on ARM big.LITTLE
>>> platform where there might be the need to have different cpupools for each type
>>> of core, but also systems using NUMA can have different cpu pool for each node.
>>
>> ... from my understanding, all the vCPUs of a domain have to be in the same cpupool. So with this approach it is not possible:
>>    1) to have a mix of LITTLE and big vCPUs in the domain
>>    2) to create a domain spanning across two NUMA nodes
>>
>> So I think we need to make sure that any solutions we go through will not prevent us to implement those setups.
> 
> The point of this patch is to make all cores available without breaking the current behaviour of existing system.

May I suggest to add a boot parameter for being able to control this
behavior by other means than compile time configuration?

> 
> Someone not using cpupool will keep running on the same cores as before.
> Someone wanting to use the other cores could assign a guest to the other(s) cpupool (big.LITTLE is just an example with 2 but there are now cores with 3 types of cores).
> Someone wanting to build something different can now create new cpupools in Dom0 and assign the cores they want to is to build a guest having access to different types of cores.
> 
> The point here is just to make the “other” cores accessible and park them in cpupools so that current behaviour is not changed.
> 
>>
>> I can see two options here:
>>   1) Allowing a domain vCPUs to be on a different cpupool
>>   2) Introducing CPU class (see [1])
>>
>> I can't remember why Dario suggested 2) rather than 1) in the past. @Dario, do you remember it?
> 
> I think 1) is definitely interesting and something that could be looked at in the future.

 From scheduler point of view this is IMO a nightmare.


Juergen

Re: [RFC PATCH 0/2] Boot time cpupools

Posted by Juergen Gross 2 weeks, 4 days ago
On 17.11.21 11:26, Julien Grall wrote:
> Hi Luca,
> 
> On 17/11/2021 09:57, Luca Fancellu wrote:
>> Currently Xen creates a default cpupool0 that contains all the cpu 
>> brought up
>> during boot and it assumes that the platform has only one kind of CPU.
>> This assumption does not hold on big.LITTLE platform, but putting 
>> different
>> type of CPU in the same cpupool can result in instability and security 
>> issues
>> for the domains running on the pool.
> 
> I agree that you can't move a LITTLE vCPU to a big pCPU. However...
> 
>>
>> For this reason this serie introduces an architecture specific way to 
>> create
>> different cpupool at boot time, this is particularly useful on ARM 
>> big.LITTLE
>> platform where there might be the need to have different cpupools for 
>> each type
>> of core, but also systems using NUMA can have different cpu pool for 
>> each node.
> 
> ... from my understanding, all the vCPUs of a domain have to be in the 
> same cpupool. So with this approach it is not possible:
>     1) to have a mix of LITTLE and big vCPUs in the domain
>     2) to create a domain spanning across two NUMA nodes
> 
> So I think we need to make sure that any solutions we go through will 
> not prevent us to implement those setups.
> 
> I can see two options here:
>    1) Allowing a domain vCPUs to be on a different cpupool
>    2) Introducing CPU class (see [1])
> 
> I can't remember why Dario suggested 2) rather than 1) in the past. 
> @Dario, do you remember it?

Having vcpus of a domain in different cpupools would probably require a
major scheduling rework due to several accounting information being per
cpupool today with some of that data being additionally per domain. Not
to mention the per-domain scheduling parameters, which would need to
cope with different schedulers suddenly (imagine one cpupool using
credit2 and the other rt).

I'd rather use implicit pinning e.g. via a cpu class.


Juergen