With lock handling now allowing up to 16384 cpus (spinlocks can handle
65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit for
the number of cpus to be configured to 16383.
The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and
QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS.
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V5:
- new patch (Jan Beulich)
---
xen/arch/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index 67ba38f32f..308ce129a8 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -6,7 +6,7 @@ config PHYS_ADDR_T_32
config NR_CPUS
int "Maximum number of CPUs"
- range 1 4095
+ range 1 16383
default "256" if X86
default "8" if ARM && RCAR3
default "4" if ARM && QEMU
--
2.35.3
On 27.03.2024 16:22, Juergen Gross wrote: > With lock handling now allowing up to 16384 cpus (spinlocks can handle > 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit for > the number of cpus to be configured to 16383. > > The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and > QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS. > > Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com> I'd prefer this to also gain an Arm ack, though. Jan > --- a/xen/arch/Kconfig > +++ b/xen/arch/Kconfig > @@ -6,7 +6,7 @@ config PHYS_ADDR_T_32 > > config NR_CPUS > int "Maximum number of CPUs" > - range 1 4095 > + range 1 16383 > default "256" if X86 > default "8" if ARM && RCAR3 > default "4" if ARM && QEMU
On 08.04.24 09:10, Jan Beulich wrote: > On 27.03.2024 16:22, Juergen Gross wrote: >> With lock handling now allowing up to 16384 cpus (spinlocks can handle >> 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit for >> the number of cpus to be configured to 16383. >> >> The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and >> QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS. >> >> Signed-off-by: Juergen Gross <jgross@suse.com> > > Acked-by: Jan Beulich <jbeulich@suse.com> > > I'd prefer this to also gain an Arm ack, though. Any comment from Arm side? Juergen > > Jan > >> --- a/xen/arch/Kconfig >> +++ b/xen/arch/Kconfig >> @@ -6,7 +6,7 @@ config PHYS_ADDR_T_32 >> >> config NR_CPUS >> int "Maximum number of CPUs" >> - range 1 4095 >> + range 1 16383 >> default "256" if X86 >> default "8" if ARM && RCAR3 >> default "4" if ARM && QEMU > >
Hi Juergen, Sorry for the late reply. On 29/04/2024 11:33, Juergen Gross wrote: > On 08.04.24 09:10, Jan Beulich wrote: >> On 27.03.2024 16:22, Juergen Gross wrote: >>> With lock handling now allowing up to 16384 cpus (spinlocks can handle >>> 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit for >>> the number of cpus to be configured to 16383. >>> >>> The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and >>> QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS. >>> >>> Signed-off-by: Juergen Gross <jgross@suse.com> >> >> Acked-by: Jan Beulich <jbeulich@suse.com> >> >> I'd prefer this to also gain an Arm ack, though. > > Any comment from Arm side? Can you clarify what the new limits mean in term of (security) support? Are we now claiming that Xen will work perfectly fine on platforms with up to 16383? If so, I can't comment for x86, but for Arm, I am doubtful that it would work without any (at least performance) issues. AFAIK, this is also an untested configuration. In fact I would be surprised if Xen on Arm was tested with more than a couple of hundreds cores (AFAICT the Ampere CPUs has 192 CPUs). Cheers, -- Julien Grall
On 29.04.24 13:04, Julien Grall wrote: > Hi Juergen, > > Sorry for the late reply. > > On 29/04/2024 11:33, Juergen Gross wrote: >> On 08.04.24 09:10, Jan Beulich wrote: >>> On 27.03.2024 16:22, Juergen Gross wrote: >>>> With lock handling now allowing up to 16384 cpus (spinlocks can handle >>>> 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit for >>>> the number of cpus to be configured to 16383. >>>> >>>> The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and >>>> QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS. >>>> >>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>> >>> Acked-by: Jan Beulich <jbeulich@suse.com> >>> >>> I'd prefer this to also gain an Arm ack, though. >> >> Any comment from Arm side? > > Can you clarify what the new limits mean in term of (security) support? Are we > now claiming that Xen will work perfectly fine on platforms with up to 16383? > > If so, I can't comment for x86, but for Arm, I am doubtful that it would work > without any (at least performance) issues. AFAIK, this is also an untested > configuration. In fact I would be surprised if Xen on Arm was tested with more > than a couple of hundreds cores (AFAICT the Ampere CPUs has 192 CPUs). I think we should add a security support limit for the number of physical cpus similar to the memory support limit we already have in place. For x86 I'd suggest 4096 cpus for security support (basically the limit we have with this patch), but I'm open for other suggestions, too. I have no idea about any sensible limits for Arm32/Arm64. Juergen
Hi Juergen, On 29/04/2024 12:28, Jürgen Groß wrote: > On 29.04.24 13:04, Julien Grall wrote: >> Hi Juergen, >> >> Sorry for the late reply. >> >> On 29/04/2024 11:33, Juergen Gross wrote: >>> On 08.04.24 09:10, Jan Beulich wrote: >>>> On 27.03.2024 16:22, Juergen Gross wrote: >>>>> With lock handling now allowing up to 16384 cpus (spinlocks can handle >>>>> 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit >>>>> for >>>>> the number of cpus to be configured to 16383. >>>>> >>>>> The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and >>>>> QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS. >>>>> >>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>> >>>> Acked-by: Jan Beulich <jbeulich@suse.com> >>>> >>>> I'd prefer this to also gain an Arm ack, though. >>> >>> Any comment from Arm side? >> >> Can you clarify what the new limits mean in term of (security) >> support? Are we now claiming that Xen will work perfectly fine on >> platforms with up to 16383? >> >> If so, I can't comment for x86, but for Arm, I am doubtful that it >> would work without any (at least performance) issues. AFAIK, this is >> also an untested configuration. In fact I would be surprised if Xen on >> Arm was tested with more than a couple of hundreds cores (AFAICT the >> Ampere CPUs has 192 CPUs). > > I think we should add a security support limit for the number of physical > cpus similar to the memory support limit we already have in place. > > For x86 I'd suggest 4096 cpus for security support (basically the limit we > have with this patch), but I'm open for other suggestions, too. > > I have no idea about any sensible limits for Arm32/Arm64. I am not entirely. Bertrand, Michal, Stefano, should we use 192 (the number of CPUs from Ampere)? Cheers, -- Julien Grall
On Mon, 29 Apr 2024, Julien Grall wrote: > Hi Juergen, > > On 29/04/2024 12:28, Jürgen Groß wrote: > > On 29.04.24 13:04, Julien Grall wrote: > > > Hi Juergen, > > > > > > Sorry for the late reply. > > > > > > On 29/04/2024 11:33, Juergen Gross wrote: > > > > On 08.04.24 09:10, Jan Beulich wrote: > > > > > On 27.03.2024 16:22, Juergen Gross wrote: > > > > > > With lock handling now allowing up to 16384 cpus (spinlocks can > > > > > > handle > > > > > > 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit > > > > > > for > > > > > > the number of cpus to be configured to 16383. > > > > > > > > > > > > The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and > > > > > > QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS. > > > > > > > > > > > > Signed-off-by: Juergen Gross <jgross@suse.com> > > > > > > > > > > Acked-by: Jan Beulich <jbeulich@suse.com> > > > > > > > > > > I'd prefer this to also gain an Arm ack, though. > > > > > > > > Any comment from Arm side? > > > > > > Can you clarify what the new limits mean in term of (security) support? > > > Are we now claiming that Xen will work perfectly fine on platforms with up > > > to 16383? > > > > > > If so, I can't comment for x86, but for Arm, I am doubtful that it would > > > work without any (at least performance) issues. AFAIK, this is also an > > > untested configuration. In fact I would be surprised if Xen on Arm was > > > tested with more than a couple of hundreds cores (AFAICT the Ampere CPUs > > > has 192 CPUs). > > > > I think we should add a security support limit for the number of physical > > cpus similar to the memory support limit we already have in place. > > > > For x86 I'd suggest 4096 cpus for security support (basically the limit we > > have with this patch), but I'm open for other suggestions, too. > > > > I have no idea about any sensible limits for Arm32/Arm64. > > I am not entirely. Bertrand, Michal, Stefano, should we use 192 (the number of > CPUs from Ampere)? I am OK with that. If we want to be a bit more future proof we could say 256 or 512. I leave this to you, as all the boards I have have a much smaller CPU count.
Hi Stefano, On 02/05/2024 19:13, Stefano Stabellini wrote: > On Mon, 29 Apr 2024, Julien Grall wrote: >> Hi Juergen, >> >> On 29/04/2024 12:28, Jürgen Groß wrote: >>> On 29.04.24 13:04, Julien Grall wrote: >>>> Hi Juergen, >>>> >>>> Sorry for the late reply. >>>> >>>> On 29/04/2024 11:33, Juergen Gross wrote: >>>>> On 08.04.24 09:10, Jan Beulich wrote: >>>>>> On 27.03.2024 16:22, Juergen Gross wrote: >>>>>>> With lock handling now allowing up to 16384 cpus (spinlocks can >>>>>>> handle >>>>>>> 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit >>>>>>> for >>>>>>> the number of cpus to be configured to 16383. >>>>>>> >>>>>>> The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and >>>>>>> QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS. >>>>>>> >>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>>> >>>>>> Acked-by: Jan Beulich <jbeulich@suse.com> >>>>>> >>>>>> I'd prefer this to also gain an Arm ack, though. >>>>> >>>>> Any comment from Arm side? >>>> >>>> Can you clarify what the new limits mean in term of (security) support? >>>> Are we now claiming that Xen will work perfectly fine on platforms with up >>>> to 16383? >>>> >>>> If so, I can't comment for x86, but for Arm, I am doubtful that it would >>>> work without any (at least performance) issues. AFAIK, this is also an >>>> untested configuration. In fact I would be surprised if Xen on Arm was >>>> tested with more than a couple of hundreds cores (AFAICT the Ampere CPUs >>>> has 192 CPUs). >>> >>> I think we should add a security support limit for the number of physical >>> cpus similar to the memory support limit we already have in place. >>> >>> For x86 I'd suggest 4096 cpus for security support (basically the limit we >>> have with this patch), but I'm open for other suggestions, too. >>> >>> I have no idea about any sensible limits for Arm32/Arm64. >> >> I am not entirely. Bertrand, Michal, Stefano, should we use 192 (the number of >> CPUs from Ampere)? > > I am OK with that. If we want to be a bit more future proof we could say > 256 or 512. Sorry, I don't follow your argument. A limit can be raised at time point in the future. The question is more whether we are confident that Xen on Arm will run well if a user has a platform with 256/512 pCPUs. So are you saying that from Xen point of view, you are expecting no difference between 256 and 512. And therefore you would be happy if to backport patches if someone find differences (or even security issues) when using > 256 pCPUs? Cheers, -- Julien Grall
On Fri, 3 May 2024, Julien Grall wrote: > Hi Stefano, > > On 02/05/2024 19:13, Stefano Stabellini wrote: > > On Mon, 29 Apr 2024, Julien Grall wrote: > > > Hi Juergen, > > > > > > On 29/04/2024 12:28, Jürgen Groß wrote: > > > > On 29.04.24 13:04, Julien Grall wrote: > > > > > Hi Juergen, > > > > > > > > > > Sorry for the late reply. > > > > > > > > > > On 29/04/2024 11:33, Juergen Gross wrote: > > > > > > On 08.04.24 09:10, Jan Beulich wrote: > > > > > > > On 27.03.2024 16:22, Juergen Gross wrote: > > > > > > > > With lock handling now allowing up to 16384 cpus (spinlocks can > > > > > > > > handle > > > > > > > > 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed > > > > > > > > limit > > > > > > > > for > > > > > > > > the number of cpus to be configured to 16383. > > > > > > > > > > > > > > > > The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and > > > > > > > > QINVAL_MAX_ENTRY_NR required to be larger than 2 * > > > > > > > > CONFIG_NR_CPUS. > > > > > > > > > > > > > > > > Signed-off-by: Juergen Gross <jgross@suse.com> > > > > > > > > > > > > > > Acked-by: Jan Beulich <jbeulich@suse.com> > > > > > > > > > > > > > > I'd prefer this to also gain an Arm ack, though. > > > > > > > > > > > > Any comment from Arm side? > > > > > > > > > > Can you clarify what the new limits mean in term of (security) > > > > > support? > > > > > Are we now claiming that Xen will work perfectly fine on platforms > > > > > with up > > > > > to 16383? > > > > > > > > > > If so, I can't comment for x86, but for Arm, I am doubtful that it > > > > > would > > > > > work without any (at least performance) issues. AFAIK, this is also an > > > > > untested configuration. In fact I would be surprised if Xen on Arm was > > > > > tested with more than a couple of hundreds cores (AFAICT the Ampere > > > > > CPUs > > > > > has 192 CPUs). > > > > > > > > I think we should add a security support limit for the number of > > > > physical > > > > cpus similar to the memory support limit we already have in place. > > > > > > > > For x86 I'd suggest 4096 cpus for security support (basically the limit > > > > we > > > > have with this patch), but I'm open for other suggestions, too. > > > > > > > > I have no idea about any sensible limits for Arm32/Arm64. > > > > > > I am not entirely. Bertrand, Michal, Stefano, should we use 192 (the > > > number of > > > CPUs from Ampere)? > > > > I am OK with that. If we want to be a bit more future proof we could say > > 256 or 512. > > Sorry, I don't follow your argument. A limit can be raised at time point in > the future. The question is more whether we are confident that Xen on Arm will > run well if a user has a platform with 256/512 pCPUs. > > So are you saying that from Xen point of view, you are expecting no difference > between 256 and 512. And therefore you would be happy if to backport patches > if someone find differences (or even security issues) when using > 256 pCPUs? It is difficult to be sure about anything that it is not regularly tested. I am pretty sure someone in the community got Xen running on an Ampere, so like you said 192 is a good number. However, that is not regularly tested, so we don't have any regression checks in gitlab-ci or OSSTest for it. One approach would be to only support things regularly tested either by OSSTest, Gitlab-ci, or also Xen community members. I am not sure what would be the highest number with this way of thinking but likely no more than 192, probably less. I don't know the CPU core count of the biggest ARM machine in OSSTest. Another approach is to support a "sensible" number: not something tested but something we believe it should work. No regular testing. (In safety, they only believe in things that are actually tested, so this would not be OK. But this is security, not safety, just FYI.) With this approach, we could round up the number to a limit we think it won't break. If 192 works, 256/512 should work? I don't know but couldn't think of something that would break going from 192 to 256. It depends on how strict we want to be on testing requirements. I am not sure what approach was taken by x86 so far. I am OK either way.
Hi Stefano, On 03/05/2024 20:07, Stefano Stabellini wrote: > On Fri, 3 May 2024, Julien Grall wrote: [...] >> So are you saying that from Xen point of view, you are expecting no difference >> between 256 and 512. And therefore you would be happy if to backport patches >> if someone find differences (or even security issues) when using > 256 pCPUs? > > It is difficult to be sure about anything that it is not regularly > tested. I am pretty sure someone in the community got Xen running on an > Ampere, so like you said 192 is a good number. However, that is not > regularly tested, so we don't have any regression checks in gitlab-ci or > OSSTest for it. > > One approach would be to only support things regularly tested either by > OSSTest, Gitlab-ci, or also Xen community members. I am not sure what > would be the highest number with this way of thinking but likely no > more than 192, probably less. I don't know the CPU core count of the > biggest ARM machine in OSSTest. This would be rochester* (Cavium Thunder-X). They have 96 pCPUs which, IIRC, are split across two numa nodes. > > Another approach is to support a "sensible" number: not something tested > but something we believe it should work. No regular testing. (In safety, > they only believe in things that are actually tested, so this would not > be OK. But this is security, not safety, just FYI.) With this approach, > we could round up the number to a limit we think it won't break. If 192 > works, 256/512 should work? I don't know but couldn't think of something > that would break going from 192 to 256. It depends what you mean by work/break. Strictly speaking, Xen should run (i.e. not crash). However, it is unclear how well as if you increase the number of physical CPUs, you will increase contention and may find some bottleneck. I haven't done any performance testing with that many CPUs and I haven't seen any so far with Xen. But I have some areas of concerns. * Xenstored: At least the C version is single-threaded. Technically the limit here is not based on the number of pCPUs, but as you increase it, you indirectly increase the number of domains that can run. I doubt it will behave well if you have 4096 domains running (I am thinking about the x86 limit...). * Locking * How Xen use the locks: I don't think we have many places where we have global locks (one is the memory subsystem). If a lock is already taken, the others will spin. It is unclear if we could high contending. * How Xen implements the locks: At the moment, we are using LL/SC. My take of XSA-295 is there is a lack of fairness with them. I am not sure what would happen if they get contented (as we support more pCPUs). It is also probably time to finally implement LSE atomics. * TLB flush: The TLB flush are broadcasted. There are some suggestions on the Linux ML [1] that they don't perform well on some processors. The discussion seems to have gone nowhere in Linux. But I think it is propably worth to take into account when we decide to update the limit we (security) support. > > It depends on how strict we want to be on testing requirements. From above, I am rather worry about claiming that Xen can supports up to 256 (and TBH even 192) without any proper testing. This could end up to backfire as we may need to do (in a rush) and backport some rather large work (unless we decide to remove support after the fact). I think I would prefer if we have a low number until someone can do some testing (including potentially malicious guest). If we want for a power-of-two, I would go with 128 because this is closer to the HW we have in testing. If in the future someone can show some data on other platforms (e.g. Ampere), then we can up the limit. > I am not > sure what approach was taken by x86 so far. It is unclear to me. I don't see how we can claim to support up to 4096 CPUs. But that's for the x86 folks to decide. Cheers, [1] https://lore.kernel.org/linux-arm-kernel/20190617143255.10462-1-indou.takao@jp.fujitsu.com/ -- Julien Grall
On Tue, 7 May 2024, Julien Grall wrote: > Hi Stefano, > > On 03/05/2024 20:07, Stefano Stabellini wrote: > > On Fri, 3 May 2024, Julien Grall wrote: > > [...] > > > > So are you saying that from Xen point of view, you are expecting no > > > difference > > > between 256 and 512. And therefore you would be happy if to backport > > > patches > > > if someone find differences (or even security issues) when using > 256 > > > pCPUs? > > > > It is difficult to be sure about anything that it is not regularly > > tested. I am pretty sure someone in the community got Xen running on an > > Ampere, so like you said 192 is a good number. However, that is not > > regularly tested, so we don't have any regression checks in gitlab-ci or > > OSSTest for it. > > > > One approach would be to only support things regularly tested either by > > OSSTest, Gitlab-ci, or also Xen community members. I am not sure what > > would be the highest number with this way of thinking but likely no > > more than 192, probably less. I don't know the CPU core count of the > > biggest ARM machine in OSSTest. > > This would be rochester* (Cavium Thunder-X). They have 96 pCPUs which, IIRC, > are split across two numa nodes. > > > > > Another approach is to support a "sensible" number: not something tested > > but something we believe it should work. No regular testing. (In safety, > > they only believe in things that are actually tested, so this would not > > be OK. But this is security, not safety, just FYI.) With this approach, > > we could round up the number to a limit we think it won't break. If 192 > > works, 256/512 should work? I don't know but couldn't think of something > > that would break going from 192 to 256. > > It depends what you mean by work/break. Strictly speaking, Xen should run > (i.e. not crash). However, it is unclear how well as if you increase the > number of physical CPUs, you will increase contention and may find some > bottleneck. > > I haven't done any performance testing with that many CPUs and I haven't seen > any so far with Xen. But I have some areas of concerns. > > * Xenstored: At least the C version is single-threaded. Technically the limit > here is not based on the number of pCPUs, but as you increase it, you > indirectly increase the number of domains that can run. I doubt it will behave > well if you have 4096 domains running (I am thinking about the x86 limit...). > > * Locking > * How Xen use the locks: I don't think we have many places where we have > global locks (one is the memory subsystem). If a lock is already taken, the > others will spin. It is unclear if we could high contending. > * How Xen implements the locks: At the moment, we are using LL/SC. My take > of XSA-295 is there is a lack of fairness with them. I am not sure what would > happen if they get contented (as we support more pCPUs). It is also probably > time to finally implement LSE atomics. > > * TLB flush: The TLB flush are broadcasted. There are some suggestions on the > Linux ML [1] that they don't perform well on some processors. The discussion > seems to have gone nowhere in Linux. But I think it is propably worth to take > into account when we decide to update the limit we (security) support. > > > > > It depends on how strict we want to be on testing requirements. > From above, I am rather worry about claiming that Xen can supports up to 256 > (and TBH even 192) without any proper testing. This could end up to backfire > as we may need to do (in a rush) and backport some rather large work (unless > we decide to remove support after the fact). I agree with everything you said and I would also add that is not just about backports: if we "support" something it is supposed to mean that we strongly believe it is working. I think we should only make that claim if we test regularly that configuration/feature. > I think I would prefer if we have a low number until someone can do some > testing (including potentially malicious guest). If we want for a > power-of-two, I would go with 128 because this is closer to the HW we have in > testing. If in the future someone can show some data on other platforms (e.g. > Ampere), then we can up the limit. I am OK with that. I wonder if we could use QEMU to add a test for this. > > I am not > > sure what approach was taken by x86 so far. > > It is unclear to me. I don't see how we can claim to support up to 4096 CPUs. > But that's for the x86 folks to decide. Until not long ago, many things were "supported" in many Open Source projects (including Linux, QEMU, etc.) without any automated tests at all. Maybe it is time to revisit this practice.
On 03.05.2024 21:07, Stefano Stabellini wrote: > On Fri, 3 May 2024, Julien Grall wrote: >> Hi Stefano, >> >> On 02/05/2024 19:13, Stefano Stabellini wrote: >>> On Mon, 29 Apr 2024, Julien Grall wrote: >>>> Hi Juergen, >>>> >>>> On 29/04/2024 12:28, Jürgen Groß wrote: >>>>> On 29.04.24 13:04, Julien Grall wrote: >>>>>> Hi Juergen, >>>>>> >>>>>> Sorry for the late reply. >>>>>> >>>>>> On 29/04/2024 11:33, Juergen Gross wrote: >>>>>>> On 08.04.24 09:10, Jan Beulich wrote: >>>>>>>> On 27.03.2024 16:22, Juergen Gross wrote: >>>>>>>>> With lock handling now allowing up to 16384 cpus (spinlocks can >>>>>>>>> handle >>>>>>>>> 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed >>>>>>>>> limit >>>>>>>>> for >>>>>>>>> the number of cpus to be configured to 16383. >>>>>>>>> >>>>>>>>> The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and >>>>>>>>> QINVAL_MAX_ENTRY_NR required to be larger than 2 * >>>>>>>>> CONFIG_NR_CPUS. >>>>>>>>> >>>>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>>>>> >>>>>>>> Acked-by: Jan Beulich <jbeulich@suse.com> >>>>>>>> >>>>>>>> I'd prefer this to also gain an Arm ack, though. >>>>>>> >>>>>>> Any comment from Arm side? >>>>>> >>>>>> Can you clarify what the new limits mean in term of (security) >>>>>> support? >>>>>> Are we now claiming that Xen will work perfectly fine on platforms >>>>>> with up >>>>>> to 16383? >>>>>> >>>>>> If so, I can't comment for x86, but for Arm, I am doubtful that it >>>>>> would >>>>>> work without any (at least performance) issues. AFAIK, this is also an >>>>>> untested configuration. In fact I would be surprised if Xen on Arm was >>>>>> tested with more than a couple of hundreds cores (AFAICT the Ampere >>>>>> CPUs >>>>>> has 192 CPUs). >>>>> >>>>> I think we should add a security support limit for the number of >>>>> physical >>>>> cpus similar to the memory support limit we already have in place. >>>>> >>>>> For x86 I'd suggest 4096 cpus for security support (basically the limit >>>>> we >>>>> have with this patch), but I'm open for other suggestions, too. >>>>> >>>>> I have no idea about any sensible limits for Arm32/Arm64. >>>> >>>> I am not entirely. Bertrand, Michal, Stefano, should we use 192 (the >>>> number of >>>> CPUs from Ampere)? >>> >>> I am OK with that. If we want to be a bit more future proof we could say >>> 256 or 512. >> >> Sorry, I don't follow your argument. A limit can be raised at time point in >> the future. The question is more whether we are confident that Xen on Arm will >> run well if a user has a platform with 256/512 pCPUs. >> >> So are you saying that from Xen point of view, you are expecting no difference >> between 256 and 512. And therefore you would be happy if to backport patches >> if someone find differences (or even security issues) when using > 256 pCPUs? > > It is difficult to be sure about anything that it is not regularly > tested. I am pretty sure someone in the community got Xen running on an > Ampere, so like you said 192 is a good number. However, that is not > regularly tested, so we don't have any regression checks in gitlab-ci or > OSSTest for it. > > One approach would be to only support things regularly tested either by > OSSTest, Gitlab-ci, or also Xen community members. I am not sure what > would be the highest number with this way of thinking but likely no > more than 192, probably less. I don't know the CPU core count of the > biggest ARM machine in OSSTest. > > Another approach is to support a "sensible" number: not something tested > but something we believe it should work. No regular testing. (In safety, > they only believe in things that are actually tested, so this would not > be OK. But this is security, not safety, just FYI.) With this approach, > we could round up the number to a limit we think it won't break. If 192 > works, 256/512 should work? I don't know but couldn't think of something > that would break going from 192 to 256. I would suggest to aim at sticking to power-of-2 values. There are still some calculations in Xen which can be translated to more efficient code that way (mainly: using shifts rather than multiplications or a combination of shifts and adds). Of course those calculations depend on what people choose as actual values, but giving an upper bound being a power of 2 may at least serve as a hint to them. > It depends on how strict we want to be on testing requirements. I am not > sure what approach was taken by x86 so far. I am OK either way. The bumping of the limit here clearly is forward-looking for x86, i.e. is unlikely to be even possible to test right now (except maybe when running Xen itself virtualized). I actually think there need to be two separate considerations: One is towards for how many CPUs Xen can be built (and such a build can be validated on a much smaller system), while another is to limit what is supported (in ./SUPPORT.md). Jan
Hi Jan, On 06/05/2024 07:42, Jan Beulich wrote: > Of course those calculations depend on > what people choose as actual values, but giving an upper bound being a > power of 2 may at least serve as a hint to them. This is rather a weak hint. If you want to encourage users to chose a power-of-2 value, then let's spell it out in the description. Cheers, -- Julien Grall
On 06.05.24 08:42, Jan Beulich wrote: > On 03.05.2024 21:07, Stefano Stabellini wrote: >> On Fri, 3 May 2024, Julien Grall wrote: >>> Hi Stefano, >>> >>> On 02/05/2024 19:13, Stefano Stabellini wrote: >>>> On Mon, 29 Apr 2024, Julien Grall wrote: >>>>> Hi Juergen, >>>>> >>>>> On 29/04/2024 12:28, Jürgen Groß wrote: >>>>>> On 29.04.24 13:04, Julien Grall wrote: >>>>>>> Hi Juergen, >>>>>>> >>>>>>> Sorry for the late reply. >>>>>>> >>>>>>> On 29/04/2024 11:33, Juergen Gross wrote: >>>>>>>> On 08.04.24 09:10, Jan Beulich wrote: >>>>>>>>> On 27.03.2024 16:22, Juergen Gross wrote: >>>>>>>>>> With lock handling now allowing up to 16384 cpus (spinlocks can >>>>>>>>>> handle >>>>>>>>>> 65535 cpus, rwlocks can handle 16384 cpus), raise the allowed >>>>>>>>>> limit >>>>>>>>>> for >>>>>>>>>> the number of cpus to be configured to 16383. >>>>>>>>>> >>>>>>>>>> The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and >>>>>>>>>> QINVAL_MAX_ENTRY_NR required to be larger than 2 * >>>>>>>>>> CONFIG_NR_CPUS. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>>>>>> >>>>>>>>> Acked-by: Jan Beulich <jbeulich@suse.com> >>>>>>>>> >>>>>>>>> I'd prefer this to also gain an Arm ack, though. >>>>>>>> >>>>>>>> Any comment from Arm side? >>>>>>> >>>>>>> Can you clarify what the new limits mean in term of (security) >>>>>>> support? >>>>>>> Are we now claiming that Xen will work perfectly fine on platforms >>>>>>> with up >>>>>>> to 16383? >>>>>>> >>>>>>> If so, I can't comment for x86, but for Arm, I am doubtful that it >>>>>>> would >>>>>>> work without any (at least performance) issues. AFAIK, this is also an >>>>>>> untested configuration. In fact I would be surprised if Xen on Arm was >>>>>>> tested with more than a couple of hundreds cores (AFAICT the Ampere >>>>>>> CPUs >>>>>>> has 192 CPUs). >>>>>> >>>>>> I think we should add a security support limit for the number of >>>>>> physical >>>>>> cpus similar to the memory support limit we already have in place. >>>>>> >>>>>> For x86 I'd suggest 4096 cpus for security support (basically the limit >>>>>> we >>>>>> have with this patch), but I'm open for other suggestions, too. >>>>>> >>>>>> I have no idea about any sensible limits for Arm32/Arm64. >>>>> >>>>> I am not entirely. Bertrand, Michal, Stefano, should we use 192 (the >>>>> number of >>>>> CPUs from Ampere)? >>>> >>>> I am OK with that. If we want to be a bit more future proof we could say >>>> 256 or 512. >>> >>> Sorry, I don't follow your argument. A limit can be raised at time point in >>> the future. The question is more whether we are confident that Xen on Arm will >>> run well if a user has a platform with 256/512 pCPUs. >>> >>> So are you saying that from Xen point of view, you are expecting no difference >>> between 256 and 512. And therefore you would be happy if to backport patches >>> if someone find differences (or even security issues) when using > 256 pCPUs? >> >> It is difficult to be sure about anything that it is not regularly >> tested. I am pretty sure someone in the community got Xen running on an >> Ampere, so like you said 192 is a good number. However, that is not >> regularly tested, so we don't have any regression checks in gitlab-ci or >> OSSTest for it. >> >> One approach would be to only support things regularly tested either by >> OSSTest, Gitlab-ci, or also Xen community members. I am not sure what >> would be the highest number with this way of thinking but likely no >> more than 192, probably less. I don't know the CPU core count of the >> biggest ARM machine in OSSTest. >> >> Another approach is to support a "sensible" number: not something tested >> but something we believe it should work. No regular testing. (In safety, >> they only believe in things that are actually tested, so this would not >> be OK. But this is security, not safety, just FYI.) With this approach, >> we could round up the number to a limit we think it won't break. If 192 >> works, 256/512 should work? I don't know but couldn't think of something >> that would break going from 192 to 256. > > I would suggest to aim at sticking to power-of-2 values. There are still > some calculations in Xen which can be translated to more efficient code > that way (mainly: using shifts rather than multiplications or a > combination of shifts and adds). Of course those calculations depend on > what people choose as actual values, but giving an upper bound being a > power of 2 may at least serve as a hint to them. > >> It depends on how strict we want to be on testing requirements. I am not >> sure what approach was taken by x86 so far. I am OK either way. > > The bumping of the limit here clearly is forward-looking for x86, i.e. is > unlikely to be even possible to test right now (except maybe when running > Xen itself virtualized). I actually think there need to be two separate > considerations: One is towards for how many CPUs Xen can be built (and > such a build can be validated on a much smaller system), while another is > to limit what is supported (in ./SUPPORT.md). My suggestion would be to add the following to my patch: - introducing the number of security supported physical cpus to SUPPORT.md (4096 for x86, 256 for Arm64 and Arm32) - adding the new upper bound to CHANGELOG.md In case I don't hear any objections I'll send it out tomorrow. Juergen
© 2016 - 2026 Red Hat, Inc.