Ever since Xen 4.14, there has been a latent bug with migration.
While some toolstacks can level the features properly, they don't shink
feat.max_subleaf when all features have been dropped. This is because
we *still* have not completed the toolstack side work for full CPU Policy
objects.
As a consequence, even when properly feature levelled, VMs can't migrate
"backwards" across hardware which reduces feat.max_subleaf. One such example
is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
Extend the max policies feat.max_subleaf to the hightest number Xen knows
about, but leave the default policies matching the host. This will allow VMs
with a higher feat.max_subleaf than strictly necessary to migrate in.
Eventually we'll manage to teach the toolstack how to avoid creating such VMs
in the first place, but there's still more work to do there.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
v2:
* Adjust max policies rather than the host policy.
---
xen/arch/x86/cpu-policy.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 4b6d96276399..f7e2910c01b5 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -590,6 +590,13 @@ static void __init calculate_pv_max_policy(void)
unsigned int i;
*p = host_cpu_policy;
+
+ /*
+ * Some VMs may have a larger-than-necessary feat max_leaf. Allow them to
+ * migrate in.
+ */
+ p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
+
x86_cpu_policy_to_featureset(p, fs);
for ( i = 0; i < ARRAY_SIZE(fs); ++i )
@@ -630,6 +637,10 @@ static void __init calculate_pv_def_policy(void)
unsigned int i;
*p = pv_max_cpu_policy;
+
+ /* Default to the same max_subleaf as the host. */
+ p->feat.max_subleaf = host_cpu_policy.feat.max_subleaf;
+
x86_cpu_policy_to_featureset(p, fs);
for ( i = 0; i < ARRAY_SIZE(fs); ++i )
@@ -666,6 +677,13 @@ static void __init calculate_hvm_max_policy(void)
const uint32_t *mask;
*p = host_cpu_policy;
+
+ /*
+ * Some VMs may have a larger-than-necessary feat max_leaf. Allow them to
+ * migrate in.
+ */
+ p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
+
x86_cpu_policy_to_featureset(p, fs);
mask = hvm_hap_supported() ?
@@ -783,6 +801,10 @@ static void __init calculate_hvm_def_policy(void)
const uint32_t *mask;
*p = hvm_max_cpu_policy;
+
+ /* Default to the same max_subleaf as the host. */
+ p->feat.max_subleaf = host_cpu_policy.feat.max_subleaf;
+
x86_cpu_policy_to_featureset(p, fs);
mask = hvm_hap_supported() ?
base-commit: ebab808eb1bb8f24c7d0dd41b956e48cb1824b81
--
2.30.2
On 07.05.2024 15:45, Andrew Cooper wrote: > Ever since Xen 4.14, there has been a latent bug with migration. > > While some toolstacks can level the features properly, they don't shink > feat.max_subleaf when all features have been dropped. This is because > we *still* have not completed the toolstack side work for full CPU Policy > objects. > > As a consequence, even when properly feature levelled, VMs can't migrate > "backwards" across hardware which reduces feat.max_subleaf. One such example > is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). > > Extend the max policies feat.max_subleaf to the hightest number Xen knows > about, but leave the default policies matching the host. This will allow VMs > with a higher feat.max_subleaf than strictly necessary to migrate in. > > Eventually we'll manage to teach the toolstack how to avoid creating such VMs > in the first place, but there's still more work to do there. Can you explain to me in how far "x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents" would not already have taken care of this (and not just for sub-leaves of leaf 7), if only it (at least its more recent versions) was ever seriously looked at? I realize there was one todo item left there (addressing of which I could probably have used some help with), but that shouldn't have entirely prevented any progress. (If I'm not mistaken an earlier version had once gone in, but then needed to be reverted.) Jan
On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote: > Ever since Xen 4.14, there has been a latent bug with migration. > > While some toolstacks can level the features properly, they don't shink > feat.max_subleaf when all features have been dropped. This is because > we *still* have not completed the toolstack side work for full CPU Policy > objects. > > As a consequence, even when properly feature levelled, VMs can't migrate > "backwards" across hardware which reduces feat.max_subleaf. One such example > is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). > > Extend the max policies feat.max_subleaf to the hightest number Xen knows > about, but leave the default policies matching the host. This will allow VMs > with a higher feat.max_subleaf than strictly necessary to migrate in. > > Eventually we'll manage to teach the toolstack how to avoid creating such VMs > in the first place, but there's still more work to do there. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Even if we have just found one glitch with PSFD and Ice Lake vs Cascade Lack, wouldn't it be safer to always extend the max policies max leafs and subleafs to match the known array sizes? Thanks, Roger.
On 07/05/2024 3:24 pm, Roger Pau Monné wrote: > On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote: >> Ever since Xen 4.14, there has been a latent bug with migration. >> >> While some toolstacks can level the features properly, they don't shink >> feat.max_subleaf when all features have been dropped. This is because >> we *still* have not completed the toolstack side work for full CPU Policy >> objects. >> >> As a consequence, even when properly feature levelled, VMs can't migrate >> "backwards" across hardware which reduces feat.max_subleaf. One such example >> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). >> >> Extend the max policies feat.max_subleaf to the hightest number Xen knows >> about, but leave the default policies matching the host. This will allow VMs >> with a higher feat.max_subleaf than strictly necessary to migrate in. >> >> Eventually we'll manage to teach the toolstack how to avoid creating such VMs >> in the first place, but there's still more work to do there. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Acked-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. > > Even if we have just found one glitch with PSFD and Ice Lake vs > Cascade Lack, wouldn't it be safer to always extend the max policies > max leafs and subleafs to match the known array sizes? This is the final max leaf (containing feature information) to gain custom handling, I think? ~Andrew
On Tue, May 07, 2024 at 03:31:19PM +0100, Andrew Cooper wrote: > On 07/05/2024 3:24 pm, Roger Pau Monné wrote: > > On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote: > >> Ever since Xen 4.14, there has been a latent bug with migration. > >> > >> While some toolstacks can level the features properly, they don't shink > >> feat.max_subleaf when all features have been dropped. This is because > >> we *still* have not completed the toolstack side work for full CPU Policy > >> objects. > >> > >> As a consequence, even when properly feature levelled, VMs can't migrate > >> "backwards" across hardware which reduces feat.max_subleaf. One such example > >> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). > >> > >> Extend the max policies feat.max_subleaf to the hightest number Xen knows > >> about, but leave the default policies matching the host. This will allow VMs > >> with a higher feat.max_subleaf than strictly necessary to migrate in. > >> > >> Eventually we'll manage to teach the toolstack how to avoid creating such VMs > >> in the first place, but there's still more work to do there. > >> > >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> > > Thanks. > > > > > Even if we have just found one glitch with PSFD and Ice Lake vs > > Cascade Lack, wouldn't it be safer to always extend the max policies > > max leafs and subleafs to match the known array sizes? > > This is the final max leaf (containing feature information) to gain > custom handling, I think? Couldn't the same happen with extended leaves? Some of the extended leaves contain features, and hence for policy leveling toolstack might decide to zero them, yet extd.max_leaf won't be adjusted. Thanks, Roger.
On 07/05/2024 3:45 pm, Roger Pau Monné wrote: > On Tue, May 07, 2024 at 03:31:19PM +0100, Andrew Cooper wrote: >> On 07/05/2024 3:24 pm, Roger Pau Monné wrote: >>> On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote: >>>> Ever since Xen 4.14, there has been a latent bug with migration. >>>> >>>> While some toolstacks can level the features properly, they don't shink >>>> feat.max_subleaf when all features have been dropped. This is because >>>> we *still* have not completed the toolstack side work for full CPU Policy >>>> objects. >>>> >>>> As a consequence, even when properly feature levelled, VMs can't migrate >>>> "backwards" across hardware which reduces feat.max_subleaf. One such example >>>> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). >>>> >>>> Extend the max policies feat.max_subleaf to the hightest number Xen knows >>>> about, but leave the default policies matching the host. This will allow VMs >>>> with a higher feat.max_subleaf than strictly necessary to migrate in. >>>> >>>> Eventually we'll manage to teach the toolstack how to avoid creating such VMs >>>> in the first place, but there's still more work to do there. >>>> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> >> Thanks. >> >>> Even if we have just found one glitch with PSFD and Ice Lake vs >>> Cascade Lack, wouldn't it be safer to always extend the max policies >>> max leafs and subleafs to match the known array sizes? >> This is the final max leaf (containing feature information) to gain >> custom handling, I think? > Couldn't the same happen with extended leaves? Some of the extended > leaves contain features, and hence for policy leveling toolstack might > decide to zero them, yet extd.max_leaf won't be adjusted. Hmm. Right now, extd max leaf is also the one with the bit that we unconditionally advertise, and it's inherited all the way from the host policy. So yes, in principle, but anything that bumps this limit is going to have other implications too, and I'd prefer not to second-guess them at this point. I hope we can get the toolstack side fixes before this becomes a real problem... ~Andrew
© 2016 - 2024 Red Hat, Inc.