[v2] x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

[PATCH v2] x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Posted by Andrew Cooper 1 week, 5 days ago

Ever since Xen 4.14, there has been a latent bug with migration.

While some toolstacks can level the features properly, they don't shink
feat.max_subleaf when all features have been dropped.  This is because
we *still* have not completed the toolstack side work for full CPU Policy
objects.

As a consequence, even when properly feature levelled, VMs can't migrate
"backwards" across hardware which reduces feat.max_subleaf.  One such example
is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).

Extend the max policies feat.max_subleaf to the hightest number Xen knows
about, but leave the default policies matching the host.  This will allow VMs
with a higher feat.max_subleaf than strictly necessary to migrate in.

Eventually we'll manage to teach the toolstack how to avoid creating such VMs
in the first place, but there's still more work to do there.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

v2:
 * Adjust max policies rather than the host policy.
---
 xen/arch/x86/cpu-policy.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 4b6d96276399..f7e2910c01b5 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -590,6 +590,13 @@ static void __init calculate_pv_max_policy(void)
     unsigned int i;
 
     *p = host_cpu_policy;
+
+    /*
+     * Some VMs may have a larger-than-necessary feat max_leaf.  Allow them to
+     * migrate in.
+     */
+    p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
+
     x86_cpu_policy_to_featureset(p, fs);
 
     for ( i = 0; i < ARRAY_SIZE(fs); ++i )
@@ -630,6 +637,10 @@ static void __init calculate_pv_def_policy(void)
     unsigned int i;
 
     *p = pv_max_cpu_policy;
+
+    /* Default to the same max_subleaf as the host. */
+    p->feat.max_subleaf = host_cpu_policy.feat.max_subleaf;
+
     x86_cpu_policy_to_featureset(p, fs);
 
     for ( i = 0; i < ARRAY_SIZE(fs); ++i )
@@ -666,6 +677,13 @@ static void __init calculate_hvm_max_policy(void)
     const uint32_t *mask;
 
     *p = host_cpu_policy;
+
+    /*
+     * Some VMs may have a larger-than-necessary feat max_leaf.  Allow them to
+     * migrate in.
+     */
+    p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1;
+
     x86_cpu_policy_to_featureset(p, fs);
 
     mask = hvm_hap_supported() ?
@@ -783,6 +801,10 @@ static void __init calculate_hvm_def_policy(void)
     const uint32_t *mask;
 
     *p = hvm_max_cpu_policy;
+
+    /* Default to the same max_subleaf as the host. */
+    p->feat.max_subleaf = host_cpu_policy.feat.max_subleaf;
+
     x86_cpu_policy_to_featureset(p, fs);
 
     mask = hvm_hap_supported() ?

base-commit: ebab808eb1bb8f24c7d0dd41b956e48cb1824b81
-- 
2.30.2

Re: [PATCH v2] x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Posted by Jan Beulich 5 days, 4 hours ago

On 07.05.2024 15:45, Andrew Cooper wrote:
> Ever since Xen 4.14, there has been a latent bug with migration.
> 
> While some toolstacks can level the features properly, they don't shink
> feat.max_subleaf when all features have been dropped.  This is because
> we *still* have not completed the toolstack side work for full CPU Policy
> objects.
> 
> As a consequence, even when properly feature levelled, VMs can't migrate
> "backwards" across hardware which reduces feat.max_subleaf.  One such example
> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
> 
> Extend the max policies feat.max_subleaf to the hightest number Xen knows
> about, but leave the default policies matching the host.  This will allow VMs
> with a higher feat.max_subleaf than strictly necessary to migrate in.
> 
> Eventually we'll manage to teach the toolstack how to avoid creating such VMs
> in the first place, but there's still more work to do there.

Can you explain to me in how far "x86/CPUID: shrink max_{,sub}leaf fields
according to actual leaf contents" would not already have taken care of
this (and not just for sub-leaves of leaf 7), if only it (at least its
more recent versions) was ever seriously looked at? I realize there was
one todo item left there (addressing of which I could probably have used
some help with), but that shouldn't have entirely prevented any progress.
(If I'm not mistaken an earlier version had once gone in, but then needed
to be reverted.)

Jan

Re: [PATCH v2] x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Posted by Roger Pau Monné 1 week, 4 days ago

On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote:
> Ever since Xen 4.14, there has been a latent bug with migration.
> 
> While some toolstacks can level the features properly, they don't shink
> feat.max_subleaf when all features have been dropped.  This is because
> we *still* have not completed the toolstack side work for full CPU Policy
> objects.
> 
> As a consequence, even when properly feature levelled, VMs can't migrate
> "backwards" across hardware which reduces feat.max_subleaf.  One such example
> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
> 
> Extend the max policies feat.max_subleaf to the hightest number Xen knows
> about, but leave the default policies matching the host.  This will allow VMs
> with a higher feat.max_subleaf than strictly necessary to migrate in.
> 
> Eventually we'll manage to teach the toolstack how to avoid creating such VMs
> in the first place, but there's still more work to do there.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Even if we have just found one glitch with PSFD and Ice Lake vs
Cascade Lack, wouldn't it be safer to always extend the max policies
max leafs and subleafs to match the known array sizes?

Thanks, Roger.

Re: [PATCH v2] x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Posted by Andrew Cooper 1 week, 4 days ago

On 07/05/2024 3:24 pm, Roger Pau Monné wrote:
> On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote:
>> Ever since Xen 4.14, there has been a latent bug with migration.
>>
>> While some toolstacks can level the features properly, they don't shink
>> feat.max_subleaf when all features have been dropped.  This is because
>> we *still* have not completed the toolstack side work for full CPU Policy
>> objects.
>>
>> As a consequence, even when properly feature levelled, VMs can't migrate
>> "backwards" across hardware which reduces feat.max_subleaf.  One such example
>> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
>>
>> Extend the max policies feat.max_subleaf to the hightest number Xen knows
>> about, but leave the default policies matching the host.  This will allow VMs
>> with a higher feat.max_subleaf than strictly necessary to migrate in.
>>
>> Eventually we'll manage to teach the toolstack how to avoid creating such VMs
>> in the first place, but there's still more work to do there.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>
> Even if we have just found one glitch with PSFD and Ice Lake vs
> Cascade Lack, wouldn't it be safer to always extend the max policies
> max leafs and subleafs to match the known array sizes?

This is the final max leaf (containing feature information) to gain
custom handling, I think?

~Andrew

Re: [PATCH v2] x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Posted by Roger Pau Monné 1 week, 4 days ago

On Tue, May 07, 2024 at 03:31:19PM +0100, Andrew Cooper wrote:
> On 07/05/2024 3:24 pm, Roger Pau Monné wrote:
> > On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote:
> >> Ever since Xen 4.14, there has been a latent bug with migration.
> >>
> >> While some toolstacks can level the features properly, they don't shink
> >> feat.max_subleaf when all features have been dropped.  This is because
> >> we *still* have not completed the toolstack side work for full CPU Policy
> >> objects.
> >>
> >> As a consequence, even when properly feature levelled, VMs can't migrate
> >> "backwards" across hardware which reduces feat.max_subleaf.  One such example
> >> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
> >>
> >> Extend the max policies feat.max_subleaf to the hightest number Xen knows
> >> about, but leave the default policies matching the host.  This will allow VMs
> >> with a higher feat.max_subleaf than strictly necessary to migrate in.
> >>
> >> Eventually we'll manage to teach the toolstack how to avoid creating such VMs
> >> in the first place, but there's still more work to do there.
> >>
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > Acked-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> >
> > Even if we have just found one glitch with PSFD and Ice Lake vs
> > Cascade Lack, wouldn't it be safer to always extend the max policies
> > max leafs and subleafs to match the known array sizes?
> 
> This is the final max leaf (containing feature information) to gain
> custom handling, I think?

Couldn't the same happen with extended leaves?  Some of the extended
leaves contain features, and hence for policy leveling toolstack might
decide to zero them, yet extd.max_leaf won't be adjusted.

Thanks, Roger.

Re: [PATCH v2] x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Posted by Andrew Cooper 1 week, 4 days ago

On 07/05/2024 3:45 pm, Roger Pau Monné wrote:
> On Tue, May 07, 2024 at 03:31:19PM +0100, Andrew Cooper wrote:
>> On 07/05/2024 3:24 pm, Roger Pau Monné wrote:
>>> On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote:
>>>> Ever since Xen 4.14, there has been a latent bug with migration.
>>>>
>>>> While some toolstacks can level the features properly, they don't shink
>>>> feat.max_subleaf when all features have been dropped.  This is because
>>>> we *still* have not completed the toolstack side work for full CPU Policy
>>>> objects.
>>>>
>>>> As a consequence, even when properly feature levelled, VMs can't migrate
>>>> "backwards" across hardware which reduces feat.max_subleaf.  One such example
>>>> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
>>>>
>>>> Extend the max policies feat.max_subleaf to the hightest number Xen knows
>>>> about, but leave the default policies matching the host.  This will allow VMs
>>>> with a higher feat.max_subleaf than strictly necessary to migrate in.
>>>>
>>>> Eventually we'll manage to teach the toolstack how to avoid creating such VMs
>>>> in the first place, but there's still more work to do there.
>>>>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
>> Thanks.
>>
>>> Even if we have just found one glitch with PSFD and Ice Lake vs
>>> Cascade Lack, wouldn't it be safer to always extend the max policies
>>> max leafs and subleafs to match the known array sizes?
>> This is the final max leaf (containing feature information) to gain
>> custom handling, I think?
> Couldn't the same happen with extended leaves?  Some of the extended
> leaves contain features, and hence for policy leveling toolstack might
> decide to zero them, yet extd.max_leaf won't be adjusted.

Hmm.  Right now, extd max leaf is also the one with the bit that we
unconditionally advertise, and it's inherited all the way from the host
policy.

So yes, in principle, but anything that bumps this limit is going to
have other implications too, and I'd prefer not to second-guess them at
this point.

I hope we can get the toolstack side fixes before this becomes a real
problem...

~Andrew