[PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies

Roger Pau Monne posted 1 patch 2 years, 5 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20211124161649.83189-1-roger.pau@citrix.com
xen/arch/x86/cpuid.c | 6 ------
1 file changed, 6 deletions(-)
[PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Roger Pau Monne 2 years, 5 months ago
Shrinking max policies can lead to failures in migration as previous
versions of Xen didn't shrink the number of leaves in any case, so
it's possible for a guest created on previous versions of Xen that
pass CPUID data on the migration stream to contain a max leaf number
greatest than the one present on the max policies in versions of Xen
containing 540d911c28.

Such failure was seen by osstest when doing a migration from Xen
4.15 to Xen 4.16-rc on a pair of equal boxes, the noceras.

Fix this by preventing any shrinking of the max CPUID policies, so
that previously built guest CPUID policies are compatible.

Fixes: 540d911c28 ('x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Ian Jackson <iwj@xenproject.org>

This is a regression from Xen 4.15, so should be considered for Xen
4.16. The main risks would be to mess up with the CPUID policy in a
different way, that would also lead to brokenness. Strictly speaking
the change here removes the shrinking of max leaves and restores the
previous behavior, but it's obviously not completely risk free.

It has proven to fix the regression seen on the noceras.
---
 xen/arch/x86/cpuid.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 8ac55f0806..173c7b71ac 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -353,8 +353,6 @@ static void __init calculate_host_policy(void)
         p->extd.raw[0xa].d |= ((1u << SVM_FEATURE_VMCBCLEAN) |
                                (1u << SVM_FEATURE_TSCRATEMSR));
     }
-
-    x86_cpuid_policy_shrink_max_leaves(p);
 }
 
 static void __init guest_common_default_feature_adjustments(uint32_t *fs)
@@ -434,8 +432,6 @@ static void __init calculate_pv_max_policy(void)
     recalculate_xstate(p);
 
     p->extd.raw[0xa] = EMPTY_LEAF; /* No SVM for PV guests. */
-
-    x86_cpuid_policy_shrink_max_leaves(p);
 }
 
 static void __init calculate_pv_def_policy(void)
@@ -523,8 +519,6 @@ static void __init calculate_hvm_max_policy(void)
     sanitise_featureset(hvm_featureset);
     cpuid_featureset_to_policy(hvm_featureset, p);
     recalculate_xstate(p);
-
-    x86_cpuid_policy_shrink_max_leaves(p);
 }
 
 static void __init calculate_hvm_def_policy(void)
-- 
2.33.0


Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Ian Jackson 2 years, 5 months ago
Roger Pau Monne writes ("[PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies"):
> Shrinking max policies can lead to failures in migration as previous
> versions of Xen didn't shrink the number of leaves in any case, so
> it's possible for a guest created on previous versions of Xen that
> pass CPUID data on the migration stream to contain a max leaf number
> greatest than the one present on the max policies in versions of Xen
> containing 540d911c28.
> 
> Such failure was seen by osstest when doing a migration from Xen
> 4.15 to Xen 4.16-rc on a pair of equal boxes, the noceras.
> 
> Fix this by preventing any shrinking of the max CPUID policies, so
> that previously built guest CPUID policies are compatible.
> 
> Fixes: 540d911c28 ('x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
...
> This is a regression from Xen 4.15, so should be considered for Xen
> 4.16. The main risks would be to mess up with the CPUID policy in a
> different way, that would also lead to brokenness. Strictly speaking
> the change here removes the shrinking of max leaves and restores the
> previous behavior, but it's obviously not completely risk free.
> 
> It has proven to fix the regression seen on the noceras.

Ouch.

Questions from my RM hat:

Is there a workaround ?
What proportion of machines do we think this might affect ?

Jan, Andy, do you have an opinion ?

Ian.

Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Andrew Cooper 2 years, 5 months ago
On 24/11/2021 16:24, Ian Jackson wrote:
> Roger Pau Monne writes ("[PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies"):
>> Shrinking max policies can lead to failures in migration as previous
>> versions of Xen didn't shrink the number of leaves in any case, so
>> it's possible for a guest created on previous versions of Xen that
>> pass CPUID data on the migration stream to contain a max leaf number
>> greatest than the one present on the max policies in versions of Xen
>> containing 540d911c28.
>>
>> Such failure was seen by osstest when doing a migration from Xen
>> 4.15 to Xen 4.16-rc on a pair of equal boxes, the noceras.
>>
>> Fix this by preventing any shrinking of the max CPUID policies, so
>> that previously built guest CPUID policies are compatible.
>>
>> Fixes: 540d911c28 ('x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents')
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ...
>> This is a regression from Xen 4.15, so should be considered for Xen
>> 4.16. The main risks would be to mess up with the CPUID policy in a
>> different way, that would also lead to brokenness. Strictly speaking
>> the change here removes the shrinking of max leaves and restores the
>> previous behavior, but it's obviously not completely risk free.
>>
>> It has proven to fix the regression seen on the noceras.
> Ouch.
>
> Questions from my RM hat:
>
> Is there a workaround ?

No.

The safety check being tripped is intended to prevent the VM crashing on
resume, and is functioning correctly.

> What proportion of machines do we think this might affect ?

Any pre-xsave machines (~2012 and older), and any newer machines booted
with no-xsave.

All AMD machines are actually broken by this, except that failure is
being masked by other changes in 4.16.  Future AMD machines will break
in the same way.

> Jan, Andy, do you have an opinion ?

The reversion doesn't go far enough.

While the shrinking of the max policies manifests as a concrete breakage
here, there is further breakage caused by shrinking the default
policies, because it renders some cpuid= settings in VM config files broken.

There is still no feedback or error checking from individual cpuid=
settings, so this will manifest as the VM admin settings silently no
longer taking effect.


I recommend a full and complete reversion of 540d911c28.  The
justification for it in the first place is especially weak because it is
explicitly contrary to how real hardware behaves, and this is the 3rd
ABI breakage it has caused, with more expected in the future based on
the analysis of what has gone wrong so far.

~Andrew

Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Ian Jackson 2 years, 5 months ago
(Hoisting Roger and Jan to the To:)

Andrew Cooper writes ("Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies"):
> On 24/11/2021 16:24, Ian Jackson wrote:
> > Questions from my RM hat:
> >
> > Is there a workaround ?
> 
> No.
> 
> The safety check being tripped is intended to prevent the VM crashing on
> resume, and is functioning correctly.
> 
> > What proportion of machines do we think this might affect ?
> 
> Any pre-xsave machines (~2012 and older), and any newer machines booted
> with no-xsave.
> 
> All AMD machines are actually broken by this, except that failure is
> being masked by other changes in 4.16.  Future AMD machines will break
> in the same way.

This is quite bad, then, I think.  I'm inclined to treat this as a
blocker for the release.

> > Jan, Andy, do you have an opinion ?
> 
> The reversion doesn't go far enough.
> 
> While the shrinking of the max policies manifests as a concrete breakage
> here, there is further breakage caused by shrinking the default
> policies, because it renders some cpuid= settings in VM config files broken.
> 
> There is still no feedback or error checking from individual cpuid=
> settings, so this will manifest as the VM admin settings silently no
> longer taking effect.
> 
> 
> I recommend a full and complete reversion of 540d911c28.  The
> justification for it in the first place is especially weak because it is
> explicitly contrary to how real hardware behaves, and this is the 3rd
> ABI breakage it has caused, with more expected in the future based on
> the analysis of what has gone wrong so far.

I would like to collect as many opinions as possible.  Do we have
other options besides (a) reverting 540d911c28, or (b) releasing with
this bug ?

What bad consequences follow, for users of Xen, from reverting
540d911c28 ?  Presumably it had some purpose which will be undermined
by reverting it.  The commit message speaks of details but doesn't
explain the ultimate impact, at least not to someone like me who only
dimly perceives the underlying technical aspects.

I did an experimental git-revert.  It seemed to go cleanly.
If we go for the revert, we would need a commit message.

I would like to make a decision tomorrow if at all possible.

Thanks,
Ian.

Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Andrew Cooper 2 years, 5 months ago
On 24/11/2021 18:07, Ian Jackson wrote:
> (Hoisting Roger and Jan to the To:)
>
> Andrew Cooper writes ("Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies"):
>> On 24/11/2021 16:24, Ian Jackson wrote:
>>> Questions from my RM hat:
>>>
>>> Is there a workaround ?
>> No.
>>
>> The safety check being tripped is intended to prevent the VM crashing on
>> resume, and is functioning correctly.
>>
>>> What proportion of machines do we think this might affect ?
>> Any pre-xsave machines (~2012 and older), and any newer machines booted
>> with no-xsave.
>>
>> All AMD machines are actually broken by this, except that failure is
>> being masked by other changes in 4.16.  Future AMD machines will break
>> in the same way.
> This is quite bad, then, I think.  I'm inclined to treat this as a
> blocker for the release.

I would also classify it as a blocker.

>
>>> Jan, Andy, do you have an opinion ?
>> The reversion doesn't go far enough.
>>
>> While the shrinking of the max policies manifests as a concrete breakage
>> here, there is further breakage caused by shrinking the default
>> policies, because it renders some cpuid= settings in VM config files broken.
>>
>> There is still no feedback or error checking from individual cpuid=
>> settings, so this will manifest as the VM admin settings silently no
>> longer taking effect.
>>
>>
>> I recommend a full and complete reversion of 540d911c28.  The
>> justification for it in the first place is especially weak because it is
>> explicitly contrary to how real hardware behaves, and this is the 3rd
>> ABI breakage it has caused, with more expected in the future based on
>> the analysis of what has gone wrong so far.
> I would like to collect as many opinions as possible.  Do we have
> other options besides (a) reverting 540d911c28, or (b) releasing with
> this bug ?

There is a 3rd option of taking this patch as-is, which is half way
between (a) and (b), but anything other than (a) leaves us with known
breakages that have no workaround.

Shutting the VM down on the old host, copying it's disks and config file
manually, then booting it clean would avoid this specific breakage on
migrate, but you'd still be subject to the silent breakage from certain
cpuid= settings not taking effect.

> What bad consequences follow, for users of Xen, from reverting
> 540d911c28 ?

Nothing.  It will take everything back to the same behaviour as 4.15 and
older.

>   Presumably it had some purpose which will be undermined
> by reverting it.  The commit message speaks of details but doesn't
> explain the ultimate impact, at least not to someone like me who only
> dimly perceives the underlying technical aspects.

540d911c28 "fixes" an issue which is theoretical at best.

Real hardware behaviour does not trim max leaf when certain features are
turned off, and will report blocks of trailing zeros.

None of the software manuals permit any inference based on max leaf,
which is why the 4.15 behaviour has been fine for the lifetime of Xen so
far.

> I did an experimental git-revert.  It seemed to go cleanly.
> If we go for the revert, we would need a commit message.

It may revert cleanly, but it won't build because of the first hunk in
81da2b544cbb00.  That hunk needs reverting too, because it too breaks
some cpuid= settings in VM config files.

In principle, the *final* thing the toolstack should do, *for brand new
VMs only*, is a shrink of that form, but this depends on whole load more
toolstack work before it can be done safely.  There is a plan to fix
CPUID handling, in a safe way, and it is ongoing (subject to all the
security interruptions), but has a long way to go yet.

~Andrew

Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Ian Jackson 2 years, 5 months ago
Andrew Cooper writes ("Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies"):
> It may revert cleanly, but it won't build because of the first hunk in
> 81da2b544cbb00.  That hunk needs reverting too, because it too breaks
> some cpuid= settings in VM config files.

Would you be able to prepare the two-patch series (?) that does the
necessary reverts ?

> In principle, the *final* thing the toolstack should do, *for brand new
> VMs only*, is a shrink of that form, but this depends on whole load more
> toolstack work before it can be done safely.  There is a plan to fix
> CPUID handling, in a safe way, and it is ongoing (subject to all the
> security interruptions), but has a long way to go yet.

For clarity: I think you are discussing future aspirations, which do
not concern us for 4.16.

Thanks,
Ian.

Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Andrew Cooper 2 years, 5 months ago
On 24/11/2021 19:01, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies"):
>> It may revert cleanly, but it won't build because of the first hunk in
>> 81da2b544cbb00.  That hunk needs reverting too, because it too breaks
>> some cpuid= settings in VM config files.
> Would you be able to prepare the two-patch series (?) that does the
> necessary reverts ?

I'd do it as a single patch, but I can try to prepare one.

>> In principle, the *final* thing the toolstack should do, *for brand new
>> VMs only*, is a shrink of that form, but this depends on whole load more
>> toolstack work before it can be done safely.  There is a plan to fix
>> CPUID handling, in a safe way, and it is ongoing (subject to all the
>> security interruptions), but has a long way to go yet.
> For clarity: I think you are discussing future aspirations, which do
> not concern us for 4.16.

Correct.

~Andrew

Re: [PATCH for-4.16] x86/cpuid: do not shrink number of leaves in max policies
Posted by Jan Beulich 2 years, 5 months ago
On 24.11.2021 19:47, Andrew Cooper wrote:
> Real hardware behaviour does not trim max leaf when certain features are
> turned off, and will report blocks of trailing zeros.

I question that, but I'm also unaware of a specific case where feature
disabling (presumably in the BIOS) would lead to a trailing unpopulated
leaf.

> None of the software manuals permit any inference based on max leaf,
> which is why the 4.15 behaviour has been fine for the lifetime of Xen so
> far.

Yet the behavior with AMX (or KeyLocker) becomes quite odd: VMs would
see lots of trailing empty leaves by default on hardware supporting
these, as soon as we have (opt-in) support for them. That's because of
the large gap of leaves we're not making use of just yet.

The manual not permitting inference doesn't mean people can't infer
things. Whether they can do any bad from this (most likely just to
themselves) is unclear in the general case.

To be clear - this isn't an objection to the proposed revert. But the
aspect wants addressing imo, and not only in many years time, nor by
simply continuing to ignore the AMX work which I did submit over half
a year ago. (I didn't even dare to submit the partial KeyLocker work
I've done, both for this reason and for it being just partial work.)

Jan