[Xen-devel] [PATCH 0/2] x86emul: vendor specific treatment adjustments

Jan Beulich posted 2 patches 4 years, 7 months ago
Only 0 patches received!
[Xen-devel] [PATCH 0/2] x86emul: vendor specific treatment adjustments
Posted by Jan Beulich 4 years, 7 months ago
I've noticed the issue the 1st patch means to address only while
putting together the 2nd. Considering the other Hygon enablement
in this release cycle I think we want patch 1 for 4.13. Patch 2
should be considered too, but we've been effectively mis-emulating
MOVSXD on modern Intel hardware for quite some time, so it getting
delayed until after 4.13 (and then possible be backported) wouldn't
be overly bad.

1: treat Hygon guests like AMD ones
2: adjust MOVSXD source operand handling

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 0/2] x86emul: vendor specific treatment adjustments
Posted by Juergen Gross 4 years, 7 months ago
On 16.09.19 11:44, Jan Beulich wrote:
> I've noticed the issue the 1st patch means to address only while
> putting together the 2nd. Considering the other Hygon enablement
> in this release cycle I think we want patch 1 for 4.13. Patch 2
> should be considered too, but we've been effectively mis-emulating
> MOVSXD on modern Intel hardware for quite some time, so it getting
> delayed until after 4.13 (and then possible be backported) wouldn't
> be overly bad.
> 
> 1: treat Hygon guests like AMD ones
> 2: adjust MOVSXD source operand handling

For the series:

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 1/2] x86emul: treat Hygon guests like AMD ones
Posted by Jan Beulich 4 years, 7 months ago
For some reason the Hygon enabling series left out the insn emulator.
Make appropriate adjustments wherever we've been special casing AMD.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1995,7 +1995,8 @@ protmode_load_seg(
         case x86_seg_tr:
             goto raise_exn;
         }
-        if ( cp->x86_vendor != X86_VENDOR_AMD || !ops->read_segment ||
+        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
+             !ops->read_segment ||
              ops->read_segment(seg, sreg, ctxt) != X86EMUL_OKAY )
             memset(sreg, 0, sizeof(*sreg));
         else
@@ -2122,7 +2123,8 @@ protmode_load_seg(
          */
         bool wide = desc.b & 0x1000
                     ? false : (desc.b & 0xf00) != 0xc00 &&
-                               cp->x86_vendor != X86_VENDOR_AMD
+                               !(cp->x86_vendor &
+                                 (X86_VENDOR_AMD | X86_VENDOR_HYGON))
                                ? mode_64bit() : ctxt->lma;
 
         if ( wide )
@@ -2140,7 +2142,8 @@ protmode_load_seg(
             default:
                 return rc;
             }
-            if ( !mode_64bit() && cp->x86_vendor == X86_VENDOR_AMD &&
+            if ( !mode_64bit() &&
+                 (cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) &&
                  (desc.b & 0xf00) != 0xc00 )
                 desc_hi.b = desc_hi.a = 0;
             if ( (desc_hi.b & 0x00001f00) ||


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86emul: treat Hygon guests like AMD ones
Posted by Wei Liu 4 years, 7 months ago
On Mon, 16 Sep 2019 at 10:47, Jan Beulich <jbeulich@suse.com> wrote:
>
> For some reason the Hygon enabling series left out the insn emulator.
> Make appropriate adjustments wherever we've been special casing AMD.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wl@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86emul: treat Hygon guests like AMD ones
Posted by Andrew Cooper 4 years, 7 months ago
On 16/09/2019 10:48, Jan Beulich wrote:
> For some reason the Hygon enabling series left out the insn emulator.
> Make appropriate adjustments wherever we've been special casing AMD.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Jan Beulich 4 years, 7 months ago
XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out
that as of SDM rev 064 MOVSXD is specified to read only 16 bits from
memory (or register) when used without REX.W and with operand size
override. Since the upper 16 bits of the value read won't be used
anyway in this case, make the emulation uniformly follow this more
compatible behavior when not emulating an AMD-like CPU, at the risk
of missing an exception when emulating on/for older hardware (the
boundary at SandyBridge noted in said commit looks questionable - I've
observed the "new" behavior also on Westmere).

While touching this code I also noticed that #UD outside of protected
mode gets raised for ARPL only after having read the memory operand -
correct this atthe same time by moving up the respective construct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -4048,8 +4048,12 @@ x86_emulate(
             /* movsxd */
             if ( ea.type == OP_REG )
                 src.val = *ea.reg;
-            else if ( (rc = read_ulong(ea.mem.seg, ea.mem.off,
-                                       &src.val, 4, ctxt, ops)) )
+            else if ( (rc = read_ulong(ea.mem.seg, ea.mem.off, &src.val,
+                                       (op_bytes == 2 &&
+                                        !(ctxt->cpuid->x86_vendor &
+                                          (X86_VENDOR_AMD | X86_VENDOR_HYGON))
+                                        ? 2 : 4),
+                                       ctxt, ops)) )
                 goto done;
             dst.val = (int32_t)src.val;
         }
@@ -4058,6 +4062,8 @@ x86_emulate(
             /* arpl */
             unsigned int src_rpl = dst.val & 3;
 
+            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
+
             dst = ea;
             dst.bytes = 2;
             if ( dst.type == OP_REG )
@@ -4075,7 +4081,6 @@ x86_emulate(
                 _regs.eflags &= ~X86_EFLAGS_ZF;
                 dst.type = OP_NONE;
             }
-            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
         }
         break;
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Andrew Cooper 4 years, 7 months ago
On 16/09/2019 10:48, Jan Beulich wrote:
> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out
> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from
> memory (or register) when used without REX.W and with operand size
> override. Since the upper 16 bits of the value read won't be used
> anyway in this case, make the emulation uniformly follow this more
> compatible behavior when not emulating an AMD-like CPU, at the risk
> of missing an exception when emulating on/for older hardware (the
> boundary at SandyBridge noted in said commit looks questionable - I've
> observed the "new" behavior also on Westmere).

AMD documents this instruction has always using an 8 or 16bit source
operand.

There are corner cases which we can't possibly reasonably cope with. 
e.g. It is model specific as to whether UD0 takes a ModRM byte or not,
and I'll note that the latest revision (3.31) of APM Vol2 clarifies in
Table 8-8:

"This reflects the relative priority for faults encountered when
fetching the first byte of an instruction. In the fetching and decoding
of subsequent bytes of an instruction, if those bytes span the segment
limit or cross into a non-executable or not-present page, the fetch will
result in a #GP(0) fault or #PF as appropriate, preventing those bytes
from being accessed. However, if the instruction can be determined to be
invalid based just on the bytes preceding that boundary, a #UD fault may
take priority. This behavior is model-dependent."

so we have no hope of getting model-accurate fault behaviour.

> While touching this code I also noticed that #UD outside of protected
> mode gets raised for ARPL only after having read the memory operand -
> correct this atthe same time by moving up the respective construct.

at the.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Jan Beulich 4 years, 7 months ago
On 17.09.2019 19:17, Andrew Cooper wrote:
> On 16/09/2019 10:48, Jan Beulich wrote:
>> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out
>> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from
>> memory (or register) when used without REX.W and with operand size
>> override. Since the upper 16 bits of the value read won't be used
>> anyway in this case, make the emulation uniformly follow this more
>> compatible behavior when not emulating an AMD-like CPU, at the risk
>> of missing an exception when emulating on/for older hardware (the
>> boundary at SandyBridge noted in said commit looks questionable - I've
>> observed the "new" behavior also on Westmere).
> 
> AMD documents this instruction has always using an 8 or 16bit source
> operand.

Have you mixed up MOVSX with MOVSXD? Both have separate pages in
AMD's doc, but a common page in Intel's.

> There are corner cases which we can't possibly reasonably cope with. 
> e.g. It is model specific as to whether UD0 takes a ModRM byte or not,
> and I'll note that the latest revision (3.31) of APM Vol2 clarifies in
> Table 8-8:
> 
> "This reflects the relative priority for faults encountered when
> fetching the first byte of an instruction. In the fetching and decoding
> of subsequent bytes of an instruction, if those bytes span the segment
> limit or cross into a non-executable or not-present page, the fetch will
> result in a #GP(0) fault or #PF as appropriate, preventing those bytes
> from being accessed. However, if the instruction can be determined to be
> invalid based just on the bytes preceding that boundary, a #UD fault may
> take priority. This behavior is model-dependent."
> 
> so we have no hope of getting model-accurate fault behaviour.

How is UD0 relevant here? And was the remainder of the above perhaps
meant to be in response to the ARPL adjustment, described below? If
so, I still wouldn't know what to take from it as far as this patch
goes.

>> While touching this code I also noticed that #UD outside of protected
>> mode gets raised for ARPL only after having read the memory operand -
>> correct this atthe same time by moving up the respective construct.
> 
> at the.

Fixed.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Andrew Cooper 4 years, 7 months ago
On 18/09/2019 07:34, Jan Beulich wrote:
> On 17.09.2019 19:17, Andrew Cooper wrote:
>> On 16/09/2019 10:48, Jan Beulich wrote:
>>> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out
>>> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from
>>> memory (or register) when used without REX.W and with operand size
>>> override. Since the upper 16 bits of the value read won't be used
>>> anyway in this case, make the emulation uniformly follow this more
>>> compatible behavior when not emulating an AMD-like CPU, at the risk
>>> of missing an exception when emulating on/for older hardware (the
>>> boundary at SandyBridge noted in said commit looks questionable - I've
>>> observed the "new" behavior also on Westmere).
>> AMD documents this instruction has always using an 8 or 16bit source
>> operand.
> Have you mixed up MOVSX with MOVSXD? Both have separate pages in
> AMD's doc, but a common page in Intel's.

I had confused the two, yes.

I constructed an experiment using 66 6e 08, i.e.

movslq (%rax),%cx

according to objdump, and iterating backwards over the boundary to the
unmapped page at 0.

On a Rome system:

(d24) Ptr: 0000000000001000
(d24)  => c2c2
(d24) Ptr: 0000000000000fff
(d24) ******************************
(d24) PANIC: Unhandled exception at 0008:00000000001047a5
(d24) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff
(d24) ******************************

Which also confirms the description which states that in the case of a
16 bit operand, no sign extension occurs.

I then tried the same test on an Intel Haswell system:

(d91) Ptr: 0000000000001000
(d91)  => c2c2
(d91) Ptr: 0000000000000fff
(d91) ******************************
(d91) PANIC: Unhandled exception at 0008:00000000001047a5
(d91) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff
(d91) ******************************

So from this experimentation, I disbelieve the claim in XED, and and it
looks as if the Intel behaviour matches the AMD documentation.

Either way, I think further clarification from Intel is needed.

>
>> There are corner cases which we can't possibly reasonably cope with. 
>> e.g. It is model specific as to whether UD0 takes a ModRM byte or not,
>> and I'll note that the latest revision (3.31) of APM Vol2 clarifies in
>> Table 8-8:
>>
>> "This reflects the relative priority for faults encountered when
>> fetching the first byte of an instruction. In the fetching and decoding
>> of subsequent bytes of an instruction, if those bytes span the segment
>> limit or cross into a non-executable or not-present page, the fetch will
>> result in a #GP(0) fault or #PF as appropriate, preventing those bytes
>> from being accessed. However, if the instruction can be determined to be
>> invalid based just on the bytes preceding that boundary, a #UD fault may
>> take priority. This behavior is model-dependent."
>>
>> so we have no hope of getting model-accurate fault behaviour.
> How is UD0 relevant here?

to "there are model-specific corner cases which we can't possibly
reasonably cope with."

> And was the remainder of the above perhaps
> meant to be in response to the ARPL adjustment, described below? If
> so, I still wouldn't know what to take from it as far as this patch
> goes.

The ARPL bit is fine in isolation, and probably wants submitting in
isolation, given the conflicting evidence about MOVSXD.

If you do want to submit it individually, consider it R-by me to avoid
further latency.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Jan Beulich 4 years, 7 months ago
On 18.09.2019 21:22, Andrew Cooper wrote:
> On 18/09/2019 07:34, Jan Beulich wrote:
>> On 17.09.2019 19:17, Andrew Cooper wrote:
>>> On 16/09/2019 10:48, Jan Beulich wrote:
>>>> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out
>>>> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from
>>>> memory (or register) when used without REX.W and with operand size
>>>> override. Since the upper 16 bits of the value read won't be used
>>>> anyway in this case, make the emulation uniformly follow this more
>>>> compatible behavior when not emulating an AMD-like CPU, at the risk
>>>> of missing an exception when emulating on/for older hardware (the
>>>> boundary at SandyBridge noted in said commit looks questionable - I've
>>>> observed the "new" behavior also on Westmere).
>>> AMD documents this instruction has always using an 8 or 16bit source
>>> operand.
>> Have you mixed up MOVSX with MOVSXD? Both have separate pages in
>> AMD's doc, but a common page in Intel's.
> 
> I had confused the two, yes.
> 
> I constructed an experiment using 66 6e 08, i.e.
> 
> movslq (%rax),%cx
> 
> according to objdump, and iterating backwards over the boundary to the
> unmapped page at 0.
> 
> On a Rome system:
> 
> (d24) Ptr: 0000000000001000
> (d24)  => c2c2
> (d24) Ptr: 0000000000000fff
> (d24) ******************************
> (d24) PANIC: Unhandled exception at 0008:00000000001047a5
> (d24) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff
> (d24) ******************************
> 
> Which also confirms the description which states that in the case of a
> 16 bit operand, no sign extension occurs.
> 
> I then tried the same test on an Intel Haswell system:
> 
> (d91) Ptr: 0000000000001000
> (d91)  => c2c2
> (d91) Ptr: 0000000000000fff
> (d91) ******************************
> (d91) PANIC: Unhandled exception at 0008:00000000001047a5
> (d91) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff
> (d91) ******************************

But judging from the "Ptr: 0000000000000fff" in the log I take
it you tried to access a byte rather than a word (which would
need an address of 0000000000000ffe to distinguish whether it's
a 2- or 4-byte read that the CPU issues). Trust me, I did try
this out, which is also why I noticed that Mark's claim of
the behavior having changed with SandyBridge is likely wrong.
He has meanwhile confirmed that Merom also already behaved this
way.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Andrew Cooper 4 years, 6 months ago
On 19/09/2019 10:31, Jan Beulich wrote:
> On 18.09.2019 21:22, Andrew Cooper wrote:
>> On 18/09/2019 07:34, Jan Beulich wrote:
>>> On 17.09.2019 19:17, Andrew Cooper wrote:
>>>> On 16/09/2019 10:48, Jan Beulich wrote:
>>>>> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out
>>>>> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from
>>>>> memory (or register) when used without REX.W and with operand size
>>>>> override. Since the upper 16 bits of the value read won't be used
>>>>> anyway in this case, make the emulation uniformly follow this more
>>>>> compatible behavior when not emulating an AMD-like CPU, at the risk
>>>>> of missing an exception when emulating on/for older hardware (the
>>>>> boundary at SandyBridge noted in said commit looks questionable - I've
>>>>> observed the "new" behavior also on Westmere).
>>>> AMD documents this instruction has always using an 8 or 16bit source
>>>> operand.
>>> Have you mixed up MOVSX with MOVSXD? Both have separate pages in
>>> AMD's doc, but a common page in Intel's.
>> I had confused the two, yes.
>>
>> I constructed an experiment using 66 6e 08, i.e.
>>
>> movslq (%rax),%cx
>>
>> according to objdump, and iterating backwards over the boundary to the
>> unmapped page at 0.
>>
>> On a Rome system:
>>
>> (d24) Ptr: 0000000000001000
>> (d24)  => c2c2
>> (d24) Ptr: 0000000000000fff
>> (d24) ******************************
>> (d24) PANIC: Unhandled exception at 0008:00000000001047a5
>> (d24) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff
>> (d24) ******************************
>>
>> Which also confirms the description which states that in the case of a
>> 16 bit operand, no sign extension occurs.
>>
>> I then tried the same test on an Intel Haswell system:
>>
>> (d91) Ptr: 0000000000001000
>> (d91)  => c2c2
>> (d91) Ptr: 0000000000000fff
>> (d91) ******************************
>> (d91) PANIC: Unhandled exception at 0008:00000000001047a5
>> (d91) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff
>> (d91) ******************************
> But judging from the "Ptr: 0000000000000fff" in the log I take
> it you tried to access a byte rather than a word (which would
> need an address of 0000000000000ffe to distinguish whether it's
> a 2- or 4-byte read that the CPU issues).

No - it was a word access in all cases.  The bug was walking backwards
into an unmapped page, rather than forward.

When walking forward from 0x1ffc to 0x2000, I do observe that AMD does
4-byte accesses while Intel does 2.

> Trust me, I did try
> this out, which is also why I noticed that Mark's claim of
> the behavior having changed with SandyBridge is likely wrong.
> He has meanwhile confirmed that Merom also already behaved this
> way.

Sadly, we also have a second bug here, and it is rather more complicated
to reason about.

The shadow livelock bug with implicit supervisor accesses to user
mappings happens when the pipeline generates #PF, and the emulator
believes that the instruction completes correctly.  The SMAP case
specifically occurs because information is discarded on VMExit which
prevents Xen's pagewalk from behaving identically to the pipeline. 
However, any case where hardware and the emulator disagree is a recipe
for livelock.

In this example, hardware can the emulator can disagree by using a
different access width.

I've been experimenting with my Rome system, and an emulator hardcoded
to use 2-byte accesses.  After some investigation, the livelock only
occurs for access-rights faults.  Translation faults get identified as
not a shadow fault, and bounced back to the guest.

Shadow guests can use PKRU, so can generate an access fault by marking
the page at 0x2000 as no-access, so I think that in principle, this
change will result in a new latent livelock case, but I can't actually
confirm it.

For now, I don't think this will interact problematically with
mem-access reduced p2m types, because they are ignored by the emulator.

However, I think we need to tread very carefully here.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Jan Beulich 4 years, 6 months ago
On 01.10.2019 21:44, Andrew Cooper wrote:
> In this example, hardware can the emulator can disagree by using a
> different access width.
> 
> I've been experimenting with my Rome system, and an emulator hardcoded
> to use 2-byte accesses.  After some investigation, the livelock only
> occurs for access-rights faults.  Translation faults get identified as
> not a shadow fault, and bounced back to the guest.
> 
> Shadow guests can use PKRU, so can generate an access fault by marking
> the page at 0x2000 as no-access, so I think that in principle, this
> change will result in a new latent livelock case, but I can't actually
> confirm it.

I think I see what you mean, but then I don't see how this is an
argument against the patch in its current shape: It actually
reduces the cases of disagreement between hardware and emulator.
One possibility to make a further step in that direction would
be to make behavior dependent upon the underlying hardware's
vendor, rather than the one the guest sees.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Andrew Cooper 4 years, 6 months ago
On 02/10/2019 08:07, Jan Beulich wrote:
> On 01.10.2019 21:44, Andrew Cooper wrote:
>> In this example, hardware can the emulator can disagree by using a
>> different access width.
>>
>> I've been experimenting with my Rome system, and an emulator hardcoded
>> to use 2-byte accesses.  After some investigation, the livelock only
>> occurs for access-rights faults.  Translation faults get identified as
>> not a shadow fault, and bounced back to the guest.
>>
>> Shadow guests can use PKRU, so can generate an access fault by marking
>> the page at 0x2000 as no-access, so I think that in principle, this
>> change will result in a new latent livelock case, but I can't actually
>> confirm it.
> I think I see what you mean, but then I don't see how this is an
> argument against the patch in its current shape: It actually
> reduces the cases of disagreement between hardware and emulator.

At the moment, the emulator is strictly 4 bytes, and hardware may be 4
or 2.  Therefore, there is no chance of the pipeline yielding #PF while
the emulator yielding OK.

With the change in place, older Intel parts which do use a 4 byte access
now come with a risk of livelock.  Whichever parts these are, they
predate PKRU so in this specific case, the problem is only theoretical
AFAICT.

Also, in this specific case, Intel's warning of "Don't use this
instruction without a REX prefix" means that we shouldn't see it in
anything but test scenarios.

> One possibility to make a further step in that direction would
> be to make behavior dependent upon the underlying hardware's
> vendor, rather than the one the guest sees.

I considered this.  It would work on native (at the expense of
complicating the emulator), but won't work properly if Xen is
virtualisied and migrated.  I can't see a way around this.

Furthermore, we can't execute the instruction against a mapping of the
guest, because the problem here is determining the width of the access,
which is information needed to construct the mapping in the first place.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Jan Beulich 4 years, 6 months ago
On 02.10.2019 10:51, Andrew Cooper wrote:
> On 02/10/2019 08:07, Jan Beulich wrote:
>> On 01.10.2019 21:44, Andrew Cooper wrote:
>>> In this example, hardware can the emulator can disagree by using a
>>> different access width.
>>>
>>> I've been experimenting with my Rome system, and an emulator hardcoded
>>> to use 2-byte accesses.  After some investigation, the livelock only
>>> occurs for access-rights faults.  Translation faults get identified as
>>> not a shadow fault, and bounced back to the guest.
>>>
>>> Shadow guests can use PKRU, so can generate an access fault by marking
>>> the page at 0x2000 as no-access, so I think that in principle, this
>>> change will result in a new latent livelock case, but I can't actually
>>> confirm it.
>> I think I see what you mean, but then I don't see how this is an
>> argument against the patch in its current shape: It actually
>> reduces the cases of disagreement between hardware and emulator.
> 
> At the moment, the emulator is strictly 4 bytes, and hardware may be 4
> or 2.  Therefore, there is no chance of the pipeline yielding #PF while
> the emulator yielding OK.

At the expense of possibly yielding #PF when the pipeline wouldn't.

> With the change in place, older Intel parts which do use a 4 byte access
> now come with a risk of livelock.  Whichever parts these are, they
> predate PKRU so in this specific case, the problem is only theoretical
> AFAICT.

Plus at this point we don't even know whether there are any such
parts.

> Also, in this specific case, Intel's warning of "Don't use this
> instruction without a REX prefix" means that we shouldn't see it in
> anything but test scenarios.

It's extremely unlikely at least.

>> One possibility to make a further step in that direction would
>> be to make behavior dependent upon the underlying hardware's
>> vendor, rather than the one the guest sees.
> 
> I considered this.  It would work on native (at the expense of
> complicating the emulator), but won't work properly if Xen is
> virtualisied and migrated.  I can't see a way around this.

Are you concerned about Xen getting cross-vendor migrated? If
you'd accept things to not be 100% right in such a case, I could
simply probe hardware while booting.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86emul: adjust MOVSXD source operand handling
Posted by Andrew Cooper 4 years, 6 months ago
On 02/10/2019 10:17, Jan Beulich wrote:
> On 02.10.2019 10:51, Andrew Cooper wrote:
>> On 02/10/2019 08:07, Jan Beulich wrote:
>>> On 01.10.2019 21:44, Andrew Cooper wrote:
>>>> In this example, hardware can the emulator can disagree by using a
>>>> different access width.
>>>>
>>>> I've been experimenting with my Rome system, and an emulator hardcoded
>>>> to use 2-byte accesses.  After some investigation, the livelock only
>>>> occurs for access-rights faults.  Translation faults get identified as
>>>> not a shadow fault, and bounced back to the guest.
>>>>
>>>> Shadow guests can use PKRU, so can generate an access fault by marking
>>>> the page at 0x2000 as no-access, so I think that in principle, this
>>>> change will result in a new latent livelock case, but I can't actually
>>>> confirm it.
>>> I think I see what you mean, but then I don't see how this is an
>>> argument against the patch in its current shape: It actually
>>> reduces the cases of disagreement between hardware and emulator.
>> At the moment, the emulator is strictly 4 bytes, and hardware may be 4
>> or 2.  Therefore, there is no chance of the pipeline yielding #PF while
>> the emulator yielding OK.
> At the expense of possibly yielding #PF when the pipeline wouldn't.

Right, but given the differing behaviour, no code can reasonably expect
not to get the second #PF.

>> With the change in place, older Intel parts which do use a 4 byte access
>> now come with a risk of livelock.  Whichever parts these are, they
>> predate PKRU so in this specific case, the problem is only theoretical
>> AFAICT.
> Plus at this point we don't even know whether there are any such
> parts.

I'll see if I can find out.  Given this is a 64-bit only instruction, it
is possible that Intel has always had the described behaviour, and it
was just the documentation which was incorrect.

>> Also, in this specific case, Intel's warning of "Don't use this
>> instruction without a REX prefix" means that we shouldn't see it in
>> anything but test scenarios.
> It's extremely unlikely at least.
>
>>> One possibility to make a further step in that direction would
>>> be to make behavior dependent upon the underlying hardware's
>>> vendor, rather than the one the guest sees.
>> I considered this.  It would work on native (at the expense of
>> complicating the emulator), but won't work properly if Xen is
>> virtualisied and migrated.  I can't see a way around this.
> Are you concerned about Xen getting cross-vendor migrated? If
> you'd accept things to not be 100% right in such a case, I could
> simply probe hardware while booting.

To be honest, when Xen isn't L0, all of this is liable to break under
our feet.  I don't think probing at boot is going to provide a
meaningful improvement on that.

For this specific movxsd change, lets call it Acked-by me.  Whatever the
behaviour on ancient processors, the livelock case is safe because they
don't support PKRU.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel