I've noticed the issue the 1st patch means to address only while putting together the 2nd. Considering the other Hygon enablement in this release cycle I think we want patch 1 for 4.13. Patch 2 should be considered too, but we've been effectively mis-emulating MOVSXD on modern Intel hardware for quite some time, so it getting delayed until after 4.13 (and then possible be backported) wouldn't be overly bad. 1: treat Hygon guests like AMD ones 2: adjust MOVSXD source operand handling Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 16.09.19 11:44, Jan Beulich wrote: > I've noticed the issue the 1st patch means to address only while > putting together the 2nd. Considering the other Hygon enablement > in this release cycle I think we want patch 1 for 4.13. Patch 2 > should be considered too, but we've been effectively mis-emulating > MOVSXD on modern Intel hardware for quite some time, so it getting > delayed until after 4.13 (and then possible be backported) wouldn't > be overly bad. > > 1: treat Hygon guests like AMD ones > 2: adjust MOVSXD source operand handling For the series: Release-acked-by: Juergen Gross <jgross@suse.com> Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
For some reason the Hygon enabling series left out the insn emulator. Make appropriate adjustments wherever we've been special casing AMD. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1995,7 +1995,8 @@ protmode_load_seg( case x86_seg_tr: goto raise_exn; } - if ( cp->x86_vendor != X86_VENDOR_AMD || !ops->read_segment || + if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || + !ops->read_segment || ops->read_segment(seg, sreg, ctxt) != X86EMUL_OKAY ) memset(sreg, 0, sizeof(*sreg)); else @@ -2122,7 +2123,8 @@ protmode_load_seg( */ bool wide = desc.b & 0x1000 ? false : (desc.b & 0xf00) != 0xc00 && - cp->x86_vendor != X86_VENDOR_AMD + !(cp->x86_vendor & + (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ? mode_64bit() : ctxt->lma; if ( wide ) @@ -2140,7 +2142,8 @@ protmode_load_seg( default: return rc; } - if ( !mode_64bit() && cp->x86_vendor == X86_VENDOR_AMD && + if ( !mode_64bit() && + (cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) && (desc.b & 0xf00) != 0xc00 ) desc_hi.b = desc_hi.a = 0; if ( (desc_hi.b & 0x00001f00) || _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Mon, 16 Sep 2019 at 10:47, Jan Beulich <jbeulich@suse.com> wrote: > > For some reason the Hygon enabling series left out the insn emulator. > Make appropriate adjustments wherever we've been special casing AMD. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wl@xen.org> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 16/09/2019 10:48, Jan Beulich wrote: > For some reason the Hygon enabling series left out the insn emulator. > Make appropriate adjustments wherever we've been special casing AMD. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out that as of SDM rev 064 MOVSXD is specified to read only 16 bits from memory (or register) when used without REX.W and with operand size override. Since the upper 16 bits of the value read won't be used anyway in this case, make the emulation uniformly follow this more compatible behavior when not emulating an AMD-like CPU, at the risk of missing an exception when emulating on/for older hardware (the boundary at SandyBridge noted in said commit looks questionable - I've observed the "new" behavior also on Westmere). While touching this code I also noticed that #UD outside of protected mode gets raised for ARPL only after having read the memory operand - correct this atthe same time by moving up the respective construct. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -4048,8 +4048,12 @@ x86_emulate( /* movsxd */ if ( ea.type == OP_REG ) src.val = *ea.reg; - else if ( (rc = read_ulong(ea.mem.seg, ea.mem.off, - &src.val, 4, ctxt, ops)) ) + else if ( (rc = read_ulong(ea.mem.seg, ea.mem.off, &src.val, + (op_bytes == 2 && + !(ctxt->cpuid->x86_vendor & + (X86_VENDOR_AMD | X86_VENDOR_HYGON)) + ? 2 : 4), + ctxt, ops)) ) goto done; dst.val = (int32_t)src.val; } @@ -4058,6 +4062,8 @@ x86_emulate( /* arpl */ unsigned int src_rpl = dst.val & 3; + generate_exception_if(!in_protmode(ctxt, ops), EXC_UD); + dst = ea; dst.bytes = 2; if ( dst.type == OP_REG ) @@ -4075,7 +4081,6 @@ x86_emulate( _regs.eflags &= ~X86_EFLAGS_ZF; dst.type = OP_NONE; } - generate_exception_if(!in_protmode(ctxt, ops), EXC_UD); } break; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 16/09/2019 10:48, Jan Beulich wrote: > XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out > that as of SDM rev 064 MOVSXD is specified to read only 16 bits from > memory (or register) when used without REX.W and with operand size > override. Since the upper 16 bits of the value read won't be used > anyway in this case, make the emulation uniformly follow this more > compatible behavior when not emulating an AMD-like CPU, at the risk > of missing an exception when emulating on/for older hardware (the > boundary at SandyBridge noted in said commit looks questionable - I've > observed the "new" behavior also on Westmere). AMD documents this instruction has always using an 8 or 16bit source operand. There are corner cases which we can't possibly reasonably cope with. e.g. It is model specific as to whether UD0 takes a ModRM byte or not, and I'll note that the latest revision (3.31) of APM Vol2 clarifies in Table 8-8: "This reflects the relative priority for faults encountered when fetching the first byte of an instruction. In the fetching and decoding of subsequent bytes of an instruction, if those bytes span the segment limit or cross into a non-executable or not-present page, the fetch will result in a #GP(0) fault or #PF as appropriate, preventing those bytes from being accessed. However, if the instruction can be determined to be invalid based just on the bytes preceding that boundary, a #UD fault may take priority. This behavior is model-dependent." so we have no hope of getting model-accurate fault behaviour. > While touching this code I also noticed that #UD outside of protected > mode gets raised for ARPL only after having read the memory operand - > correct this atthe same time by moving up the respective construct. at the. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 17.09.2019 19:17, Andrew Cooper wrote: > On 16/09/2019 10:48, Jan Beulich wrote: >> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out >> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from >> memory (or register) when used without REX.W and with operand size >> override. Since the upper 16 bits of the value read won't be used >> anyway in this case, make the emulation uniformly follow this more >> compatible behavior when not emulating an AMD-like CPU, at the risk >> of missing an exception when emulating on/for older hardware (the >> boundary at SandyBridge noted in said commit looks questionable - I've >> observed the "new" behavior also on Westmere). > > AMD documents this instruction has always using an 8 or 16bit source > operand. Have you mixed up MOVSX with MOVSXD? Both have separate pages in AMD's doc, but a common page in Intel's. > There are corner cases which we can't possibly reasonably cope with. > e.g. It is model specific as to whether UD0 takes a ModRM byte or not, > and I'll note that the latest revision (3.31) of APM Vol2 clarifies in > Table 8-8: > > "This reflects the relative priority for faults encountered when > fetching the first byte of an instruction. In the fetching and decoding > of subsequent bytes of an instruction, if those bytes span the segment > limit or cross into a non-executable or not-present page, the fetch will > result in a #GP(0) fault or #PF as appropriate, preventing those bytes > from being accessed. However, if the instruction can be determined to be > invalid based just on the bytes preceding that boundary, a #UD fault may > take priority. This behavior is model-dependent." > > so we have no hope of getting model-accurate fault behaviour. How is UD0 relevant here? And was the remainder of the above perhaps meant to be in response to the ARPL adjustment, described below? If so, I still wouldn't know what to take from it as far as this patch goes. >> While touching this code I also noticed that #UD outside of protected >> mode gets raised for ARPL only after having read the memory operand - >> correct this atthe same time by moving up the respective construct. > > at the. Fixed. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 18/09/2019 07:34, Jan Beulich wrote: > On 17.09.2019 19:17, Andrew Cooper wrote: >> On 16/09/2019 10:48, Jan Beulich wrote: >>> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out >>> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from >>> memory (or register) when used without REX.W and with operand size >>> override. Since the upper 16 bits of the value read won't be used >>> anyway in this case, make the emulation uniformly follow this more >>> compatible behavior when not emulating an AMD-like CPU, at the risk >>> of missing an exception when emulating on/for older hardware (the >>> boundary at SandyBridge noted in said commit looks questionable - I've >>> observed the "new" behavior also on Westmere). >> AMD documents this instruction has always using an 8 or 16bit source >> operand. > Have you mixed up MOVSX with MOVSXD? Both have separate pages in > AMD's doc, but a common page in Intel's. I had confused the two, yes. I constructed an experiment using 66 6e 08, i.e. movslq (%rax),%cx according to objdump, and iterating backwards over the boundary to the unmapped page at 0. On a Rome system: (d24) Ptr: 0000000000001000 (d24) => c2c2 (d24) Ptr: 0000000000000fff (d24) ****************************** (d24) PANIC: Unhandled exception at 0008:00000000001047a5 (d24) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff (d24) ****************************** Which also confirms the description which states that in the case of a 16 bit operand, no sign extension occurs. I then tried the same test on an Intel Haswell system: (d91) Ptr: 0000000000001000 (d91) => c2c2 (d91) Ptr: 0000000000000fff (d91) ****************************** (d91) PANIC: Unhandled exception at 0008:00000000001047a5 (d91) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff (d91) ****************************** So from this experimentation, I disbelieve the claim in XED, and and it looks as if the Intel behaviour matches the AMD documentation. Either way, I think further clarification from Intel is needed. > >> There are corner cases which we can't possibly reasonably cope with. >> e.g. It is model specific as to whether UD0 takes a ModRM byte or not, >> and I'll note that the latest revision (3.31) of APM Vol2 clarifies in >> Table 8-8: >> >> "This reflects the relative priority for faults encountered when >> fetching the first byte of an instruction. In the fetching and decoding >> of subsequent bytes of an instruction, if those bytes span the segment >> limit or cross into a non-executable or not-present page, the fetch will >> result in a #GP(0) fault or #PF as appropriate, preventing those bytes >> from being accessed. However, if the instruction can be determined to be >> invalid based just on the bytes preceding that boundary, a #UD fault may >> take priority. This behavior is model-dependent." >> >> so we have no hope of getting model-accurate fault behaviour. > How is UD0 relevant here? to "there are model-specific corner cases which we can't possibly reasonably cope with." > And was the remainder of the above perhaps > meant to be in response to the ARPL adjustment, described below? If > so, I still wouldn't know what to take from it as far as this patch > goes. The ARPL bit is fine in isolation, and probably wants submitting in isolation, given the conflicting evidence about MOVSXD. If you do want to submit it individually, consider it R-by me to avoid further latency. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 18.09.2019 21:22, Andrew Cooper wrote: > On 18/09/2019 07:34, Jan Beulich wrote: >> On 17.09.2019 19:17, Andrew Cooper wrote: >>> On 16/09/2019 10:48, Jan Beulich wrote: >>>> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out >>>> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from >>>> memory (or register) when used without REX.W and with operand size >>>> override. Since the upper 16 bits of the value read won't be used >>>> anyway in this case, make the emulation uniformly follow this more >>>> compatible behavior when not emulating an AMD-like CPU, at the risk >>>> of missing an exception when emulating on/for older hardware (the >>>> boundary at SandyBridge noted in said commit looks questionable - I've >>>> observed the "new" behavior also on Westmere). >>> AMD documents this instruction has always using an 8 or 16bit source >>> operand. >> Have you mixed up MOVSX with MOVSXD? Both have separate pages in >> AMD's doc, but a common page in Intel's. > > I had confused the two, yes. > > I constructed an experiment using 66 6e 08, i.e. > > movslq (%rax),%cx > > according to objdump, and iterating backwards over the boundary to the > unmapped page at 0. > > On a Rome system: > > (d24) Ptr: 0000000000001000 > (d24) => c2c2 > (d24) Ptr: 0000000000000fff > (d24) ****************************** > (d24) PANIC: Unhandled exception at 0008:00000000001047a5 > (d24) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff > (d24) ****************************** > > Which also confirms the description which states that in the case of a > 16 bit operand, no sign extension occurs. > > I then tried the same test on an Intel Haswell system: > > (d91) Ptr: 0000000000001000 > (d91) => c2c2 > (d91) Ptr: 0000000000000fff > (d91) ****************************** > (d91) PANIC: Unhandled exception at 0008:00000000001047a5 > (d91) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff > (d91) ****************************** But judging from the "Ptr: 0000000000000fff" in the log I take it you tried to access a byte rather than a word (which would need an address of 0000000000000ffe to distinguish whether it's a 2- or 4-byte read that the CPU issues). Trust me, I did try this out, which is also why I noticed that Mark's claim of the behavior having changed with SandyBridge is likely wrong. He has meanwhile confirmed that Merom also already behaved this way. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 19/09/2019 10:31, Jan Beulich wrote: > On 18.09.2019 21:22, Andrew Cooper wrote: >> On 18/09/2019 07:34, Jan Beulich wrote: >>> On 17.09.2019 19:17, Andrew Cooper wrote: >>>> On 16/09/2019 10:48, Jan Beulich wrote: >>>>> XED commit 1b2fd94425 ("Update MOVSXD to modern behavior") points out >>>>> that as of SDM rev 064 MOVSXD is specified to read only 16 bits from >>>>> memory (or register) when used without REX.W and with operand size >>>>> override. Since the upper 16 bits of the value read won't be used >>>>> anyway in this case, make the emulation uniformly follow this more >>>>> compatible behavior when not emulating an AMD-like CPU, at the risk >>>>> of missing an exception when emulating on/for older hardware (the >>>>> boundary at SandyBridge noted in said commit looks questionable - I've >>>>> observed the "new" behavior also on Westmere). >>>> AMD documents this instruction has always using an 8 or 16bit source >>>> operand. >>> Have you mixed up MOVSX with MOVSXD? Both have separate pages in >>> AMD's doc, but a common page in Intel's. >> I had confused the two, yes. >> >> I constructed an experiment using 66 6e 08, i.e. >> >> movslq (%rax),%cx >> >> according to objdump, and iterating backwards over the boundary to the >> unmapped page at 0. >> >> On a Rome system: >> >> (d24) Ptr: 0000000000001000 >> (d24) => c2c2 >> (d24) Ptr: 0000000000000fff >> (d24) ****************************** >> (d24) PANIC: Unhandled exception at 0008:00000000001047a5 >> (d24) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff >> (d24) ****************************** >> >> Which also confirms the description which states that in the case of a >> 16 bit operand, no sign extension occurs. >> >> I then tried the same test on an Intel Haswell system: >> >> (d91) Ptr: 0000000000001000 >> (d91) => c2c2 >> (d91) Ptr: 0000000000000fff >> (d91) ****************************** >> (d91) PANIC: Unhandled exception at 0008:00000000001047a5 >> (d91) Vec 14 #PF[-d-sr-] %cr2 0000000000000fff >> (d91) ****************************** > But judging from the "Ptr: 0000000000000fff" in the log I take > it you tried to access a byte rather than a word (which would > need an address of 0000000000000ffe to distinguish whether it's > a 2- or 4-byte read that the CPU issues). No - it was a word access in all cases. The bug was walking backwards into an unmapped page, rather than forward. When walking forward from 0x1ffc to 0x2000, I do observe that AMD does 4-byte accesses while Intel does 2. > Trust me, I did try > this out, which is also why I noticed that Mark's claim of > the behavior having changed with SandyBridge is likely wrong. > He has meanwhile confirmed that Merom also already behaved this > way. Sadly, we also have a second bug here, and it is rather more complicated to reason about. The shadow livelock bug with implicit supervisor accesses to user mappings happens when the pipeline generates #PF, and the emulator believes that the instruction completes correctly. The SMAP case specifically occurs because information is discarded on VMExit which prevents Xen's pagewalk from behaving identically to the pipeline. However, any case where hardware and the emulator disagree is a recipe for livelock. In this example, hardware can the emulator can disagree by using a different access width. I've been experimenting with my Rome system, and an emulator hardcoded to use 2-byte accesses. After some investigation, the livelock only occurs for access-rights faults. Translation faults get identified as not a shadow fault, and bounced back to the guest. Shadow guests can use PKRU, so can generate an access fault by marking the page at 0x2000 as no-access, so I think that in principle, this change will result in a new latent livelock case, but I can't actually confirm it. For now, I don't think this will interact problematically with mem-access reduced p2m types, because they are ignored by the emulator. However, I think we need to tread very carefully here. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 01.10.2019 21:44, Andrew Cooper wrote: > In this example, hardware can the emulator can disagree by using a > different access width. > > I've been experimenting with my Rome system, and an emulator hardcoded > to use 2-byte accesses. After some investigation, the livelock only > occurs for access-rights faults. Translation faults get identified as > not a shadow fault, and bounced back to the guest. > > Shadow guests can use PKRU, so can generate an access fault by marking > the page at 0x2000 as no-access, so I think that in principle, this > change will result in a new latent livelock case, but I can't actually > confirm it. I think I see what you mean, but then I don't see how this is an argument against the patch in its current shape: It actually reduces the cases of disagreement between hardware and emulator. One possibility to make a further step in that direction would be to make behavior dependent upon the underlying hardware's vendor, rather than the one the guest sees. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 02/10/2019 08:07, Jan Beulich wrote: > On 01.10.2019 21:44, Andrew Cooper wrote: >> In this example, hardware can the emulator can disagree by using a >> different access width. >> >> I've been experimenting with my Rome system, and an emulator hardcoded >> to use 2-byte accesses. After some investigation, the livelock only >> occurs for access-rights faults. Translation faults get identified as >> not a shadow fault, and bounced back to the guest. >> >> Shadow guests can use PKRU, so can generate an access fault by marking >> the page at 0x2000 as no-access, so I think that in principle, this >> change will result in a new latent livelock case, but I can't actually >> confirm it. > I think I see what you mean, but then I don't see how this is an > argument against the patch in its current shape: It actually > reduces the cases of disagreement between hardware and emulator. At the moment, the emulator is strictly 4 bytes, and hardware may be 4 or 2. Therefore, there is no chance of the pipeline yielding #PF while the emulator yielding OK. With the change in place, older Intel parts which do use a 4 byte access now come with a risk of livelock. Whichever parts these are, they predate PKRU so in this specific case, the problem is only theoretical AFAICT. Also, in this specific case, Intel's warning of "Don't use this instruction without a REX prefix" means that we shouldn't see it in anything but test scenarios. > One possibility to make a further step in that direction would > be to make behavior dependent upon the underlying hardware's > vendor, rather than the one the guest sees. I considered this. It would work on native (at the expense of complicating the emulator), but won't work properly if Xen is virtualisied and migrated. I can't see a way around this. Furthermore, we can't execute the instruction against a mapping of the guest, because the problem here is determining the width of the access, which is information needed to construct the mapping in the first place. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 02.10.2019 10:51, Andrew Cooper wrote: > On 02/10/2019 08:07, Jan Beulich wrote: >> On 01.10.2019 21:44, Andrew Cooper wrote: >>> In this example, hardware can the emulator can disagree by using a >>> different access width. >>> >>> I've been experimenting with my Rome system, and an emulator hardcoded >>> to use 2-byte accesses. After some investigation, the livelock only >>> occurs for access-rights faults. Translation faults get identified as >>> not a shadow fault, and bounced back to the guest. >>> >>> Shadow guests can use PKRU, so can generate an access fault by marking >>> the page at 0x2000 as no-access, so I think that in principle, this >>> change will result in a new latent livelock case, but I can't actually >>> confirm it. >> I think I see what you mean, but then I don't see how this is an >> argument against the patch in its current shape: It actually >> reduces the cases of disagreement between hardware and emulator. > > At the moment, the emulator is strictly 4 bytes, and hardware may be 4 > or 2. Therefore, there is no chance of the pipeline yielding #PF while > the emulator yielding OK. At the expense of possibly yielding #PF when the pipeline wouldn't. > With the change in place, older Intel parts which do use a 4 byte access > now come with a risk of livelock. Whichever parts these are, they > predate PKRU so in this specific case, the problem is only theoretical > AFAICT. Plus at this point we don't even know whether there are any such parts. > Also, in this specific case, Intel's warning of "Don't use this > instruction without a REX prefix" means that we shouldn't see it in > anything but test scenarios. It's extremely unlikely at least. >> One possibility to make a further step in that direction would >> be to make behavior dependent upon the underlying hardware's >> vendor, rather than the one the guest sees. > > I considered this. It would work on native (at the expense of > complicating the emulator), but won't work properly if Xen is > virtualisied and migrated. I can't see a way around this. Are you concerned about Xen getting cross-vendor migrated? If you'd accept things to not be 100% right in such a case, I could simply probe hardware while booting. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 02/10/2019 10:17, Jan Beulich wrote: > On 02.10.2019 10:51, Andrew Cooper wrote: >> On 02/10/2019 08:07, Jan Beulich wrote: >>> On 01.10.2019 21:44, Andrew Cooper wrote: >>>> In this example, hardware can the emulator can disagree by using a >>>> different access width. >>>> >>>> I've been experimenting with my Rome system, and an emulator hardcoded >>>> to use 2-byte accesses. After some investigation, the livelock only >>>> occurs for access-rights faults. Translation faults get identified as >>>> not a shadow fault, and bounced back to the guest. >>>> >>>> Shadow guests can use PKRU, so can generate an access fault by marking >>>> the page at 0x2000 as no-access, so I think that in principle, this >>>> change will result in a new latent livelock case, but I can't actually >>>> confirm it. >>> I think I see what you mean, but then I don't see how this is an >>> argument against the patch in its current shape: It actually >>> reduces the cases of disagreement between hardware and emulator. >> At the moment, the emulator is strictly 4 bytes, and hardware may be 4 >> or 2. Therefore, there is no chance of the pipeline yielding #PF while >> the emulator yielding OK. > At the expense of possibly yielding #PF when the pipeline wouldn't. Right, but given the differing behaviour, no code can reasonably expect not to get the second #PF. >> With the change in place, older Intel parts which do use a 4 byte access >> now come with a risk of livelock. Whichever parts these are, they >> predate PKRU so in this specific case, the problem is only theoretical >> AFAICT. > Plus at this point we don't even know whether there are any such > parts. I'll see if I can find out. Given this is a 64-bit only instruction, it is possible that Intel has always had the described behaviour, and it was just the documentation which was incorrect. >> Also, in this specific case, Intel's warning of "Don't use this >> instruction without a REX prefix" means that we shouldn't see it in >> anything but test scenarios. > It's extremely unlikely at least. > >>> One possibility to make a further step in that direction would >>> be to make behavior dependent upon the underlying hardware's >>> vendor, rather than the one the guest sees. >> I considered this. It would work on native (at the expense of >> complicating the emulator), but won't work properly if Xen is >> virtualisied and migrated. I can't see a way around this. > Are you concerned about Xen getting cross-vendor migrated? If > you'd accept things to not be 100% right in such a case, I could > simply probe hardware while booting. To be honest, when Xen isn't L0, all of this is liable to break under our feet. I don't think probing at boot is going to provide a meaningful improvement on that. For this specific movxsd change, lets call it Acked-by me. Whatever the behaviour on ancient processors, the livelock case is safe because they don't support PKRU. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
© 2016 - 2024 Red Hat, Inc.