To represent the USER-MSR bitmap access, a new segment type needs
introducing, behaving like x86_seg_none in terms of address treatment,
but behaving like a system segment for page walk purposes (implicit
supervisor-mode access).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
This feels a little fragile: Of course I did look through uses of the
enumerators, and I didn't find further places which would need
adjustment, but I'm not really sure I didn't miss any place.
---
v3: New.
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -593,6 +593,7 @@ static int read(
default:
if ( !is_x86_user_segment(seg) )
return X86EMUL_UNHANDLEABLE;
+ case x86_seg_sys:
case x86_seg_none:
bytes_read += bytes;
break;
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -840,7 +840,7 @@ static int hvmemul_virtual_to_linear(
int okay;
unsigned long reps = 1;
- if ( seg == x86_seg_none )
+ if ( seg == x86_seg_none || seg == x86_seg_sys )
{
*linear = offset;
return X86EMUL_OKAY;
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2601,7 +2601,7 @@ bool hvm_vcpu_virtual_to_linear(
* It is expected that the access rights of reg are suitable for seg (and
* that this is enforced at the point that seg is loaded).
*/
- ASSERT(seg < x86_seg_none);
+ ASSERT(seg < x86_seg_sys);
/* However, check that insn fetches only ever specify CS. */
ASSERT(access_type != hvm_access_insn_fetch || seg == x86_seg_cs);
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -749,6 +749,7 @@ static void cf_check svm_set_segment_reg
vmcb->ldtr = *reg;
break;
+ case x86_seg_sys:
case x86_seg_none:
ASSERT_UNREACHABLE();
break;
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -43,7 +43,8 @@ enum x86_segment {
x86_seg_ldtr,
x86_seg_gdtr,
x86_seg_idtr,
- /* No Segment: For accesses which are already linear. */
+ /* No Segment: For (system/normal) accesses which are already linear. */
+ x86_seg_sys,
x86_seg_none
};
On 04/09/2024 1:29 pm, Jan Beulich wrote: > To represent the USER-MSR bitmap access, a new segment type needs > introducing, behaving like x86_seg_none in terms of address treatment, > but behaving like a system segment for page walk purposes (implicit > supervisor-mode access). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> After discussing this extensively in the maintainers call, and with others, it's probably the best way forwards given the structure of Xen right now. ~Andrew
On 04/09/2024 1:29 pm, Jan Beulich wrote: > To represent the USER-MSR bitmap access, a new segment type needs > introducing, behaving like x86_seg_none in terms of address treatment, > but behaving like a system segment for page walk purposes (implicit > supervisor-mode access). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > This feels a little fragile: Of course I did look through uses of the > enumerators, and I didn't find further places which would need > adjustment, but I'm not really sure I didn't miss any place. It does feel a bit fragile, but it may help to consider the other related cases. Here, we need a linear access with implicit-supervisor paging properties. From what I can tell, it needs to be exactly like other implicit supervisor accesses. For CET, we get two new cases. The legacy bitmap has a pointer out of MSR_[U,S]_CET, but otherwise obeys CPL rules, so wants to be x86_seg_none. However, WRUSS is both a CPL0 instruction, and generates implicit-user accesses. It's the first instruction of it's like, that I'm aware of. If we're going down this x86_seg_sys route, we'd need x86_seg_user too. Really, this is a consequence of the memory APIs we've got. It's the intermediate layers which generate PFEC_* for the pagewalk, and we're (ab)using segment at the top level to encode "skip segmentation but I still want certain properties". But, there's actually a 3rd case we get from CET, and it breaks everything. Shstk accesses are a new type, architecturally expressed as a new input (and output) to the pagewalk, but are also regular user-segment relative. We either do the same trick of expressing fetch() in terms of read(PFEC_insn) and implement new shstk_{read,write}() accessors which wrap {read,write}(PFEC_shstk), or we need to plumb the PFEC parameters higher in the call tree. It's worth noting that alignment restrictions make things even more complicated. Generally, shstk accesses should be 8 or 4 byte aligned (based on osize), and the pseudocode for WR{U}SS calls this out; after all they're converting from arbitrary memory operands. However, there's a fun corner case where a 64bit code segment can use INCSSPD to misalign SSP, then CALL to generate a misaligned store. This combines with an erratum in Zen3 and possibly Zen4 where there's a missing #GP check on LRET and you can forge a return address formed of two misaligned addresses. So misaligned stores are definitely possible (I checked this on both vendors at the time), so it wouldn't be appropriate to have in a general shstk_*() helper. In turn, this means that the implementation of WR{U}SS would need a way to linearise it's operand manually to insert the additional check before then making a regular memory access. And I can't see a way of doing this without exposing PFEC inputs at the top level. Thoughts? ~Andrew
On 04.09.2024 18:54, Andrew Cooper wrote: > On 04/09/2024 1:29 pm, Jan Beulich wrote: >> To represent the USER-MSR bitmap access, a new segment type needs >> introducing, behaving like x86_seg_none in terms of address treatment, >> but behaving like a system segment for page walk purposes (implicit >> supervisor-mode access). >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> This feels a little fragile: Of course I did look through uses of the >> enumerators, and I didn't find further places which would need >> adjustment, but I'm not really sure I didn't miss any place. > > It does feel a bit fragile, but it may help to consider the other > related cases. > > Here, we need a linear access with implicit-supervisor paging > properties. From what I can tell, it needs to be exactly like other > implicit supervisor accesses. Well, not exactly. There's no segment (and hence no segment base) involved here. Hence, as said in the description, it's a mix of two things we've got so far. > For CET, we get two new cases. > > The legacy bitmap has a pointer out of MSR_[U,S]_CET, but otherwise > obeys CPL rules, so wants to be x86_seg_none. > > However, WRUSS is both a CPL0 instruction, and generates implicit-user > accesses. It's the first instruction of it's like, that I'm aware of. With MOVU having got ripped back out of the 386, yes. (Whether to call such "implicit" user is a separate question.) > If we're going down this x86_seg_sys route, we'd need x86_seg_user too. That won't work, as we need to express the real x86_seg_[cdefgs]s associated with the insn's memory operand. Whereas x86_seg_sys doesn't need combining with anything. > Really, this is a consequence of the memory APIs we've got. It's the > intermediate layers which generate PFEC_* for the pagewalk, and we're > (ab)using segment at the top level to encode "skip segmentation but I > still want certain properties". Right, for USER-MSR. For WRUSS it's "do segmentation and I want two extra properties" (just one for WRSS). > But, there's actually a 3rd case we get from CET, and it breaks everything. > > Shstk accesses are a new type, architecturally expressed as a new input > (and output) to the pagewalk, but are also regular user-segment relative. WR{,U}SS are part of that, aren't they? > We either do the same trick of expressing fetch() in terms of > read(PFEC_insn) and implement new shstk_{read,write}() accessors which > wrap {read,write}(PFEC_shstk), or we need to plumb the PFEC parameters > higher in the call tree. > > It's worth noting that alignment restrictions make things even more > complicated. Generally, shstk accesses should be 8 or 4 byte aligned > (based on osize), and the pseudocode for WR{U}SS calls this out; after > all they're converting from arbitrary memory operands. > > However, there's a fun corner case where a 64bit code segment can use > INCSSPD to misalign SSP, then CALL to generate a misaligned store. This > combines with an erratum in Zen3 and possibly Zen4 where there's a > missing #GP check on LRET and you can forge a return address formed of > two misaligned addresses. Well, we certainly don't need to emulate errata, I'd say. > So misaligned stores are definitely possible (I checked this on both > vendors at the time), so it wouldn't be appropriate to have in a general > shstk_*() helper. In turn, this means that the implementation of > WR{U}SS would need a way to linearise it's operand manually to insert > the additional check before then making a regular memory access. We do such for SSE alignment checking already; see the emulator's is_aligned(). I don't see why we couldn't re-use that for WR{,U}SS. > And I can't see a way of doing this without exposing PFEC inputs at the > top level. Certainly we'll need a qualifier alongside x86_seg_[cdefgs]s, which of course could then also be allowed to be combined with x86_seg_none. Moving PFEC inputs to the top level, while certainly possible, would involve a lot of churn. Plus I'm also hesitant to further grow the hooks' numbers of parameters. IOW introducing new shstk_{read,write}() hooks would look somewhat preferable to me, at least for the moment, if we don't want to have a x86_seg_{shstk,user} flags that can be OR-ed into the other x86_seg_*. Jan
© 2016 - 2024 Red Hat, Inc.