It turns out that using the higher level helpers adjacent like this leads to
terrible code generation. Due to -fno-strict-alising, the store into state->
invalidates the read_cr4() address calculation (which is really cpu_info->cr4
under the hood), meaning that it can't be hoisted.
As a result we get "locate the top of stack block, get cr4, and see if
FSGSBASE is set" repeated 3 times, and an unreasoanble number of basic blocks.
Hoist the calculation manually, which results in two basic blocks.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
Side-by-side disassembly: https://termbin.com/9xfq
---
xen/arch/x86/traps.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 0c5393cb2166..8c261d219c07 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -118,9 +118,18 @@ static void read_registers(struct extra_state *state)
state->cr3 = read_cr3();
state->cr4 = read_cr4();
- state->fsb = read_fs_base();
- state->gsb = read_gs_base();
- state->gss = read_gs_shadow();
+ if ( state->cr4 & X86_CR4_FSGSBASE )
+ {
+ state->fsb = __rdfsbase();
+ state->gsb = __rdgsbase();
+ state->gss = __rdgskern();
+ }
+ else
+ {
+ state->fsb = rdmsr(MSR_FS_BASE);
+ state->gsb = rdmsr(MSR_GS_BASE);
+ state->gss = rdmsr(MSR_SHADOW_GS_BASE);
+ }
asm ( "mov %%ds, %0" : "=m" (state->ds) );
asm ( "mov %%es, %0" : "=m" (state->es) );
--
2.39.5