[Xen-devel] [PATCH] x86/boot: Simplify %fs setup in trampoline_setup

Andrew Cooper posted 1 patch 4 years, 8 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20190812151032.9353-1-andrew.cooper3@citrix.com
xen/arch/x86/boot/head.S | 27 ++++++++++-----------------
1 file changed, 10 insertions(+), 17 deletions(-)
[Xen-devel] [PATCH] x86/boot: Simplify %fs setup in trampoline_setup
Posted by Andrew Cooper 4 years, 8 months ago
mov/shr is easier to follow than shld, and doesn't have a merge dependency on
the previous value of %edx.  Shorten the rest of the code by streamlining the
comments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>

In addition to being clearer to follow, mov/shr is faster than shld to decode
and execute.  See https://godbolt.org/z/A5kvuC for the latency/throughput/port
analysis, the Intel Optimisation guide which classifes them as "Slow Int"
instructions, or the AMD Optimisation guide which specifically has a section
entitled "Alternatives to SHLD Instruction".
---
 xen/arch/x86/boot/head.S | 27 ++++++++++-----------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index 782deac0d4..26b680521d 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -556,24 +556,17 @@ trampoline_setup:
         /*
          * Called on legacy BIOS and EFI platforms.
          *
-         * Initialize bits 0-15 of BOOT_FS segment descriptor base address.
+         * Set the BOOT_FS descriptor base address to %esi.
          */
-        mov     %si,BOOT_FS+2+sym_esi(trampoline_gdt)
-
-        /* Initialize bits 16-23 of BOOT_FS segment descriptor base address. */
-        shld    $16,%esi,%edx
-        mov     %dl,BOOT_FS+4+sym_esi(trampoline_gdt)
-
-        /* Initialize bits 24-31 of BOOT_FS segment descriptor base address. */
-        mov     %dh,BOOT_FS+7+sym_esi(trampoline_gdt)
-
-        /*
-         * Initialize %fs and later use it to access Xen data where possible.
-         * According to Intel 64 and IA-32 Architectures Software Developer's
-         * Manual it is safe to do that without reloading GDTR before.
-         */
-        mov     $BOOT_FS,%edx
-        mov     %edx,%fs
+        mov     %esi, %edx
+        shr     $16, %edx
+        mov     %si, BOOT_FS + 2 + sym_esi(trampoline_gdt) /* Bits  0-15 */
+        mov     %dl, BOOT_FS + 4 + sym_esi(trampoline_gdt) /* Bits 16-23 */
+        mov     %dh, BOOT_FS + 7 + sym_esi(trampoline_gdt) /* Bits 24-31 */
+
+        /* Load %fs to allow for access to Xen data. */
+        mov     $BOOT_FS, %edx
+        mov     %edx, %fs
 
         /* Save Xen image load base address for later use. */
         mov     %esi,sym_fs(xen_phys_start)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/boot: Simplify %fs setup in trampoline_setup
Posted by Wei Liu 4 years, 8 months ago
On Mon, Aug 12, 2019 at 04:10:32PM +0100, Andrew Cooper wrote:
> mov/shr is easier to follow than shld, and doesn't have a merge dependency on
> the previous value of %edx.  Shorten the rest of the code by streamlining the
> comments.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Wei Liu <wl@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/boot: Simplify %fs setup in trampoline_setup
Posted by Jan Beulich 4 years, 7 months ago
On 12.08.2019 17:10, Andrew Cooper wrote:
> mov/shr is easier to follow than shld, and doesn't have a merge dependency on
> the previous value of %edx.  Shorten the rest of the code by streamlining the
> comments.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Wei Liu <wl@xen.org>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> 
> In addition to being clearer to follow, mov/shr is faster than shld to decode
> and execute.  See https://godbolt.org/z/A5kvuC for the latency/throughput/port
> analysis, the Intel Optimisation guide which classifes them as "Slow Int"
> instructions, or the AMD Optimisation guide which specifically has a section
> entitled "Alternatives to SHLD Instruction".

I don't really mind the change, but I don't think performance is a
concern here. Instead I think we want to size-optimize the trampoline
as much as possible, which is why (iirc) I had asked for the use of
SHLD here. Considering David's work to split boot and permanent
trampoline I'm find with the minimal 1 byte increase though:

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel