[Xen-devel] [PATCH v2 0/2] x86/boot: cleanup

Jan Beulich posted 2 patches 4 years, 8 months ago
Only 1 patches received!
[Xen-devel] [PATCH v2 0/2] x86/boot: cleanup
Posted by Jan Beulich 4 years, 8 months ago
This is my slight adjustment of the original patch 4 in Andrew's
series, plus a (new) prereq adjustment. I think the result is
cleaner overall.

1: x86: define a few selector values
2: x86/desc: Build boot_{,compat_}gdt[] in C

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 0/2] x86/boot: cleanup
Posted by Jan Beulich 4 years, 8 months ago
On 09.08.2019 12:34, Jan Beulich wrote:
> This is my slight adjustment of the original patch 4 in Andrew's
> series, plus a (new) prereq adjustment. I think the result is
> cleaner overall.
> 
> 1: x86: define a few selector values
> 2: x86/desc: Build boot_{,compat_}gdt[] in C

And I realize I should have attached the patches here once again.

Jan
x86: define a few selector values

TSS, LDT, and per-CPU entries all can benefit a little from also having
their selector values defined.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -761,7 +761,7 @@ void load_system_tables(void)
 	per_cpu(full_gdt_loaded, cpu) = false;
 	lgdt(&gdtr);
 	lidt(&idtr);
-	ltr(TSS_ENTRY << 3);
+	ltr(TSS_SELECTOR);
 	lldt(0);
 
 	enable_each_ist(idt_tables[cpu]);
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1572,7 +1572,7 @@ bool svm_load_segs(unsigned int ldt_ents
 
         _set_tssldt_desc(desc, ldt_base, ldt_ents * 8 - 1, SYS_DESC_ldt);
 
-        vmcb->ldtr.sel = LDT_ENTRY << 3;
+        vmcb->ldtr.sel = LDT_SELECTOR;
         vmcb->ldtr.attr = SYS_DESC_ldt | (_SEGMENT_P >> 8);
         vmcb->ldtr.limit = ldt_ents * 8 - 1;
         vmcb->ldtr.base = ldt_base;
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1128,7 +1128,7 @@ static int construct_vmcs(struct vcpu *v
     __vmwrite(HOST_GS_SELECTOR, 0);
     __vmwrite(HOST_FS_BASE, 0);
     __vmwrite(HOST_GS_BASE, 0);
-    __vmwrite(HOST_TR_SELECTOR, TSS_ENTRY << 3);
+    __vmwrite(HOST_TR_SELECTOR, TSS_SELECTOR);
 
     /* Host control registers. */
     v->arch.hvm.vmx.host_cr0 = read_cr0() & ~X86_CR0_TS;
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1917,7 +1917,7 @@ void load_TR(void)
     /* Switch to non-compat GDT (which has B bit clear) to execute LTR. */
     asm volatile (
         "sgdt %0; lgdt %2; ltr %w1; lgdt %0"
-        : "=m" (old_gdt) : "rm" (TSS_ENTRY << 3), "m" (tss_gdt) : "memory" );
+        : "=m" (old_gdt) : "rm" (TSS_SELECTOR), "m" (tss_gdt) : "memory" );
 }
 
 static unsigned int calc_ler_msr(void)
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -251,7 +251,7 @@ void do_double_fault(struct cpu_user_reg
 
     console_force_unlock();
 
-    asm ( "lsll %1, %0" : "=r" (cpu) : "rm" (PER_CPU_GDT_ENTRY << 3) );
+    asm ( "lsll %1, %0" : "=r" (cpu) : "rm" (PER_CPU_GDT_SELECTOR) );
 
     /* Find information saved during fault and dump it to the console. */
     printk("*** DOUBLE FAULT ***\n");
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -36,6 +36,10 @@
 #define LDT_ENTRY (TSS_ENTRY + 2)
 #define PER_CPU_GDT_ENTRY (LDT_ENTRY + 2)
 
+#define TSS_SELECTOR         (TSS_ENTRY << 3)
+#define LDT_SELECTOR         (LDT_ENTRY << 3)
+#define PER_CPU_GDT_SELECTOR (PER_CPU_GDT_ENTRY << 3)
+
 #ifndef __ASSEMBLY__
 
 #define GUEST_KERNEL_RPL(d) (is_pv_32bit_domain(d) ? 1 : 3)
--- a/xen/include/asm-x86/ldt.h
+++ b/xen/include/asm-x86/ldt.h
@@ -16,7 +16,7 @@ static inline void load_LDT(struct vcpu
         desc = (!is_pv_32bit_vcpu(v) ? this_cpu(gdt) : this_cpu(compat_gdt))
                + LDT_ENTRY - FIRST_RESERVED_GDT_ENTRY;
         _set_tssldt_desc(desc, LDT_VIRT_START(v), ents*8-1, SYS_DESC_ldt);
-        lldt(LDT_ENTRY << 3);
+        lldt(LDT_SELECTOR);
     }
 }
 
x86/desc: Build boot_{,compat_}gdt[] in C

... where we can at least get the compiler to fill in the surrounding space
without having to do it manually.  This also results in the symbols having
proper type/size information in the debug symbols.

Reorder 'raw' in the seg_desc_t union to allow for easier initialisation.

Leave a comment explaining the various restrictions we have on altering the
GDT layout.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Introduce SEL2GDT(). Correct GDT indices in public header.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Address my own v1 review comments. Re-base.
---
There is plenty more cleanup which can be done in the future.  As we are
64-bit, there is no need for load_TR() to keep the TSS in sync between the two
GDTs, which means it can drop all sgdt/lgdt instructions.  Also,
__HYPERVISOR_CS32 is unused and could be dropped.

As is demonstrated by the required includes, asm/desc.h is a tangle in need of
some cleanup.

The SYSEXIT GDT requirements are:
  R0 long code, R0 data, R3 compat code, R3 data, R3 long code, R3 data.

We could make Xen compatible by shifting the R0 code/data down by one slot,
moving R0 compat code elsewhere and replacing it with another R3 data.

However, this seems like a fairly pointless exercise, as the only thing it
will do is change the magic numbers which developers have become accustom to
seeing in backtraces.

--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_PV) += compat.o x86_64/comp
 obj-$(CONFIG_KEXEC) += crash.o
 obj-y += debug.o
 obj-y += delay.o
+obj-y += desc.o
 obj-bin-y += dmi_scan.init.o
 obj-y += domctl.o
 obj-y += domain.o
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -2,7 +2,6 @@
 #include <xen/multiboot2.h>
 #include <public/xen.h>
 #include <asm/asm_defns.h>
-#include <asm/desc.h>
 #include <asm/fixmap.h>
 #include <asm/page.h>
 #include <asm/processor.h>
--- a/xen/arch/x86/boot/x86_64.S
+++ b/xen/arch/x86/boot/x86_64.S
@@ -43,44 +43,11 @@ ENTRY(__high_start)
 multiboot_ptr:
         .long   0
 
-        .word   0
-GLOBAL(boot_gdtr)
-        .word   LAST_RESERVED_GDT_BYTE
-        .quad   boot_gdt - FIRST_RESERVED_GDT_BYTE
-
 GLOBAL(stack_start)
         .quad   cpu0_stack
 
         .section .data.page_aligned, "aw", @progbits
         .align PAGE_SIZE, 0
-GLOBAL(boot_gdt)
-        .quad 0x0000000000000000     /* unused */
-        .quad 0x00af9b000000ffff     /* 0xe008 ring 0 code, 64-bit mode   */
-        .quad 0x00cf93000000ffff     /* 0xe010 ring 0 data                */
-        .quad 0x0000000000000000     /* reserved                          */
-        .quad 0x00cffb000000ffff     /* 0xe023 ring 3 code, compatibility */
-        .quad 0x00cff3000000ffff     /* 0xe02b ring 3 data                */
-        .quad 0x00affb000000ffff     /* 0xe033 ring 3 code, 64-bit mode   */
-        .quad 0x00cf9b000000ffff     /* 0xe038 ring 0 code, compatibility */
-        .fill (PER_CPU_GDT_ENTRY - __HYPERVISOR_CS32 / 8 - 1), 8, 0
-        .quad 0x0000910000000000     /* per-CPU entry (limit == cpu)      */
-
-        .align PAGE_SIZE, 0
-/* NB. Even rings != 0 get access to the full 4Gb, as only the            */
-/*     (compatibility) machine->physical mapping table lives there.       */
-GLOBAL(boot_compat_gdt)
-        .quad 0x0000000000000000     /* unused */
-        .quad 0x00af9b000000ffff     /* 0xe008 ring 0 code, 64-bit mode   */
-        .quad 0x00cf93000000ffff     /* 0xe010 ring 0 data                */
-        .quad 0x00cfbb000000ffff     /* 0xe019 ring 1 code, compatibility */
-        .quad 0x00cfb3000000ffff     /* 0xe021 ring 1 data                */
-        .quad 0x00cffb000000ffff     /* 0xe02b ring 3 code, compatibility */
-        .quad 0x00cff3000000ffff     /* 0xe033 ring 3 data                */
-        .quad 0x00cf9b000000ffff     /* 0xe038 ring 0 code, compatibility */
-        .fill (PER_CPU_GDT_ENTRY - __HYPERVISOR_CS32 / 8 - 1), 8, 0
-        .quad 0x0000910000000000     /* per-CPU entry (limit == cpu)      */
-        .align PAGE_SIZE, 0
-
 /*
  * Mapping of first 2 megabytes of memory. This is mapped with 4kB mappings
  * to avoid type conflicts with fixed-range MTRRs covering the lowest megabyte
--- /dev/null
+++ b/xen/arch/x86/desc.c
@@ -0,0 +1,109 @@
+#include <xen/types.h>
+#include <xen/lib.h>
+#include <xen/percpu.h>
+#include <xen/mm.h>
+
+#include <asm/desc.h>
+
+/*
+ * Native and Compat GDTs used by Xen.
+ *
+ * The R1 and R3 descriptors are fixed in Xen's ABI for PV guests.  All other
+ * descriptors are in principle variable, with the following restrictions.
+ *
+ * All R0 descriptors must line up in both GDTs to allow for correct
+ * interrupt/exception handling.
+ *
+ * The SYSCALL/SYSRET GDT layout requires:
+ *  - R0 long mode code followed by R0 data.
+ *  - R3 compat code, followed by R3 data, followed by R3 long mode code.
+ *
+ * The SYSENTER GDT layout requirements are compatible with SYSCALL.  Xen does
+ * not use the SYSEXIT instruction, and does not provide a compatible GDT.
+ *
+ * These tables are used directly by CPU0, and used as the template for the
+ * GDTs of other CPUs.  Everything from the TSS onwards is unique per CPU.
+ */
+
+#define SEL2GDT(sel) (((sel) >> 3) - FIRST_RESERVED_GDT_ENTRY)
+
+__section(".data.page_aligned") __aligned(PAGE_SIZE)
+seg_desc_t boot_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
+{
+    /* 0xe008 - Ring 0 code, 64bit mode */
+    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
+
+    /* 0xe010 - Ring 0 data */
+    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
+
+    /* 0xe018 - reserved */
+
+    /* 0xe023 - Ring 3 code, compatibility */
+    [SEL2GDT(FLAT_RING3_CS32)] = { 0x00cffb000000ffff },
+
+    /* 0xe02b - Ring 3 data */
+    [SEL2GDT(FLAT_RING3_DS32)] = { 0x00cff3000000ffff },
+
+    /* 0xe033 - Ring 3 code, 64-bit mode */
+    [SEL2GDT(FLAT_RING3_CS64)] = { 0x00affb000000ffff },
+
+    /* 0xe038 - Ring 0 code, compatibility */
+    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
+
+    /* 0xe040 - TSS */
+    /* 0xe050 - LDT */
+
+    /* 0xe060 - per-CPU entry (limit == cpu) */
+    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
+};
+
+__section(".data.page_aligned") __aligned(PAGE_SIZE)
+seg_desc_t boot_compat_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
+{
+    /* 0xe008 - Ring 0 code, 64bit mode */
+    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
+
+    /* 0xe010 - Ring 0 data */
+    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
+
+    /* 0xe019 - Ring 1 code, compatibility */
+    [SEL2GDT(FLAT_COMPAT_RING1_CS)] = { 0x00cfbb000000ffff },
+
+    /* 0xe021 - Ring 1 data */
+    [SEL2GDT(FLAT_COMPAT_RING1_DS)] = { 0x00cfb3000000ffff },
+
+    /* 0xe02b - Ring 3 code, compatibility */
+    [SEL2GDT(FLAT_COMPAT_RING3_CS)] = { 0x00cffb000000ffff },
+
+    /* 0xe033 - Ring 3 data */
+    [SEL2GDT(FLAT_COMPAT_RING3_DS)] = { 0x00cff3000000ffff },
+
+    /* 0xe038 - Ring 0 code, compatibility */
+    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
+
+    /* 0xe040 - TSS */
+    /* 0xe050 - LDT */
+
+    /* 0xe060 - per-CPU entry (limit == cpu) */
+    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
+};
+
+/*
+ * Used by each CPU as it starts up, to enter C with a suitable %cs.
+ * References boot_cpu_gdt_table for a short period, until the CPUs switch
+ * onto their per-CPU GDTs.
+ */
+struct desc_ptr boot_gdtr = {
+    .limit = LAST_RESERVED_GDT_BYTE,
+    .base = (unsigned long)(boot_gdt - FIRST_RESERVED_GDT_ENTRY),
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -107,10 +107,10 @@
 #define SYS_DESC_trap_gate    15
 
 typedef union {
+    uint64_t raw;
     struct {
         uint32_t a, b;
     };
-    uint64_t raw;
 } seg_desc_t;
 
 typedef union {
--- a/xen/include/public/arch-x86/xen-x86_64.h
+++ b/xen/include/public/arch-x86/xen-x86_64.h
@@ -44,11 +44,11 @@
  */
 
 #define FLAT_RING3_CS32 0xe023  /* GDT index 260 */
-#define FLAT_RING3_CS64 0xe033  /* GDT index 261 */
-#define FLAT_RING3_DS32 0xe02b  /* GDT index 262 */
+#define FLAT_RING3_DS32 0xe02b  /* GDT index 261 */
+#define FLAT_RING3_CS64 0xe033  /* GDT index 262 */
 #define FLAT_RING3_DS64 0x0000  /* NULL selector */
-#define FLAT_RING3_SS32 0xe02b  /* GDT index 262 */
-#define FLAT_RING3_SS64 0xe02b  /* GDT index 262 */
+#define FLAT_RING3_SS32 0xe02b  /* GDT index 261 */
+#define FLAT_RING3_SS64 0xe02b  /* GDT index 261 */
 
 #define FLAT_KERNEL_DS64 FLAT_RING3_DS64
 #define FLAT_KERNEL_DS32 FLAT_RING3_DS32
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 1/2] x86: define a few selector values
Posted by Jan Beulich 4 years, 8 months ago
TSS, LDT, and per-CPU entries all can benefit a little from also having
their selector values defined.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -761,7 +761,7 @@ void load_system_tables(void)
  	per_cpu(full_gdt_loaded, cpu) = false;
  	lgdt(&gdtr);
  	lidt(&idtr);
-	ltr(TSS_ENTRY << 3);
+	ltr(TSS_SELECTOR);
  	lldt(0);
  
  	enable_each_ist(idt_tables[cpu]);
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1572,7 +1572,7 @@ bool svm_load_segs(unsigned int ldt_ents
  
          _set_tssldt_desc(desc, ldt_base, ldt_ents * 8 - 1, SYS_DESC_ldt);
  
-        vmcb->ldtr.sel = LDT_ENTRY << 3;
+        vmcb->ldtr.sel = LDT_SELECTOR;
          vmcb->ldtr.attr = SYS_DESC_ldt | (_SEGMENT_P >> 8);
          vmcb->ldtr.limit = ldt_ents * 8 - 1;
          vmcb->ldtr.base = ldt_base;
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1128,7 +1128,7 @@ static int construct_vmcs(struct vcpu *v
      __vmwrite(HOST_GS_SELECTOR, 0);
      __vmwrite(HOST_FS_BASE, 0);
      __vmwrite(HOST_GS_BASE, 0);
-    __vmwrite(HOST_TR_SELECTOR, TSS_ENTRY << 3);
+    __vmwrite(HOST_TR_SELECTOR, TSS_SELECTOR);
  
      /* Host control registers. */
      v->arch.hvm.vmx.host_cr0 = read_cr0() & ~X86_CR0_TS;
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1917,7 +1917,7 @@ void load_TR(void)
      /* Switch to non-compat GDT (which has B bit clear) to execute LTR. */
      asm volatile (
          "sgdt %0; lgdt %2; ltr %w1; lgdt %0"
-        : "=m" (old_gdt) : "rm" (TSS_ENTRY << 3), "m" (tss_gdt) : "memory" );
+        : "=m" (old_gdt) : "rm" (TSS_SELECTOR), "m" (tss_gdt) : "memory" );
  }
  
  static unsigned int calc_ler_msr(void)
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -251,7 +251,7 @@ void do_double_fault(struct cpu_user_reg
  
      console_force_unlock();
  
-    asm ( "lsll %1, %0" : "=r" (cpu) : "rm" (PER_CPU_GDT_ENTRY << 3) );
+    asm ( "lsll %1, %0" : "=r" (cpu) : "rm" (PER_CPU_GDT_SELECTOR) );
  
      /* Find information saved during fault and dump it to the console. */
      printk("*** DOUBLE FAULT ***\n");
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -36,6 +36,10 @@
  #define LDT_ENTRY (TSS_ENTRY + 2)
  #define PER_CPU_GDT_ENTRY (LDT_ENTRY + 2)
  
+#define TSS_SELECTOR         (TSS_ENTRY << 3)
+#define LDT_SELECTOR         (LDT_ENTRY << 3)
+#define PER_CPU_GDT_SELECTOR (PER_CPU_GDT_ENTRY << 3)
+
  #ifndef __ASSEMBLY__
  
  #define GUEST_KERNEL_RPL(d) (is_pv_32bit_domain(d) ? 1 : 3)
--- a/xen/include/asm-x86/ldt.h
+++ b/xen/include/asm-x86/ldt.h
@@ -16,7 +16,7 @@ static inline void load_LDT(struct vcpu
          desc = (!is_pv_32bit_vcpu(v) ? this_cpu(gdt) : this_cpu(compat_gdt))
                 + LDT_ENTRY - FIRST_RESERVED_GDT_ENTRY;
          _set_tssldt_desc(desc, LDT_VIRT_START(v), ents*8-1, SYS_DESC_ldt);
-        lldt(LDT_ENTRY << 3);
+        lldt(LDT_SELECTOR);
      }
  }
  


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] x86: define a few selector values
Posted by Andrew Cooper 4 years, 8 months ago
On 09/08/2019 11:38, Jan Beulich wrote:
> --- a/xen/include/asm-x86/desc.h
> +++ b/xen/include/asm-x86/desc.h
> @@ -36,6 +36,10 @@
>  #define LDT_ENTRY (TSS_ENTRY + 2)
>  #define PER_CPU_GDT_ENTRY (LDT_ENTRY + 2)
>  
> +#define TSS_SELECTOR         (TSS_ENTRY << 3)
> +#define LDT_SELECTOR         (LDT_ENTRY << 3)
> +#define PER_CPU_GDT_SELECTOR (PER_CPU_GDT_ENTRY << 3)

Thinking about it, now would be an excellent time to remove the GDT
infix from PER_CPU_GDT_{ENTRY,SELECTOR}.

Looking at the resulting diff, there are only 3 extra hunks on top of
this patch to perform the rename.

Preferably with this done, Reviewed-by: Andrew Cooper
<andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] x86: define a few selector values
Posted by Jan Beulich 4 years, 8 months ago
On 09.08.2019 13:50, Andrew Cooper wrote:
> On 09/08/2019 11:38, Jan Beulich wrote:
>> --- a/xen/include/asm-x86/desc.h
>> +++ b/xen/include/asm-x86/desc.h
>> @@ -36,6 +36,10 @@
>>   #define LDT_ENTRY (TSS_ENTRY + 2)
>>   #define PER_CPU_GDT_ENTRY (LDT_ENTRY + 2)
>>   
>> +#define TSS_SELECTOR         (TSS_ENTRY << 3)
>> +#define LDT_SELECTOR         (LDT_ENTRY << 3)
>> +#define PER_CPU_GDT_SELECTOR (PER_CPU_GDT_ENTRY << 3)
> 
> Thinking about it, now would be an excellent time to remove the GDT
> infix from PER_CPU_GDT_{ENTRY,SELECTOR}.
> 
> Looking at the resulting diff, there are only 3 extra hunks on top of
> this patch to perform the rename.
> 
> Preferably with this done, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com>

I'm okay with dropping it from the new selector constant,
since "selector" can't really be mistaken. For "entry" though
I think this isn't clear enough without "GDT". (This is less
for a problem with TSS_ENTRY and LDT_ENTRY, as their prefixes
make clear these are GDT entities.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] x86: define a few selector values
Posted by Tian, Kevin 4 years, 8 months ago
> From: Jan Beulich [mailto:jbeulich@suse.com]
> Sent: Friday, August 9, 2019 6:39 PM
> 
> TSS, LDT, and per-CPU entries all can benefit a little from also having
> their selector values defined.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Jan Beulich 4 years, 8 months ago
From: Andrew Cooper <andrew.cooper3@citrix.com>

... where we can at least get the compiler to fill in the surrounding space
without having to do it manually.  This also results in the symbols having
proper type/size information in the debug symbols.

Reorder 'raw' in the seg_desc_t union to allow for easier initialisation.

Leave a comment explaining the various restrictions we have on altering the
GDT layout.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Introduce SEL2GDT(). Correct GDT indices in public header.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Address my own v1 review comments. Re-base.
---
There is plenty more cleanup which can be done in the future.  As we are
64-bit, there is no need for load_TR() to keep the TSS in sync between the two
GDTs, which means it can drop all sgdt/lgdt instructions.  Also,
__HYPERVISOR_CS32 is unused and could be dropped.

As is demonstrated by the required includes, asm/desc.h is a tangle in need of
some cleanup.

The SYSEXIT GDT requirements are:
   R0 long code, R0 data, R3 compat code, R3 data, R3 long code, R3 data.

We could make Xen compatible by shifting the R0 code/data down by one slot,
moving R0 compat code elsewhere and replacing it with another R3 data.

However, this seems like a fairly pointless exercise, as the only thing it
will do is change the magic numbers which developers have become accustom to
seeing in backtraces.

--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_PV) += compat.o x86_64/comp
  obj-$(CONFIG_KEXEC) += crash.o
  obj-y += debug.o
  obj-y += delay.o
+obj-y += desc.o
  obj-bin-y += dmi_scan.init.o
  obj-y += domctl.o
  obj-y += domain.o
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -2,7 +2,6 @@
  #include <xen/multiboot2.h>
  #include <public/xen.h>
  #include <asm/asm_defns.h>
-#include <asm/desc.h>
  #include <asm/fixmap.h>
  #include <asm/page.h>
  #include <asm/processor.h>
--- a/xen/arch/x86/boot/x86_64.S
+++ b/xen/arch/x86/boot/x86_64.S
@@ -43,44 +43,11 @@ ENTRY(__high_start)
  multiboot_ptr:
          .long   0
  
-        .word   0
-GLOBAL(boot_gdtr)
-        .word   LAST_RESERVED_GDT_BYTE
-        .quad   boot_gdt - FIRST_RESERVED_GDT_BYTE
-
  GLOBAL(stack_start)
          .quad   cpu0_stack
  
          .section .data.page_aligned, "aw", @progbits
          .align PAGE_SIZE, 0
-GLOBAL(boot_gdt)
-        .quad 0x0000000000000000     /* unused */
-        .quad 0x00af9b000000ffff     /* 0xe008 ring 0 code, 64-bit mode   */
-        .quad 0x00cf93000000ffff     /* 0xe010 ring 0 data                */
-        .quad 0x0000000000000000     /* reserved                          */
-        .quad 0x00cffb000000ffff     /* 0xe023 ring 3 code, compatibility */
-        .quad 0x00cff3000000ffff     /* 0xe02b ring 3 data                */
-        .quad 0x00affb000000ffff     /* 0xe033 ring 3 code, 64-bit mode   */
-        .quad 0x00cf9b000000ffff     /* 0xe038 ring 0 code, compatibility */
-        .fill (PER_CPU_GDT_ENTRY - __HYPERVISOR_CS32 / 8 - 1), 8, 0
-        .quad 0x0000910000000000     /* per-CPU entry (limit == cpu)      */
-
-        .align PAGE_SIZE, 0
-/* NB. Even rings != 0 get access to the full 4Gb, as only the            */
-/*     (compatibility) machine->physical mapping table lives there.       */
-GLOBAL(boot_compat_gdt)
-        .quad 0x0000000000000000     /* unused */
-        .quad 0x00af9b000000ffff     /* 0xe008 ring 0 code, 64-bit mode   */
-        .quad 0x00cf93000000ffff     /* 0xe010 ring 0 data                */
-        .quad 0x00cfbb000000ffff     /* 0xe019 ring 1 code, compatibility */
-        .quad 0x00cfb3000000ffff     /* 0xe021 ring 1 data                */
-        .quad 0x00cffb000000ffff     /* 0xe02b ring 3 code, compatibility */
-        .quad 0x00cff3000000ffff     /* 0xe033 ring 3 data                */
-        .quad 0x00cf9b000000ffff     /* 0xe038 ring 0 code, compatibility */
-        .fill (PER_CPU_GDT_ENTRY - __HYPERVISOR_CS32 / 8 - 1), 8, 0
-        .quad 0x0000910000000000     /* per-CPU entry (limit == cpu)      */
-        .align PAGE_SIZE, 0
-
  /*
   * Mapping of first 2 megabytes of memory. This is mapped with 4kB mappings
   * to avoid type conflicts with fixed-range MTRRs covering the lowest megabyte
--- /dev/null
+++ b/xen/arch/x86/desc.c
@@ -0,0 +1,109 @@
+#include <xen/types.h>
+#include <xen/lib.h>
+#include <xen/percpu.h>
+#include <xen/mm.h>
+
+#include <asm/desc.h>
+
+/*
+ * Native and Compat GDTs used by Xen.
+ *
+ * The R1 and R3 descriptors are fixed in Xen's ABI for PV guests.  All other
+ * descriptors are in principle variable, with the following restrictions.
+ *
+ * All R0 descriptors must line up in both GDTs to allow for correct
+ * interrupt/exception handling.
+ *
+ * The SYSCALL/SYSRET GDT layout requires:
+ *  - R0 long mode code followed by R0 data.
+ *  - R3 compat code, followed by R3 data, followed by R3 long mode code.
+ *
+ * The SYSENTER GDT layout requirements are compatible with SYSCALL.  Xen does
+ * not use the SYSEXIT instruction, and does not provide a compatible GDT.
+ *
+ * These tables are used directly by CPU0, and used as the template for the
+ * GDTs of other CPUs.  Everything from the TSS onwards is unique per CPU.
+ */
+
+#define SEL2GDT(sel) (((sel) >> 3) - FIRST_RESERVED_GDT_ENTRY)
+
+__section(".data.page_aligned") __aligned(PAGE_SIZE)
+seg_desc_t boot_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
+{
+    /* 0xe008 - Ring 0 code, 64bit mode */
+    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
+
+    /* 0xe010 - Ring 0 data */
+    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
+
+    /* 0xe018 - reserved */
+
+    /* 0xe023 - Ring 3 code, compatibility */
+    [SEL2GDT(FLAT_RING3_CS32)] = { 0x00cffb000000ffff },
+
+    /* 0xe02b - Ring 3 data */
+    [SEL2GDT(FLAT_RING3_DS32)] = { 0x00cff3000000ffff },
+
+    /* 0xe033 - Ring 3 code, 64-bit mode */
+    [SEL2GDT(FLAT_RING3_CS64)] = { 0x00affb000000ffff },
+
+    /* 0xe038 - Ring 0 code, compatibility */
+    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
+
+    /* 0xe040 - TSS */
+    /* 0xe050 - LDT */
+
+    /* 0xe060 - per-CPU entry (limit == cpu) */
+    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
+};
+
+__section(".data.page_aligned") __aligned(PAGE_SIZE)
+seg_desc_t boot_compat_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
+{
+    /* 0xe008 - Ring 0 code, 64bit mode */
+    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
+
+    /* 0xe010 - Ring 0 data */
+    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
+
+    /* 0xe019 - Ring 1 code, compatibility */
+    [SEL2GDT(FLAT_COMPAT_RING1_CS)] = { 0x00cfbb000000ffff },
+
+    /* 0xe021 - Ring 1 data */
+    [SEL2GDT(FLAT_COMPAT_RING1_DS)] = { 0x00cfb3000000ffff },
+
+    /* 0xe02b - Ring 3 code, compatibility */
+    [SEL2GDT(FLAT_COMPAT_RING3_CS)] = { 0x00cffb000000ffff },
+
+    /* 0xe033 - Ring 3 data */
+    [SEL2GDT(FLAT_COMPAT_RING3_DS)] = { 0x00cff3000000ffff },
+
+    /* 0xe038 - Ring 0 code, compatibility */
+    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
+
+    /* 0xe040 - TSS */
+    /* 0xe050 - LDT */
+
+    /* 0xe060 - per-CPU entry (limit == cpu) */
+    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
+};
+
+/*
+ * Used by each CPU as it starts up, to enter C with a suitable %cs.
+ * References boot_cpu_gdt_table for a short period, until the CPUs switch
+ * onto their per-CPU GDTs.
+ */
+struct desc_ptr boot_gdtr = {
+    .limit = LAST_RESERVED_GDT_BYTE,
+    .base = (unsigned long)(boot_gdt - FIRST_RESERVED_GDT_ENTRY),
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -107,10 +107,10 @@
  #define SYS_DESC_trap_gate    15
  
  typedef union {
+    uint64_t raw;
      struct {
          uint32_t a, b;
      };
-    uint64_t raw;
  } seg_desc_t;
  
  typedef union {
--- a/xen/include/public/arch-x86/xen-x86_64.h
+++ b/xen/include/public/arch-x86/xen-x86_64.h
@@ -44,11 +44,11 @@
   */
  
  #define FLAT_RING3_CS32 0xe023  /* GDT index 260 */
-#define FLAT_RING3_CS64 0xe033  /* GDT index 261 */
-#define FLAT_RING3_DS32 0xe02b  /* GDT index 262 */
+#define FLAT_RING3_DS32 0xe02b  /* GDT index 261 */
+#define FLAT_RING3_CS64 0xe033  /* GDT index 262 */
  #define FLAT_RING3_DS64 0x0000  /* NULL selector */
-#define FLAT_RING3_SS32 0xe02b  /* GDT index 262 */
-#define FLAT_RING3_SS64 0xe02b  /* GDT index 262 */
+#define FLAT_RING3_SS32 0xe02b  /* GDT index 261 */
+#define FLAT_RING3_SS64 0xe02b  /* GDT index 261 */
  
  #define FLAT_KERNEL_DS64 FLAT_RING3_DS64
  #define FLAT_KERNEL_DS32 FLAT_RING3_DS32


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Andrew Cooper 4 years, 8 months ago
On 09/08/2019 11:40, Jan Beulich wrote:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> Introduce SEL2GDT(). Correct GDT indices in public header.

"Correct" here is ambiguous because it implies there is a breakage.

You appear to have reversed FLAT_RING3_CS64 and DS32 (retaining the
original comments) and changed the comments for FLAT_RING3_SS{32,64}.

Except that now they are out of their logical order (CS 32 then 64, DS
32 then 64, SS 32 then 64).

What is the reasoning for the new order?  It isn't sorted by index.

>
> --- /dev/null
> +++ b/xen/arch/x86/desc.c
> @@ -0,0 +1,109 @@
> +
> +#define SEL2GDT(sel) (((sel) >> 3) - FIRST_RESERVED_GDT_ENTRY)
> +
> +__section(".data.page_aligned") __aligned(PAGE_SIZE)
> +seg_desc_t boot_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
> +{
> +    /* 0xe008 - Ring 0 code, 64bit mode */
> +    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
> +
> +    /* 0xe010 - Ring 0 data */
> +    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
> +
> +    /* 0xe018 - reserved */
> +
> +    /* 0xe023 - Ring 3 code, compatibility */
> +    [SEL2GDT(FLAT_RING3_CS32)] = { 0x00cffb000000ffff },
> +
> +    /* 0xe02b - Ring 3 data */
> +    [SEL2GDT(FLAT_RING3_DS32)] = { 0x00cff3000000ffff },
> +
> +    /* 0xe033 - Ring 3 code, 64-bit mode */
> +    [SEL2GDT(FLAT_RING3_CS64)] = { 0x00affb000000ffff },
> +
> +    /* 0xe038 - Ring 0 code, compatibility */
> +    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
> +
> +    /* 0xe040 - TSS */
> +    /* 0xe050 - LDT */
> +
> +    /* 0xe060 - per-CPU entry (limit == cpu) */
> +    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },

It would be better if the = { } were vertically aligned.  It makes
reading them easier.

Also, now that SEL2GDT() is present, we need a BUILD_BUG_ON() to check
that the size doesn't vary from one page.  At the moment, changing a
selector to use 0xfxxx will cause this to grow beyond 1 page without any
compiler diagnostic.  I'd go with

static void __init __maybe_unused
build_assertions(void)                                                                                                                                                      

{
    BUILD_BUG_ON(sizeof(boot_gdt) != PAGE_SIZE);
    BUILD_BUG_ON(sizeof(boot_compat_gdt) != PAGE_SIZE);
}

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Jan Beulich 4 years, 8 months ago
On 09.08.2019 14:19, Andrew Cooper wrote:
> On 09/08/2019 11:40, Jan Beulich wrote:
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> Introduce SEL2GDT(). Correct GDT indices in public header.
> 
> "Correct" here is ambiguous because it implies there is a breakage.
> 
> You appear to have reversed FLAT_RING3_CS64 and DS32 (retaining the
> original comments) and changed the comments for FLAT_RING3_SS{32,64}.

Well - the comments were what was wrong after all.

> Except that now they are out of their logical order (CS 32 then 64, DS
> 32 then 64, SS 32 then 64).
> 
> What is the reasoning for the new order?  It isn't sorted by index.

It is, as much as it looks to have been the original author's
intent. I.e. I didn't re-order things across the nul selector
between the two groups.I can pull FLAT_RING3_SS* up if that's
what you want. I'm also happy with any other order that you
may prefer - just let me know which one. All I care about is
for the comments to be in sync with the actual values.

>> --- /dev/null
>> +++ b/xen/arch/x86/desc.c
>> @@ -0,0 +1,109 @@
>> +
>> +#define SEL2GDT(sel) (((sel) >> 3) - FIRST_RESERVED_GDT_ENTRY)
>> +
>> +__section(".data.page_aligned") __aligned(PAGE_SIZE)
>> +seg_desc_t boot_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
>> +{
>> +    /* 0xe008 - Ring 0 code, 64bit mode */
>> +    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
>> +
>> +    /* 0xe010 - Ring 0 data */
>> +    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
>> +
>> +    /* 0xe018 - reserved */
>> +
>> +    /* 0xe023 - Ring 3 code, compatibility */
>> +    [SEL2GDT(FLAT_RING3_CS32)] = { 0x00cffb000000ffff },
>> +
>> +    /* 0xe02b - Ring 3 data */
>> +    [SEL2GDT(FLAT_RING3_DS32)] = { 0x00cff3000000ffff },
>> +
>> +    /* 0xe033 - Ring 3 code, 64-bit mode */
>> +    [SEL2GDT(FLAT_RING3_CS64)] = { 0x00affb000000ffff },
>> +
>> +    /* 0xe038 - Ring 0 code, compatibility */
>> +    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
>> +
>> +    /* 0xe040 - TSS */
>> +    /* 0xe050 - LDT */
>> +
>> +    /* 0xe060 - per-CPU entry (limit == cpu) */
>> +    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
> 
> It would be better if the = { } were vertically aligned.  It makes
> reading them easier.
> 
> Also, now that SEL2GDT() is present, we need a BUILD_BUG_ON() to check
> that the size doesn't vary from one page.  At the moment, changing a
> selector to use 0xfxxx will cause this to grow beyond 1 page without any
> compiler diagnostic.  I'd go with
> 
> static void __init __maybe_unused
> build_assertions(void)
> 
> {
>      BUILD_BUG_ON(sizeof(boot_gdt) != PAGE_SIZE);
>      BUILD_BUG_ON(sizeof(boot_compat_gdt) != PAGE_SIZE);
> }

Will do, albeit for the build assertions this isn't really the
right place imo, because this isn't the place where we depend
on them being just single pages. I'll put it there nevertheless,
but I'll add a comment for why they're there.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Andrew Cooper 4 years, 8 months ago
On 09/08/2019 13:43, Jan Beulich wrote:
> On 09.08.2019 14:19, Andrew Cooper wrote:
>> On 09/08/2019 11:40, Jan Beulich wrote:
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> Introduce SEL2GDT(). Correct GDT indices in public header.
>>
>> "Correct" here is ambiguous because it implies there is a breakage.
>>
>> You appear to have reversed FLAT_RING3_CS64 and DS32 (retaining the
>> original comments) and changed the comments for FLAT_RING3_SS{32,64}.
>
> Well - the comments were what was wrong after all.
>
>> Except that now they are out of their logical order (CS 32 then 64, DS
>> 32 then 64, SS 32 then 64).
>>
>> What is the reasoning for the new order?  It isn't sorted by index.
>
> It is, as much as it looks to have been the original author's
> intent. I.e. I didn't re-order things across the nul selector
> between the two groups.I can pull FLAT_RING3_SS* up if that's
> what you want. I'm also happy with any other order that you
> may prefer - just let me know which one. All I care about is
> for the comments to be in sync with the actual values.

I think it would be clearest to leave the order as is, and just fix the
comments.

>
>>> --- /dev/null
>>> +++ b/xen/arch/x86/desc.c
>>> @@ -0,0 +1,109 @@
>>> +
>>> +#define SEL2GDT(sel) (((sel) >> 3) - FIRST_RESERVED_GDT_ENTRY)
>>> +
>>> +__section(".data.page_aligned") __aligned(PAGE_SIZE)
>>> +seg_desc_t boot_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
>>> +{
>>> +    /* 0xe008 - Ring 0 code, 64bit mode */
>>> +    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
>>> +
>>> +    /* 0xe010 - Ring 0 data */
>>> +    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
>>> +
>>> +    /* 0xe018 - reserved */
>>> +
>>> +    /* 0xe023 - Ring 3 code, compatibility */
>>> +    [SEL2GDT(FLAT_RING3_CS32)] = { 0x00cffb000000ffff },
>>> +
>>> +    /* 0xe02b - Ring 3 data */
>>> +    [SEL2GDT(FLAT_RING3_DS32)] = { 0x00cff3000000ffff },
>>> +
>>> +    /* 0xe033 - Ring 3 code, 64-bit mode */
>>> +    [SEL2GDT(FLAT_RING3_CS64)] = { 0x00affb000000ffff },
>>> +
>>> +    /* 0xe038 - Ring 0 code, compatibility */
>>> +    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
>>> +
>>> +    /* 0xe040 - TSS */
>>> +    /* 0xe050 - LDT */
>>> +
>>> +    /* 0xe060 - per-CPU entry (limit == cpu) */
>>> +    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
>>
>> It would be better if the = { } were vertically aligned.  It makes
>> reading them easier.
>>
>> Also, now that SEL2GDT() is present, we need a BUILD_BUG_ON() to check
>> that the size doesn't vary from one page.  At the moment, changing a
>> selector to use 0xfxxx will cause this to grow beyond 1 page without any
>> compiler diagnostic.  I'd go with
>>
>> static void __init __maybe_unused
>> build_assertions(void)
>>
>> {
>>      BUILD_BUG_ON(sizeof(boot_gdt) != PAGE_SIZE);
>>      BUILD_BUG_ON(sizeof(boot_compat_gdt) != PAGE_SIZE);
>> }
>
> Will do, albeit for the build assertions this isn't really the
> right place imo, because this isn't the place where we depend
> on them being just single pages. I'll put it there nevertheless,
> but I'll add a comment for why they're there.

IMO this is the right place, because it is right beside where the array
is specifically defined to be [PAGE_SIZE / sizeof()].

What it is doing is working around what is arguably a compiler bug by
allowing foo[x] = { [x + 1] = ... } to work.

Anyway, with these assertions and the tweaked constant clenaup,
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Jan Beulich 4 years, 8 months ago
On 09.08.2019 15:07, Andrew Cooper wrote:
> On 09/08/2019 13:43, Jan Beulich wrote:
>> On 09.08.2019 14:19, Andrew Cooper wrote:
>>> On 09/08/2019 11:40, Jan Beulich wrote:
>>>> --- /dev/null
>>>> +++ b/xen/arch/x86/desc.c
>>>> @@ -0,0 +1,109 @@
>>>> +
>>>> +#define SEL2GDT(sel) (((sel) >> 3) - FIRST_RESERVED_GDT_ENTRY)
>>>> +
>>>> +__section(".data.page_aligned") __aligned(PAGE_SIZE)
>>>> +seg_desc_t boot_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
>>>> +{
>>>> +    /* 0xe008 - Ring 0 code, 64bit mode */
>>>> +    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
>>>> +
>>>> +    /* 0xe010 - Ring 0 data */
>>>> +    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
>>>> +
>>>> +    /* 0xe018 - reserved */
>>>> +
>>>> +    /* 0xe023 - Ring 3 code, compatibility */
>>>> +    [SEL2GDT(FLAT_RING3_CS32)] = { 0x00cffb000000ffff },
>>>> +
>>>> +    /* 0xe02b - Ring 3 data */
>>>> +    [SEL2GDT(FLAT_RING3_DS32)] = { 0x00cff3000000ffff },
>>>> +
>>>> +    /* 0xe033 - Ring 3 code, 64-bit mode */
>>>> +    [SEL2GDT(FLAT_RING3_CS64)] = { 0x00affb000000ffff },
>>>> +
>>>> +    /* 0xe038 - Ring 0 code, compatibility */
>>>> +    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
>>>> +
>>>> +    /* 0xe040 - TSS */
>>>> +    /* 0xe050 - LDT */
>>>> +
>>>> +    /* 0xe060 - per-CPU entry (limit == cpu) */
>>>> +    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
>>>
>>> It would be better if the = { } were vertically aligned.  It makes
>>> reading them easier.
>>>
>>> Also, now that SEL2GDT() is present, we need a BUILD_BUG_ON() to check
>>> that the size doesn't vary from one page.  At the moment, changing a
>>> selector to use 0xfxxx will cause this to grow beyond 1 page without any
>>> compiler diagnostic.  I'd go with
>>>
>>> static void __init __maybe_unused
>>> build_assertions(void)
>>>
>>> {
>>>       BUILD_BUG_ON(sizeof(boot_gdt) != PAGE_SIZE);
>>>       BUILD_BUG_ON(sizeof(boot_compat_gdt) != PAGE_SIZE);
>>> }
>>
>> Will do, albeit for the build assertions this isn't really the
>> right place imo, because this isn't the place where we depend
>> on them being just single pages. I'll put it there nevertheless,
>> but I'll add a comment for why they're there.
> 
> IMO this is the right place, because it is right beside where the array
> is specifically defined to be [PAGE_SIZE / sizeof()].

I was about to ask why we then need build_assertions() at all,
until I also saw ...

> What it is doing is working around what is arguably a compiler bug by
> allowing foo[x] = { [x + 1] = ... } to work.

... this. Which made me go check, and both gcc 4.3 and gcc 9.1
correctly complain "array index in initializer exceeds array
bounds".

> Anyway, with these assertions and the tweaked constant clenaup,
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks, but as per above I'm now irritated: With the explicit
specification of the array size, build_assertions() should either
be dropped again, or its comment be extended to cover why it's
needed _despite_ the specified array size (i.e. in which case
your example above would not cause a build failure).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Andrew Cooper 4 years, 8 months ago
On 09/08/2019 14:18, Jan Beulich wrote:
> On 09.08.2019 15:07, Andrew Cooper wrote:
>> On 09/08/2019 13:43, Jan Beulich wrote:
>>> On 09.08.2019 14:19, Andrew Cooper wrote:
>>>> On 09/08/2019 11:40, Jan Beulich wrote:
>>>>> --- /dev/null
>>>>> +++ b/xen/arch/x86/desc.c
>>>>> @@ -0,0 +1,109 @@
>>>>> +
>>>>> +#define SEL2GDT(sel) (((sel) >> 3) - FIRST_RESERVED_GDT_ENTRY)
>>>>> +
>>>>> +__section(".data.page_aligned") __aligned(PAGE_SIZE)
>>>>> +seg_desc_t boot_gdt[PAGE_SIZE / sizeof(seg_desc_t)] =
>>>>> +{
>>>>> +    /* 0xe008 - Ring 0 code, 64bit mode */
>>>>> +    [SEL2GDT(__HYPERVISOR_CS64)] = { 0x00af9b000000ffff },
>>>>> +
>>>>> +    /* 0xe010 - Ring 0 data */
>>>>> +    [SEL2GDT(__HYPERVISOR_DS32)] = { 0x00cf93000000ffff },
>>>>> +
>>>>> +    /* 0xe018 - reserved */
>>>>> +
>>>>> +    /* 0xe023 - Ring 3 code, compatibility */
>>>>> +    [SEL2GDT(FLAT_RING3_CS32)] = { 0x00cffb000000ffff },
>>>>> +
>>>>> +    /* 0xe02b - Ring 3 data */
>>>>> +    [SEL2GDT(FLAT_RING3_DS32)] = { 0x00cff3000000ffff },
>>>>> +
>>>>> +    /* 0xe033 - Ring 3 code, 64-bit mode */
>>>>> +    [SEL2GDT(FLAT_RING3_CS64)] = { 0x00affb000000ffff },
>>>>> +
>>>>> +    /* 0xe038 - Ring 0 code, compatibility */
>>>>> +    [SEL2GDT(__HYPERVISOR_CS32)] = { 0x00cf9b000000ffff },
>>>>> +
>>>>> +    /* 0xe040 - TSS */
>>>>> +    /* 0xe050 - LDT */
>>>>> +
>>>>> +    /* 0xe060 - per-CPU entry (limit == cpu) */
>>>>> +    [SEL2GDT(PER_CPU_GDT_SELECTOR)] = { 0x0000910000000000 },
>>>>
>>>> It would be better if the = { } were vertically aligned.  It makes
>>>> reading them easier.
>>>>
>>>> Also, now that SEL2GDT() is present, we need a BUILD_BUG_ON() to check
>>>> that the size doesn't vary from one page.  At the moment, changing a
>>>> selector to use 0xfxxx will cause this to grow beyond 1 page
>>>> without any
>>>> compiler diagnostic.  I'd go with
>>>>
>>>> static void __init __maybe_unused
>>>> build_assertions(void)
>>>>
>>>> {
>>>>       BUILD_BUG_ON(sizeof(boot_gdt) != PAGE_SIZE);
>>>>       BUILD_BUG_ON(sizeof(boot_compat_gdt) != PAGE_SIZE);
>>>> }
>>>
>>> Will do, albeit for the build assertions this isn't really the
>>> right place imo, because this isn't the place where we depend
>>> on them being just single pages. I'll put it there nevertheless,
>>> but I'll add a comment for why they're there.
>>
>> IMO this is the right place, because it is right beside where the array
>> is specifically defined to be [PAGE_SIZE / sizeof()].
>
> I was about to ask why we then need build_assertions() at all,
> until I also saw ...
>
>> What it is doing is working around what is arguably a compiler bug by
>> allowing foo[x] = { [x + 1] = ... } to work.
>
> ... this. Which made me go check, and both gcc 4.3 and gcc 9.1
> correctly complain "array index in initializer exceeds array
> bounds".
>
>> Anyway, with these assertions and the tweaked constant clenaup,
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> Thanks, but as per above I'm now irritated: With the explicit
> specification of the array size, build_assertions() should either
> be dropped again, or its comment be extended to cover why it's
> needed _despite_ the specified array size (i.e. in which case
> your example above would not cause a build failure).

This comes down to a bug I encountered while doing the CPUID work, which
is specifically why we have cpuid.c's build assertions.

Given that we get failures from at least one compiler in CI, we can drop
the extra build assertions.

At some point which isn't now, I'll try to work out exactly what went
wrong in the CPUID case, but I can't reproduce it from memory.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Jan Beulich 4 years, 8 months ago
On 09.08.2019 12:40, Jan Beulich wrote:
> There is plenty more cleanup which can be done in the future.  As we are
> 64-bit, there is no need for load_TR() to keep the TSS in sync between the two
> GDTs, which means it can drop all sgdt/lgdt instructions.

I'm trying to figure what exactly you mean here. Are you suggesting
we run with a TSS selector loaded whose descriptor's busy bit is
clear? I agree this shouldn't cause issues in the 64-bit world, but
it would still not feel right. Question is why they've retained the
avail/busy distinction in the first place.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] x86/desc: Build boot_{, compat_}gdt[] in C
Posted by Andrew Cooper 4 years, 8 months ago
On 12/08/2019 08:32, Jan Beulich wrote:
> On 09.08.2019 12:40, Jan Beulich wrote:
>> There is plenty more cleanup which can be done in the future.  As we are
>> 64-bit, there is no need for load_TR() to keep the TSS in sync
>> between the two
>> GDTs, which means it can drop all sgdt/lgdt instructions.
>
> I'm trying to figure what exactly you mean here. Are you suggesting
> we run with a TSS selector loaded whose descriptor's busy bit is
> clear? I agree this shouldn't cause issues in the 64-bit world, but
> it would still not feel right.

At a minimum, all the sgdt/lgdt can disappear because we're (AFAICT)
always on the native per-cpu GDT at this point.  (If not, I'm sure we
can arrange to be.)

As for running without a valid GDT reference, the CPU will function
fine, and it is a defence-in-depth strategy against Meltdown, seeing as
an attacker can no longer do sgdt; str to locate the TSS and find RSP0.

> Question is why they've retained the avail/busy distinction in the
> first place.

Easier than making any changes.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel