From nobody Wed Apr 24 03:32:29 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1562263149; cv=none; d=zoho.com; s=zohoarc; b=ZqInzNNLQm5/mSJrTPnkkf9zMk8GyGyVwNR7BdUX6YFVcvkXKxLNUb65c0oK9R5Yi97HzvXJUQ4ajtu5SApbq3CYuVz5Fh3kIKOEoGvYM78immgsULKbMXSCNe6N/Tf8Ehkj+/hUrZgO6Q1X4BrYR6MsZiKUuY5JqnB80rsl0/c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1562263149; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To:ARC-Authentication-Results; bh=dSl88cdMm3U5XXyOfHy2cTwOGVWeuD4fGbnl2gaK2tw=; b=c8nOpShNBVCsk15UT+oO/gZnmkpi4YE35itB5TIMSbxfR4j2dfhmeIGkAtXSxUkgYkQ1LfcqsQIQupnejqfRPsQiZ28isAC14eRqtYQVMaCxaqcHvgx8rqBHwOmZUOjxPmNUEsSZQvCTLeybIuLJk7nAy/pDq4o8DOTQGSYRBhs= ARC-Authentication-Results: i=1; mx.zoho.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1562263148899555.7881975447334; Thu, 4 Jul 2019 10:59:08 -0700 (PDT) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hj5zJ-0000AF-2q; Thu, 04 Jul 2019 17:57:41 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hj5zH-0000AA-Id for xen-devel@lists.xenproject.org; Thu, 04 Jul 2019 17:57:39 +0000 Received: from esa6.hc3370-68.iphmx.com (unknown [216.71.155.175]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 34420552-9e85-11e9-ae1c-a792ea221c88; Thu, 04 Jul 2019 17:57:36 +0000 (UTC) X-Inumbo-ID: 34420552-9e85-11e9-ae1c-a792ea221c88 Authentication-Results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@citrix.com; spf=Pass smtp.mailfrom=Andrew.Cooper3@citrix.com; spf=None smtp.helo=postmaster@mail.citrix.com Received-SPF: none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Received-SPF: None (esa6.hc3370-68.iphmx.com: no sender authenticity information available from domain of andrew.cooper3@citrix.com) identity=pra; client-ip=162.221.158.21; receiver=esa6.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="andrew.cooper3@citrix.com"; x-conformance=sidf_compatible Received-SPF: Pass (esa6.hc3370-68.iphmx.com: domain of Andrew.Cooper3@citrix.com designates 162.221.158.21 as permitted sender) identity=mailfrom; client-ip=162.221.158.21; receiver=esa6.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="Andrew.Cooper3@citrix.com"; x-conformance=sidf_compatible; x-record-type="v=spf1"; x-record-text="v=spf1 ip4:209.167.231.154 ip4:178.63.86.133 ip4:195.66.111.40/30 ip4:85.115.9.32/28 ip4:199.102.83.4 ip4:192.28.146.160 ip4:192.28.146.107 ip4:216.52.6.88 ip4:216.52.6.188 ip4:162.221.158.21 ip4:162.221.156.83 ~all" Received-SPF: None (esa6.hc3370-68.iphmx.com: no sender authenticity information available from domain of postmaster@mail.citrix.com) identity=helo; client-ip=162.221.158.21; receiver=esa6.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="postmaster@mail.citrix.com"; x-conformance=sidf_compatible IronPort-SDR: SDAk2dBxItClEqB0tW2kUg5FIpvpP4sTKFIfoVTY8XKFNte/W3CSM+P+HCKYGFvM/lOHMQ5DGa r9v1ZD2EPNrH/2wNOzmUFsoGFKyHtTdJtUz6EzouQ3avOwMPMxcb3Js3fqycSrGwuHn553wIjS OzvZCdU63KuSpCNg2/X3ut0MfUFckbH37F61u0VlbGle9rwqbyse2Od8ydn2PBAUzIXDxsAR32 kmzDZxwSZv4HjML8hHZrV1xvTA6wH4NcP4ek33dGU9fxPQODPh4wo5UagwlMijMkwf32ng7aDp IFM= X-SBRS: 2.7 X-MesageID: 2630108 X-Ironport-Server: esa6.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.63,451,1557201600"; d="scan'208";a="2630108" From: Andrew Cooper To: Xen-devel Date: Thu, 4 Jul 2019 18:57:32 +0100 Message-ID: <20190704175732.5943-1-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.11.0 MIME-Version: 1.0 Subject: [Xen-devel] [PATCH] x86/ctxt-switch: Document and improve GDT handling X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Andrew Cooper , Wei Liu , Jan Beulich , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" write_full_gdt_ptes() has a latent bug. Using virt_to_mfn() and iterating with (mfn + i) is wrong, because of PDX compression. The context switch pa= th only functions correctly because NR_RESERVED_GDT_PAGES is 1. As this is exceedingly unlikely to change moving foward, drop the loop rather than inserting a BUILD_BUG_ON(NR_RESERVED_GDT_PAGES !=3D 1). With the loop dropped, write_full_gdt_ptes() becomes more obviously a poor name, so rename it to update_xen_slot_in_full_gdt(). Furthermore, calling virt_to_mfn() in the context switch path is a lot of wasted cycles for a result which is constant after boot. Begin by documenting how Xen handles the GDTs across context switch. From this, we observe that load_full_gdt() is completely independent of the current CPU, and load_default_gdt() only gets passed the current CPU regular GDT. Add two extra per-cpu variables which cache the L1e for the regular and com= pat GDT, calculated in cpu_smpboot_alloc()/trap_init() as appropriate, so update_xen_slot_in_full_gdt() doesn't need to waste time performing the same calculation on every context switch. Signed-off-by: Andrew Cooper Reviewed-by: Juergen Gross Tested-by: Juergen Gross --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monn=C3=A9 CC: Juergen Gross Slightly RFC. I'm fairly confident this is better, but Juergen says that the some of his scheduling perf tests notice large difference from subtle changes in __context_switch(), so it would be useful to get some numbers from this change. The delta from this change is: add/remove: 2/0 grow/shrink: 1/1 up/down: 320/-127 (193) Function old new delta cpu_smpboot_callback 1152 1456 +304 per_cpu__gdt_table_l1e - 8 +8 per_cpu__compat_gdt_table_l1e - 8 +8 __context_switch 1238 1111 -127 Total: Before=3D3339227, After=3D3339420, chg +0.01% I'm not overly happy about the special case in trap_init() but I can't think of a better place to put this. Also, it should now be very obvious to people that Xen's current GDT handli= ng for non-PV vcpus is a recipe subtle bugs, if we ever manage to execute a st= ray mov/pop %sreg instruction. We really ought to have Xen's regular GDT in an area where slots 0-13 are either mapped to the zero page, or not present, so we don't risk loading a non-faulting garbage selector. --- xen/arch/x86/domain.c | 52 ++++++++++++++++++++++++++++++------------= ---- xen/arch/x86/smpboot.c | 4 ++++ xen/arch/x86/traps.c | 10 +++++++++ xen/include/asm-x86/desc.h | 2 ++ 4 files changed, 50 insertions(+), 18 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 84cafbe558..147f96a09e 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1635,23 +1635,42 @@ static void _update_runstate_area(struct vcpu *v) v->arch.pv.need_update_runstate_area =3D 1; } =20 +/* + * Overview of Xen's GDTs. + * + * Xen maintains per-CPU compat and regular GDTs which are both a single p= age + * in size. Some content is specific to each CPU (the TSS, the per-CPU ma= rker + * for #DF handling, and optionally the LDT). The compat and regular GDTs + * differ by the layout and content of the guest accessible selectors. + * + * The Xen selectors live from 0xe000 (slot 14 of 16), and need to always + * appear in this position for interrupt/exception handling to work. + * + * A PV guest may specify GDT frames of their own (slots 0 to 13). Room f= or a + * full GDT exists in the per-domain mappings. + * + * To schedule a PV vcpu, we point slot 14 of the guest's full GDT at the + * current CPU's compat or regular (as appropriate) GDT frame. This is so + * that the per-CPU parts still work correctly after switching pagetables = and + * loading the guests full GDT into GDTR. + * + * To schedule Idle or HVM vcpus, we load a GDT base address which causes = the + * regular per-CPU GDT frame to appear with selectors at the appropriate + * offset. + */ static inline bool need_full_gdt(const struct domain *d) { return is_pv_domain(d) && !is_idle_domain(d); } =20 -static void write_full_gdt_ptes(seg_desc_t *gdt, const struct vcpu *v) +static void update_xen_slot_in_full_gdt(const struct vcpu *v, unsigned int= cpu) { - unsigned long mfn =3D virt_to_mfn(gdt); - l1_pgentry_t *pl1e =3D pv_gdt_ptes(v); - unsigned int i; - - for ( i =3D 0; i < NR_RESERVED_GDT_PAGES; i++ ) - l1e_write(pl1e + FIRST_RESERVED_GDT_PAGE + i, - l1e_from_pfn(mfn + i, __PAGE_HYPERVISOR_RW)); + l1e_write(pv_gdt_ptes(v) + FIRST_RESERVED_GDT_PAGE, + !is_pv_32bit_vcpu(v) ? per_cpu(gdt_table_l1e, cpu) + : per_cpu(compat_gdt_table_l1e, cpu)); } =20 -static void load_full_gdt(const struct vcpu *v, unsigned int cpu) +static void load_full_gdt(const struct vcpu *v) { struct desc_ptr gdt_desc =3D { .limit =3D LAST_RESERVED_GDT_BYTE, @@ -1661,11 +1680,12 @@ static void load_full_gdt(const struct vcpu *v, uns= igned int cpu) lgdt(&gdt_desc); } =20 -static void load_default_gdt(const seg_desc_t *gdt, unsigned int cpu) +static void load_default_gdt(unsigned int cpu) { struct desc_ptr gdt_desc =3D { .limit =3D LAST_RESERVED_GDT_BYTE, - .base =3D (unsigned long)(gdt - FIRST_RESERVED_GDT_ENTRY), + .base =3D (unsigned long)(per_cpu(gdt_table, cpu) - + FIRST_RESERVED_GDT_ENTRY), }; =20 lgdt(&gdt_desc); @@ -1678,7 +1698,6 @@ static void __context_switch(void) struct vcpu *p =3D per_cpu(curr_vcpu, cpu); struct vcpu *n =3D current; struct domain *pd =3D p->domain, *nd =3D n->domain; - seg_desc_t *gdt; =20 ASSERT(p !=3D n); ASSERT(!vcpu_cpu_dirty(n)); @@ -1718,15 +1737,12 @@ static void __context_switch(void) =20 psr_ctxt_switch_to(nd); =20 - gdt =3D !is_pv_32bit_domain(nd) ? per_cpu(gdt_table, cpu) : - per_cpu(compat_gdt_table, cpu); - if ( need_full_gdt(nd) ) - write_full_gdt_ptes(gdt, n); + update_xen_slot_in_full_gdt(n, cpu); =20 if ( need_full_gdt(pd) && ((p->vcpu_id !=3D n->vcpu_id) || !need_full_gdt(nd)) ) - load_default_gdt(gdt, cpu); + load_default_gdt(cpu); =20 write_ptbase(n); =20 @@ -1739,7 +1755,7 @@ static void __context_switch(void) =20 if ( need_full_gdt(nd) && ((p->vcpu_id !=3D n->vcpu_id) || !need_full_gdt(pd)) ) - load_full_gdt(n, cpu); + load_full_gdt(n); =20 if ( pd !=3D nd ) cpumask_clear_cpu(cpu, pd->dirty_cpumask); diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 730fe141fa..004285d14c 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -985,6 +985,8 @@ static int cpu_smpboot_alloc(unsigned int cpu) if ( gdt =3D=3D NULL ) goto out; per_cpu(gdt_table, cpu) =3D gdt; + per_cpu(gdt_table_l1e, cpu) =3D + l1e_from_pfn(virt_to_mfn(gdt), __PAGE_HYPERVISOR_RW); memcpy(gdt, boot_cpu_gdt_table, NR_RESERVED_GDT_PAGES * PAGE_SIZE); BUILD_BUG_ON(NR_CPUS > 0x10000); gdt[PER_CPU_GDT_ENTRY - FIRST_RESERVED_GDT_ENTRY].a =3D cpu; @@ -992,6 +994,8 @@ static int cpu_smpboot_alloc(unsigned int cpu) per_cpu(compat_gdt_table, cpu) =3D gdt =3D alloc_xenheap_pages(order, = memflags); if ( gdt =3D=3D NULL ) goto out; + per_cpu(compat_gdt_table_l1e, cpu) =3D + l1e_from_pfn(virt_to_mfn(gdt), __PAGE_HYPERVISOR_RW); memcpy(gdt, boot_cpu_compat_gdt_table, NR_RESERVED_GDT_PAGES * PAGE_SI= ZE); gdt[PER_CPU_GDT_ENTRY - FIRST_RESERVED_GDT_ENTRY].a =3D cpu; =20 diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 8097ef3bf5..25b4b47e5e 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -97,7 +97,9 @@ DEFINE_PER_CPU(uint64_t, efer); static DEFINE_PER_CPU(unsigned long, last_extable_addr); =20 DEFINE_PER_CPU_READ_MOSTLY(seg_desc_t *, gdt_table); +DEFINE_PER_CPU_READ_MOSTLY(l1_pgentry_t, gdt_table_l1e); DEFINE_PER_CPU_READ_MOSTLY(seg_desc_t *, compat_gdt_table); +DEFINE_PER_CPU_READ_MOSTLY(l1_pgentry_t, compat_gdt_table_l1e); =20 /* Master table, used by CPU0. */ idt_entry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) @@ -2059,6 +2061,14 @@ void __init trap_init(void) } } =20 + /* Cache {,compat_}gdt_table_l1e now that physically relocation is don= e. */ + this_cpu(gdt_table_l1e) =3D + l1e_from_pfn(virt_to_mfn(boot_cpu_gdt_table), + __PAGE_HYPERVISOR_RW); + this_cpu(compat_gdt_table_l1e) =3D + l1e_from_pfn(virt_to_mfn(boot_cpu_compat_gdt_table), + __PAGE_HYPERVISOR_RW); + percpu_traps_init(); =20 cpu_init(); diff --git a/xen/include/asm-x86/desc.h b/xen/include/asm-x86/desc.h index 85e83bcefb..e565727dc0 100644 --- a/xen/include/asm-x86/desc.h +++ b/xen/include/asm-x86/desc.h @@ -206,8 +206,10 @@ struct __packed desc_ptr { =20 extern seg_desc_t boot_cpu_gdt_table[]; DECLARE_PER_CPU(seg_desc_t *, gdt_table); +DECLARE_PER_CPU(l1_pgentry_t, gdt_table_l1e); extern seg_desc_t boot_cpu_compat_gdt_table[]; DECLARE_PER_CPU(seg_desc_t *, compat_gdt_table); +DECLARE_PER_CPU(l1_pgentry_t, compat_gdt_table_l1e); =20 extern void load_TR(void); =20 --=20 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel