From nobody Thu Apr 25 10:41:59 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1581362557174419.1345401077722; Mon, 10 Feb 2020 11:22:37 -0800 (PST) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1Ed5-0000qc-JY; Mon, 10 Feb 2020 19:21:59 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1Ed3-0000qR-WB for xen-devel@lists.xenproject.org; Mon, 10 Feb 2020 19:21:58 +0000 Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 97f57c2e-4c3a-11ea-8cad-bc764e2007e4; Mon, 10 Feb 2020 19:21:53 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Feb 2020 11:21:52 -0800 Received: from jcguru1x-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.67.221]) by orsmga004.jf.intel.com with ESMTP; 10 Feb 2020 11:21:46 -0800 X-Inumbo-ID: 97f57c2e-4c3a-11ea-8cad-bc764e2007e4 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,426,1574150400"; d="scan'208";a="380199789" From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Mon, 10 Feb 2020 11:21:25 -0800 Message-Id: X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v8 1/5] x86/p2m: Allow p2m_get_page_from_gfn to return shared entries X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Tamas K Lengyel , Wei Liu , George Dunlap , Andrew Cooper , Jan Beulich , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" The owner domain of shared pages is dom_cow, use that for get_page otherwise the function fails to return the correct page under some situations. The check if dom_cow should be used was only performed in a subset of use-cases. Fixing the error and simplifying the existing check since we can't have any shared entries with dom_cow being NULL. Signed-off-by: Tamas K Lengyel --- xen/arch/x86/mm/p2m.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index fd9f09536d..2c0bb7e869 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -574,11 +574,12 @@ struct page_info *p2m_get_page_from_gfn( if ( fdom =3D=3D NULL ) page =3D NULL; } - else if ( !get_page(page, p2m->domain) && - /* Page could be shared */ - (!dom_cow || !p2m_is_shared(*t) || - !get_page(page, dom_cow)) ) - page =3D NULL; + else + { + struct domain *d =3D !p2m_is_shared(*t) ? p2m->domain : do= m_cow; + if ( !get_page(page, d) ) + page =3D NULL; + } } p2m_read_unlock(p2m); =20 @@ -594,8 +595,9 @@ struct page_info *p2m_get_page_from_gfn( mfn =3D get_gfn_type_access(p2m, gfn_x(gfn), t, a, q, NULL); if ( p2m_is_ram(*t) && mfn_valid(mfn) ) { + struct domain *d =3D !p2m_is_shared(*t) ? p2m->domain : dom_cow; page =3D mfn_to_page(mfn); - if ( !get_page(page, p2m->domain) ) + if ( !get_page(page, d) ) page =3D NULL; } put_gfn(p2m->domain, gfn_x(gfn)); --=20 2.20.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel From nobody Thu Apr 25 10:41:59 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1581362560572491.31530253990934; Mon, 10 Feb 2020 11:22:40 -0800 (PST) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1Ed5-0000qk-SK; Mon, 10 Feb 2020 19:21:59 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1Ed4-0000qW-KP for xen-devel@lists.xenproject.org; Mon, 10 Feb 2020 19:21:58 +0000 Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 99c89fd6-4c3a-11ea-b4f5-12813bfff9fa; Mon, 10 Feb 2020 19:21:56 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Feb 2020 11:21:55 -0800 Received: from jcguru1x-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.67.221]) by orsmga004.jf.intel.com with ESMTP; 10 Feb 2020 11:21:52 -0800 X-Inumbo-ID: 99c89fd6-4c3a-11ea-b4f5-12813bfff9fa X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,426,1574150400"; d="scan'208";a="380199810" From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Mon, 10 Feb 2020 11:21:26 -0800 Message-Id: <2bbfcca0a830da7648a1d0133ea3a4c2f73e17ea.1581362050.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v8 2/5] xen/x86: Make hap_get_allocation accessible X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Tamas K Lengyel , Wei Liu , George Dunlap , Andrew Cooper , Jan Beulich , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" During VM forking we'll copy the parent domain's parameters to the client, including the HAP shadow memory setting that is used for storing the domain= 's EPT. We'll copy this in the hypervisor instead doing it during toolstack la= unch to allow the domain to start executing and unsharing memory before (or even completely without) the toolstack. Signed-off-by: Tamas K Lengyel --- xen/arch/x86/mm/hap/hap.c | 3 +-- xen/include/asm-x86/hap.h | 1 + 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index 3d93f3451c..c7c7ff6e99 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -321,8 +321,7 @@ static void hap_free_p2m_page(struct domain *d, struct = page_info *pg) } =20 /* Return the size of the pool, rounded up to the nearest MB */ -static unsigned int -hap_get_allocation(struct domain *d) +unsigned int hap_get_allocation(struct domain *d) { unsigned int pg =3D d->arch.paging.hap.total_pages + d->arch.paging.hap.p2m_pages; diff --git a/xen/include/asm-x86/hap.h b/xen/include/asm-x86/hap.h index b94bfb4ed0..1bf07e49fe 100644 --- a/xen/include/asm-x86/hap.h +++ b/xen/include/asm-x86/hap.h @@ -45,6 +45,7 @@ int hap_track_dirty_vram(struct domain *d, =20 extern const struct paging_mode *hap_paging_get_mode(struct vcpu *); int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempt= ed); +unsigned int hap_get_allocation(struct domain *d); =20 #endif /* XEN_HAP_H */ =20 --=20 2.20.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel From nobody Thu Apr 25 10:41:59 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1581362558361467.7215933008953; Mon, 10 Feb 2020 11:22:38 -0800 (PST) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1EdB-0000sH-66; Mon, 10 Feb 2020 19:22:05 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1Ed9-0000rw-G4 for xen-devel@lists.xenproject.org; Mon, 10 Feb 2020 19:22:03 +0000 Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 9bdaa7c4-4c3a-11ea-b4f5-12813bfff9fa; Mon, 10 Feb 2020 19:21:59 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Feb 2020 11:21:59 -0800 Received: from jcguru1x-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.67.221]) by orsmga004.jf.intel.com with ESMTP; 10 Feb 2020 11:21:55 -0800 X-Inumbo-ID: 9bdaa7c4-4c3a-11ea-b4f5-12813bfff9fa X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,426,1574150400"; d="scan'208";a="380199831" From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Mon, 10 Feb 2020 11:21:27 -0800 Message-Id: <0f7d636910c45e9ca32fda4ef864a9b7d6e32745.1581362050.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v8 3/5] xen/mem_sharing: VM forking X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Stefano Stabellini , Tamas K Lengyel , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Tamas K Lengyel , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" VM forking is the process of creating a domain with an empty memory space a= nd a parent domain specified from which to populate the memory when necessary. F= or the new domain to be functional the VM state is copied over as part of the = fork operation (HVM params, hap allocation, etc). Signed-off-by: Tamas K Lengyel --- xen/arch/x86/domain.c | 11 ++ xen/arch/x86/hvm/hvm.c | 2 +- xen/arch/x86/mm/mem_sharing.c | 221 ++++++++++++++++++++++++++++++ xen/arch/x86/mm/p2m.c | 11 +- xen/include/asm-x86/mem_sharing.h | 17 +++ xen/include/public/memory.h | 5 + xen/include/xen/sched.h | 2 + 7 files changed, 267 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index f53ae5ff86..a98e2e0479 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2189,6 +2189,17 @@ int domain_relinquish_resources(struct domain *d) ret =3D relinquish_shared_pages(d); if ( ret ) return ret; + + /* + * If the domain is forked, decrement the parent's pause count + * and release the domain. + */ + if ( d->parent ) + { + domain_unpause(d->parent); + put_domain(d->parent); + d->parent =3D NULL; + } } #endif =20 diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 00a9e70b7c..55520bbd23 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1915,7 +1915,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned l= ong gla, } #endif =20 - /* Spurious fault? PoD and log-dirty also take this path. */ + /* Spurious fault? PoD, log-dirty and VM forking also take this path. = */ if ( p2m_is_ram(p2mt) ) { rc =3D 1; diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index 3835bc928f..ccf338918d 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -22,6 +22,7 @@ =20 #include #include +#include #include #include #include @@ -36,6 +37,9 @@ #include #include #include +#include +#include +#include #include =20 #include "mm-locks.h" @@ -1444,6 +1448,193 @@ static inline int mem_sharing_control(struct domain= *d, bool enable) return 0; } =20 +/* + * Forking a page only gets called when the VM faults due to no entry being + * in the EPT for the access. Depending on the type of access we either + * populate the physmap with a shared entry for read-only access or + * fork the page if its a write access. + * + * The client p2m is already locked so we only need to lock + * the parent's here. + */ +int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool unsharing) +{ + int rc =3D -ENOENT; + shr_handle_t handle; + struct domain *parent; + struct p2m_domain *p2m; + unsigned long gfn_l =3D gfn_x(gfn); + mfn_t mfn, new_mfn; + p2m_type_t p2mt; + struct page_info *page; + + if ( !mem_sharing_is_fork(d) ) + return -ENOENT; + + parent =3D d->parent; + + if ( !unsharing ) + { + /* For read-only accesses we just add a shared entry to the physma= p */ + while ( parent ) + { + if ( !(rc =3D nominate_page(parent, gfn, 0, &handle)) ) + break; + + parent =3D parent->parent; + } + + if ( !rc ) + { + /* The client's p2m is already locked */ + struct p2m_domain *pp2m =3D p2m_get_hostp2m(parent); + + p2m_lock(pp2m); + rc =3D add_to_physmap(parent, gfn_l, handle, d, gfn_l, false); + p2m_unlock(pp2m); + + if ( !rc ) + return 0; + } + } + + /* + * If it's a write access (ie. unsharing) or if adding a shared entry = to + * the physmap failed we'll fork the page directly. + */ + p2m =3D p2m_get_hostp2m(d); + parent =3D d->parent; + + while ( parent ) + { + mfn =3D get_gfn_query(parent, gfn_l, &p2mt); + + if ( mfn_valid(mfn) && p2m_is_any_ram(p2mt) ) + break; + + put_gfn(parent, gfn_l); + parent =3D parent->parent; + } + + if ( !parent ) + return -ENOENT; + + if ( !(page =3D alloc_domheap_page(d, 0)) ) + { + put_gfn(parent, gfn_l); + return -ENOMEM; + } + + new_mfn =3D page_to_mfn(page); + copy_domain_page(new_mfn, mfn); + set_gpfn_from_mfn(mfn_x(new_mfn), gfn_l); + + put_gfn(parent, gfn_l); + + return p2m->set_entry(p2m, gfn, new_mfn, PAGE_ORDER_4K, p2m_ram_rw, + p2m->default_access, -1); +} + +static int bring_up_vcpus(struct domain *cd, struct cpupool *cpupool) +{ + int ret; + unsigned int i; + + if ( (ret =3D cpupool_move_domain(cd, cpupool)) ) + return ret; + + for ( i =3D 0; i < cd->max_vcpus; i++ ) + { + if ( cd->vcpu[i] ) + continue; + + if ( !vcpu_create(cd, i) ) + return -EINVAL; + } + + domain_update_node_affinity(cd); + return 0; +} + +static int fork_hap_allocation(struct domain *cd, struct domain *d) +{ + int rc; + bool preempted; + unsigned long mb =3D hap_get_allocation(d); + + if ( mb =3D=3D hap_get_allocation(cd) ) + return 0; + + paging_lock(cd); + rc =3D hap_set_allocation(cd, mb << (20 - PAGE_SHIFT), &preempted); + paging_unlock(cd); + + if ( rc ) + return rc; + + if ( preempted ) + return -ERESTART; + + return 0; +} + +static void fork_tsc(struct domain *cd, struct domain *d) +{ + uint32_t tsc_mode; + uint32_t gtsc_khz; + uint32_t incarnation; + uint64_t elapsed_nsec; + + tsc_get_info(d, &tsc_mode, &elapsed_nsec, >sc_khz, &incarnation); + tsc_set_info(cd, tsc_mode, elapsed_nsec, gtsc_khz, incarnation); +} + +static int mem_sharing_fork(struct domain *d, struct domain *cd) +{ + int rc =3D -EINVAL; + + if ( !cd->controller_pause_count ) + return rc; + + /* + * We only want to get and pause the parent once, not each time this + * operation is restarted due to preemption. + */ + if ( !cd->parent_paused ) + { + ASSERT(get_domain(d)); + domain_pause(d); + + cd->parent_paused =3D true; + cd->max_pages =3D d->max_pages; + cd->max_vcpus =3D d->max_vcpus; + } + + /* this is preemptible so it's the first to get done */ + if ( (rc =3D fork_hap_allocation(cd, d)) ) + goto done; + + if ( (rc =3D bring_up_vcpus(cd, d->cpupool)) ) + goto done; + + if ( (rc =3D hvm_copy_context_and_params(cd, d)) ) + goto done; + + fork_tsc(cd, d); + + cd->parent =3D d; + + done: + if ( rc && rc !=3D -ERESTART ) + { + domain_unpause(d); + put_domain(d); + cd->parent_paused =3D false; + } + + return rc; +} + int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) { int rc; @@ -1698,6 +1889,36 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem= _sharing_op_t) arg) rc =3D debug_gref(d, mso.u.debug.u.gref); break; =20 + case XENMEM_sharing_op_fork: + { + struct domain *pd; + + rc =3D -EINVAL; + if ( mso.u.fork._pad[0] || mso.u.fork._pad[1] || + mso.u.fork._pad[2] ) + goto out; + + rc =3D rcu_lock_live_remote_domain_by_id(mso.u.fork.parent_domain, + &pd); + if ( rc ) + goto out; + + if ( !mem_sharing_enabled(pd) ) + { + if ( (rc =3D mem_sharing_control(pd, true)) ) + goto out; + } + + rc =3D mem_sharing_fork(pd, d); + + if ( rc =3D=3D -ERESTART ) + rc =3D hypercall_create_continuation(__HYPERVISOR_memory_op, + "lh", XENMEM_sharing_op, + arg); + rcu_unlock_domain(pd); + break; + } + default: rc =3D -ENOSYS; break; diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 2c0bb7e869..72b4485970 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -509,6 +509,14 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, un= signed long gfn_l, =20 mfn =3D p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL); =20 + /* Check if we need to fork the page */ + if ( (q & P2M_ALLOC) && p2m_is_hole(*t) && + !mem_sharing_fork_page(p2m->domain, gfn, !!(q & P2M_UNSHARE)) ) + { + mfn =3D p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL); + } + + /* Check if we need to unshare the page */ if ( (q & P2M_UNSHARE) && p2m_is_shared(*t) ) { ASSERT(p2m_is_hostp2m(p2m)); @@ -587,7 +595,8 @@ struct page_info *p2m_get_page_from_gfn( return page; =20 /* Error path: not a suitable GFN at all */ - if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) ) + if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) && + !mem_sharing_is_fork(p2m->domain) ) return NULL; } =20 diff --git a/xen/include/asm-x86/mem_sharing.h b/xen/include/asm-x86/mem_sh= aring.h index 53760a2896..ac968fae3f 100644 --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -39,6 +39,9 @@ struct mem_sharing_domain =20 #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled) =20 +#define mem_sharing_is_fork(d) \ + (mem_sharing_enabled(d) && !!((d)->parent)) + /* Auditing of memory sharing code? */ #ifndef NDEBUG #define MEM_SHARING_AUDIT 1 @@ -88,6 +91,9 @@ static inline int mem_sharing_unshare_page(struct domain = *d, return rc; } =20 +int mem_sharing_fork_page(struct domain *d, gfn_t gfn, + bool unsharing); + /* * If called by a foreign domain, possible errors are * -EBUSY -> ring full @@ -117,6 +123,7 @@ int relinquish_shared_pages(struct domain *d); #else =20 #define mem_sharing_enabled(d) false +#define mem_sharing_is_fork(p2m) false =20 static inline unsigned int mem_sharing_get_nr_saved_mfns(void) { @@ -141,6 +148,16 @@ static inline int mem_sharing_notify_enomem(struct dom= ain *d, unsigned long gfn, return -EOPNOTSUPP; } =20 +static inline int mem_sharing_fork(struct domain *d, struct domain *cd, bo= ol vcpu) +{ + return -EOPNOTSUPP; +} + +static inline int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool = lock) +{ + return -EOPNOTSUPP; +} + #endif =20 #endif /* __MEM_SHARING_H__ */ diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index cfdda6e2a8..90a3f4498e 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -482,6 +482,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t); #define XENMEM_sharing_op_add_physmap 6 #define XENMEM_sharing_op_audit 7 #define XENMEM_sharing_op_range_share 8 +#define XENMEM_sharing_op_fork 9 =20 #define XENMEM_SHARING_OP_S_HANDLE_INVALID (-10) #define XENMEM_SHARING_OP_C_HANDLE_INVALID (-9) @@ -532,6 +533,10 @@ struct xen_mem_sharing_op { uint32_t gref; /* IN: gref to debug */ } u; } debug; + struct mem_sharing_op_fork { + domid_t parent_domain; + uint16_t _pad[3]; /* Must be set to 0 */ + } fork; } u; }; typedef struct xen_mem_sharing_op xen_mem_sharing_op_t; diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 7c5c437247..8ed727e10c 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -507,6 +507,8 @@ struct domain /* Memory sharing support */ #ifdef CONFIG_MEM_SHARING struct vm_event_domain *vm_event_share; + struct domain *parent; /* VM fork parent */ + bool parent_paused; #endif /* Memory paging support */ #ifdef CONFIG_HAS_MEM_PAGING --=20 2.20.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel From nobody Thu Apr 25 10:41:59 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1581362560513372.17992998730665; Mon, 10 Feb 2020 11:22:40 -0800 (PST) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1EdF-0000uH-Mw; Mon, 10 Feb 2020 19:22:09 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1EdE-0000to-GH for xen-devel@lists.xenproject.org; Mon, 10 Feb 2020 19:22:08 +0000 Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 9c62fb2e-4c3a-11ea-b4f5-12813bfff9fa; Mon, 10 Feb 2020 19:22:00 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Feb 2020 11:22:00 -0800 Received: from jcguru1x-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.67.221]) by orsmga004.jf.intel.com with ESMTP; 10 Feb 2020 11:21:59 -0800 X-Inumbo-ID: 9c62fb2e-4c3a-11ea-b4f5-12813bfff9fa X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,426,1574150400"; d="scan'208";a="380199848" From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Mon, 10 Feb 2020 11:21:28 -0800 Message-Id: <00e429194b01ac469280a05cfffe3cd64fcce4e9.1581362050.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v8 4/5] x86/mem_sharing: reset a fork X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Tamas K Lengyel , Tamas K Lengyel , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Stefano Stabellini , Jan Beulich , Julien Grall , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Implement hypercall that allows a fork to shed all memory that got allocated for it during its execution and re-load its vCPU context from the parent VM. This allows the forked VM to reset into the same state the parent VM is in a faster way then creating a new fork would be. Measurements show about a 2x speedup during normal fuzzing operations. Performance may vary depending how much memory got allocated for the forked VM. If it has been completely deduplicated from the parent VM then creating a new fork would likely be mo= re performant. Signed-off-by: Tamas K Lengyel --- xen/arch/x86/mm/mem_sharing.c | 76 +++++++++++++++++++++++++++++++++++ xen/include/public/memory.h | 1 + 2 files changed, 77 insertions(+) diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index ccf338918d..9d61592efa 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -1635,6 +1635,59 @@ static int mem_sharing_fork(struct domain *d, struct= domain *cd) return rc; } =20 +/* + * The fork reset operation is intended to be used on short-lived forks on= ly. + * There is no hypercall continuation operation implemented for this reaso= n. + * For forks that obtain a larger memory footprint it is likely going to be + * more performant to create a new fork instead of resetting an existing o= ne. + * + * TODO: In case this hypercall would become useful on forks with larger m= emory + * footprints the hypercall continuation should be implemented. + */ +static int mem_sharing_fork_reset(struct domain *d, struct domain *cd) +{ + int rc; + struct p2m_domain* p2m =3D p2m_get_hostp2m(cd); + struct page_info *page, *tmp; + + domain_pause(cd); + + page_list_for_each_safe(page, tmp, &cd->page_list) + { + p2m_type_t p2mt; + p2m_access_t p2ma; + gfn_t gfn; + mfn_t mfn =3D page_to_mfn(page); + + if ( !mfn_valid(mfn) ) + continue; + + gfn =3D mfn_to_gfn(cd, mfn); + mfn =3D __get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma, + 0, NULL, false); + + if ( !p2m_is_ram(p2mt) || p2m_is_shared(p2mt) ) + continue; + + /* take an extra reference */ + if ( !get_page(page, cd) ) + continue; + + rc =3D p2m->set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_4K, + p2m_invalid, p2m_access_rwx, -1); + ASSERT(!rc); + + put_page_alloc_ref(page); + put_page(page); + } + + if ( !(rc =3D hvm_copy_context_and_params(cd, d)) ) + fork_tsc(cd, d); + + domain_unpause(cd); + return rc; +} + int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) { int rc; @@ -1919,6 +1972,29 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem= _sharing_op_t) arg) break; } =20 + case XENMEM_sharing_op_fork_reset: + { + struct domain *pd; + + rc =3D -EINVAL; + if ( mso.u.fork._pad[0] || mso.u.fork._pad[1] || + mso.u.fork._pad[2] ) + goto out; + + rc =3D -ENOSYS; + if ( !d->parent ) + goto out; + + rc =3D rcu_lock_live_remote_domain_by_id(d->parent->domain_id, &pd= ); + if ( rc ) + goto out; + + rc =3D mem_sharing_fork_reset(pd, d); + + rcu_unlock_domain(pd); + break; + } + default: rc =3D -ENOSYS; break; diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index 90a3f4498e..e3d063e22e 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -483,6 +483,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t); #define XENMEM_sharing_op_audit 7 #define XENMEM_sharing_op_range_share 8 #define XENMEM_sharing_op_fork 9 +#define XENMEM_sharing_op_fork_reset 10 =20 #define XENMEM_SHARING_OP_S_HANDLE_INVALID (-10) #define XENMEM_SHARING_OP_C_HANDLE_INVALID (-9) --=20 2.20.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel From nobody Thu Apr 25 10:41:59 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1581362575897973.2977351066771; Mon, 10 Feb 2020 11:22:55 -0800 (PST) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1EdL-0000xS-02; Mon, 10 Feb 2020 19:22:15 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1EdJ-0000wh-GV for xen-devel@lists.xenproject.org; Mon, 10 Feb 2020 19:22:13 +0000 Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 9ceaf894-4c3a-11ea-b4f5-12813bfff9fa; Mon, 10 Feb 2020 19:22:01 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Feb 2020 11:22:00 -0800 Received: from jcguru1x-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.67.221]) by orsmga004.jf.intel.com with ESMTP; 10 Feb 2020 11:22:00 -0800 X-Inumbo-ID: 9ceaf894-4c3a-11ea-b4f5-12813bfff9fa X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,426,1574150400"; d="scan'208";a="380199857" From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Mon, 10 Feb 2020 11:21:29 -0800 Message-Id: <9d0df182d6140f64928cff859184c02bd55c377e.1581362050.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v8 5/5] xen/tools: VM forking toolstack side X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Anthony PERARD , Ian Jackson , Tamas K Lengyel , Wei Liu Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Add necessary bits to implement "xl fork-vm" commands. The command allows t= he user to specify how to launch the device model allowing for a late-launch m= odel in which the user can execute the fork without the device model and decide = to only later launch it. Signed-off-by: Tamas K Lengyel --- v8: don't try to unpause twice when launching dm --- docs/man/xl.1.pod.in | 36 +++++ tools/libxc/include/xenctrl.h | 13 ++ tools/libxc/xc_memshr.c | 22 +++ tools/libxl/libxl.h | 7 + tools/libxl/libxl_create.c | 256 ++++++++++++++++++++++------------ tools/libxl/libxl_dm.c | 2 +- tools/libxl/libxl_dom.c | 43 +++++- tools/libxl/libxl_internal.h | 1 + tools/libxl/libxl_types.idl | 1 + tools/xl/xl.h | 5 + tools/xl/xl_cmdtable.c | 12 ++ tools/xl/xl_saverestore.c | 97 +++++++++++++ tools/xl/xl_vmcontrol.c | 8 ++ 13 files changed, 409 insertions(+), 94 deletions(-) diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in index 33ad2ebd71..c4012939f5 100644 --- a/docs/man/xl.1.pod.in +++ b/docs/man/xl.1.pod.in @@ -694,6 +694,42 @@ Leave the domain paused after creating the snapshot. =20 =3Dback =20 +=3Ditem B [I] I + +Create a fork of a running VM. The domain will be paused after the operati= on +and needs to remain paused while forks of it exist. + +B + +=3Dover 4 + +=3Ditem B<-p> + +Leave the fork paused after creating it. + +=3Ditem B<--launch-dm> + +Specify whether the device model (QEMU) should be launched for the fork. L= ate +launch allows to start the device model for an already running fork. + +=3Ditem B<-C> + +The config file to use when launching the device model. Currently required= when +launching the device model. + +=3Ditem B<-Q> + +The qemu save file to use when launching the device model. Currently requ= ired +when launching the device model. + +=3Ditem B<--fork-reset> + +Perform a reset operation of an already running fork. Note that resetting = may +be less performant then creating a new fork depending on how much memory t= he +fork has deduplicated during its runtime. + +=3Dback + =3Ditem B [I] =20 Display the number of shared pages for a specified domain. If no domain is diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index cc4eb1e3d3..6f65888dd0 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2225,6 +2225,19 @@ int xc_memshr_range_share(xc_interface *xch, uint64_t first_gfn, uint64_t last_gfn); =20 +int xc_memshr_fork(xc_interface *xch, + uint32_t source_domain, + uint32_t client_domain); + +/* + * Note: this function is only intended to be used on short-lived forks th= at + * haven't yet aquired a lot of memory. In case the fork has a lot of memo= ry + * it is likely more performant to create a new fork with xc_memshr_fork. + * + * With VMs that have a lot of memory this call may block for a long time. + */ +int xc_memshr_fork_reset(xc_interface *xch, uint32_t forked_domain); + /* Debug calls: return the number of pages referencing the shared frame ba= cking * the input argument. Should be one or greater. * diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c index 97e2e6a8d9..d0e4ee225b 100644 --- a/tools/libxc/xc_memshr.c +++ b/tools/libxc/xc_memshr.c @@ -239,6 +239,28 @@ int xc_memshr_debug_gref(xc_interface *xch, return xc_memshr_memop(xch, domid, &mso); } =20 +int xc_memshr_fork(xc_interface *xch, uint32_t pdomid, uint32_t domid) +{ + xen_mem_sharing_op_t mso; + + memset(&mso, 0, sizeof(mso)); + + mso.op =3D XENMEM_sharing_op_fork; + mso.u.fork.parent_domain =3D pdomid; + + return xc_memshr_memop(xch, domid, &mso); +} + +int xc_memshr_fork_reset(xc_interface *xch, uint32_t domid) +{ + xen_mem_sharing_op_t mso; + + memset(&mso, 0, sizeof(mso)); + mso.op =3D XENMEM_sharing_op_fork_reset; + + return xc_memshr_memop(xch, domid, &mso); +} + int xc_memshr_audit(xc_interface *xch) { xen_mem_sharing_op_t mso; diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 18c1a2d6bf..094ab0d205 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1538,6 +1538,13 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_do= main_config *d_config, const libxl_asyncop_how *ao_how, const libxl_asyncprogress_how *aop_console_how) LIBXL_EXTERNAL_CALLERS_ONLY; +int libxl_domain_fork_vm(libxl_ctx *ctx, uint32_t pdomid, uint32_t *domid) + LIBXL_EXTERNAL_CALLERS_ONLY; +int libxl_domain_fork_launch_dm(libxl_ctx *ctx, libxl_domain_config *d_con= fig, + uint32_t domid, + const libxl_asyncprogress_how *aop_console= _how) + LIBXL_EXTERNAL_CALLERS_ONLY; +int libxl_domain_fork_reset(libxl_ctx *ctx, uint32_t domid); int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_con= fig, uint32_t *domid, int restore_fd, int send_back_fd, diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 3a7364e2ac..9dd9802fc7 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -536,12 +536,12 @@ out: return ret; } =20 -int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, - libxl__domain_build_state *state, - uint32_t *domid, bool soft_reset) +static int libxl__domain_make_xs_entries(libxl__gc *gc, libxl_domain_confi= g *d_config, + libxl__domain_build_state *state, + uint32_t domid) { libxl_ctx *ctx =3D libxl__gc_owner(gc); - int ret, rc, nb_vm; + int rc, nb_vm; const char *dom_type; char *uuid_string; char *dom_path, *vm_path, *libxl_path; @@ -553,9 +553,6 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_conf= ig *d_config, =20 /* convenience aliases */ libxl_domain_create_info *info =3D &d_config->c_info; - libxl_domain_build_info *b_info =3D &d_config->b_info; - - assert(soft_reset || *domid =3D=3D INVALID_DOMID); =20 uuid_string =3D libxl__uuid2string(gc, info->uuid); if (!uuid_string) { @@ -563,71 +560,7 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_con= fig *d_config, goto out; } =20 - if (!soft_reset) { - struct xen_domctl_createdomain create =3D { - .ssidref =3D info->ssidref, - .max_vcpus =3D b_info->max_vcpus, - .max_evtchn_port =3D b_info->event_channels, - .max_grant_frames =3D b_info->max_grant_frames, - .max_maptrack_frames =3D b_info->max_maptrack_frames, - }; - - if (info->type !=3D LIBXL_DOMAIN_TYPE_PV) { - create.flags |=3D XEN_DOMCTL_CDF_hvm; - create.flags |=3D - libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0; - create.flags |=3D - libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off; - } - - assert(info->passthrough !=3D LIBXL_PASSTHROUGH_DEFAULT); - LOG(DETAIL, "passthrough: %s", - libxl_passthrough_to_string(info->passthrough)); - - if (info->passthrough !=3D LIBXL_PASSTHROUGH_DISABLED) - create.flags |=3D XEN_DOMCTL_CDF_iommu; - - if (info->passthrough =3D=3D LIBXL_PASSTHROUGH_SYNC_PT) - create.iommu_opts |=3D XEN_DOMCTL_IOMMU_no_sharept; - - /* Ultimately, handle is an array of 16 uint8_t, same as uuid */ - libxl_uuid_copy(ctx, (libxl_uuid *)&create.handle, &info->uuid); - - ret =3D libxl__arch_domain_prepare_config(gc, d_config, &create); - if (ret < 0) { - LOGED(ERROR, *domid, "fail to get domain config"); - rc =3D ERROR_FAIL; - goto out; - } - - ret =3D xc_domain_create(ctx->xch, domid, &create); - if (ret < 0) { - LOGED(ERROR, *domid, "domain creation fail"); - rc =3D ERROR_FAIL; - goto out; - } - - rc =3D libxl__arch_domain_save_config(gc, d_config, state, &create= ); - if (rc < 0) - goto out; - } - - /* - * If soft_reset is set the the domid will have been valid on entry. - * If it was not set then xc_domain_create() should have assigned a - * valid value. Either way, if we reach this point, domid should be - * valid. - */ - assert(libxl_domid_valid_guest(*domid)); - - ret =3D xc_cpupool_movedomain(ctx->xch, info->poolid, *domid); - if (ret < 0) { - LOGED(ERROR, *domid, "domain move fail"); - rc =3D ERROR_FAIL; - goto out; - } - - dom_path =3D libxl__xs_get_dompath(gc, *domid); + dom_path =3D libxl__xs_get_dompath(gc, domid); if (!dom_path) { rc =3D ERROR_FAIL; goto out; @@ -635,12 +568,12 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_co= nfig *d_config, =20 vm_path =3D GCSPRINTF("/vm/%s", uuid_string); if (!vm_path) { - LOGD(ERROR, *domid, "cannot allocate create paths"); + LOGD(ERROR, domid, "cannot allocate create paths"); rc =3D ERROR_FAIL; goto out; } =20 - libxl_path =3D libxl__xs_libxl_path(gc, *domid); + libxl_path =3D libxl__xs_libxl_path(gc, domid); if (!libxl_path) { rc =3D ERROR_FAIL; goto out; @@ -651,10 +584,10 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_co= nfig *d_config, =20 roperm[0].id =3D 0; roperm[0].perms =3D XS_PERM_NONE; - roperm[1].id =3D *domid; + roperm[1].id =3D domid; roperm[1].perms =3D XS_PERM_READ; =20 - rwperm[0].id =3D *domid; + rwperm[0].id =3D domid; rwperm[0].perms =3D XS_PERM_NONE; =20 retry_transaction: @@ -672,7 +605,7 @@ retry_transaction: noperm, ARRAY_SIZE(noperm)); =20 xs_write(ctx->xsh, t, GCSPRINTF("%s/vm", dom_path), vm_path, strlen(vm= _path)); - rc =3D libxl__domain_rename(gc, *domid, 0, info->name, t); + rc =3D libxl__domain_rename(gc, domid, 0, info->name, t); if (rc) goto out; =20 @@ -749,7 +682,7 @@ retry_transaction: =20 vm_list =3D libxl_list_vm(ctx, &nb_vm); if (!vm_list) { - LOGD(ERROR, *domid, "cannot get number of running guests"); + LOGD(ERROR, domid, "cannot get number of running guests"); rc =3D ERROR_FAIL; goto out; } @@ -773,7 +706,7 @@ retry_transaction: t =3D 0; goto retry_transaction; } - LOGED(ERROR, *domid, "domain creation ""xenstore transaction commi= t failed"); + LOGED(ERROR, domid, "domain creation ""xenstore transaction commit= failed"); rc =3D ERROR_FAIL; goto out; } @@ -785,6 +718,89 @@ retry_transaction: return rc; } =20 +int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, + libxl__domain_build_state *state, + uint32_t *domid, bool soft_reset) +{ + libxl_ctx *ctx =3D libxl__gc_owner(gc); + int ret, rc; + + /* convenience aliases */ + libxl_domain_create_info *info =3D &d_config->c_info; + libxl_domain_build_info *b_info =3D &d_config->b_info; + + assert(soft_reset || *domid =3D=3D INVALID_DOMID); + + if (!soft_reset) { + struct xen_domctl_createdomain create =3D { + .ssidref =3D info->ssidref, + .max_vcpus =3D b_info->max_vcpus, + .max_evtchn_port =3D b_info->event_channels, + .max_grant_frames =3D b_info->max_grant_frames, + .max_maptrack_frames =3D b_info->max_maptrack_frames, + }; + + if (info->type !=3D LIBXL_DOMAIN_TYPE_PV) { + create.flags |=3D XEN_DOMCTL_CDF_hvm; + create.flags |=3D + libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0; + create.flags |=3D + libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off; + } + + assert(info->passthrough !=3D LIBXL_PASSTHROUGH_DEFAULT); + LOG(DETAIL, "passthrough: %s", + libxl_passthrough_to_string(info->passthrough)); + + if (info->passthrough !=3D LIBXL_PASSTHROUGH_DISABLED) + create.flags |=3D XEN_DOMCTL_CDF_iommu; + + if (info->passthrough =3D=3D LIBXL_PASSTHROUGH_SYNC_PT) + create.iommu_opts |=3D XEN_DOMCTL_IOMMU_no_sharept; + + /* Ultimately, handle is an array of 16 uint8_t, same as uuid */ + libxl_uuid_copy(ctx, (libxl_uuid *)&create.handle, &info->uuid); + + ret =3D libxl__arch_domain_prepare_config(gc, d_config, &create); + if (ret < 0) { + LOGED(ERROR, *domid, "fail to get domain config"); + rc =3D ERROR_FAIL; + goto out; + } + + ret =3D xc_domain_create(ctx->xch, domid, &create); + if (ret < 0) { + LOGED(ERROR, *domid, "domain creation fail"); + rc =3D ERROR_FAIL; + goto out; + } + + rc =3D libxl__arch_domain_save_config(gc, d_config, state, &create= ); + if (rc < 0) + goto out; + } + + /* + * If soft_reset is set the the domid will have been valid on entry. + * If it was not set then xc_domain_create() should have assigned a + * valid value. Either way, if we reach this point, domid should be + * valid. + */ + assert(libxl_domid_valid_guest(*domid)); + + ret =3D xc_cpupool_movedomain(ctx->xch, info->poolid, *domid); + if (ret < 0) { + LOGED(ERROR, *domid, "domain move fail"); + rc =3D ERROR_FAIL; + goto out; + } + + rc =3D libxl__domain_make_xs_entries(gc, d_config, state, *domid); + +out: + return rc; +} + static int store_libxl_entry(libxl__gc *gc, uint32_t domid, libxl_domain_build_info *b_info) { @@ -1106,16 +1122,32 @@ static void initiate_domain_create(libxl__egc *egc, ret =3D libxl__domain_config_setdefault(gc,d_config,domid); if (ret) goto error_out; =20 - ret =3D libxl__domain_make(gc, d_config, &dcs->build_state, &domid, - dcs->soft_reset); - if (ret) { - LOGD(ERROR, domid, "cannot make domain: %d", ret); + if ( !d_config->dm_restore_file ) + { + ret =3D libxl__domain_make(gc, d_config, &dcs->build_state, &domid, + dcs->soft_reset); dcs->guest_domid =3D domid; + + if (ret) { + LOGD(ERROR, domid, "cannot make domain: %d", ret); + ret =3D ERROR_FAIL; + goto error_out; + } + } else if ( dcs->guest_domid !=3D INVALID_DOMID ) { + domid =3D dcs->guest_domid; + + ret =3D libxl__domain_make_xs_entries(gc, d_config, &dcs->build_st= ate, domid); + if (ret) { + LOGD(ERROR, domid, "cannot make domain: %d", ret); + ret =3D ERROR_FAIL; + goto error_out; + } + } else { + LOGD(ERROR, domid, "cannot make domain"); ret =3D ERROR_FAIL; goto error_out; } =20 - dcs->guest_domid =3D domid; dcs->sdss.dm.guest_domid =3D 0; /* means we haven't spawned */ =20 /* post-4.13 todo: move these next bits of defaulting to @@ -1151,7 +1183,7 @@ static void initiate_domain_create(libxl__egc *egc, if (ret) goto error_out; =20 - if (restore_fd >=3D 0 || dcs->soft_reset) { + if (restore_fd >=3D 0 || dcs->soft_reset || d_config->dm_restore_file)= { LOGD(DEBUG, domid, "restoring, not running bootloader"); domcreate_bootloader_done(egc, &dcs->bl, 0); } else { @@ -1227,7 +1259,16 @@ static void domcreate_bootloader_done(libxl__egc *eg= c, dcs->sdss.dm.callback =3D domcreate_devmodel_started; dcs->sdss.callback =3D domcreate_devmodel_started; =20 - if (restore_fd < 0 && !dcs->soft_reset) { + if (restore_fd < 0 && !dcs->soft_reset && !d_config->dm_restore_file) { + rc =3D libxl__domain_build(gc, d_config, domid, state); + domcreate_rebuild_done(egc, dcs, rc); + return; + } + + if ( d_config->dm_restore_file ) { + dcs->srs.dcs =3D dcs; + dcs->srs.ao =3D ao; + state->forked_vm =3D true; rc =3D libxl__domain_build(gc, d_config, domid, state); domcreate_rebuild_done(egc, dcs, rc); return; @@ -1425,6 +1466,7 @@ static void domcreate_rebuild_done(libxl__egc *egc, /* convenience aliases */ const uint32_t domid =3D dcs->guest_domid; libxl_domain_config *const d_config =3D dcs->guest_config; + libxl__domain_build_state *const state =3D &dcs->build_state; =20 if (ret) { LOGD(ERROR, domid, "cannot (re-)build domain: %d", ret); @@ -1432,6 +1474,9 @@ static void domcreate_rebuild_done(libxl__egc *egc, goto error_out; } =20 + if ( d_config->dm_restore_file ) + state->saved_state =3D GCSPRINTF("%s", d_config->dm_restore_file); + store_libxl_entry(gc, domid, &d_config->b_info); =20 libxl__multidev_begin(ao, &dcs->multidev); @@ -1833,6 +1878,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_dom= ain_config *d_config, GCNEW(cdcs); cdcs->dcs.ao =3D ao; cdcs->dcs.guest_config =3D d_config; + cdcs->dcs.guest_domid =3D *domid; + libxl_domain_config_init(&cdcs->dcs.guest_config_saved); libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config); cdcs->dcs.restore_fd =3D cdcs->dcs.libxc_fd =3D restore_fd; @@ -2081,6 +2128,43 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_do= main_config *d_config, ao_how, aop_console_how); } =20 +int libxl_domain_fork_vm(libxl_ctx *ctx, uint32_t pdomid, uint32_t *domid) +{ + int rc; + struct xen_domctl_createdomain create =3D {0}; + create.flags |=3D XEN_DOMCTL_CDF_hvm; + create.flags |=3D XEN_DOMCTL_CDF_hap; + create.flags |=3D XEN_DOMCTL_CDF_oos_off; + create.arch.emulation_flags =3D (XEN_X86_EMU_ALL & ~XEN_X86_EMU_VPCI); + + create.ssidref =3D SECINITSID_DOMU; + create.max_vcpus =3D 1; // placeholder, will be cloned from pdomid + create.max_evtchn_port =3D 1023; + create.max_grant_frames =3D LIBXL_MAX_GRANT_FRAMES_DEFAULT; + create.max_maptrack_frames =3D LIBXL_MAX_MAPTRACK_FRAMES_DEFAULT; + + if ( (rc =3D xc_domain_create(ctx->xch, domid, &create)) ) + return rc; + + if ( (rc =3D xc_memshr_fork(ctx->xch, pdomid, *domid)) ) + xc_domain_destroy(ctx->xch, *domid); + + return rc; +} + +int libxl_domain_fork_launch_dm(libxl_ctx *ctx, libxl_domain_config *d_con= fig, + uint32_t domid, + const libxl_asyncprogress_how *aop_console= _how) +{ + unset_disk_colo_restore(d_config); + return do_domain_create(ctx, d_config, &domid, -1, -1, 0, 0, aop_conso= le_how); +} + +int libxl_domain_fork_reset(libxl_ctx *ctx, uint32_t domid) +{ + return xc_memshr_fork_reset(ctx->xch, domid); +} + int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_con= fig, uint32_t *domid, int restore_fd, int send_back_fd, diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 3b1da90167..87ae1478cf 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -2787,7 +2787,7 @@ static void device_model_spawn_outcome(libxl__egc *eg= c, =20 libxl__domain_build_state *state =3D dmss->build_state; =20 - if (state->saved_state) { + if (state->saved_state && !state->forked_vm) { ret2 =3D unlink(state->saved_state); if (ret2) { LOGED(ERROR, dmss->guest_domid, "%s: failed to remove device-m= odel state %s", diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index d9ada8a422..e7c54ddf63 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -249,9 +249,12 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, libxl_domain_build_info *const info =3D &d_config->b_info; libxl_ctx *ctx =3D libxl__gc_owner(gc); char *xs_domid, *con_domid; - int rc; + int rc =3D 0; uint64_t size; =20 + if ( state->forked_vm ) + goto skip_fork; + if (xc_domain_max_vcpus(ctx->xch, domid, info->max_vcpus) !=3D 0) { LOG(ERROR, "Couldn't set max vcpu count"); return ERROR_FAIL; @@ -362,7 +365,6 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, } } =20 - rc =3D libxl__arch_extra_memory(gc, info, &size); if (rc < 0) { LOGE(ERROR, "Couldn't get arch extra constant memory size"); @@ -374,6 +376,11 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, return ERROR_FAIL; } =20 + rc =3D libxl__arch_domain_create(gc, d_config, domid); + if ( rc ) + goto out; + +skip_fork: xs_domid =3D xs_read(ctx->xsh, XBT_NULL, "/tool/xenstored/domid", NULL= ); state->store_domid =3D xs_domid ? atoi(xs_domid) : 0; free(xs_domid); @@ -385,8 +392,7 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, state->store_port =3D xc_evtchn_alloc_unbound(ctx->xch, domid, state->= store_domid); state->console_port =3D xc_evtchn_alloc_unbound(ctx->xch, domid, state= ->console_domid); =20 - rc =3D libxl__arch_domain_create(gc, d_config, domid); - +out: return rc; } =20 @@ -444,6 +450,9 @@ int libxl__build_post(libxl__gc *gc, uint32_t domid, char **ents; int i, rc; =20 + if ( state->forked_vm ) + goto skip_fork; + if (info->num_vnuma_nodes && !info->num_vcpu_soft_affinity) { rc =3D set_vnuma_affinity(gc, domid, info); if (rc) @@ -468,6 +477,7 @@ int libxl__build_post(libxl__gc *gc, uint32_t domid, } } =20 +skip_fork: ents =3D libxl__calloc(gc, 12 + (info->max_vcpus * 2) + 2, sizeof(char= *)); ents[0] =3D "memory/static-max"; ents[1] =3D GCSPRINTF("%"PRId64, info->max_memkb); @@ -730,14 +740,16 @@ static int hvm_build_set_params(xc_interface *handle,= uint32_t domid, libxl_domain_build_info *info, int store_evtchn, unsigned long *store_mfn, int console_evtchn, unsigned long *console= _mfn, - domid_t store_domid, domid_t console_domid) + domid_t store_domid, domid_t console_domid, + bool forked_vm) { struct hvm_info_table *va_hvm; uint8_t *va_map, sum; uint64_t str_mfn, cons_mfn; int i; =20 - if (info->type =3D=3D LIBXL_DOMAIN_TYPE_HVM) { + if ( info->type =3D=3D LIBXL_DOMAIN_TYPE_HVM && !forked_vm ) + { va_map =3D xc_map_foreign_range(handle, domid, XC_PAGE_SIZE, PROT_READ | PROT_WRITE, HVM_INFO_PFN); @@ -1053,6 +1065,23 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, struct xc_dom_image *dom =3D NULL; bool device_model =3D info->type =3D=3D LIBXL_DOMAIN_TYPE_HVM ? true := false; =20 + if ( state->forked_vm ) + { + rc =3D hvm_build_set_params(ctx->xch, domid, info, state->store_po= rt, + &state->store_mfn, state->console_port, + &state->console_mfn, state->store_domid, + state->console_domid, state->forked_vm); + + if ( rc ) + return rc; + + return xc_dom_gnttab_seed(ctx->xch, domid, true, + state->console_mfn, + state->store_mfn, + state->console_domid, + state->store_domid); + } + xc_dom_loginit(ctx->xch); =20 /* @@ -1177,7 +1206,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, rc =3D hvm_build_set_params(ctx->xch, domid, info, state->store_port, &state->store_mfn, state->console_port, &state->console_mfn, state->store_domid, - state->console_domid); + state->console_domid, false); if (rc !=3D 0) { LOG(ERROR, "hvm build set params failed"); goto out; diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index dd3c08bc14..f69a8387ed 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1374,6 +1374,7 @@ typedef struct { =20 char *saved_state; int dm_monitor_fd; + bool forked_vm; =20 libxl__file_reference pv_kernel; libxl__file_reference pv_ramdisk; diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 7921950f6a..7c4c4057a9 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -956,6 +956,7 @@ libxl_domain_config =3D Struct("domain_config", [ ("on_watchdog", libxl_action_on_shutdown), ("on_crash", libxl_action_on_shutdown), ("on_soft_reset", libxl_action_on_shutdown), + ("dm_restore_file", string, {'const': True}), ], dir=3DDIR_IN) =20 libxl_diskinfo =3D Struct("diskinfo", [ diff --git a/tools/xl/xl.h b/tools/xl/xl.h index 60bdad8ffb..9bdad6526e 100644 --- a/tools/xl/xl.h +++ b/tools/xl/xl.h @@ -31,6 +31,7 @@ struct cmd_spec { }; =20 struct domain_create { + uint32_t ddomid; /* fork launch dm for this domid */ int debug; int daemonize; int monitor; /* handle guest reboots etc */ @@ -45,6 +46,7 @@ struct domain_create { const char *config_file; char *extra_config; /* extra config string */ const char *restore_file; + const char *dm_restore_file; char *colo_proxy_script; bool userspace_colo_proxy; int migrate_fd; /* -1 means none */ @@ -127,6 +129,9 @@ int main_pciassignable_remove(int argc, char **argv); int main_pciassignable_list(int argc, char **argv); #ifndef LIBXL_HAVE_NO_SUSPEND_RESUME int main_restore(int argc, char **argv); +int main_fork_vm(int argc, char **argv); +int main_fork_launch_dm(int argc, char **argv); +int main_fork_reset(int argc, char **argv); int main_migrate_receive(int argc, char **argv); int main_save(int argc, char **argv); int main_migrate(int argc, char **argv); diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 3b302b2f20..3a5d371057 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -185,6 +185,18 @@ struct cmd_spec cmd_table[] =3D { "Restore a domain from a saved state", "- for internal use only", }, + { "fork-vm", + &main_fork_vm, 0, 1, + "Fork a domain from the running parent domid", + "[options] ", + "-h Print this help.\n" + "-C Use config file for VM fork.\n" + "-Q Use qemu save file for VM fork.\n" + "--launch-dm Launch device model (QEMU) for VM fork= .\n" + "--fork-reset Reset VM fork.\n" + "-p Do not unpause fork VM after operation= .\n" + "-d Enable debug messages.\n" + }, #endif { "dump-core", &main_dump_core, 0, 1, diff --git a/tools/xl/xl_saverestore.c b/tools/xl/xl_saverestore.c index 9be033fe65..d99d3eceb2 100644 --- a/tools/xl/xl_saverestore.c +++ b/tools/xl/xl_saverestore.c @@ -229,6 +229,103 @@ int main_restore(int argc, char **argv) return EXIT_SUCCESS; } =20 +int main_fork_vm(int argc, char **argv) +{ + int rc, debug =3D 0; + uint32_t domid_in =3D INVALID_DOMID, domid_out =3D INVALID_DOMID; + int launch_dm =3D 1; + bool reset =3D 0; + bool pause =3D 0; + const char *config_file =3D NULL; + const char *dm_restore_file =3D NULL; + + int opt; + static struct option opts[] =3D { + {"launch-dm", 1, 0, 'l'}, + {"fork-reset", 0, 0, 'r'}, + COMMON_LONG_OPTS + }; + + SWITCH_FOREACH_OPT(opt, "phdC:Q:l:rN:D:B:V:", opts, "fork-vm", 1) { + case 'd': + debug =3D 1; + break; + case 'p': + pause =3D 1; + break; + case 'C': + config_file =3D optarg; + break; + case 'Q': + dm_restore_file =3D optarg; + break; + case 'l': + if ( !strcmp(optarg, "no") ) + launch_dm =3D 0; + if ( !strcmp(optarg, "yes") ) + launch_dm =3D 1; + if ( !strcmp(optarg, "late") ) + launch_dm =3D 2; + break; + case 'r': + reset =3D 1; + break; + case 'N': /* fall-through */ + case 'D': /* fall-through */ + case 'B': /* fall-through */ + case 'V': + fprintf(stderr, "Unimplemented option(s)\n"); + return EXIT_FAILURE; + } + + if (argc-optind =3D=3D 1) { + domid_in =3D atoi(argv[optind]); + } else { + help("fork-vm"); + return EXIT_FAILURE; + } + + if (launch_dm && (!config_file || !dm_restore_file)) { + fprintf(stderr, "Currently you must provide both -C and -Q options= \n"); + return EXIT_FAILURE; + } + + if (reset) { + domid_out =3D domid_in; + if (libxl_domain_fork_reset(ctx, domid_in) =3D=3D EXIT_FAILURE) + return EXIT_FAILURE; + } + + if (launch_dm =3D=3D 2 || reset) { + domid_out =3D domid_in; + rc =3D EXIT_SUCCESS; + } else + rc =3D libxl_domain_fork_vm(ctx, domid_in, &domid_out); + + if (rc =3D=3D EXIT_SUCCESS) { + if ( launch_dm ) { + struct domain_create dom_info; + memset(&dom_info, 0, sizeof(dom_info)); + dom_info.ddomid =3D domid_out; + dom_info.dm_restore_file =3D dm_restore_file; + dom_info.debug =3D debug; + dom_info.paused =3D pause; + dom_info.config_file =3D config_file; + dom_info.migrate_fd =3D -1; + dom_info.send_back_fd =3D -1; + rc =3D create_domain(&dom_info) < 0 ? EXIT_FAILURE : EXIT_SUCC= ESS; + } else if ( !pause ) + rc =3D libxl_domain_unpause(ctx, domid_out, NULL); + } + + if (rc =3D=3D EXIT_SUCCESS) + fprintf(stderr, "fork-vm command successfully returned domid: %u\n= ", domid_out); + else if ( domid_out !=3D INVALID_DOMID ) + libxl_domain_destroy(ctx, domid_out, 0); + + return rc; +} + int main_save(int argc, char **argv) { uint32_t domid; diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c index e520b1da79..d9cb19c599 100644 --- a/tools/xl/xl_vmcontrol.c +++ b/tools/xl/xl_vmcontrol.c @@ -645,6 +645,7 @@ int create_domain(struct domain_create *dom_info) =20 libxl_domain_config d_config; =20 + uint32_t ddomid =3D dom_info->ddomid; // launch dm for this domain iff= set int debug =3D dom_info->debug; int daemonize =3D dom_info->daemonize; int monitor =3D dom_info->monitor; @@ -655,6 +656,7 @@ int create_domain(struct domain_create *dom_info) const char *restore_file =3D dom_info->restore_file; const char *config_source =3D NULL; const char *restore_source =3D NULL; + const char *dm_restore_file =3D dom_info->dm_restore_file; int migrate_fd =3D dom_info->migrate_fd; bool config_in_json; =20 @@ -923,6 +925,12 @@ start: * restore/migrate-receive it again. */ restoring =3D 0; + } else if ( ddomid ) { + d_config.dm_restore_file =3D dm_restore_file; + ret =3D libxl_domain_fork_launch_dm(ctx, &d_config, ddomid, + autoconnect_console_how); + domid =3D ddomid; + ddomid =3D INVALID_DOMID; } else if (domid_soft_reset !=3D INVALID_DOMID) { /* Do soft reset. */ ret =3D libxl_domain_soft_reset(ctx, &d_config, domid_soft_reset, --=20 2.20.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel