From nobody Fri May 17 05:54:43 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=fail header.i=@mandrillapp.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1662501061594818.2890453956079; Tue, 6 Sep 2022 14:51:01 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.400415.642165 (Exim 4.92) (envelope-from ) id 1oVgSs-0004vH-C9; Tue, 06 Sep 2022 21:50:38 +0000 Received: by outflank-mailman (output) from mailman id 400415.642165; Tue, 06 Sep 2022 21:50:38 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1oVgSs-0004uW-4f; Tue, 06 Sep 2022 21:50:38 +0000 Received: by outflank-mailman (input) for mailman id 400415; Tue, 06 Sep 2022 21:50:36 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1oVgIV-0000Cs-Im for xen-devel@lists.xenproject.org; Tue, 06 Sep 2022 21:39:55 +0000 Received: from mail136-23.atl41.mandrillapp.com (mail136-23.atl41.mandrillapp.com [198.2.136.23]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id e242cf91-2dc9-11ed-af93-0125da4c0113; Tue, 06 Sep 2022 11:54:23 +0200 (CEST) Received: from pmta11.mandrill.prod.atl01.rsglab.com (localhost [127.0.0.1]) by mail136-23.atl41.mandrillapp.com (Mailchimp) with ESMTP id 4MMLMM6jwpz1XMpv1 for ; Tue, 6 Sep 2022 09:54:23 +0000 (GMT) Received: from [37.26.189.201] by mandrillapp.com id 8ccb6ebb1d8746c59fc358783ad67ab1; Tue, 06 Sep 2022 09:54:23 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: e242cf91-2dc9-11ed-af93-0125da4c0113 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vates.fr; s=mandrill; t=1662458063; x=1662760463; i=andrei.semenov@vates.fr; bh=9M3+qLiWJriRB2iQVM16e9Wo2zoVHPoQDkx5vj3yF7g=; h=From:Subject:To:Cc:Message-Id:In-Reply-To:References:Feedback-ID: Date:MIME-Version:Content-Type:Content-Transfer-Encoding:CC:Date: Subject:From; b=mTKiS1YJJO9oWbo5fP2flAiNRZAbs764ovofQu4SwmSqn5ZWs5+Sdn7RJdJY/00Ac DjMemgAr1Jt2f/br6utVVFIoU9crLUFM2D4EYj2zJ4/NnEHJE2MeGcGmRxFI14t/3m qfdtZ/yClmm2bh7kexV/SEdeuzRpTzABXhuC6+Eo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1662458063; h=From : Subject : To : Cc : Message-Id : In-Reply-To : References : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=9M3+qLiWJriRB2iQVM16e9Wo2zoVHPoQDkx5vj3yF7g=; b=gjWmycH6CEwrpfiB3zwMaU/QSoFfJK3wtu/hv2SbUeXbCJNGwQ20xIr8hIrB195GNlh0Bb oe0EJK61jWYXowjy/eWAHTMIGuQXid8iunbsE1q2CjXpiRJjoz8XbpYoAQFk3z59jRa1hmDn fc0PKVBfS/RvexpzBCSd26GcYAty0= From: Andrei Semenov Subject: [PATCH v2 1/2] live migration: do not use deffered bitmap when inappropriate X-Mailer: git-send-email 2.34.1 X-Bm-Disclaimer: Yes X-Bm-Milter-Handled: 5cd6f291-6f11-459d-97c0-d09b574c3896 X-Bm-Transport-Timestamp: 1662458062869 To: andrei.semenov@vates.fr, xen-devel@lists.xenproject.org Cc: Wei Liu , Anthony PERARD , Juergen Gross Message-Id: <1e7862a0d83c61b7550747591275c38e87d4fbd2.1662457291.git.andrei.semenov@vates.fr> In-Reply-To: References: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=30504962.8ccb6ebb1d8746c59fc358783ad67ab1 X-Mandrill-User: md_30504962 Feedback-ID: 30504962:30504962.20220906:md Date: Tue, 06 Sep 2022 09:54:23 +0000 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: fail (found 2 invalid signatures) X-ZM-MESSAGEID: 1662501062678100001 Content-Type: text/plain; charset="utf-8" Use deffered bitmap only in PV guests context as it not used for HVM guests. This allow to reduce memory pressure on domain0 while migrating very large (memory wise) HVM guests. Signed-off-by: Andrei Semenov --- tools/libs/guest/xg_sr_common.h | 26 ++++++++++++++++-- tools/libs/guest/xg_sr_save.c | 23 +++++++--------- tools/libs/guest/xg_sr_save_x86_hvm.c | 21 +++++++++++++++ tools/libs/guest/xg_sr_save_x86_pv.c | 39 +++++++++++++++++++++++++++ 4 files changed, 93 insertions(+), 16 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_commo= n.h index 36d45ef56f..941e24d7b7 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -96,6 +96,24 @@ struct xc_sr_save_ops */ int (*check_vm_state)(struct xc_sr_context *ctx); =20 + /** + * For some reasons the page can't be sent for the moment. Postpone th= is + * send to the later stage when domain is suspended. + */ + int (*defer_page)(struct xc_sr_context *ctx, xen_pfn_t pfn); + + /** + * Merge all deferred pages with the dirty pages bitmap (in order to = be + * sent). + */ + int (*merge_deferred)(const struct xc_sr_context *ctx, + unsigned long *bitmap, unsigned long *count); + + /** + * Deferred pages was successfully sent. Reset all associated informa= tion. + */ + int (*reset_deferred)(struct xc_sr_context *ctx); + /** * Clean up the local environment. Will be called exactly once, either * after a successful save, or upon encountering an error. @@ -243,8 +261,6 @@ struct xc_sr_context =20 xen_pfn_t *batch_pfns; unsigned int nr_batch_pfns; - unsigned long *deferred_pages; - unsigned long nr_deferred_pages; xc_hypercall_buffer_t dirty_bitmap_hbuf; } save; =20 @@ -349,6 +365,12 @@ struct xc_sr_context =20 union { + struct + { + unsigned long *deferred_pages; + unsigned long nr_deferred_pages; + } save; + struct { /* State machine for the order of received records= . */ diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 9853d8d846..602b18488d 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -132,8 +132,7 @@ static int write_batch(struct xc_sr_context *ctx) /* Likely a ballooned page. */ if ( mfns[i] =3D=3D INVALID_MFN ) { - set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages); - ++ctx->save.nr_deferred_pages; + ctx->save.ops.defer_page(ctx, ctx->save.batch_pfns[i]); } } =20 @@ -192,8 +191,7 @@ static int write_batch(struct xc_sr_context *ctx) { if ( rc =3D=3D -1 && errno =3D=3D EAGAIN ) { - set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pa= ges); - ++ctx->save.nr_deferred_pages; + ctx->save.ops.defer_page(ctx, ctx->save.batch_pfns[i]); types[i] =3D XEN_DOMCTL_PFINFO_XTAB; --nr_pages; } @@ -641,6 +639,7 @@ static int suspend_and_send_dirty(struct xc_sr_context = *ctx) xc_interface *xch =3D ctx->xch; xc_shadow_op_stats_t stats =3D { 0, ctx->save.p2m_size }; char *progress_str =3D NULL; + unsigned long merged; int rc; DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap, &ctx->save.dirty_bitmap_hbuf); @@ -669,7 +668,7 @@ static int suspend_and_send_dirty(struct xc_sr_context = *ctx) else xc_set_progress_prefix(xch, "Checkpointed save"); =20 - bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size); + ctx->save.ops.merge_deferred(ctx, dirty_bitmap, &merged); =20 if ( !ctx->save.live && ctx->stream_type =3D=3D XC_STREAM_COLO ) { @@ -681,12 +680,11 @@ static int suspend_and_send_dirty(struct xc_sr_contex= t *ctx) } } =20 - rc =3D send_dirty_pages(ctx, stats.dirty_count + ctx->save.nr_deferred= _pages); + rc =3D send_dirty_pages(ctx, stats.dirty_count + merged); if ( rc ) goto out; =20 - bitmap_clear(ctx->save.deferred_pages, ctx->save.p2m_size); - ctx->save.nr_deferred_pages =3D 0; + ctx->save.ops.reset_deferred(ctx); =20 out: xc_set_progress_prefix(xch, NULL); @@ -805,18 +803,16 @@ static int setup(struct xc_sr_context *ctx) xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size))); ctx->save.batch_pfns =3D malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.batch_pfns)); - ctx->save.deferred_pages =3D bitmap_alloc(ctx->save.p2m_size); =20 - if ( !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pag= es ) + if ( !ctx->save.batch_pfns || !dirty_bitmap ) { - ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and" - " deferred pages"); + ERROR("Unable to allocate memory for dirty bitmaps, batch pfns"); rc =3D -1; errno =3D ENOMEM; goto err; } =20 - rc =3D 0; + rc =3D ctx->save.ops.reset_deferred(ctx); =20 err: return rc; @@ -837,7 +833,6 @@ static void cleanup(struct xc_sr_context *ctx) =20 xc_hypercall_buffer_free_pages(xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)= )); - free(ctx->save.deferred_pages); free(ctx->save.batch_pfns); } =20 diff --git a/tools/libs/guest/xg_sr_save_x86_hvm.c b/tools/libs/guest/xg_sr= _save_x86_hvm.c index 1634a7bc43..3c762a0af0 100644 --- a/tools/libs/guest/xg_sr_save_x86_hvm.c +++ b/tools/libs/guest/xg_sr_save_x86_hvm.c @@ -211,6 +211,24 @@ static int x86_hvm_end_of_checkpoint(struct xc_sr_cont= ext *ctx) return 0; } =20 +static int x86_hvm_defer_page(struct xc_sr_context *ctx, xen_pfn_t pfn) +{ + return 0; +} + +static int x86_hvm_merge_deferred(const struct xc_sr_context *ctx, + unsigned long *bitmap, unsigned long *cou= nt) +{ + *count =3D 0; + + return 0; +} + +static int x86_hvm_reset_deferred(struct xc_sr_context *ctx) +{ + return 0; +} + static int x86_hvm_cleanup(struct xc_sr_context *ctx) { xc_interface *xch =3D ctx->xch; @@ -237,6 +255,9 @@ struct xc_sr_save_ops save_ops_x86_hvm =3D .start_of_checkpoint =3D x86_hvm_start_of_checkpoint, .end_of_checkpoint =3D x86_hvm_end_of_checkpoint, .check_vm_state =3D x86_hvm_check_vm_state, + .defer_page =3D x86_hvm_defer_page, + .merge_deferred =3D x86_hvm_merge_deferred, + .reset_deferred =3D x86_hvm_reset_deferred, .cleanup =3D x86_hvm_cleanup, }; =20 diff --git a/tools/libs/guest/xg_sr_save_x86_pv.c b/tools/libs/guest/xg_sr_= save_x86_pv.c index 4964f1f7b8..5fdc7e9590 100644 --- a/tools/libs/guest/xg_sr_save_x86_pv.c +++ b/tools/libs/guest/xg_sr_save_x86_pv.c @@ -1031,6 +1031,7 @@ static int x86_pv_normalise_page(struct xc_sr_context= *ctx, xen_pfn_t type, */ static int x86_pv_setup(struct xc_sr_context *ctx) { + xc_interface *xch =3D ctx->xch; int rc; =20 rc =3D x86_pv_domain_info(ctx); @@ -1049,6 +1050,15 @@ static int x86_pv_setup(struct xc_sr_context *ctx) if ( rc ) return rc; =20 + ctx->x86.pv.save.deferred_pages =3D bitmap_alloc(ctx->save.p2m_size); + + if (!ctx->x86.pv.save.deferred_pages) + { + ERROR("Unable to allocate memory for deferred pages"); + errno =3D ENOMEM; + return -1; + } + return 0; } =20 @@ -1116,9 +1126,35 @@ static int x86_pv_check_vm_state(struct xc_sr_contex= t *ctx) return x86_pv_check_vm_state_p2m_list(ctx); } =20 +static int x86_pv_defer_page(struct xc_sr_context *ctx, xen_pfn_t pfn) +{ + set_bit(pfn, ctx->x86.pv.save.deferred_pages); + ++ctx->x86.pv.save.nr_deferred_pages; + + return 0; +} + +static int x86_pv_merge_deferred(const struct xc_sr_context *ctx, + unsigned long *bitmap, unsigned long *cou= nt) +{ + bitmap_or(bitmap, ctx->x86.pv.save.deferred_pages, ctx->save.p2m_size); + *count =3D ctx->x86.pv.save.nr_deferred_pages; + + return 0; +} + +static int x86_pv_reset_deferred(struct xc_sr_context *ctx) +{ + bitmap_clear(ctx->x86.pv.save.deferred_pages, ctx->save.p2m_size); + ctx->x86.pv.save.nr_deferred_pages =3D 0; + + return 0; +} + static int x86_pv_cleanup(struct xc_sr_context *ctx) { free(ctx->x86.pv.p2m_pfns); + free(ctx->x86.pv.save.deferred_pages); =20 if ( ctx->x86.pv.p2m ) munmap(ctx->x86.pv.p2m, ctx->x86.pv.p2m_frames * PAGE_SIZE); @@ -1142,6 +1178,9 @@ struct xc_sr_save_ops save_ops_x86_pv =3D .start_of_checkpoint =3D x86_pv_start_of_checkpoint, .end_of_checkpoint =3D x86_pv_end_of_checkpoint, .check_vm_state =3D x86_pv_check_vm_state, + .defer_page =3D x86_pv_defer_page, + .merge_deferred =3D x86_pv_merge_deferred, + .reset_deferred =3D x86_pv_reset_deferred, .cleanup =3D x86_pv_cleanup, }; =20 --=20 2.34.1 Andrei Semenov | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions w: vates.fr | xcp-ng.org | xen-orchestra.com From nobody Fri May 17 05:54:43 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=fail header.i=@mandrillapp.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1662501016963195.50394730129642; Tue, 6 Sep 2022 14:50:16 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.400346.642044 (Exim 4.92) (envelope-from ) id 1oVgSC-0007WW-To; Tue, 06 Sep 2022 21:49:56 +0000 Received: by outflank-mailman (output) from mailman id 400346.642044; Tue, 06 Sep 2022 21:49:56 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1oVgSC-0007WP-PO; Tue, 06 Sep 2022 21:49:56 +0000 Received: by outflank-mailman (input) for mailman id 400346; Tue, 06 Sep 2022 21:49:54 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1oVgIW-0000Cs-Ii for xen-devel@lists.xenproject.org; Tue, 06 Sep 2022 21:39:56 +0000 Received: from mail136-23.atl41.mandrillapp.com (mail136-23.atl41.mandrillapp.com [198.2.136.23]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id e2f74d4e-2dc9-11ed-af93-0125da4c0113; Tue, 06 Sep 2022 11:54:24 +0200 (CEST) Received: from pmta11.mandrill.prod.atl01.rsglab.com (localhost [127.0.0.1]) by mail136-23.atl41.mandrillapp.com (Mailchimp) with ESMTP id 4MMLMP1T42z1XLFCT for ; Tue, 6 Sep 2022 09:54:25 +0000 (GMT) Received: from [37.26.189.201] by mandrillapp.com id a8dca91a88494f7f8b1011af811f372b; Tue, 06 Sep 2022 09:54:25 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: e2f74d4e-2dc9-11ed-af93-0125da4c0113 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vates.fr; s=mandrill; t=1662458065; x=1662760465; i=andrei.semenov@vates.fr; bh=pQxC6Pftr2jRoWbYvo9cSoey9URYE0qRetJUWBPSTIk=; h=From:Subject:To:Cc:Message-Id:In-Reply-To:References:Feedback-ID: Date:MIME-Version:Content-Type:Content-Transfer-Encoding:CC:Date: Subject:From; b=J90RHZxdhtwm0LlZD5oP7VGWGxM2WYnsvbxujyoIj7OZyxJ2L8DcSnznz0iwWhGQx 4j8zU2jvtl8f6/tVakm3nXQpuDvU0ui833kHI6YphNgRKEWWa9VteIIOMYKVAUwOhE Y+9kAWW5xFh7MBCSzKb01D1BXCvqY4p5wCLaB1t4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1662458065; h=From : Subject : To : Cc : Message-Id : In-Reply-To : References : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=pQxC6Pftr2jRoWbYvo9cSoey9URYE0qRetJUWBPSTIk=; b=e0+uul0MFDsDLUDNu96xRuh9CG/82ragY2k/HAkqMg56LHNA/ReMr5e51SC+DdBmoopGXO Eadu31KqK0bTJRk/lzpfBgCrt940lXdKeR9B5VbqIeAcRYbJiwTxiyU5SRABg3Jkkvm2CvQn CIg/M9VSPS2AD5mopYBo9SHRA4HeA= From: Andrei Semenov Subject: [PATCH v2 2/2] live migration: use superpages for physmap population on restore when possible X-Mailer: git-send-email 2.34.1 X-Bm-Disclaimer: Yes X-Bm-Milter-Handled: 5cd6f291-6f11-459d-97c0-d09b574c3896 X-Bm-Transport-Timestamp: 1662458064080 To: andrei.semenov@vates.fr, xen-devel@lists.xenproject.org Cc: Wei Liu , Anthony PERARD , Juergen Gross Message-Id: <657d6dad39f4ab87569470c94afb4cc6d005e829.1662457291.git.andrei.semenov@vates.fr> In-Reply-To: References: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=30504962.a8dca91a88494f7f8b1011af811f372b X-Mandrill-User: md_30504962 Feedback-ID: 30504962:30504962.20220906:md Date: Tue, 06 Sep 2022 09:54:25 +0000 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: fail (found 2 invalid signatures) X-ZM-MESSAGEID: 1662501018444100001 Content-Type: text/plain; charset="utf-8" Implement an heuristic for X86 HVM guests which tries to use superpages whi= le populating guest physmap on live migration. This should impove memory acces= ses performances for these guests. Signed-off-by: Andrei Semenov --- tools/include/xen-tools/libs.h | 4 ++ tools/libs/guest/xg_private.h | 3 + tools/libs/guest/xg_sr_common.h | 18 ++++- tools/libs/guest/xg_sr_restore.c | 60 +++++++--------- tools/libs/guest/xg_sr_restore_x86_hvm.c | 88 +++++++++++++++++++++++- tools/libs/guest/xg_sr_restore_x86_pv.c | 22 +++++- 6 files changed, 154 insertions(+), 41 deletions(-) diff --git a/tools/include/xen-tools/libs.h b/tools/include/xen-tools/libs.h index a16e0c3807..bdd903eb7b 100644 --- a/tools/include/xen-tools/libs.h +++ b/tools/include/xen-tools/libs.h @@ -63,4 +63,8 @@ #define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w)= )-1)) #endif =20 +#ifndef ROUNDDOWN +#define ROUNDDOWN(_x,_w) ((unsigned long)(_x) & (-1UL << (_w))) +#endif + #endif /* __XEN_TOOLS_LIBS__ */ diff --git a/tools/libs/guest/xg_private.h b/tools/libs/guest/xg_private.h index 09e24f1227..dcf63b5188 100644 --- a/tools/libs/guest/xg_private.h +++ b/tools/libs/guest/xg_private.h @@ -134,6 +134,9 @@ typedef uint64_t x86_pgentry_t; #define PAGE_SIZE_X86 (1UL << PAGE_SHIFT_X86) #define PAGE_MASK_X86 (~(PAGE_SIZE_X86-1)) =20 +#define S_PAGE_1GB_ORDER 18 +#define S_PAGE_2MB_ORDER 9 + #define NRPAGES(x) (ROUNDUP(x, PAGE_SHIFT) >> PAGE_SHIFT) =20 static inline xen_pfn_t xc_pfn_to_mfn(xen_pfn_t pfn, xen_pfn_t *p2m, diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_commo= n.h index 941e24d7b7..96365e05a8 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -137,7 +137,8 @@ struct xc_sr_restore_ops bool (*pfn_is_valid)(const struct xc_sr_context *ctx, xen_pfn_t pfn); =20 /* Set the GFN of a PFN. */ - void (*set_gfn)(struct xc_sr_context *ctx, xen_pfn_t pfn, xen_pfn_t gf= n); + void (*set_gfn)(struct xc_sr_context *ctx, xen_pfn_t pfn, xen_pfn_t gf= n, + unsigned int order); =20 /* Set the type of a PFN. */ void (*set_page_type)(struct xc_sr_context *ctx, xen_pfn_t pfn, @@ -175,6 +176,17 @@ struct xc_sr_restore_ops #define BROKEN_CHANNEL 2 int (*process_record)(struct xc_sr_context *ctx, struct xc_sr_record *= rec); =20 + /** + * Guest physmap population order is based on heuristic which is family + * dependant. X86 HVM heuristic is interested in observing the whole + * record (the first) in order to guess how the physmap should be popu= lated. + */ + void (*guess_physmap)(struct xc_sr_context *ctx, unsigned int count, + const xen_pfn_t *pfns, const uint32_t *types); + + /* Get the physmap population order for given PFN */ + int (*get_physmap_order)(const struct xc_sr_context *ctx, xen_pfn_t pf= n); + /** * Perform any actions required after the static data has arrived. Ca= lled * when the STATIC_DATA_COMPLETE record has been recieved/inferred. @@ -404,6 +416,10 @@ struct xc_sr_context { /* HVM context blob. */ struct xc_sr_blob context; + + /* Set guest type (based on the first record) */ + bool set_guest_type; + bool pvh_guest; } restore; }; } hvm; diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_rest= ore.c index 074b56d263..af864bd5ea 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -86,18 +86,21 @@ static bool pfn_is_populated(const struct xc_sr_context= *ctx, xen_pfn_t pfn) * avoid realloc()ing too excessively, the size increased to the nearest p= ower * of two large enough to contain the required pfn. */ -static int pfn_set_populated(struct xc_sr_context *ctx, xen_pfn_t pfn) +static int pfn_set_populated(struct xc_sr_context *ctx, xen_pfn_t pfn, + unsigned int order) { xc_interface *xch =3D ctx->xch; + xen_pfn_t start_pfn =3D ROUNDDOWN(pfn, order), + end_pfn =3D (ROUNDUP(pfn + 1, order) - 1); =20 - if ( pfn > ctx->restore.max_populated_pfn ) + if ( end_pfn > ctx->restore.max_populated_pfn ) { xen_pfn_t new_max; size_t old_sz, new_sz; unsigned long *p; =20 /* Round up to the nearest power of two larger than pfn, less 1. */ - new_max =3D pfn; + new_max =3D end_pfn; new_max |=3D new_max >> 1; new_max |=3D new_max >> 2; new_max |=3D new_max >> 4; @@ -123,8 +126,11 @@ static int pfn_set_populated(struct xc_sr_context *ctx= , xen_pfn_t pfn) ctx->restore.max_populated_pfn =3D new_max; } =20 - assert(!test_bit(pfn, ctx->restore.populated_pfns)); - set_bit(pfn, ctx->restore.populated_pfns); + for ( pfn =3D start_pfn; pfn <=3D end_pfn; ++pfn ) + { + assert(!test_bit(pfn, ctx->restore.populated_pfns)); + set_bit(pfn, ctx->restore.populated_pfns); + } =20 return 0; } @@ -138,60 +144,40 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned= int count, const xen_pfn_t *original_pfns, const uint32_t *types) { xc_interface *xch =3D ctx->xch; - xen_pfn_t *mfns =3D malloc(count * sizeof(*mfns)), - *pfns =3D malloc(count * sizeof(*pfns)); - unsigned int i, nr_pfns =3D 0; + xen_pfn_t mfn, pfn; + unsigned int i, order; int rc =3D -1; =20 - if ( !mfns || !pfns ) - { - ERROR("Failed to allocate %zu bytes for populating the physmap", - 2 * count * sizeof(*mfns)); - goto err; - } + /* Feed this record for family dependant heuristic to guess the physma= p */ + ctx->restore.ops.guess_physmap(ctx, count, original_pfns, types); =20 for ( i =3D 0; i < count; ++i ) { if ( (!types || page_type_to_populate(types[i])) && !pfn_is_populated(ctx, original_pfns[i]) ) { - rc =3D pfn_set_populated(ctx, original_pfns[i]); + order =3D ctx->restore.ops.get_physmap_order(ctx, original_pfn= s[i]); + rc =3D pfn_set_populated(ctx, original_pfns[i], order); if ( rc ) goto err; - pfns[nr_pfns] =3D mfns[nr_pfns] =3D original_pfns[i]; - ++nr_pfns; - } - } - - if ( nr_pfns ) - { - rc =3D xc_domain_populate_physmap_exact( - xch, ctx->domid, nr_pfns, 0, 0, mfns); - if ( rc ) - { - PERROR("Failed to populate physmap"); - goto err; - } =20 - for ( i =3D 0; i < nr_pfns; ++i ) - { - if ( mfns[i] =3D=3D INVALID_MFN ) + pfn =3D mfn =3D ROUNDDOWN(original_pfns[i], order); + rc =3D xc_domain_populate_physmap_exact(xch, ctx->domid, 1, or= der, 0, + &mfn); + if ( rc || (mfn =3D=3D INVALID_MFN) ) { - ERROR("Populate physmap failed for pfn %u", i); + ERROR("Failed to populate physmap for pfn %lu (%u)", pfn, = order); rc =3D -1; goto err; } =20 - ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]); + ctx->restore.ops.set_gfn(ctx, pfn, mfn, order); } } =20 rc =3D 0; =20 err: - free(pfns); - free(mfns); - return rc; } =20 diff --git a/tools/libs/guest/xg_sr_restore_x86_hvm.c b/tools/libs/guest/xg= _sr_restore_x86_hvm.c index d6ea6f3012..2e525443ab 100644 --- a/tools/libs/guest/xg_sr_restore_x86_hvm.c +++ b/tools/libs/guest/xg_sr_restore_x86_hvm.c @@ -110,7 +110,7 @@ static xen_pfn_t x86_hvm_pfn_to_gfn(const struct xc_sr_= context *ctx, =20 /* restore_ops function. */ static void x86_hvm_set_gfn(struct xc_sr_context *ctx, xen_pfn_t pfn, - xen_pfn_t gfn) + xen_pfn_t gfn, unsigned int order) { /* no op */ } @@ -161,6 +161,8 @@ static int x86_hvm_setup(struct xc_sr_context *ctx) } #endif =20 + ctx->x86.hvm.restore.set_guest_type =3D true; + return 0; } =20 @@ -192,6 +194,88 @@ static int x86_hvm_process_record(struct xc_sr_context= *ctx, } } =20 +/* + * We consider that PVH guest physmap starts from 0 and coninugiously cove= r the + * pysical memory space for the first GB of memory. HVM guest will have I= /0 + * holes in the first 2MB of memory space (at least for VGA). Therefore we + * should observe the very first record (wich comes in physmap order) to f= ind + * out how we should map this first GB. + * To map the rest of the memory space in both cases (PVH or HVM) we will = use + * the maximum available order (up to 1GB), except for forth GB wich holds= the + * low MMIO hole (at least for LAPIC MMIO window and for potential passthr= oughed + * or emulated PCI devices BARs). + */ +static void x86_hvm_guess_physmap(struct xc_sr_context *ctx, unsigned int = count, + const xen_pfn_t *pfns, const uint32_t *types) +{ + xen_pfn_t prev; + unsigned int i; + + + if ( !ctx->x86.hvm.restore.set_guest_type ) + return; + + for ( i =3D 0, prev =3D INVALID_PFN; i < count; ++i ) + { + if ( !types || page_type_to_populate(types[i]) ) + { + if ( prev =3D=3D INVALID_MFN ) + { + if (pfns[i] !=3D 0) + break; + } + else + { + if ( pfns[i] !=3D (prev + 1) ) + break; + } + prev =3D pfns[i]; + } + } + + ctx->x86.hvm.restore.pvh_guest =3D (i =3D=3D count) ? true : false; + ctx->x86.hvm.restore.set_guest_type =3D false; +} + +/* + * + */ +static int x86_hvm_get_physmap_order(const struct xc_sr_context *ctx, + xen_pfn_t pfn) +{ + int order; + + if ( pfn >=3D ctx->restore.p2m_size ) + return 0; + + switch (pfn >> S_PAGE_1GB_ORDER) + { + case 3: + /* The forth GB of memory is mapped with 2MB superpages */ + order =3D S_PAGE_2MB_ORDER; + break; + case 0: + if (!ctx->x86.hvm.restore.pvh_guest) + { + /* First 2MB are mapped as 4K for HVM guest */ + order =3D (pfn > 0x1ff) ? S_PAGE_2MB_ORDER : 0; + break; + } + default: + order =3D S_PAGE_1GB_ORDER; + } + + if ( ((ROUNDUP(pfn + 1, S_PAGE_1GB_ORDER) - 1) >=3D ctx->restore.p2m_s= ize) && + order =3D=3D S_PAGE_1GB_ORDER ) + order =3D S_PAGE_2MB_ORDER; + + if ( ((ROUNDUP(pfn + 1, S_PAGE_2MB_ORDER) - 1) >=3D ctx->restore.p2m_s= ize) && + order =3D=3D S_PAGE_2MB_ORDER ) + order =3D 0; + + return order; +} + /* * restore_ops function. Sets extra hvm parameters and seeds the grant ta= ble. */ @@ -258,6 +342,8 @@ struct xc_sr_restore_ops restore_ops_x86_hvm =3D .localise_page =3D x86_hvm_localise_page, .setup =3D x86_hvm_setup, .process_record =3D x86_hvm_process_record, + .guess_physmap =3D x86_hvm_guess_physmap, + .get_physmap_order =3D x86_hvm_get_physmap_order, .static_data_complete =3D x86_static_data_complete, .stream_complete =3D x86_hvm_stream_complete, .cleanup =3D x86_hvm_cleanup, diff --git a/tools/libs/guest/xg_sr_restore_x86_pv.c b/tools/libs/guest/xg_= sr_restore_x86_pv.c index dc50b0f5a8..f8545f941a 100644 --- a/tools/libs/guest/xg_sr_restore_x86_pv.c +++ b/tools/libs/guest/xg_sr_restore_x86_pv.c @@ -59,7 +59,7 @@ static int expand_p2m(struct xc_sr_context *ctx, unsigned= long max_pfn) ctx->x86.pv.max_pfn =3D max_pfn; for ( i =3D (old_max ? old_max + 1 : 0); i <=3D max_pfn; ++i ) { - ctx->restore.ops.set_gfn(ctx, i, INVALID_MFN); + ctx->restore.ops.set_gfn(ctx, i, INVALID_MFN, 0); ctx->restore.ops.set_page_type(ctx, i, 0); } =20 @@ -947,9 +947,10 @@ static void x86_pv_set_page_type(struct xc_sr_context = *ctx, xen_pfn_t pfn, =20 /* restore_ops function. */ static void x86_pv_set_gfn(struct xc_sr_context *ctx, xen_pfn_t pfn, - xen_pfn_t mfn) + xen_pfn_t mfn, unsigned int order) { assert(pfn <=3D ctx->x86.pv.max_pfn); + assert(!order); =20 if ( ctx->x86.pv.width =3D=3D sizeof(uint64_t) ) /* 64 bit guest. Need to expand INVALID_MFN for 32 bit toolstacks= . */ @@ -1113,6 +1114,21 @@ static int x86_pv_process_record(struct xc_sr_contex= t *ctx, } } =20 +/* + * There's no reliable heuristic which can predict the PV guest physmap. + * Therefore the 0 order always will be used. + */ +static void x86_pv_guess_physmap(struct xc_sr_context *ctx, unsigned int c= ount, + const xen_pfn_t *pfns, const uint32_t *ty= pes) +{ +} + +static int x86_pv_get_physmap_order(const struct xc_sr_context *ctx, + xen_pfn_t pfn) +{ + return 0; +} + /* * restore_ops function. Update the vcpu context in Xen, pin the pagetabl= es, * rewrite the p2m and seed the grant table. @@ -1194,6 +1210,8 @@ struct xc_sr_restore_ops restore_ops_x86_pv =3D .localise_page =3D x86_pv_localise_page, .setup =3D x86_pv_setup, .process_record =3D x86_pv_process_record, + .guess_physmap =3D x86_pv_guess_physmap, + .get_physmap_order =3D x86_pv_get_physmap_order, .static_data_complete =3D x86_static_data_complete, .stream_complete =3D x86_pv_stream_complete, .cleanup =3D x86_pv_cleanup, --=20 2.34.1 Andrei Semenov | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions w: vates.fr | xcp-ng.org | xen-orchestra.com