From nobody Mon Feb 9 11:32:31 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=citrix.com ARC-Seal: i=1; a=rsa-sha256; t=1612222090; cv=none; d=zohomail.com; s=zohoarc; b=I4TCA2lIRoMR8KewHduxUAntZ0WWD3tSMSplVT95pnl/83o9ipjPjEvIPOissRkvl8Ut6YELBXZm2DheLl4AYFzWgnPDiLccgvVPHsC6ndJIAYrzAbqkvSA6oxEkTGNDC5NtbLKCLfqfxpt1gUMMJplu2KgE391qbXrKWMcesIQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1612222090; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=6XHlV8fRq0ZhgLVtu4Ru3v9j/liBZs/SeVcg/XmBqrs=; b=f4MzruLp7MKTeNdBqdkw4qqJYxQ1EFP3F+qHuf9p+tjjtRsBQSIEHKVK7sSTs3XFHY7g8YAiDfxyM3Hl2+adAwBh274Id69bLtLhXyBfX9h1Syvdn2vudaFMEE8TBJcR2KNjmEZeovEzYE+FNHKa+iI45k3hks6VLRe40HWb3Ws= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) header.from= Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1612222090750925.20591083377; Mon, 1 Feb 2021 15:28:10 -0800 (PST) Received: from list by lists.xenproject.org with outflank-mailman.80196.146578 (Exim 4.92) (envelope-from ) id 1l6ibq-0007tk-98; Mon, 01 Feb 2021 23:27:54 +0000 Received: by outflank-mailman (output) from mailman id 80196.146578; Mon, 01 Feb 2021 23:27:54 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1l6ibq-0007tY-4m; Mon, 01 Feb 2021 23:27:54 +0000 Received: by outflank-mailman (input) for mailman id 80196; Mon, 01 Feb 2021 23:27:52 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1l6ibo-0007P6-RR for xen-devel@lists.xenproject.org; Mon, 01 Feb 2021 23:27:52 +0000 Received: from esa3.hc3370-68.iphmx.com (unknown [216.71.145.155]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 4974a30f-43f4-47f0-bfa7-ad5966e08e58; Mon, 01 Feb 2021 23:27:35 +0000 (UTC) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 4974a30f-43f4-47f0-bfa7-ad5966e08e58 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=citrix.com; s=securemail; t=1612222055; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ooOeFTFX01JiGDLs2zOj6QNxJzo/EvnmXjy1QpJOiP4=; b=a+NT2DtUex+pSs0j0IktlIAZRKnsKiiEVDrDBiSiBjUQRJkMY1VS7Toz upWRHN2MSqbcsR8p6Z3l67d05V4qo/zQ6wCb2T5YAc7NuxR88ebW0UQQR MwRBLXl/njKInQefWZca0Xf7VE2DKx7/Ci6yDNqwejWJ6znqEGi36vImI s=; Authentication-Results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none IronPort-SDR: GkpqpvXlBS0xQRr0Iymb1U4xQHvVP8qukJrzdmUmf5j2EtFLx/ZR8st79ap0XebBopMenwStB5 YK4Nc2TFUMM7caQwQ9KmRudisXuuPmgAMGDyWHmyj4kR0TI2fncs39KX6OYO15Myz6g4Uukf5D j+IW0IFwoqJ1NWMxWZqsHsCLZJueQJr3k4gV+0BIfQCpahOKXaVIujqLCKUAeSe9xR9eL4g8kO vNQXK/w3Hil2eZd/f0n2wixXZ9LeuawymIziKwkEdS1rUXuUemWdeYzeJvtVaiVWhIP0bH/HY1 Leo= X-SBRS: 5.1 X-MesageID: 36319806 X-Ironport-Server: esa3.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.79,393,1602561600"; d="scan'208";a="36319806" From: Andrew Cooper To: Xen-devel CC: Andrew Cooper , George Dunlap , Ian Jackson , Jan Beulich , Stefano Stabellini , Wei Liu , Julien Grall , Paul Durrant , =?UTF-8?q?Micha=C5=82=20Leszczy=C5=84ski?= , Hubert Jasudowicz , Tamas K Lengyel Subject: [PATCH v9 01/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource Date: Mon, 1 Feb 2021 23:26:53 +0000 Message-ID: <20210201232703.29275-2-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20210201232703.29275-1-andrew.cooper3@citrix.com> References: <20210201232703.29275-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @citrix.com) A guest's default number of grant frames is 64, and XENMEM_acquire_resource will reject an attempt to map more than 32 frames. This limit is caused by the size of mfn_list[] on the stack. Fix mapping of arbitrary size requests by looping over batches of 32 in acquire_resource(), and using hypercall continuations when necessary. To start with, break _acquire_resource() out of acquire_resource() to cope with type-specific dispatching, and update the return semantics to indicate the number of mfns returned. Update gnttab_acquire_resource() and x86's arch_acquire_resource() to match these new semantics. Have do_memory_op() pass start_extent into acquire_resource() so it can pick up where it left off after a continuation, and loop over batches of 32 until all the work is done, or a continuation needs to occur. compat_memory_op() is a bit more complicated, because it also has to marshal frame_list in the XLAT buffer. Have it account for continuation information itself and hide details from the upper layer, so it can marshal the buffer = in chunks if necessary. With these fixes in place, it is now possible to map the whole grant table = for a guest. Signed-off-by: Andrew Cooper Reviewed-by: Roger Pau Monn=C3=A9 --- CC: George Dunlap CC: Ian Jackson CC: Jan Beulich CC: Stefano Stabellini CC: Wei Liu CC: Julien Grall CC: Paul Durrant CC: Micha=C5=82 Leszczy=C5=84ski CC: Hubert Jasudowicz CC: Tamas K Lengyel v9: * Crash domain rather than returning late with -ERANGE/-EFAULT. v8: * nat =3D> cmp change in the start_extent check. * Rebase over 'frame' and ARM/IOREQ series. v3: * Spelling fixes --- xen/common/compat/memory.c | 114 +++++++++++++++++++++++++++++++++-------- xen/common/grant_table.c | 3 ++ xen/common/memory.c | 124 +++++++++++++++++++++++++++++++++--------= ---- 3 files changed, 187 insertions(+), 54 deletions(-) diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c index 834c5e19d1..c43fa97cf1 100644 --- a/xen/common/compat/memory.c +++ b/xen/common/compat/memory.c @@ -402,23 +402,10 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HAND= LE_PARAM(void) compat) case XENMEM_acquire_resource: { xen_pfn_t *xen_frame_list =3D NULL; - unsigned int max_nr_frames; =20 if ( copy_from_guest(&cmp.mar, compat, 1) ) return -EFAULT; =20 - /* - * The number of frames handled is currently limited to a - * small number by the underlying implementation, so the - * scratch space should be sufficient for bouncing the - * frame addresses. - */ - max_nr_frames =3D (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) / - sizeof(*xen_frame_list); - - if ( cmp.mar.nr_frames > max_nr_frames ) - return -E2BIG; - /* Marshal the frame list in the remainder of the xlat space. = */ if ( !compat_handle_is_null(cmp.mar.frame_list) ) xen_frame_list =3D (xen_pfn_t *)(nat.mar + 1); @@ -432,6 +419,28 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDL= E_PARAM(void) compat) =20 if ( xen_frame_list && cmp.mar.nr_frames ) { + unsigned int xlat_max_frames =3D + (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) / + sizeof(*xen_frame_list); + + if ( start_extent >=3D cmp.mar.nr_frames ) + return -EINVAL; + + /* + * Adjust nat to account for work done on previous + * continuations, leaving cmp pristine. Hide the continau= tion + * from the native code to prevent double accounting. + */ + nat.mar->nr_frames -=3D start_extent; + nat.mar->frame +=3D start_extent; + cmd &=3D MEMOP_CMD_MASK; + + /* + * If there are two many frames to fit within the xlat buf= fer, + * we'll need to loop to marshal them all. + */ + nat.mar->nr_frames =3D min(nat.mar->nr_frames, xlat_max_fr= ames); + /* * frame_list is an input for translated guests, and an ou= tput * for untranslated guests. Only copy in for translated g= uests. @@ -444,14 +453,14 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HAND= LE_PARAM(void) compat) cmp.mar.nr_frames) || __copy_from_compat_offset( compat_frame_list, cmp.mar.frame_list, - 0, cmp.mar.nr_frames) ) + start_extent, nat.mar->nr_frames) ) return -EFAULT; =20 /* * Iterate backwards over compat_frame_list[] expanding * compat_pfn_t to xen_pfn_t in place. */ - for ( int x =3D cmp.mar.nr_frames - 1; x >=3D 0; --x ) + for ( int x =3D nat.mar->nr_frames - 1; x >=3D 0; --x ) xen_frame_list[x] =3D compat_frame_list[x]; } } @@ -600,9 +609,11 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDL= E_PARAM(void) compat) case XENMEM_acquire_resource: { DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t); + unsigned int done; =20 if ( compat_handle_is_null(cmp.mar.frame_list) ) { + ASSERT(split =3D=3D 0 && rc =3D=3D 0); if ( __copy_field_to_guest( guest_handle_cast(compat, compat_mem_acquire_resource_t), @@ -611,6 +622,21 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDL= E_PARAM(void) compat) break; } =20 + if ( split < 0 ) + { + /* Continuation occurred. */ + ASSERT(rc !=3D XENMEM_acquire_resource); + done =3D cmd >> MEMOP_EXTENT_SHIFT; + } + else + { + /* No continuation. */ + ASSERT(rc =3D=3D 0); + done =3D nat.mar->nr_frames; + } + + ASSERT(done <=3D nat.mar->nr_frames); + /* * frame_list is an input for translated guests, and an output= for * untranslated guests. Only copy out for untranslated guests. @@ -626,21 +652,67 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HAND= LE_PARAM(void) compat) */ BUILD_BUG_ON(sizeof(compat_pfn_t) > sizeof(xen_pfn_t)); =20 - for ( i =3D 0; i < cmp.mar.nr_frames; i++ ) + rc =3D 0; + for ( i =3D 0; i < done; i++ ) { compat_pfn_t frame =3D xen_frame_list[i]; =20 if ( frame !=3D xen_frame_list[i] ) - return -ERANGE; + { + rc =3D -ERANGE; + break; + } =20 compat_frame_list[i] =3D frame; } =20 - if ( __copy_to_compat_offset(cmp.mar.frame_list, 0, - compat_frame_list, - cmp.mar.nr_frames) ) - return -EFAULT; + if ( !rc && __copy_to_compat_offset( + cmp.mar.frame_list, start_extent, + compat_frame_list, done) ) + rc =3D -EFAULT; + + if ( rc ) + { + if ( split < 0 ) + { + gdprintk(XENLOG_ERR, + "Cannot cancel continuation: %ld\n", rc); + domain_crash(current->domain); + } + return rc; + } + } + + start_extent +=3D done; + + /* Completely done. */ + if ( start_extent =3D=3D cmp.mar.nr_frames ) + break; + + /* + * Done a "full" batch, but we were limited by space in the xl= at + * area. Go around the loop again without necesserily returni= ng + * to guest context. + */ + if ( done =3D=3D nat.mar->nr_frames ) + { + split =3D 1; + break; } + + /* Explicit continuation request from a higher level. */ + if ( done < nat.mar->nr_frames ) + return hypercall_create_continuation( + __HYPERVISOR_memory_op, "ih", + op | (start_extent << MEMOP_EXTENT_SHIFT), compat); + + /* + * Well... Somethings gone wrong with the two levels of chunki= ng. + * My condolences to whomever next has to debug this mess. + */ + ASSERT_UNREACHABLE(); + domain_crash(current->domain); + split =3D 0; break; } =20 diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index 280b7969b6..b95403695f 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -4053,6 +4053,9 @@ int gnttab_acquire_resource( for ( i =3D 0; i < nr_frames; ++i ) mfn_list[i] =3D virt_to_mfn(vaddrs[frame + i]); =20 + /* Success. Passed nr_frames back to the caller. */ + rc =3D nr_frames; + out: grant_write_unlock(gt); =20 diff --git a/xen/common/memory.c b/xen/common/memory.c index 01cab7e493..128718b31c 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -1118,23 +1118,41 @@ static int acquire_ioreq_server(struct domain *d, mfn_list[i] =3D mfn_x(mfn); } =20 - return 0; + /* Success. Passed nr_frames back to the caller. */ + return nr_frames; #else return -EOPNOTSUPP; #endif } =20 +/* + * Returns -errno on error, or positive in the range [1, nr_frames] on + * success. Returning less than nr_frames contitutes a request for a + * continuation. Callers can depend on frame + nr_frames not overflowing. + */ +static int _acquire_resource( + struct domain *d, unsigned int type, unsigned int id, unsigned int fra= me, + unsigned int nr_frames, xen_pfn_t mfn_list[]) +{ + switch ( type ) + { + case XENMEM_resource_grant_table: + return gnttab_acquire_resource(d, id, frame, nr_frames, mfn_list); + + case XENMEM_resource_ioreq_server: + return acquire_ioreq_server(d, id, frame, nr_frames, mfn_list); + + default: + return -EOPNOTSUPP; + } +} + static int acquire_resource( - XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg) + XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg, + unsigned long start_extent) { struct domain *d, *currd =3D current->domain; xen_mem_acquire_resource_t xmar; - /* - * The mfn_list and gfn_list (below) arrays are ok on stack for the - * moment since they are small, but if they need to grow in future - * use-cases then per-CPU arrays or heap allocations may be required. - */ - xen_pfn_t mfn_list[32]; unsigned int max_frames; int rc; =20 @@ -1147,9 +1165,6 @@ static int acquire_resource( if ( xmar.pad !=3D 0 ) return -EINVAL; =20 - if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) ) - return -E2BIG; - /* * The ABI is rather unfortunate. nr_frames (and therefore the total = size * of the resource) is 32bit, while frame (the offset within the resou= rce @@ -1179,7 +1194,7 @@ static int acquire_resource( =20 if ( guest_handle_is_null(xmar.frame_list) ) { - if ( xmar.nr_frames ) + if ( xmar.nr_frames || start_extent ) goto out; =20 xmar.nr_frames =3D max_frames; @@ -1187,30 +1202,47 @@ static int acquire_resource( goto out; } =20 - do { - switch ( xmar.type ) - { - case XENMEM_resource_grant_table: - rc =3D gnttab_acquire_resource(d, xmar.id, xmar.frame, xmar.nr= _frames, - mfn_list); - break; + /* + * Limiting nr_frames at (UINT_MAX >> MEMOP_EXTENT_SHIFT) isn't ideal.= If + * it ever becomes a practical problem, we can switch to mutating + * xmar.{frame,nr_frames,frame_list} in guest memory. + */ + rc =3D -EINVAL; + if ( start_extent >=3D xmar.nr_frames || + xmar.nr_frames > (UINT_MAX >> MEMOP_EXTENT_SHIFT) ) + goto out; =20 - case XENMEM_resource_ioreq_server: - rc =3D acquire_ioreq_server(d, xmar.id, xmar.frame, xmar.nr_fr= ames, - mfn_list); - break; + /* Adjust for work done on previous continuations. */ + xmar.nr_frames -=3D start_extent; + xmar.frame +=3D start_extent; + guest_handle_add_offset(xmar.frame_list, start_extent); =20 - default: - rc =3D -EOPNOTSUPP; - break; - } + do { + /* + * Arbitrary size. Not too much stack space, and a reasonable str= ide + * for continuation checks. + */ + xen_pfn_t mfn_list[32]; + unsigned int todo =3D MIN(ARRAY_SIZE(mfn_list), xmar.nr_frames), d= one; =20 - if ( rc ) + rc =3D _acquire_resource(d, xmar.type, xmar.id, xmar.frame, + todo, mfn_list); + if ( rc < 0 ) + goto out; + + done =3D rc; + rc =3D 0; + if ( done =3D=3D 0 || done > todo ) + { + ASSERT_UNREACHABLE(); + rc =3D -EINVAL; goto out; + } =20 + /* Adjust guest frame_list appropriately. */ if ( !paging_mode_translate(currd) ) { - if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) ) + if ( copy_to_guest(xmar.frame_list, mfn_list, done) ) rc =3D -EFAULT; } else @@ -1218,10 +1250,10 @@ static int acquire_resource( xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)]; unsigned int i; =20 - if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames= ) ) + if ( copy_from_guest(gfn_list, xmar.frame_list, done) ) rc =3D -EFAULT; =20 - for ( i =3D 0; !rc && i < xmar.nr_frames; i++ ) + for ( i =3D 0; !rc && i < done; i++ ) { rc =3D set_foreign_p2m_entry(currd, d, gfn_list[i], _mfn(mfn_list[i])); @@ -1230,7 +1262,32 @@ static int acquire_resource( rc =3D -EIO; } } - } while ( 0 ); + + if ( rc ) + goto out; + + xmar.nr_frames -=3D done; + xmar.frame +=3D done; + guest_handle_add_offset(xmar.frame_list, done); + start_extent +=3D done; + + /* + * Explicit continuation request from _acquire_resource(), or we've + * still got more work to do. + */ + if ( done < todo || + (xmar.nr_frames && hypercall_preempt_check()) ) + { + rc =3D hypercall_create_continuation( + __HYPERVISOR_memory_op, "lh", + XENMEM_acquire_resource | (start_extent << MEMOP_EXTENT_SH= IFT), + arg); + goto out; + } + + } while ( xmar.nr_frames ); + + rc =3D 0; =20 out: rcu_unlock_domain(d); @@ -1697,7 +1754,8 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE= _PARAM(void) arg) =20 case XENMEM_acquire_resource: rc =3D acquire_resource( - guest_handle_cast(arg, xen_mem_acquire_resource_t)); + guest_handle_cast(arg, xen_mem_acquire_resource_t), + start_extent); break; =20 default: --=20 2.11.0