From nobody Sun May 5 10:57:09 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=fail(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1610025817; cv=none; d=zohomail.com; s=zohoarc; b=BAM3Kz5Fr7yBQ+wVSWEuFHWxyd8VGP4mOT3tvFGeZ7OHDg/8DBHNnhiupYpFId3pqU6W1g8UJMGEbRyR8d9mljPN/vayC0QddaUz/bp/FrS82xsFBueKi6yMz4753TRA/WMe/SCoTo2/o+6eOn7Q0r3ZysYKBwnMFGAxxtmKMnQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1610025817; h=Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=yAZhNN/qhcq6Fx4WqmvfzW3TIwHGeLs0LzPoSGGAIIE=; b=VL9u+Ezf26fJZXTypPjSaqRiFdHKnh51vb6a1Nghsi6BaMyIZLOFpjGInAkMXO8HGQ6PepflJLGwM+/ghAspFXs7L3o/L57e5erdutQfV2quVUVI4KVrNRG0bdICwtQApUKY2bIlKYW87BuadPUuzqt6/QYn8jBwlB14mtmpanc= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1610025817187964.9461889897744; Thu, 7 Jan 2021 05:23:37 -0800 (PST) Received: from list by lists.xenproject.org with outflank-mailman.62891.111549 (Exim 4.92) (envelope-from ) id 1kxVG6-0006b9-Ms; Thu, 07 Jan 2021 13:23:22 +0000 Received: by outflank-mailman (output) from mailman id 62891.111549; Thu, 07 Jan 2021 13:23:22 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kxVG6-0006b2-Jn; Thu, 07 Jan 2021 13:23:22 +0000 Received: by outflank-mailman (input) for mailman id 62891; Thu, 07 Jan 2021 13:23:21 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kxVG5-0006aw-FD for xen-devel@lists.xenproject.org; Thu, 07 Jan 2021 13:23:21 +0000 Received: from mga03.intel.com (unknown [134.134.136.65]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id d259c9a0-0bc4-4eb1-852a-cf6cf2e34b42; Thu, 07 Jan 2021 13:23:18 +0000 (UTC) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jan 2021 05:23:16 -0800 Received: from mlupinac-mobl1.amr.corp.intel.com (HELO ubuntu.localdomain) ([10.209.25.187]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jan 2021 05:23:15 -0800 X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: d259c9a0-0bc4-4eb1-852a-cf6cf2e34b42 IronPort-SDR: LJf6MuIw538QlIcDNE8pkuC/Iz64jqftCzbya7dua7B/APgaS9Y1/9bQj4GuseGXsxFN384X1n IPXdp8OCr0nw== X-IronPort-AV: E=McAfee;i="6000,8403,9856"; a="177521183" X-IronPort-AV: E=Sophos;i="5.79,329,1602572400"; d="scan'208";a="177521183" IronPort-SDR: TxK5tS4WBIB5J2ovBAmkb79eO7BI+jKY1dNedabkw9QkYj3UP5tmtQYBPPxulJPjVhQM2DFSVC oK5y7H3xYs7g== X-IronPort-AV: E=Sophos;i="5.79,329,1602572400"; d="scan'208";a="398620017" From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Cc: Tamas K Lengyel , Tamas K Lengyel , Jan Beulich , Andrew Cooper , George Dunlap , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Wei Liu Subject: [PATCH v2] x86/mem_sharing: Resolve mm-lock order violations when forking VMs with nested p2m Date: Thu, 7 Jan 2021 05:23:03 -0800 Message-Id: <253be1190a5cdc452611b3741d852d1c7d2bc8d4.1610025394.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Several lock-order violations have been encountered while attempting to fork VMs with nestedhvm=3D1 set. This patch resolves the issues. The order violations stems from a call to p2m_flush_nestedp2m being perform= ed whenever the hostp2m changes. This functions always takes the p2m lock for = the nested_p2m. However, with sharing the p2m locks always have to be taken bef= ore the sharing lock. To resolve this issue we avoid taking the sharing lock wh= ere possible (and was actually unecessary to begin with). But we also make p2m_flush_nestedp2m aware that the p2m lock may have already been taken and preemptively take all nested_p2m locks before unsharing a page where taking= the sharing lock is necessary. Signed-off-by: Tamas K Lengyel Acked-by: Jan Beulich --- v2: cosmetic fixes --- xen/arch/x86/mm/mem_sharing.c | 44 ++++++++++++++++++++++++----------- xen/arch/x86/mm/p2m.c | 12 ++++++++-- 2 files changed, 40 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index fc7b2a4102..ebce11e0e0 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -39,6 +39,7 @@ #include #include #include +#include #include =20 #include @@ -893,13 +894,11 @@ static int nominate_page(struct domain *d, gfn_t gfn, goto out; =20 /* - * Now that the page is validated, we can lock it. There is no - * race because we're holding the p2m entry, so no one else - * could be nominating this gfn. + * Now that the page is validated, we can make it shared. There is no = race + * because we're holding the p2m entry, so no one else could be nomina= ting + * this gfn & and it is evidently not yet shared with any other VM, th= us we + * don't need to take the mem_sharing_page_lock here. */ - ret =3D -ENOENT; - if ( !mem_sharing_page_lock(page) ) - goto out; =20 /* Initialize the shared state */ ret =3D -ENOMEM; @@ -935,7 +934,6 @@ static int nominate_page(struct domain *d, gfn_t gfn, =20 *phandle =3D page->sharing->handle; audit_add_list(page); - mem_sharing_page_unlock(page); ret =3D 0; =20 out: @@ -1214,7 +1212,8 @@ int __mem_sharing_unshare_page(struct domain *d, p2m_type_t p2mt; mfn_t mfn; struct page_info *page, *old_page; - int last_gfn; + bool last_gfn; + int rc =3D 0; gfn_info_t *gfn_info =3D NULL; =20 mfn =3D get_gfn(d, gfn, &p2mt); @@ -1226,6 +1225,15 @@ int __mem_sharing_unshare_page(struct domain *d, return 0; } =20 + /* lock nested p2ms to avoid lock-order violation with sharing lock */ + if ( unlikely(nestedhvm_enabled(d)) ) + { + unsigned int i; + + for ( i =3D 0; i < MAX_NESTEDP2M; i++ ) + p2m_lock(d->arch.nested_p2m[i]); + } + page =3D __grab_shared_page(mfn); if ( page =3D=3D NULL ) { @@ -1276,9 +1284,7 @@ int __mem_sharing_unshare_page(struct domain *d, put_page_alloc_ref(page); =20 put_page_and_type(page); - put_gfn(d, gfn); - - return 0; + goto out; } =20 if ( last_gfn ) @@ -1295,12 +1301,12 @@ int __mem_sharing_unshare_page(struct domain *d, /* Undo dec of nr_saved_mfns, as the retry will decrease again. */ atomic_inc(&nr_saved_mfns); mem_sharing_page_unlock(old_page); - put_gfn(d, gfn); /* * Caller is responsible for placing an event * in the ring. */ - return -ENOMEM; + rc =3D -ENOMEM; + goto out; } =20 copy_domain_page(page_to_mfn(page), page_to_mfn(old_page)); @@ -1327,8 +1333,18 @@ int __mem_sharing_unshare_page(struct domain *d, */ paging_mark_dirty(d, page_to_mfn(page)); /* We do not need to unlock a private page */ + + out: + if ( unlikely(nestedhvm_enabled(d)) ) + { + unsigned int i; + + for ( i =3D 0; i < MAX_NESTEDP2M; i++ ) + p2m_unlock(d->arch.nested_p2m[i]); + } + put_gfn(d, gfn); - return 0; + return rc; } =20 int relinquish_shared_pages(struct domain *d) diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index ad4bb94514..a32301c343 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1597,9 +1597,17 @@ p2m_flush(struct vcpu *v, struct p2m_domain *p2m) void p2m_flush_nestedp2m(struct domain *d) { - int i; + unsigned int i; + for ( i =3D 0; i < MAX_NESTEDP2M; i++ ) - p2m_flush_table(d->arch.nested_p2m[i]); + { + struct p2m_domain *p2m =3D d->arch.nested_p2m[i]; + + if ( p2m_locked_by_me(p2m) ) + p2m_flush_table_locked(p2m); + else + p2m_flush_table(p2m); + } } =20 void np2m_flush_base(struct vcpu *v, unsigned long np2m_base) --=20 2.25.1