From nobody Mon Sep 16 19:21:12 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1721356482253427.9727521914999; Thu, 18 Jul 2024 19:34:42 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.760835.1170782 (Exim 4.92) (envelope-from ) id 1sUdRo-00027c-FK; Fri, 19 Jul 2024 02:34:16 +0000 Received: by outflank-mailman (output) from mailman id 760835.1170782; Fri, 19 Jul 2024 02:34:16 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sUdRo-00027V-CS; Fri, 19 Jul 2024 02:34:16 +0000 Received: by outflank-mailman (input) for mailman id 760835; Fri, 19 Jul 2024 02:34:15 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sUdRn-0001tG-CY for xen-devel@lists.xenproject.org; Fri, 19 Jul 2024 02:34:15 +0000 Received: from fhigh2-smtp.messagingengine.com (fhigh2-smtp.messagingengine.com [103.168.172.153]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 61b5b205-4577-11ef-bbfd-fd08da9f4363; Fri, 19 Jul 2024 04:34:12 +0200 (CEST) Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 3DBA5114026D; Thu, 18 Jul 2024 22:34:11 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Thu, 18 Jul 2024 22:34:11 -0400 Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 18 Jul 2024 22:34:08 -0400 (EDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 61b5b205-4577-11ef-bbfd-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1721356451; x=1721442851; bh=295JPg7IQt Dkp5dvhP0fy0GSyAQJ2jLNGYtHNK1RIsM=; b=BHSpjF7xqAYkGMUrdvS1RJ+fSn Kv0Io3ezVIeVXBLyib8vOeRjymWnJSlY8ou1xJy/ihwd4limbIseforhDNCRxYAM EKWYYndOGzT9weVVoIpzE7q9584DA9wewO2JSaOPUYLjdx4sG49nSetTf+3FbMSh DcpBTXkhLj67HQF3kDGpKMOvoHOhXklaXRdJBMEMNdsvWhghkKXai7rKw591PKm9 4ixIbOBMcg74iA7Tb81B6FdllRdfx/S6/hTXlt2j3D80rEYPwzyAQxMlOwX6Nhcb 6x5KdjOK74amBrka28tF1dQDeMStc6MUnTQBxrQEafOg7PSRa4cvRXbmB+EA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1721356451; x= 1721442851; bh=295JPg7IQtDkp5dvhP0fy0GSyAQJ2jLNGYtHNK1RIsM=; b=m hCStaCj27+X9rrmQlGc8Yiuanf3Zl1mSJCUjxXqFVacCW0fmrPH2OLEIlGaCxp5o rMvBMK9X1omCMbFCiZMiU4BpHpT3IUuzi5HV48gGrEvnCo5CaBe4QFfgJYy7UzDF FB096Hu48u1NjiaT27QaQRrTlzHXT4zQrGz6Sa0j1tWcULJgZApJ7FGzW9Ho8GEV 1jPYayAb5YIIG76+trBdgoQtSKP+XyAO48R5Maz4MWiEzl0i4jbZ6/zg/dSoElyQ rVl9HYEKxl5IYxEZlo93Aj5o4ltCFWkgVKzOjag9k6+9mHfWBQDA2tkdKk89hbmy a75AYiIareqke/JsImqAg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrhedtgdeitdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffojghfgggtgfesthekredtredtjeenucfhrhhomhepofgrrhgv khcuofgrrhgtiiihkhhofihskhhiqdfikphrvggtkhhiuceomhgrrhhmrghrvghksehinh hvihhsihgslhgvthhhihhnghhslhgrsgdrtghomheqnecuggftrfgrthhtvghrnhepgfeu udehgfdvfeehhedujeehfeduveeugefhkefhheelgeevudetueeiudfggfffnecuvehluh hsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrrhhmrghrvghk sehinhhvihhsihgslhgvthhhihhnghhslhgrsgdrtghomh X-ME-Proxy: Feedback-ID: i1568416f:Fastmail From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= To: xen-devel@lists.xenproject.org Cc: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= , Andrew Cooper , Jan Beulich , Julien Grall , Stefano Stabellini Subject: [PATCH v5 1/3] xen/list: add LIST_HEAD_RO_AFTER_INIT Date: Fri, 19 Jul 2024 04:33:36 +0200 Message-ID: <1994087de901c7520db559724ae95b2b0e1b1f5d.1721356393.git-series.marmarek@invisiblethingslab.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1721356484307116600 Similar to LIST_HEAD_READ_MOSTLY. Signed-off-by: Marek Marczykowski-G=C3=B3recki Acked-by: Jan Beulich --- New in v5 --- xen/include/xen/list.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/xen/include/xen/list.h b/xen/include/xen/list.h index 6506ac40893b..62169f46742e 100644 --- a/xen/include/xen/list.h +++ b/xen/include/xen/list.h @@ -42,6 +42,9 @@ struct list_head { #define LIST_HEAD_READ_MOSTLY(name) \ struct list_head __read_mostly name =3D LIST_HEAD_INIT(name) =20 +#define LIST_HEAD_RO_AFTER_INIT(name) \ + struct list_head __ro_after_init name =3D LIST_HEAD_INIT(name) + static inline void INIT_LIST_HEAD(struct list_head *list) { list->next =3D list; --=20 git-series 0.9.1 From nobody Mon Sep 16 19:21:12 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1721356487445367.4563901852598; Thu, 18 Jul 2024 19:34:47 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.760836.1170792 (Exim 4.92) (envelope-from ) id 1sUdRq-0002M7-Of; Fri, 19 Jul 2024 02:34:18 +0000 Received: by outflank-mailman (output) from mailman id 760836.1170792; Fri, 19 Jul 2024 02:34:18 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sUdRq-0002M0-Jy; Fri, 19 Jul 2024 02:34:18 +0000 Received: by outflank-mailman (input) for mailman id 760836; Fri, 19 Jul 2024 02:34:16 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sUdRo-0001tA-L8 for xen-devel@lists.xenproject.org; Fri, 19 Jul 2024 02:34:16 +0000 Received: from fhigh2-smtp.messagingengine.com (fhigh2-smtp.messagingengine.com [103.168.172.153]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 6357420a-4577-11ef-8776-851b0ebba9a2; Fri, 19 Jul 2024 04:34:14 +0200 (CEST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 0523F1140269; Thu, 18 Jul 2024 22:34:14 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Thu, 18 Jul 2024 22:34:14 -0400 Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 18 Jul 2024 22:34:11 -0400 (EDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6357420a-4577-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1721356453; x=1721442853; bh=Qr2PBy1sOj h/isDKiUX+v/WZoGLV0I6u8v0ak0EPnUE=; b=lEFHFK6sCxcsdWfn3mLWmZxmrv 3Tsemqjie2DK1j0wmWSIWvPdMRiRDGFCwXTjHvOBSTFi/mYzlevKyPZ/sSjFKe6W YX1KwCruQ2Q5gdr0f5FitI3rzNVPX3Wa3d0TJbSM0v5p2hD8Ou6mJYXvpjA0Ap5q At/31ba+FOZ6+cYkWBJAxjAIkDmZTDg6rFzsp6Y+V4W+OMLNp/QumC2BtRjJmiva s8hrPBkBvKVbPJ3YRLj1f0gLCX3ozk2jeHX9wSBAnijSFxpXsJwJ1Xt4UUuPDQVG yqm66p8VrSKwYEyD08eqIpkOxidT45QnHpFzrLO4+kl0qKbAprFOMh2MPCWg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1721356453; x= 1721442853; bh=Qr2PBy1sOjh/isDKiUX+v/WZoGLV0I6u8v0ak0EPnUE=; b=X KSSb0tY5X6OiiwvTvObyVSvQtprgOhK3eqPQp1GFhkxD3gtitjrFFKg3lOrxfVKA TLEljX3Aqj56HxKWGD4Gk4KSkJq3xcJhtWIAkfjaS7cnZGc4ygBS1x2HpvIlXk4K i8dNCjzQHiPSq580aN6wQ4X6BMxhJ9vW2nEJrKDaBIQGE9L6zdy7OxoNC6sg0SSn cG7ra+jJcka53U/rNBKLl7edR5Ea+3H+ndCO7vu8/q4MusSRSocwNNgUddZ4Sd9y Bs12jphjGOZvOX/ravfHzxd0F3ssfC85Vpw4h+eFrhRHbJYuakJ5bDGQmy6/TKM1 KFg0G9XsYdyOOirneJFMg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrhedtgdeiudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffojghfgggtgfesthekredtredtjeenucfhrhhomhepofgrrhgv khcuofgrrhgtiiihkhhofihskhhiqdfikphrvggtkhhiuceomhgrrhhmrghrvghksehinh hvihhsihgslhgvthhhihhnghhslhgrsgdrtghomheqnecuggftrfgrthhtvghrnhepgfeu udehgfdvfeehhedujeehfeduveeugefhkefhheelgeevudetueeiudfggfffnecuvehluh hsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrrhhmrghrvghk sehinhhvihhsihgslhgvthhhihhnghhslhgrsgdrtghomh X-ME-Proxy: Feedback-ID: i1568416f:Fastmail From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= To: xen-devel@lists.xenproject.org Cc: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= , Jan Beulich , Andrew Cooper , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Subject: [PATCH v5 2/3] x86/mm: add API for marking only part of a MMIO page read only Date: Fri, 19 Jul 2024 04:33:37 +0200 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1721356488151116600 In some cases, only few registers on a page needs to be write-protected. Examples include USB3 console (64 bytes worth of registers) or MSI-X's PBA table (which doesn't need to span the whole table either), although in the latter case the spec forbids placing other registers on the same page. Current API allows only marking whole pages pages read-only, which sometimes may cover other registers that guest may need to write into. Currently, when a guest tries to write to an MMIO page on the mmio_ro_ranges, it's either immediately crashed on EPT violation - if that's HVM, or if PV, it gets #PF. In case of Linux PV, if access was from userspace (like, /dev/mem), it will try to fixup by updating page tables (that Xen again will force to read-only) and will hit that #PF again (looping endlessly). Both behaviors are undesirable if guest could actually be allowed the write. Introduce an API that allows marking part of a page read-only. Since sub-page permissions are not a thing in page tables (they are in EPT, but not granular enough), do this via emulation (or simply page fault handler for PV) that handles writes that are supposed to be allowed. The new subpage_mmio_ro_add() takes a start physical address and the region size in bytes. Both start address and the size need to be 8-byte aligned, as a practical simplification (allows using smaller bitmask, and a smaller granularity isn't really necessary right now). It will internally add relevant pages to mmio_ro_ranges, but if either start or end address is not page-aligned, it additionally adds that page to a list for sub-page R/O handling. The list holds a bitmask which qwords are supposed to be read-only and an address where page is mapped for write emulation - this mapping is done only on the first access. A plain list is used instead of more efficient structure, because there isn't supposed to be many pages needing this precise r/o control. The mechanism this API is plugged in is slightly different for PV and HVM. For both paths, it's plugged into mmio_ro_emulated_write(). For PV, it's already called for #PF on read-only MMIO page. For HVM however, EPT violation on p2m_mmio_direct page results in a direct domain_crash() for non hardware domains. To reach mmio_ro_emulated_write(), change how write violations for p2m_mmio_direct are handled - specifically, check if they relate to such partially protected page via subpage_mmio_write_accept() and if so, call hvm_emulate_one_mmio() for them too. This decodes what guest is trying write and finally calls mmio_ro_emulated_write(). The EPT write violation is detected as npfec.write_access and npfec.present both being true (similar to other places), which may cover some other (future?) cases - if that happens, emulator might get involved unnecessarily, but since it's limited to pages marked with subpage_mmio_ro_add() only, the impact is minimal. Both of those paths need an MFN to which guest tried to write (to check which part of the page is supposed to be read-only, and where the page is mapped for writes). This information currently isn't available directly in mmio_ro_emulated_write(), but in both cases it is already resolved somewhere higher in the call tree. Pass it down to mmio_ro_emulated_write() via new mmio_ro_emulate_ctxt.mfn field. This may give a bit more access to the instruction emulator to HVM guests (the change in hvm_hap_nested_page_fault()), but only for pages explicitly marked with subpage_mmio_ro_add() - so, if the guest has a passed through a device partially used by Xen. As of the next patch, it applies only configuration explicitly documented as not security supported. The subpage_mmio_ro_add() function cannot be called with overlapping ranges, and on pages already added to mmio_ro_ranges separately. Successful calls would result in correct handling, but error paths may result in incorrect state (like pages removed from mmio_ro_ranges too early). Debug build has asserts for relevant cases. Signed-off-by: Marek Marczykowski-G=C3=B3recki --- Shadow mode is not tested, but I don't expect it to work differently than HAP in areas related to this patch. Changes in v5: - use subpage_mmio_find_page helper, simplifying several functions - use LIST_HEAD_RO_AFTER_INIT - don't use subpage_ro_lock in __init - drop #ifdef in mm.h - return error on unaligned size in subpage_mmio_ro_add() instead of extending the size (in release build) Changes in v4: - rename SUBPAGE_MMIO_RO_ALIGN to MMIO_RO_SUBPAGE_GRAN - guard subpage_mmio_write_accept with CONFIG_HVM, as it's used only there - rename ro_qwords to ro_elems - use unsigned arguments for subpage_mmio_ro_remove_page() - use volatile for __iomem - do not set mmio_ro_ctxt.mfn for mmcfg case - comment where fields of mmio_ro_ctxt are used - use bool for result of __test_and_set_bit - do not open-code mfn_to_maddr() - remove leftover RCU - mention hvm_hap_nested_page_fault() explicitly in the commit message Changes in v3: - use unsigned int for loop iterators - use __set_bit/__clear_bit when under spinlock - avoid ioremap() under spinlock - do not cast away const - handle unaligned parameters in release build - comment fixes - remove RCU - the add functions are __init and actual usage is only much later after domains are running - add checks overlapping ranges in debug build and document the limitations - change subpage_mmio_ro_add() so the error path doesn't potentially remove pages from mmio_ro_ranges - move printing message to avoid one goto in subpage_mmio_write_emulate() Changes in v2: - Simplify subpage_mmio_ro_add() parameters - add to mmio_ro_ranges from within subpage_mmio_ro_add() - use ioremap() instead of caller-provided fixmap - use 8-bytes granularity (largest supported single write) and a bitmap instead of a rangeset - clarify commit message - change how it's plugged in for HVM domain, to not change the behavior for read-only parts (keep it hitting domain_crash(), instead of ignoring write) - remove unused subpage_mmio_ro_remove() --- xen/arch/x86/hvm/emulate.c | 2 +- xen/arch/x86/hvm/hvm.c | 4 +- xen/arch/x86/include/asm/mm.h | 23 +++- xen/arch/x86/mm.c | 262 +++++++++++++++++++++++++++++++++- xen/arch/x86/pv/ro-page-fault.c | 6 +- 5 files changed, 292 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 02e378365b40..7253a87032dd 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2734,7 +2734,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned = long gla) .write =3D mmio_ro_emulated_write, .validate =3D hvmemul_validate, }; - struct mmio_ro_emulate_ctxt mmio_ro_ctxt =3D { .cr2 =3D gla }; + struct mmio_ro_emulate_ctxt mmio_ro_ctxt =3D { .cr2 =3D gla, .mfn =3D = _mfn(mfn) }; struct hvm_emulate_ctxt ctxt; const struct x86_emulate_ops *ops; unsigned int seg, bdf; diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 7f4b627b1f5f..a108870558bf 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2016,8 +2016,8 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned l= ong gla, goto out_put_gfn; } =20 - if ( (p2mt =3D=3D p2m_mmio_direct) && is_hardware_domain(currd) && - npfec.write_access && npfec.present && + if ( (p2mt =3D=3D p2m_mmio_direct) && npfec.write_access && npfec.pres= ent && + (is_hardware_domain(currd) || subpage_mmio_write_accept(mfn, gla)= ) && (hvm_emulate_one_mmio(mfn_x(mfn), gla) =3D=3D X86EMUL_OKAY) ) { rc =3D 1; diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 98b66edaca5e..a457f0d2b1b3 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -522,9 +522,32 @@ extern struct rangeset *mmio_ro_ranges; void memguard_guard_stack(void *p); void memguard_unguard_stack(void *p); =20 +/* + * Add more precise r/o marking for a MMIO page. Range specified here + * will still be R/O, but the rest of the page (not marked as R/O via anot= her + * call) will have writes passed through. + * The start address and the size must be aligned to MMIO_RO_SUBPAGE_GRAN. + * + * This API cannot be used for overlapping ranges, nor for pages already a= dded + * to mmio_ro_ranges separately. + * + * Since there is currently no subpage_mmio_ro_remove(), relevant device s= hould + * not be hot-unplugged. + * + * Return values: + * - negative: error + * - 0: success + */ +#define MMIO_RO_SUBPAGE_GRAN 8 +int subpage_mmio_ro_add(paddr_t start, size_t size); +bool subpage_mmio_write_accept(mfn_t mfn, unsigned long gla); + struct mmio_ro_emulate_ctxt { unsigned long cr2; + /* Used only for mmcfg case */ unsigned int seg, bdf; + /* Used only for non-mmcfg case */ + mfn_t mfn; }; =20 int cf_check mmio_ro_emulated_write( diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 648d6dd475ba..7f0ac537e86c 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -150,6 +150,17 @@ bool __read_mostly machine_to_phys_mapping_valid; =20 struct rangeset *__read_mostly mmio_ro_ranges; =20 +/* Handling sub-page read-only MMIO regions */ +struct subpage_ro_range { + struct list_head list; + mfn_t mfn; + void __iomem *mapped; + DECLARE_BITMAP(ro_elems, PAGE_SIZE / MMIO_RO_SUBPAGE_GRAN); +}; + +static LIST_HEAD_RO_AFTER_INIT(subpage_ro_ranges); +static DEFINE_SPINLOCK(subpage_ro_lock); + static uint32_t base_disallow_mask; /* Global bit is allowed to be set on L1 PTEs. Intended for user mappings.= */ #define L1_DISALLOW_MASK ((base_disallow_mask | _PAGE_GNTTAB) & ~_PAGE_GLO= BAL) @@ -4910,6 +4921,254 @@ long arch_memory_op(unsigned long cmd, XEN_GUEST_HA= NDLE_PARAM(void) arg) return rc; } =20 +static void __iomem *subpage_mmio_find_page(mfn_t mfn) +{ + struct subpage_ro_range *entry; + + list_for_each_entry(entry, &subpage_ro_ranges, list) + if ( mfn_eq(entry->mfn, mfn) ) + return entry; + + return NULL; +} + +/* + * Mark part of the page as R/O. + * Returns: + * - 0 on success - first range in the page + * - 1 on success - subsequent range in the page + * - <0 on error + */ +static int __init subpage_mmio_ro_add_page( + mfn_t mfn, + unsigned int offset_s, + unsigned int offset_e) +{ + struct subpage_ro_range *entry =3D NULL, *iter; + unsigned int i; + + entry =3D subpage_mmio_find_page(mfn); + if ( !entry ) + { + /* iter =3D=3D NULL marks it was a newly allocated entry */ + iter =3D NULL; + entry =3D xzalloc(struct subpage_ro_range); + if ( !entry ) + return -ENOMEM; + entry->mfn =3D mfn; + } + + for ( i =3D offset_s; i <=3D offset_e; i +=3D MMIO_RO_SUBPAGE_GRAN ) + { + bool oldbit =3D __test_and_set_bit(i / MMIO_RO_SUBPAGE_GRAN, + entry->ro_elems); + ASSERT(!oldbit); + } + + if ( !iter ) + list_add(&entry->list, &subpage_ro_ranges); + + return iter ? 1 : 0; +} + +static void __init subpage_mmio_ro_remove_page( + mfn_t mfn, + unsigned int offset_s, + unsigned int offset_e) +{ + struct subpage_ro_range *entry =3D NULL; + unsigned int i; + + entry =3D subpage_mmio_find_page(mfn); + if ( !entry ) + return; + + for ( i =3D offset_s; i <=3D offset_e; i +=3D MMIO_RO_SUBPAGE_GRAN ) + __clear_bit(i / MMIO_RO_SUBPAGE_GRAN, entry->ro_elems); + + if ( !bitmap_empty(entry->ro_elems, PAGE_SIZE / MMIO_RO_SUBPAGE_GRAN) ) + return; + + list_del(&entry->list); + if ( entry->mapped ) + iounmap(entry->mapped); + xfree(entry); +} + +int __init subpage_mmio_ro_add( + paddr_t start, + size_t size) +{ + mfn_t mfn_start =3D maddr_to_mfn(start); + paddr_t end =3D start + size - 1; + mfn_t mfn_end =3D maddr_to_mfn(end); + unsigned int offset_end =3D 0; + int rc; + bool subpage_start, subpage_end; + + ASSERT(IS_ALIGNED(start, MMIO_RO_SUBPAGE_GRAN)); + ASSERT(IS_ALIGNED(size, MMIO_RO_SUBPAGE_GRAN)); + if ( !IS_ALIGNED(size, MMIO_RO_SUBPAGE_GRAN) ) + return -EINVAL; + + if ( !size ) + return 0; + + if ( mfn_eq(mfn_start, mfn_end) ) + { + /* Both starting and ending parts handled at once */ + subpage_start =3D PAGE_OFFSET(start) || PAGE_OFFSET(end) !=3D PAGE= _SIZE - 1; + subpage_end =3D false; + } + else + { + subpage_start =3D PAGE_OFFSET(start); + subpage_end =3D PAGE_OFFSET(end) !=3D PAGE_SIZE - 1; + } + + if ( subpage_start ) + { + offset_end =3D mfn_eq(mfn_start, mfn_end) ? + PAGE_OFFSET(end) : + (PAGE_SIZE - 1); + rc =3D subpage_mmio_ro_add_page(mfn_start, + PAGE_OFFSET(start), + offset_end); + if ( rc < 0 ) + goto err_unlock; + /* Check if not marking R/W part of a page intended to be fully R/= O */ + ASSERT(rc || !rangeset_contains_singleton(mmio_ro_ranges, + mfn_x(mfn_start))); + } + + if ( subpage_end ) + { + rc =3D subpage_mmio_ro_add_page(mfn_end, 0, PAGE_OFFSET(end)); + if ( rc < 0 ) + goto err_unlock_remove; + /* Check if not marking R/W part of a page intended to be fully R/= O */ + ASSERT(rc || !rangeset_contains_singleton(mmio_ro_ranges, + mfn_x(mfn_end))); + } + + rc =3D rangeset_add_range(mmio_ro_ranges, mfn_x(mfn_start), mfn_x(mfn_= end)); + if ( rc ) + goto err_remove; + + return 0; + + err_remove: + if ( subpage_end ) + subpage_mmio_ro_remove_page(mfn_end, 0, PAGE_OFFSET(end)); + err_unlock_remove: + if ( subpage_start ) + subpage_mmio_ro_remove_page(mfn_start, PAGE_OFFSET(start), offset_= end); + err_unlock: + return rc; +} + +static void __iomem *subpage_mmio_map_page( + struct subpage_ro_range *entry) +{ + void __iomem *mapped_page; + + if ( entry->mapped ) + return entry->mapped; + + mapped_page =3D ioremap(mfn_to_maddr(entry->mfn), PAGE_SIZE); + + spin_lock(&subpage_ro_lock); + /* Re-check under the lock */ + if ( entry->mapped ) + { + spin_unlock(&subpage_ro_lock); + if ( mapped_page ) + iounmap(mapped_page); + return entry->mapped; + } + + entry->mapped =3D mapped_page; + spin_unlock(&subpage_ro_lock); + return entry->mapped; +} + +static void subpage_mmio_write_emulate( + mfn_t mfn, + unsigned int offset, + const void *data, + unsigned int len) +{ + struct subpage_ro_range *entry; + volatile void __iomem *addr; + + entry =3D subpage_mmio_find_page(mfn); + if ( !entry ) + /* Do not print message for pages without any writable parts. */ + return; + + if ( test_bit(offset / MMIO_RO_SUBPAGE_GRAN, entry->ro_elems) ) + { +write_ignored: + gprintk(XENLOG_WARNING, + "ignoring write to R/O MMIO 0x%"PRI_mfn"%03x len %u\n", + mfn_x(mfn), offset, len); + return; + } + + addr =3D subpage_mmio_map_page(entry); + if ( !addr ) + { + gprintk(XENLOG_ERR, + "Failed to map page for MMIO write at 0x%"PRI_mfn"%03x\n", + mfn_x(mfn), offset); + return; + } + + switch ( len ) + { + case 1: + writeb(*(const uint8_t*)data, addr); + break; + case 2: + writew(*(const uint16_t*)data, addr); + break; + case 4: + writel(*(const uint32_t*)data, addr); + break; + case 8: + writeq(*(const uint64_t*)data, addr); + break; + default: + /* mmio_ro_emulated_write() already validated the size */ + ASSERT_UNREACHABLE(); + goto write_ignored; + } +} + +#ifdef CONFIG_HVM +bool subpage_mmio_write_accept(mfn_t mfn, unsigned long gla) +{ + unsigned int offset =3D PAGE_OFFSET(gla); + const struct subpage_ro_range *entry; + + entry =3D subpage_mmio_find_page(mfn); + if ( !entry ) + return false; + + if ( !test_bit(offset / MMIO_RO_SUBPAGE_GRAN, entry->ro_elems) ) + { + /* + * We don't know the write size at this point yet, so it could be + * an unaligned write, but accept it here anyway and deal with it + * later. + */ + return true; + } + + return false; +} +#endif + int cf_check mmio_ro_emulated_write( enum x86_segment seg, unsigned long offset, @@ -4928,6 +5187,9 @@ int cf_check mmio_ro_emulated_write( return X86EMUL_UNHANDLEABLE; } =20 + subpage_mmio_write_emulate(mmio_ro_ctxt->mfn, PAGE_OFFSET(offset), + p_data, bytes); + return X86EMUL_OKAY; } =20 diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-faul= t.c index cad28ef928ad..2ea1a6ad489c 100644 --- a/xen/arch/x86/pv/ro-page-fault.c +++ b/xen/arch/x86/pv/ro-page-fault.c @@ -333,8 +333,10 @@ static int mmio_ro_do_page_fault(struct x86_emulate_ct= xt *ctxt, ctxt->data =3D &mmio_ro_ctxt; if ( pci_ro_mmcfg_decode(mfn_x(mfn), &mmio_ro_ctxt.seg, &mmio_ro_ctxt.= bdf) ) return x86_emulate(ctxt, &mmcfg_intercept_ops); - else - return x86_emulate(ctxt, &mmio_ro_emulate_ops); + + mmio_ro_ctxt.mfn =3D mfn; + + return x86_emulate(ctxt, &mmio_ro_emulate_ops); } =20 int pv_ro_page_fault(unsigned long addr, struct cpu_user_regs *regs) --=20 git-series 0.9.1 From nobody Mon Sep 16 19:21:12 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1721356482729533.3774125053368; Thu, 18 Jul 2024 19:34:42 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.760837.1170802 (Exim 4.92) (envelope-from ) id 1sUdRs-0002bQ-31; Fri, 19 Jul 2024 02:34:20 +0000 Received: by outflank-mailman (output) from mailman id 760837.1170802; Fri, 19 Jul 2024 02:34:20 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sUdRs-0002bJ-0A; Fri, 19 Jul 2024 02:34:20 +0000 Received: by outflank-mailman (input) for mailman id 760837; Fri, 19 Jul 2024 02:34:19 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sUdRr-0001tA-4R for xen-devel@lists.xenproject.org; Fri, 19 Jul 2024 02:34:19 +0000 Received: from fout6-smtp.messagingengine.com (fout6-smtp.messagingengine.com [103.168.172.149]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 64e0b003-4577-11ef-8776-851b0ebba9a2; Fri, 19 Jul 2024 04:34:17 +0200 (CEST) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailfout.nyi.internal (Postfix) with ESMTP id 96E87138027B; Thu, 18 Jul 2024 22:34:16 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 18 Jul 2024 22:34:16 -0400 Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 18 Jul 2024 22:34:14 -0400 (EDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 64e0b003-4577-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1721356456; x=1721442856; bh=3q36D4bJYb KeTNzDrlCHr7vVdpQaHyX3JUKiZcT8bno=; b=uC3Kdt+Eja/wVLXjhnTkweQaCU bZFHxr1CGB2XbOIpe2jw1BoyqPaQnriPt4SQdNmIFsgTUe30oC3ovt4oIAnCprxo uYEIxmLlyzuVUQXOut1b64Zmn+CCfacUqkCoQrcSBdls7qmMoOByp9iQ6Yi4TjLI ew5XlXFlIufaErP3op/ChqcI52r1sXxPJGvz/JtcBx0ZQl0oBtKFozTgNG3dCwRt h5+QCUM7CCClGfzPjdLewkEiHsdaPxdQBmkaPdvRe7v7UEz2/KKvWaaKU/f27Esp kGisrdCZCkR4WW+ZoAplb1Q3LxHXEZHKDoyKi0tTCrQy8nEWAz6wxR6nCo/w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1721356456; x= 1721442856; bh=3q36D4bJYbKeTNzDrlCHr7vVdpQaHyX3JUKiZcT8bno=; b=T 4tLFTbsfKqYi0jIQcTOCN8TgcMIfKS2F7PkoAhtHDueGpj8Wtvgkhk2tsNXwD9tt FetKe1A8p5H3YBmIvlM/ZnEtjYfvr5d6nuJ762AxlPVdj9duc9Mi2BNJ8ggvdDS8 Lja0StmNqj5HLUBxwE88Ccs4XBlBHbtUEKVQbhf22rr740uOZAgkXpx4WRcRsgXl GadYhNVyKOUuwM2/knfm0fHopF40s/C6LK0Opz4zWJJ9WDhDTLkYmrD0vCdGBZYL oTM2wDckr93gIeSkSdKpauTJB3blw8KTg4QsKrz/srH6PZqSm1SH/zMqvJda/9Y3 XfLzDmcLNbO5LSFk8MVaw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrhedtgdeiudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffojghfgggtgfesthekredtredtjeenucfhrhhomhepofgrrhgv khcuofgrrhgtiiihkhhofihskhhiqdfikphrvggtkhhiuceomhgrrhhmrghrvghksehinh hvihhsihgslhgvthhhihhnghhslhgrsgdrtghomheqnecuggftrfgrthhtvghrnhepgfeu udehgfdvfeehhedujeehfeduveeugefhkefhheelgeevudetueeiudfggfffnecuvehluh hsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrrhhmrghrvghk sehinhhvihhsihgslhgvthhhihhnghhslhgrsgdrtghomh X-ME-Proxy: Feedback-ID: i1568416f:Fastmail From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= To: xen-devel@lists.xenproject.org Cc: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= , Jan Beulich , Andrew Cooper , Julien Grall , Stefano Stabellini Subject: [PATCH v5 3/3] drivers/char: Use sub-page ro API to make just xhci dbc cap RO Date: Fri, 19 Jul 2024 04:33:38 +0200 Message-ID: <62267309025cd78dd4c901d6c1d0f9880cdd0c73.1721356393.git-series.marmarek@invisiblethingslab.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1721356484137116600 Not the whole page, which may contain other registers too. The XHCI specification describes DbC as designed to be controlled by a different driver, but does not mandate placing registers on a separate page. In fact on Tiger Lake and newer (at least), this page do contain other registers that Linux tries to use. And with share=3Dyes, a domU would use them too. Without this patch, PV dom0 would fail to initialize the controller, while HVM would be killed on EPT violation. With `share=3Dyes`, this patch gives domU more access to the emulator (although a HVM with any emulated device already has plenty of it). This configuration is already documented as unsafe with untrusted guests and not security supported. Signed-off-by: Marek Marczykowski-G=C3=B3recki Reviewed-by: Jan Beulich --- Changes in v4: - restore mmio_ro_ranges in the fallback case - set XHCI_SHARE_NONE in the fallback case Changes in v3: - indentation fix - remove stale comment - fallback to pci_ro_device() if subpage_mmio_ro_add() fails - extend commit message Changes in v2: - adjust for simplified subpage_mmio_ro_add() API --- xen/drivers/char/xhci-dbc.c | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-) diff --git a/xen/drivers/char/xhci-dbc.c b/xen/drivers/char/xhci-dbc.c index 8e2037f1a5f7..c45e4b6825cc 100644 --- a/xen/drivers/char/xhci-dbc.c +++ b/xen/drivers/char/xhci-dbc.c @@ -1216,20 +1216,28 @@ static void __init cf_check dbc_uart_init_postirq(s= truct serial_port *port) break; } #ifdef CONFIG_X86 - /* - * This marks the whole page as R/O, which may include other registers - * unrelated to DbC. Xen needs only DbC area protected, but it seems - * Linux's XHCI driver (as of 5.18) works without writting to the whole - * page, so keep it simple. - */ - if ( rangeset_add_range(mmio_ro_ranges, - PFN_DOWN((uart->dbc.bar_val & PCI_BASE_ADDRESS_MEM_MASK) + - uart->dbc.xhc_dbc_offset), - PFN_UP((uart->dbc.bar_val & PCI_BASE_ADDRESS_MEM_MASK) + - uart->dbc.xhc_dbc_offset + - sizeof(*uart->dbc.dbc_reg)) - 1) ) - printk(XENLOG_INFO - "Error while adding MMIO range of device to mmio_ro_ranges\= n"); + if ( subpage_mmio_ro_add( + (uart->dbc.bar_val & PCI_BASE_ADDRESS_MEM_MASK) + + uart->dbc.xhc_dbc_offset, + sizeof(*uart->dbc.dbc_reg)) ) + { + printk(XENLOG_WARNING + "Error while marking MMIO range of XHCI console as R/O, " + "making the whole device R/O (share=3Dno)\n"); + uart->dbc.share =3D XHCI_SHARE_NONE; + if ( pci_ro_device(0, uart->dbc.sbdf.bus, uart->dbc.sbdf.devfn) ) + printk(XENLOG_WARNING + "Failed to mark read-only %pp used for XHCI console\n", + &uart->dbc.sbdf); + if ( rangeset_add_range(mmio_ro_ranges, + PFN_DOWN((uart->dbc.bar_val & PCI_BASE_ADDRESS_MEM_MASK) + + uart->dbc.xhc_dbc_offset), + PFN_UP((uart->dbc.bar_val & PCI_BASE_ADDRESS_MEM_MASK) + + uart->dbc.xhc_dbc_offset + + sizeof(*uart->dbc.dbc_reg)) - 1) ) + printk(XENLOG_INFO + "Error while adding MMIO range of device to mmio_ro_ran= ges\n"); + } #endif } =20 --=20 git-series 0.9.1