From nobody Sun May 19 19:01:00 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1587578002; cv=none; d=zohomail.com; s=zohoarc; b=jxV3jDPFD8+3Kt2ySiej28Wp+TzkZ7n8WB1htKeouVO/TYDfl4fOOxXxsXy2hor6/KbmOmJlsCYRtiKaTTAOwnvASVZhpBcCH3UM3LkxD4dR0scsB2qz+UX840zXS5iC8p8/Mx29l+nCKtZ3jWrZt9VWzJfJHELI4UT3fbAvVp0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1587578002; h=Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:Sender:Subject:To; bh=2FD5FTLhDlls7d8GJw2+wWZH8xx0p7mNMpxR6aIFDZ8=; b=kvNbgNLD5hkZeoDaIqzz61HNtN7ka3FrG2tz99XahnvKT+N/vrUjfGn6kRK+F56CyfqGJsr27Y4HacRTEm8vTp0JH8e7NO1zBQCERnprEH0uKauSqsS4otD71uQ8Q3anQ5Qu+WbsLsY8JZu0cPgKIwvMGCG+6DkbCRc1MJoMqXU= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1587578002881463.8649373647239; Wed, 22 Apr 2020 10:53:22 -0700 (PDT) Received: from localhost ([::1]:55290 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jRJYn-0003hs-M2 for importer@patchew.org; Wed, 22 Apr 2020 13:53:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35250) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jRJAq-0005Yf-Gw for qemu-devel@nongnu.org; Wed, 22 Apr 2020 13:28:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jRJAl-0001QY-Vs for qemu-devel@nongnu.org; Wed, 22 Apr 2020 13:28:35 -0400 Received: from mga17.intel.com ([192.55.52.151]:13773) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jRJAk-00019H-Al for qemu-devel@nongnu.org; Wed, 22 Apr 2020 13:28:30 -0400 Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Apr 2020 10:28:18 -0700 Received: from unknown (HELO localhost.lm.intel.com) ([10.232.116.40]) by orsmga005.jf.intel.com with ESMTP; 22 Apr 2020 10:28:17 -0700 IronPort-SDR: 7PYZ3FmQ+BCrZCy/EGl1EHRclOCleZ4EmRpksIRzscNyeR1OxLABwJdTK9D73FfDONT2fwE+DP 23qS7GZnl+ng== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False IronPort-SDR: HxdrDDUmCxodUNlCPZE1se1ZGE6slvBsrzZL0UvifA4nUA8lKBkKc7fz5sYeIhg0653ocxP4FY fXu9PD0aT3fw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,304,1583222400"; d="scan'208";a="429999410" From: Jon Derrick To: Bjorn Helgaas , qemu-devel@nongnu.org Subject: [PATCH for QEMU] hw/vfio: Add VMD Passthrough Quirk Date: Wed, 22 Apr 2020 13:13:04 -0400 Message-Id: <20200422171305.10923-1-jonathan.derrick@intel.com> X-Mailer: git-send-email 2.18.1 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.55.52.151; envelope-from=jonathan.derrick@intel.com; helo=mga17.intel.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/04/22 13:28:18 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Received-From: 192.55.52.151 X-Mailman-Approved-At: Wed, 22 Apr 2020 13:49:16 -0400 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-pci@vger.kernel.org, Lorenzo Pieralisi , Andrzej Jakowski , Jon Derrick , virtualization@lists.linux-foundation.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The VMD endpoint provides a real PCIe domain to the guest, including bridges and endpoints. The IOMMU performs Host Physical Address to Guest Physical Address translation when assigning downstream endpoint BARs and when translating MMIO addresses. This translation is not desired when assigning bridge windows. When MMIO goes to an endpoint after being translated to HPA, the bridge will reject the HPA transaction because the bridge window has been programmed with translated GPAs. VMD device 28C0 natively supports passthrough by providing the Host Physical Address in shadow registers accessible to the guest for bridge window assignment. The shadow registers are valid if bit 1 is set in VMD VMLOCK config register 0x70. This quirk emulates the VMLOCK and HPA shadow registers for all VMD device ids which don't natively offer this feature. The Linux VMD driver is updated to match the QEMU subsystem id to enable this feature. Signed-off-by: Jon Derrick Reviewed-by: Andrzej Jakowski --- hw/vfio/pci-quirks.c | 119 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/pci.c | 7 +++ hw/vfio/pci.h | 2 + hw/vfio/trace-events | 4 ++ 4 files changed, 132 insertions(+) diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c index 2d348f8237..2fd27cc8f6 100644 --- a/hw/vfio/pci-quirks.c +++ b/hw/vfio/pci-quirks.c @@ -1709,3 +1709,122 @@ free_exit: =20 return ret; } + +/* + * The VMD endpoint provides a real PCIe domain to the guest. The IOMMU + * performs Host Physical Address to Guest Physical Address translation wh= en + * assigning downstream endpoint BARs and when translating MMIO addresses. + * However this translation is not desired when assigning bridge windows. = When + * MMIO goes to an endpoint after being translated to HPA, the bridge reje= cts + * the transaction because the window has been programmed with translated = GPAs. + * + * VMD uses the Host Physical Address in order to correctly program the br= idge + * windows in its PCIe domain. VMD device 28C0 has HPA shadow registers lo= cated + * at offset 0x2000 in MEMBAR2 (BAR 4). The shadow registers are valid if = bit 1 + * is set in the VMD VMLOCK config register 0x70. + * + * This quirk emulates the VMLOCK and HPA shadow registers for all VMD dev= ice + * ids which don't natively offer this feature. The subsystem vendor/device + * id is set to the QEMU subsystem vendor/device id, where the driver matc= hes + * the id to enable this feature. + */ +typedef struct VFIOVMDQuirk { + VFIOPCIDevice *vdev; + uint64_t membar_phys[2]; +} VFIOVMDQuirk; + +static uint64_t vfio_vmd_quirk_read(void *opaque, hwaddr addr, unsigned si= ze) +{ + VFIOVMDQuirk *data =3D opaque; + uint64_t val =3D 0; + + memcpy(&val, (void *)data->membar_phys + addr, size); + return val; +} + +static const MemoryRegionOps vfio_vmd_quirk =3D { + .read =3D vfio_vmd_quirk_read, + .endianness =3D DEVICE_LITTLE_ENDIAN, +}; + +#define VMD_VMLOCK 0x70 +#define VMD_SHADOW 0x2000 +#define VMD_MEMBAR2 4 + +static int vfio_vmd_emulate_shadow_registers(VFIOPCIDevice *vdev) +{ + VFIOQuirk *quirk; + VFIOVMDQuirk *data; + PCIDevice *pdev =3D &vdev->pdev; + int ret; + + data =3D g_malloc0(sizeof(*data)); + ret =3D pread(vdev->vbasedev.fd, data->membar_phys, 16, + vdev->config_offset + PCI_BASE_ADDRESS_2); + if (ret !=3D 16) { + error_report("VMD %s cannot read MEMBARs (%d)", + vdev->vbasedev.name, ret); + g_free(data); + return -EFAULT; + } + + quirk =3D vfio_quirk_alloc(1); + quirk->data =3D data; + data->vdev =3D vdev; + + /* Emulate Shadow Registers */ + memory_region_init_io(quirk->mem, OBJECT(vdev), &vfio_vmd_quirk, data, + "vfio-vmd-quirk", sizeof(data->membar_phys)); + memory_region_add_subregion_overlap(vdev->bars[VMD_MEMBAR2].region.mem, + VMD_SHADOW, quirk->mem, 1); + memory_region_set_readonly(quirk->mem, true); + memory_region_set_enabled(quirk->mem, true); + + QLIST_INSERT_HEAD(&vdev->bars[VMD_MEMBAR2].quirks, quirk, next); + + trace_vfio_pci_vmd_quirk_shadow_regs(vdev->vbasedev.name, + data->membar_phys[0], + data->membar_phys[1]); + + /* Advertise Shadow Register support */ + pci_byte_test_and_set_mask(pdev->config + VMD_VMLOCK, 0x2); + pci_set_byte(pdev->wmask + VMD_VMLOCK, 0); + pci_set_byte(vdev->emulated_config_bits + VMD_VMLOCK, 0x2); + + trace_vfio_pci_vmd_quirk_vmlock(vdev->vbasedev.name, + pci_get_byte(pdev->config + VMD_VMLOCK= )); + + /* Drivers can match the subsystem vendor/device id */ + pci_set_word(pdev->config + PCI_SUBSYSTEM_VENDOR_ID, + PCI_SUBVENDOR_ID_REDHAT_QUMRANET); + pci_set_word(vdev->emulated_config_bits + PCI_SUBSYSTEM_VENDOR_ID, ~0); + + pci_set_word(pdev->config + PCI_SUBSYSTEM_ID, PCI_SUBDEVICE_ID_QEMU); + pci_set_word(vdev->emulated_config_bits + PCI_SUBSYSTEM_ID, ~0); + + trace_vfio_pci_vmd_quirk_subsystem(vdev->vbasedev.name, + vdev->sub_vendor_id, vdev->sub_device_id, + pci_get_word(pdev->config + PCI_SUBSYSTEM_VENDO= R_ID), + pci_get_word(pdev->config + PCI_SUBSYSTEM_ID)); + + return 0; +} + +int vfio_pci_vmd_init(VFIOPCIDevice *vdev) +{ + int ret =3D 0; + + switch (vdev->device_id) { + case 0x28C0: /* Native passthrough support */ + break; + /* Emulates Native passthrough support */ + case 0x201D: + case 0x467F: + case 0x4C3D: + case 0x9A0B: + ret =3D vfio_vmd_emulate_shadow_registers(vdev); + break; + } + + return ret; +} diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 5e75a95129..85425a1a6f 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3024,6 +3024,13 @@ static void vfio_realize(PCIDevice *pdev, Error **er= rp) } } =20 + if (vdev->vendor_id =3D=3D PCI_VENDOR_ID_INTEL) { + ret =3D vfio_pci_vmd_init(vdev); + if (ret) { + error_report("Failed to setup VMD"); + } + } + vfio_register_err_notifier(vdev); vfio_register_req_notifier(vdev); vfio_setup_resetfn_quirk(vdev); diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 0da7a20a7e..e8632d806b 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -217,6 +217,8 @@ int vfio_pci_igd_opregion_init(VFIOPCIDevice *vdev, int vfio_pci_nvidia_v100_ram_init(VFIOPCIDevice *vdev, Error **errp); int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error **errp); =20 +int vfio_pci_vmd_init(VFIOPCIDevice *vdev); + void vfio_display_reset(VFIOPCIDevice *vdev); int vfio_display_probe(VFIOPCIDevice *vdev, Error **errp); void vfio_display_finalize(VFIOPCIDevice *vdev); diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index b1ef55a33f..aabbd2693a 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -90,6 +90,10 @@ vfio_pci_nvidia_gpu_setup_quirk(const char *name, uint64= _t tgt, uint64_t size) " vfio_pci_nvlink2_setup_quirk_ssatgt(const char *name, uint64_t tgt, uint64= _t size) "%s tgt=3D0x%"PRIx64" size=3D0x%"PRIx64 vfio_pci_nvlink2_setup_quirk_lnkspd(const char *name, uint32_t link_speed)= "%s link_speed=3D0x%x" =20 +vfio_pci_vmd_quirk_shadow_regs(const char *name, uint64_t mb1, uint64_t mb= 2) "%s membar1_phys=3D0x%"PRIx64" membar2_phys=3D0x%"PRIx64" +vfio_pci_vmd_quirk_vmlock(const char *name, uint8_t vmlock) "%s vmlock=3D0= x%x" +vfio_pci_vmd_quirk_subsystem(const char *name, uint16_t old_svid, uint16_t= old_sdid, uint16_t new_svid, uint16_t new_sdid) "%s subsystem id 0x%04x:0x= %04x -> 0x%04x:0x%04x" + # common.c vfio_region_write(const char *name, int index, uint64_t addr, uint64_t dat= a, unsigned size) " (%s:region%d+0x%"PRIx64", 0x%"PRIx64 ", %d)" vfio_region_read(char *name, int index, uint64_t addr, unsigned size, uint= 64_t data) " (%s:region%d+0x%"PRIx64", %d) =3D 0x%"PRIx64 --=20 2.18.1