From nobody Sun Nov 24 19:31:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1721020866; cv=none; d=zohomail.com; s=zohoarc; b=FtxR6zXBmLbKBrvdBQMrVhWf+rPQk1BcJELYHnGsKBtk28AENp5Eg1893xT5AaPplOw8XX6UxRt4jjZ+kCs/RGY/X6yA8XYoPcQjEKtUFdw8pswdAzc7bbJekdhERDkafAhMAl3XsSdrfIdGqVfTzunae9E7JKMVfNNX3tMtVHE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1721020866; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Qybwk3OW5qaGVUA/oEO6VnAFeQ3iixWcj0s4ESw4IO8=; b=MY+qLEHtThtC18aZMt5+RApcGQjfrW9kEPEoyCU/hyj0dqtflFgANSWc517YZWNpIYl9vppbX91RoS43PjCsFjcvE/l53go0YuQ9nds151vMrufU+/1t0DYcvICQRp8Bh5CdqTJM1aFMb33bcquIzr+fO1bCwLT+yZcwArBNNTI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1721020866940724.5857459651702; Sun, 14 Jul 2024 22:21:06 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sTE8U-00018N-Fz; Mon, 15 Jul 2024 01:20:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sTE8Q-0000gD-Cj for qemu-devel@nongnu.org; Mon, 15 Jul 2024 01:20:26 -0400 Received: from mail-ot1-x32b.google.com ([2607:f8b0:4864:20::32b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sTE86-0007dr-0q for qemu-devel@nongnu.org; Mon, 15 Jul 2024 01:20:08 -0400 Received: by mail-ot1-x32b.google.com with SMTP id 46e09a7af769-708adad61f8so2055381a34.1 for ; Sun, 14 Jul 2024 22:20:05 -0700 (PDT) Received: from localhost ([157.82.204.135]) by smtp.gmail.com with UTF8SMTPSA id d2e1a72fcca58-70b7eb9e285sm3422482b3a.8.2024.07.14.22.20.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 14 Jul 2024 22:20:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1721020803; x=1721625603; darn=nongnu.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=Qybwk3OW5qaGVUA/oEO6VnAFeQ3iixWcj0s4ESw4IO8=; b=VJ8AhaqaCER/iTXs91nrT//y8XDg//uddeiHb97m9DMm5CPU+NZZ+7ZC+dAaQAcMmo 2AMMDL4XGnSHdCFs7QzTQHXaKW6Q0nv8StVRAgaTq+K38W7MHByzzSHRKfIZI/SWjflZ fYQeaFdSd888Bn4Gggc2H8gXKtHsVZMie7cHkpAZnCkoyQU0xwzRl8oZwYUGJTJ7mMNZ PZyMt6/nqefdmpW8JFeXtNVuna1cR5UR3+g6zZIQQ58IbW2W5f3aLu5uZvlQnL5OSWOV 8Ct+ft1CSSqjWxfImLLKOmuVVPSqabsyxZj6VWLjrXdZEHkxPO5mNReOOV1iveliQW3d J+nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721020803; x=1721625603; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qybwk3OW5qaGVUA/oEO6VnAFeQ3iixWcj0s4ESw4IO8=; b=bm6CclSpnJW9cgbdI7GH6jM+I8nKy69VMlUZc//sl15XtmaCMpFBi7c+fwrzdjSWtI 5A7qUqjKWkdKmxwDsNQyocqT7lE8qWfJWBqTOhIRB/lZSg5UsA7dCdxDaeWcHKSutCw9 TP6WQeQMbMUY0UzrhpoqxdEdiVDMl6f1/F3dfSll4Qx5ZuQnKNfWjb3mTV82JBEkAdPz 4cB/FkBGa83b6rDicXXQWNKWRWzqNU8vP3Jbg9Hiv/aMAqMnc5g9eQ279jL2v8Tipg1V CtAul9aMgVyI+BfWP0CQuzx2+5dJpFVLfRpvksT2cwZSOd8i9IkPV8HXBDFGMUBMabOP CWpg== X-Gm-Message-State: AOJu0Yyk02osKbLHELHsm5jv9JoqOLAheE3LXEmzNsaGNktxLvvqMx1X 1HSPEt1blCusOphlVsUrhyTAexJ0WwtKjLPNNuRo3gjsCNYGEeAp3bp0wxrCQz4= X-Google-Smtp-Source: AGHT+IEMo5MRbjeRqJlgtgic1dhobI312RHDg3mD4SOTmvZ0ueQBTwFXErrKBkOI3e61samFoU2kbw== X-Received: by 2002:a05:6830:18f1:b0:703:6703:8909 with SMTP id 46e09a7af769-70375a055ccmr21149158a34.11.1721020803572; Sun, 14 Jul 2024 22:20:03 -0700 (PDT) From: Akihiko Odaki Date: Mon, 15 Jul 2024 14:19:11 +0900 Subject: [PATCH v5 5/8] pcie_sriov: Allow user to create SR-IOV device MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240715-sriov-v5-5-3f5539093ffc@daynix.com> References: <20240715-sriov-v5-0-3f5539093ffc@daynix.com> In-Reply-To: <20240715-sriov-v5-0-3f5539093ffc@daynix.com> To: "Michael S. Tsirkin" , Marcel Apfelbaum , Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_Goater?= , Paolo Bonzini , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Eduardo Habkost , Jason Wang , Sriram Yagnaraman , Keith Busch , Klaus Jensen Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, Yui Washizu , Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=2607:f8b0:4864:20::32b; envelope-from=akihiko.odaki@daynix.com; helo=mail-ot1-x32b.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @daynix-com.20230601.gappssmtp.com) X-ZM-MESSAGEID: 1721020868279116600 A user can create a SR-IOV device by specifying the PF with the sriov-pf property of the VFs. The VFs must be added before the PF. A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable set. Such a VF cannot refer to the PF because it is created before the PF. A PF that user-creatable VFs can be attached calls pcie_sriov_pf_init_from_user_created_vfs() during realization and pcie_sriov_pf_exit() when exiting. Signed-off-by: Akihiko Odaki --- include/hw/pci/pci_device.h | 6 +- include/hw/pci/pcie_sriov.h | 18 +++ hw/pci/pci.c | 62 ++++++---- hw/pci/pcie_sriov.c | 288 +++++++++++++++++++++++++++++++++++-----= ---- 4 files changed, 291 insertions(+), 83 deletions(-) diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h index 49b341ce2e27..14d391333aff 100644 --- a/include/hw/pci/pci_device.h +++ b/include/hw/pci/pci_device.h @@ -37,6 +37,8 @@ struct PCIDeviceClass { uint16_t subsystem_id; /* only for header type =3D 0 */ =20 const char *romfile; /* rom bar */ + + bool sriov_vf_user_creatable; }; =20 enum PCIReqIDType { @@ -160,6 +162,8 @@ struct PCIDevice { /* ID of standby device in net_failover pair */ char *failover_pair_id; uint32_t acpi_index; + + char *sriov_pf; }; =20 static inline int pci_intx(PCIDevice *pci_dev) @@ -192,7 +196,7 @@ static inline int pci_is_express_downstream_port(const = PCIDevice *d) =20 static inline int pci_is_vf(const PCIDevice *d) { - return d->exp.sriov_vf.pf !=3D NULL; + return d->sriov_pf || d->exp.sriov_vf.pf !=3D NULL; } =20 static inline uint32_t pci_config_size(const PCIDevice *d) diff --git a/include/hw/pci/pcie_sriov.h b/include/hw/pci/pcie_sriov.h index c5d2d318d330..f75b8f22ee92 100644 --- a/include/hw/pci/pcie_sriov.h +++ b/include/hw/pci/pcie_sriov.h @@ -18,6 +18,7 @@ typedef struct PCIESriovPF { uint8_t vf_bar_type[PCI_NUM_REGIONS]; /* Store type for each VF bar = */ PCIDevice **vf; /* Pointer to an array of num_vfs VF devices */ + bool vf_user_created; /* If VFs are created by user */ } PCIESriovPF; =20 typedef struct PCIESriovVF { @@ -40,6 +41,23 @@ void pcie_sriov_pf_init_vf_bar(PCIDevice *dev, int regio= n_num, void pcie_sriov_vf_register_bar(PCIDevice *dev, int region_num, MemoryRegion *memory); =20 +/** + * pcie_sriov_pf_init_from_user_created_vfs() - Initialize PF with user-cr= eated + * VFs. + * @dev: A PCIe device being realized. + * @offset: The offset of the SR-IOV capability. + * @errp: pointer to Error*, to store an error if it happens. + * + * Return: The size of added capability. 0 if the user did not create VFs. + * -1 if failed. + */ +int16_t pcie_sriov_pf_init_from_user_created_vfs(PCIDevice *dev, + uint16_t offset, + Error **errp); + +bool pcie_sriov_register_device(PCIDevice *dev, Error **errp); +void pcie_sriov_unregister_device(PCIDevice *dev); + /* * Default (minimal) page size support values * as required by the SR/IOV standard: diff --git a/hw/pci/pci.c b/hw/pci/pci.c index ae7137d70579..0368e448992a 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -138,6 +138,7 @@ static Property pci_props[] =3D { QEMU_PCIE_ERR_UNC_MASK_BITNR, true), DEFINE_PROP_BIT("x-pcie-ari-nextfn-1", PCIDevice, cap_present, QEMU_PCIE_ARI_NEXTFN_1_BITNR, false), + DEFINE_PROP_STRING("sriov-pf", PCIDevice, sriov_pf), DEFINE_PROP_END_OF_LIST() }; =20 @@ -1012,13 +1013,8 @@ static void pci_init_multifunction(PCIBus *bus, PCID= evice *dev, Error **errp) dev->config[PCI_HEADER_TYPE] |=3D PCI_HEADER_TYPE_MULTI_FUNCTION; } =20 - /* - * With SR/IOV and ARI, a device at function 0 need not be a multifunc= tion - * device, as it may just be a VF that ended up with function 0 in - * the legacy PCI interpretation. Avoid failing in such cases: - */ - if (pci_is_vf(dev) && - dev->exp.sriov_vf.pf->cap_present & QEMU_PCI_CAP_MULTIFUNCTION) { + /* SR/IOV is not handled here. */ + if (pci_is_vf(dev)) { return; } =20 @@ -1051,7 +1047,8 @@ static void pci_init_multifunction(PCIBus *bus, PCIDe= vice *dev, Error **errp) } /* function 0 indicates single function, so function > 0 must be NULL = */ for (func =3D 1; func < PCI_FUNC_MAX; ++func) { - if (bus->devices[PCI_DEVFN(slot, func)]) { + PCIDevice *device =3D bus->devices[PCI_DEVFN(slot, func)]; + if (device && !pci_is_vf(device)) { error_setg(errp, "PCI: %x.0 indicates single function, " "but %x.%x is already populated.", slot, slot, func); @@ -1336,6 +1333,7 @@ static void pci_qdev_unrealize(DeviceState *dev) =20 pci_unregister_io_regions(pci_dev); pci_del_option_rom(pci_dev); + pcie_sriov_unregister_device(pci_dev); =20 if (pc->exit) { pc->exit(pci_dev); @@ -1367,7 +1365,6 @@ void pci_register_bar(PCIDevice *pci_dev, int region_= num, pcibus_t size =3D memory_region_size(memory); uint8_t hdr_type; =20 - assert(!pci_is_vf(pci_dev)); /* VFs must use pcie_sriov_vf_register_ba= r */ assert(region_num >=3D 0); assert(region_num < PCI_NUM_REGIONS); assert(is_power_of_2(size)); @@ -1378,7 +1375,6 @@ void pci_register_bar(PCIDevice *pci_dev, int region_= num, assert(hdr_type !=3D PCI_HEADER_TYPE_BRIDGE || region_num < 2); =20 r =3D &pci_dev->io_regions[region_num]; - r->addr =3D PCI_BAR_UNMAPPED; r->size =3D size; r->type =3D type; r->memory =3D memory; @@ -1386,22 +1382,35 @@ void pci_register_bar(PCIDevice *pci_dev, int regio= n_num, ? pci_get_bus(pci_dev)->address_space_io : pci_get_bus(pci_dev)->address_space_mem; =20 - wmask =3D ~(size - 1); - if (region_num =3D=3D PCI_ROM_SLOT) { - /* ROM enable bit is writable */ - wmask |=3D PCI_ROM_ADDRESS_ENABLE; - } - - addr =3D pci_bar(pci_dev, region_num); - pci_set_long(pci_dev->config + addr, type); + if (pci_is_vf(pci_dev)) { + PCIDevice *pf =3D pci_dev->exp.sriov_vf.pf; + assert(!pf || type =3D=3D pf->exp.sriov_pf.vf_bar_type[region_num]= ); =20 - if (!(r->type & PCI_BASE_ADDRESS_SPACE_IO) && - r->type & PCI_BASE_ADDRESS_MEM_TYPE_64) { - pci_set_quad(pci_dev->wmask + addr, wmask); - pci_set_quad(pci_dev->cmask + addr, ~0ULL); + r->addr =3D pci_bar_address(pci_dev, region_num, r->type, r->size); + if (r->addr !=3D PCI_BAR_UNMAPPED) { + memory_region_add_subregion_overlap(r->address_space, + r->addr, r->memory, 1); + } } else { - pci_set_long(pci_dev->wmask + addr, wmask & 0xffffffff); - pci_set_long(pci_dev->cmask + addr, 0xffffffff); + r->addr =3D PCI_BAR_UNMAPPED; + + wmask =3D ~(size - 1); + if (region_num =3D=3D PCI_ROM_SLOT) { + /* ROM enable bit is writable */ + wmask |=3D PCI_ROM_ADDRESS_ENABLE; + } + + addr =3D pci_bar(pci_dev, region_num); + pci_set_long(pci_dev->config + addr, type); + + if (!(r->type & PCI_BASE_ADDRESS_SPACE_IO) && + r->type & PCI_BASE_ADDRESS_MEM_TYPE_64) { + pci_set_quad(pci_dev->wmask + addr, wmask); + pci_set_quad(pci_dev->cmask + addr, ~0ULL); + } else { + pci_set_long(pci_dev->wmask + addr, wmask & 0xffffffff); + pci_set_long(pci_dev->cmask + addr, 0xffffffff); + } } } =20 @@ -2162,6 +2171,11 @@ static void pci_qdev_realize(DeviceState *qdev, Erro= r **errp) } } =20 + if (!pcie_sriov_register_device(pci_dev, errp)) { + pci_qdev_unrealize(DEVICE(pci_dev)); + return; + } + /* * A PCIe Downstream Port that do not have ARI Forwarding enabled must * associate only Device 0 with the device attached to the bus diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c index 3af0cc7d560a..0c875e61fe96 100644 --- a/hw/pci/pcie_sriov.c +++ b/hw/pci/pcie_sriov.c @@ -20,6 +20,8 @@ #include "qapi/error.h" #include "trace.h" =20 +static GHashTable *pfs; + static void unparent_vfs(PCIDevice *dev, uint16_t total_vfs) { for (uint16_t i =3D 0; i < total_vfs; i++) { @@ -31,14 +33,49 @@ static void unparent_vfs(PCIDevice *dev, uint16_t total= _vfs) dev->exp.sriov_pf.vf =3D NULL; } =20 -bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset, - const char *vfname, uint16_t vf_dev_id, - uint16_t init_vfs, uint16_t total_vfs, - uint16_t vf_offset, uint16_t vf_stride, - Error **errp) +static void clear_ctrl_vfe(PCIDevice *dev) +{ + uint8_t *ctrl =3D dev->config + dev->exp.sriov_cap + PCI_SRIOV_CTRL; + pci_set_word(ctrl, pci_get_word(ctrl) & ~PCI_SRIOV_CTRL_VFE); +} + +static void register_vfs(PCIDevice *dev) +{ + uint16_t num_vfs; + uint16_t i; + uint16_t sriov_cap =3D dev->exp.sriov_cap; + + assert(sriov_cap > 0); + num_vfs =3D pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF); + if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_V= F)) { + clear_ctrl_vfe(dev); + return; + } + + trace_sriov_register_vfs(dev->name, PCI_SLOT(dev->devfn), + PCI_FUNC(dev->devfn), num_vfs); + for (i =3D 0; i < num_vfs; i++) { + pci_set_enabled(dev->exp.sriov_pf.vf[i], true); + } +} + +static void unregister_vfs(PCIDevice *dev) +{ + uint16_t i; + uint8_t *cfg =3D dev->config + dev->exp.sriov_cap; + + trace_sriov_unregister_vfs(dev->name, PCI_SLOT(dev->devfn), + PCI_FUNC(dev->devfn)); + for (i =3D 0; i < pci_get_word(cfg + PCI_SRIOV_TOTAL_VF); i++) { + pci_set_enabled(dev->exp.sriov_pf.vf[i], false); + } +} + +static bool pcie_sriov_pf_init_common(PCIDevice *dev, uint16_t offset, + uint16_t vf_dev_id, uint16_t init_vf= s, + uint16_t total_vfs, uint16_t vf_offs= et, + uint16_t vf_stride, Error **errp) { - BusState *bus =3D qdev_get_parent_bus(&dev->qdev); - int32_t devfn =3D dev->devfn + vf_offset; uint8_t *cfg =3D dev->config + offset; uint8_t *wmask; =20 @@ -100,6 +137,28 @@ bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offse= t, =20 qdev_prop_set_bit(&dev->qdev, "multifunction", true); =20 + return true; +} + +bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset, + const char *vfname, uint16_t vf_dev_id, + uint16_t init_vfs, uint16_t total_vfs, + uint16_t vf_offset, uint16_t vf_stride, + Error **errp) +{ + BusState *bus =3D qdev_get_parent_bus(&dev->qdev); + int32_t devfn =3D dev->devfn + vf_offset; + + if (pfs && g_hash_table_contains(pfs, dev->qdev.id)) { + error_setg(errp, "attaching user-created SR-IOV VF unsupported"); + return false; + } + + if (!pcie_sriov_pf_init_common(dev, offset, vf_dev_id, init_vfs, + total_vfs, vf_offset, vf_stride, errp))= { + return false; + } + dev->exp.sriov_pf.vf =3D g_new(PCIDevice *, total_vfs); =20 for (uint16_t i =3D 0; i < total_vfs; i++) { @@ -129,7 +188,22 @@ void pcie_sriov_pf_exit(PCIDevice *dev) { uint8_t *cfg =3D dev->config + dev->exp.sriov_cap; =20 - unparent_vfs(dev, pci_get_word(cfg + PCI_SRIOV_TOTAL_VF)); + if (dev->exp.sriov_pf.vf_user_created) { + uint16_t ven_id =3D pci_get_word(dev->config + PCI_VENDOR_ID); + uint16_t total_vfs =3D pci_get_word(dev->config + PCI_SRIOV_TOTAL_= VF); + uint16_t vf_dev_id =3D pci_get_word(dev->config + PCI_SRIOV_VF_DID= ); + + unregister_vfs(dev); + + for (uint16_t i =3D 0; i < total_vfs; i++) { + dev->exp.sriov_pf.vf[i]->exp.sriov_vf.pf =3D NULL; + + pci_config_set_vendor_id(dev->exp.sriov_pf.vf[i]->config, ven_= id); + pci_config_set_device_id(dev->exp.sriov_pf.vf[i]->config, vf_d= ev_id); + } + } else { + unparent_vfs(dev, pci_get_word(cfg + PCI_SRIOV_TOTAL_VF)); + } } =20 void pcie_sriov_pf_init_vf_bar(PCIDevice *dev, int region_num, @@ -162,74 +236,172 @@ void pcie_sriov_pf_init_vf_bar(PCIDevice *dev, int r= egion_num, void pcie_sriov_vf_register_bar(PCIDevice *dev, int region_num, MemoryRegion *memory) { - PCIIORegion *r; - PCIBus *bus =3D pci_get_bus(dev); uint8_t type; - pcibus_t size =3D memory_region_size(memory); =20 - assert(pci_is_vf(dev)); /* PFs must use pci_register_bar */ - assert(region_num >=3D 0); - assert(region_num < PCI_NUM_REGIONS); + assert(dev->exp.sriov_vf.pf); type =3D dev->exp.sriov_vf.pf->exp.sriov_pf.vf_bar_type[region_num]; =20 - if (!is_power_of_2(size)) { - error_report("%s: PCI region size must be a power" - " of two - type=3D0x%x, size=3D0x%"FMT_PCIBUS, - __func__, type, size); - exit(1); - } - - r =3D &dev->io_regions[region_num]; - r->memory =3D memory; - r->address_space =3D - type & PCI_BASE_ADDRESS_SPACE_IO - ? bus->address_space_io - : bus->address_space_mem; - r->size =3D size; - r->type =3D type; - - r->addr =3D pci_bar_address(dev, region_num, r->type, r->size); - if (r->addr !=3D PCI_BAR_UNMAPPED) { - memory_region_add_subregion_overlap(r->address_space, - r->addr, r->memory, 1); - } + return pci_register_bar(dev, region_num, type, memory); } =20 -static void clear_ctrl_vfe(PCIDevice *dev) +static gint compare_vf_devfns(gconstpointer a, gconstpointer b) { - uint8_t *ctrl =3D dev->config + dev->exp.sriov_cap + PCI_SRIOV_CTRL; - pci_set_word(ctrl, pci_get_word(ctrl) & ~PCI_SRIOV_CTRL_VFE); + return (*(PCIDevice **)a)->devfn - (*(PCIDevice **)b)->devfn; } =20 -static void register_vfs(PCIDevice *dev) +int16_t pcie_sriov_pf_init_from_user_created_vfs(PCIDevice *dev, + uint16_t offset, + Error **errp) { - uint16_t num_vfs; + GPtrArray *pf; + PCIDevice **vfs; + BusState *bus =3D qdev_get_parent_bus(DEVICE(dev)); + uint16_t ven_id =3D pci_get_word(dev->config + PCI_VENDOR_ID); + uint16_t vf_dev_id; + uint16_t vf_offset; + uint16_t vf_stride; uint16_t i; - uint16_t sriov_cap =3D dev->exp.sriov_cap; =20 - assert(sriov_cap > 0); - num_vfs =3D pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF); - if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_V= F)) { - clear_ctrl_vfe(dev); - return; + if (!pfs || !dev->qdev.id) { + return 0; } =20 - trace_sriov_register_vfs(dev->name, PCI_SLOT(dev->devfn), - PCI_FUNC(dev->devfn), num_vfs); - for (i =3D 0; i < num_vfs; i++) { - pci_set_enabled(dev->exp.sriov_pf.vf[i], true); + pf =3D g_hash_table_lookup(pfs, dev->qdev.id); + if (!pf) { + return 0; + } + + if (pf->len > UINT16_MAX) { + error_setg(errp, "too many VFs"); + return -1; + } + + g_ptr_array_sort(pf, compare_vf_devfns); + vfs =3D (void *)pf->pdata; + + if (vfs[0]->devfn <=3D dev->devfn) { + error_setg(errp, "a VF function number is less than the PF functio= n number"); + return -1; + } + + vf_dev_id =3D pci_get_word(vfs[0]->config + PCI_DEVICE_ID); + vf_offset =3D vfs[0]->devfn - dev->devfn; + vf_stride =3D pf->len < 2 ? 0 : vfs[1]->devfn - vfs[0]->devfn; + + for (i =3D 0; i < pf->len; i++) { + if (bus !=3D qdev_get_parent_bus(&vfs[i]->qdev)) { + error_setg(errp, "SR-IOV VF parent bus mismatches with PF"); + return -1; + } + + if (ven_id !=3D pci_get_word(vfs[i]->config + PCI_VENDOR_ID)) { + error_setg(errp, "SR-IOV VF vendor ID mismatches with PF"); + return -1; + } + + if (vf_dev_id !=3D pci_get_word(vfs[i]->config + PCI_DEVICE_ID)) { + error_setg(errp, "inconsistent SR-IOV VF device IDs"); + return -1; + } + + for (size_t j =3D 0; j < PCI_NUM_REGIONS; j++) { + if (vfs[i]->io_regions[j].size !=3D vfs[0]->io_regions[j].size= || + vfs[i]->io_regions[j].type !=3D vfs[0]->io_regions[j].type= ) { + error_setg(errp, "inconsistent SR-IOV BARs"); + return -1; + } + } + + if (vfs[i]->devfn - vfs[0]->devfn !=3D vf_stride * i) { + error_setg(errp, "inconsistent SR-IOV stride"); + return -1; + } + } + + if (!pcie_sriov_pf_init_common(dev, offset, vf_dev_id, pf->len, + pf->len, vf_offset, vf_stride, errp)) { + return -1; + } + + for (i =3D 0; i < pf->len; i++) { + vfs[i]->exp.sriov_vf.pf =3D dev; + vfs[i]->exp.sriov_vf.vf_number =3D i; + + /* set vid/did according to sr/iov spec - they are not used */ + pci_config_set_vendor_id(vfs[i]->config, 0xffff); + pci_config_set_device_id(vfs[i]->config, 0xffff); } + + dev->exp.sriov_pf.vf =3D vfs; + dev->exp.sriov_pf.vf_user_created =3D true; + + for (i =3D 0; i < PCI_NUM_REGIONS; i++) { + PCIIORegion *region =3D &vfs[0]->io_regions[i]; + + if (region->size) { + pcie_sriov_pf_init_vf_bar(dev, i, region->type, region->size); + } + } + + return PCI_EXT_CAP_SRIOV_SIZEOF; } =20 -static void unregister_vfs(PCIDevice *dev) +bool pcie_sriov_register_device(PCIDevice *dev, Error **errp) { - uint16_t i; - uint8_t *cfg =3D dev->config + dev->exp.sriov_cap; + if (!dev->exp.sriov_pf.vf && dev->qdev.id && + pfs && g_hash_table_contains(pfs, dev->qdev.id)) { + error_setg(errp, "attaching user-created SR-IOV VF unsupported"); + return false; + } =20 - trace_sriov_unregister_vfs(dev->name, PCI_SLOT(dev->devfn), - PCI_FUNC(dev->devfn)); - for (i =3D 0; i < pci_get_word(cfg + PCI_SRIOV_TOTAL_VF); i++) { - pci_set_enabled(dev->exp.sriov_pf.vf[i], false); + if (dev->sriov_pf) { + PCIDevice *pci_pf; + GPtrArray *pf; + + if (!PCI_DEVICE_GET_CLASS(dev)->sriov_vf_user_creatable) { + error_setg(errp, "user cannot create SR-IOV VF with this devic= e type"); + return false; + } + + if (!pci_is_express(dev)) { + error_setg(errp, "PCI Express is required for SR-IOV VF"); + return false; + } + + if (!pci_qdev_find_device(dev->sriov_pf, &pci_pf)) { + error_setg(errp, "PCI device specified as SR-IOV PF already ex= ists"); + return false; + } + + if (!pfs) { + pfs =3D g_hash_table_new_full(g_str_hash, g_str_equal, g_free,= NULL); + } + + pf =3D g_hash_table_lookup(pfs, dev->sriov_pf); + if (!pf) { + pf =3D g_ptr_array_new(); + g_hash_table_insert(pfs, g_strdup(dev->sriov_pf), pf); + } + + g_ptr_array_add(pf, dev); + } + + return true; +} + +void pcie_sriov_unregister_device(PCIDevice *dev) +{ + if (dev->sriov_pf && pfs) { + GPtrArray *pf =3D g_hash_table_lookup(pfs, dev->sriov_pf); + + if (pf) { + g_ptr_array_remove_fast(pf, dev); + + if (!pf->len) { + g_hash_table_remove(pfs, dev->sriov_pf); + g_ptr_array_free(pf, FALSE); + } + } } } =20 @@ -316,7 +488,7 @@ void pcie_sriov_pf_add_sup_pgsize(PCIDevice *dev, uint1= 6_t opt_sup_pgsize) =20 uint16_t pcie_sriov_vf_number(PCIDevice *dev) { - assert(pci_is_vf(dev)); + assert(dev->exp.sriov_vf.pf); return dev->exp.sriov_vf.vf_number; } =20 --=20 2.45.2