From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106554; cv=none; d=zohomail.com; s=zohoarc; b=lKcmaMiEkwSqI6pwYSbIAK6bLaJg52hHSLETR6s3SYrNYgk+80Dv1XlVPlJIoMN4kLjh+aSxR+OWyxzG5jG42rd7dkW1REiO/5Lha6ZRhYLVKdUpRwFw0DGzM2N+VOBH5c5cGYfHw8FHQVi6y/Riz7gTfmCcp0gq4AL1IJMlatM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106554; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=GXF8qtwD4slJ6TLe8X3dA2JENJH9tOJgY8siYw4fpbo=; b=V3L08haY+Bgt0sYl97SqkKpC1OkIUJvjwyqyN4iKVX6ADHYSUkcTYe4buB7AjfPvdRNVstf4blRK/VLC+KnquTFVrjUgN5Xwye7FL5ZOIlYr0UtS6C2n2bs6YBqZNlPL9T1QTW3nHqWDEgZmkQfn5BpLd51Z/5LoLnbbOXt/82s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106554872527.150615286996; Mon, 9 May 2022 07:29:14 -0700 (PDT) Received: from localhost ([::1]:55968 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4Nt-0007aG-0J for importer@patchew.org; Mon, 09 May 2022 10:29:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dj-0001FS-Im; Mon, 09 May 2022 10:18:43 -0400 Received: from mga05.intel.com ([192.55.52.43]:24280) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dh-00019s-Fk; Mon, 09 May 2022 10:18:43 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:39 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:36 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105921; x=1683641921; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bGFBSJAS5HmD35L0DTPo/bmfiwgr/AL24bDqB+Bx+aw=; b=Y90WVoqCPLcXm+J+aSIK2bYe0dtrm5wZUaouBcPldiPOJROzZYvC3PCr NVW3eHIi63vTPfNqh8RkfG9K9bBZYI3kus5Ac837Dvk+ASLvd8Vj/Lm60 IdOxUsFsu8MZtzNmcrIhbuPuQyKTYouJ1UD01/BjI1BPmnuf3/0lDiSKj dBgvSbBKmBdJMNY271Zu07/R8+p1NgU4ekgY8vOYN8AU4Y7si7KPQVVE0 WzkkIcdm4OydBnNdRw06Ns0KOKTcGlKXlbTldWInKCe9jVu1oAE5zmBcu DItPBjTu21I1FbUu6hLp63gXNhAxR4ksZK7BB9IeF9CUduGeQRb6xvvKd g==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490752" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490752" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987239" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 01/12] hw/nvme: Add support for SR-IOV Date: Mon, 9 May 2022 16:16:09 +0200 Message-Id: <20220509141620.3868733-2-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106555125100001 Content-Type: text/plain; charset="utf-8" This patch implements initial support for Single Root I/O Virtualization on an NVMe device. Essentially, it allows to define the maximum number of virtual functions supported by the NVMe controller via sriov_max_vfs parameter. Passing a non-zero value to sriov_max_vfs triggers reporting of SR-IOV capability by a physical controller and ARI capability by both the physical and virtual function devices. NVMe controllers created via virtual functions mirror functionally the physical controller, which may not entirely be the case, thus consideration would be needed on the way to limit the capabilities of the VF. NVMe subsystem is required for the use of SR-IOV. Signed-off-by: Lukasz Maniak Reviewed-by: Klaus Jensen Acked-by: Michael S. Tsirkin --- hw/nvme/ctrl.c | 85 ++++++++++++++++++++++++++++++++++++++-- hw/nvme/nvme.h | 3 +- include/hw/pci/pci_ids.h | 1 + 3 files changed, 85 insertions(+), 4 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 03760ddeae8..0e1d8d03c87 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -35,6 +35,7 @@ * mdts=3D,vsl=3D, \ * zoned.zasl=3D, \ * zoned.auto_transition=3D, \ + * sriov_max_vfs=3D \ * subsys=3D * -device nvme-ns,drive=3D,bus=3D,nsid=3D,\ * zoned=3D, \ @@ -106,6 +107,12 @@ * transitioned to zone state closed for resource management purposes. * Defaults to 'on'. * + * - `sriov_max_vfs` + * Indicates the maximum number of PCIe virtual functions supported + * by the controller. The default value is 0. Specifying a non-zero value + * enables reporting of both SR-IOV and ARI capabilities by the NVMe dev= ice. + * Virtual function controllers will not report SR-IOV capability. + * * nvme namespace device parameters * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * - `shared` @@ -160,6 +167,7 @@ #include "sysemu/block-backend.h" #include "sysemu/hostmem.h" #include "hw/pci/msix.h" +#include "hw/pci/pcie_sriov.h" #include "migration/vmstate.h" =20 #include "nvme.h" @@ -176,6 +184,9 @@ #define NVME_TEMPERATURE_CRITICAL 0x175 #define NVME_NUM_FW_SLOTS 1 #define NVME_DEFAULT_MAX_ZA_SIZE (128 * KiB) +#define NVME_MAX_VFS 127 +#define NVME_VF_OFFSET 0x1 +#define NVME_VF_STRIDE 1 =20 #define NVME_GUEST_ERR(trace, fmt, ...) \ do { \ @@ -5886,6 +5897,10 @@ static void nvme_ctrl_reset(NvmeCtrl *n) g_free(event); } =20 + if (!pci_is_vf(&n->parent_obj) && n->params.sriov_max_vfs) { + pcie_sriov_pf_disable_vfs(&n->parent_obj); + } + n->aer_queued =3D 0; n->outstanding_aers =3D 0; n->qs_created =3D false; @@ -6567,6 +6582,29 @@ static void nvme_check_constraints(NvmeCtrl *n, Erro= r **errp) error_setg(errp, "vsl must be non-zero"); return; } + + if (params->sriov_max_vfs) { + if (!n->subsys) { + error_setg(errp, "subsystem is required for the use of SR-IOV"= ); + return; + } + + if (params->sriov_max_vfs > NVME_MAX_VFS) { + error_setg(errp, "sriov_max_vfs must be between 0 and %d", + NVME_MAX_VFS); + return; + } + + if (params->cmb_size_mb) { + error_setg(errp, "CMB is not supported with SR-IOV"); + return; + } + + if (n->pmr.dev) { + error_setg(errp, "PMR is not supported with SR-IOV"); + return; + } + } } =20 static void nvme_init_state(NvmeCtrl *n) @@ -6624,6 +6662,20 @@ static void nvme_init_pmr(NvmeCtrl *n, PCIDevice *pc= i_dev) memory_region_set_enabled(&n->pmr.dev->mr, false); } =20 +static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *pci_dev, uint16_t offs= et, + uint64_t bar_size) +{ + uint16_t vf_dev_id =3D n->params.use_intel_id ? + PCI_DEVICE_ID_INTEL_NVME : PCI_DEVICE_ID_REDHAT_N= VME; + + pcie_sriov_pf_init(pci_dev, offset, "nvme", vf_dev_id, + n->params.sriov_max_vfs, n->params.sriov_max_vfs, + NVME_VF_OFFSET, NVME_VF_STRIDE); + + pcie_sriov_pf_init_vf_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64, bar_size); +} + static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) { uint8_t *pci_conf =3D pci_dev->config; @@ -6638,7 +6690,7 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_= dev, Error **errp) =20 if (n->params.use_intel_id) { pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL); - pci_config_set_device_id(pci_conf, 0x5845); + pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_INTEL_NVME); } else { pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_REDHAT); pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_REDHAT_NVME); @@ -6646,6 +6698,9 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_= dev, Error **errp) =20 pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS); pcie_endpoint_cap_init(pci_dev, 0x80); + if (n->params.sriov_max_vfs) { + pcie_ari_init(pci_dev, 0x100, 1); + } =20 bar_size =3D QEMU_ALIGN_UP(n->reg_size, 4 * KiB); msix_table_offset =3D bar_size; @@ -6664,8 +6719,12 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci= _dev, Error **errp) n->reg_size); memory_region_add_subregion(&n->bar0, 0, &n->iomem); =20 - pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | - PCI_BASE_ADDRESS_MEM_TYPE_64, &n->bar0); + if (pci_is_vf(pci_dev)) { + pcie_sriov_vf_register_bar(pci_dev, 0, &n->bar0); + } else { + pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64, &n->bar0); + } ret =3D msix_init(pci_dev, n->params.msix_qsize, &n->bar0, 0, msix_table_offset, &n->bar0, 0, msix_pba_offset, 0, &err); @@ -6686,6 +6745,10 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci= _dev, Error **errp) nvme_init_pmr(n, pci_dev); } =20 + if (!pci_is_vf(pci_dev) && n->params.sriov_max_vfs) { + nvme_init_sriov(n, pci_dev, 0x120, bar_size); + } + return 0; } =20 @@ -6836,6 +6899,16 @@ static void nvme_realize(PCIDevice *pci_dev, Error *= *errp) NvmeCtrl *n =3D NVME(pci_dev); NvmeNamespace *ns; Error *local_err =3D NULL; + NvmeCtrl *pn =3D NVME(pcie_sriov_get_pf(pci_dev)); + + if (pci_is_vf(pci_dev)) { + /* + * VFs derive settings from the parent. PF's lifespan exceeds + * that of VF's, so it's safe to share params.serial. + */ + memcpy(&n->params, &pn->params, sizeof(NvmeParams)); + n->subsys =3D pn->subsys; + } =20 nvme_check_constraints(n, &local_err); if (local_err) { @@ -6900,6 +6973,11 @@ static void nvme_exit(PCIDevice *pci_dev) if (n->pmr.dev) { host_memory_backend_set_mapped(n->pmr.dev, false); } + + if (!pci_is_vf(pci_dev) && n->params.sriov_max_vfs) { + pcie_sriov_pf_exit(pci_dev); + } + msix_uninit(pci_dev, &n->bar0, &n->bar0); memory_region_del_subregion(&n->bar0, &n->iomem); } @@ -6924,6 +7002,7 @@ static Property nvme_props[] =3D { DEFINE_PROP_UINT8("zoned.zasl", NvmeCtrl, params.zasl, 0), DEFINE_PROP_BOOL("zoned.auto_transition", NvmeCtrl, params.auto_transition_zones, true), + DEFINE_PROP_UINT8("sriov_max_vfs", NvmeCtrl, params.sriov_max_vfs, 0), DEFINE_PROP_END_OF_LIST(), }; =20 diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 739c8b8f796..89ca6e96401 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -24,7 +24,7 @@ =20 #include "block/nvme.h" =20 -#define NVME_MAX_CONTROLLERS 32 +#define NVME_MAX_CONTROLLERS 256 #define NVME_MAX_NAMESPACES 256 #define NVME_EUI64_DEFAULT ((uint64_t)0x5254000000000000) =20 @@ -405,6 +405,7 @@ typedef struct NvmeParams { uint8_t zasl; bool auto_transition_zones; bool legacy_cmb; + uint8_t sriov_max_vfs; } NvmeParams; =20 typedef struct NvmeCtrl { diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h index 11abe22d460..992426768e6 100644 --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -237,6 +237,7 @@ #define PCI_DEVICE_ID_INTEL_82801BA_11 0x244e #define PCI_DEVICE_ID_INTEL_82801D 0x24CD #define PCI_DEVICE_ID_INTEL_ESB_9 0x25ab +#define PCI_DEVICE_ID_INTEL_NVME 0x5845 #define PCI_DEVICE_ID_INTEL_82371SB_0 0x7000 #define PCI_DEVICE_ID_INTEL_82371SB_1 0x7010 #define PCI_DEVICE_ID_INTEL_82371SB_2 0x7020 --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106597; cv=none; d=zohomail.com; s=zohoarc; b=kyEvo5HaqpqEuk1tstI6hcpGvNtbHIxf38sqygAmMvHkiCrZ2sEhZlGe1bRZ0B1Fz+6WzCdVgeOfwh6okjyslv7IP+4MMEgOi7K8sZLtyqUX6nUQ++nlArTk40bETXDQtegII5vlVPeaDyoRDOZoGJzqhCdajuHae5+MhNCdpn0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106597; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=YpnIy8rZLBT2FVSy34IAcFPFfVFbECUfa+chiJQ0OLY=; b=mAdQyZhfe1f8MsftEhRCsw+Px6cVTpBb5m0F4oZ0glodvASKYeGMPxSu2vMDAGPueLj7cKJgUNjCKyObMOrNiyG6Q2dOJAuyp3Qh5GDgqoQN4otC95wmqnLkeW8nzJpYQHDPlodKRbJsCDFQxOeQGAzHXAP+wyQdVrvcbCtLIu0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106597152549.4255206000731; Mon, 9 May 2022 07:29:57 -0700 (PDT) Received: from localhost ([::1]:58470 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4Oa-0000tY-6L for importer@patchew.org; Mon, 09 May 2022 10:29:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38358) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dn-0001PD-Lx; Mon, 09 May 2022 10:18:47 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dl-0001AQ-PA; Mon, 09 May 2022 10:18:47 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:43 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:40 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105925; x=1683641925; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kpyEGPav2Jjbyh9Fqg0ipScT4T6zsmT4D+4NfAhGJxw=; b=bDnpjeAI9GMHVOwI+xcYEJBdAKYmQWtRuWq5725xCcLc9KmyX1DI+xWm eR46X2Y+VIri8kA2fbpFFtX+yRbyCGEmx1g52M5BBkLR82HboxgCThpkD Z0n42zySpvixASTH6sAAHnJrOClrbKvk0rYaR9gqKBdhzYqfUwyGKYbTF kTgv7pT1NY8U1RBfcZHyYCxBN3v1Yi23d42PIj5m7RDJ0dgL8YdtIHJvi lKfAJFS/elu3TPCtzScUl7na69bB1iRmmfGnNAwHMT76nanSDjgXGKbtO Rss5ehLBvpgWdgU/paf8reeXkbIDkz/7wivtJwaZAc35Qicc5UCbjZ45k A==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490774" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490774" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987247" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 02/12] hw/nvme: Add support for Primary Controller Capabilities Date: Mon, 9 May 2022 16:16:10 +0200 Message-Id: <20220509141620.3868733-3-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106598351100001 Content-Type: text/plain; charset="utf-8" Implementation of Primary Controller Capabilities data structure (Identify command with CNS value of 14h). Currently, the command returns only ID of a primary controller. Handling of remaining fields are added in subsequent patches implementing virtualization enhancements. Signed-off-by: Lukasz Maniak Reviewed-by: Klaus Jensen Acked-by: Michael S. Tsirkin --- hw/nvme/ctrl.c | 23 ++++++++++++++++++----- hw/nvme/nvme.h | 2 ++ hw/nvme/trace-events | 1 + include/block/nvme.h | 23 +++++++++++++++++++++++ 4 files changed, 44 insertions(+), 5 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 0e1d8d03c87..ea9d5af3545 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -4799,6 +4799,14 @@ static uint16_t nvme_identify_ctrl_list(NvmeCtrl *n,= NvmeRequest *req, return nvme_c2h(n, (uint8_t *)list, sizeof(list), req); } =20 +static uint16_t nvme_identify_pri_ctrl_cap(NvmeCtrl *n, NvmeRequest *req) +{ + trace_pci_nvme_identify_pri_ctrl_cap(le16_to_cpu(n->pri_ctrl_cap.cntli= d)); + + return nvme_c2h(n, (uint8_t *)&n->pri_ctrl_cap, + sizeof(NvmePriCtrlCap), req); +} + static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, bool active) { @@ -5018,6 +5026,8 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeReques= t *req) return nvme_identify_ctrl_list(n, req, true); case NVME_ID_CNS_CTRL_LIST: return nvme_identify_ctrl_list(n, req, false); + case NVME_ID_CNS_PRIMARY_CTRL_CAP: + return nvme_identify_pri_ctrl_cap(n, req); case NVME_ID_CNS_CS_NS: return nvme_identify_ns_csi(n, req, true); case NVME_ID_CNS_CS_NS_PRESENT: @@ -6609,6 +6619,8 @@ static void nvme_check_constraints(NvmeCtrl *n, Error= **errp) =20 static void nvme_init_state(NvmeCtrl *n) { + NvmePriCtrlCap *cap =3D &n->pri_ctrl_cap; + /* add one to max_ioqpairs to account for the admin queue pair */ n->reg_size =3D pow2ceil(sizeof(NvmeBar) + 2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE= ); @@ -6618,6 +6630,8 @@ static void nvme_init_state(NvmeCtrl *n) n->features.temp_thresh_hi =3D NVME_TEMPERATURE_WARNING; n->starttime_ms =3D qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs =3D g_new0(NvmeRequest *, n->params.aerl + 1); + + cap->cntlid =3D cpu_to_le16(n->cntlid); } =20 static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev) @@ -6919,15 +6933,14 @@ static void nvme_realize(PCIDevice *pci_dev, Error = **errp) qbus_init(&n->bus, sizeof(NvmeBus), TYPE_NVME_BUS, &pci_dev->qdev, n->parent_obj.qdev.id); =20 - nvme_init_state(n); - if (nvme_init_pci(n, pci_dev, errp)) { - return; - } - if (nvme_init_subsys(n, errp)) { error_propagate(errp, local_err); return; } + nvme_init_state(n); + if (nvme_init_pci(n, pci_dev, errp)) { + return; + } nvme_init_ctrl(n, pci_dev); =20 /* setup a namespace if the controller drive property was given */ diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 89ca6e96401..e58bab841e2 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -477,6 +477,8 @@ typedef struct NvmeCtrl { uint32_t async_config; NvmeHostBehaviorSupport hbs; } features; + + NvmePriCtrlCap pri_ctrl_cap; } NvmeCtrl; =20 static inline NvmeNamespace *nvme_ns(NvmeCtrl *n, uint32_t nsid) diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events index ff1b4589692..1834b17cf21 100644 --- a/hw/nvme/trace-events +++ b/hw/nvme/trace-events @@ -56,6 +56,7 @@ pci_nvme_identify_ctrl(void) "identify controller" pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=3D0x%"PR= Ix8"" pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" pci_nvme_identify_ctrl_list(uint8_t cns, uint16_t cntid) "cns 0x%"PRIx8" c= ntid %"PRIu16"" +pci_nvme_identify_pri_ctrl_cap(uint16_t cntlid) "identify primary controll= er capabilities cntlid=3D%"PRIu16"" pci_nvme_identify_ns_csi(uint32_t ns, uint8_t csi) "nsid=3D%"PRIu32", csi= =3D0x%"PRIx8"" pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=3D%"PRIu16", = csi=3D0x%"PRIx8"" diff --git a/include/block/nvme.h b/include/block/nvme.h index 3737351cc81..524a04fb94e 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -1033,6 +1033,7 @@ enum NvmeIdCns { NVME_ID_CNS_NS_PRESENT =3D 0x11, NVME_ID_CNS_NS_ATTACHED_CTRL_LIST =3D 0x12, NVME_ID_CNS_CTRL_LIST =3D 0x13, + NVME_ID_CNS_PRIMARY_CTRL_CAP =3D 0x14, NVME_ID_CNS_CS_NS_PRESENT_LIST =3D 0x1a, NVME_ID_CNS_CS_NS_PRESENT =3D 0x1b, NVME_ID_CNS_IO_COMMAND_SET =3D 0x1c, @@ -1553,6 +1554,27 @@ typedef enum NvmeZoneState { NVME_ZONE_STATE_OFFLINE =3D 0x0f, } NvmeZoneState; =20 +typedef struct QEMU_PACKED NvmePriCtrlCap { + uint16_t cntlid; + uint16_t portid; + uint8_t crt; + uint8_t rsvd5[27]; + uint32_t vqfrt; + uint32_t vqrfa; + uint16_t vqrfap; + uint16_t vqprt; + uint16_t vqfrsm; + uint16_t vqgran; + uint8_t rsvd48[16]; + uint32_t vifrt; + uint32_t virfa; + uint16_t virfap; + uint16_t viprt; + uint16_t vifrsm; + uint16_t vigran; + uint8_t rsvd80[4016]; +} NvmePriCtrlCap; + static inline void _nvme_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmeBar) !=3D 4096); @@ -1588,5 +1610,6 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) !=3D 4); QEMU_BUILD_BUG_ON(sizeof(NvmeZoneDescr) !=3D 64); QEMU_BUILD_BUG_ON(sizeof(NvmeDifTuple) !=3D 16); + QEMU_BUILD_BUG_ON(sizeof(NvmePriCtrlCap) !=3D 4096); } #endif --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106770; cv=none; d=zohomail.com; s=zohoarc; b=FFKJV0EyELcNI4tH/JNCp0MPaBlFNzjDVPZ99pu0BVr9Hcd/xcDB+vJ7FzZyalxtsJccR6suLZtXLegl37Xy8Vfi4Fvviaj/GmhoXCn1a/IBUu914Ou0Vmc3MmINHU/HOrasfjp4cIypy9gXfbqCWCXjisCniU9wS2yZdriPRSs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106770; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Yv4i4LWVSnY1f3oX9qSyTcsEiB/x29EWTwlmh2GByxs=; b=YJAq2Kvm0FzWHiVAlLTU7uy6LpFvCnE6PjhDRDzLMWBqW6bcjJghOWK/mtHTyn73T+ViWOBsU+QiPOxgb5DtLCt8KP6EAO05ogHgIWkY19B4ullfWfXerL/DQOSxDWbeNVAXApH/ZxVPp17GqrCUlGBubqmlIhTc7ovBMGBC6pk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106770251562.7475006299494; Mon, 9 May 2022 07:32:50 -0700 (PDT) Received: from localhost ([::1]:39010 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4RM-000747-U3 for importer@patchew.org; Mon, 09 May 2022 10:32:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38374) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dr-0001cQ-QK; Mon, 09 May 2022 10:18:51 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dp-0001AQ-E1; Mon, 09 May 2022 10:18:51 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:47 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:44 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105929; x=1683641929; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hC3rPhaF03DfQt1rtDbTBuIm4D3pBgzymZdzuS66Q9A=; b=EMAl8rBXfAMGqEigMmdSB/QA/Z9K1+Ew2gwsB/wK/W1YnH+uZWl9XqYj z7JeT5M+I3PuD8caj//cNmPpEAoFkHJtv7+2yH7YoyhopOuXFb991tnf1 e9+EjPzp4vo8S8/Hscc0sFVv1zR8++AVRM1bA/4u3B0VvgdPOo8jJ8JXt c6JfCWMxii+ISE5XjbggXCJzpDF3GHOb91vPg2vWIUY9Cgh6lRaNcVDbZ LoJedWe87ubD67cY/I2h0Q8Wak68X9TMhtZH2y/jZi9aBnXnfNXa9HR1T /rMzdOb4xzSLxyyqcg7Qx1n5sWswp3I19v/Jg5yERI95okQJp1RRe4vrT w==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490802" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490802" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987260" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 03/12] hw/nvme: Add support for Secondary Controller List Date: Mon, 9 May 2022 16:16:11 +0200 Message-Id: <20220509141620.3868733-4-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106770799100001 Content-Type: text/plain; charset="utf-8" Introduce handling for Secondary Controller List (Identify command with CNS value of 15h). Secondary controller ids are unique in the subsystem, hence they are reserved by it upon initialization of the primary controller to the number of sriov_max_vfs. ID reservation requires the addition of an intermediate controller slot state, so the reserved controller has the address 0xFFFF. A secondary controller is in the reserved state when it has no virtual function assigned, but its primary controller is realized. Secondary controller reservations are released to NULL when its primary controller is unregistered. Signed-off-by: Lukasz Maniak Acked-by: Michael S. Tsirkin Reviewed-by: Klaus Jensen --- hw/nvme/ctrl.c | 35 +++++++++++++++++++++ hw/nvme/ns.c | 2 +- hw/nvme/nvme.h | 18 +++++++++++ hw/nvme/subsys.c | 75 ++++++++++++++++++++++++++++++++++++++------ hw/nvme/trace-events | 1 + include/block/nvme.h | 20 ++++++++++++ 6 files changed, 141 insertions(+), 10 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index ea9d5af3545..b1b1bebbaf2 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -4807,6 +4807,29 @@ static uint16_t nvme_identify_pri_ctrl_cap(NvmeCtrl = *n, NvmeRequest *req) sizeof(NvmePriCtrlCap), req); } =20 +static uint16_t nvme_identify_sec_ctrl_list(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeIdentify *c =3D (NvmeIdentify *)&req->cmd; + uint16_t pri_ctrl_id =3D le16_to_cpu(n->pri_ctrl_cap.cntlid); + uint16_t min_id =3D le16_to_cpu(c->ctrlid); + uint8_t num_sec_ctrl =3D n->sec_ctrl_list.numcntl; + NvmeSecCtrlList list =3D {0}; + uint8_t i; + + for (i =3D 0; i < num_sec_ctrl; i++) { + if (n->sec_ctrl_list.sec[i].scid >=3D min_id) { + list.numcntl =3D num_sec_ctrl - i; + memcpy(&list.sec, n->sec_ctrl_list.sec + i, + list.numcntl * sizeof(NvmeSecCtrlEntry)); + break; + } + } + + trace_pci_nvme_identify_sec_ctrl_list(pri_ctrl_id, list.numcntl); + + return nvme_c2h(n, (uint8_t *)&list, sizeof(list), req); +} + static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, bool active) { @@ -5028,6 +5051,8 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeReques= t *req) return nvme_identify_ctrl_list(n, req, false); case NVME_ID_CNS_PRIMARY_CTRL_CAP: return nvme_identify_pri_ctrl_cap(n, req); + case NVME_ID_CNS_SECONDARY_CTRL_LIST: + return nvme_identify_sec_ctrl_list(n, req); case NVME_ID_CNS_CS_NS: return nvme_identify_ns_csi(n, req, true); case NVME_ID_CNS_CS_NS_PRESENT: @@ -6620,6 +6645,9 @@ static void nvme_check_constraints(NvmeCtrl *n, Error= **errp) static void nvme_init_state(NvmeCtrl *n) { NvmePriCtrlCap *cap =3D &n->pri_ctrl_cap; + NvmeSecCtrlList *list =3D &n->sec_ctrl_list; + NvmeSecCtrlEntry *sctrl; + int i; =20 /* add one to max_ioqpairs to account for the admin queue pair */ n->reg_size =3D pow2ceil(sizeof(NvmeBar) + @@ -6631,6 +6659,13 @@ static void nvme_init_state(NvmeCtrl *n) n->starttime_ms =3D qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs =3D g_new0(NvmeRequest *, n->params.aerl + 1); =20 + list->numcntl =3D cpu_to_le16(n->params.sriov_max_vfs); + for (i =3D 0; i < n->params.sriov_max_vfs; i++) { + sctrl =3D &list->sec[i]; + sctrl->pcid =3D cpu_to_le16(n->cntlid); + sctrl->vfn =3D cpu_to_le16(i + 1); + } + cap->cntlid =3D cpu_to_le16(n->cntlid); } =20 diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c index 324f53ea0cd..3b227de0065 100644 --- a/hw/nvme/ns.c +++ b/hw/nvme/ns.c @@ -596,7 +596,7 @@ static void nvme_ns_realize(DeviceState *dev, Error **e= rrp) for (i =3D 0; i < ARRAY_SIZE(subsys->ctrls); i++) { NvmeCtrl *ctrl =3D subsys->ctrls[i]; =20 - if (ctrl) { + if (ctrl && ctrl !=3D SUBSYS_SLOT_RSVD) { nvme_attach_ns(ctrl, ns); } } diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index e58bab841e2..7581ef26fdb 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -43,6 +43,7 @@ typedef struct NvmeBus { #define TYPE_NVME_SUBSYS "nvme-subsys" #define NVME_SUBSYS(obj) \ OBJECT_CHECK(NvmeSubsystem, (obj), TYPE_NVME_SUBSYS) +#define SUBSYS_SLOT_RSVD (void *)0xFFFF =20 typedef struct NvmeSubsystem { DeviceState parent_obj; @@ -67,6 +68,10 @@ static inline NvmeCtrl *nvme_subsys_ctrl(NvmeSubsystem *= subsys, return NULL; } =20 + if (subsys->ctrls[cntlid] =3D=3D SUBSYS_SLOT_RSVD) { + return NULL; + } + return subsys->ctrls[cntlid]; } =20 @@ -479,6 +484,7 @@ typedef struct NvmeCtrl { } features; =20 NvmePriCtrlCap pri_ctrl_cap; + NvmeSecCtrlList sec_ctrl_list; } NvmeCtrl; =20 static inline NvmeNamespace *nvme_ns(NvmeCtrl *n, uint32_t nsid) @@ -513,6 +519,18 @@ static inline uint16_t nvme_cid(NvmeRequest *req) return le16_to_cpu(req->cqe.cid); } =20 +static inline NvmeSecCtrlEntry *nvme_sctrl(NvmeCtrl *n) +{ + PCIDevice *pci_dev =3D &n->parent_obj; + NvmeCtrl *pf =3D NVME(pcie_sriov_get_pf(pci_dev)); + + if (pci_is_vf(pci_dev)) { + return &pf->sec_ctrl_list.sec[pcie_sriov_vf_number(pci_dev)]; + } + + return NULL; +} + void nvme_attach_ns(NvmeCtrl *n, NvmeNamespace *ns); uint16_t nvme_bounce_data(NvmeCtrl *n, void *ptr, uint32_t len, NvmeTxDirection dir, NvmeRequest *req); diff --git a/hw/nvme/subsys.c b/hw/nvme/subsys.c index fb58d639504..f445ca86c25 100644 --- a/hw/nvme/subsys.c +++ b/hw/nvme/subsys.c @@ -11,20 +11,71 @@ =20 #include "nvme.h" =20 -int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp) +static int nvme_subsys_reserve_cntlids(NvmeCtrl *n, int start, int num) { NvmeSubsystem *subsys =3D n->subsys; - int cntlid, nsid; + NvmeSecCtrlList *list =3D &n->sec_ctrl_list; + NvmeSecCtrlEntry *sctrl; + int i, cnt =3D 0; + + for (i =3D start; i < ARRAY_SIZE(subsys->ctrls) && cnt < num; i++) { + if (!subsys->ctrls[i]) { + sctrl =3D &list->sec[cnt]; + sctrl->scid =3D cpu_to_le16(i); + subsys->ctrls[i] =3D SUBSYS_SLOT_RSVD; + cnt++; + } + } + + return cnt; +} =20 - for (cntlid =3D 0; cntlid < ARRAY_SIZE(subsys->ctrls); cntlid++) { - if (!subsys->ctrls[cntlid]) { - break; +static void nvme_subsys_unreserve_cntlids(NvmeCtrl *n) +{ + NvmeSubsystem *subsys =3D n->subsys; + NvmeSecCtrlList *list =3D &n->sec_ctrl_list; + NvmeSecCtrlEntry *sctrl; + int i, cntlid; + + for (i =3D 0; i < n->params.sriov_max_vfs; i++) { + sctrl =3D &list->sec[i]; + cntlid =3D le16_to_cpu(sctrl->scid); + + if (cntlid) { + assert(subsys->ctrls[cntlid] =3D=3D SUBSYS_SLOT_RSVD); + subsys->ctrls[cntlid] =3D NULL; + sctrl->scid =3D 0; } } +} + +int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp) +{ + NvmeSubsystem *subsys =3D n->subsys; + NvmeSecCtrlEntry *sctrl =3D nvme_sctrl(n); + int cntlid, nsid, num_rsvd, num_vfs =3D n->params.sriov_max_vfs; + + if (pci_is_vf(&n->parent_obj)) { + cntlid =3D le16_to_cpu(sctrl->scid); + } else { + for (cntlid =3D 0; cntlid < ARRAY_SIZE(subsys->ctrls); cntlid++) { + if (!subsys->ctrls[cntlid]) { + break; + } + } =20 - if (cntlid =3D=3D ARRAY_SIZE(subsys->ctrls)) { - error_setg(errp, "no more free controller id"); - return -1; + if (cntlid =3D=3D ARRAY_SIZE(subsys->ctrls)) { + error_setg(errp, "no more free controller id"); + return -1; + } + + num_rsvd =3D nvme_subsys_reserve_cntlids(n, cntlid + 1, num_vfs); + if (num_rsvd !=3D num_vfs) { + nvme_subsys_unreserve_cntlids(n); + error_setg(errp, + "no more free controller ids for secondary controll= ers"); + return -1; + } } =20 subsys->ctrls[cntlid] =3D n; @@ -41,7 +92,13 @@ int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp) =20 void nvme_subsys_unregister_ctrl(NvmeSubsystem *subsys, NvmeCtrl *n) { - subsys->ctrls[n->cntlid] =3D NULL; + if (pci_is_vf(&n->parent_obj)) { + subsys->ctrls[n->cntlid] =3D SUBSYS_SLOT_RSVD; + } else { + subsys->ctrls[n->cntlid] =3D NULL; + nvme_subsys_unreserve_cntlids(n); + } + n->cntlid =3D -1; } =20 diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events index 1834b17cf21..889bbb3101e 100644 --- a/hw/nvme/trace-events +++ b/hw/nvme/trace-events @@ -57,6 +57,7 @@ pci_nvme_identify_ctrl_csi(uint8_t csi) "identify control= ler, csi=3D0x%"PRIx8"" pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" pci_nvme_identify_ctrl_list(uint8_t cns, uint16_t cntid) "cns 0x%"PRIx8" c= ntid %"PRIu16"" pci_nvme_identify_pri_ctrl_cap(uint16_t cntlid) "identify primary controll= er capabilities cntlid=3D%"PRIu16"" +pci_nvme_identify_sec_ctrl_list(uint16_t cntlid, uint8_t numcntl) "identif= y secondary controller list cntlid=3D%"PRIu16" numcntl=3D%"PRIu8"" pci_nvme_identify_ns_csi(uint32_t ns, uint8_t csi) "nsid=3D%"PRIu32", csi= =3D0x%"PRIx8"" pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=3D%"PRIu16", = csi=3D0x%"PRIx8"" diff --git a/include/block/nvme.h b/include/block/nvme.h index 524a04fb94e..94efd32578c 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -1034,6 +1034,7 @@ enum NvmeIdCns { NVME_ID_CNS_NS_ATTACHED_CTRL_LIST =3D 0x12, NVME_ID_CNS_CTRL_LIST =3D 0x13, NVME_ID_CNS_PRIMARY_CTRL_CAP =3D 0x14, + NVME_ID_CNS_SECONDARY_CTRL_LIST =3D 0x15, NVME_ID_CNS_CS_NS_PRESENT_LIST =3D 0x1a, NVME_ID_CNS_CS_NS_PRESENT =3D 0x1b, NVME_ID_CNS_IO_COMMAND_SET =3D 0x1c, @@ -1575,6 +1576,23 @@ typedef struct QEMU_PACKED NvmePriCtrlCap { uint8_t rsvd80[4016]; } NvmePriCtrlCap; =20 +typedef struct QEMU_PACKED NvmeSecCtrlEntry { + uint16_t scid; + uint16_t pcid; + uint8_t scs; + uint8_t rsvd5[3]; + uint16_t vfn; + uint16_t nvq; + uint16_t nvi; + uint8_t rsvd14[18]; +} NvmeSecCtrlEntry; + +typedef struct QEMU_PACKED NvmeSecCtrlList { + uint8_t numcntl; + uint8_t rsvd1[31]; + NvmeSecCtrlEntry sec[127]; +} NvmeSecCtrlList; + static inline void _nvme_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmeBar) !=3D 4096); @@ -1611,5 +1629,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeZoneDescr) !=3D 64); QEMU_BUILD_BUG_ON(sizeof(NvmeDifTuple) !=3D 16); QEMU_BUILD_BUG_ON(sizeof(NvmePriCtrlCap) !=3D 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlEntry) !=3D 32); + QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlList) !=3D 4096); } #endif --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106749; cv=none; d=zohomail.com; s=zohoarc; b=jmlCW3IWmdlvqw31cxfAvzDWsIO7rZYfkmxDrHZkIcecJ+enrbz+3XTMMpt+TKgiFQxEnGIOQwh6dFXVKnYS3oSArWzGKjsrwDKud5GFH+T1JzelUryRtEoeChXvXrU2bA+pG5Waw+ksjeoxtrxycnn0xNChvkGms/33PdUhJXc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106749; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=IO7X/GguioBd3ZtqjvD0mITXGoTugtFJo+cusJNNswo=; b=O+xrxtJWzfMWktetRZ0OQIzlhn7PuGaQ37SAROCCd8wWjgDR6+n73QMkh856vzaIbEfRw6Soqf5h7Azm0giKfsvs0YsESSsfQNUV97/quzENSs2aVSl8Q8UkPlCVkQP1CQojxRNb0UHttiY56S41/tX3r59X22HMh057IQ9dEwM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106749458761.2552806568237; Mon, 9 May 2022 07:32:29 -0700 (PDT) Received: from localhost ([::1]:37232 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4R2-0005ok-F8 for importer@patchew.org; Mon, 09 May 2022 10:32:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38398) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Du-0001m7-UL; Mon, 09 May 2022 10:18:54 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Ds-0001AQ-U3; Mon, 09 May 2022 10:18:54 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:51 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:48 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105932; x=1683641932; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aGxVfD9gw0VoBMvF6uEcegfpfjMO3FkpcfrDte6Hulk=; b=MUHcMJW7jc4/UinZgHVa3YN6+e9OxoJAZTCL5l2/eM5eIRqoEhtnWMNr 4gtUF/Q5Doc48hpaOqPB7mK6T4GtG3uqZq/RYnU02bBaqgOtlWvaf91VJ e4Sh2ANxk4RFvR1NSjPO3Aa3tgXJ57D2+F1K3mgjc0B2kV9Ro5MxgtSR8 WH8PMjNh4rRxUKqhIvxJJe5ck07Nrc24VsvT4kjdx3hvs5ObRnW+1L7p9 1uNsXLioUczdol9Tsi8QkYAeQBpk/P94/lQiotVoD1uMRV1QJ3jEgU+Ob +9bBoB0u9dXBX8tpFBvrfH8V5SVsvTYo2YMF4v7BWUqsLtb0QBMpJlpum Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490813" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490813" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987270" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 04/12] hw/nvme: Implement the Function Level Reset Date: Mon, 9 May 2022 16:16:12 +0200 Message-Id: <20220509141620.3868733-5-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106750584100003 From: =C5=81ukasz Gieryk This patch implements the Function Level Reset, a feature currently not implemented for the Nvme device, while listed as a mandatory ("shall") in the 1.4 spec. The implementation reuses FLR-related building blocks defined for the pci-bridge module, and follows the same logic: - FLR capability is advertised in the PCIE config, - custom pci_write_config callback detects a write to the trigger register and performs the PCI reset, - which, eventually, calls the custom dc->reset handler. Depending on reset type, parts of the state should (or should not) be cleared. To distinguish the type of reset, an additional parameter is passed to the reset function. This patch also enables advertisement of the Power Management PCI capability. The main reason behind it is to announce the no_soft_reset=3D1 bit, to signal SR-IOV support where each VF can be reset individually. The implementation purposedly ignores writes to the PMCS.PS register, as even such na=C3=AFve behavior is enough to correctly handle the D3->D0 transition. It=E2=80=99s worth to note, that the power state transition back to to D3, = with all the corresponding side effects, wasn't and stil isn't handled properly. Signed-off-by: =C5=81ukasz Gieryk Reviewed-by: Klaus Jensen Acked-by: Michael S. Tsirkin --- hw/nvme/ctrl.c | 52 ++++++++++++++++++++++++++++++++++++++++---- hw/nvme/nvme.h | 5 +++++ hw/nvme/trace-events | 1 + 3 files changed, 54 insertions(+), 4 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index b1b1bebbaf2..e6d6e5840af 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -5901,7 +5901,7 @@ static void nvme_process_sq(void *opaque) } } =20 -static void nvme_ctrl_reset(NvmeCtrl *n) +static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetType rst) { NvmeNamespace *ns; int i; @@ -5933,7 +5933,9 @@ static void nvme_ctrl_reset(NvmeCtrl *n) } =20 if (!pci_is_vf(&n->parent_obj) && n->params.sriov_max_vfs) { - pcie_sriov_pf_disable_vfs(&n->parent_obj); + if (rst !=3D NVME_RESET_CONTROLLER) { + pcie_sriov_pf_disable_vfs(&n->parent_obj); + } } =20 n->aer_queued =3D 0; @@ -6167,7 +6169,7 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset= , uint64_t data, } } else if (!NVME_CC_EN(data) && NVME_CC_EN(cc)) { trace_pci_nvme_mmio_stopped(); - nvme_ctrl_reset(n); + nvme_ctrl_reset(n, NVME_RESET_CONTROLLER); cc =3D 0; csts &=3D ~NVME_CSTS_READY; } @@ -6725,6 +6727,28 @@ static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *= pci_dev, uint16_t offset, PCI_BASE_ADDRESS_MEM_TYPE_64, bar_size); } =20 +static int nvme_add_pm_capability(PCIDevice *pci_dev, uint8_t offset) +{ + Error *err =3D NULL; + int ret; + + ret =3D pci_add_capability(pci_dev, PCI_CAP_ID_PM, offset, + PCI_PM_SIZEOF, &err); + if (err) { + error_report_err(err); + return ret; + } + + pci_set_word(pci_dev->config + offset + PCI_PM_PMC, + PCI_PM_CAP_VER_1_2); + pci_set_word(pci_dev->config + offset + PCI_PM_CTRL, + PCI_PM_CTRL_NO_SOFT_RESET); + pci_set_word(pci_dev->wmask + offset + PCI_PM_CTRL, + PCI_PM_CTRL_STATE_MASK); + + return 0; +} + static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) { uint8_t *pci_conf =3D pci_dev->config; @@ -6746,7 +6770,9 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_= dev, Error **errp) } =20 pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS); + nvme_add_pm_capability(pci_dev, 0x60); pcie_endpoint_cap_init(pci_dev, 0x80); + pcie_cap_flr_init(pci_dev); if (n->params.sriov_max_vfs) { pcie_ari_init(pci_dev, 0x100, 1); } @@ -6997,7 +7023,7 @@ static void nvme_exit(PCIDevice *pci_dev) NvmeNamespace *ns; int i; =20 - nvme_ctrl_reset(n); + nvme_ctrl_reset(n, NVME_RESET_FUNCTION); =20 if (n->subsys) { for (i =3D 1; i <=3D NVME_MAX_NAMESPACES; i++) { @@ -7096,6 +7122,22 @@ static void nvme_set_smart_warning(Object *obj, Visi= tor *v, const char *name, } } =20 +static void nvme_pci_reset(DeviceState *qdev) +{ + PCIDevice *pci_dev =3D PCI_DEVICE(qdev); + NvmeCtrl *n =3D NVME(pci_dev); + + trace_pci_nvme_pci_reset(); + nvme_ctrl_reset(n, NVME_RESET_FUNCTION); +} + +static void nvme_pci_write_config(PCIDevice *dev, uint32_t address, + uint32_t val, int len) +{ + pci_default_write_config(dev, address, val, len); + pcie_cap_flr_write_config(dev, address, val, len); +} + static const VMStateDescription nvme_vmstate =3D { .name =3D "nvme", .unmigratable =3D 1, @@ -7107,6 +7149,7 @@ static void nvme_class_init(ObjectClass *oc, void *da= ta) PCIDeviceClass *pc =3D PCI_DEVICE_CLASS(oc); =20 pc->realize =3D nvme_realize; + pc->config_write =3D nvme_pci_write_config; pc->exit =3D nvme_exit; pc->class_id =3D PCI_CLASS_STORAGE_EXPRESS; pc->revision =3D 2; @@ -7115,6 +7158,7 @@ static void nvme_class_init(ObjectClass *oc, void *da= ta) dc->desc =3D "Non-Volatile Memory Express"; device_class_set_props(dc, nvme_props); dc->vmsd =3D &nvme_vmstate; + dc->reset =3D nvme_pci_reset; } =20 static void nvme_instance_init(Object *obj) diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 7581ef26fdb..4ca6cb2bac1 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -487,6 +487,11 @@ typedef struct NvmeCtrl { NvmeSecCtrlList sec_ctrl_list; } NvmeCtrl; =20 +typedef enum NvmeResetType { + NVME_RESET_FUNCTION =3D 0, + NVME_RESET_CONTROLLER =3D 1, +} NvmeResetType; + static inline NvmeNamespace *nvme_ns(NvmeCtrl *n, uint32_t nsid) { if (!nsid || nsid > NVME_MAX_NAMESPACES) { diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events index 889bbb3101e..b07864c5732 100644 --- a/hw/nvme/trace-events +++ b/hw/nvme/trace-events @@ -110,6 +110,7 @@ pci_nvme_zd_extension_set(uint32_t zone_idx) "set descr= iptor extension for zone_ pci_nvme_clear_ns_close(uint32_t state, uint64_t slba) "zone state=3D%"PRI= u32", slba=3D%"PRIu64" transitioned to Closed state" pci_nvme_clear_ns_reset(uint32_t state, uint64_t slba) "zone state=3D%"PRI= u32", slba=3D%"PRIu64" transitioned to Empty state" pci_nvme_zoned_zrwa_implicit_flush(uint64_t zslba, uint32_t nlb) "zslba 0x= %"PRIx64" nlb %"PRIu32"" +pci_nvme_pci_reset(void) "PCI Function Level Reset" =20 # error conditions pci_nvme_err_mdts(size_t len) "len %zu" --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106545; cv=none; d=zohomail.com; s=zohoarc; b=aNoRqj4vwAows3UL7a2Y5ivH85nkQ6WTlcFw4NPRNZE6YyByyTavYJnbEmZMTe9pC5yUzwOymQSN0dhQBztgnmsArYpa+A8wox+noqFotRXAeHMBKX0Pg0YTPq/eVBj1Q6M5n2h+fYVl1bSZ8ttyWUnywpuChqib8IpOccsPq/8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106545; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=xQGuArf9Z0r471poRNPp1mb0BHDYuDt+gOkO4YIX9Ok=; b=Q4VyUZLsrdgABX7hxb2gFdOtEQBqm/iuE8m7BuY40UDKme+E6G3cscKUZkRlvqzYxn99CMnfdoSpDRT8mQbewERhS8RUCPNcbyL19NxPqHIK/V4IyNf4xGkfKyBvDjzzgXyEn1yeo2DW0GWux5zaPN6yceUeVN2Psg13dfrcEKU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106545053982.3834243385867; Mon, 9 May 2022 07:29:05 -0700 (PDT) Received: from localhost ([::1]:55082 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4Nj-0006wp-R0 for importer@patchew.org; Mon, 09 May 2022 10:29:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38424) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dy-0001rd-TD; Mon, 09 May 2022 10:19:00 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4Dw-0001AQ-VH; Mon, 09 May 2022 10:18:58 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:55 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:52 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105937; x=1683641937; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VYg53icoaIO3K4F1N5i6xmnfwQJVSSl95EdZwOT9GAM=; b=RxxxlgQI5Z2lAb+e564sSECg1YnK8TVajdujIkSS6Wg0f2ilR8yjto+l iKG7Qi5zb1cRGTO8qp17EAuBRT73S1+2oRdu8pOuFPZ7A/ZCs+J1qbkKF 1XpBVQakshqE1ttUKfYAVmGoZiYikMs6envGLriw0o71GzGcAcVGb+Nu8 m54+lD7kwues3ALpvsIbPgcRm3Acs11cjxVi3WFAUw0BLMRhZHVqyIkFc 7dzFdYL1NhyO3uw+cXAkcKQQ94y9H8wXiy5iVUEGoc1KdjQxEyMSLoJm8 83vCyAtw5ZMTYQYV/75i7G7cahUttQ3taoncrsvAA0rLE3Ayd7oD+KPnv w==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490832" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490832" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987293" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 05/12] hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime Date: Mon, 9 May 2022 16:16:13 +0200 Message-Id: <20220509141620.3868733-6-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106547493100001 From: =C5=81ukasz Gieryk The NVMe device defines two properties: max_ioqpairs, msix_qsize. Having them as constants is problematic for SR-IOV support. SR-IOV introduces virtual resources (queues, interrupts) that can be assigned to PF and its dependent VFs. Each device, following a reset, should work with the configured number of queues. A single constant is no longer sufficient to hold the whole state. This patch tries to solve the problem by introducing additional variables in NvmeCtrl=E2=80=99s state. The variables for, e.g., managing qu= eues are therefore organized as: - n->params.max_ioqpairs =E2=80=93 no changes, constant set by the user - n->(mutable_state) =E2=80=93 (not a part of this patch) user-configurabl= e, specifies number of queues available _after_ reset - n->conf_ioqpairs - (new) used in all the places instead of the =E2=80=98= old=E2=80=99 n->params.max_ioqpairs; initialized in realize() and updated during reset() to reflect user=E2=80=99s changes to the mutable state Since the number of available i/o queues and interrupts can change in runtime, buffers for sq/cqs and the MSIX-related structures are allocated big enough to handle the limits, to completely avoid the complicated reallocation. A helper function (nvme_update_msixcap_ts) updates the corresponding capability register, to signal configuration changes. Signed-off-by: =C5=81ukasz Gieryk Reviewed-by: Klaus Jensen Acked-by: Michael S. Tsirkin --- hw/nvme/ctrl.c | 52 ++++++++++++++++++++++++++++++++++---------------- hw/nvme/nvme.h | 2 ++ 2 files changed, 38 insertions(+), 16 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index e6d6e5840af..12372038075 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -448,12 +448,12 @@ static bool nvme_nsid_valid(NvmeCtrl *n, uint32_t nsi= d) =20 static int nvme_check_sqid(NvmeCtrl *n, uint16_t sqid) { - return sqid < n->params.max_ioqpairs + 1 && n->sq[sqid] !=3D NULL ? 0 = : -1; + return sqid < n->conf_ioqpairs + 1 && n->sq[sqid] !=3D NULL ? 0 : -1; } =20 static int nvme_check_cqid(NvmeCtrl *n, uint16_t cqid) { - return cqid < n->params.max_ioqpairs + 1 && n->cq[cqid] !=3D NULL ? 0 = : -1; + return cqid < n->conf_ioqpairs + 1 && n->cq[cqid] !=3D NULL ? 0 : -1; } =20 static void nvme_inc_cq_tail(NvmeCQueue *cq) @@ -4290,8 +4290,7 @@ static uint16_t nvme_create_sq(NvmeCtrl *n, NvmeReque= st *req) trace_pci_nvme_err_invalid_create_sq_cqid(cqid); return NVME_INVALID_CQID | NVME_DNR; } - if (unlikely(!sqid || sqid > n->params.max_ioqpairs || - n->sq[sqid] !=3D NULL)) { + if (unlikely(!sqid || sqid > n->conf_ioqpairs || n->sq[sqid] !=3D NULL= )) { trace_pci_nvme_err_invalid_create_sq_sqid(sqid); return NVME_INVALID_QID | NVME_DNR; } @@ -4643,8 +4642,7 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeReque= st *req) trace_pci_nvme_create_cq(prp1, cqid, vector, qsize, qflags, NVME_CQ_FLAGS_IEN(qflags) !=3D 0); =20 - if (unlikely(!cqid || cqid > n->params.max_ioqpairs || - n->cq[cqid] !=3D NULL)) { + if (unlikely(!cqid || cqid > n->conf_ioqpairs || n->cq[cqid] !=3D NULL= )) { trace_pci_nvme_err_invalid_create_cq_cqid(cqid); return NVME_INVALID_QID | NVME_DNR; } @@ -4660,7 +4658,7 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeReque= st *req) trace_pci_nvme_err_invalid_create_cq_vector(vector); return NVME_INVALID_IRQ_VECTOR | NVME_DNR; } - if (unlikely(vector >=3D n->params.msix_qsize)) { + if (unlikely(vector >=3D n->conf_msix_qsize)) { trace_pci_nvme_err_invalid_create_cq_vector(vector); return NVME_INVALID_IRQ_VECTOR | NVME_DNR; } @@ -5261,13 +5259,12 @@ defaults: =20 break; case NVME_NUMBER_OF_QUEUES: - result =3D (n->params.max_ioqpairs - 1) | - ((n->params.max_ioqpairs - 1) << 16); + result =3D (n->conf_ioqpairs - 1) | ((n->conf_ioqpairs - 1) << 16); trace_pci_nvme_getfeat_numq(result); break; case NVME_INTERRUPT_VECTOR_CONF: iv =3D dw11 & 0xffff; - if (iv >=3D n->params.max_ioqpairs + 1) { + if (iv >=3D n->conf_ioqpairs + 1) { return NVME_INVALID_FIELD | NVME_DNR; } =20 @@ -5423,10 +5420,10 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeR= equest *req) =20 trace_pci_nvme_setfeat_numq((dw11 & 0xffff) + 1, ((dw11 >> 16) & 0xffff) + 1, - n->params.max_ioqpairs, - n->params.max_ioqpairs); - req->cqe.result =3D cpu_to_le32((n->params.max_ioqpairs - 1) | - ((n->params.max_ioqpairs - 1) << 16)= ); + n->conf_ioqpairs, + n->conf_ioqpairs); + req->cqe.result =3D cpu_to_le32((n->conf_ioqpairs - 1) | + ((n->conf_ioqpairs - 1) << 16)); break; case NVME_ASYNCHRONOUS_EVENT_CONF: n->features.async_config =3D dw11; @@ -5901,8 +5898,24 @@ static void nvme_process_sq(void *opaque) } } =20 +static void nvme_update_msixcap_ts(PCIDevice *pci_dev, uint32_t table_size) +{ + uint8_t *config; + + if (!msix_present(pci_dev)) { + return; + } + + assert(table_size > 0 && table_size <=3D pci_dev->msix_entries_nr); + + config =3D pci_dev->config + pci_dev->msix_cap; + pci_set_word_by_mask(config + PCI_MSIX_FLAGS, PCI_MSIX_FLAGS_QSIZE, + table_size - 1); +} + static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetType rst) { + PCIDevice *pci_dev =3D &n->parent_obj; NvmeNamespace *ns; int i; =20 @@ -5932,15 +5945,17 @@ static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetT= ype rst) g_free(event); } =20 - if (!pci_is_vf(&n->parent_obj) && n->params.sriov_max_vfs) { + if (!pci_is_vf(pci_dev) && n->params.sriov_max_vfs) { if (rst !=3D NVME_RESET_CONTROLLER) { - pcie_sriov_pf_disable_vfs(&n->parent_obj); + pcie_sriov_pf_disable_vfs(pci_dev); } } =20 n->aer_queued =3D 0; n->outstanding_aers =3D 0; n->qs_created =3D false; + + nvme_update_msixcap_ts(pci_dev, n->conf_msix_qsize); } =20 static void nvme_ctrl_shutdown(NvmeCtrl *n) @@ -6651,6 +6666,9 @@ static void nvme_init_state(NvmeCtrl *n) NvmeSecCtrlEntry *sctrl; int i; =20 + n->conf_ioqpairs =3D n->params.max_ioqpairs; + n->conf_msix_qsize =3D n->params.msix_qsize; + /* add one to max_ioqpairs to account for the admin queue pair */ n->reg_size =3D pow2ceil(sizeof(NvmeBar) + 2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE= ); @@ -6812,6 +6830,8 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_= dev, Error **errp) } } =20 + nvme_update_msixcap_ts(pci_dev, n->conf_msix_qsize); + if (n->params.cmb_size_mb) { nvme_init_cmb(n, pci_dev); } diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 4ca6cb2bac1..5bd6ac698bc 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -438,6 +438,8 @@ typedef struct NvmeCtrl { uint64_t starttime_ms; uint16_t temperature; uint8_t smart_critical_warning; + uint32_t conf_msix_qsize; + uint32_t conf_ioqpairs; =20 struct { MemoryRegion mem; --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106887; cv=none; d=zohomail.com; s=zohoarc; b=iNlvtr2UKMewg7lkJNcP1cRnWLU+A4loMnwdfrO7LWu/M9lc+ItG8V3Ln+V2PoXKlTgJseo/xUGsOraR3QIZKwlbBHA0gMip9SIpHhod4VxVQUAHiy1UBxw3SfA1/kWYPEd3wSx51ye7htx10MFxS0eynfQZrMxcUmAkYvuRCSc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106887; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=dhrQ3YVovTVPjUxDvc7V1yXX/TFeERa4X4j/NB/5LE0=; b=Ifr00whY+HGMgU/qTj2BT1vhGPloItzFw4edjGqimM/sa4ukFzBR+oe+SleWz7kF9xKiUS3qgvIc17HIAUp9CeWlG48edTfUhpoLMx/NxhtqLRMygswJJBA996L9rWrq61EatOK113b1aj/0h1IFDsRpqqh08rwGebthojQWJPE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106887949334.2047867096501; Mon, 9 May 2022 07:34:47 -0700 (PDT) Received: from localhost ([::1]:45624 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4TG-0003Bw-JP for importer@patchew.org; Mon, 09 May 2022 10:34:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38450) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4E2-00020w-TZ; Mon, 09 May 2022 10:19:02 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4E1-0001AQ-4F; Mon, 09 May 2022 10:19:02 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:00 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:18:56 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105941; x=1683641941; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DKmc6oHVK3NuZ+zXFAtKW5DjHA67+ELc73SRywXs/OI=; b=gZ6avGFByMrdJChpZ07YB3tbxT35mMN9ckCFQdeSqiorb/tzH7Obwppv ZceQcf0oNEDYvbmIX0ZT0lfX0l/OIIo2GErBPRTOUKFFlI+QK2jaN3GEg +/PNcpluzgN9CWqMJAsC7zEDoHB2iIdyz0NbJ1kaQq3OSoPiWByo11iZN lvjT1WLU54JKARBa0s1wPj7fC0jG215kNM87kZ2kONsVjGZL9tp2rhyWg W7OwrjOY6lbP7g7QpnQJXbXVZnR/InZOfVZVp7PmJEBzqllB15IHJNhfo g4+ByTNSlfOhw5Zddrt1frw+K67YGdvLLVM4pTrzKlYjGJh/jQFrQQiep Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490846" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490846" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987309" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 06/12] hw/nvme: Remove reg_size variable and update BAR0 size calculation Date: Mon, 9 May 2022 16:16:14 +0200 Message-Id: <20220509141620.3868733-7-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106889811100001 From: =C5=81ukasz Gieryk The n->reg_size parameter unnecessarily splits the BAR0 size calculation in two phases; removed to simplify the code. With all the calculations done in one place, it seems the pow2ceil, applied originally to reg_size, is unnecessary. The rounding should happen as the last step, when BAR size includes Nvme registers, queue registers, and MSIX-related space. Finally, the size of the mmio memory region is extended to cover the 1st 4KiB padding (see the map below). Access to this range is handled as interaction with a non-existing queue and generates an error trace, so actually nothing changes, while the reg_size variable is no longer needed. -------------------- | BAR0 | -------------------- [Nvme Registers ] [Queues ] [power-of-2 padding] - removed in this patch [4KiB padding (1) ] [MSIX TABLE ] [4KiB padding (2) ] [MSIX PBA ] [power-of-2 padding] Signed-off-by: =C5=81ukasz Gieryk Reviewed-by: Klaus Jensen Acked-by: Michael S. Tsirkin --- hw/nvme/ctrl.c | 10 +++++----- hw/nvme/nvme.h | 1 - 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 12372038075..f34d73a00c8 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -6669,9 +6669,6 @@ static void nvme_init_state(NvmeCtrl *n) n->conf_ioqpairs =3D n->params.max_ioqpairs; n->conf_msix_qsize =3D n->params.msix_qsize; =20 - /* add one to max_ioqpairs to account for the admin queue pair */ - n->reg_size =3D pow2ceil(sizeof(NvmeBar) + - 2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE= ); n->sq =3D g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1); n->cq =3D g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1); n->temperature =3D NVME_TEMPERATURE; @@ -6795,7 +6792,10 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci= _dev, Error **errp) pcie_ari_init(pci_dev, 0x100, 1); } =20 - bar_size =3D QEMU_ALIGN_UP(n->reg_size, 4 * KiB); + /* add one to max_ioqpairs to account for the admin queue pair */ + bar_size =3D sizeof(NvmeBar) + + 2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE; + bar_size =3D QEMU_ALIGN_UP(bar_size, 4 * KiB); msix_table_offset =3D bar_size; msix_table_size =3D PCI_MSIX_ENTRY_SIZE * n->params.msix_qsize; =20 @@ -6809,7 +6809,7 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_= dev, Error **errp) =20 memory_region_init(&n->bar0, OBJECT(n), "nvme-bar0", bar_size); memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme", - n->reg_size); + msix_table_offset); memory_region_add_subregion(&n->bar0, 0, &n->iomem); =20 if (pci_is_vf(pci_dev)) { diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 5bd6ac698bc..adde718105b 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -428,7 +428,6 @@ typedef struct NvmeCtrl { uint16_t max_prp_ents; uint16_t cqe_size; uint16_t sqe_size; - uint32_t reg_size; uint32_t max_q_ents; uint8_t outstanding_aers; uint32_t irq_status; --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106987; cv=none; d=zohomail.com; s=zohoarc; b=LQb4Zqsv6DUOobKGuzaSlUsPeIotcZ5eK7SEFNJTKYK69vb/pB2hnPdVFw0935YnitpRl1PJfbHvV+E497/d8L9D3KsV83TdHHiUtvTbpFPzUXeZpHqHC26i759lS6SkBnlKZu3EPV8NgKm7Vzr4uPln2A86xz15TAKZc4ZOzYs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106987; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=6oSsOUddP6JpbTFK6akXYR5AsyRDLqE/GRORFm3mX+U=; b=hkYoR7r23ilY3brEF6Z5suC1VemKtYUV3SABAorMg8C712ZYLDKr/p7uv951Ht2gRcl9jrYSHs1wZ/JVtg4UcRN7oyO/0jl/ia7WDHI6l+mbZlMLQ4/1rHLv6as7Y5nuJXpJDFdXybfArgrcSu986CtdO+mfDKPCdhv2osd6JD4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106987179622.9141256832434; Mon, 9 May 2022 07:36:27 -0700 (PDT) Received: from localhost ([::1]:49952 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4Us-000696-3b for importer@patchew.org; Mon, 09 May 2022 10:36:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38484) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4E6-00027Y-Rg; Mon, 09 May 2022 10:19:06 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4E5-0001AQ-24; Mon, 09 May 2022 10:19:06 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:03 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:00 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105945; x=1683641945; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BLlHCY5CoTM53Mg+cOknt6NmVz1+HC1b5rSNPXgtqU4=; b=dxxWk87klBf1iaUCui0asjtCFA/Fxeos8aLTYJIWYsLeZhVZ1iQ2DRzp HBUf2N/Itzr0aU+EVT9Z0uKQIxzZz0st3hkt0m2s3YikdNIf28cUBe+0Y wUJcfeXMTbwxzhlCCxVLkog8e3b45t0A5bkahOmgfUhEDeraWE/x0yPPx tLWW/KE+rSTQQWPCCeOBeCKhWrn6TY+rmN6cCeI5Q+dOyMbKqb90atwdt CbCGBCZBDXgcezp59XBNGAO5S4s2tqB9XH1nWb55DVXa5nXziGa94Fdng EgpJ4xbG/KVa2dybQvVvYGCfzvFPdp5JUVsR93dVJc1tHtnMB+XFEXeag A==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490862" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490862" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987315" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 07/12] hw/nvme: Calculate BAR attributes in a function Date: Mon, 9 May 2022 16:16:15 +0200 Message-Id: <20220509141620.3868733-8-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106988384100001 From: =C5=81ukasz Gieryk An NVMe device with SR-IOV capability calculates the BAR size differently for PF and VF, so it makes sense to extract the common code to a separate function. Signed-off-by: =C5=81ukasz Gieryk Reviewed-by: Klaus Jensen Acked-by: Michael S. Tsirkin --- hw/nvme/ctrl.c | 45 +++++++++++++++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 14 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index f34d73a00c8..f0554a07c40 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -6728,6 +6728,34 @@ static void nvme_init_pmr(NvmeCtrl *n, PCIDevice *pc= i_dev) memory_region_set_enabled(&n->pmr.dev->mr, false); } =20 +static uint64_t nvme_bar_size(unsigned total_queues, unsigned total_irqs, + unsigned *msix_table_offset, + unsigned *msix_pba_offset) +{ + uint64_t bar_size, msix_table_size, msix_pba_size; + + bar_size =3D sizeof(NvmeBar) + 2 * total_queues * NVME_DB_SIZE; + bar_size =3D QEMU_ALIGN_UP(bar_size, 4 * KiB); + + if (msix_table_offset) { + *msix_table_offset =3D bar_size; + } + + msix_table_size =3D PCI_MSIX_ENTRY_SIZE * total_irqs; + bar_size +=3D msix_table_size; + bar_size =3D QEMU_ALIGN_UP(bar_size, 4 * KiB); + + if (msix_pba_offset) { + *msix_pba_offset =3D bar_size; + } + + msix_pba_size =3D QEMU_ALIGN_UP(total_irqs, 64) / 8; + bar_size +=3D msix_pba_size; + + bar_size =3D pow2ceil(bar_size); + return bar_size; +} + static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *pci_dev, uint16_t offs= et, uint64_t bar_size) { @@ -6767,7 +6795,7 @@ static int nvme_add_pm_capability(PCIDevice *pci_dev,= uint8_t offset) static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) { uint8_t *pci_conf =3D pci_dev->config; - uint64_t bar_size, msix_table_size, msix_pba_size; + uint64_t bar_size; unsigned msix_table_offset, msix_pba_offset; int ret; =20 @@ -6793,19 +6821,8 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci= _dev, Error **errp) } =20 /* add one to max_ioqpairs to account for the admin queue pair */ - bar_size =3D sizeof(NvmeBar) + - 2 * (n->params.max_ioqpairs + 1) * NVME_DB_SIZE; - bar_size =3D QEMU_ALIGN_UP(bar_size, 4 * KiB); - msix_table_offset =3D bar_size; - msix_table_size =3D PCI_MSIX_ENTRY_SIZE * n->params.msix_qsize; - - bar_size +=3D msix_table_size; - bar_size =3D QEMU_ALIGN_UP(bar_size, 4 * KiB); - msix_pba_offset =3D bar_size; - msix_pba_size =3D QEMU_ALIGN_UP(n->params.msix_qsize, 64) / 8; - - bar_size +=3D msix_pba_size; - bar_size =3D pow2ceil(bar_size); + bar_size =3D nvme_bar_size(n->params.max_ioqpairs + 1, n->params.msix_= qsize, + &msix_table_offset, &msix_pba_offset); =20 memory_region_init(&n->bar0, OBJECT(n), "nvme-bar0", bar_size); memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme", --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652107076; cv=none; d=zohomail.com; s=zohoarc; b=W796LFyhBqNemMYXkvuSpFazDRJU/XZsSfdTsBNPqwz8Z6x/IQL6jI1CaSv0cu4E4H3lCweBnP2aOHtLuV7snGMDR1uMWN7D+rH1YhnyrQtxgAKtshtcwaShIbnhpPnm6aCYRY95eDb9f+b+LHQXy6bfeq7yT7+aQHmM28DBZ40= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652107076; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=h7hIn0lmNwA0c3ml2ZBBKyu4BdeQF/SO0V+Zz2+LyVY=; b=jZikeyZaq/x0K5XiaNE3vOy7Fcjk91PoMlUJLNSOMZghJlZcvYszVlvGgEIOuV3Qof4LqsO1toTqDwmMPn8+Pav0osPzLsZZKv9NbKQHP3mK5JNLkgadCdyu030eG4vuIrlZaqtC+oNl9N3fjsvzL2E+gOfZ/cU3SQFavbUiKiw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652107076302151.62295436871034; Mon, 9 May 2022 07:37:56 -0700 (PDT) Received: from localhost ([::1]:54128 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4WJ-0000YJ-6o for importer@patchew.org; Mon, 09 May 2022 10:37:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38508) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EC-0002A4-39; Mon, 09 May 2022 10:19:13 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4E8-0001AQ-OR; Mon, 09 May 2022 10:19:11 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:07 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:04 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105949; x=1683641949; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=78XCk5qOeBZhlvyfR2l423+JkhyyMUzrK6aWrqqyhV8=; b=ajHRN+AzErHXSBcGTWx6ak5S9/eXrG26skPXrPd7v7dW4vO7BQt/SbC4 iIAvagt5NXoo8T40f0tLvjKiVjAERNmbmeaKPHP7152APXp4ufLH7KU2a jkrfh3fCP4OtRQixmmKm+gDFOI8vGARDMBEVsUnfQl3cTQUryFP5XWfTi vrgMr0mlHjarxBQp/RoeAEgFkxHOD1PVwhEcnd0ZcNYvwCacfkY5AiL5f LxmyjHQbH8L4Wa1HvQQw0bbj37d9foFTHkltsbE9yokob9f7lWaJHOkpj XHKOiC2d83xJkjBhUE61U13KEXfUtxUpxMg1RrnZiydpL9mYd0MUY+4uY g==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490872" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490872" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987331" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 08/12] hw/nvme: Initialize capability structures for primary/secondary controllers Date: Mon, 9 May 2022 16:16:16 +0200 Message-Id: <20220509141620.3868733-9-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652107077406100001 From: =C5=81ukasz Gieryk With four new properties: - sriov_v{i,q}_flexible, - sriov_max_v{i,q}_per_vf, one can configure the number of available flexible resources, as well as the limits. The primary and secondary controller capability structures are initialized accordingly. Since the number of available queues (interrupts) now varies between VF/PF, BAR size calculation is also adjusted. Signed-off-by: =C5=81ukasz Gieryk Acked-by: Michael S. Tsirkin Reviewed-by: Klaus Jensen --- hw/nvme/ctrl.c | 141 ++++++++++++++++++++++++++++++++++++++++--- hw/nvme/nvme.h | 4 ++ include/block/nvme.h | 5 ++ 3 files changed, 143 insertions(+), 7 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index f0554a07c40..011231ab5a6 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -36,6 +36,10 @@ * zoned.zasl=3D, \ * zoned.auto_transition=3D, \ * sriov_max_vfs=3D \ + * sriov_vq_flexible=3D \ + * sriov_vi_flexible=3D \ + * sriov_max_vi_per_vf=3D \ + * sriov_max_vq_per_vf=3D \ * subsys=3D * -device nvme-ns,drive=3D,bus=3D,nsid=3D,\ * zoned=3D, \ @@ -113,6 +117,29 @@ * enables reporting of both SR-IOV and ARI capabilities by the NVMe dev= ice. * Virtual function controllers will not report SR-IOV capability. * + * NOTE: Single Root I/O Virtualization support is experimental. + * All the related parameters may be subject to change. + * + * - `sriov_vq_flexible` + * Indicates the total number of flexible queue resources assignable to = all + * the secondary controllers. Implicitly sets the number of primary + * controller's private resources to `(max_ioqpairs - sriov_vq_flexible)= `. + * + * - `sriov_vi_flexible` + * Indicates the total number of flexible interrupt resources assignable= to + * all the secondary controllers. Implicitly sets the number of primary + * controller's private resources to `(msix_qsize - sriov_vi_flexible)`. + * + * - `sriov_max_vi_per_vf` + * Indicates the maximum number of virtual interrupt resources assignable + * to a secondary controller. The default 0 resolves to + * `(sriov_vi_flexible / sriov_max_vfs)`. + * + * - `sriov_max_vq_per_vf` + * Indicates the maximum number of virtual queue resources assignable to + * a secondary controller. The default 0 resolves to + * `(sriov_vq_flexible / sriov_max_vfs)`. + * * nvme namespace device parameters * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * - `shared` @@ -185,6 +212,7 @@ #define NVME_NUM_FW_SLOTS 1 #define NVME_DEFAULT_MAX_ZA_SIZE (128 * KiB) #define NVME_MAX_VFS 127 +#define NVME_VF_RES_GRANULARITY 1 #define NVME_VF_OFFSET 0x1 #define NVME_VF_STRIDE 1 =20 @@ -6656,6 +6684,53 @@ static void nvme_check_constraints(NvmeCtrl *n, Erro= r **errp) error_setg(errp, "PMR is not supported with SR-IOV"); return; } + + if (!params->sriov_vq_flexible || !params->sriov_vi_flexible) { + error_setg(errp, "both sriov_vq_flexible and sriov_vi_flexible" + " must be set for the use of SR-IOV"); + return; + } + + if (params->sriov_vq_flexible < params->sriov_max_vfs * 2) { + error_setg(errp, "sriov_vq_flexible must be greater than or eq= ual" + " to %d (sriov_max_vfs * 2)", params->sriov_max_vfs= * 2); + return; + } + + if (params->max_ioqpairs < params->sriov_vq_flexible + 2) { + error_setg(errp, "(max_ioqpairs - sriov_vq_flexible) must be" + " greater than or equal to 2"); + return; + } + + if (params->sriov_vi_flexible < params->sriov_max_vfs) { + error_setg(errp, "sriov_vi_flexible must be greater than or eq= ual" + " to %d (sriov_max_vfs)", params->sriov_max_vfs); + return; + } + + if (params->msix_qsize < params->sriov_vi_flexible + 1) { + error_setg(errp, "(msix_qsize - sriov_vi_flexible) must be" + " greater than or equal to 1"); + return; + } + + if (params->sriov_max_vi_per_vf && + (params->sriov_max_vi_per_vf - 1) % NVME_VF_RES_GRANULARITY) { + error_setg(errp, "sriov_max_vi_per_vf must meet:" + " (sriov_max_vi_per_vf - 1) %% %d =3D=3D 0 and" + " sriov_max_vi_per_vf >=3D 1", NVME_VF_RES_GRANULAR= ITY); + return; + } + + if (params->sriov_max_vq_per_vf && + (params->sriov_max_vq_per_vf < 2 || + (params->sriov_max_vq_per_vf - 1) % NVME_VF_RES_GRANULARITY))= { + error_setg(errp, "sriov_max_vq_per_vf must meet:" + " (sriov_max_vq_per_vf - 1) %% %d =3D=3D 0 and" + " sriov_max_vq_per_vf >=3D 2", NVME_VF_RES_GRANULAR= ITY); + return; + } } } =20 @@ -6664,10 +6739,19 @@ static void nvme_init_state(NvmeCtrl *n) NvmePriCtrlCap *cap =3D &n->pri_ctrl_cap; NvmeSecCtrlList *list =3D &n->sec_ctrl_list; NvmeSecCtrlEntry *sctrl; + uint8_t max_vfs; int i; =20 - n->conf_ioqpairs =3D n->params.max_ioqpairs; - n->conf_msix_qsize =3D n->params.msix_qsize; + if (pci_is_vf(&n->parent_obj)) { + sctrl =3D nvme_sctrl(n); + max_vfs =3D 0; + n->conf_ioqpairs =3D sctrl->nvq ? le16_to_cpu(sctrl->nvq) - 1 : 0; + n->conf_msix_qsize =3D sctrl->nvi ? le16_to_cpu(sctrl->nvi) : 1; + } else { + max_vfs =3D n->params.sriov_max_vfs; + n->conf_ioqpairs =3D n->params.max_ioqpairs; + n->conf_msix_qsize =3D n->params.msix_qsize; + } =20 n->sq =3D g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1); n->cq =3D g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1); @@ -6676,14 +6760,41 @@ static void nvme_init_state(NvmeCtrl *n) n->starttime_ms =3D qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs =3D g_new0(NvmeRequest *, n->params.aerl + 1); =20 - list->numcntl =3D cpu_to_le16(n->params.sriov_max_vfs); - for (i =3D 0; i < n->params.sriov_max_vfs; i++) { + list->numcntl =3D cpu_to_le16(max_vfs); + for (i =3D 0; i < max_vfs; i++) { sctrl =3D &list->sec[i]; sctrl->pcid =3D cpu_to_le16(n->cntlid); sctrl->vfn =3D cpu_to_le16(i + 1); } =20 cap->cntlid =3D cpu_to_le16(n->cntlid); + cap->crt =3D NVME_CRT_VQ | NVME_CRT_VI; + + if (pci_is_vf(&n->parent_obj)) { + cap->vqprt =3D cpu_to_le16(1 + n->conf_ioqpairs); + } else { + cap->vqprt =3D cpu_to_le16(1 + n->params.max_ioqpairs - + n->params.sriov_vq_flexible); + cap->vqfrt =3D cpu_to_le32(n->params.sriov_vq_flexible); + cap->vqrfap =3D cap->vqfrt; + cap->vqgran =3D cpu_to_le16(NVME_VF_RES_GRANULARITY); + cap->vqfrsm =3D n->params.sriov_max_vq_per_vf ? + cpu_to_le16(n->params.sriov_max_vq_per_vf) : + cap->vqfrt / MAX(max_vfs, 1); + } + + if (pci_is_vf(&n->parent_obj)) { + cap->viprt =3D cpu_to_le16(n->conf_msix_qsize); + } else { + cap->viprt =3D cpu_to_le16(n->params.msix_qsize - + n->params.sriov_vi_flexible); + cap->vifrt =3D cpu_to_le32(n->params.sriov_vi_flexible); + cap->virfap =3D cap->vifrt; + cap->vigran =3D cpu_to_le16(NVME_VF_RES_GRANULARITY); + cap->vifrsm =3D n->params.sriov_max_vi_per_vf ? + cpu_to_le16(n->params.sriov_max_vi_per_vf) : + cap->vifrt / MAX(max_vfs, 1); + } } =20 static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev) @@ -6756,11 +6867,14 @@ static uint64_t nvme_bar_size(unsigned total_queues= , unsigned total_irqs, return bar_size; } =20 -static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *pci_dev, uint16_t offs= et, - uint64_t bar_size) +static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *pci_dev, uint16_t offs= et) { uint16_t vf_dev_id =3D n->params.use_intel_id ? PCI_DEVICE_ID_INTEL_NVME : PCI_DEVICE_ID_REDHAT_N= VME; + NvmePriCtrlCap *cap =3D &n->pri_ctrl_cap; + uint64_t bar_size =3D nvme_bar_size(le16_to_cpu(cap->vqfrsm), + le16_to_cpu(cap->vifrsm), + NULL, NULL); =20 pcie_sriov_pf_init(pci_dev, offset, "nvme", vf_dev_id, n->params.sriov_max_vfs, n->params.sriov_max_vfs, @@ -6858,7 +6972,7 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_= dev, Error **errp) } =20 if (!pci_is_vf(pci_dev) && n->params.sriov_max_vfs) { - nvme_init_sriov(n, pci_dev, 0x120, bar_size); + nvme_init_sriov(n, pci_dev, 0x120); } =20 return 0; @@ -6882,6 +6996,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pc= i_dev) NvmeIdCtrl *id =3D &n->id_ctrl; uint8_t *pci_conf =3D pci_dev->config; uint64_t cap =3D ldq_le_p(&n->bar.cap); + NvmeSecCtrlEntry *sctrl =3D nvme_sctrl(n); =20 id->vid =3D cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID)); id->ssvid =3D cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR= _ID)); @@ -6974,6 +7089,10 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *p= ci_dev) =20 stl_le_p(&n->bar.vs, NVME_SPEC_VER); n->bar.intmc =3D n->bar.intms =3D 0; + + if (pci_is_vf(&n->parent_obj) && !sctrl->scs) { + stl_le_p(&n->bar.csts, NVME_CSTS_FAILED); + } } =20 static int nvme_init_subsys(NvmeCtrl *n, Error **errp) @@ -7114,6 +7233,14 @@ static Property nvme_props[] =3D { DEFINE_PROP_BOOL("zoned.auto_transition", NvmeCtrl, params.auto_transition_zones, true), DEFINE_PROP_UINT8("sriov_max_vfs", NvmeCtrl, params.sriov_max_vfs, 0), + DEFINE_PROP_UINT16("sriov_vq_flexible", NvmeCtrl, + params.sriov_vq_flexible, 0), + DEFINE_PROP_UINT16("sriov_vi_flexible", NvmeCtrl, + params.sriov_vi_flexible, 0), + DEFINE_PROP_UINT8("sriov_max_vi_per_vf", NvmeCtrl, + params.sriov_max_vi_per_vf, 0), + DEFINE_PROP_UINT8("sriov_max_vq_per_vf", NvmeCtrl, + params.sriov_max_vq_per_vf, 0), DEFINE_PROP_END_OF_LIST(), }; =20 diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index adde718105b..fb18d1dc667 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -411,6 +411,10 @@ typedef struct NvmeParams { bool auto_transition_zones; bool legacy_cmb; uint8_t sriov_max_vfs; + uint16_t sriov_vq_flexible; + uint16_t sriov_vi_flexible; + uint8_t sriov_max_vq_per_vf; + uint8_t sriov_max_vi_per_vf; } NvmeParams; =20 typedef struct NvmeCtrl { diff --git a/include/block/nvme.h b/include/block/nvme.h index 94efd32578c..58d08d5c2aa 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -1576,6 +1576,11 @@ typedef struct QEMU_PACKED NvmePriCtrlCap { uint8_t rsvd80[4016]; } NvmePriCtrlCap; =20 +typedef enum NvmePriCtrlCapCrt { + NVME_CRT_VQ =3D 1 << 0, + NVME_CRT_VI =3D 1 << 1, +} NvmePriCtrlCapCrt; + typedef struct QEMU_PACKED NvmeSecCtrlEntry { uint16_t scid; uint16_t pcid; --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106271; cv=none; d=zohomail.com; s=zohoarc; b=OpLwxOQz12Ej3/md/jap2r0onc8/sFJIvkQJvbXqaxfhEcOd44c+bt0bBsxqmmc6YNbFk/c8QkdfjGGs3RUquIjnP/42KACS04AfPpmsGqs/zYPfOCl19b3WCFI9b5uUe2dkfU63fNIiqf7wfT85ZDtZYHmXP16AI4Y2UGIsUv4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106271; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=639R3nOGpWitq3fLCKuH4rJ4HxCi41voAjPNIGOPbYI=; b=aFF4Ysq83yvmUweuhkwjOmtfqnDiaNdGGeulxV8MiH+GU29CjD95LUupMCB2lbYvIolaUR67UBaCV0aaWrxicfOqGuaiampuQbxix91QYXrzZMcioKMBQ5ILfGMRHdmLFgrsYx+ZUonAYNGC8KSPykuZ4tkjoGNl78dmpbgSJeI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106271779127.98342834326274; Mon, 9 May 2022 07:24:31 -0700 (PDT) Received: from localhost ([::1]:43954 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4JK-0007mx-M0 for importer@patchew.org; Mon, 09 May 2022 10:24:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38522) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EG-0002Bm-57; Mon, 09 May 2022 10:19:16 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EC-0001AQ-Qg; Mon, 09 May 2022 10:19:15 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:11 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:08 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105952; x=1683641952; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UO1Op8McK3tDFGCGy8yriReAlurgsJAzkZxp7o+tUH8=; b=cjWTm6kk4wdPBzL2Fkrw9MUE0cwbAB0njlF6gsKu2EUev+DNTXuNf0NR BWiLh1dDDC+c65lUpnbV957kkLxrYH8ngJ5v1w/sNUpO43dtTfZkNJZQu oz4Up3UrfRPe+fusgwjbT8OOkDf0HUm2s8IeH1yIWsCGTh9O1mn57Q3tz rkb6DVcsou8Eqi/MglDkZn3cYolRamXtKAnx2nki9aaO9EFmYAdtS+Lj7 Yr9H11qJ7T+Z0IeyqRvYZoXvZ1Cd/SQOvXKAV6BT1QGoaTdNHiFOaTfXE yb+4XnjzyQmeLY+yu9aOAs2DxQ/z/CsrpeSR1ingZUlN7O0DdV0hWRPvL w==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490881" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490881" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987344" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 09/12] hw/nvme: Add support for the Virtualization Management command Date: Mon, 9 May 2022 16:16:17 +0200 Message-Id: <20220509141620.3868733-10-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106272561100001 From: =C5=81ukasz Gieryk With the new command one can: - assign flexible resources (queues, interrupts) to primary and secondary controllers, - toggle the online/offline state of given controller. Signed-off-by: =C5=81ukasz Gieryk Acked-by: Michael S. Tsirkin Reviewed-by: Klaus Jensen --- hw/nvme/ctrl.c | 257 ++++++++++++++++++++++++++++++++++++++++++- hw/nvme/nvme.h | 20 ++++ hw/nvme/trace-events | 3 + include/block/nvme.h | 17 +++ 4 files changed, 295 insertions(+), 2 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 011231ab5a6..247c09882dd 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -188,6 +188,7 @@ #include "qemu/error-report.h" #include "qemu/log.h" #include "qemu/units.h" +#include "qemu/range.h" #include "qapi/error.h" #include "qapi/visitor.h" #include "sysemu/sysemu.h" @@ -262,6 +263,7 @@ static const uint32_t nvme_cse_acs[256] =3D { [NVME_ADM_CMD_GET_FEATURES] =3D NVME_CMD_EFF_CSUPP, [NVME_ADM_CMD_ASYNC_EV_REQ] =3D NVME_CMD_EFF_CSUPP, [NVME_ADM_CMD_NS_ATTACHMENT] =3D NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_= NIC, + [NVME_ADM_CMD_VIRT_MNGMT] =3D NVME_CMD_EFF_CSUPP, [NVME_ADM_CMD_FORMAT_NVM] =3D NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_= LBCC, }; =20 @@ -293,6 +295,7 @@ static const uint32_t nvme_cse_iocs_zoned[256] =3D { }; =20 static void nvme_process_sq(void *opaque); +static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetType rst); =20 static uint16_t nvme_sqid(NvmeRequest *req) { @@ -5838,6 +5841,167 @@ out: return status; } =20 +static void nvme_get_virt_res_num(NvmeCtrl *n, uint8_t rt, int *num_total, + int *num_prim, int *num_sec) +{ + *num_total =3D le32_to_cpu(rt ? + n->pri_ctrl_cap.vifrt : n->pri_ctrl_cap.vqfrt= ); + *num_prim =3D le16_to_cpu(rt ? + n->pri_ctrl_cap.virfap : n->pri_ctrl_cap.vqrfa= p); + *num_sec =3D le16_to_cpu(rt ? n->pri_ctrl_cap.virfa : n->pri_ctrl_cap.= vqrfa); +} + +static uint16_t nvme_assign_virt_res_to_prim(NvmeCtrl *n, NvmeRequest *req, + uint16_t cntlid, uint8_t rt, + int nr) +{ + int num_total, num_prim, num_sec; + + if (cntlid !=3D n->cntlid) { + return NVME_INVALID_CTRL_ID | NVME_DNR; + } + + nvme_get_virt_res_num(n, rt, &num_total, &num_prim, &num_sec); + + if (nr > num_total) { + return NVME_INVALID_NUM_RESOURCES | NVME_DNR; + } + + if (nr > num_total - num_sec) { + return NVME_INVALID_RESOURCE_ID | NVME_DNR; + } + + if (rt) { + n->next_pri_ctrl_cap.virfap =3D cpu_to_le16(nr); + } else { + n->next_pri_ctrl_cap.vqrfap =3D cpu_to_le16(nr); + } + + req->cqe.result =3D cpu_to_le32(nr); + return req->status; +} + +static void nvme_update_virt_res(NvmeCtrl *n, NvmeSecCtrlEntry *sctrl, + uint8_t rt, int nr) +{ + int prev_nr, prev_total; + + if (rt) { + prev_nr =3D le16_to_cpu(sctrl->nvi); + prev_total =3D le32_to_cpu(n->pri_ctrl_cap.virfa); + sctrl->nvi =3D cpu_to_le16(nr); + n->pri_ctrl_cap.virfa =3D cpu_to_le32(prev_total + nr - prev_nr); + } else { + prev_nr =3D le16_to_cpu(sctrl->nvq); + prev_total =3D le32_to_cpu(n->pri_ctrl_cap.vqrfa); + sctrl->nvq =3D cpu_to_le16(nr); + n->pri_ctrl_cap.vqrfa =3D cpu_to_le32(prev_total + nr - prev_nr); + } +} + +static uint16_t nvme_assign_virt_res_to_sec(NvmeCtrl *n, NvmeRequest *req, + uint16_t cntlid, uint8_t rt, i= nt nr) +{ + int num_total, num_prim, num_sec, num_free, diff, limit; + NvmeSecCtrlEntry *sctrl; + + sctrl =3D nvme_sctrl_for_cntlid(n, cntlid); + if (!sctrl) { + return NVME_INVALID_CTRL_ID | NVME_DNR; + } + + if (sctrl->scs) { + return NVME_INVALID_SEC_CTRL_STATE | NVME_DNR; + } + + limit =3D le16_to_cpu(rt ? n->pri_ctrl_cap.vifrsm : n->pri_ctrl_cap.vq= frsm); + if (nr > limit) { + return NVME_INVALID_NUM_RESOURCES | NVME_DNR; + } + + nvme_get_virt_res_num(n, rt, &num_total, &num_prim, &num_sec); + num_free =3D num_total - num_prim - num_sec; + diff =3D nr - le16_to_cpu(rt ? sctrl->nvi : sctrl->nvq); + + if (diff > num_free) { + return NVME_INVALID_RESOURCE_ID | NVME_DNR; + } + + nvme_update_virt_res(n, sctrl, rt, nr); + req->cqe.result =3D cpu_to_le32(nr); + + return req->status; +} + +static uint16_t nvme_virt_set_state(NvmeCtrl *n, uint16_t cntlid, bool onl= ine) +{ + NvmeCtrl *sn =3D NULL; + NvmeSecCtrlEntry *sctrl; + int vf_index; + + sctrl =3D nvme_sctrl_for_cntlid(n, cntlid); + if (!sctrl) { + return NVME_INVALID_CTRL_ID | NVME_DNR; + } + + if (!pci_is_vf(&n->parent_obj)) { + vf_index =3D le16_to_cpu(sctrl->vfn) - 1; + sn =3D NVME(pcie_sriov_get_vf_at_index(&n->parent_obj, vf_index)); + } + + if (online) { + if (!sctrl->nvi || (le16_to_cpu(sctrl->nvq) < 2) || !sn) { + return NVME_INVALID_SEC_CTRL_STATE | NVME_DNR; + } + + if (!sctrl->scs) { + sctrl->scs =3D 0x1; + nvme_ctrl_reset(sn, NVME_RESET_FUNCTION); + } + } else { + nvme_update_virt_res(n, sctrl, NVME_VIRT_RES_INTERRUPT, 0); + nvme_update_virt_res(n, sctrl, NVME_VIRT_RES_QUEUE, 0); + + if (sctrl->scs) { + sctrl->scs =3D 0x0; + if (sn) { + nvme_ctrl_reset(sn, NVME_RESET_FUNCTION); + } + } + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_virt_mngmt(NvmeCtrl *n, NvmeRequest *req) +{ + uint32_t dw10 =3D le32_to_cpu(req->cmd.cdw10); + uint32_t dw11 =3D le32_to_cpu(req->cmd.cdw11); + uint8_t act =3D dw10 & 0xf; + uint8_t rt =3D (dw10 >> 8) & 0x7; + uint16_t cntlid =3D (dw10 >> 16) & 0xffff; + int nr =3D dw11 & 0xffff; + + trace_pci_nvme_virt_mngmt(nvme_cid(req), act, cntlid, rt ? "VI" : "VQ"= , nr); + + if (rt !=3D NVME_VIRT_RES_QUEUE && rt !=3D NVME_VIRT_RES_INTERRUPT) { + return NVME_INVALID_RESOURCE_ID | NVME_DNR; + } + + switch (act) { + case NVME_VIRT_MNGMT_ACTION_SEC_ASSIGN: + return nvme_assign_virt_res_to_sec(n, req, cntlid, rt, nr); + case NVME_VIRT_MNGMT_ACTION_PRM_ALLOC: + return nvme_assign_virt_res_to_prim(n, req, cntlid, rt, nr); + case NVME_VIRT_MNGMT_ACTION_SEC_ONLINE: + return nvme_virt_set_state(n, cntlid, true); + case NVME_VIRT_MNGMT_ACTION_SEC_OFFLINE: + return nvme_virt_set_state(n, cntlid, false); + default: + return NVME_INVALID_FIELD | NVME_DNR; + } +} + static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req) { trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), req->cmd.opcod= e, @@ -5880,6 +6044,8 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeReque= st *req) return nvme_aer(n, req); case NVME_ADM_CMD_NS_ATTACHMENT: return nvme_ns_attachment(n, req); + case NVME_ADM_CMD_VIRT_MNGMT: + return nvme_virt_mngmt(n, req); case NVME_ADM_CMD_FORMAT_NVM: return nvme_format(n, req); default: @@ -5941,9 +6107,33 @@ static void nvme_update_msixcap_ts(PCIDevice *pci_de= v, uint32_t table_size) table_size - 1); } =20 +static void nvme_activate_virt_res(NvmeCtrl *n) +{ + PCIDevice *pci_dev =3D &n->parent_obj; + NvmePriCtrlCap *cap =3D &n->pri_ctrl_cap; + NvmeSecCtrlEntry *sctrl; + + /* -1 to account for the admin queue */ + if (pci_is_vf(pci_dev)) { + sctrl =3D nvme_sctrl(n); + cap->vqprt =3D sctrl->nvq; + cap->viprt =3D sctrl->nvi; + n->conf_ioqpairs =3D sctrl->nvq ? le16_to_cpu(sctrl->nvq) - 1 : 0; + n->conf_msix_qsize =3D sctrl->nvi ? le16_to_cpu(sctrl->nvi) : 1; + } else { + cap->vqrfap =3D n->next_pri_ctrl_cap.vqrfap; + cap->virfap =3D n->next_pri_ctrl_cap.virfap; + n->conf_ioqpairs =3D le16_to_cpu(cap->vqprt) + + le16_to_cpu(cap->vqrfap) - 1; + n->conf_msix_qsize =3D le16_to_cpu(cap->viprt) + + le16_to_cpu(cap->virfap); + } +} + static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetType rst) { PCIDevice *pci_dev =3D &n->parent_obj; + NvmeSecCtrlEntry *sctrl; NvmeNamespace *ns; int i; =20 @@ -5973,9 +6163,20 @@ static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetTy= pe rst) g_free(event); } =20 - if (!pci_is_vf(pci_dev) && n->params.sriov_max_vfs) { + if (n->params.sriov_max_vfs) { + if (!pci_is_vf(pci_dev)) { + for (i =3D 0; i < n->sec_ctrl_list.numcntl; i++) { + sctrl =3D &n->sec_ctrl_list.sec[i]; + nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); + } + + if (rst !=3D NVME_RESET_CONTROLLER) { + pcie_sriov_pf_disable_vfs(pci_dev); + } + } + if (rst !=3D NVME_RESET_CONTROLLER) { - pcie_sriov_pf_disable_vfs(pci_dev); + nvme_activate_virt_res(n); } } =20 @@ -5984,6 +6185,13 @@ static void nvme_ctrl_reset(NvmeCtrl *n, NvmeResetTy= pe rst) n->qs_created =3D false; =20 nvme_update_msixcap_ts(pci_dev, n->conf_msix_qsize); + + if (pci_is_vf(pci_dev)) { + sctrl =3D nvme_sctrl(n); + stl_le_p(&n->bar.csts, sctrl->scs ? 0 : NVME_CSTS_FAILED); + } else { + stl_le_p(&n->bar.csts, 0); + } } =20 static void nvme_ctrl_shutdown(NvmeCtrl *n) @@ -6029,7 +6237,15 @@ static int nvme_start_ctrl(NvmeCtrl *n) uint64_t acq =3D ldq_le_p(&n->bar.acq); uint32_t page_bits =3D NVME_CC_MPS(cc) + 12; uint32_t page_size =3D 1 << page_bits; + NvmeSecCtrlEntry *sctrl =3D nvme_sctrl(n); =20 + if (pci_is_vf(&n->parent_obj) && !sctrl->scs) { + trace_pci_nvme_err_startfail_virt_state(le16_to_cpu(sctrl->nvi), + le16_to_cpu(sctrl->nvq), + sctrl->scs ? "ONLINE" : + "OFFLINE"); + return -1; + } if (unlikely(n->cq[0])) { trace_pci_nvme_err_startfail_cq(); return -1; @@ -6412,6 +6628,12 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr = addr, unsigned size) return 0; } =20 + if (pci_is_vf(&n->parent_obj) && !nvme_sctrl(n)->scs && + addr !=3D NVME_REG_CSTS) { + trace_pci_nvme_err_ignored_mmio_vf_offline(addr, size); + return 0; + } + /* * When PMRWBM bit 1 is set then read from * from PMRSTS should ensure prior writes @@ -6561,6 +6783,12 @@ static void nvme_mmio_write(void *opaque, hwaddr add= r, uint64_t data, =20 trace_pci_nvme_mmio_write(addr, data, size); =20 + if (pci_is_vf(&n->parent_obj) && !nvme_sctrl(n)->scs && + addr !=3D NVME_REG_CSTS) { + trace_pci_nvme_err_ignored_mmio_vf_offline(addr, size); + return; + } + if (addr < sizeof(n->bar)) { nvme_write_bar(n, addr, data, size); } else { @@ -7295,9 +7523,34 @@ static void nvme_pci_reset(DeviceState *qdev) nvme_ctrl_reset(n, NVME_RESET_FUNCTION); } =20 +static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, uint32_t address, + uint32_t val, int len) +{ + NvmeCtrl *n =3D NVME(dev); + NvmeSecCtrlEntry *sctrl; + uint16_t sriov_cap =3D dev->exp.sriov_cap; + uint32_t off =3D address - sriov_cap; + int i, num_vfs; + + if (!sriov_cap) { + return; + } + + if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) { + if (!(val & PCI_SRIOV_CTRL_VFE)) { + num_vfs =3D pci_get_word(dev->config + sriov_cap + PCI_SRIOV_N= UM_VF); + for (i =3D 0; i < num_vfs; i++) { + sctrl =3D &n->sec_ctrl_list.sec[i]; + nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); + } + } + } +} + static void nvme_pci_write_config(PCIDevice *dev, uint32_t address, uint32_t val, int len) { + nvme_sriov_pre_write_ctrl(dev, address, val, len); pci_default_write_config(dev, address, val, len); pcie_cap_flr_write_config(dev, address, val, len); } diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index fb18d1dc667..38f365da84a 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -339,6 +339,7 @@ static inline const char *nvme_adm_opc_str(uint8_t opc) case NVME_ADM_CMD_GET_FEATURES: return "NVME_ADM_CMD_GET_FEATURES"; case NVME_ADM_CMD_ASYNC_EV_REQ: return "NVME_ADM_CMD_ASYNC_EV_REQ"; case NVME_ADM_CMD_NS_ATTACHMENT: return "NVME_ADM_CMD_NS_ATTACHMENT= "; + case NVME_ADM_CMD_VIRT_MNGMT: return "NVME_ADM_CMD_VIRT_MNGMT"; case NVME_ADM_CMD_FORMAT_NVM: return "NVME_ADM_CMD_FORMAT_NVM"; default: return "NVME_ADM_CMD_UNKNOWN"; } @@ -490,6 +491,10 @@ typedef struct NvmeCtrl { =20 NvmePriCtrlCap pri_ctrl_cap; NvmeSecCtrlList sec_ctrl_list; + struct { + uint16_t vqrfap; + uint16_t virfap; + } next_pri_ctrl_cap; /* These override pri_ctrl_cap after reset */ } NvmeCtrl; =20 typedef enum NvmeResetType { @@ -541,6 +546,21 @@ static inline NvmeSecCtrlEntry *nvme_sctrl(NvmeCtrl *n) return NULL; } =20 +static inline NvmeSecCtrlEntry *nvme_sctrl_for_cntlid(NvmeCtrl *n, + uint16_t cntlid) +{ + NvmeSecCtrlList *list =3D &n->sec_ctrl_list; + uint8_t i; + + for (i =3D 0; i < list->numcntl; i++) { + if (le16_to_cpu(list->sec[i].scid) =3D=3D cntlid) { + return &list->sec[i]; + } + } + + return NULL; +} + void nvme_attach_ns(NvmeCtrl *n, NvmeNamespace *ns); uint16_t nvme_bounce_data(NvmeCtrl *n, void *ptr, uint32_t len, NvmeTxDirection dir, NvmeRequest *req); diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events index b07864c5732..065e1c891df 100644 --- a/hw/nvme/trace-events +++ b/hw/nvme/trace-events @@ -111,6 +111,7 @@ pci_nvme_clear_ns_close(uint32_t state, uint64_t slba) = "zone state=3D%"PRIu32", sl pci_nvme_clear_ns_reset(uint32_t state, uint64_t slba) "zone state=3D%"PRI= u32", slba=3D%"PRIu64" transitioned to Empty state" pci_nvme_zoned_zrwa_implicit_flush(uint64_t zslba, uint32_t nlb) "zslba 0x= %"PRIx64" nlb %"PRIu32"" pci_nvme_pci_reset(void) "PCI Function Level Reset" +pci_nvme_virt_mngmt(uint16_t cid, uint16_t act, uint16_t cntlid, const cha= r* rt, uint16_t nr) "cid %"PRIu16", act=3D0x%"PRIx16", ctrlid=3D%"PRIu16" %= s nr=3D%"PRIu16"" =20 # error conditions pci_nvme_err_mdts(size_t len) "len %zu" @@ -180,7 +181,9 @@ pci_nvme_err_startfail_asqent_sz_zero(void) "nvme_start= _ctrl failed because the pci_nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed becaus= e the admin completion queue size is zero" pci_nvme_err_startfail_zasl_too_small(uint32_t zasl, uint32_t pagesz) "nvm= e_start_ctrl failed because zone append size limit %"PRIu32" is too small, = needs to be >=3D %"PRIu32"" pci_nvme_err_startfail(void) "setting controller enable bit failed" +pci_nvme_err_startfail_virt_state(uint16_t vq, uint16_t vi, const char *st= ate) "nvme_start_ctrl failed due to ctrl state: vi=3D%u vq=3D%u %s" pci_nvme_err_invalid_mgmt_action(uint8_t action) "action=3D0x%"PRIx8"" +pci_nvme_err_ignored_mmio_vf_offline(uint64_t addr, unsigned size) "addr 0= x%"PRIx64" size %d" =20 # undefined behavior pci_nvme_ub_mmiowr_misaligned32(uint64_t offset) "MMIO write not 32-bit al= igned, offset=3D0x%"PRIx64"" diff --git a/include/block/nvme.h b/include/block/nvme.h index 58d08d5c2aa..373c70b5ca7 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -595,6 +595,7 @@ enum NvmeAdminCommands { NVME_ADM_CMD_ACTIVATE_FW =3D 0x10, NVME_ADM_CMD_DOWNLOAD_FW =3D 0x11, NVME_ADM_CMD_NS_ATTACHMENT =3D 0x15, + NVME_ADM_CMD_VIRT_MNGMT =3D 0x1c, NVME_ADM_CMD_FORMAT_NVM =3D 0x80, NVME_ADM_CMD_SECURITY_SEND =3D 0x81, NVME_ADM_CMD_SECURITY_RECV =3D 0x82, @@ -899,6 +900,10 @@ enum NvmeStatusCodes { NVME_NS_PRIVATE =3D 0x0119, NVME_NS_NOT_ATTACHED =3D 0x011a, NVME_NS_CTRL_LIST_INVALID =3D 0x011c, + NVME_INVALID_CTRL_ID =3D 0x011f, + NVME_INVALID_SEC_CTRL_STATE =3D 0x0120, + NVME_INVALID_NUM_RESOURCES =3D 0x0121, + NVME_INVALID_RESOURCE_ID =3D 0x0122, NVME_CONFLICTING_ATTRS =3D 0x0180, NVME_INVALID_PROT_INFO =3D 0x0181, NVME_WRITE_TO_RO =3D 0x0182, @@ -1598,6 +1603,18 @@ typedef struct QEMU_PACKED NvmeSecCtrlList { NvmeSecCtrlEntry sec[127]; } NvmeSecCtrlList; =20 +typedef enum NvmeVirtMngmtAction { + NVME_VIRT_MNGMT_ACTION_PRM_ALLOC =3D 0x01, + NVME_VIRT_MNGMT_ACTION_SEC_OFFLINE =3D 0x07, + NVME_VIRT_MNGMT_ACTION_SEC_ASSIGN =3D 0x08, + NVME_VIRT_MNGMT_ACTION_SEC_ONLINE =3D 0x09, +} NvmeVirtMngmtAction; + +typedef enum NvmeVirtualResourceType { + NVME_VIRT_RES_QUEUE =3D 0x00, + NVME_VIRT_RES_INTERRUPT =3D 0x01, +} NvmeVirtualResourceType; + static inline void _nvme_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmeBar) !=3D 4096); --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652107299; cv=none; d=zohomail.com; s=zohoarc; b=fwDgTfAfsFNKyBzxyel/8/ADUr2PvYLckUmE6mWzXoN2cqEu6XHIhuTYK/xwDuljE3jfYPgxlN2Apjsbddd5lp7hsjGfeiy/w2Qwg9S785Am0H9Gc7g97DuctucoJwBrv+vsQdodbH2PbYRLGOywS7qoPkxXP5jLkh4BUNxCYK8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652107299; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=gt6s5oznKMZWb4mc10uJBTFZK3eUBlubfpZFLRzoJ5Q=; b=dd5888cvVKUih/2J0jGW3DPAmUq3tMZAiOKHqp/NTIL1ezDXmOcvZhuNBE6BYmhrm8Ly2/0B/X+AvLqo/8t+tKUoF8PaIbUgKA1f8dDGBHKcxcwC9PNJtIBO0LL2aZn0Y656Z8AYGm89RTzXTnJMioFU+ooKjmfnMY5KtiX/u0Y= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652107299067322.2965558234321; Mon, 9 May 2022 07:41:39 -0700 (PDT) Received: from localhost ([::1]:59156 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4Zt-0004LD-H2 for importer@patchew.org; Mon, 09 May 2022 10:41:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38560) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EK-0002Fw-SW; Mon, 09 May 2022 10:19:20 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EI-0001AQ-4I; Mon, 09 May 2022 10:19:20 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:15 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:12 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105958; x=1683641958; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=teWfneLmPkP/tQTqHrRJG4OODSqIIMXDbgMLbnIwKRM=; b=K4101bKZeyYlh96sno5JDliSXJFHB0wm3bgxBZtDDiYqHPl0zkhBnI/I l4HcUt/vTrMgs8RhEwFfyCgnLJ7krMgguY2TV/KWHgofkjS7P79tqYczB lEddhTsd22ca5321VzQZZAfSuXQ5LyE0W93x7gW0K9Fw32NHDr3Q31sfL dA69haCsMrQpWPlv0n3H4yhTZ2uiRUj1ay8Khc97MI023Dv2y78PeBO42 e3glQte1QiX9HZQF+fvlGtwNIefIj6RDGF0RteTt0BN4ezAjg+Iut0KI4 M2NCZsChAdvAoxSQhi7+FXGmfpqanbXSNbHXxAc2yOjnUPHGF/fbHXQ+b g==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490896" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490896" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987362" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 10/12] docs: Add documentation for SR-IOV and Virtualization Enhancements Date: Mon, 9 May 2022 16:16:18 +0200 Message-Id: <20220509141620.3868733-11-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652107300285100001 Content-Type: text/plain; charset="utf-8" Documentation describes 5 new parameters being added regarding SR-IOV: sriov_max_vfs sriov_vq_flexible sriov_vi_flexible sriov_max_vi_per_vf sriov_max_vq_per_vf The description also includes the simplest possible QEMU invocation and the series of NVMe commands required to enable SR-IOV support. Signed-off-by: Lukasz Maniak Acked-by: Michael S. Tsirkin Reviewed-by: Klaus Jensen --- docs/system/devices/nvme.rst | 82 ++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) diff --git a/docs/system/devices/nvme.rst b/docs/system/devices/nvme.rst index b5acb2a9c19..aba253304e4 100644 --- a/docs/system/devices/nvme.rst +++ b/docs/system/devices/nvme.rst @@ -239,3 +239,85 @@ The virtual namespace device supports DIF- and DIX-bas= ed protection information to ``1`` to transfer protection information as the first eight bytes of metadata. Otherwise, the protection information is transferred as the la= st eight bytes. + +Virtualization Enhancements and SR-IOV (Experimental Support) +------------------------------------------------------------- + +The ``nvme`` device supports Single Root I/O Virtualization and Sharing +along with Virtualization Enhancements. The controller has to be linked to +an NVM Subsystem device (``nvme-subsys``) for use with SR-IOV. + +A number of parameters are present (**please note, that they may be +subject to change**): + +``sriov_max_vfs`` (default: ``0``) + Indicates the maximum number of PCIe virtual functions supported + by the controller. Specifying a non-zero value enables reporting of both + SR-IOV and ARI (Alternative Routing-ID Interpretation) capabilities + by the NVMe device. Virtual function controllers will not report SR-IOV. + +``sriov_vq_flexible`` + Indicates the total number of flexible queue resources assignable to all + the secondary controllers. Implicitly sets the number of primary + controller's private resources to ``(max_ioqpairs - sriov_vq_flexible)``. + +``sriov_vi_flexible`` + Indicates the total number of flexible interrupt resources assignable to + all the secondary controllers. Implicitly sets the number of primary + controller's private resources to ``(msix_qsize - sriov_vi_flexible)``. + +``sriov_max_vi_per_vf`` (default: ``0``) + Indicates the maximum number of virtual interrupt resources assignable + to a secondary controller. The default ``0`` resolves to + ``(sriov_vi_flexible / sriov_max_vfs)`` + +``sriov_max_vq_per_vf`` (default: ``0``) + Indicates the maximum number of virtual queue resources assignable to + a secondary controller. The default ``0`` resolves to + ``(sriov_vq_flexible / sriov_max_vfs)`` + +The simplest possible invocation enables the capability to set up one VF +controller and assign an admin queue, an IO queue, and a MSI-X interrupt. + +.. code-block:: console + + -device nvme-subsys,id=3Dsubsys0 + -device nvme,serial=3Ddeadbeef,subsys=3Dsubsys0,sriov_max_vfs=3D1, + sriov_vq_flexible=3D2,sriov_vi_flexible=3D1 + +The minimum steps required to configure a functional NVMe secondary +controller are: + + * unbind flexible resources from the primary controller + +.. code-block:: console + + nvme virt-mgmt /dev/nvme0 -c 0 -r 1 -a 1 -n 0 + nvme virt-mgmt /dev/nvme0 -c 0 -r 0 -a 1 -n 0 + + * perform a Function Level Reset on the primary controller to actually + release the resources + +.. code-block:: console + + echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset + + * enable VF + +.. code-block:: console + + echo 1 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs + + * assign the flexible resources to the VF and set it ONLINE + +.. code-block:: console + + nvme virt-mgmt /dev/nvme0 -c 1 -r 1 -a 8 -n 1 + nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 8 -n 2 + nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 9 -n 0 + + * bind the NVMe driver to the VF + +.. code-block:: console + + echo 0000:01:00.1 > /sys/bus/pci/drivers/nvme/bind \ No newline at end of file --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106905; cv=none; d=zohomail.com; s=zohoarc; b=mHCc7CCq3yLgx8jQmg7K8yCyifZXhYUSQZrof0CwVkXWVzoEs8Ur96M5hc8vH/akKlM/bjvKxGcnqueQEBob+C/wreBC+h+IjvmbRaR1ro6JJbvgxdYHXSGWsNfQu3+nPt8yU6FCzT/hOz5W+6L/jfNAF7fuItoVX3Wnp8QgtPo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106905; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=OAD0hQ8O/B6FAIVZQSbMeuJlJqdcAErXgyUTsjn6Luk=; b=YfejWwFqAxie8bDImszIChqgXlzHJmiypv/wbNRvNTsgp6TCB8BmmvMPu2JwqHLcTvCC9Kna5Hve1RdOwQGkt+3wk2ewISK3/oymYUwZufgJ9Ou2tVNgDTnwuJ/7yvzWkg/T4SB+zxOPsbmLF4fTaNxex6b83zgzvlm9HnHEKwE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106905710958.5961608649618; Mon, 9 May 2022 07:35:05 -0700 (PDT) Received: from localhost ([::1]:46684 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4TY-0003tz-26 for importer@patchew.org; Mon, 09 May 2022 10:35:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38584) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EN-0002Hj-4z; Mon, 09 May 2022 10:19:25 -0400 Received: from mga05.intel.com ([192.55.52.43]:24288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EL-0001AQ-HO; Mon, 09 May 2022 10:19:22 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:19 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:16 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105961; x=1683641961; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wCnjRNxx0aBPKHmxBrCa2GOKDCDydqbCCvB5XSuBI9k=; b=PptGfoDanHtObCht+6G/KMh2CPzI2KsnNa9MiNYXtVK0PZ6EnNGM+eWf l6TGjXfNdJJxeNjxVUSwsFBN61HMDDlmFmrBWESanyFmMwJXDltdhufrY UcIptugcSnx0bWSJ8ErGGgg2NZKZS/e3kGNAmRzPK+p1WiqAoXTihCOw8 YoeS7fUXjwQWqjZ0Dop4Cb6ADRnvV5PeCb5+hWhs/jKoWvW0HtLp8U123 Za+G5xbIITSQ7vfR8QTCJzAI2Z2Q9+d6wVZrounbuF0DjSmXFDxu6x9bm 6h/IVlEEU4bjBRilOKOuSx4MJ+sEMJhEoV3e8BZqzRSYooRSwyRFUocdd Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490904" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490904" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987375" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 11/12] hw/nvme: Update the initalization place for the AER queue Date: Mon, 9 May 2022 16:16:19 +0200 Message-Id: <20220509141620.3868733-12-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106907736100001 From: =C5=81ukasz Gieryk This patch updates the initialization place for the AER queue, so it=E2=80= =99s initialized once, at controller initialization, and not every time controller is enabled. While the original version works for a non-SR-IOV device, as it=E2=80=99s h= ard to interact with the controller if it=E2=80=99s not enabled, the multiple reinitialization is not necessarily correct. With the SR/IOV feature enabled a segfault can happen: a VF can have its controller disabled, while a namespace can still be attached to the controller through the parent PF. An event generated in such case ends up on an uninitialized queue. While it=E2=80=99s an interesting question whether a VF should support AER = in the first place, I don=E2=80=99t think it must be answered today. Signed-off-by: =C5=81ukasz Gieryk Reviewed-by: Klaus Jensen Acked-by: Michael S. Tsirkin --- hw/nvme/ctrl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 247c09882dd..b0862b1d96c 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -6326,8 +6326,6 @@ static int nvme_start_ctrl(NvmeCtrl *n) =20 nvme_set_timestamp(n, 0ULL); =20 - QTAILQ_INIT(&n->aer_queue); - nvme_select_iocs(n); =20 return 0; @@ -6987,6 +6985,7 @@ static void nvme_init_state(NvmeCtrl *n) n->features.temp_thresh_hi =3D NVME_TEMPERATURE_WARNING; n->starttime_ms =3D qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs =3D g_new0(NvmeRequest *, n->params.aerl + 1); + QTAILQ_INIT(&n->aer_queue); =20 list->numcntl =3D cpu_to_le16(max_vfs); for (i =3D 0; i < max_vfs; i++) { --=20 2.25.1 From nobody Tue May 14 12:06:46 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.intel.com ARC-Seal: i=1; a=rsa-sha256; t=1652106746; cv=none; d=zohomail.com; s=zohoarc; b=gQfTpC4cXsOGnWxqt6fHACOwY20nl10DahxQOhxYyYmazhRZxWHK7iPykdcqgOlvF4yPhJdhyUEogEbRaVFLAdIiewDJBenkUK9GgJdlCSiiRtd8eJ9czAVer0/SIBUeMnNnGQYn0YdUry+VqbsgWOgVYUuQOVpq/YQu15CJd4U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1652106746; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=6a1bndGvPjjDiPFvA+NbFWInuxLtPMLiW1UrbfICtIo=; b=PceGRRbDeqmv1GqTzn/WOl5absC6RJuH8l4XRMvxxIG+g5NBoB9MLWJtNUe/3Bt4lHz00rwgUeLIh78EBK6fC28dY1q1XvQ/r8kmq1U8BFqXEQvx2aFu34RuS0htSoxhCRc+hAydLfSuS8zKhYHirNYStK0H1zDaTl0EexnC8/I= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1652106746654353.74171023011274; Mon, 9 May 2022 07:32:26 -0700 (PDT) Received: from localhost ([::1]:37064 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1no4Qz-0005hq-Lo for importer@patchew.org; Mon, 09 May 2022 10:32:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38598) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EQ-0002Mp-SZ; Mon, 09 May 2022 10:19:26 -0400 Received: from mga05.intel.com ([192.55.52.43]:24353) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1no4EP-0001Ef-4p; Mon, 09 May 2022 10:19:26 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:23 -0700 Received: from lmaniak-dev.elements.local ([10.55.249.72]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2022 07:19:19 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652105965; x=1683641965; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T3j6nnZa5/IGQxypAhVXl35cX7LvpTIlvzv6DpCG0+E=; b=VIMSV9xkiMKjU6wGiJzIOjxjkYjxogVy8jgaQyNgTLqtbgdrmApo69Jp sPIJEoGOpnrGjyBLvyggW+XjekpBNDIiv0uFNs54Vwh7x84IZBdvRmY4Y 8QAV1zBrzgmXHpZdvj9NQulyCN1Bc3qBYaDywS5nLVwDhldK8iSitHV4b JyR7lKABllhIHunvJTiGbrAwWVpPTFJPdHgIfGKhC6PcHA1kN2qCDQBrs yAOaVnKIQjEVT2vPDhNouQr8puSA4pnpy5XrD+hLI0J+0DXmj6kAFW6Cz C2BdIEImBANF2BRPhnmAIjPWHbOxxPICW+T/uxIwR5vx0xcnapwtPj8Wq Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10342"; a="355490911" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="355490911" X-IronPort-AV: E=Sophos;i="5.91,211,1647327600"; d="scan'208";a="622987381" From: Lukasz Maniak To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, ani@anisinha.ca, armbru@redhat.com, f4bug@amsat.org, fam@euphon.net, hreitz@redhat.com, imammedo@redhat.com, its@irrelevant.dk, kbusch@kernel.org, k.jensen@samsung.com, kwolf@redhat.com, lukasz.gieryk@linux.intel.com, lukasz.maniak@linux.intel.com, marcel.apfelbaum@gmail.com, mst@redhat.com, stefanha@redhat.com, xypron.glpk@gmx.de Subject: [PATCH v8 12/12] hw/acpi: Make the PCI hot-plug aware of SR-IOV Date: Mon, 9 May 2022 16:16:20 +0200 Message-Id: <20220509141620.3868733-13-lukasz.maniak@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> References: <20220509141620.3868733-1-lukasz.maniak@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: none client-ip=192.55.52.43; envelope-from=lukasz.maniak@linux.intel.com; helo=mga05.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1652106748963100001 From: =C5=81ukasz Gieryk PCI device capable of SR-IOV support is a new, still-experimental feature with only a single working example of the Nvme device. This patch in an attempt to fix a double-free problem when a SR-IOV-capable Nvme device is hot-unplugged in the following scenario: Qemu CLI: Acked-by: Michael S. Tsirkin Reviewed-by: Klaus Jensen Reviewed-by: Michael S. Tsirkin --------- -device pcie-root-port,slot=3D0,id=3Drp0 -device nvme-subsys,id=3Dsubsys0 -device nvme,id=3Dnvme0,bus=3Drp0,serial=3Ddeadbeef,subsys=3Dsubsys0,sriov_= max_vfs=3D1,sriov_vq_flexible=3D2,sriov_vi_flexible=3D1 Guest OS: --------- sudo nvme virt-mgmt /dev/nvme0 -c 0 -r 1 -a 1 -n 0 sudo nvme virt-mgmt /dev/nvme0 -c 0 -r 0 -a 1 -n 0 echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset sleep 1 echo 1 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs nvme virt-mgmt /dev/nvme0 -c 1 -r 1 -a 8 -n 1 nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 8 -n 2 nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 9 -n 0 sleep 2 echo 01:00.1 > /sys/bus/pci/drivers/nvme/bind Qemu monitor: ------------- device_del nvme0 Explanation of the problem and the proposed solution: 1) The current SR-IOV implementation assumes it=E2=80=99s the PhysicalFunct= ion that creates and deletes VirtualFunctions. 2) It=E2=80=99s a design decision (the Nvme device at least) for the VFs to= be of the same class as PF. Effectively, they share the dc->hotpluggable value. 3) When a VF is created, it=E2=80=99s added as a child node to PF=E2=80=99s= PCI bus slot. 4) Monitor/device_del triggers the ACPI mechanism. The implementation is not aware of SR/IOV and ejects PF=E2=80=99s PCI slot, directly unrealizi= ng all hot-pluggable (!acpi_pcihp_pc_no_hotplug) children nodes. 5) VFs are unrealized directly, and it doesn=E2=80=99t work well with (1). SR/IOV structures are not updated, so when it=E2=80=99s PF=E2=80=99s tur= n to be unrealized, it works on stale pointers to already-deleted VFs. The proposed fix is to make the PCI ACPI code aware of SR/IOV. Signed-off-by: =C5=81ukasz Gieryk Acked-by: Michael S. Tsirkin Reviewed-by: Klaus Jensen Reviewed-by: Michael S. Tsirkin --- hw/acpi/pcihp.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c index bf65bbea494..84d75e6b846 100644 --- a/hw/acpi/pcihp.c +++ b/hw/acpi/pcihp.c @@ -192,8 +192,12 @@ static bool acpi_pcihp_pc_no_hotplug(AcpiPciHpState *s= , PCIDevice *dev) * ACPI doesn't allow hotplug of bridge devices. Don't allow * hot-unplug of bridge devices unless they were added by hotplug * (and so, not described by acpi). + * + * Don't allow hot-unplug of SR-IOV Virtual Functions, as they + * will be removed implicitly, when Physical Function is unplugged. */ - return (pc->is_bridge && !dev->qdev.hotplugged) || !dc->hotpluggable; + return (pc->is_bridge && !dev->qdev.hotplugged) || !dc->hotpluggable || + pci_is_vf(dev); } =20 static void acpi_pcihp_eject_slot(AcpiPciHpState *s, unsigned bsel, unsign= ed slots) --=20 2.25.1