From nobody Sun May 5 22:22:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=nvidia.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1550612000375722.5869289045063; Tue, 19 Feb 2019 13:33:20 -0800 (PST) Received: from localhost ([127.0.0.1]:55247 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwD0n-0007UF-Q6 for importer@patchew.org; Tue, 19 Feb 2019 16:33:09 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33466) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwCyP-00068w-1G for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gwCyM-0005ga-6i for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:40 -0500 Received: from hqemgate15.nvidia.com ([216.228.121.64]:7148) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gwCyL-0005b8-BF for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:37 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate15.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Tue, 19 Feb 2019 13:25:25 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Tue, 19 Feb 2019 13:25:26 -0800 Received: from HQMAIL110.nvidia.com (172.18.146.15) by HQMAIL103.nvidia.com (172.20.187.11) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:25:25 +0000 Received: from HQMAIL105.nvidia.com (172.20.187.12) by hqmail110.nvidia.com (172.18.146.15) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:25:24 +0000 Received: from kwankhede-dev.nvidia.com (10.124.1.5) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Frontend Transport; Tue, 19 Feb 2019 21:25:17 +0000 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 19 Feb 2019 13:25:26 -0800 From: Kirti Wankhede To: , Date: Wed, 20 Feb 2019 02:53:16 +0530 Message-ID: <1550611400-13703-2-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> References: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1550611525; bh=to18RAoY3ilgPnufDy6NrWRf4HS/Vht3Au3qBT8x4dc=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=fXR5CWssHdGhQj/A0kPzUlGZI+rODE3grq5zudZ9DTb0axcoaOSSj4M9ADT9IkPVo ONPUlRbHzXO3A+kcjMG8nNnF4Re0II6arkyj2q/urTWtKe2yup49Cfpfz0V6iQ9K3O cmAQQGeN6fUOKm/AOI+kHAvXpMfAJ3UCdc5/V6BcCWRy8GxBkS344ZbO5AapvZV0Ps iJ+TahBKjMXJTFIU3ppTdHCTon0BWZboIhGYOlyR3scimoM8vlWblTWoUVnrGMWPQV j8bey00Ej1XxJbJt+YBuzFwhgkB6DgRTVv4a/UgoRYJFpeoxGyvrLLtNm709fz1r4m /9RTsDt/PIwcg== X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 X-Received-From: 216.228.121.64 Subject: [Qemu-devel] [PATCH v3 1/5] VFIO KABI for migration interface X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kirti Wankhede , Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, yulei.zhang@intel.com, eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" - Defined MIGRATION region type and sub-type. - Used 2 bits to define VFIO device states. Bit 0 =3D> 0/1 =3D> _STOPPED/_RUNNING Bit 1 =3D> 0/1 =3D> _RESUMING/_SAVING Combination of these bits defines VFIO device's state during migration _RUNNING =3D> Normal VFIO device running state. _STOPPED =3D> VFIO device stopped. _SAVING | _RUNNING =3D> vCPUs are running, VFIO device is running but s= tart saving state of device i.e. pre-copy state _SAVING | _STOPPED =3D> vCPUs are stoppped, VFIO device should be stopp= ed, and save device state,i.e. stop-n-copy state _RESUMING =3D> VFIO device resuming state. - Defined vfio_device_migration_info structure which will be placed at 0th offset of migration region to get/set VFIO device related information. Defined members of structure and usage on read/write access: * device_state: (write only) To convey VFIO device state to be transitioned to. * pending bytes: (read only) To get pending bytes yet to be migrated for VFIO device * data_offset: (read/write) To get or set data offset in migration from where data exist during _SAVING and _RESUMING state * data_size: (write only) To convey size of data copied in migration region during _RESUMING state * start_pfn, page_size, total_pfns: (write only) To get bitmap of dirty pages from vendor driver from given start address for total_pfns. * copied_pfns: (read only) To get number of pfns bitmap copied in migration region. Vendor driver should copy the bitmap with bits set only for pages to be marked dirty in migration region. Vendor driver should return 0 if there are 0 pages dirty in requested range. Migration region looks like: ------------------------------------------------------------------ |vfio_device_migration_info| data section | | | /////////////////////////////// | ------------------------------------------------------------------ ^ ^ ^ offset 0-trapped part data.offset data.size Data section is always followed by vfio_device_migration_info structure in the region, so data.offset will always be none-0. Offset from where data is copied is decided by kernel driver, data section can be trapped or mapped depending on how kernel driver defines data section. If mmapped, then data.offset should be page aligned, where as initial section which contain vfio_device_migration_info structure might not end at offset which is page aligned. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia --- linux-headers/linux/vfio.h | 65 ++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 65 insertions(+) diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h index 12a7b1dc53c8..1b12a9b95e00 100644 --- a/linux-headers/linux/vfio.h +++ b/linux-headers/linux/vfio.h @@ -368,6 +368,71 @@ struct vfio_region_gfx_edid { */ #define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD (1) =20 +/* Migration region type and sub-type */ +#define VFIO_REGION_TYPE_MIGRATION (2) +#define VFIO_REGION_SUBTYPE_MIGRATION (1) + +/** + * Structure vfio_device_migration_info is placed at 0th offset of + * VFIO_REGION_SUBTYPE_MIGRATION region to get/set VFIO device related mig= ration + * information. Field accesses from this structure are only supported at t= heir + * native width and alignment, otherwise should return error. + * + * device_state: (write only) + * To indicate vendor driver the state VFIO device should be transiti= oned + * to. If device state transition fails, write to this field return e= rror. + * It consists of 2 bits. + * - If bit 0 set, indicates _RUNNING state. When its reset, that ind= icates + * _STOPPED state. When device is changed to _STOPPED, driver shoul= d stop + * device before write returns. + * - If bit 1 set, indicates _SAVING state. When its reset, that indi= cates + * _RESUMING state. + * + * pending bytes: (read only) + * Read pending bytes yet to be migrated from vendor driver + * + * data_offset: (read/write) + * User application should read data_offset in migration region from = where + * user application should read data during _SAVING state. + * User application would write data_offset in migration region from = where + * user application is had written data during _RESUMING state. + * + * data_size: (write only) + * User application should write size of data copied in migration reg= ion + * during _RESUMING state. + * + * start_pfn: (write only) + * Start address pfn to get bitmap of dirty pages from vendor driver = duing + * _SAVING state. + * + * page_size: (write only) + * User application should write the page_size of pfn. + * + * total_pfns: (write only) + * Total pfn count from start_pfn for which dirty bitmap is requested. + * + * copied_pfns: (read only) + * pfn count for which dirty bitmap is copied to migration region. + * Vendor driver should copy the bitmap with bits set only for pages = to be + * marked dirty in migration region. + * Vendor driver should return 0 if there are 0 pages dirty in reques= ted + * range. + */ + +struct vfio_device_migration_info { + __u32 device_state; /* VFIO device state */ +#define VFIO_DEVICE_STATE_RUNNING (1 << 0) +#define VFIO_DEVICE_STATE_SAVING (1 << 1) + __u32 reserved; + __u64 pending_bytes; + __u64 data_offset; + __u64 data_size; + __u64 start_pfn; + __u64 page_size; + __u64 total_pfns; + __u64 copied_pfns; +} __attribute__((packed)); + /* * The MSIX mappable capability informs that MSIX data of a BAR can be mma= pped * which allows direct access to non-MSIX registers which happened to be w= ithin --=20 2.7.0 From nobody Sun May 5 22:22:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=nvidia.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1550611996771689.194328173122; Tue, 19 Feb 2019 13:33:16 -0800 (PST) Received: from localhost ([127.0.0.1]:55245 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwD0o-0007U8-Co for importer@patchew.org; Tue, 19 Feb 2019 16:33:10 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33462) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwCyO-00068t-SQ for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gwCyL-0005fx-Qs for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:40 -0500 Received: from hqemgate15.nvidia.com ([216.228.121.64]:7150) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gwCyL-0005c1-DY for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:37 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate15.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Tue, 19 Feb 2019 13:25:32 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Tue, 19 Feb 2019 13:25:33 -0800 Received: from HQMAIL105.nvidia.com (172.20.187.12) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:25:32 +0000 Received: from kwankhede-dev.nvidia.com (10.124.1.5) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Frontend Transport; Tue, 19 Feb 2019 21:25:25 +0000 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 19 Feb 2019 13:25:33 -0800 From: Kirti Wankhede To: , Date: Wed, 20 Feb 2019 02:53:17 +0530 Message-ID: <1550611400-13703-3-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> References: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1550611532; bh=AWlFUnjcg1e/79ob3xuhxtSslf8pWGPMKMe4r+5gSZM=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=hFwunTDcR7H9yyeTgSHjPB3pebOHUzXGguX0pFeJkln8H2uOU9o3QDZaqS94T2rSL AvGUEJc9/u9ROwrzn3Q4gjtm+AOhSTsGq8A4Xyj8D9sHCzr8p+lZGbZOX9F/AmeKNQ 0iJKkuhWXoU3KQrD8JvuSD62Z3slWe+GBnvVhS0bHw1Bso3xcG9XMAkPdt6bLqDHSJ DSBBDsmjrPadfBHk8nRc8sHcIozeHYjcR1qY8VYvbvaJivALwmYvGTVldfOZjJRLrp QikU62lpVi9ksXy9alRUJuOEJTqKZDcUlYv+Mub5O/sNqo6kSAcL0tnUR/fhe2Auuo 4UN6c3x6uE1RQ== X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 X-Received-From: 216.228.121.64 Subject: [Qemu-devel] [PATCH v3 2/5] Add save and load functions for VFIO PCI devices X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kirti Wankhede , Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, yulei.zhang@intel.com, eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" These functions save and restore PCI device specific data - config space of PCI device. Tested save and restore with MSI and MSIX type. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia --- hw/vfio/pci.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ hw/vfio/pci.h | 29 ++++++++++++++++ 2 files changed, 135 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index dd12f363915d..e87a8a03d3f3 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1237,6 +1237,112 @@ void vfio_pci_write_config(PCIDevice *pdev, } } =20 +void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f) +{ + VFIOPCIDevice *vdev =3D container_of(vbasedev, VFIOPCIDevice, vbasedev= ); + PCIDevice *pdev =3D &vdev->pdev; + int i; + + for (i =3D 0; i < PCI_ROM_SLOT; i++) { + uint32_t bar; + + bar =3D pci_default_read_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, = 4); + qemu_put_be32(f, bar); + } + + qemu_put_be32(f, vdev->interrupt); + if (vdev->interrupt =3D=3D VFIO_INT_MSI) { + uint32_t msi_flags, msi_addr_lo, msi_addr_hi =3D 0, msi_data; + bool msi_64bit; + + msi_flags =3D pci_default_read_config(pdev, pdev->msi_cap + PCI_MS= I_FLAGS, + 2); + msi_64bit =3D (msi_flags & PCI_MSI_FLAGS_64BIT); + + msi_addr_lo =3D pci_default_read_config(pdev, + pdev->msi_cap + PCI_MSI_ADDRESS_L= O, 4); + qemu_put_be32(f, msi_addr_lo); + + if (msi_64bit) { + msi_addr_hi =3D pci_default_read_config(pdev, + pdev->msi_cap + PCI_MSI_ADDRE= SS_HI, + 4); + } + qemu_put_be32(f, msi_addr_hi); + + msi_data =3D pci_default_read_config(pdev, + pdev->msi_cap + (msi_64bit ? PCI_MSI_DATA_64 : PCI_MSI_DAT= A_32), + 2); + qemu_put_be32(f, msi_data); + } else if (vdev->interrupt =3D=3D VFIO_INT_MSIX) { + uint16_t offset; + + /* save enable bit and maskall bit */ + offset =3D pci_default_read_config(pdev, + pdev->msix_cap + PCI_MSIX_FLAGS + 1= , 2); + qemu_put_be16(f, offset); + msix_save(pdev, f); + } +} + +void vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) +{ + VFIOPCIDevice *vdev =3D container_of(vbasedev, VFIOPCIDevice, vbasedev= ); + PCIDevice *pdev =3D &vdev->pdev; + uint32_t pci_cmd, interrupt_type; + uint32_t msi_flags, msi_addr_lo, msi_addr_hi =3D 0, msi_data; + bool msi_64bit; + int i; + + /* retore pci bar configuration */ + pci_cmd =3D pci_default_read_config(pdev, PCI_COMMAND, 2); + vfio_pci_write_config(pdev, PCI_COMMAND, + pci_cmd & (!(PCI_COMMAND_IO | PCI_COMMAND_MEMORY))= , 2); + for (i =3D 0; i < PCI_ROM_SLOT; i++) { + uint32_t bar =3D qemu_get_be32(f); + + vfio_pci_write_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, bar, 4); + } + vfio_pci_write_config(pdev, PCI_COMMAND, + pci_cmd | PCI_COMMAND_IO | PCI_COMMAND_MEMORY, 2= ); + + interrupt_type =3D qemu_get_be32(f); + + if (interrupt_type =3D=3D VFIO_INT_MSI) { + /* restore msi configuration */ + msi_flags =3D pci_default_read_config(pdev, + pdev->msi_cap + PCI_MSI_FLAGS,= 2); + msi_64bit =3D (msi_flags & PCI_MSI_FLAGS_64BIT); + + vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_FLAGS, + msi_flags & (!PCI_MSI_FLAGS_ENABLE), 2); + + msi_addr_lo =3D qemu_get_be32(f); + vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_ADDRESS_LO, + msi_addr_lo, 4); + + msi_addr_hi =3D qemu_get_be32(f); + if (msi_64bit) { + vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_ADDRESS_HI, + msi_addr_hi, 4); + } + msi_data =3D qemu_get_be32(f); + vfio_pci_write_config(pdev, + pdev->msi_cap + (msi_64bit ? PCI_MSI_DATA_64 : PCI_MSI_DAT= A_32), + msi_data, 2); + + vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_FLAGS, + msi_flags | PCI_MSI_FLAGS_ENABLE, 2); + } else if (interrupt_type =3D=3D VFIO_INT_MSIX) { + uint16_t offset =3D qemu_get_be16(f); + + /* load enable bit and maskall bit */ + vfio_pci_write_config(pdev, pdev->msix_cap + PCI_MSIX_FLAGS + 1, + offset, 2); + msix_load(pdev, f); + } +} + /* * Interrupt setup */ diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index b1ae4c07549a..77d3223481b4 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -20,6 +20,7 @@ #include "qemu/queue.h" #include "qemu/timer.h" =20 +#ifdef CONFIG_LINUX #define PCI_ANY_ID (~0) =20 struct VFIOPCIDevice; @@ -199,4 +200,32 @@ void vfio_display_reset(VFIOPCIDevice *vdev); int vfio_display_probe(VFIOPCIDevice *vdev, Error **errp); void vfio_display_finalize(VFIOPCIDevice *vdev); =20 +void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f); +void vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f); + +static inline Object *vfio_pci_get_object(VFIODevice *vbasedev) +{ + VFIOPCIDevice *vdev =3D container_of(vbasedev, VFIOPCIDevice, vbasedev= ); + + return OBJECT(vdev); +} + +#else +static inline void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f) +{ + g_assert(false); +} + +static inline void vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) +{ + g_assert(false); +} + +static inline Object *vfio_pci_get_object(VFIODevice *vbasedev) +{ + return NULL; +} + +#endif + #endif /* HW_VFIO_VFIO_PCI_H */ --=20 2.7.0 From nobody Sun May 5 22:22:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=nvidia.com Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 1550612213792198.06228026179826; Tue, 19 Feb 2019 13:36:53 -0800 (PST) Received: from localhost ([127.0.0.1]:55320 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwD4G-0001yP-CO for importer@patchew.org; Tue, 19 Feb 2019 16:36:44 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33539) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwCyb-0006K4-F9 for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gwCyZ-0005lQ-89 for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:53 -0500 Received: from hqemgate15.nvidia.com ([216.228.121.64]:7158) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gwCyX-0005kx-Nj for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:51 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate15.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Tue, 19 Feb 2019 13:25:45 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Tue, 19 Feb 2019 13:25:46 -0800 Received: from HQMAIL111.nvidia.com (172.20.187.18) by HQMAIL104.nvidia.com (172.18.146.11) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:25:46 +0000 Received: from HQMAIL105.nvidia.com (172.20.187.12) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:25:45 +0000 Received: from kwankhede-dev.nvidia.com (10.124.1.5) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Frontend Transport; Tue, 19 Feb 2019 21:25:33 +0000 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 19 Feb 2019 13:25:46 -0800 From: Kirti Wankhede To: , Date: Wed, 20 Feb 2019 02:53:18 +0530 Message-ID: <1550611400-13703-4-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> References: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1550611546; bh=RqjEFpoAU9GWZKNTXmg7MrOXRENv7UvWWXWYIZIZ34E=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=QJia6JrOLZH7EheGQfBWx6idQ8OJeHxMpfYOkKLEv2M3J+bx7YqAu7SRnoRycmhj8 jCdwYG6yuwb7guuQCCP+eJ03NEjc/Vn5PcBJSJE0rDVmsHpA0InrIE9RN4AcLdThjO rP+5XY9uaDkC23D6Pvgih4IBQmvc1A2YbmASOGTMi9MApme5DgttcEvA6wgqFyux6F jQW753Rk1D3m5kHojAdMSlOw97D/pVX31nJJx7zJ0Qz2rrvdD4JaGuJ/4xaUzxboGJ 7zLSkfWHT5nHsncMTscqMv5/uTIWZ9qvS6Kvjxccb6a1QAm7CQ4LUNsRL7QHc7iUX3 pYULuZoBQ279w== X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 X-Received-From: 216.228.121.64 Subject: [Qemu-devel] [PATCH v3 3/5] Add migration functions for VFIO devices X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kirti Wankhede , Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, yulei.zhang@intel.com, eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" - Migration function are implemented for VFIO_DEVICE_TYPE_PCI device. - Added SaveVMHandlers and implemented all basic functions required for live migration. - Added VM state change handler to know running or stopped state of VM. - Added migration state change notifier to get notification on migration st= ate change. This state is translated to VFIO device state and conveyed to ven= dor driver. - VFIO device supports migration or not is decided based of migration region query. If migration region query is successful then migration is supported else migration is blocked. - Structure vfio_device_migration_info is mapped at 0th offset of migration region and should always trapped by VFIO device's driver. Added both type= of access support, trapped or mmapped, for data section of the region. - To save device state, read pending_bytes and data_offset using structure vfio_device_migration_info, accordingly copy data from the region. - To restore device state, write data_offset and data_size in the structure and write data in the region. - To get dirty page bitmap, write start address and pfn count then read cou= nt of pfns copied and accordingly read those from the rest of the region or mma= ped part of the region. This copy is iterated till page bitmap for all reques= ted pfns are copied. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia --- hw/vfio/Makefile.objs | 2 +- hw/vfio/migration.c | 714 ++++++++++++++++++++++++++++++++++++++= ++++ include/hw/vfio/vfio-common.h | 20 ++ 3 files changed, 735 insertions(+), 1 deletion(-) create mode 100644 hw/vfio/migration.c diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs index abad8b818c9b..36033d1437c5 100644 --- a/hw/vfio/Makefile.objs +++ b/hw/vfio/Makefile.objs @@ -1,4 +1,4 @@ -obj-y +=3D common.o spapr.o +obj-y +=3D common.o spapr.o migration.o obj-$(CONFIG_VFIO_PCI) +=3D pci.o pci-quirks.o display.o obj-$(CONFIG_VFIO_CCW) +=3D ccw.o obj-$(CONFIG_VFIO_PLATFORM) +=3D platform.o diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c new file mode 100644 index 000000000000..d7b6d972c043 --- /dev/null +++ b/hw/vfio/migration.c @@ -0,0 +1,714 @@ +/* + * Migration support for VFIO devices + * + * Copyright NVIDIA, Inc. 2018 + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include + +#include "hw/vfio/vfio-common.h" +#include "cpu.h" +#include "migration/migration.h" +#include "migration/qemu-file.h" +#include "migration/register.h" +#include "migration/blocker.h" +#include "migration/misc.h" +#include "qapi/error.h" +#include "exec/ramlist.h" +#include "exec/ram_addr.h" +#include "pci.h" + +/* + * Flags used as delimiter: + * 0xffffffff =3D> MSB 32-bit all 1s + * 0xef10 =3D> emulated (virtual) function IO + * 0x0000 =3D> 16-bits reserved for flags + */ +#define VFIO_MIG_FLAG_END_OF_STATE (0xffffffffef100001ULL) +#define VFIO_MIG_FLAG_DEV_CONFIG_STATE (0xffffffffef100002ULL) +#define VFIO_MIG_FLAG_DEV_SETUP_STATE (0xffffffffef100003ULL) +#define VFIO_MIG_FLAG_DEV_DATA_STATE (0xffffffffef100004ULL) + +static void vfio_migration_region_exit(VFIODevice *vbasedev) +{ + VFIOMigration *migration =3D vbasedev->migration; + + if (!migration) { + return; + } + + if (migration->region.buffer.size) { + vfio_region_exit(&migration->region.buffer); + vfio_region_finalize(&migration->region.buffer); + } +} + +static int vfio_migration_region_init(VFIODevice *vbasedev) +{ + VFIOMigration *migration =3D vbasedev->migration; + Object *obj =3D NULL; + int ret =3D -EINVAL; + + if (!migration) { + return ret; + } + + /* Migration support added for PCI device only */ + if (vbasedev->type =3D=3D VFIO_DEVICE_TYPE_PCI) { + obj =3D vfio_pci_get_object(vbasedev); + } + + if (!obj) { + return ret; + } + + ret =3D vfio_region_setup(obj, vbasedev, &migration->region.buffer, + migration->region.index, "migration"); + if (ret) { + error_report("Failed to setup VFIO migration region %d: %s", + migration->region.index, strerror(-ret)); + goto err; + } + + if (!migration->region.buffer.size) { + ret =3D -EINVAL; + error_report("Invalid region size of VFIO migration region %d: %s", + migration->region.index, strerror(-ret)); + goto err; + } + + if (migration->region.buffer.mmaps) { + ret =3D vfio_region_mmap(&migration->region.buffer); + if (ret) { + error_report("Failed to mmap VFIO migration region %d: %s", + migration->region.index, strerror(-ret)); + goto err; + } + } + + return 0; + +err: + vfio_migration_region_exit(vbasedev); + return ret; +} + +static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t state) +{ + VFIOMigration *migration =3D vbasedev->migration; + VFIORegion *region =3D &migration->region.buffer; + int ret =3D 0; + + ret =3D pwrite(vbasedev->fd, &state, sizeof(state), + region->fd_offset + offsetof(struct vfio_device_migration= _info, + device_state)); + if (ret < 0) { + error_report("Failed to set migration state %d %s", + ret, strerror(errno)); + return ret; + } + + vbasedev->device_state =3D state; + return 0; +} + +void vfio_get_dirty_page_list(VFIODevice *vbasedev, + uint64_t start_pfn, + uint64_t pfn_count, + uint64_t page_size) +{ + VFIOMigration *migration =3D vbasedev->migration; + VFIORegion *region =3D &migration->region.buffer; + uint64_t count =3D 0, copied_pfns =3D 0; + int ret; + + ret =3D pwrite(vbasedev->fd, &start_pfn, sizeof(start_pfn), + region->fd_offset + offsetof(struct vfio_device_migration= _info, + start_pfn)); + if (ret < 0) { + error_report("Failed to set dirty pages start address %d %s", + ret, strerror(errno)); + return; + } + + ret =3D pwrite(vbasedev->fd, &page_size, sizeof(page_size), + region->fd_offset + offsetof(struct vfio_device_migration= _info, + page_size)); + if (ret < 0) { + error_report("Failed to set dirty page size %d %s", + ret, strerror(errno)); + return; + } + + ret =3D pwrite(vbasedev->fd, &pfn_count, sizeof(pfn_count), + region->fd_offset + offsetof(struct vfio_device_migration= _info, + total_pfns)); + if (ret < 0) { + error_report("Failed to set dirty page total pfns %d %s", + ret, strerror(errno)); + return; + } + + do { + uint64_t bitmap_size; + void *buf =3D NULL; + bool buffer_mmaped =3D false; + + /* Read dirty_pfns.copied */ + ret =3D pread(vbasedev->fd, &copied_pfns, sizeof(copied_pfns), + region->fd_offset + offsetof(struct vfio_device_migration_= info, + copied_pfns)); + if (ret < 0) { + error_report("Failed to get dirty pages bitmap count %d %s", + ret, strerror(errno)); + return; + } + + if (copied_pfns =3D=3D 0) { + /* + * copied_pfns could be 0 if driver doesn't have any page to r= eport + * dirty in given range + */ + break; + } + + bitmap_size =3D (BITS_TO_LONGS(copied_pfns) + 1) * sizeof(unsigned= long); + + if (region->mmaps) { + int i; + + for (i =3D 0; i < region->nr_mmaps; i++) { + if (region->mmaps[i].size >=3D bitmap_size) { + buf =3D region->mmaps[i].mmap; + buffer_mmaped =3D true; + break; + } + } + } + + if (!buffer_mmaped) { + buf =3D g_malloc0(bitmap_size); + + ret =3D pread(vbasedev->fd, buf, bitmap_size, + region->fd_offset + + sizeof(struct vfio_device_migration_info) + 1); + if (ret !=3D bitmap_size) { + error_report("Failed to get dirty pages bitmap %d", ret); + g_free(buf); + return; + } + } + + cpu_physical_memory_set_dirty_lebitmap((unsigned long *)buf, + (start_pfn + count) * page_= size, + copied_pfns); + count +=3D copied_pfns; + + if (!buffer_mmaped) { + g_free(buf); + } + } while (count < pfn_count); +} + +static int vfio_save_device_config_state(QEMUFile *f, void *opaque) +{ + VFIODevice *vbasedev =3D opaque; + + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_CONFIG_STATE); + + if (vbasedev->type =3D=3D VFIO_DEVICE_TYPE_PCI) { + vfio_pci_save_config(vbasedev, f); + } + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); + + return qemu_file_get_error(f); +} + +static int vfio_load_device_config_state(QEMUFile *f, void *opaque) +{ + VFIODevice *vbasedev =3D opaque; + + if (vbasedev->type =3D=3D VFIO_DEVICE_TYPE_PCI) { + vfio_pci_load_config(vbasedev, f); + } + + if (qemu_get_be64(f) !=3D VFIO_MIG_FLAG_END_OF_STATE) { + error_report("Wrong end of block while loading device config space= "); + return -EINVAL; + } + + return qemu_file_get_error(f); +} + +/* ---------------------------------------------------------------------- = */ + +static int vfio_save_setup(QEMUFile *f, void *opaque) +{ + VFIODevice *vbasedev =3D opaque; + int ret; + + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); + + if (vbasedev->vm_running) { + ret =3D vfio_migration_set_state(vbasedev, + VFIO_DEVICE_STATE_RUNNING | VFIO_DEVICE_STATE_SAV= ING); + if (ret) { + error_report("Failed to set state RUNNING and SAVING"); + } + } else { + ret =3D vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_SAVIN= G); + if (ret) { + error_report("Failed to set state STOP and SAVING"); + } + } + + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); + + ret =3D qemu_file_get_error(f); + if (ret) { + return ret; + } + + return 0; +} + +static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev) +{ + VFIOMigration *migration =3D vbasedev->migration; + VFIORegion *region =3D &migration->region.buffer; + uint64_t data_offset =3D 0, data_size =3D 0; + int ret; + + ret =3D pread(vbasedev->fd, &data_offset, sizeof(data_offset), + region->fd_offset + offsetof(struct vfio_device_migration_= info, + data_offset)); + if (ret !=3D sizeof(data_offset)) { + error_report("Failed to get migration buffer data offset %d", + ret); + return -EINVAL; + } + + if (migration->pending_bytes) { + void *buf =3D NULL; + bool buffer_mmaped =3D false; + + if (region->mmaps) { + int i; + + for (i =3D 0; i < region->nr_mmaps; i++) { + if ((data_offset >=3D region->mmaps[i].offset) && + (data_offset < region->mmaps[i].offset + + region->mmaps[i].size)) { + uint64_t region_data_size =3D region->mmaps[i].size - + (data_offset - region->mmaps[i].of= fset); + + buf =3D region->mmaps[i].mmap; + buffer_mmaped =3D true; + + if (migration->pending_bytes > region_data_size) { + data_size =3D region_data_size; + } else { + data_size =3D migration->pending_bytes; + } + break; + } + } + } + + if (!buffer_mmaped) { + uint64_t region_data_size =3D region->size - data_offset; + + if (migration->pending_bytes > region_data_size) { + data_size =3D region_data_size; + } else { + data_size =3D migration->pending_bytes; + } + + buf =3D g_malloc0(data_size); + ret =3D pread(vbasedev->fd, buf, data_size, + region->fd_offset + data_offset); + if (ret !=3D data_size) { + error_report("Failed to get migration data %d", ret); + return -EINVAL; + } + } + + qemu_put_be64(f, data_size); + qemu_put_buffer(f, buf, data_size); + + if (!buffer_mmaped) { + g_free(buf); + } + migration->pending_bytes -=3D data_size; + } else { + qemu_put_be64(f, migration->pending_bytes); + } + + ret =3D qemu_file_get_error(f); + if (ret) { + return ret; + } + + return data_size; +} + +static int vfio_save_iterate(QEMUFile *f, void *opaque) +{ + VFIODevice *vbasedev =3D opaque; + int ret; + + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); + + ret =3D vfio_save_buffer(f, vbasedev); + if (ret < 0) { + error_report("vfio_save_buffer failed %s", + strerror(errno)); + return ret; + } + + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); + + ret =3D qemu_file_get_error(f); + if (ret) { + return ret; + } + + return ret; +} + +static int vfio_update_pending(VFIODevice *vbasedev) +{ + VFIOMigration *migration =3D vbasedev->migration; + VFIORegion *region =3D &migration->region.buffer; + uint64_t pending_bytes =3D 0; + int ret; + + ret =3D pread(vbasedev->fd, &pending_bytes, sizeof(pending_bytes), + region->fd_offset + offsetof(struct vfio_device_migration_= info, + pending_bytes)); + if ((ret < 0) || (ret !=3D sizeof(pending_bytes))) { + error_report("Failed to get pending bytes %d", ret); + migration->pending_bytes =3D 0; + return (ret < 0) ? ret : -EINVAL; + } + + migration->pending_bytes =3D pending_bytes; + return 0; +} + +static void vfio_save_pending(QEMUFile *f, void *opaque, + uint64_t threshold_size, + uint64_t *res_precopy_only, + uint64_t *res_compatible, + uint64_t *res_postcopy_only) +{ + VFIODevice *vbasedev =3D opaque; + VFIOMigration *migration =3D vbasedev->migration; + int ret; + + ret =3D vfio_update_pending(vbasedev); + if (ret) { + return; + } + + if (vbasedev->device_state & VFIO_DEVICE_STATE_RUNNING) { + *res_precopy_only +=3D migration->pending_bytes; + } else { + *res_postcopy_only +=3D migration->pending_bytes; + } + *res_compatible +=3D 0; +} + +static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) +{ + VFIODevice *vbasedev =3D opaque; + VFIOMigration *migration =3D vbasedev->migration; + int ret; + + ret =3D vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_SAVING); + if (ret) { + error_report("Failed to set state STOP and SAVING"); + return ret; + } + + ret =3D vfio_save_device_config_state(f, opaque); + if (ret) { + return ret; + } + + ret =3D vfio_update_pending(vbasedev); + if (ret) { + return ret; + } + + while (migration->pending_bytes > 0) { + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); + ret =3D vfio_save_buffer(f, vbasedev); + if (ret < 0) { + error_report("Failed to save buffer"); + return ret; + } else if (ret =3D=3D 0) { + break; + } + + if (migration->pending_bytes =3D=3D 0) { + ret =3D vfio_update_pending(vbasedev); + if (ret) { + return ret; + } + } + } + + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); + + ret =3D qemu_file_get_error(f); + if (ret) { + return ret; + } + + ret =3D vfio_migration_set_state(vbasedev, 0); + if (ret) { + error_report("Failed to set state STOPPED"); + return ret; + } + return ret; +} + +static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) +{ + VFIODevice *vbasedev =3D opaque; + int ret; + uint64_t data, data_size; + + ret =3D vfio_migration_set_state(vbasedev, 0); + if (ret) { + error_report("Failed to set state RESUMING"); + return ret; + } + + data =3D qemu_get_be64(f); + while (data !=3D VFIO_MIG_FLAG_END_OF_STATE) { + if (data =3D=3D VFIO_MIG_FLAG_DEV_CONFIG_STATE) { + ret =3D vfio_load_device_config_state(f, opaque); + if (ret) { + return ret; + } + } else if (data =3D=3D VFIO_MIG_FLAG_DEV_SETUP_STATE) { + data =3D qemu_get_be64(f); + if (data =3D=3D VFIO_MIG_FLAG_END_OF_STATE) { + return 0; + } else { + error_report("SETUP STATE: EOS not found 0x%lx", data); + return -EINVAL; + } + } else if (data =3D=3D VFIO_MIG_FLAG_DEV_DATA_STATE) { + VFIOMigration *migration =3D vbasedev->migration; + VFIORegion *region =3D &migration->region.buffer; + void *buf =3D NULL; + bool buffer_mmaped =3D false; + uint64_t data_offset =3D 0; + + data_size =3D qemu_get_be64(f); + if (data_size !=3D 0) { + if (region->mmaps) { + int i; + + for (i =3D 0; i < region->nr_mmaps; i++) { + if (region->mmaps[i].mmap && + (region->mmaps[i].size >=3D data_size)) { + buf =3D region->mmaps[i].mmap; + data_offset =3D region->mmaps[i].offset; + buffer_mmaped =3D true; + break; + } + } + } + + if (!buffer_mmaped) { + buf =3D g_malloc0(data_size); + data_offset =3D sizeof(struct vfio_device_migration_in= fo) + 1; + } + + qemu_get_buffer(f, buf, data_size); + + ret =3D pwrite(vbasedev->fd, &data_offset, sizeof(data_off= set), + region->fd_offset + + offsetof(struct vfio_device_migration_info, data_off= set)); + if (ret !=3D sizeof(data_offset)) { + error_report("Failed to set migration data offset %d", + ret); + return -EINVAL; + } + + ret =3D pwrite(vbasedev->fd, &data_size, sizeof(data_size), + region->fd_offset + + offsetof(struct vfio_device_migration_info, data_s= ize)); + if (ret !=3D sizeof(data_size)) { + error_report("Failed to set migration buffer data size= %d", + ret); + return -EINVAL; + } + + if (!buffer_mmaped) { + ret =3D pwrite(vbasedev->fd, buf, data_size, + region->fd_offset + data_offset); + g_free(buf); + + if (ret !=3D data_size) { + error_report("Failed to set migration buffer %d", = ret); + return -EINVAL; + } + } + } + } + + ret =3D qemu_file_get_error(f); + if (ret) { + return ret; + } + data =3D qemu_get_be64(f); + } + + return 0; +} + +static SaveVMHandlers savevm_vfio_handlers =3D { + .save_setup =3D vfio_save_setup, + .save_live_pending =3D vfio_save_pending, + .save_live_iterate =3D vfio_save_iterate, + .save_live_complete_precopy =3D vfio_save_complete_precopy, + .load_state =3D vfio_load_state, +}; + +static void vfio_vmstate_change(void *opaque, int running, RunState state) +{ + VFIODevice *vbasedev =3D opaque; + + if ((vbasedev->vm_running !=3D running) && running) { + int ret; + + ret =3D vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RUNNI= NG); + if (ret) { + error_report("Failed to set state RUNNING"); + } + } + + vbasedev->vm_running =3D running; +} + +static void vfio_migration_state_notifier(Notifier *notifier, void *data) +{ + MigrationState *s =3D data; + VFIODevice *vbasedev =3D container_of(notifier, VFIODevice, migration_= state); + int ret; + + switch (s->state) { + case MIGRATION_STATUS_ACTIVE: + if (vbasedev->device_state & VFIO_DEVICE_STATE_RUNNING) { + if (vbasedev->vm_running) { + ret =3D vfio_migration_set_state(vbasedev, + VFIO_DEVICE_STATE_RUNNING | VFIO_DEVICE_STATE_SA= VING); + if (ret) { + error_report("Failed to set state RUNNING and SAVING"); + } + } else { + ret =3D vfio_migration_set_state(vbasedev, + VFIO_DEVICE_STATE_SAVING); + if (ret) { + error_report("Failed to set state STOP and SAVING"); + } + } + } else { + ret =3D vfio_migration_set_state(vbasedev, 0); + if (ret) { + error_report("Failed to set state RESUMING"); + } + } + return; + + case MIGRATION_STATUS_CANCELLING: + case MIGRATION_STATUS_CANCELLED: + case MIGRATION_STATUS_FAILED: + ret =3D vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RUNNI= NG); + if (ret) { + error_report("Failed to set state RUNNING"); + } + return; + } +} + +static int vfio_migration_init(VFIODevice *vbasedev, + struct vfio_region_info *info) +{ + int ret; + + vbasedev->migration =3D g_new0(VFIOMigration, 1); + vbasedev->migration->region.index =3D info->index; + + ret =3D vfio_migration_region_init(vbasedev); + if (ret) { + error_report("Failed to initialise migration region"); + return ret; + } + + register_savevm_live(NULL, "vfio", -1, 1, &savevm_vfio_handlers, vbase= dev); + vbasedev->vm_state =3D qemu_add_vm_change_state_handler(vfio_vmstate_c= hange, + vbasedev); + + vbasedev->migration_state.notify =3D vfio_migration_state_notifier; + add_migration_state_change_notifier(&vbasedev->migration_state); + + return 0; +} + + +/* ---------------------------------------------------------------------- = */ + +int vfio_migration_probe(VFIODevice *vbasedev, Error **errp) +{ + struct vfio_region_info *info; + int ret; + + ret =3D vfio_get_dev_region_info(vbasedev, VFIO_REGION_TYPE_MIGRATION, + VFIO_REGION_SUBTYPE_MIGRATION, &info); + if (ret) { + Error *local_err =3D NULL; + + error_setg(&vbasedev->migration_blocker, + "VFIO device doesn't support migration"); + ret =3D migrate_add_blocker(vbasedev->migration_blocker, &local_er= r); + if (local_err) { + error_propagate(errp, local_err); + error_free(vbasedev->migration_blocker); + return ret; + } + } else { + return vfio_migration_init(vbasedev, info); + } + + return 0; +} + +void vfio_migration_finalize(VFIODevice *vbasedev) +{ + if (!vbasedev->migration) { + return; + } + + if (vbasedev->vm_state) { + qemu_del_vm_change_state_handler(vbasedev->vm_state); + remove_migration_state_change_notifier(&vbasedev->migration_state); + } + + if (vbasedev->migration_blocker) { + migrate_del_blocker(vbasedev->migration_blocker); + error_free(vbasedev->migration_blocker); + } + + vfio_migration_region_exit(vbasedev); + g_free(vbasedev->migration); +} diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 7624c9f511c4..3b8c98b29baf 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -30,6 +30,7 @@ #ifdef CONFIG_LINUX #include #endif +#include "sysemu/sysemu.h" =20 #define VFIO_MSG_PREFIX "vfio %s: " =20 @@ -58,6 +59,14 @@ typedef struct VFIORegion { uint8_t nr; /* cache the region number for debug */ } VFIORegion; =20 +typedef struct VFIOMigration { + struct { + VFIORegion buffer; + uint32_t index; + } region; + uint64_t pending_bytes; +} VFIOMigration; + typedef struct VFIOAddressSpace { AddressSpace *as; QLIST_HEAD(, VFIOContainer) containers; @@ -119,6 +128,12 @@ typedef struct VFIODevice { unsigned int num_irqs; unsigned int num_regions; unsigned int flags; + uint32_t device_state; + VMChangeStateEntry *vm_state; + int vm_running; + Notifier migration_state; + VFIOMigration *migration; + Error *migration_blocker; } VFIODevice; =20 struct VFIODeviceOps { @@ -198,4 +213,9 @@ int vfio_spapr_create_window(VFIOContainer *container, int vfio_spapr_remove_window(VFIOContainer *container, hwaddr offset_within_address_space); =20 +int vfio_migration_probe(VFIODevice *vbasedev, Error **errp); +void vfio_migration_finalize(VFIODevice *vbasedev); +void vfio_get_dirty_page_list(VFIODevice *vbasedev, uint64_t start_pfn, + uint64_t pfn_count, uint64_t page_size); + #endif /* HW_VFIO_VFIO_COMMON_H */ --=20 2.7.0 From nobody Sun May 5 22:22:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=nvidia.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 15506120169961014.1399582173847; Tue, 19 Feb 2019 13:33:36 -0800 (PST) Received: from localhost ([127.0.0.1]:55249 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwD1D-0007qW-U5 for importer@patchew.org; Tue, 19 Feb 2019 16:33:35 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33561) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwCyg-0006Oa-Qc for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gwCyg-0005oP-47 for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:58 -0500 Received: from hqemgate14.nvidia.com ([216.228.121.143]:17003) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gwCyf-0005nZ-3A for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:30:58 -0500 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Tue, 19 Feb 2019 13:26:01 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Tue, 19 Feb 2019 13:25:53 -0800 Received: from HQMAIL110.nvidia.com (172.18.146.15) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:25:53 +0000 Received: from HQMAIL105.nvidia.com (172.20.187.12) by hqmail110.nvidia.com (172.18.146.15) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:25:52 +0000 Received: from kwankhede-dev.nvidia.com (10.124.1.5) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Frontend Transport; Tue, 19 Feb 2019 21:25:46 +0000 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Tue, 19 Feb 2019 13:25:53 -0800 From: Kirti Wankhede To: , Date: Wed, 20 Feb 2019 02:53:19 +0530 Message-ID: <1550611400-13703-5-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> References: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1550611561; bh=uNGPTrUmwHp/z5e06XUP0eQEgUgt2ZZRVZ0MIJ3K4XM=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=Z2dUJDzQDvhgFGJlwzRrxpiwk2aSrAhjRB8vBUdkiRUkm5fxWcGz31QrkXUMthF/Q BlKJRFzeQ8QkLduT6PYuh4eV3khN39h3XGboeCv8CyMiQtOgenNNQL8/+/MLFoZ5tB GcuZB4R/ELU3SyHVe9I1wWywyYijmmOoulwE8qsvD2LJPnfPodkZVB4So2Qzn69tZG v16r1sJEHYw31tVi6jOZ5uY471He9mxnT8hZd4k/k++ez6BtNmf1VHe19+mHiy7beo NDcZvMnkO+tTTg0vMaaNTHkHZQB3bxfuYf4RgtROTwgrTiRr1RF6sIXM6HiFn7PjgU SQYRbHOsatmKw== X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 X-Received-From: 216.228.121.143 Subject: [Qemu-devel] [PATCH v3 4/5] Add vfio_listerner_log_sync to mark dirty pages X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kirti Wankhede , Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, yulei.zhang@intel.com, eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vfio_listerner_log_sync gets list of dirty pages from vendor driver and mark those pages dirty. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia --- hw/vfio/common.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 4262b80c4450..84ba6808f7d0 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -36,6 +36,7 @@ #include "sysemu/kvm.h" #include "trace.h" #include "qapi/error.h" +#include "migration/migration.h" =20 VFIOGroupList vfio_group_list =3D QLIST_HEAD_INITIALIZER(vfio_group_list); @@ -698,9 +699,39 @@ static void vfio_listener_region_del(MemoryListener *l= istener, } } =20 +static void vfio_listerner_log_sync(MemoryListener *listener, + MemoryRegionSection *section) +{ + uint64_t start_addr, size, pfn_count; + VFIOGroup *group; + VFIODevice *vbasedev; + + QLIST_FOREACH(group, &vfio_group_list, next) { + QLIST_FOREACH(vbasedev, &group->device_list, next) { + if (vbasedev->device_state & VFIO_DEVICE_STATE_SAVING) { + continue; + } else { + return; + } + } + } + + start_addr =3D TARGET_PAGE_ALIGN(section->offset_within_address_space); + size =3D int128_get64(section->size); + pfn_count =3D size >> TARGET_PAGE_BITS; + + QLIST_FOREACH(group, &vfio_group_list, next) { + QLIST_FOREACH(vbasedev, &group->device_list, next) { + vfio_get_dirty_page_list(vbasedev, start_addr >> TARGET_PAGE_B= ITS, + pfn_count, TARGET_PAGE_SIZE); + } + } +} + static const MemoryListener vfio_memory_listener =3D { .region_add =3D vfio_listener_region_add, .region_del =3D vfio_listener_region_del, + .log_sync =3D vfio_listerner_log_sync, }; =20 static void vfio_listener_release(VFIOContainer *container) --=20 2.7.0 From nobody Sun May 5 22:22:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=nvidia.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 155061227959519.667499399403823; Tue, 19 Feb 2019 13:37:59 -0800 (PST) Received: from localhost ([127.0.0.1]:55326 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwD5Q-0002Yc-JP for importer@patchew.org; Tue, 19 Feb 2019 16:37:56 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33619) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gwCyz-0006ZH-Dh for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:31:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gwCyy-0005rw-Kw for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:31:17 -0500 Received: from hqemgate14.nvidia.com ([216.228.121.143]:17024) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gwCyy-0005ro-Ce for qemu-devel@nongnu.org; Tue, 19 Feb 2019 16:31:16 -0500 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Tue, 19 Feb 2019 13:26:20 -0800 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Tue, 19 Feb 2019 13:26:12 -0800 Received: from HQMAIL110.nvidia.com (172.18.146.15) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:26:12 +0000 Received: from HQMAIL105.nvidia.com (172.20.187.12) by hqmail110.nvidia.com (172.18.146.15) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 19 Feb 2019 21:26:00 +0000 Received: from kwankhede-dev.nvidia.com (10.124.1.5) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Frontend Transport; Tue, 19 Feb 2019 21:25:53 +0000 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 19 Feb 2019 13:26:12 -0800 From: Kirti Wankhede To: , Date: Wed, 20 Feb 2019 02:53:20 +0530 Message-ID: <1550611400-13703-6-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> References: <1550611400-13703-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1550611580; bh=VNuGhJj/QxIZ/2OkokKvw8ELtYjsToKTi1qtr32bceE=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=bK0CiDscJUzccZSn2OsbZJ9+VekGWD2SzdS3D4PjORjz1M7k8xgu+ikbKnn8z5VWX tAjTeK+bwiZpdrGKHZQOdo30RPI9/orT+vt+jaC7wWa68E/YRRvjXNozaj5WJjUBKp gciKxUvtj+Bb2fca/5kjyrkorHua/PJ+8VdYZD8oJwtxXok0DKzxrr9tZXN+QGCIyY 0R5PzZtSwx5GdWzdH6DhD8jI6tfXl1uc637iv1IFnHUbKopTcsGKhldQkHsPEXdt+w A07CUeZuq9HbfgfMmtHJIykV/ZBdHG8nkjSAWqzVBEMl1F5J8lJlQgPlEumvZUtWEx cUzyflqK4QUFQ== X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 X-Received-From: 216.228.121.143 Subject: [Qemu-devel] [PATCH v3 5/5] Make vfio-pci device migration capable. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kirti Wankhede , Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, yulei.zhang@intel.com, eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Call vfio_migration_probe() and vfio_migration_finalize() functions for vfio-pci device to enable migration for vfio PCI device. Removed vfio_pci_vmstate structure. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia --- hw/vfio/pci.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index e87a8a03d3f3..0fe42d146006 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2939,6 +2939,7 @@ static void vfio_realize(PCIDevice *pdev, Error **err= p) vdev->vbasedev.ops =3D &vfio_pci_ops; vdev->vbasedev.type =3D VFIO_DEVICE_TYPE_PCI; vdev->vbasedev.dev =3D &vdev->pdev.qdev; + vdev->vbasedev.device_state =3D 0; =20 tmp =3D g_strdup_printf("%s/iommu_group", vdev->vbasedev.sysfsdev); len =3D readlink(tmp, group_path, sizeof(group_path)); @@ -3175,6 +3176,8 @@ static void vfio_realize(PCIDevice *pdev, Error **err= p) goto out_teardown; } =20 + ret =3D vfio_migration_probe(&vdev->vbasedev, errp); + vfio_register_err_notifier(vdev); vfio_register_req_notifier(vdev); vfio_setup_resetfn_quirk(vdev); @@ -3213,6 +3216,7 @@ static void vfio_exitfn(PCIDevice *pdev) { VFIOPCIDevice *vdev =3D PCI_VFIO(pdev); =20 + vdev->vbasedev.device_state =3D 0; vfio_unregister_req_notifier(vdev); vfio_unregister_err_notifier(vdev); pci_device_set_intx_routing_notifier(&vdev->pdev, NULL); @@ -3222,6 +3226,7 @@ static void vfio_exitfn(PCIDevice *pdev) } vfio_teardown_msi(vdev); vfio_bars_exit(vdev); + vfio_migration_finalize(&vdev->vbasedev); } =20 static void vfio_pci_reset(DeviceState *dev) @@ -3328,11 +3333,6 @@ static Property vfio_pci_dev_properties[] =3D { DEFINE_PROP_END_OF_LIST(), }; =20 -static const VMStateDescription vfio_pci_vmstate =3D { - .name =3D "vfio-pci", - .unmigratable =3D 1, -}; - static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) { DeviceClass *dc =3D DEVICE_CLASS(klass); @@ -3340,7 +3340,6 @@ static void vfio_pci_dev_class_init(ObjectClass *klas= s, void *data) =20 dc->reset =3D vfio_pci_reset; dc->props =3D vfio_pci_dev_properties; - dc->vmsd =3D &vfio_pci_vmstate; dc->desc =3D "VFIO-based PCI device assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); pdc->realize =3D vfio_realize; --=20 2.7.0