From nobody Wed Oct 29 06:39:43 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1524655606513202.0095747447715; Wed, 25 Apr 2018 04:26:46 -0700 (PDT) Received: from localhost ([::1]:35827 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIZR-0001hy-N4 for importer@patchew.org; Wed, 25 Apr 2018 07:26:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33142) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIXM-0000B0-Dp for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fBIXK-00014W-SZ for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:36 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:46078 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fBIXK-000147-Mr for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:34 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 42EE2406C751; Wed, 25 Apr 2018 11:24:34 +0000 (UTC) Received: from dhcp201-121.englab.pnq.redhat.com (dhcp193-127.pnq.redhat.com [10.65.193.127]) by smtp.corp.redhat.com (Postfix) with ESMTP id EE1122023239; Wed, 25 Apr 2018 11:24:27 +0000 (UTC) From: Pankaj Gupta To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-mm@kvack.org Date: Wed, 25 Apr 2018 16:54:13 +0530 Message-Id: <20180425112415.12327-2-pagupta@redhat.com> In-Reply-To: <20180425112415.12327-1-pagupta@redhat.com> References: <20180425112415.12327-1-pagupta@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 25 Apr 2018 11:24:34 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 25 Apr 2018 11:24:34 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'pagupta@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [RFC v2 1/2] virtio: add pmem driver X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, haozhong.zhang@intel.com, jack@suse.cz, xiaoguangrong.eric@gmail.com, riel@surriel.com, pagupta@redhat.com, niteshnarayanlal@hotmail.com, david@redhat.com, ross.zwisler@intel.com, lcapitulino@redhat.com, hch@infradead.org, mst@redhat.com, stefanha@redhat.com, imammedo@redhat.com, marcel@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, nilal@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This patch adds virtio-pmem driver for KVM=20 guest.=20 Guest reads the persistent memory range=20 information from Qemu over VIRTIO and registers=20 it on nvdimm_bus. It also creates a nd_region=20 object with the persistent memory range=20 information so that existing 'nvdimm/pmem'=20 driver can reserve this into system memory map.=20 This way 'virtio-pmem' driver uses existing=20 functionality of pmem driver to register persistent=20 memory compatible for DAX capable filesystems. This also provides function to perform guest flush=20 over VIRTIO from 'pmem' driver when userspace=20 performs flush on DAX memory range. Signed-off-by: Pankaj Gupta --- drivers/virtio/Kconfig | 12 ++++ drivers/virtio/Makefile | 1 + drivers/virtio/virtio_pmem.c | 118 +++++++++++++++++++++++++++++++++++= ++++ include/linux/libnvdimm.h | 4 ++ include/uapi/linux/virtio_ids.h | 1 + include/uapi/linux/virtio_pmem.h | 58 +++++++++++++++++++ 6 files changed, 194 insertions(+) create mode 100644 drivers/virtio/virtio_pmem.c create mode 100644 include/uapi/linux/virtio_pmem.h diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 3589764..879335d 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -42,6 +42,18 @@ config VIRTIO_PCI_LEGACY =20 If unsure, say Y. =20 +config VIRTIO_PMEM + tristate "Virtio pmem driver" + depends on VIRTIO + help + This driver adds persistent memory range to nd_region and registers + with nvdimm bus. NVDIMM 'pmem' driver later allocates a persistent + memory range on the memory information added by this driver. In addition + to this, 'virtio-pmem' driver also provides a paravirt flushing interface + from guest to host. + + If unsure, say M. + config VIRTIO_BALLOON tristate "Virtio balloon driver" depends on VIRTIO diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile index 3a2b5c5..cbe91c6 100644 --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -6,3 +6,4 @@ virtio_pci-y :=3D virtio_pci_modern.o virtio_pci_common.o virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) +=3D virtio_pci_legacy.o obj-$(CONFIG_VIRTIO_BALLOON) +=3D virtio_balloon.o obj-$(CONFIG_VIRTIO_INPUT) +=3D virtio_input.o +obj-$(CONFIG_VIRTIO_PMEM) +=3D virtio_pmem.o diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c new file mode 100644 index 0000000..0906d2d --- /dev/null +++ b/drivers/virtio/virtio_pmem.c @@ -0,0 +1,118 @@ +/* Virtio pmem Driver + * + * Discovers persitent memory range information + * from host and provides a virtio based flushing + * interface. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static int init_vq(struct virtio_pmem *vpmem) +{ + struct virtqueue *vq; + + /* single vq */ + vpmem->req_vq =3D vq =3D virtio_find_single_vq(vpmem->vdev, + NULL, "flush_queue"); + + if (IS_ERR(vq)) + return PTR_ERR(vq); + + return 0; +}; + +static int virtio_pmem_probe(struct virtio_device *vdev) +{ + int err =3D 0; + struct resource res; + struct virtio_pmem *vpmem; + struct nvdimm_bus *nvdimm_bus; + struct nd_region_desc ndr_desc; + int nid =3D dev_to_node(&vdev->dev); + static struct nvdimm_bus_descriptor nd_desc; + + if (!vdev->config->get) { + dev_err(&vdev->dev, "%s failure: config disabled\n", + __func__); + return -EINVAL; + } + + vdev->priv =3D vpmem =3D devm_kzalloc(&vdev->dev, sizeof(*vpmem), + GFP_KERNEL); + if (!vpmem) { + err =3D -ENOMEM; + goto out; + } + + vpmem->vdev =3D vdev; + err =3D init_vq(vpmem); + if (err) + goto out; + + virtio_cread(vpmem->vdev, struct virtio_pmem_config, + start, &vpmem->start); + virtio_cread(vpmem->vdev, struct virtio_pmem_config, + size, &vpmem->size); + + res.start =3D vpmem->start; + res.end =3D vpmem->start + vpmem->size-1; + + memset(&nd_desc, 0, sizeof(nd_desc)); + nd_desc.provider_name =3D "virtio-pmem"; + nd_desc.module =3D THIS_MODULE; + nvdimm_bus =3D nvdimm_bus_register(&vdev->dev, &nd_desc); + + if (!nvdimm_bus) + goto out_nd; + dev_set_drvdata(&vdev->dev, nvdimm_bus); + + memset(&ndr_desc, 0, sizeof(ndr_desc)); + ndr_desc.res =3D &res; + ndr_desc.numa_node =3D nid; + set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); + set_bit(ND_REGION_VIRTIO, &ndr_desc.flags); + + if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc)) + goto out_nd; + + virtio_device_ready(vdev); + return 0; + +out_nd: + nvdimm_bus_unregister(nvdimm_bus); +out: + dev_err(&vdev->dev, "failed to register virtio pmem memory\n"); + vdev->config->del_vqs(vdev); + return err; +} + +static void virtio_pmem_remove(struct virtio_device *vdev) +{ + struct nvdimm_bus *nvdimm_bus =3D dev_get_drvdata(&vdev->dev); + + nvdimm_bus_unregister(nvdimm_bus); + vdev->config->del_vqs(vdev); +} + +static struct virtio_driver virtio_pmem_driver =3D { + .driver.name =3D KBUILD_MODNAME, + .driver.owner =3D THIS_MODULE, + .id_table =3D id_table, + .probe =3D virtio_pmem_probe, + .remove =3D virtio_pmem_remove, +}; + +module_virtio_driver(virtio_pmem_driver); +MODULE_DEVICE_TABLE(virtio, id_table); +MODULE_DESCRIPTION("Virtio pmem driver"); +MODULE_LICENSE("GPL"); diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h index 097072c..b1b7f14 100644 --- a/include/linux/libnvdimm.h +++ b/include/linux/libnvdimm.h @@ -58,6 +58,10 @@ enum { * (ADR) */ ND_REGION_PERSIST_MEMCTRL =3D 2, + /* + * region flag indicating to use VIRTIO flush interface for pmem + */ + ND_REGION_VIRTIO =3D 3, =20 /* mark newly adjusted resources as requiring a label update */ DPA_RESOURCE_ADJUSTED =3D 1 << 0, diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_id= s.h index 6d5c3b2..5ebd049 100644 --- a/include/uapi/linux/virtio_ids.h +++ b/include/uapi/linux/virtio_ids.h @@ -43,5 +43,6 @@ #define VIRTIO_ID_INPUT 18 /* virtio input */ #define VIRTIO_ID_VSOCK 19 /* virtio vsock transport */ #define VIRTIO_ID_CRYPTO 20 /* virtio crypto */ +#define VIRTIO_ID_PMEM 21 /* virtio pmem */ =20 #endif /* _LINUX_VIRTIO_IDS_H */ diff --git a/include/uapi/linux/virtio_pmem.h b/include/uapi/linux/virtio_p= mem.h new file mode 100644 index 0000000..2ec27cb --- /dev/null +++ b/include/uapi/linux/virtio_pmem.h @@ -0,0 +1,58 @@ +/* Virtio pmem Driver + * + * Discovers persitent memory range information + * from host and provides a virtio based flushing + * interface. + */ + +#ifndef _LINUX_VIRTIO_PMEM_H +#define _LINUX_VIRTIO_PMEM_H + +#include +#include +#include +#include +#include + + +struct virtio_pmem_config { + + uint64_t start; + uint64_t size; +}; + +struct virtio_pmem { + + struct virtio_device *vdev; + struct virtqueue *req_vq; + + uint64_t start; + uint64_t size; +} __packed; + +static struct virtio_device_id id_table[] =3D { + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, + { 0 }, +}; + +void virtio_pmem_flush(struct device *dev) +{ + struct scatterlist sg; + struct virtio_device *vdev =3D dev_to_virtio(dev->parent->parent); + struct virtio_pmem *vpmem =3D vdev->priv; + char *buf =3D "FLUSH"; + int err; + + sg_init_one(&sg, buf, sizeof(buf)); + + err =3D virtqueue_add_outbuf(vpmem->req_vq, &sg, 1, buf, GFP_KERNEL); + + if (err) { + dev_err(&vdev->dev, "failed to send command to virtio pmem device\n"); + return; + } + + virtqueue_kick(vpmem->req_vq); +}; + +#endif --=20 2.9.3 From nobody Wed Oct 29 06:39:43 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1524655949009191.44034096989594; Wed, 25 Apr 2018 04:32:29 -0700 (PDT) Received: from localhost ([::1]:35864 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIey-0006QS-4c for importer@patchew.org; Wed, 25 Apr 2018 07:32:28 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33190) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIXT-0000HJ-87 for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fBIXR-0001Ab-2Z for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:43 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:46092 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fBIXQ-0001AF-Uc for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:41 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 849ED406C757; Wed, 25 Apr 2018 11:24:40 +0000 (UTC) Received: from dhcp201-121.englab.pnq.redhat.com (dhcp193-127.pnq.redhat.com [10.65.193.127]) by smtp.corp.redhat.com (Postfix) with ESMTP id AC3FE2023239; Wed, 25 Apr 2018 11:24:34 +0000 (UTC) From: Pankaj Gupta To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-mm@kvack.org Date: Wed, 25 Apr 2018 16:54:14 +0530 Message-Id: <20180425112415.12327-3-pagupta@redhat.com> In-Reply-To: <20180425112415.12327-1-pagupta@redhat.com> References: <20180425112415.12327-1-pagupta@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 25 Apr 2018 11:24:40 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 25 Apr 2018 11:24:40 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'pagupta@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [RFC v2 2/2] pmem: device flush over VIRTIO X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, haozhong.zhang@intel.com, jack@suse.cz, xiaoguangrong.eric@gmail.com, riel@surriel.com, pagupta@redhat.com, niteshnarayanlal@hotmail.com, david@redhat.com, ross.zwisler@intel.com, lcapitulino@redhat.com, hch@infradead.org, mst@redhat.com, stefanha@redhat.com, imammedo@redhat.com, marcel@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, nilal@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This patch adds functionality to perform=20 flush from guest to hosy over VIRTIO=20 when 'ND_REGION_VIRTIO'flag is set on=20 nd_negion. Flag is set by 'virtio-pmem' driver. Signed-off-by: Pankaj Gupta --- drivers/nvdimm/region_devs.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c index a612be6..6c6454e 100644 --- a/drivers/nvdimm/region_devs.c +++ b/drivers/nvdimm/region_devs.c @@ -20,6 +20,7 @@ #include #include "nd-core.h" #include "nd.h" +#include =20 /* * For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is @@ -1074,6 +1075,12 @@ void nvdimm_flush(struct nd_region *nd_region) struct nd_region_data *ndrd =3D dev_get_drvdata(&nd_region->dev); int i, idx; =20 + /* call PV device flush */ + if (test_bit(ND_REGION_VIRTIO, &nd_region->flags)) { + virtio_pmem_flush(&nd_region->dev); + return; + } + /* * Try to encourage some diversity in flush hint addresses * across cpus assuming a limited number of flush hints. --=20 2.9.3 From nobody Wed Oct 29 06:39:43 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1524655771067333.49994735795997; Wed, 25 Apr 2018 04:29:31 -0700 (PDT) Received: from localhost ([::1]:35845 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIc6-00040j-3t for importer@patchew.org; Wed, 25 Apr 2018 07:29:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33243) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIXZ-0000Nf-N7 for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fBIXX-0001Gj-J0 for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:49 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:33446 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fBIXX-0001GB-Cv for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:47 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DF8CF738E0; Wed, 25 Apr 2018 11:24:46 +0000 (UTC) Received: from dhcp201-121.englab.pnq.redhat.com (dhcp193-127.pnq.redhat.com [10.65.193.127]) by smtp.corp.redhat.com (Postfix) with ESMTP id EFA7D2023239; Wed, 25 Apr 2018 11:24:40 +0000 (UTC) From: Pankaj Gupta To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-mm@kvack.org Date: Wed, 25 Apr 2018 16:54:15 +0530 Message-Id: <20180425112415.12327-4-pagupta@redhat.com> In-Reply-To: <20180425112415.12327-1-pagupta@redhat.com> References: <20180425112415.12327-1-pagupta@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 25 Apr 2018 11:24:46 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 25 Apr 2018 11:24:46 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'pagupta@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [RFC v2] qemu: Add virtio pmem device X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, haozhong.zhang@intel.com, jack@suse.cz, xiaoguangrong.eric@gmail.com, riel@surriel.com, pagupta@redhat.com, niteshnarayanlal@hotmail.com, david@redhat.com, ross.zwisler@intel.com, lcapitulino@redhat.com, hch@infradead.org, mst@redhat.com, stefanha@redhat.com, imammedo@redhat.com, marcel@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, nilal@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This patch adds virtio-pmem Qemu device. This device presents memory address range=20 information to guest which is backed by file=20 backend type. It acts like persistent memory=20 device for KVM guest. Guest can perform read=20 and persistent write operations on this memory=20 range with the help of DAX capable filesystem. Persistent guest writes are assured with the=20 help of virtio based flushing interface. When=20 guest userspace space performs fsync on file=20 fd on pmem device, a flush command is send to=20 Qemu over VIRTIO and host side flush/sync is=20 done on backing image file. This PV device code is dependent and tested=20 with 'David Hildenbrand's ' patchset[1] to=20 map non-PCDIMM devices to guest address space. There is still upstream discussion on using=20 among PCI bar vs memory device, will update=20 as per concensus. [1] https://marc.info/?l=3Dqemu-devel&m=3D152450249319168&w=3D2 Signed-off-by: Pankaj Gupta --- hw/virtio/Makefile.objs | 3 + hw/virtio/virtio-pci.c | 44 +++++++ hw/virtio/virtio-pci.h | 14 ++ hw/virtio/virtio-pmem.c | 197 ++++++++++++++++++++++++= ++++ include/hw/pci/pci.h | 1 + include/hw/virtio/virtio-pmem.h | 44 +++++++ include/standard-headers/linux/virtio_ids.h | 1 + qapi/misc.json | 26 +++- 8 files changed, 329 insertions(+), 1 deletion(-) create mode 100644 hw/virtio/virtio-pmem.c create mode 100644 include/hw/virtio/virtio-pmem.h diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs index 765d363c1f..d329dbb1a1 100644 --- a/hw/virtio/Makefile.objs +++ b/hw/virtio/Makefile.objs @@ -6,6 +6,9 @@ common-obj-y +=3D virtio-mmio.o =20 obj-y +=3D virtio.o virtio-balloon.o=20 obj-$(CONFIG_LINUX) +=3D vhost.o vhost-backend.o vhost-user.o +ifeq ($(CONFIG_MEM_HOTPLUG),y) +obj-$(CONFIG_LINUX) +=3D virtio-pmem.o +endif obj-$(CONFIG_VHOST_VSOCK) +=3D vhost-vsock.o obj-y +=3D virtio-crypto.o obj-$(CONFIG_VIRTIO_PCI) +=3D virtio-crypto-pci.o diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 1e8ab7bbc5..e15a3a5a2e 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -2501,6 +2501,49 @@ static const TypeInfo virtio_rng_pci_info =3D { .class_init =3D virtio_rng_pci_class_init, }; =20 +/* virtio-pmem-pci */ + +static void virtio_pmem_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) +{ + VirtIOPMEMPCI *vpmem =3D VIRTIO_PMEM_PCI(vpci_dev); + DeviceState *vdev =3D DEVICE(&vpmem->vdev); + + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus)); + object_property_set_bool(OBJECT(vdev), true, "realized", errp); +} + +static void virtio_pmem_pci_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc =3D DEVICE_CLASS(klass); + VirtioPCIClass *k =3D VIRTIO_PCI_CLASS(klass); + PCIDeviceClass *pcidev_k =3D PCI_DEVICE_CLASS(klass); + k->realize =3D virtio_pmem_pci_realize; + set_bit(DEVICE_CATEGORY_MISC, dc->categories); + pcidev_k->vendor_id =3D PCI_VENDOR_ID_REDHAT_QUMRANET; + pcidev_k->device_id =3D PCI_DEVICE_ID_VIRTIO_PMEM; + pcidev_k->revision =3D VIRTIO_PCI_ABI_VERSION; + pcidev_k->class_id =3D PCI_CLASS_OTHERS; +} + +static void virtio_pmem_pci_instance_init(Object *obj) +{ + VirtIOPMEMPCI *dev =3D VIRTIO_PMEM_PCI(obj); + + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev), + TYPE_VIRTIO_PMEM); + object_property_add_alias(obj, "memdev", OBJECT(&dev->vdev), "memdev", + &error_abort); +} + +static const TypeInfo virtio_pmem_pci_info =3D { + .name =3D TYPE_VIRTIO_PMEM_PCI, + .parent =3D TYPE_VIRTIO_PCI, + .instance_size =3D sizeof(VirtIOPMEMPCI), + .instance_init =3D virtio_pmem_pci_instance_init, + .class_init =3D virtio_pmem_pci_class_init, +}; + + /* virtio-input-pci */ =20 static Property virtio_input_pci_properties[] =3D { @@ -2693,6 +2736,7 @@ static void virtio_pci_register_types(void) type_register_static(&virtio_balloon_pci_info); type_register_static(&virtio_serial_pci_info); type_register_static(&virtio_net_pci_info); + type_register_static(&virtio_pmem_pci_info); #ifdef CONFIG_VHOST_SCSI type_register_static(&vhost_scsi_pci_info); #endif diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h index 813082b0d7..fe74fcad3f 100644 --- a/hw/virtio/virtio-pci.h +++ b/hw/virtio/virtio-pci.h @@ -19,6 +19,7 @@ #include "hw/virtio/virtio-blk.h" #include "hw/virtio/virtio-net.h" #include "hw/virtio/virtio-rng.h" +#include "hw/virtio/virtio-pmem.h" #include "hw/virtio/virtio-serial.h" #include "hw/virtio/virtio-scsi.h" #include "hw/virtio/virtio-balloon.h" @@ -57,6 +58,7 @@ typedef struct VirtIOInputHostPCI VirtIOInputHostPCI; typedef struct VirtIOGPUPCI VirtIOGPUPCI; typedef struct VHostVSockPCI VHostVSockPCI; typedef struct VirtIOCryptoPCI VirtIOCryptoPCI; +typedef struct VirtIOPMEMPCI VirtIOPMEMPCI; =20 /* virtio-pci-bus */ =20 @@ -274,6 +276,18 @@ struct VirtIOBlkPCI { VirtIOBlock vdev; }; =20 +/* + * virtio-pmem-pci: This extends VirtioPCIProxy. + */ +#define TYPE_VIRTIO_PMEM_PCI "virtio-pmem-pci" +#define VIRTIO_PMEM_PCI(obj) \ + OBJECT_CHECK(VirtIOPMEMPCI, (obj), TYPE_VIRTIO_PMEM_PCI) + +struct VirtIOPMEMPCI { + VirtIOPCIProxy parent_obj; + VirtIOPMEM vdev; +}; + /* * virtio-balloon-pci: This extends VirtioPCIProxy. */ diff --git a/hw/virtio/virtio-pmem.c b/hw/virtio/virtio-pmem.c new file mode 100644 index 0000000000..70d3697423 --- /dev/null +++ b/hw/virtio/virtio-pmem.c @@ -0,0 +1,197 @@ +/* + * Virtio pmem device + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qemu-common.h" +#include "qemu/error-report.h" +#include "hw/virtio/virtio-pmem.h" +#include "hw/mem/memory-device.h" + +static void virtio_pmem_flush(VirtIODevice *vdev, VirtQueue *vq) +{ + VirtQueueElement *elem; + VirtIOPMEM *pmem =3D VIRTIO_PMEM(vdev); + HostMemoryBackend *backend =3D MEMORY_BACKEND(pmem->memdev); + int fd =3D memory_region_get_fd(&backend->mr); + + elem =3D virtqueue_pop(vq, sizeof(VirtQueueElement)); + if (!elem) { + return; + } + /* flush raw backing image */ + fsync(fd); + + virtio_notify(vdev, vq); + g_free(elem); + +} + +static void virtio_pmem_get_config(VirtIODevice *vdev, uint8_t *config) +{ + VirtIOPMEM *pmem =3D VIRTIO_PMEM(vdev); + struct virtio_pmem_config *pmemcfg =3D (struct virtio_pmem_config *) c= onfig; + + pmemcfg->start =3D pmem->start; + pmemcfg->size =3D pmem->size; +} + +static uint64_t virtio_pmem_get_features(VirtIODevice *vdev, uint64_t feat= ures, + Error **errp) +{ + virtio_add_feature(&features, VIRTIO_PMEM_PLUG); + return features; +} + +static void virtio_pmem_realize(DeviceState *dev, Error **errp) +{ + VirtIODevice *vdev =3D VIRTIO_DEVICE(dev); + VirtIOPMEM *pmem =3D VIRTIO_PMEM(dev); + MachineState *ms =3D MACHINE(qdev_get_machine()); + uint64_t align; + + Error *local_err =3D NULL; + MemoryRegion *mr; + + if (!pmem->memdev) { + error_setg(errp, "virtio-pmem memdev not set"); + return; + } + + mr =3D host_memory_backend_get_memory(pmem->memdev, errp); + align =3D memory_region_get_alignment(mr); + pmem->size =3D QEMU_ALIGN_DOWN(memory_region_size(mr), align); + pmem->start =3D memory_device_get_free_addr(ms, NULL, align, pmem->siz= e, + &local_err); + + if (local_err) { + error_setg(errp, "Can't get free address in mem device"); + return; + } + + memory_region_init_alias(&pmem->mr, OBJECT(pmem), + "virtio_pmem-memory", mr, 0, pmem->size); + memory_device_plug_region(ms, &pmem->mr, pmem->start); + + host_memory_backend_set_mapped(pmem->memdev, true); + virtio_init(vdev, TYPE_VIRTIO_PMEM, VIRTIO_ID_PMEM, + sizeof(struct virtio_pmem_config)); + + pmem->rq_vq =3D virtio_add_queue(vdev, 128, virtio_pmem_flush); +} + +static void virtio_mem_check_memdev(Object *obj, const char *name, Object = *val, + Error **errp) +{ + if (host_memory_backend_is_mapped(MEMORY_BACKEND(val))) { + + char *path =3D object_get_canonical_path_component(val); + error_setg(errp, "Can't use already busy memdev: %s", path); + g_free(path); + return; + } + + qdev_prop_allow_set_link_before_realize(obj, name, val, errp); +} + +static const char *virtio_pmem_get_device_id(VirtIOPMEM *vm) +{ + Object *obj =3D OBJECT(vm); + DeviceState *parent_dev; + + /* always use the ID of the proxy device */ + if (obj->parent && object_dynamic_cast(obj->parent, TYPE_DEVICE)) { + parent_dev =3D DEVICE(obj->parent); + return parent_dev->id; + } + return NULL; +} + + +static void virtio_pmem_md_fill_device_info(const MemoryDeviceState *md, + MemoryDeviceInfo *info) +{ + VirtioPMemDeviceInfo *vi =3D g_new0(VirtioPMemDeviceInfo, 1); + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + const char *id =3D virtio_pmem_get_device_id(vm); + + if (id) { + vi->has_id =3D true; + vi->id =3D g_strdup(id); + } + + vi->start =3D vm->start; + vi->size =3D vm->size; + vi->memdev =3D object_get_canonical_path(OBJECT(vm->memdev)); + + info->u.virtio_pmem.data =3D vi; + info->type =3D MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM; +} + +static uint64_t virtio_pmem_md_get_addr(const MemoryDeviceState *md) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + return vm->start; +} + +static uint64_t virtio_pmem_md_get_plugged_size(const MemoryDeviceState *m= d) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + return vm->size; +} + +static uint64_t virtio_pmem_md_get_region_size(const MemoryDeviceState *md) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + return vm->size; +} + +static void virtio_pmem_instance_init(Object *obj) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(obj); + object_property_add_link(obj, "memdev", TYPE_MEMORY_BACKEND, + (Object **)&vm->memdev, + (void *) virtio_mem_check_memdev, + OBJ_PROP_LINK_UNREF_ON_RELEASE, + &error_abort); +} + + +static void virtio_pmem_class_init(ObjectClass *klass, void *data) +{ + VirtioDeviceClass *vdc =3D VIRTIO_DEVICE_CLASS(klass); + MemoryDeviceClass *mdc =3D MEMORY_DEVICE_CLASS(klass); + + vdc->realize =3D virtio_pmem_realize; + vdc->get_config =3D virtio_pmem_get_config; + vdc->get_features =3D virtio_pmem_get_features; + + mdc->get_addr =3D virtio_pmem_md_get_addr; + mdc->get_plugged_size =3D virtio_pmem_md_get_plugged_size; + mdc->get_region_size =3D virtio_pmem_md_get_region_size; + mdc->fill_device_info =3D virtio_pmem_md_fill_device_info; +} + +static TypeInfo virtio_pmem_info =3D { + .name =3D TYPE_VIRTIO_PMEM, + .parent =3D TYPE_VIRTIO_DEVICE, + .class_init =3D virtio_pmem_class_init, + .instance_size =3D sizeof(VirtIOPMEM), + .instance_init =3D virtio_pmem_instance_init, + .interfaces =3D (InterfaceInfo[]) { + { TYPE_MEMORY_DEVICE }, + { } + }, +}; + +static void virtio_register_types(void) +{ + type_register_static(&virtio_pmem_info); +} + +type_init(virtio_register_types) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index a9c3ee5aa2..df26e204ce 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -85,6 +85,7 @@ extern bool pci_available; #define PCI_DEVICE_ID_VIRTIO_RNG 0x1005 #define PCI_DEVICE_ID_VIRTIO_9P 0x1009 #define PCI_DEVICE_ID_VIRTIO_VSOCK 0x1012 +#define PCI_DEVICE_ID_VIRTIO_PMEM 0x1013 =20 #define PCI_VENDOR_ID_REDHAT 0x1b36 #define PCI_DEVICE_ID_REDHAT_BRIDGE 0x0001 diff --git a/include/hw/virtio/virtio-pmem.h b/include/hw/virtio/virtio-pme= m.h new file mode 100644 index 0000000000..a8d017beca --- /dev/null +++ b/include/hw/virtio/virtio-pmem.h @@ -0,0 +1,44 @@ +/* + * Virtio pmem Device + * + * PV device to emulate nvdimm memory. + * Provides guest flushing interface based + * on VIRTIO. + */ + +#ifndef QEMU_VIRTIO_PMEM_H +#define QEMU_VIRTIO_PMEM_H + +#include "hw/virtio/virtio.h" +#include "exec/memory.h" +#include "sysemu/hostmem.h" +#include "standard-headers/linux/virtio_ids.h" +#include "hw/boards.h" +#include "hw/i386/pc.h" + +#define VIRTIO_PMEM_PLUG 0 + +#define TYPE_VIRTIO_PMEM "virtio-pmem" + +#define VIRTIO_PMEM(obj) \ + OBJECT_CHECK(VirtIOPMEM, (obj), TYPE_VIRTIO_PMEM) + +/* VirtIOPMEM device structure */ +typedef struct VirtIOPMEM { + + VirtIODevice parent_obj; + VirtQueue *rq_vq; + uint64_t start; + uint64_t size; + + MemoryRegion mr; + HostMemoryBackend *memdev; +} VirtIOPMEM; + +struct virtio_pmem_config { + + uint64_t start; + uint64_t size; +}; + +#endif diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard= -headers/linux/virtio_ids.h index 6d5c3b2d4f..5ebd04980d 100644 --- a/include/standard-headers/linux/virtio_ids.h +++ b/include/standard-headers/linux/virtio_ids.h @@ -43,5 +43,6 @@ #define VIRTIO_ID_INPUT 18 /* virtio input */ #define VIRTIO_ID_VSOCK 19 /* virtio vsock transport */ #define VIRTIO_ID_CRYPTO 20 /* virtio crypto */ +#define VIRTIO_ID_PMEM 21 /* virtio pmem */ =20 #endif /* _LINUX_VIRTIO_IDS_H */ diff --git a/qapi/misc.json b/qapi/misc.json index 5636f4a149..29a00b27d7 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -2871,6 +2871,29 @@ } } =20 +## +# @VirtioPMemDeviceInfo: +# +# VirtioPMem state information +# +# @id: device's ID +# +# @start: physical address, where device is mapped +# +# @size: size of memory that the device provides +# +# @memdev: memory backend linked with device +# +# Since: 2.13 +## +{ 'struct': 'VirtioPMemDeviceInfo', + 'data': { '*id': 'str', + 'start': 'size', + 'size': 'size', + 'memdev': 'str' + } +} + ## # @MemoryDeviceInfo: # @@ -2880,7 +2903,8 @@ ## { 'union': 'MemoryDeviceInfo', 'data': { 'dimm': 'PCDIMMDeviceInfo', - 'nvdimm': 'PCDIMMDeviceInfo' + 'nvdimm': 'PCDIMMDeviceInfo', + 'virtio-pmem': 'VirtioPMemDeviceInfo' } } =20 --=20 2.14.3