From nobody Wed Oct 29 06:38:51 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1524655771067333.49994735795997; Wed, 25 Apr 2018 04:29:31 -0700 (PDT) Received: from localhost ([::1]:35845 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIc6-00040j-3t for importer@patchew.org; Wed, 25 Apr 2018 07:29:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33243) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBIXZ-0000Nf-N7 for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fBIXX-0001Gj-J0 for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:49 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:33446 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fBIXX-0001GB-Cv for qemu-devel@nongnu.org; Wed, 25 Apr 2018 07:24:47 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DF8CF738E0; Wed, 25 Apr 2018 11:24:46 +0000 (UTC) Received: from dhcp201-121.englab.pnq.redhat.com (dhcp193-127.pnq.redhat.com [10.65.193.127]) by smtp.corp.redhat.com (Postfix) with ESMTP id EFA7D2023239; Wed, 25 Apr 2018 11:24:40 +0000 (UTC) From: Pankaj Gupta To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-mm@kvack.org Date: Wed, 25 Apr 2018 16:54:15 +0530 Message-Id: <20180425112415.12327-4-pagupta@redhat.com> In-Reply-To: <20180425112415.12327-1-pagupta@redhat.com> References: <20180425112415.12327-1-pagupta@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 25 Apr 2018 11:24:46 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 25 Apr 2018 11:24:46 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'pagupta@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [RFC v2] qemu: Add virtio pmem device X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, haozhong.zhang@intel.com, jack@suse.cz, xiaoguangrong.eric@gmail.com, riel@surriel.com, pagupta@redhat.com, niteshnarayanlal@hotmail.com, david@redhat.com, ross.zwisler@intel.com, lcapitulino@redhat.com, hch@infradead.org, mst@redhat.com, stefanha@redhat.com, imammedo@redhat.com, marcel@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, nilal@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This patch adds virtio-pmem Qemu device. This device presents memory address range=20 information to guest which is backed by file=20 backend type. It acts like persistent memory=20 device for KVM guest. Guest can perform read=20 and persistent write operations on this memory=20 range with the help of DAX capable filesystem. Persistent guest writes are assured with the=20 help of virtio based flushing interface. When=20 guest userspace space performs fsync on file=20 fd on pmem device, a flush command is send to=20 Qemu over VIRTIO and host side flush/sync is=20 done on backing image file. This PV device code is dependent and tested=20 with 'David Hildenbrand's ' patchset[1] to=20 map non-PCDIMM devices to guest address space. There is still upstream discussion on using=20 among PCI bar vs memory device, will update=20 as per concensus. [1] https://marc.info/?l=3Dqemu-devel&m=3D152450249319168&w=3D2 Signed-off-by: Pankaj Gupta --- hw/virtio/Makefile.objs | 3 + hw/virtio/virtio-pci.c | 44 +++++++ hw/virtio/virtio-pci.h | 14 ++ hw/virtio/virtio-pmem.c | 197 ++++++++++++++++++++++++= ++++ include/hw/pci/pci.h | 1 + include/hw/virtio/virtio-pmem.h | 44 +++++++ include/standard-headers/linux/virtio_ids.h | 1 + qapi/misc.json | 26 +++- 8 files changed, 329 insertions(+), 1 deletion(-) create mode 100644 hw/virtio/virtio-pmem.c create mode 100644 include/hw/virtio/virtio-pmem.h diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs index 765d363c1f..d329dbb1a1 100644 --- a/hw/virtio/Makefile.objs +++ b/hw/virtio/Makefile.objs @@ -6,6 +6,9 @@ common-obj-y +=3D virtio-mmio.o =20 obj-y +=3D virtio.o virtio-balloon.o=20 obj-$(CONFIG_LINUX) +=3D vhost.o vhost-backend.o vhost-user.o +ifeq ($(CONFIG_MEM_HOTPLUG),y) +obj-$(CONFIG_LINUX) +=3D virtio-pmem.o +endif obj-$(CONFIG_VHOST_VSOCK) +=3D vhost-vsock.o obj-y +=3D virtio-crypto.o obj-$(CONFIG_VIRTIO_PCI) +=3D virtio-crypto-pci.o diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 1e8ab7bbc5..e15a3a5a2e 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -2501,6 +2501,49 @@ static const TypeInfo virtio_rng_pci_info =3D { .class_init =3D virtio_rng_pci_class_init, }; =20 +/* virtio-pmem-pci */ + +static void virtio_pmem_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) +{ + VirtIOPMEMPCI *vpmem =3D VIRTIO_PMEM_PCI(vpci_dev); + DeviceState *vdev =3D DEVICE(&vpmem->vdev); + + qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus)); + object_property_set_bool(OBJECT(vdev), true, "realized", errp); +} + +static void virtio_pmem_pci_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc =3D DEVICE_CLASS(klass); + VirtioPCIClass *k =3D VIRTIO_PCI_CLASS(klass); + PCIDeviceClass *pcidev_k =3D PCI_DEVICE_CLASS(klass); + k->realize =3D virtio_pmem_pci_realize; + set_bit(DEVICE_CATEGORY_MISC, dc->categories); + pcidev_k->vendor_id =3D PCI_VENDOR_ID_REDHAT_QUMRANET; + pcidev_k->device_id =3D PCI_DEVICE_ID_VIRTIO_PMEM; + pcidev_k->revision =3D VIRTIO_PCI_ABI_VERSION; + pcidev_k->class_id =3D PCI_CLASS_OTHERS; +} + +static void virtio_pmem_pci_instance_init(Object *obj) +{ + VirtIOPMEMPCI *dev =3D VIRTIO_PMEM_PCI(obj); + + virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev), + TYPE_VIRTIO_PMEM); + object_property_add_alias(obj, "memdev", OBJECT(&dev->vdev), "memdev", + &error_abort); +} + +static const TypeInfo virtio_pmem_pci_info =3D { + .name =3D TYPE_VIRTIO_PMEM_PCI, + .parent =3D TYPE_VIRTIO_PCI, + .instance_size =3D sizeof(VirtIOPMEMPCI), + .instance_init =3D virtio_pmem_pci_instance_init, + .class_init =3D virtio_pmem_pci_class_init, +}; + + /* virtio-input-pci */ =20 static Property virtio_input_pci_properties[] =3D { @@ -2693,6 +2736,7 @@ static void virtio_pci_register_types(void) type_register_static(&virtio_balloon_pci_info); type_register_static(&virtio_serial_pci_info); type_register_static(&virtio_net_pci_info); + type_register_static(&virtio_pmem_pci_info); #ifdef CONFIG_VHOST_SCSI type_register_static(&vhost_scsi_pci_info); #endif diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h index 813082b0d7..fe74fcad3f 100644 --- a/hw/virtio/virtio-pci.h +++ b/hw/virtio/virtio-pci.h @@ -19,6 +19,7 @@ #include "hw/virtio/virtio-blk.h" #include "hw/virtio/virtio-net.h" #include "hw/virtio/virtio-rng.h" +#include "hw/virtio/virtio-pmem.h" #include "hw/virtio/virtio-serial.h" #include "hw/virtio/virtio-scsi.h" #include "hw/virtio/virtio-balloon.h" @@ -57,6 +58,7 @@ typedef struct VirtIOInputHostPCI VirtIOInputHostPCI; typedef struct VirtIOGPUPCI VirtIOGPUPCI; typedef struct VHostVSockPCI VHostVSockPCI; typedef struct VirtIOCryptoPCI VirtIOCryptoPCI; +typedef struct VirtIOPMEMPCI VirtIOPMEMPCI; =20 /* virtio-pci-bus */ =20 @@ -274,6 +276,18 @@ struct VirtIOBlkPCI { VirtIOBlock vdev; }; =20 +/* + * virtio-pmem-pci: This extends VirtioPCIProxy. + */ +#define TYPE_VIRTIO_PMEM_PCI "virtio-pmem-pci" +#define VIRTIO_PMEM_PCI(obj) \ + OBJECT_CHECK(VirtIOPMEMPCI, (obj), TYPE_VIRTIO_PMEM_PCI) + +struct VirtIOPMEMPCI { + VirtIOPCIProxy parent_obj; + VirtIOPMEM vdev; +}; + /* * virtio-balloon-pci: This extends VirtioPCIProxy. */ diff --git a/hw/virtio/virtio-pmem.c b/hw/virtio/virtio-pmem.c new file mode 100644 index 0000000000..70d3697423 --- /dev/null +++ b/hw/virtio/virtio-pmem.c @@ -0,0 +1,197 @@ +/* + * Virtio pmem device + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qemu-common.h" +#include "qemu/error-report.h" +#include "hw/virtio/virtio-pmem.h" +#include "hw/mem/memory-device.h" + +static void virtio_pmem_flush(VirtIODevice *vdev, VirtQueue *vq) +{ + VirtQueueElement *elem; + VirtIOPMEM *pmem =3D VIRTIO_PMEM(vdev); + HostMemoryBackend *backend =3D MEMORY_BACKEND(pmem->memdev); + int fd =3D memory_region_get_fd(&backend->mr); + + elem =3D virtqueue_pop(vq, sizeof(VirtQueueElement)); + if (!elem) { + return; + } + /* flush raw backing image */ + fsync(fd); + + virtio_notify(vdev, vq); + g_free(elem); + +} + +static void virtio_pmem_get_config(VirtIODevice *vdev, uint8_t *config) +{ + VirtIOPMEM *pmem =3D VIRTIO_PMEM(vdev); + struct virtio_pmem_config *pmemcfg =3D (struct virtio_pmem_config *) c= onfig; + + pmemcfg->start =3D pmem->start; + pmemcfg->size =3D pmem->size; +} + +static uint64_t virtio_pmem_get_features(VirtIODevice *vdev, uint64_t feat= ures, + Error **errp) +{ + virtio_add_feature(&features, VIRTIO_PMEM_PLUG); + return features; +} + +static void virtio_pmem_realize(DeviceState *dev, Error **errp) +{ + VirtIODevice *vdev =3D VIRTIO_DEVICE(dev); + VirtIOPMEM *pmem =3D VIRTIO_PMEM(dev); + MachineState *ms =3D MACHINE(qdev_get_machine()); + uint64_t align; + + Error *local_err =3D NULL; + MemoryRegion *mr; + + if (!pmem->memdev) { + error_setg(errp, "virtio-pmem memdev not set"); + return; + } + + mr =3D host_memory_backend_get_memory(pmem->memdev, errp); + align =3D memory_region_get_alignment(mr); + pmem->size =3D QEMU_ALIGN_DOWN(memory_region_size(mr), align); + pmem->start =3D memory_device_get_free_addr(ms, NULL, align, pmem->siz= e, + &local_err); + + if (local_err) { + error_setg(errp, "Can't get free address in mem device"); + return; + } + + memory_region_init_alias(&pmem->mr, OBJECT(pmem), + "virtio_pmem-memory", mr, 0, pmem->size); + memory_device_plug_region(ms, &pmem->mr, pmem->start); + + host_memory_backend_set_mapped(pmem->memdev, true); + virtio_init(vdev, TYPE_VIRTIO_PMEM, VIRTIO_ID_PMEM, + sizeof(struct virtio_pmem_config)); + + pmem->rq_vq =3D virtio_add_queue(vdev, 128, virtio_pmem_flush); +} + +static void virtio_mem_check_memdev(Object *obj, const char *name, Object = *val, + Error **errp) +{ + if (host_memory_backend_is_mapped(MEMORY_BACKEND(val))) { + + char *path =3D object_get_canonical_path_component(val); + error_setg(errp, "Can't use already busy memdev: %s", path); + g_free(path); + return; + } + + qdev_prop_allow_set_link_before_realize(obj, name, val, errp); +} + +static const char *virtio_pmem_get_device_id(VirtIOPMEM *vm) +{ + Object *obj =3D OBJECT(vm); + DeviceState *parent_dev; + + /* always use the ID of the proxy device */ + if (obj->parent && object_dynamic_cast(obj->parent, TYPE_DEVICE)) { + parent_dev =3D DEVICE(obj->parent); + return parent_dev->id; + } + return NULL; +} + + +static void virtio_pmem_md_fill_device_info(const MemoryDeviceState *md, + MemoryDeviceInfo *info) +{ + VirtioPMemDeviceInfo *vi =3D g_new0(VirtioPMemDeviceInfo, 1); + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + const char *id =3D virtio_pmem_get_device_id(vm); + + if (id) { + vi->has_id =3D true; + vi->id =3D g_strdup(id); + } + + vi->start =3D vm->start; + vi->size =3D vm->size; + vi->memdev =3D object_get_canonical_path(OBJECT(vm->memdev)); + + info->u.virtio_pmem.data =3D vi; + info->type =3D MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM; +} + +static uint64_t virtio_pmem_md_get_addr(const MemoryDeviceState *md) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + return vm->start; +} + +static uint64_t virtio_pmem_md_get_plugged_size(const MemoryDeviceState *m= d) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + return vm->size; +} + +static uint64_t virtio_pmem_md_get_region_size(const MemoryDeviceState *md) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(md); + + return vm->size; +} + +static void virtio_pmem_instance_init(Object *obj) +{ + VirtIOPMEM *vm =3D VIRTIO_PMEM(obj); + object_property_add_link(obj, "memdev", TYPE_MEMORY_BACKEND, + (Object **)&vm->memdev, + (void *) virtio_mem_check_memdev, + OBJ_PROP_LINK_UNREF_ON_RELEASE, + &error_abort); +} + + +static void virtio_pmem_class_init(ObjectClass *klass, void *data) +{ + VirtioDeviceClass *vdc =3D VIRTIO_DEVICE_CLASS(klass); + MemoryDeviceClass *mdc =3D MEMORY_DEVICE_CLASS(klass); + + vdc->realize =3D virtio_pmem_realize; + vdc->get_config =3D virtio_pmem_get_config; + vdc->get_features =3D virtio_pmem_get_features; + + mdc->get_addr =3D virtio_pmem_md_get_addr; + mdc->get_plugged_size =3D virtio_pmem_md_get_plugged_size; + mdc->get_region_size =3D virtio_pmem_md_get_region_size; + mdc->fill_device_info =3D virtio_pmem_md_fill_device_info; +} + +static TypeInfo virtio_pmem_info =3D { + .name =3D TYPE_VIRTIO_PMEM, + .parent =3D TYPE_VIRTIO_DEVICE, + .class_init =3D virtio_pmem_class_init, + .instance_size =3D sizeof(VirtIOPMEM), + .instance_init =3D virtio_pmem_instance_init, + .interfaces =3D (InterfaceInfo[]) { + { TYPE_MEMORY_DEVICE }, + { } + }, +}; + +static void virtio_register_types(void) +{ + type_register_static(&virtio_pmem_info); +} + +type_init(virtio_register_types) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index a9c3ee5aa2..df26e204ce 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -85,6 +85,7 @@ extern bool pci_available; #define PCI_DEVICE_ID_VIRTIO_RNG 0x1005 #define PCI_DEVICE_ID_VIRTIO_9P 0x1009 #define PCI_DEVICE_ID_VIRTIO_VSOCK 0x1012 +#define PCI_DEVICE_ID_VIRTIO_PMEM 0x1013 =20 #define PCI_VENDOR_ID_REDHAT 0x1b36 #define PCI_DEVICE_ID_REDHAT_BRIDGE 0x0001 diff --git a/include/hw/virtio/virtio-pmem.h b/include/hw/virtio/virtio-pme= m.h new file mode 100644 index 0000000000..a8d017beca --- /dev/null +++ b/include/hw/virtio/virtio-pmem.h @@ -0,0 +1,44 @@ +/* + * Virtio pmem Device + * + * PV device to emulate nvdimm memory. + * Provides guest flushing interface based + * on VIRTIO. + */ + +#ifndef QEMU_VIRTIO_PMEM_H +#define QEMU_VIRTIO_PMEM_H + +#include "hw/virtio/virtio.h" +#include "exec/memory.h" +#include "sysemu/hostmem.h" +#include "standard-headers/linux/virtio_ids.h" +#include "hw/boards.h" +#include "hw/i386/pc.h" + +#define VIRTIO_PMEM_PLUG 0 + +#define TYPE_VIRTIO_PMEM "virtio-pmem" + +#define VIRTIO_PMEM(obj) \ + OBJECT_CHECK(VirtIOPMEM, (obj), TYPE_VIRTIO_PMEM) + +/* VirtIOPMEM device structure */ +typedef struct VirtIOPMEM { + + VirtIODevice parent_obj; + VirtQueue *rq_vq; + uint64_t start; + uint64_t size; + + MemoryRegion mr; + HostMemoryBackend *memdev; +} VirtIOPMEM; + +struct virtio_pmem_config { + + uint64_t start; + uint64_t size; +}; + +#endif diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard= -headers/linux/virtio_ids.h index 6d5c3b2d4f..5ebd04980d 100644 --- a/include/standard-headers/linux/virtio_ids.h +++ b/include/standard-headers/linux/virtio_ids.h @@ -43,5 +43,6 @@ #define VIRTIO_ID_INPUT 18 /* virtio input */ #define VIRTIO_ID_VSOCK 19 /* virtio vsock transport */ #define VIRTIO_ID_CRYPTO 20 /* virtio crypto */ +#define VIRTIO_ID_PMEM 21 /* virtio pmem */ =20 #endif /* _LINUX_VIRTIO_IDS_H */ diff --git a/qapi/misc.json b/qapi/misc.json index 5636f4a149..29a00b27d7 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -2871,6 +2871,29 @@ } } =20 +## +# @VirtioPMemDeviceInfo: +# +# VirtioPMem state information +# +# @id: device's ID +# +# @start: physical address, where device is mapped +# +# @size: size of memory that the device provides +# +# @memdev: memory backend linked with device +# +# Since: 2.13 +## +{ 'struct': 'VirtioPMemDeviceInfo', + 'data': { '*id': 'str', + 'start': 'size', + 'size': 'size', + 'memdev': 'str' + } +} + ## # @MemoryDeviceInfo: # @@ -2880,7 +2903,8 @@ ## { 'union': 'MemoryDeviceInfo', 'data': { 'dimm': 'PCDIMMDeviceInfo', - 'nvdimm': 'PCDIMMDeviceInfo' + 'nvdimm': 'PCDIMMDeviceInfo', + 'virtio-pmem': 'VirtioPMemDeviceInfo' } } =20 --=20 2.14.3