From nobody Tue Feb 10 19:14:35 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1531468578238539.1761896931326; Fri, 13 Jul 2018 00:56:18 -0700 (PDT) Received: from localhost ([::1]:35731 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdsw5-0002KE-5Y for importer@patchew.org; Fri, 13 Jul 2018 03:56:17 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33662) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdssu-0008VP-QK for qemu-devel@nongnu.org; Fri, 13 Jul 2018 03:53:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fdssp-000490-Vh for qemu-devel@nongnu.org; Fri, 13 Jul 2018 03:53:00 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:53436 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fdssp-000481-QH for qemu-devel@nongnu.org; Fri, 13 Jul 2018 03:52:55 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 376A5406EA4F; Fri, 13 Jul 2018 07:52:55 +0000 (UTC) Received: from dhcp201-121.englab.pnq.redhat.com (dhcp193-70.pnq.redhat.com [10.65.193.70]) by smtp.corp.redhat.com (Postfix) with ESMTP id A089F2026D6B; Fri, 13 Jul 2018 07:52:49 +0000 (UTC) From: Pankaj Gupta To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org Date: Fri, 13 Jul 2018 13:22:31 +0530 Message-Id: <20180713075232.9575-3-pagupta@redhat.com> In-Reply-To: <20180713075232.9575-1-pagupta@redhat.com> References: <20180713075232.9575-1-pagupta@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 13 Jul 2018 07:52:55 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 13 Jul 2018 07:52:55 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'pagupta@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [RFC v3 2/2] virtio-pmem: Add virtio pmem driver X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, haozhong.zhang@intel.com, jack@suse.cz, xiaoguangrong.eric@gmail.com, riel@surriel.com, pagupta@redhat.com, niteshnarayanlal@hotmail.com, david@redhat.com, ross.zwisler@intel.com, lcapitulino@redhat.com, hch@infradead.org, mst@redhat.com, stefanha@redhat.com, imammedo@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, nilal@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This patch adds virtio-pmem driver for KVM guest. Guest reads the persistent memory range information from Qemu over=20 VIRTIO and registers it on nvdimm_bus. It also creates a nd_region=20 object with the persistent memory range information so that existing=20 'nvdimm/pmem' driver can reserve this into system memory map. This way=20 'virtio-pmem' driver uses existing functionality of pmem driver to=20 register persistent memory compatible for DAX capable filesystems. This also provides function to perform guest flush over VIRTIO from=20 'pmem' driver when userspace performs flush on DAX memory range. Signed-off-by: Pankaj Gupta --- drivers/virtio/Kconfig | 9 ++ drivers/virtio/Makefile | 1 + drivers/virtio/virtio_pmem.c | 190 +++++++++++++++++++++++++++++++++++= ++++ include/linux/virtio_pmem.h | 44 +++++++++ include/uapi/linux/virtio_ids.h | 1 + include/uapi/linux/virtio_pmem.h | 40 +++++++++ 6 files changed, 285 insertions(+) create mode 100644 drivers/virtio/virtio_pmem.c create mode 100644 include/linux/virtio_pmem.h create mode 100644 include/uapi/linux/virtio_pmem.h diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 3589764..a331e23 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY =20 If unsure, say Y. =20 +config VIRTIO_PMEM + tristate "Support for virtio pmem driver" + depends on VIRTIO + help + This driver provides support for virtio based flushing interface + for persistent memory range. + + If unsure, say M. + config VIRTIO_BALLOON tristate "Virtio balloon driver" depends on VIRTIO diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile index 3a2b5c5..cbe91c6 100644 --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -6,3 +6,4 @@ virtio_pci-y :=3D virtio_pci_modern.o virtio_pci_common.o virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) +=3D virtio_pci_legacy.o obj-$(CONFIG_VIRTIO_BALLOON) +=3D virtio_balloon.o obj-$(CONFIG_VIRTIO_INPUT) +=3D virtio_input.o +obj-$(CONFIG_VIRTIO_PMEM) +=3D virtio_pmem.o diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c new file mode 100644 index 0000000..6200b5e --- /dev/null +++ b/drivers/virtio/virtio_pmem.c @@ -0,0 +1,190 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * virtio_pmem.c: Virtio pmem Driver + * + * Discovers persistent memory range information + * from host and provides a virtio based flushing + * interface. + */ +#include +#include +#include + +static struct virtio_device_id id_table[] =3D { + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, + { 0 }, +}; + + /* The interrupt handler */ +static void host_ack(struct virtqueue *vq) +{ + unsigned int len; + unsigned long flags; + struct virtio_pmem_request *req; + struct virtio_pmem *vpmem =3D vq->vdev->priv; + + spin_lock_irqsave(&vpmem->pmem_lock, flags); + while ((req =3D virtqueue_get_buf(vq, &len)) !=3D NULL) { + req->done =3D true; + wake_up(&req->acked); + } + spin_unlock_irqrestore(&vpmem->pmem_lock, flags); +} + /* Initialize virt queue */ +static int init_vq(struct virtio_pmem *vpmem) +{ + struct virtqueue *vq; + + /* single vq */ + vpmem->req_vq =3D vq =3D virtio_find_single_vq(vpmem->vdev, + host_ack, "flush_queue"); + if (IS_ERR(vq)) + return PTR_ERR(vq); + spin_lock_init(&vpmem->pmem_lock); + + return 0; +}; + + /* The request submission function */ +static int virtio_pmem_flush(struct device *dev) +{ + int err; + unsigned long flags; + struct scatterlist *sgs[2], sg, ret; + struct virtio_device *vdev =3D dev_to_virtio(dev->parent->parent); + struct virtio_pmem *vpmem =3D vdev->priv; + struct virtio_pmem_request *req =3D kmalloc(sizeof(*req), GFP_KERNEL); + + req->done =3D false; + init_waitqueue_head(&req->acked); + spin_lock_irqsave(&vpmem->pmem_lock, flags); + + sg_init_one(&sg, req, sizeof(req)); + sgs[0] =3D &sg; + sg_init_one(&ret, &req->ret, sizeof(req->ret)); + sgs[1] =3D &ret; + err =3D virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC); + if (err) { + dev_err(&vdev->dev, "failed to send command to virtio pmem device\n"); + spin_unlock_irqrestore(&vpmem->pmem_lock, flags); + return -ENOSPC; + } + virtqueue_kick(vpmem->req_vq); + spin_unlock_irqrestore(&vpmem->pmem_lock, flags); + + /* When host has read buffer, this completes via host_ack */ + wait_event(req->acked, req->done); + err =3D req->ret; + kfree(req); + + return err; +}; + +static int virtio_pmem_probe(struct virtio_device *vdev) +{ + int err =3D 0; + struct resource res; + struct virtio_pmem *vpmem; + struct nvdimm_bus *nvdimm_bus; + struct nd_region_desc ndr_desc; + int nid =3D dev_to_node(&vdev->dev); + struct nd_region *nd_region; + + if (!vdev->config->get) { + dev_err(&vdev->dev, "%s failure: config disabled\n", + __func__); + return -EINVAL; + } + + vdev->priv =3D vpmem =3D devm_kzalloc(&vdev->dev, sizeof(*vpmem), + GFP_KERNEL); + if (!vpmem) { + err =3D -ENOMEM; + goto out_err; + } + + vpmem->vdev =3D vdev; + err =3D init_vq(vpmem); + if (err) + goto out_err; + + virtio_cread(vpmem->vdev, struct virtio_pmem_config, + start, &vpmem->start); + virtio_cread(vpmem->vdev, struct virtio_pmem_config, + size, &vpmem->size); + + res.start =3D vpmem->start; + res.end =3D vpmem->start + vpmem->size-1; + vpmem->nd_desc.provider_name =3D "virtio-pmem"; + vpmem->nd_desc.module =3D THIS_MODULE; + + vpmem->nvdimm_bus =3D nvdimm_bus =3D nvdimm_bus_register(&vdev->dev, + &vpmem->nd_desc); + if (!nvdimm_bus) + goto out_vq; + + dev_set_drvdata(&vdev->dev, nvdimm_bus); + memset(&ndr_desc, 0, sizeof(ndr_desc)); + + ndr_desc.res =3D &res; + ndr_desc.numa_node =3D nid; + ndr_desc.flush =3D virtio_pmem_flush; + set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); + nd_region =3D nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc); + + if (!nd_region) + goto out_nd; + + virtio_device_ready(vdev); + return 0; +out_nd: + err =3D -ENXIO; + nvdimm_bus_unregister(nvdimm_bus); +out_vq: + vdev->config->del_vqs(vdev); +out_err: + dev_err(&vdev->dev, "failed to register virtio pmem memory\n"); + return err; +} + +static void virtio_pmem_remove(struct virtio_device *vdev) +{ + struct virtio_pmem *vpmem =3D vdev->priv; + struct nvdimm_bus *nvdimm_bus =3D dev_get_drvdata(&vdev->dev); + + nvdimm_bus_unregister(nvdimm_bus); + vdev->config->del_vqs(vdev); + kfree(vpmem); +} + +#ifdef CONFIG_PM_SLEEP +static int virtio_pmem_freeze(struct virtio_device *vdev) +{ + /* todo: handle freeze function */ + return -EPERM; +} + +static int virtio_pmem_restore(struct virtio_device *vdev) +{ + /* todo: handle restore function */ + return -EPERM; +} +#endif + + +static struct virtio_driver virtio_pmem_driver =3D { + .driver.name =3D KBUILD_MODNAME, + .driver.owner =3D THIS_MODULE, + .id_table =3D id_table, + .probe =3D virtio_pmem_probe, + .remove =3D virtio_pmem_remove, +#ifdef CONFIG_PM_SLEEP + .freeze =3D virtio_pmem_freeze, + .restore =3D virtio_pmem_restore, +#endif +}; + +module_virtio_driver(virtio_pmem_driver); +MODULE_DEVICE_TABLE(virtio, id_table); +MODULE_DESCRIPTION("Virtio pmem driver"); +MODULE_LICENSE("GPL"); diff --git a/include/linux/virtio_pmem.h b/include/linux/virtio_pmem.h new file mode 100644 index 0000000..0f83d9c --- /dev/null +++ b/include/linux/virtio_pmem.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * virtio_pmem.h: virtio pmem Driver + * + * Discovers persistent memory range information + * from host and provides a virtio based flushing + * interface. + */ +#ifndef _LINUX_VIRTIO_PMEM_H +#define _LINUX_VIRTIO_PMEM_H + +#include +#include +#include +#include +#include + +struct virtio_pmem_request { + /* Host return status corresponding to flush request */ + int ret; + + /* Wait queue to process deferred work after ack from host */ + wait_queue_head_t acked; + bool done; +}; + +struct virtio_pmem { + struct virtio_device *vdev; + + /* Virtio pmem request queue */ + struct virtqueue *req_vq; + + /* nvdimm bus registers virtio pmem device */ + struct nvdimm_bus *nvdimm_bus; + struct nvdimm_bus_descriptor nd_desc; + + /* Synchronize virtqueue data */ + spinlock_t pmem_lock; + + /* Memory region information */ + uint64_t start; + uint64_t size; +}; +#endif diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_id= s.h index 6d5c3b2..3463895 100644 --- a/include/uapi/linux/virtio_ids.h +++ b/include/uapi/linux/virtio_ids.h @@ -43,5 +43,6 @@ #define VIRTIO_ID_INPUT 18 /* virtio input */ #define VIRTIO_ID_VSOCK 19 /* virtio vsock transport */ #define VIRTIO_ID_CRYPTO 20 /* virtio crypto */ +#define VIRTIO_ID_PMEM 25 /* virtio pmem */ =20 #endif /* _LINUX_VIRTIO_IDS_H */ diff --git a/include/uapi/linux/virtio_pmem.h b/include/uapi/linux/virtio_p= mem.h new file mode 100644 index 0000000..c7c22a5 --- /dev/null +++ b/include/uapi/linux/virtio_pmem.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * This header, excluding the #ifdef __KERNEL__ part, is BSD licensed so + * anyone can use the definitions to implement compatible drivers/servers: + * + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of IBM nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``A= S IS'' + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, T= HE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURP= OSE + * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENT= IAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STR= ICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY W= AY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * Copyright (C) Red Hat, Inc., 2018-2019 + * Copyright (C) Pankaj Gupta , 2018 + */ +#ifndef _UAPI_LINUX_VIRTIO_PMEM_H +#define _UAPI_LINUX_VIRTIO_PMEM_H + +struct virtio_pmem_config { + __le64 start; + __le64 size; +}; +#endif --=20 2.9.3