From nobody Sun Dec 14 06:19:16 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CB85C001DE for ; Fri, 4 Aug 2023 07:10:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234113AbjHDHKP (ORCPT ); Fri, 4 Aug 2023 03:10:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234066AbjHDHKE (ORCPT ); Fri, 4 Aug 2023 03:10:04 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8C04230E1; Fri, 4 Aug 2023 00:10:00 -0700 (PDT) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id EB8DD207F5BC; Fri, 4 Aug 2023 00:09:59 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com EB8DD207F5BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1691133000; bh=tK3oD7odpa1w807rL/Yujw4yRVLIILGtNOd6SFw2Qyw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Gl9saf4Jcc1BAVdSPN5Z1rQ5GFyPvnLlbwHHJYlbJFFSj9aMjgeOn9BLoXoFdq9zR 3f++lEdc5veyqqqfh2IzxxtIhf7rtnTSGpjSHwPEbaU/YYlQZGZmvPwVSZdB11aXXA sL4Gdp5iTqab8v+1/eAqxq9nesF95WiigW7Ermpg= From: Saurabh Sengar To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, mikelley@microsoft.com, gregkh@linuxfoundation.org, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH v4 1/3] uio: Add hv_vmbus_client driver Date: Fri, 4 Aug 2023 00:09:54 -0700 Message-Id: <1691132996-11706-2-git-send-email-ssengar@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1691132996-11706-1-git-send-email-ssengar@linux.microsoft.com> References: <1691132996-11706-1-git-send-email-ssengar@linux.microsoft.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Add a new UIO-based driver that generically supports low speed Hyper-V VMBus devices. This driver can be bound to VMBus devices by user space drivers that provide device-specific management. The new driver provides the following core functionality, which is suitable for low speed devices: * A single VMBus channel for each device * Ability to specify the VMBus channel ring buffer size for each device * Host notification via a hypercall instead of monitor bits Signed-off-by: Saurabh Sengar Reviewed-by: Michael Kelley --- [V4] - Added Reviewed-by [V3] - Removed ringbuffer sysfs entry and used uio framework for mmap - Remove ".id_table =3D NULL" - kasprintf -> devm_kasprintf - Change global variable ring_size to per device - More checks on value which can be set for ring_size - Remove driverctl, and used echo command instead for driver documentation - Remove unnecessary one time use macros - Change kernel version and date for sysfs documentation - Update documentation - Better commit message [V2] - Update driver info in Documentation/driver-api/uio-howto.rst - Update ring_size sysfs info in Documentation/ABI/stable/sysfs-bus-vmbus - Remove DRIVER_VERSION - Remove refcnt - scnprintf -> sysfs_emit - sysfs_create_file -> ATTRIBUTE_GROUPS + ".driver.groups"; - sysfs_create_bin_file -> device_create_bin_file - dev_notice -> dev_err - remove MODULE_VERSION Documentation/ABI/stable/sysfs-bus-vmbus | 10 ++ Documentation/driver-api/uio-howto.rst | 54 ++++++ drivers/uio/Kconfig | 12 ++ drivers/uio/Makefile | 1 + drivers/uio/uio_hv_vmbus_client.c | 218 +++++++++++++++++++++++ 5 files changed, 295 insertions(+) create mode 100644 drivers/uio/uio_hv_vmbus_client.c diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/s= table/sysfs-bus-vmbus index 3066feae1d8d..7e77eda77be3 100644 --- a/Documentation/ABI/stable/sysfs-bus-vmbus +++ b/Documentation/ABI/stable/sysfs-bus-vmbus @@ -153,6 +153,16 @@ Contact: Stephen Hemminger Description: Binary file created by uio_hv_generic for ring buffer Users: Userspace drivers =20 +What: /sys/bus/vmbus/devices//ring_size +Date: September 2023 +KernelVersion: 6.6 +Contact: Saurabh Sengar +Description: File created by uio_hv_vmbus_client for setting device ring + buffer size. The value specified within the file denotes the + total memory allocation for the one complete ring buffer, which + includes the ring buffer header, of size PAGE_SIZE. +Users: Userspace drivers + What: /sys/bus/vmbus/devices//channels//intr_in_full Date: February 2019 KernelVersion: 5.0 diff --git a/Documentation/driver-api/uio-howto.rst b/Documentation/driver-= api/uio-howto.rst index 907ffa3b38f5..625c2bda369f 100644 --- a/Documentation/driver-api/uio-howto.rst +++ b/Documentation/driver-api/uio-howto.rst @@ -722,6 +722,60 @@ For example:: =20 /sys/bus/vmbus/devices/3811fe4d-0fa0-4b62-981a-74fc1084c757/channels/21/r= ing =20 +Generic Hyper-V driver for low speed devices +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The generic driver is a kernel module named uio_hv_vmbus_client. It +supports slow devices on the Hyper-V VMBus similar to uio_hv_generic +for faster devices. This driver also gives flexibility of customized +ring buffer sizes. + +Making the driver recognize the device +-------------------------------------- + +Since the driver does not declare any device GUID's, it will not get +loaded automatically and will not automatically bind to any devices. You +must load it and allocate id to the driver yourself. For example, to use +the fcopy device class GUID:: + + modprobe uio_hv_vmbus_client + echo "34d14be3-dee4-41c8-9ae7-6b174977c192" > /sys/bus/vmbus/drive= rs/uio_hv_vmbus_client/new_id + +If there already is a hardware specific kernel driver for the device, +the generic driver still won't bind to it. In this case if you want to +use the generic driver for a userspace library you'll have to manually unb= ind +the hardware specific driver and bind the generic driver, using the device +instance GUID like this:: + + echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/dri= vers/uio_hv_vmbus_client/unbind + echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/dri= vers/uio_hv_vmbus_client/bind + +You can verify that the device has been bound to the driver by looking +for it in sysfs, for example like the following:: + + ls -l /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/= driver + +Which if successful should print:: + + .../eb765408-105f-49b6-b4aa-c123b64d17d4/driver -> ../../../bus/vmbu= s/drivers/uio_hv_vmbus_client + +Things to know about uio_hv_vmbus_client +---------------------------------------- + +The uio_hv_vmbus_client driver maps the Hyper-V device ring buffer to user= space +and offers an interface to manage it. + +The userspace API for mapping and performing read/write operations on the = device +ring buffer is implemented in tools/hv/vmbus_bufring.c. Userspace applicat= ions +should use this file as a library and build their logic on top of it. + +Additionally, the uio_hv_vmbus_client driver offers the "ring_size" sysfs = entry +for setting the device ring buffer size before opening the device. + +For example:: + + /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/ring_s= ize + Further information =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig index 2e16c5338e5b..bd4d27ecfc9a 100644 --- a/drivers/uio/Kconfig +++ b/drivers/uio/Kconfig @@ -166,6 +166,18 @@ config UIO_HV_GENERIC =20 If you compile this as a module, it will be called uio_hv_generic. =20 +config UIO_HV_SLOW_DEVICES + tristate "Generic driver for low speed VMBus devices" + depends on HYPERV + help + Generic driver that you can dynamically bind to low speed Hyper-V + VMBus devices to allow a user space driver to manage the device. + The driver provides a single VMBus channel and uses a hypercall + instead of monitor bits to interrupt the host. The driver provides + a configurable per-device ring buffer size. + + If you compile this as a module, it will be called uio_hv_vmbus_client. + config UIO_DFL tristate "Generic driver for DFL (Device Feature List) bus" depends on FPGA_DFL diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile index f2f416a14228..44be0f96da34 100644 --- a/drivers/uio/Makefile +++ b/drivers/uio/Makefile @@ -11,4 +11,5 @@ obj-$(CONFIG_UIO_PRUSS) +=3D uio_pruss.o obj-$(CONFIG_UIO_MF624) +=3D uio_mf624.o obj-$(CONFIG_UIO_FSL_ELBC_GPCM) +=3D uio_fsl_elbc_gpcm.o obj-$(CONFIG_UIO_HV_GENERIC) +=3D uio_hv_generic.o +obj-$(CONFIG_UIO_HV_SLOW_DEVICES) +=3D uio_hv_vmbus_client.o obj-$(CONFIG_UIO_DFL) +=3D uio_dfl.o diff --git a/drivers/uio/uio_hv_vmbus_client.c b/drivers/uio/uio_hv_vmbus_c= lient.c new file mode 100644 index 000000000000..68e5aec92a13 --- /dev/null +++ b/drivers/uio/uio_hv_vmbus_client.c @@ -0,0 +1,218 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * uio_hv_vmbus_client - UIO driver for low speed VMBus devices + * + * Copyright (c) 2023, Microsoft Corporation. + * + * Authors: + * Saurabh Sengar + * + * Since the driver does not declare any device ids, userspace code must + * allocate an id and bind the device to the driver. + * + * For example, to associate the fcopy service with this driver: + * # echo "34d14be3-dee4-41c8-9ae7-6b174977c192" > /sys/bus/vmbus/drivers/= uio_hv_vmbus_client/new_id + * + * If there already is a hardware specific kernel driver for the device, + * the generic driver still won't bind to it. In this case if you want to + * use the generic driver for a userspace library you'll have to manually = unbind + * the hardware specific driver and bind the generic driver, using the dev= ice + * instance GUID like this: + * # echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/drivers/= uio_hv_vmbus_client/unbind + * # echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/drivers/= uio_hv_vmbus_client/bind + */ + +#include +#include +#include +#include +#include + +struct uio_hv_vmbus_dev { + struct uio_info info; + struct hv_device *device; + int ring_size; +}; + +/* + * This is the irqcontrol callback to be registered to uio_info. + * It can be used to disable/enable interrupt from user space processes. + * + * @param info + * pointer to uio_info. + * @param irq_state + * state value. 1 to enable interrupt. + */ +static int uio_hv_vmbus_irqcontrol(struct uio_info *info, s32 irq_state) +{ + struct uio_hv_vmbus_dev *pdata =3D info->priv; + struct hv_device *hv_dev =3D pdata->device; + + /* Issue a full memory barrier before triggering the notification */ + virt_mb(); + + if (irq_state =3D=3D 1) + vmbus_setevent(hv_dev->channel); + + return 0; +} + +/* + * Callback from vmbus_event when something is in inbound ring. + */ +static void uio_hv_vmbus_channel_cb(void *context) +{ + struct uio_hv_vmbus_dev *pdata =3D context; + + /* Issue a full memory barrier before sending the event to userspace */ + virt_mb(); + + uio_event_notify(&pdata->info); +} + +static int uio_hv_vmbus_open(struct uio_info *info, struct inode *inode) +{ + struct uio_hv_vmbus_dev *pdata =3D container_of(info, struct uio_hv_vmbus= _dev, info); + struct hv_device *hv_dev =3D pdata->device; + struct vmbus_channel *channel =3D hv_dev->channel; + void *ring_buffer; + int ret; + + ret =3D vmbus_open(channel, pdata->ring_size, pdata->ring_size, NULL, 0, + uio_hv_vmbus_channel_cb, pdata); + if (ret) { + dev_err(&hv_dev->device, "error %d when opening the channel\n", ret); + return ret; + } + channel->inbound.ring_buffer->interrupt_mask =3D 0; + set_channel_read_mode(channel, HV_CALL_ISR); + + /* set the mem pointer */ + info->mem[0].name =3D "txrx_rings"; + ring_buffer =3D page_address(channel->ringbuffer_page); + info->mem[0].addr =3D (uintptr_t)virt_to_phys(ring_buffer); + info->mem[0].size =3D channel->ringbuffer_pagecount << PAGE_SHIFT; + info->mem[0].memtype =3D UIO_MEM_IOVA; + + return ret; +} + +static int uio_hv_vmbus_release(struct uio_info *info, struct inode *inode) +{ + struct uio_hv_vmbus_dev *pdata =3D container_of(info, struct uio_hv_vmbus= _dev, info); + struct hv_device *hv_dev =3D pdata->device; + + vmbus_close(hv_dev->channel); + + /* restore the mem pointer to its original state */ + info->mem[0].name =3D NULL; + info->mem[0].addr =3D 0; + info->mem[0].size =3D 1; + info->mem[0].memtype =3D UIO_MEM_NONE; + + return 0; +} + +static ssize_t ring_size_show(struct device *dev, struct device_attribute = *attr, + char *buf) +{ + struct uio_info *info =3D dev_get_drvdata(dev); + struct uio_hv_vmbus_dev *pdata =3D container_of(info, struct uio_hv_vmbus= _dev, info); + + return sysfs_emit(buf, "%d\n", pdata->ring_size); +} + +static ssize_t ring_size_store(struct device *dev, struct device_attribute= *attr, + const char *buf, size_t count) +{ + unsigned int val; + struct uio_info *info =3D dev_get_drvdata(dev); + struct uio_hv_vmbus_dev *pdata =3D container_of(info, struct uio_hv_vmbus= _dev, info); + + if (kstrtouint(buf, 0, &val) < 0) + return -EINVAL; + + if (val < 2 * PAGE_SIZE || val % PAGE_SIZE) + return -EINVAL; + + pdata->ring_size =3D val; + + return count; +} + +static DEVICE_ATTR_RW(ring_size); + +static struct attribute *uio_hv_vmbus_client_attrs[] =3D { + &dev_attr_ring_size.attr, + NULL, +}; +ATTRIBUTE_GROUPS(uio_hv_vmbus_client); + +static int uio_hv_vmbus_probe(struct hv_device *dev, const struct hv_vmbus= _device_id *dev_id) +{ + struct uio_hv_vmbus_dev *pdata; + int ret; + char *name =3D NULL; + + pdata =3D devm_kzalloc(&dev->device, sizeof(*pdata), GFP_KERNEL); + if (!pdata) + return -ENOMEM; + + name =3D devm_kasprintf(&dev->device, GFP_KERNEL, "%pUl", &dev->dev_insta= nce); + + /* Fill general uio info */ + pdata->info.name =3D name; /* /sys/class/uio/uioX/name */ + pdata->info.version =3D "1"; + pdata->info.irqcontrol =3D uio_hv_vmbus_irqcontrol; + pdata->info.open =3D uio_hv_vmbus_open; + pdata->info.release =3D uio_hv_vmbus_release; + pdata->info.irq =3D UIO_IRQ_CUSTOM; + pdata->info.priv =3D pdata; + pdata->ring_size =3D VMBUS_RING_SIZE(3 * HV_HYP_PAGE_SIZE); /* Default ri= ngbuffer size */ + pdata->device =3D dev; + + /* dummy value to register the mem pointers which will be updated by open= */ + pdata->info.mem[0].size =3D 1; + + ret =3D uio_register_device(&dev->device, &pdata->info); + if (ret) { + dev_err(&dev->device, "uio_hv_vmbus register failed\n"); + return ret; + } + + hv_set_drvdata(dev, pdata); + + return 0; +} + +static void uio_hv_vmbus_remove(struct hv_device *dev) +{ + struct uio_hv_vmbus_dev *pdata =3D hv_get_drvdata(dev); + + if (pdata) + uio_unregister_device(&pdata->info); +} + +static struct hv_driver uio_hv_vmbus_drv =3D { + .driver.dev_groups =3D uio_hv_vmbus_client_groups, + .name =3D "uio_hv_vmbus_client", + .probe =3D uio_hv_vmbus_probe, + .remove =3D uio_hv_vmbus_remove, +}; + +static int __init uio_hv_vmbus_init(void) +{ + return vmbus_driver_register(&uio_hv_vmbus_drv); +} + +static void __exit uio_hv_vmbus_exit(void) +{ + vmbus_driver_unregister(&uio_hv_vmbus_drv); +} + +module_init(uio_hv_vmbus_init); +module_exit(uio_hv_vmbus_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Saurabh Sengar "); +MODULE_DESCRIPTION("Generic UIO driver for low speed VMBus devices"); --=20 2.34.1 From nobody Sun Dec 14 06:19:16 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F948C001DE for ; Fri, 4 Aug 2023 07:10:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232416AbjHDHKV (ORCPT ); Fri, 4 Aug 2023 03:10:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234064AbjHDHKG (ORCPT ); Fri, 4 Aug 2023 03:10:06 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8C15030E2; Fri, 4 Aug 2023 00:10:00 -0700 (PDT) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id 10448207F5BD; Fri, 4 Aug 2023 00:10:00 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 10448207F5BD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1691133000; bh=PsT/PJNCVNatmEaspRFKPQJm8lT11iMc7zxGAUOy8tk=; h=From:To:Subject:Date:In-Reply-To:References:From; b=kEyng/7+LP6IiNPobb6bU0jptDS38cOBjShKBd8LpcYG+Yk05GWIfBNWEFUM2f6nO 8njt60TuTnUDh8vnU/f2G99B8opMqLHSfXcg5c1dXlDSBxnQOX08JDKH8pe/F0UAT3 Zg6z4zg+N13/sIs361l9vKZqNjnOmZp8kqXpRR7s= From: Saurabh Sengar To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, mikelley@microsoft.com, gregkh@linuxfoundation.org, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH v4 2/3] tools: hv: Add vmbus_bufring Date: Fri, 4 Aug 2023 00:09:55 -0700 Message-Id: <1691132996-11706-3-git-send-email-ssengar@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1691132996-11706-1-git-send-email-ssengar@linux.microsoft.com> References: <1691132996-11706-1-git-send-email-ssengar@linux.microsoft.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Provide a userspace interface for userspace drivers or applications to read/write a VMBus ringbuffer. A significant part of this code is borrowed from DPDK[1]. Current library is supported exclusively for the x86 architecture. To build this library: make -C tools/hv libvmbus_bufring.a Applications using this library can include the vmbus_bufring.h header file and libvmbus_bufring.a statically. [1] https://github.com/DPDK/dpdk/ Signed-off-by: Mary Hardy Signed-off-by: Saurabh Sengar --- [V4] - Modify comment to remove RingDataStartOffset mention - Change downward ALIGN macro to upward ALIGN [V3] - Made ring buffer data offset depend on page size - remove rte_smp_rwmb macro and reused rte_compiler_barrier instead - Added legal counsel sign-off - Removed "Link:" tag=20 - Improve commit messages - new library compilation dependent on x86 - simplify mmap [V2] - simpler sysfs path, less parsing tools/hv/Build | 1 + tools/hv/Makefile | 13 +- tools/hv/vmbus_bufring.c | 297 +++++++++++++++++++++++++++++++++++++++ tools/hv/vmbus_bufring.h | 154 ++++++++++++++++++++ 4 files changed, 464 insertions(+), 1 deletion(-) create mode 100644 tools/hv/vmbus_bufring.c create mode 100644 tools/hv/vmbus_bufring.h diff --git a/tools/hv/Build b/tools/hv/Build index 6cf51fa4b306..2a667d3d94cb 100644 --- a/tools/hv/Build +++ b/tools/hv/Build @@ -1,3 +1,4 @@ hv_kvp_daemon-y +=3D hv_kvp_daemon.o hv_vss_daemon-y +=3D hv_vss_daemon.o hv_fcopy_daemon-y +=3D hv_fcopy_daemon.o +vmbus_bufring-y +=3D vmbus_bufring.o diff --git a/tools/hv/Makefile b/tools/hv/Makefile index fe770e679ae8..33cf488fd20f 100644 --- a/tools/hv/Makefile +++ b/tools/hv/Makefile @@ -11,14 +11,19 @@ srctree :=3D $(patsubst %/,%,$(dir $(CURDIR))) srctree :=3D $(patsubst %/,%,$(dir $(srctree))) endif =20 +include $(srctree)/tools/scripts/Makefile.arch + # Do not use make's built-in rules # (this improves performance and avoids hard-to-debug behaviour); MAKEFLAGS +=3D -r =20 override CFLAGS +=3D -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include =20 +ifeq ($(SRCARCH),x86) +ALL_LIBS :=3D libvmbus_bufring.a +endif ALL_TARGETS :=3D hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon -ALL_PROGRAMS :=3D $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) +ALL_PROGRAMS :=3D $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) $(patsubst %,$(O= UTPUT)%,$(ALL_LIBS)) =20 ALL_SCRIPTS :=3D hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh =20 @@ -27,6 +32,12 @@ all: $(ALL_PROGRAMS) export srctree OUTPUT CC LD CFLAGS include $(srctree)/tools/build/Makefile.include =20 +HV_VMBUS_BUFRING_IN :=3D $(OUTPUT)vmbus_bufring.o +$(HV_VMBUS_BUFRING_IN): FORCE + $(Q)$(MAKE) $(build)=3Dvmbus_bufring +$(OUTPUT)libvmbus_bufring.a : vmbus_bufring.o + $(AR) rcs $@ $^ + HV_KVP_DAEMON_IN :=3D $(OUTPUT)hv_kvp_daemon-in.o $(HV_KVP_DAEMON_IN): FORCE $(Q)$(MAKE) $(build)=3Dhv_kvp_daemon diff --git a/tools/hv/vmbus_bufring.c b/tools/hv/vmbus_bufring.c new file mode 100644 index 000000000000..0a0686751b79 --- /dev/null +++ b/tools/hv/vmbus_bufring.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: BSD-3-Clause +/* + * Copyright (c) 2009-2012,2016,2023 Microsoft Corp. + * Copyright (c) 2012 NetApp Inc. + * Copyright (c) 2012 Citrix Inc. + * All rights reserved. + */ + +#include +#include +#include +#include +#include +#include +#include "vmbus_bufring.h" + +#define rte_compiler_barrier() ({ asm volatile ("" : : : "memory"); }) +#define RINGDATA_START_OFFSET (getpagesize()) +#define VMBUS_RQST_ERROR 0xFFFFFFFFFFFFFFFF +#define ALIGN(val, align) (((val) + ((align) - 1)) & ~((align) - 1)) + +/* Increase bufring index by inc with wraparound */ +static inline uint32_t vmbus_br_idxinc(uint32_t idx, uint32_t inc, uint32_= t sz) +{ + idx +=3D inc; + if (idx >=3D sz) + idx -=3D sz; + + return idx; +} + +void vmbus_br_setup(struct vmbus_br *br, void *buf, unsigned int blen) +{ + br->vbr =3D buf; + br->windex =3D br->vbr->windex; + br->dsize =3D blen - RINGDATA_START_OFFSET; +} + +static inline __always_inline void +rte_smp_mb(void) +{ + asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); +} + +static inline int +rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src) +{ + uint8_t res; + + asm volatile("lock ; " + "cmpxchgl %[src], %[dst];" + "sete %[res];" + : [res] "=3Da" (res), /* output */ + [dst] "=3Dm" (*dst) + : [src] "r" (src), /* input */ + "a" (exp), + "m" (*dst) + : "memory"); /* no-clobber list */ + return res; +} + +static inline uint32_t +vmbus_txbr_copyto(const struct vmbus_br *tbr, uint32_t windex, + const void *src0, uint32_t cplen) +{ + uint8_t *br_data =3D (uint8_t *)tbr->vbr + RINGDATA_START_OFFSET; + uint32_t br_dsize =3D tbr->dsize; + const uint8_t *src =3D src0; + + if (cplen > br_dsize - windex) { + uint32_t fraglen =3D br_dsize - windex; + + /* Wrap-around detected */ + memcpy(br_data + windex, src, fraglen); + memcpy(br_data, src + fraglen, cplen - fraglen); + } else { + memcpy(br_data + windex, src, cplen); + } + + return vmbus_br_idxinc(windex, cplen, br_dsize); +} + +/* + * Write scattered channel packet to TX bufring. + * + * The offset of this channel packet is written as a 64bits value + * immediately after this channel packet. + * + * The write goes through three stages: + * 1. Reserve space in ring buffer for the new data. + * Writer atomically moves priv_write_index. + * 2. Copy the new data into the ring. + * 3. Update the tail of the ring (visible to host) that indicates + * next read location. Writer updates write_index + */ +static int +vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovle= n, + bool *need_sig) +{ + struct vmbus_bufring *vbr =3D tbr->vbr; + uint32_t ring_size =3D tbr->dsize; + uint32_t old_windex, next_windex, windex, total; + uint64_t save_windex; + int i; + + total =3D 0; + for (i =3D 0; i < iovlen; i++) + total +=3D iov[i].iov_len; + total +=3D sizeof(save_windex); + + /* Reserve space in ring */ + do { + uint32_t avail; + + /* Get current free location */ + old_windex =3D tbr->windex; + + /* Prevent compiler reordering this with calculation */ + rte_compiler_barrier(); + + avail =3D vmbus_br_availwrite(tbr, old_windex); + + /* If not enough space in ring, then tell caller. */ + if (avail <=3D total) + return -EAGAIN; + + next_windex =3D vmbus_br_idxinc(old_windex, total, ring_size); + + /* Atomic update of next write_index for other threads */ + } while (!rte_atomic32_cmpset(&tbr->windex, old_windex, next_windex)); + + /* Space from old..new is now reserved */ + windex =3D old_windex; + for (i =3D 0; i < iovlen; i++) + windex =3D vmbus_txbr_copyto(tbr, windex, iov[i].iov_base, iov[i].iov_le= n); + + /* Set the offset of the current channel packet. */ + save_windex =3D ((uint64_t)old_windex) << 32; + windex =3D vmbus_txbr_copyto(tbr, windex, &save_windex, + sizeof(save_windex)); + + /* The region reserved should match region used */ + if (windex !=3D next_windex) + return -EINVAL; + + /* Ensure that data is available before updating host index */ + rte_compiler_barrier(); + + /* Checkin for our reservation. wait for our turn to update host */ + while (!rte_atomic32_cmpset(&vbr->windex, old_windex, next_windex)) + _mm_pause(); + + return 0; +} + +int rte_vmbus_chan_send(struct vmbus_br *txbr, uint16_t type, void *data, + uint32_t dlen, uint32_t flags) +{ + struct vmbus_chanpkt pkt; + unsigned int pktlen, pad_pktlen; + const uint32_t hlen =3D sizeof(pkt); + bool send_evt =3D false; + uint64_t pad =3D 0; + struct iovec iov[3]; + int error; + + pktlen =3D hlen + dlen; + pad_pktlen =3D ALIGN(pktlen, sizeof(uint64_t)); + + pkt.hdr.type =3D type; + pkt.hdr.flags =3D flags; + pkt.hdr.hlen =3D hlen >> VMBUS_CHANPKT_SIZE_SHIFT; + pkt.hdr.tlen =3D pad_pktlen >> VMBUS_CHANPKT_SIZE_SHIFT; + pkt.hdr.xactid =3D VMBUS_RQST_ERROR; /* doesn't support multiple requests= at same time */ + + iov[0].iov_base =3D &pkt; + iov[0].iov_len =3D hlen; + iov[1].iov_base =3D data; + iov[1].iov_len =3D dlen; + iov[2].iov_base =3D &pad; + iov[2].iov_len =3D pad_pktlen - pktlen; + + error =3D vmbus_txbr_write(txbr, iov, 3, &send_evt); + + return error; +} + +static inline uint32_t +vmbus_rxbr_copyfrom(const struct vmbus_br *rbr, uint32_t rindex, + void *dst0, size_t cplen) +{ + const uint8_t *br_data =3D (uint8_t *)rbr->vbr + RINGDATA_START_OFFSET; + uint32_t br_dsize =3D rbr->dsize; + uint8_t *dst =3D dst0; + + if (cplen > br_dsize - rindex) { + uint32_t fraglen =3D br_dsize - rindex; + + /* Wrap-around detected. */ + memcpy(dst, br_data + rindex, fraglen); + memcpy(dst + fraglen, br_data, cplen - fraglen); + } else { + memcpy(dst, br_data + rindex, cplen); + } + + return vmbus_br_idxinc(rindex, cplen, br_dsize); +} + +/* Copy data from receive ring but don't change index */ +static int +vmbus_rxbr_peek(const struct vmbus_br *rbr, void *data, size_t dlen) +{ + uint32_t avail; + + /* + * The requested data and the 64bits channel packet + * offset should be there at least. + */ + avail =3D vmbus_br_availread(rbr); + if (avail < dlen + sizeof(uint64_t)) + return -EAGAIN; + + vmbus_rxbr_copyfrom(rbr, rbr->vbr->rindex, data, dlen); + return 0; +} + +/* + * Copy data from receive ring and change index + * NOTE: + * We assume (dlen + skip) =3D=3D sizeof(channel packet). + */ +static int +vmbus_rxbr_read(struct vmbus_br *rbr, void *data, size_t dlen, size_t skip) +{ + struct vmbus_bufring *vbr =3D rbr->vbr; + uint32_t br_dsize =3D rbr->dsize; + uint32_t rindex; + + if (vmbus_br_availread(rbr) < dlen + skip + sizeof(uint64_t)) + return -EAGAIN; + + /* Record where host was when we started read (for debug) */ + rbr->windex =3D rbr->vbr->windex; + + /* + * Copy channel packet from RX bufring. + */ + rindex =3D vmbus_br_idxinc(rbr->vbr->rindex, skip, br_dsize); + rindex =3D vmbus_rxbr_copyfrom(rbr, rindex, data, dlen); + + /* + * Discard this channel packet's 64bits offset, which is useless to us. + */ + rindex =3D vmbus_br_idxinc(rindex, sizeof(uint64_t), br_dsize); + + /* Update the read index _after_ the channel packet is fetched. */ + rte_compiler_barrier(); + + vbr->rindex =3D rindex; + + return 0; +} + +int rte_vmbus_chan_recv_raw(struct vmbus_br *rxbr, + void *data, uint32_t *len) +{ + struct vmbus_chanpkt_hdr pkt; + uint32_t dlen, bufferlen =3D *len; + int error; + + error =3D vmbus_rxbr_peek(rxbr, &pkt, sizeof(pkt)); + if (error) + return error; + + if (unlikely(pkt.hlen < VMBUS_CHANPKT_HLEN_MIN)) + /* XXX this channel is dead actually. */ + return -EIO; + + if (unlikely(pkt.hlen > pkt.tlen)) + return -EIO; + + /* Length are in quad words */ + dlen =3D pkt.tlen << VMBUS_CHANPKT_SIZE_SHIFT; + *len =3D dlen; + + /* If caller buffer is not large enough */ + if (unlikely(dlen > bufferlen)) + return -ENOBUFS; + + /* Read data and skip packet header */ + error =3D vmbus_rxbr_read(rxbr, data, dlen, 0); + if (error) + return error; + + /* Return the number of bytes read */ + return dlen + sizeof(uint64_t); +} diff --git a/tools/hv/vmbus_bufring.h b/tools/hv/vmbus_bufring.h new file mode 100644 index 000000000000..5590df90e971 --- /dev/null +++ b/tools/hv/vmbus_bufring.h @@ -0,0 +1,154 @@ +/* SPDX-License-Identifier: BSD-3-Clause */ + +#ifndef _VMBUS_BUF_H_ +#define _VMBUS_BUF_H_ + +#include +#include + +#define __packed __attribute__((__packed__)) +#define unlikely(x) __builtin_expect(!!(x), 0) + +#define ICMSGHDRFLAG_TRANSACTION 1 +#define ICMSGHDRFLAG_REQUEST 2 +#define ICMSGHDRFLAG_RESPONSE 4 + +#define IC_VERSION_NEGOTIATION_MAX_VER_COUNT 100 +#define ICMSG_HDR (sizeof(struct vmbuspipe_hdr) + sizeof(struct icmsg_hdr)) +#define ICMSG_NEGOTIATE_PKT_SIZE(icframe_vercnt, icmsg_vercnt) \ + (ICMSG_HDR + sizeof(struct icmsg_negotiate) + \ + (((icframe_vercnt) + (icmsg_vercnt)) * sizeof(struct ic_version))) + +/* + * Channel packets + */ + +/* Channel packet flags */ +#define VMBUS_CHANPKT_TYPE_INBAND 0x0006 +#define VMBUS_CHANPKT_TYPE_RXBUF 0x0007 +#define VMBUS_CHANPKT_TYPE_GPA 0x0009 +#define VMBUS_CHANPKT_TYPE_COMP 0x000b + +#define VMBUS_CHANPKT_FLAG_NONE 0 +#define VMBUS_CHANPKT_FLAG_RC 0x0001 /* report completion */ + +#define VMBUS_CHANPKT_SIZE_SHIFT 3 +#define VMBUS_CHANPKT_SIZE_ALIGN BIT(VMBUS_CHANPKT_SIZE_SHIFT) +#define VMBUS_CHANPKT_HLEN_MIN \ + (sizeof(struct vmbus_chanpkt_hdr) >> VMBUS_CHANPKT_SIZE_SHIFT) + +/* + * Buffer ring + */ +struct vmbus_bufring { + volatile uint32_t windex; + volatile uint32_t rindex; + + /* + * Interrupt mask {0,1} + * + * For TX bufring, host set this to 1, when it is processing + * the TX bufring, so that we can safely skip the TX event + * notification to host. + * + * For RX bufring, once this is set to 1 by us, host will not + * further dispatch interrupts to us, even if there are data + * pending on the RX bufring. This effectively disables the + * interrupt of the channel to which this RX bufring is attached. + */ + volatile uint32_t imask; + + /* + * Win8 uses some of the reserved bits to implement + * interrupt driven flow management. On the send side + * we can request that the receiver interrupt the sender + * when the ring transitions from being full to being able + * to handle a message of size "pending_send_sz". + * + * Add necessary state for this enhancement. + */ + volatile uint32_t pending_send; + uint32_t reserved1[12]; + + union { + struct { + uint32_t feat_pending_send_sz:1; + }; + uint32_t value; + } feature_bits; + + /* + * Ring data starts after PAGE_SIZE offset (RINGDATA_START_OFFSET). + * !!! DO NOT place any fields below this !!! + */ + uint8_t data[]; +} __packed; + +struct vmbus_br { + struct vmbus_bufring *vbr; + uint32_t dsize; + uint32_t windex; /* next available location */ +}; + +struct vmbus_chanpkt_hdr { + uint16_t type; /* VMBUS_CHANPKT_TYPE_ */ + uint16_t hlen; /* header len, in 8 bytes */ + uint16_t tlen; /* total len, in 8 bytes */ + uint16_t flags; /* VMBUS_CHANPKT_FLAG_ */ + uint64_t xactid; +} __packed; + +struct vmbus_chanpkt { + struct vmbus_chanpkt_hdr hdr; +} __packed; + +struct vmbuspipe_hdr { + unsigned int flags; + unsigned int msgsize; +} __packed; + +struct ic_version { + unsigned short major; + unsigned short minor; +} __packed; + +struct icmsg_negotiate { + unsigned short icframe_vercnt; + unsigned short icmsg_vercnt; + unsigned int reserved; + struct ic_version icversion_data[]; /* any size array */ +} __packed; + +struct icmsg_hdr { + struct ic_version icverframe; + unsigned short icmsgtype; + struct ic_version icvermsg; + unsigned short icmsgsize; + unsigned int status; + unsigned char ictransaction_id; + unsigned char icflags; + unsigned char reserved[2]; +} __packed; + +int rte_vmbus_chan_recv_raw(struct vmbus_br *rxbr, void *data, uint32_t *l= en); +int rte_vmbus_chan_send(struct vmbus_br *txbr, uint16_t type, void *data, + uint32_t dlen, uint32_t flags); +void vmbus_br_setup(struct vmbus_br *br, void *buf, unsigned int blen); + +/* Amount of space available for write */ +static inline uint32_t vmbus_br_availwrite(const struct vmbus_br *br, uint= 32_t windex) +{ + uint32_t rindex =3D br->vbr->rindex; + + if (windex >=3D rindex) + return br->dsize - (windex - rindex); + else + return rindex - windex; +} + +static inline uint32_t vmbus_br_availread(const struct vmbus_br *br) +{ + return br->dsize - vmbus_br_availwrite(br, br->vbr->windex); +} + +#endif /* !_VMBUS_BUF_H_ */ --=20 2.34.1 From nobody Sun Dec 14 06:19:16 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63FA4C001DE for ; Fri, 4 Aug 2023 07:10:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234122AbjHDHK1 (ORCPT ); Fri, 4 Aug 2023 03:10:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234091AbjHDHKH (ORCPT ); Fri, 4 Aug 2023 03:10:07 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9D52F30C4; Fri, 4 Aug 2023 00:10:00 -0700 (PDT) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id 28C88207F5BE; Fri, 4 Aug 2023 00:10:00 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 28C88207F5BE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1691133000; bh=eXrRvT0/8si3mNUBNc/nZ2OsNSNamh0jMMAVHtD6iWY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=LJwvTBlLq6A4PbXjtmGTk0qppNQjq1Gv5oZdCWzYcrWWNq72Ql81zGRWEUwUcmRPW ICb4pAbHtnjO5g8LRxA7FAUs7r3xQ/rVyPYJ9kuy3h6jIX846S2Zj9g060pKlvS8fr UaRRIg18rbSu0ei8wBKMY0PrgE2Xij62mPP7tkaM= From: Saurabh Sengar To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, mikelley@microsoft.com, gregkh@linuxfoundation.org, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH v4 3/3] tools: hv: Add new fcopy application based on uio driver Date: Fri, 4 Aug 2023 00:09:56 -0700 Message-Id: <1691132996-11706-4-git-send-email-ssengar@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1691132996-11706-1-git-send-email-ssengar@linux.microsoft.com> References: <1691132996-11706-1-git-send-email-ssengar@linux.microsoft.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Implement the file copy service for Linux guests on Hyper-V. This permits the host to copy a file (over VMBus) into the guest. This facility is part of "guest integration services" supported on the Hyper-V platform. Here is a link that provides additional details on this functionality: http://technet.microsoft.com/en-us/library/dn464282.aspx This new fcopy application uses uio_hv_vmbus_client driver which makes the earlier hv_util based driver and application obsolete. Signed-off-by: Saurabh Sengar --- [V4] - Add error check for setting ring_size value in sysfs entry - Add error handling in fcopy_get_instance_id for instance id not found case [V3] - Improve cover commit messages - Improve debug prints - Instead of hardcoded instance id, query from class id sysfs - Set the ring_size value from application - Update the application to mmap /dev/uio instead of sysfs - new application compilation dependent on x86 [V2] - simpler sysfs path tools/hv/Build | 1 + tools/hv/Makefile | 10 +- tools/hv/hv_fcopy_uio_daemon.c | 587 +++++++++++++++++++++++++++++++++ 3 files changed, 597 insertions(+), 1 deletion(-) create mode 100644 tools/hv/hv_fcopy_uio_daemon.c diff --git a/tools/hv/Build b/tools/hv/Build index 2a667d3d94cb..efcbb74a0d23 100644 --- a/tools/hv/Build +++ b/tools/hv/Build @@ -2,3 +2,4 @@ hv_kvp_daemon-y +=3D hv_kvp_daemon.o hv_vss_daemon-y +=3D hv_vss_daemon.o hv_fcopy_daemon-y +=3D hv_fcopy_daemon.o vmbus_bufring-y +=3D vmbus_bufring.o +hv_fcopy_uio_daemon-y +=3D hv_fcopy_uio_daemon.o diff --git a/tools/hv/Makefile b/tools/hv/Makefile index 33cf488fd20f..678c6c450a53 100644 --- a/tools/hv/Makefile +++ b/tools/hv/Makefile @@ -21,8 +21,10 @@ override CFLAGS +=3D -O2 -Wall -g -D_GNU_SOURCE -I$(OUTP= UT)include =20 ifeq ($(SRCARCH),x86) ALL_LIBS :=3D libvmbus_bufring.a -endif +ALL_TARGETS :=3D hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon hv_fcopy_uio_= daemon +else ALL_TARGETS :=3D hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon +endif ALL_PROGRAMS :=3D $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) $(patsubst %,$(O= UTPUT)%,$(ALL_LIBS)) =20 ALL_SCRIPTS :=3D hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh @@ -56,6 +58,12 @@ $(HV_FCOPY_DAEMON_IN): FORCE $(OUTPUT)hv_fcopy_daemon: $(HV_FCOPY_DAEMON_IN) $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@ =20 +HV_FCOPY_UIO_DAEMON_IN :=3D $(OUTPUT)hv_fcopy_uio_daemon-in.o +$(HV_FCOPY_UIO_DAEMON_IN): FORCE + $(Q)$(MAKE) $(build)=3Dhv_fcopy_uio_daemon +$(OUTPUT)hv_fcopy_uio_daemon: $(HV_FCOPY_UIO_DAEMON_IN) libvmbus_bufring.a + $(QUIET_LINK)$(CC) -lm $< -L. -lvmbus_bufring -o $@ + clean: rm -f $(ALL_PROGRAMS) find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.d' -delete diff --git a/tools/hv/hv_fcopy_uio_daemon.c b/tools/hv/hv_fcopy_uio_daemon.c new file mode 100644 index 000000000000..b35737082c91 --- /dev/null +++ b/tools/hv/hv_fcopy_uio_daemon.c @@ -0,0 +1,587 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * An implementation of host to guest copy functionality for Linux. + * + * Copyright (C) 2023, Microsoft, Inc. + * + * Author : K. Y. Srinivasan + * Author : Saurabh Sengar + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "vmbus_bufring.h" + +#define ICMSGTYPE_NEGOTIATE 0 +#define ICMSGTYPE_FCOPY 7 + +#define WIN8_SRV_MAJOR 1 +#define WIN8_SRV_MINOR 1 +#define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR) + +#define MAX_PATH_LEN 300 +#define MAX_LINE_LEN 40 +#define DEVICES_SYSFS "/sys/bus/vmbus/devices" +#define FCOPY_CLASS_ID "34d14be3-dee4-41c8-9ae7-6b174977c192" + +#define FCOPY_VER_COUNT 1 +static const int fcopy_versions[] =3D { + WIN8_SRV_VERSION +}; + +#define FW_VER_COUNT 1 +static const int fw_versions[] =3D { + UTIL_FW_VERSION +}; + +#define HV_RING_SIZE (4 * 4096) + +unsigned char desc[HV_RING_SIZE]; + +static int target_fd; +static char target_fname[PATH_MAX]; +static unsigned long long filesize; + +static int hv_fcopy_create_file(char *file_name, char *path_name, __u32 fl= ags) +{ + int error =3D HV_E_FAIL; + char *q, *p; + + filesize =3D 0; + p =3D (char *)path_name; + snprintf(target_fname, sizeof(target_fname), "%s/%s", + (char *)path_name, (char *)file_name); + + /* + * Check to see if the path is already in place; if not, + * create if required. + */ + while ((q =3D strchr(p, '/')) !=3D NULL) { + if (q =3D=3D p) { + p++; + continue; + } + *q =3D '\0'; + if (access(path_name, F_OK)) { + if (flags & CREATE_PATH) { + if (mkdir(path_name, 0755)) { + syslog(LOG_ERR, "Failed to create %s", + path_name); + goto done; + } + } else { + syslog(LOG_ERR, "Invalid path: %s", path_name); + goto done; + } + } + p =3D q + 1; + *q =3D '/'; + } + + if (!access(target_fname, F_OK)) { + syslog(LOG_INFO, "File: %s exists", target_fname); + if (!(flags & OVER_WRITE)) { + error =3D HV_ERROR_ALREADY_EXISTS; + goto done; + } + } + + target_fd =3D open(target_fname, + O_RDWR | O_CREAT | O_TRUNC | O_CLOEXEC, 0744); + if (target_fd =3D=3D -1) { + syslog(LOG_INFO, "Open Failed: %s", strerror(errno)); + goto done; + } + + error =3D 0; +done: + if (error) + target_fname[0] =3D '\0'; + return error; +} + +static int hv_copy_data(struct hv_do_fcopy *cpmsg) +{ + ssize_t bytes_written; + int ret =3D 0; + + bytes_written =3D pwrite(target_fd, cpmsg->data, cpmsg->size, + cpmsg->offset); + + filesize +=3D cpmsg->size; + if (bytes_written !=3D cpmsg->size) { + switch (errno) { + case ENOSPC: + ret =3D HV_ERROR_DISK_FULL; + break; + default: + ret =3D HV_E_FAIL; + break; + } + syslog(LOG_ERR, "pwrite failed to write %llu bytes: %ld (%s)", + filesize, (long)bytes_written, strerror(errno)); + } + + return ret; +} + +/* + * Reset target_fname to "" in the two below functions for hibernation: if + * the fcopy operation is aborted by hibernation, the daemon should remove= the + * partially-copied file; to achieve this, the hv_utils driver always fake= s a + * CANCEL_FCOPY message upon suspend, and later when the VM resumes back, + * the daemon calls hv_copy_cancel() to remove the file; if a file is copi= ed + * successfully before suspend, hv_copy_finished() must reset target_fname= to + * avoid that the file can be incorrectly removed upon resume, since the f= aked + * CANCEL_FCOPY message is spurious in this case. + */ +static int hv_copy_finished(void) +{ + close(target_fd); + target_fname[0] =3D '\0'; + return 0; +} + +static void print_usage(char *argv[]) +{ + fprintf(stderr, "Usage: %s [options]\n" + "Options are:\n" + " -n, --no-daemon stay in foreground, don't daemonize\n" + " -h, --help print this help\n", argv[0]); +} + +static bool vmbus_prep_negotiate_resp(struct icmsg_hdr *icmsghdrp, unsigne= d char *buf, + unsigned int buflen, const int *fw_version, int fw_vercnt, + const int *srv_version, int srv_vercnt, + int *nego_fw_version, int *nego_srv_version) +{ + int icframe_major, icframe_minor; + int icmsg_major, icmsg_minor; + int fw_major, fw_minor; + int srv_major, srv_minor; + int i, j; + bool found_match =3D false; + struct icmsg_negotiate *negop; + + /* Check that there's enough space for icframe_vercnt, icmsg_vercnt */ + if (buflen < ICMSG_HDR + offsetof(struct icmsg_negotiate, reserved)) { + syslog(LOG_ERR, "Invalid icmsg negotiate"); + return false; + } + + icmsghdrp->icmsgsize =3D 0x10; + negop =3D (struct icmsg_negotiate *)&buf[ICMSG_HDR]; + + icframe_major =3D negop->icframe_vercnt; + icframe_minor =3D 0; + + icmsg_major =3D negop->icmsg_vercnt; + icmsg_minor =3D 0; + + /* Validate negop packet */ + if (icframe_major > IC_VERSION_NEGOTIATION_MAX_VER_COUNT || + icmsg_major > IC_VERSION_NEGOTIATION_MAX_VER_COUNT || + ICMSG_NEGOTIATE_PKT_SIZE(icframe_major, icmsg_major) > buflen) { + syslog(LOG_ERR, "Invalid icmsg negotiate - icframe_major: %u, icmsg_majo= r: %u\n", + icframe_major, icmsg_major); + goto fw_error; + } + + /* + * Select the framework version number we will + * support. + */ + + for (i =3D 0; i < fw_vercnt; i++) { + fw_major =3D (fw_version[i] >> 16); + fw_minor =3D (fw_version[i] & 0xFFFF); + + for (j =3D 0; j < negop->icframe_vercnt; j++) { + if (negop->icversion_data[j].major =3D=3D fw_major && + negop->icversion_data[j].minor =3D=3D fw_minor) { + icframe_major =3D negop->icversion_data[j].major; + icframe_minor =3D negop->icversion_data[j].minor; + found_match =3D true; + break; + } + } + + if (found_match) + break; + } + + if (!found_match) + goto fw_error; + + found_match =3D false; + + for (i =3D 0; i < srv_vercnt; i++) { + srv_major =3D (srv_version[i] >> 16); + srv_minor =3D (srv_version[i] & 0xFFFF); + + for (j =3D negop->icframe_vercnt; + (j < negop->icframe_vercnt + negop->icmsg_vercnt); + j++) { + if (negop->icversion_data[j].major =3D=3D srv_major && + negop->icversion_data[j].minor =3D=3D srv_minor) { + icmsg_major =3D negop->icversion_data[j].major; + icmsg_minor =3D negop->icversion_data[j].minor; + found_match =3D true; + break; + } + } + + if (found_match) + break; + } + + /* + * Respond with the framework and service + * version numbers we can support. + */ +fw_error: + if (!found_match) { + negop->icframe_vercnt =3D 0; + negop->icmsg_vercnt =3D 0; + } else { + negop->icframe_vercnt =3D 1; + negop->icmsg_vercnt =3D 1; + } + + if (nego_fw_version) + *nego_fw_version =3D (icframe_major << 16) | icframe_minor; + + if (nego_srv_version) + *nego_srv_version =3D (icmsg_major << 16) | icmsg_minor; + + negop->icversion_data[0].major =3D icframe_major; + negop->icversion_data[0].minor =3D icframe_minor; + negop->icversion_data[1].major =3D icmsg_major; + negop->icversion_data[1].minor =3D icmsg_minor; + + return found_match; +} + +static void wcstoutf8(char *dest, const __u16 *src, size_t dest_size) +{ + size_t len =3D 0; + + while (len < dest_size) { + if (src[len] < 0x80) + dest[len++] =3D (char)(*src++); + else + dest[len++] =3D 'X'; + } + + dest[len] =3D '\0'; +} + +static int hv_fcopy_start(struct hv_start_fcopy *smsg_in) +{ + setlocale(LC_ALL, "en_US.utf8"); + size_t file_size, path_size; + char *file_name, *path_name; + char *in_file_name =3D (char *)smsg_in->file_name; + char *in_path_name =3D (char *)smsg_in->path_name; + + file_size =3D wcstombs(NULL, (const wchar_t *restrict)in_file_name, 0) + = 1; + path_size =3D wcstombs(NULL, (const wchar_t *restrict)in_path_name, 0) + = 1; + + file_name =3D (char *)malloc(file_size * sizeof(char)); + path_name =3D (char *)malloc(path_size * sizeof(char)); + + wcstoutf8(file_name, (__u16 *)in_file_name, file_size); + wcstoutf8(path_name, (__u16 *)in_path_name, path_size); + + return hv_fcopy_create_file(file_name, path_name, smsg_in->copy_flags); +} + +static int hv_fcopy_send_data(struct hv_fcopy_hdr *fcopy_msg, int recvlen) +{ + int operation =3D fcopy_msg->operation; + + /* + * The strings sent from the host are encoded in + * utf16; convert it to utf8 strings. + * The host assures us that the utf16 strings will not exceed + * the max lengths specified. We will however, reserve room + * for the string terminating character - in the utf16s_utf8s() + * function we limit the size of the buffer where the converted + * string is placed to W_MAX_PATH -1 to guarantee + * that the strings can be properly terminated! + */ + + switch (operation) { + case START_FILE_COPY: + return hv_fcopy_start((struct hv_start_fcopy *)fcopy_msg); + case WRITE_TO_FILE: + return hv_copy_data((struct hv_do_fcopy *)fcopy_msg); + case COMPLETE_FCOPY: + return hv_copy_finished(); + } + + return HV_E_FAIL; +} + +/* process the packet recv from host */ +static int fcopy_pkt_process(struct vmbus_br *txbr) +{ + int ret, offset, pktlen; + int fcopy_srv_version; + const struct vmbus_chanpkt_hdr *pkt; + struct hv_fcopy_hdr *fcopy_msg; + struct icmsg_hdr *icmsghdr; + + pkt =3D (const struct vmbus_chanpkt_hdr *)desc; + offset =3D pkt->hlen << 3; + pktlen =3D (pkt->tlen << 3) - offset; + icmsghdr =3D (struct icmsg_hdr *)&desc[offset + sizeof(struct vmbuspipe_h= dr)]; + icmsghdr->status =3D HV_E_FAIL; + + if (icmsghdr->icmsgtype =3D=3D ICMSGTYPE_NEGOTIATE) { + if (vmbus_prep_negotiate_resp(icmsghdr, desc + offset, pktlen, fw_versio= ns, + FW_VER_COUNT, fcopy_versions, FCOPY_VER_COUNT, + NULL, &fcopy_srv_version)) { + syslog(LOG_INFO, "FCopy IC version %d.%d", + fcopy_srv_version >> 16, fcopy_srv_version & 0xFFFF); + icmsghdr->status =3D 0; + } + } else if (icmsghdr->icmsgtype =3D=3D ICMSGTYPE_FCOPY) { + /* Ensure recvlen is big enough to contain hv_fcopy_hdr */ + if (pktlen < ICMSG_HDR + sizeof(struct hv_fcopy_hdr)) { + syslog(LOG_ERR, "Invalid Fcopy hdr. Packet length too small: %u", + pktlen); + return -ENOBUFS; + } + + fcopy_msg =3D (struct hv_fcopy_hdr *)&desc[offset + ICMSG_HDR]; + icmsghdr->status =3D hv_fcopy_send_data(fcopy_msg, pktlen); + } + + icmsghdr->icflags =3D ICMSGHDRFLAG_TRANSACTION | ICMSGHDRFLAG_RESPONSE; + ret =3D rte_vmbus_chan_send(txbr, 0x6, desc + offset, pktlen, 0); + if (ret) { + syslog(LOG_ERR, "Write to ringbuffer failed err: %d", ret); + return ret; + } + + return 0; +} + +static void fcopy_get_first_folder(char *path, char *chan_no) +{ + DIR *dir =3D opendir(path); + struct dirent *entry; + + if (!dir) { + syslog(LOG_ERR, "Failed to open directory (errno=3D%s).\n", strerror(err= no)); + return; + } + + while ((entry =3D readdir(dir)) !=3D NULL) { + if (entry->d_type =3D=3D DT_DIR && strcmp(entry->d_name, ".") !=3D 0 && + strcmp(entry->d_name, "..") !=3D 0) { + strcpy(chan_no, entry->d_name); + break; + } + } + + closedir(dir); +} + +static void fcopy_set_ring_size(char *path, char *inst, int size) +{ + char ring_size_path[MAX_PATH_LEN] =3D {0}; + FILE *fd; + + snprintf(ring_size_path, sizeof(ring_size_path), "%s/%s/%s", path, inst, = "ring_size"); + fd =3D fopen(ring_size_path, "w"); + if (!fd) { + syslog(LOG_WARNING, "Failed to open ring_size file (errno=3D%s).\n", str= error(errno)); + return; + } + + setvbuf(fd, NULL, _IONBF, 0); /* don't allow buffering to catch sysfs sto= re error */ + if (fprintf(fd, "%d", size) < 0) + syslog(LOG_WARNING, "Failed to set %d as ring size (errno=3D%s).\n", + size, strerror(errno)); + + fclose(fd); +} + +static char *fcopy_read_sysfs(char *path, char *buf, int len) +{ + FILE *fd; + char *ret; + + fd =3D fopen(path, "r"); + if (!fd) + return NULL; + + ret =3D fgets(buf, len, fd); + fclose(fd); + + return ret; +} + +static int fcopy_get_instance_id(char *path, char *class_id, char *inst) +{ + DIR *dir =3D opendir(path); + struct dirent *entry; + char tmp_path[MAX_PATH_LEN] =3D {0}; + char line[MAX_LINE_LEN]; + int ret =3D -EINVAL; + + if (!dir) { + syslog(LOG_ERR, "Failed to open directory (errno=3D%s).", strerror(errno= )); + return ret; + } + + while ((entry =3D readdir(dir)) !=3D NULL) { + if (entry->d_type =3D=3D DT_LNK && strcmp(entry->d_name, ".") !=3D 0 && + strcmp(entry->d_name, "..") !=3D 0) { + /* search for the sysfs path with matching class_id */ + snprintf(tmp_path, sizeof(tmp_path), "%s/%s/%s", + path, entry->d_name, "class_id"); + if (!fcopy_read_sysfs(tmp_path, line, MAX_LINE_LEN)) + continue; + + /* class id matches, now fetch the instance id from device_id */ + if (strstr(line, class_id)) { + snprintf(tmp_path, sizeof(tmp_path), "%s/%s/%s", + path, entry->d_name, "device_id"); + if (!fcopy_read_sysfs(tmp_path, line, MAX_LINE_LEN)) + continue; + /* remove braces */ + strncpy(inst, line + 1, strlen(line) - 3); + ret =3D 0; + goto closedir; + } + } + } + + syslog(LOG_ERR, "Failed to fetch instance id"); +closedir: + closedir(dir); + return ret; +} + +int main(int argc, char *argv[]) +{ + int fcopy_fd =3D -1, tmp =3D 1; + int daemonize =3D 1, long_index =3D 0, opt, ret =3D -EINVAL; + struct vmbus_br txbr, rxbr; + void *ring; + uint32_t len =3D HV_RING_SIZE; + char uio_name[10] =3D {0}; + char uio_dev_path[15] =3D {0}; + char uio_path[MAX_PATH_LEN] =3D {0}; + char inst[MAX_LINE_LEN] =3D {0}; + + static struct option long_options[] =3D { + {"help", no_argument, 0, 'h' }, + {"no-daemon", no_argument, 0, 'n' }, + {0, 0, 0, 0 } + }; + + while ((opt =3D getopt_long(argc, argv, "hn", long_options, + &long_index)) !=3D -1) { + switch (opt) { + case 'n': + daemonize =3D 0; + break; + case 'h': + default: + print_usage(argv); + exit(EXIT_FAILURE); + } + } + + if (daemonize && daemon(1, 0)) { + syslog(LOG_ERR, "daemon() failed; error: %s", strerror(errno)); + exit(EXIT_FAILURE); + } + + openlog("HV_UIO_FCOPY", 0, LOG_USER); + syslog(LOG_INFO, "starting; pid is:%d", getpid()); + + /* get instance id */ + if (fcopy_get_instance_id(DEVICES_SYSFS, FCOPY_CLASS_ID, inst)) + exit(EXIT_FAILURE); + + /* set ring_size value */ + fcopy_set_ring_size(DEVICES_SYSFS, inst, HV_RING_SIZE); + + /* get /dev/uioX dev path and open it */ + snprintf(uio_path, sizeof(uio_path), "%s/%s/%s", DEVICES_SYSFS, inst, "ui= o"); + fcopy_get_first_folder(uio_path, uio_name); + snprintf(uio_dev_path, sizeof(uio_dev_path), "/dev/%s", uio_name); + fcopy_fd =3D open(uio_dev_path, O_RDWR); + + if (fcopy_fd < 0) { + syslog(LOG_ERR, "open %s failed; error: %d %s", + uio_dev_path, errno, strerror(errno)); + syslog(LOG_ERR, "Please make sure module uio_hv_vmbus_client is loaded a= nd" \ + " device is not used by any other application\n"); + ret =3D fcopy_fd; + exit(EXIT_FAILURE); + } + + ring =3D mmap(NULL, 2 * HV_RING_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,= fcopy_fd, 0); + if (ring =3D=3D MAP_FAILED) { + ret =3D errno; + syslog(LOG_ERR, "mmap ringbuffer failed; error: %d %s", ret, strerror(re= t)); + goto close; + } + vmbus_br_setup(&txbr, ring, HV_RING_SIZE); + vmbus_br_setup(&rxbr, (char *)ring + HV_RING_SIZE, HV_RING_SIZE); + + while (1) { + /* + * In this loop we process fcopy messages after the + * handshake is complete. + */ + ret =3D pread(fcopy_fd, &tmp, sizeof(int), 0); + if (ret < 0) { + syslog(LOG_ERR, "pread failed: %s", strerror(errno)); + continue; + } + + len =3D HV_RING_SIZE; + ret =3D rte_vmbus_chan_recv_raw(&rxbr, desc, &len); + if (unlikely(ret <=3D 0)) { + /* This indicates a failure to communicate (or worse) */ + syslog(LOG_ERR, "VMBus channel recv error: %d", ret); + } else { + ret =3D fcopy_pkt_process(&txbr); + if (ret < 0) + goto close; + + /* Signal host */ + tmp =3D 1; + if ((write(fcopy_fd, &tmp, sizeof(int))) !=3D sizeof(int)) { + ret =3D errno; + syslog(LOG_ERR, "Registration failed: %s\n", strerror(ret)); + goto close; + } + } + } +close: + close(fcopy_fd); + return ret; +} --=20 2.34.1