From nobody Mon Nov 25 07:34:03 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=movementarian.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1717000339586920.8123060498649; Wed, 29 May 2024 09:32:19 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sCM7W-0001zV-3r; Wed, 29 May 2024 12:25:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sCM7P-0001Ud-MU for qemu-devel@nongnu.org; Wed, 29 May 2024 12:25:39 -0400 Received: from ssh.movementarian.org ([139.162.205.133] helo=movementarian.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sCM7J-0006S2-LX for qemu-devel@nongnu.org; Wed, 29 May 2024 12:25:35 -0400 Received: from movement by movementarian.org with local (Exim 4.95) (envelope-from ) id 1sCM6r-006CQF-29; Wed, 29 May 2024 17:25:05 +0100 From: John Levon To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jag.raman@oracle.com, thanos.makatos@nutanix.com, John Levon , John Johnson , Elena Ufimtseva , John Levon Subject: [PATCH 21/26] vfio-user: dma map/unmap operations Date: Wed, 29 May 2024 17:23:14 +0100 Message-Id: <20240529162319.1476680-22-levon@movementarian.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240529162319.1476680-1-levon@movementarian.org> References: <20240529162319.1476680-1-levon@movementarian.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=139.162.205.133; envelope-from=movement@movementarian.org; helo=movementarian.org X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1717000340300100006 Content-Type: text/plain; charset="utf-8" Implement DMA map/unmap for the vfio-user container. Add ability to do async operations during memory transactions. Originally-by: John Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John Levon --- hw/vfio/trace-events | 4 ++ hw/vfio/user-container.c | 114 ++++++++++++++++++++++++++++++++++++++- hw/vfio/user-protocol.h | 32 +++++++++++ hw/vfio/user.c | 99 ++++++++++++++++++++++++++++++---- hw/vfio/user.h | 10 ++++ 5 files changed, 247 insertions(+), 12 deletions(-) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 73661db9d9..387751bd7f 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -188,3 +188,7 @@ vfio_user_get_region_info(uint32_t index, uint32_t flag= s, uint64_t size) " index vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " regio= n %d offset 0x%"PRIx64" count %d" vfio_user_get_irq_info(uint32_t index, uint32_t flags, uint32_t count) " i= ndex %d flags 0x%x count %d" vfio_user_set_irqs(uint32_t index, uint32_t start, uint32_t count, uint32_= t flags) " index %d start %d count %d flags 0x%x" + +# user-container.c +vfio_user_dma_map(uint64_t iova, uint64_t size, uint64_t off, uint32_t fla= gs, bool async_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" off 0x%"PRIx64" fla= gs 0x%x async_ops %d" +vfio_user_dma_unmap(uint64_t iova, uint64_t size, uint32_t flags, bool as= ync_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" flags 0x%x async_ops %d" diff --git a/hw/vfio/user-container.c b/hw/vfio/user-container.c index f0414509d5..213861fb5d 100644 --- a/hw/vfio/user-container.c +++ b/hw/vfio/user-container.c @@ -24,18 +24,126 @@ #include "qapi/error.h" #include "pci.h" =20 +/* + * When DMA space is the physical address space, the region add/del listen= ers + * will fire during memory update transactions. These depend on BQL being= held, + * so do any resulting map/demap ops async while keeping BQL. + */ +static void vfio_user_listener_begin(VFIOContainerBase *bcontainer) +{ + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + + container->proxy->async_ops =3D true; +} + +static void vfio_user_listener_commit(VFIOContainerBase *bcontainer) +{ + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + + /* wait here for any async requests sent during the transaction */ + container->proxy->async_ops =3D false; + vfio_user_wait_reqs(container->proxy); +} + static int vfio_user_dma_unmap(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb, int flags) { - return -ENOTSUP; + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + + struct { + VFIOUserDMAUnmap msg; + VFIOUserBitmap bitmap; + } *msgp =3D NULL; + int msize, rsize; + + msize =3D rsize =3D sizeof(VFIOUserDMAUnmap); + msgp =3D g_malloc0(rsize); + + vfio_user_request_msg(&msgp->msg.hdr, VFIO_USER_DMA_UNMAP, msize, 0); + msgp->msg.argsz =3D rsize - sizeof(VFIOUserHdr); + msgp->msg.flags =3D flags; + msgp->msg.iova =3D iova; + msgp->msg.size =3D size; + trace_vfio_user_dma_unmap(msgp->msg.iova, msgp->msg.size, msgp->msg.fl= ags, + container->proxy->async_ops); + + if (container->proxy->async_ops) { + vfio_user_send_nowait(container->proxy, &msgp->msg.hdr, NULL, rsiz= e); + return 0; + } + + vfio_user_send_wait(container->proxy, &msgp->msg.hdr, NULL, rsize); + if (msgp->msg.hdr.flags & VFIO_USER_ERROR) { + return -msgp->msg.hdr.error_reply; + } + + g_free(msgp); + return 0; } =20 static int vfio_user_dma_map(const VFIOContainerBase *bcontainer, hwaddr i= ova, ram_addr_t size, void *vaddr, bool readonly, MemoryRegion *mrp) { - return -ENOTSUP; + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + + VFIOUserProxy *proxy =3D container->proxy; + int fd =3D memory_region_get_fd(mrp); + int ret; + + VFIOUserFDs *fds =3D NULL; + VFIOUserDMAMap *msgp =3D g_malloc0(sizeof(*msgp)); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DMA_MAP, sizeof(*msgp), 0); + msgp->argsz =3D sizeof(struct vfio_iommu_type1_dma_map); + msgp->flags =3D VFIO_DMA_MAP_FLAG_READ; + msgp->offset =3D 0; + msgp->iova =3D iova; + msgp->size =3D size; + + /* + * vaddr enters as a QEMU process address; make it either a file offset + * for mapped areas or leave as 0. + */ + if (fd !=3D -1) { + msgp->offset =3D qemu_ram_block_host_offset(mrp->ram_block, vaddr); + } + + if (!readonly) { + msgp->flags |=3D VFIO_DMA_MAP_FLAG_WRITE; + } + + trace_vfio_user_dma_map(msgp->iova, msgp->size, msgp->offset, msgp->fl= ags, + container->proxy->async_ops); + + /* + * The async_ops case sends without blocking or dropping BQL. + * They're later waited for in vfio_send_wait_reqs. + */ + if (container->proxy->async_ops) { + /* can't use auto variable since we don't block */ + if (fd !=3D -1) { + fds =3D vfio_user_getfds(1); + fds->send_fds =3D 1; + fds->fds[0] =3D fd; + } + vfio_user_send_nowait(proxy, &msgp->hdr, fds, 0); + ret =3D 0; + } else { + VFIOUserFDs local_fds =3D { 1, 0, &fd }; + + fds =3D fd !=3D -1 ? &local_fds : NULL; + vfio_user_send_wait(proxy, &msgp->hdr, fds, 0); + ret =3D (msgp->hdr.flags & VFIO_USER_ERROR) ? -msgp->hdr.error_rep= ly : 0; + g_free(msgp); + } + + return ret; } =20 static int @@ -230,6 +338,8 @@ static void vfio_iommu_user_class_init(ObjectClass *kla= ss, void *data) VFIOIOMMUClass *vioc =3D VFIO_IOMMU_CLASS(klass); =20 vioc->setup =3D vfio_user_setup; + vioc->listener_begin =3D vfio_user_listener_begin, + vioc->listener_commit =3D vfio_user_listener_commit, vioc->dma_map =3D vfio_user_dma_map; vioc->dma_unmap =3D vfio_user_dma_unmap; vioc->attach_device =3D vfio_user_attach_device; diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 87e43ddc72..9b569156fa 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -115,6 +115,31 @@ typedef struct { */ #define VFIO_USER_DEF_MAX_BITMAP (256 * 1024 * 1024) =20 +/* + * VFIO_USER_DMA_MAP + * imported from struct vfio_iommu_type1_dma_map + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t offset; /* FD offset */ + uint64_t iova; + uint64_t size; +} VFIOUserDMAMap; + +/* + * VFIO_USER_DMA_UNMAP + * imported from struct vfio_iommu_type1_dma_unmap + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t iova; + uint64_t size; +} VFIOUserDMAUnmap; + /* * VFIO_USER_DEVICE_GET_INFO * imported from struct vfio_device_info @@ -178,4 +203,11 @@ typedef struct { char data[]; } VFIOUserRegionRW; =20 +/*imported from struct vfio_bitmap */ +typedef struct { + uint64_t pgsize; + uint64_t size; + char data[]; +} VFIOUserBitmap; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 224b5febd8..adecd9a68a 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -50,7 +50,6 @@ static void vfio_user_shutdown(VFIOUserProxy *proxy); static int vfio_user_send_qio(VFIOUserProxy *proxy, VFIOUserMsg *msg); static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hd= r, VFIOUserFDs *fds); -static VFIOUserFDs *vfio_user_getfds(int numfds); static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg); =20 static void vfio_user_recv(void *opaque); @@ -63,10 +62,6 @@ static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg); static void vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); -static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, - VFIOUserFDs *fds, int rsize); -static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, - uint32_t size, uint32_t flags); =20 static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) { @@ -158,7 +153,7 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFI= OUserMsg *msg) QTAILQ_INSERT_HEAD(&proxy->free, msg, next); } =20 -static VFIOUserFDs *vfio_user_getfds(int numfds) +VFIOUserFDs *vfio_user_getfds(int numfds) { VFIOUserFDs *fds =3D g_malloc0(sizeof(*fds) + (numfds * sizeof(int))); =20 @@ -661,8 +656,38 @@ static void vfio_user_send_async(VFIOUserProxy *proxy,= VFIOUserHdr *hdr, } } =20 -static void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, - VFIOUserFDs *fds, int rsize) +/* + * nowait send - vfio_wait_reqs() can wait for it later + */ +void vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize) +{ + VFIOUserMsg *msg; + int ret; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_printf("vfio_user_send_nowait on async message\n"); + return; + } + + QEMU_LOCK_GUARD(&proxy->lock); + + msg =3D vfio_user_getmsg(proxy, hdr, fds); + msg->id =3D hdr->id; + msg->rsize =3D rsize ? rsize : hdr->size; + msg->type =3D VFIO_MSG_NOWAIT; + + ret =3D vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + return; + } + + proxy->last_nowait =3D msg; +} + +void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize) { VFIOUserMsg *msg; bool iolock =3D false; @@ -713,6 +738,60 @@ static void vfio_user_send_wait(VFIOUserProxy *proxy, = VFIOUserHdr *hdr, } } =20 +void vfio_user_wait_reqs(VFIOUserProxy *proxy) +{ + VFIOUserMsg *msg; + bool iolock =3D false; + + /* + * Any DMA map/unmap requests sent in the middle + * of a memory region transaction were sent nowait. + * Wait for them here. + */ + qemu_mutex_lock(&proxy->lock); + if (proxy->last_nowait !=3D NULL) { + iolock =3D bql_locked(); + if (iolock) { + bql_unlock(); + } + + /* + * Change type to WAIT to wait for reply + */ + msg =3D proxy->last_nowait; + msg->type =3D VFIO_MSG_WAIT; + proxy->last_nowait =3D NULL; + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + VFIOUserMsgQ *list; + + list =3D msg->pending ? &proxy->pending : &proxy->outgoing; + QTAILQ_REMOVE(list, msg, next); + error_printf("vfio_wait_reqs - timed out\n"); + break; + } + } + + if (msg->hdr->flags & VFIO_USER_ERROR) { + error_printf("vfio_user_wait_reqs - error reply on async "); + error_printf("request: command %x error %s\n", msg->hdr->comma= nd, + strerror(msg->hdr->error_reply)); + } + + /* + * Change type back to NOWAIT to free + */ + msg->type =3D VFIO_MSG_NOWAIT; + vfio_user_recycle(proxy, msg); + } + + /* lock order is BQL->proxy - don't hold proxy when getting BQL */ + qemu_mutex_unlock(&proxy->lock); + if (iolock) { + bql_lock(); + } +} + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets =3D QLIST_HEAD_INITIALIZER(vfio_user_sockets); =20 @@ -847,8 +926,8 @@ void vfio_user_disconnect(VFIOUserProxy *proxy) g_free(proxy); } =20 -static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, - uint32_t size, uint32_t flags) +void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags) { static uint16_t next_id; =20 diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 9039e96069..31d2c5abd9 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -75,6 +75,7 @@ typedef struct VFIOUserProxy { QemuCond close_cv; AioContext *ctx; QEMUBH *req_bh; + bool async_ops; =20 /* * above only changed when BQL is held @@ -106,4 +107,13 @@ void vfio_user_set_handler(VFIODevice *vbasedev, bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); int vfio_user_get_info(VFIOUserProxy *proxy, struct vfio_device_info *info= ); =20 +VFIOUserFDs *vfio_user_getfds(int numfds); +void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags); +void vfio_user_wait_reqs(VFIOUserProxy *proxy); +void vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize); +void vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize); + #endif /* VFIO_USER_H */ --=20 2.34.1