[PATCH v10 0/7] Support message-based DMA in vfio-user server

Mattias Nissler posted 7 patches 1 week, 5 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20240507143431.464382-1-mnissler@rivosinc.com
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Elena Ufimtseva <elena.ufimtseva@oracle.com>, Jagannathan Raman <jag.raman@oracle.com>, Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>
hw/pci/pci.c                  |   8 ++
hw/remote/trace-events        |   2 +
hw/remote/vfio-user-obj.c     | 104 ++++++++++++++++++++-----
include/exec/cpu-common.h     |   2 -
include/exec/memory.h         |  41 +++++++++-
include/hw/pci/pci_device.h   |   3 +
subprojects/libvfio-user.wrap |   2 +-
system/dma-helpers.c          |   4 +-
system/memory.c               |   8 ++
system/physmem.c              | 140 ++++++++++++++++++----------------
10 files changed, 225 insertions(+), 89 deletions(-)
[PATCH v10 0/7] Support message-based DMA in vfio-user server
Posted by Mattias Nissler 1 week, 5 days ago
This series adds basic support for message-based DMA in qemu's vfio-user
server. This is useful for cases where the client does not provide file
descriptors for accessing system memory via memory mappings. My motivating use
case is to hook up device models as PCIe endpoints to a hardware design. This
works by bridging the PCIe transaction layer to vfio-user, and the endpoint
does not access memory directly, but sends memory requests TLPs to the hardware
design in order to perform DMA.

Note that more work is needed to make message-based DMA work well: qemu
currently breaks down DMA accesses into chunks of size 8 bytes at maximum, each
of which will be handled in a separate vfio-user DMA request message. This is
quite terrible for large DMA accesses, such as when nvme reads and writes
page-sized blocks for example. Thus, I would like to improve qemu to be able to
perform larger accesses, at least for indirect memory regions. I have something
working locally, but since this will likely result in more involved surgery and
discussion, I am leaving this to be addressed in a separate patch.

Changes from v1:

* Address Stefan's review comments. In particular, enforce an allocation limit
  and don't drop the map client callbacks given that map requests can fail when
  hitting size limits.

* libvfio-user version bump now included in the series.

* Tested as well on big-endian s390x. This uncovered another byte order issue
  in vfio-user server code that I've included a fix for.

Changes from v2:

* Add a preparatory patch to make bounce buffering an AddressSpace-specific
  concept.

* The total buffer size limit parameter is now per AdressSpace and can be
  configured for PCIDevice via a property.

* Store a magic value in first bytes of bounce buffer struct as a best effort
  measure to detect invalid pointers in address_space_unmap.

Changes from v3:

* libvfio-user now supports twin-socket mode which uses separate sockets for
  client->server and server->client commands, respectively. This addresses the
  concurrent command bug triggered by server->client DMA access commands. See
  https://github.com/nutanix/libvfio-user/issues/279 for details.

* Add missing teardown code in do_address_space_destroy.

* Fix bounce buffer size bookkeeping race condition.

* Generate unmap notification callbacks unconditionally.

* Some cosmetic fixes.

Changes from v4:

* Fix accidentally dropped memory_region_unref, control flow restored to match
  previous code to simplify review.

* Some cosmetic fixes.

Changes from v5:

* Unregister indirect memory region in libvfio-user dma_unregister callback.

Changes from v6:

* Rebase, resolve straightforward merge conflict in system/dma-helpers.c

Changes from v7:

* Rebase (applied cleanly)

* Restore various Reviewed-by and Tested-by tags that I failed to carry
  forward (I double-checked that the patches haven't changed since the reviewed
  version)

Changes from v8:

* Rebase (clean)

* Change bounce buffer size accounting to use uint32_t so it works also on
  hosts that don't support uint64_t atomics, such as mipsel. As a consequence
  overflows are a real concern now, so switch to a cmpxchg loop for allocating
  bounce buffer space.

Changes from v9:

* Incorporate patch split and QEMU_MUTEX_GUARD change by philmd@linaro.org

* Use size_t instead of uint32_t for bounce buffer size accounting. The qdev
  property remains uint32_t though, so it has a consistent size regardless of
  host.

Mattias Nissler (6):
  system/physmem: Propagate AddressSpace to MapClient helpers
  system/physmem: Per-AddressSpace bounce buffering
  softmmu: Support concurrent bounce buffers
  Update subprojects/libvfio-user
  vfio-user: Message-based DMA support
  vfio-user: Fix config space access byte order

Philippe Mathieu-Daudé (1):
  system/physmem: Replace qemu_mutex_lock() calls with QEMU_LOCK_GUARD

 hw/pci/pci.c                  |   8 ++
 hw/remote/trace-events        |   2 +
 hw/remote/vfio-user-obj.c     | 104 ++++++++++++++++++++-----
 include/exec/cpu-common.h     |   2 -
 include/exec/memory.h         |  41 +++++++++-
 include/hw/pci/pci_device.h   |   3 +
 subprojects/libvfio-user.wrap |   2 +-
 system/dma-helpers.c          |   4 +-
 system/memory.c               |   8 ++
 system/physmem.c              | 140 ++++++++++++++++++----------------
 10 files changed, 225 insertions(+), 89 deletions(-)

-- 
2.43.2


Re: [PATCH v10 0/7] Support message-based DMA in vfio-user server
Posted by Peter Xu 1 week, 4 days ago
On Tue, May 07, 2024 at 07:34:24AM -0700, Mattias Nissler wrote:
> This series adds basic support for message-based DMA in qemu's vfio-user
> server. This is useful for cases where the client does not provide file
> descriptors for accessing system memory via memory mappings. My motivating use
> case is to hook up device models as PCIe endpoints to a hardware design. This
> works by bridging the PCIe transaction layer to vfio-user, and the endpoint
> does not access memory directly, but sends memory requests TLPs to the hardware
> design in order to perform DMA.
> 
> Note that more work is needed to make message-based DMA work well: qemu
> currently breaks down DMA accesses into chunks of size 8 bytes at maximum, each
> of which will be handled in a separate vfio-user DMA request message. This is
> quite terrible for large DMA accesses, such as when nvme reads and writes
> page-sized blocks for example. Thus, I would like to improve qemu to be able to
> perform larger accesses, at least for indirect memory regions. I have something
> working locally, but since this will likely result in more involved surgery and
> discussion, I am leaving this to be addressed in a separate patch.

I assume Jag will pick this up then.

Thanks,

-- 
Peter Xu
Re: [PATCH v10 0/7] Support message-based DMA in vfio-user server
Posted by Philippe Mathieu-Daudé 1 week, 3 days ago
On 7/5/24 16:34, Mattias Nissler wrote:
> This series adds basic support for message-based DMA in qemu's vfio-user
> server. This is useful for cases where the client does not provide file
> descriptors for accessing system memory via memory mappings. My motivating use
> case is to hook up device models as PCIe endpoints to a hardware design. This
> works by bridging the PCIe transaction layer to vfio-user, and the endpoint
> does not access memory directly, but sends memory requests TLPs to the hardware
> design in order to perform DMA.

Patches 1-3 & 7 queued to hw-misc tree, thanks.
Re: [PATCH v10 0/7] Support message-based DMA in vfio-user server
Posted by Mattias Nissler 1 week, 3 days ago
On Wed, May 8, 2024 at 11:16 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> On 7/5/24 16:34, Mattias Nissler wrote:
> > This series adds basic support for message-based DMA in qemu's vfio-user
> > server. This is useful for cases where the client does not provide file
> > descriptors for accessing system memory via memory mappings. My motivating use
> > case is to hook up device models as PCIe endpoints to a hardware design. This
> > works by bridging the PCIe transaction layer to vfio-user, and the endpoint
> > does not access memory directly, but sends memory requests TLPs to the hardware
> > design in order to perform DMA.
>
> Patches 1-3 & 7 queued to hw-misc tree, thanks.

Excellent, thanks for picking these up!