[PATCH v3 00/11] 9pfs: readdir optimization

Christian Schoenebeck posted 11 patches 29 weeks ago
Test docker-mingw@fedora passed
Test checkpatch passed
Test docker-quick@centos7 passed
Test FreeBSD passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/cover.1578957500.git.qemu_oss@crudebyte.com
Maintainers: Greg Kurz <groug@kaod.org>
hw/9pfs/9p-synth.c     |  48 ++++++-
hw/9pfs/9p-synth.h     |   5 +
hw/9pfs/9p.c           | 163 +++++++++++++----------
hw/9pfs/9p.h           |  34 +++++
hw/9pfs/codir.c        | 183 ++++++++++++++++++++++++--
hw/9pfs/coth.h         |   3 +
tests/virtio-9p-test.c | 287 ++++++++++++++++++++++++++++++++++++++++-
7 files changed, 640 insertions(+), 83 deletions(-)

[PATCH v3 00/11] 9pfs: readdir optimization

Posted by Christian Schoenebeck 29 weeks ago
As previously mentioned, I was investigating performance issues with 9pfs.
Raw file read/write of 9pfs is actually quite good, provided that client
picked a reasonable high msize (maximum message size). I would recommend
to log a warning on 9p server side if a client attached with a small msize
that would cause performance issues for that reason.

However there are other aspects where 9pfs currently performs suboptimally,
especially readdir handling of 9pfs is extremely slow, a simple readdir
request of a guest typically blocks for several hundred milliseconds or
even several seconds, no matter how powerful the underlying hardware is.
The reason for this performance issue: latency.
Currently 9pfs is heavily dispatching a T_readdir request numerous times
between main I/O thread and a background I/O thread back and forth; in fact
it is actually hopping between threads even multiple times for every single
directory entry during T_readdir request handling which leads in total to
huge latencies for a single T_readdir request.

This patch series aims to address this severe performance issue of 9pfs
T_readdir request handling. The actual performance fix is patch 10. I also
provided a convenient benchmark for comparing the performance improvements
by using the 9pfs "synth" driver (see patch 8 for instructions how to run
the benchmark), so no guest OS installation is required to peform this
benchmark A/B comparison. With patch 10 I achieved a performance improvement
of factor 40 on my test machine.

** NOTE: ** As outlined by patch 7 there seems to be an outstanding issue
(both with current, unoptimized readdir code, as well as with new, optimized
readdir code) causing a transport error with splitted readdir requests. This
issue only occurs if patch 7 is applied. I haven't investigated the cause of
this issue yet, it looks like a memory issue though. I am not sure if it is a
problem with the actual 9p server or rather "just" with the test environment.
Apart from that issue, the actual splitted readdir seems to work well with the
new performance optimized readdir code as well though.

v2->v3:

  * NEW patch: require msize >= 4096 [patch 2].

  * Shortened commit log message [patch 3]
    (since previously mentioned issue now addressed by new patch 2).

  * Merged previous 2 test case patches into one -> [patch 5]
    (since trivial enough for one patch).

  * Fixed code style issue [patch 5].

  * Fixed memory leak in test case [patch 5]
    (missing v9fs_req_free() in v9fs_rreaddir()).

  * NEW patch: added splitted readdir test [patch 6].

  * NEW patch: Failing splitted readdir issue [patch 7]
    (see issue description above).

  * Adjusted commit log message [patch 9]
    (that this patch would break the new splitted readdir test).

  * Fixed comment in code [patch 10].

Christian Schoenebeck (11):
  tests/virtio-9p: add terminating null in v9fs_string_read()
  9pfs: require msize >= 4096
  9pfs: validate count sent by client with T_readdir
  hw/9pfs/9p-synth: added directory for readdir test
  tests/virtio-9p: added readdir test
  tests/virtio-9p: added splitted readdir test
  tests/virtio-9p: failing splitted readdir test
  9pfs: readdir benchmark
  hw/9pfs/9p-synth: avoid n-square issue in synth_readdir()
  9pfs: T_readdir latency optimization
  hw/9pfs/9p.c: benchmark time on T_readdir request

 hw/9pfs/9p-synth.c     |  48 ++++++-
 hw/9pfs/9p-synth.h     |   5 +
 hw/9pfs/9p.c           | 163 +++++++++++++----------
 hw/9pfs/9p.h           |  34 +++++
 hw/9pfs/codir.c        | 183 ++++++++++++++++++++++++--
 hw/9pfs/coth.h         |   3 +
 tests/virtio-9p-test.c | 287 ++++++++++++++++++++++++++++++++++++++++-
 7 files changed, 640 insertions(+), 83 deletions(-)

-- 
2.20.1