[PATCH v2 0/3] 9p: Performance improvements for build workloads

Remi Pommarel posted 3 patches 2 weeks, 2 days ago
fs/9p/fid.c             |  11 +++--
fs/9p/v9fs.c            |  10 +++-
fs/9p/v9fs.h            |   2 +
fs/9p/v9fs_vfs.h        |  15 ++++++
fs/9p/vfs_addr.c        |  24 +++++++--
fs/9p/vfs_dentry.c      | 105 ++++++++++++++++++++++++++++++++++------
fs/9p/vfs_inode.c       |  13 +++--
fs/9p/vfs_inode_dotl.c  |  73 +++++++++++++++++++++++++---
fs/9p/vfs_super.c       |   1 +
include/net/9p/client.h |   2 +
10 files changed, 220 insertions(+), 36 deletions(-)
[PATCH v2 0/3] 9p: Performance improvements for build workloads
Posted by Remi Pommarel 2 weeks, 2 days ago
This patchset introduces several performance optimizations for the 9p
filesystem when used with cache=loose option (exclusive or read only
mounts). These improvements particularly target workloads with frequent
lookups of non-existent paths and repeated symlink resolutions.

The very state of the art benchmark consisting of cloning a fresh
hostap repository and building hostapd and wpa_supplicant for hwsim
tests (cd tests/hwsim; time ./build.sh) in a VM running on a 9pfs rootfs
(with trans=virtio,cache=loose options) has been used to test those
optimizations impact.

For reference, the build takes 0m56.492s on my laptop natively while it
completes in 2m18.702sec on the VM. This represents a significant
performance penalty considering running the same build on a VM using a
virtiofs rootfs (with "--cache always" virtiofsd option) takes around
1m32.141s. This patchset aims to bring the 9pfs build time close to
that of virtiofs, rather than the native host time, as a realistic
expectation.

This first two patches in this series focus on keeping negative dentries
in the cache, ensuring that subsequent lookups for paths known to not
exist do not require redundant 9P RPC calls. This optimization reduces
the time needed for the compiler to search for header files across known
locations. These two patches introduce a new mount option, ndentrytmo,
which specifies the number of ms to keep the dentry in the cache. Using
ndentrytmo=-1 (keeping the negative dentry indifinetly) shrunk build
time to 1m46.198s.

The third patch extends page cache usage to symlinks by allowing
p9_client_readlink() results to be cached. Resolving symlink is
apparently something done quite frequently during the build process and
avoiding the cost of a 9P RPC call round trip for already known symlinks
helps reduce the build time to 1m26.602s, outperforming the virtiofs
setup.

Here is summary of the different hostapd/wpa_supplicant build times:

  - Baseline (no patch): 2m18.702s
  - negative dentry caching (patches 1-2): 1m46.198s (23% improvement)
  - Above + symlink caching (patches 1-3): 1m26.302s (an additional 18%
    improvement, 37% in total)

With this ~37% performance gain, 9pfs with cache=loose can compete with
virtiofs for (at least) this specific scenario. Although this benchmark
is not the most typical, I do think that these caching optimizations
could benefit a wide range of other workflows as well.

Changes since v2:
  - Rebase on 9p-next (with new mount API conversion)
  - Integrated symlink caching with the network filesystem helper
    library for robustness (a lot of code expects a valid netfs context)
  - Instantiate symlink dentry at creation to avoid keeping a negative
    dentry in cache
  - Moved IO waiting time accounting to a separate patch series

Thanks.

Remi Pommarel (3):
  9p: Cache negative dentries for lookup performance
  9p: Introduce option for negative dentry cache retention time
  9p: Enable symlink caching in page cache

 fs/9p/fid.c             |  11 +++--
 fs/9p/v9fs.c            |  10 +++-
 fs/9p/v9fs.h            |   2 +
 fs/9p/v9fs_vfs.h        |  15 ++++++
 fs/9p/vfs_addr.c        |  24 +++++++--
 fs/9p/vfs_dentry.c      | 105 ++++++++++++++++++++++++++++++++++------
 fs/9p/vfs_inode.c       |  13 +++--
 fs/9p/vfs_inode_dotl.c  |  73 +++++++++++++++++++++++++---
 fs/9p/vfs_super.c       |   1 +
 include/net/9p/client.h |   2 +
 10 files changed, 220 insertions(+), 36 deletions(-)

-- 
2.50.1
Re: [PATCH v2 0/3] 9p: Performance improvements for build workloads
Posted by Christian Schoenebeck 2 days, 15 hours ago
On Wednesday, 21 January 2026 20:56:07 CET Remi Pommarel wrote:
> This patchset introduces several performance optimizations for the 9p
> filesystem when used with cache=loose option (exclusive or read only
> mounts). These improvements particularly target workloads with frequent
> lookups of non-existent paths and repeated symlink resolutions.
[...]
> Here is summary of the different hostapd/wpa_supplicant build times:
> 
>   - Baseline (no patch): 2m18.702s
>   - negative dentry caching (patches 1-2): 1m46.198s (23% improvement)
>   - Above + symlink caching (patches 1-3): 1m26.302s (an additional 18%
>     improvement, 37% in total)
> 
> With this ~37% performance gain, 9pfs with cache=loose can compete with
> virtiofs for (at least) this specific scenario. Although this benchmark
> is not the most typical, I do think that these caching optimizations
> could benefit a wide range of other workflows as well.

I did a wide range of tests. In broad average I'm also seeing ~40% improvement 
when compiling. Some individual sources even had 60% improvements and more. So 
there is quite a big variance.

I did not encounter misbehaviours in my tests, so feel free to add:

Tested-by: Christian Schoenebeck <linux_oss@crudebyte.com>

I still need to make a proper review though.

/Christian
Re: [PATCH v2 0/3] 9p: Performance improvements for build workloads
Posted by Dominique Martinet 2 weeks, 2 days ago
Remi Pommarel wrote on Wed, Jan 21, 2026 at 08:56:07PM +0100:
> This patchset introduces several performance optimizations for the 9p
> filesystem when used with cache=loose option (exclusive or read only
> mounts). These improvements particularly target workloads with frequent
> lookups of non-existent paths and repeated symlink resolutions.
> 
> The very state of the art benchmark consisting of cloning a fresh
> hostap repository and building hostapd and wpa_supplicant for hwsim
> tests (cd tests/hwsim; time ./build.sh) in a VM running on a 9pfs rootfs
> (with trans=virtio,cache=loose options) has been used to test those
> optimizations impact.
> 
> For reference, the build takes 0m56.492s on my laptop natively while it
> completes in 2m18.702sec on the VM. This represents a significant
> performance penalty considering running the same build on a VM using a
> virtiofs rootfs (with "--cache always" virtiofsd option) takes around
> 1m32.141s. This patchset aims to bring the 9pfs build time close to
> that of virtiofs, rather than the native host time, as a realistic
> expectation.
> 
> This first two patches in this series focus on keeping negative dentries
> in the cache, ensuring that subsequent lookups for paths known to not
> exist do not require redundant 9P RPC calls. This optimization reduces
> the time needed for the compiler to search for header files across known
> locations. These two patches introduce a new mount option, ndentrytmo,
> which specifies the number of ms to keep the dentry in the cache. Using
> ndentrytmo=-1 (keeping the negative dentry indifinetly) shrunk build
> time to 1m46.198s.
> 
> The third patch extends page cache usage to symlinks by allowing
> p9_client_readlink() results to be cached. Resolving symlink is
> apparently something done quite frequently during the build process and
> avoiding the cost of a 9P RPC call round trip for already known symlinks
> helps reduce the build time to 1m26.602s, outperforming the virtiofs
> setup.
> 
> Here is summary of the different hostapd/wpa_supplicant build times:
> 
>   - Baseline (no patch): 2m18.702s
>   - negative dentry caching (patches 1-2): 1m46.198s (23% improvement)
>   - Above + symlink caching (patches 1-3): 1m26.302s (an additional 18%
>     improvement, 37% in total)
> 
> With this ~37% performance gain, 9pfs with cache=loose can compete with
> virtiofs for (at least) this specific scenario. Although this benchmark
> is not the most typical, I do think that these caching optimizations
> could benefit a wide range of other workflows as well.

Thank you!

We've had a couple of regressions lately so I'll take a week or two to
run some proper tests first, but overall looks good to me, I just wanted
to acknowledge the patches early.
(as such it likely won't make 6.20 but should hopefully go into the next
one)
-- 
Dominique