fs/fuse/readdir.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
Commit dabb90391028 ("fuse: increase readdir buffer size") changed
fuse_readdir_uncached() to size its temporary buffer from ctx->count,
clamped to the negotiated FUSE maximum request size.
That is correct for normal userspace getdents callers, where ctx->count is
the userspace dirent buffer size. It is not correct for in-kernel callers
that use the VFS sentinel values documented for struct dir_context.count:
0 means unknown and INT_MAX means unlimited.
Overlayfs uses INT_MAX when reading merged directories. After
dabb90391028, FUSE interprets that sentinel as a real size request and
expands the readdir buffer to fc->max_pages << PAGE_SHIFT.
For virtiofs, the output kvec is included in the request bounce buffer
allocated by copy_args_to_argbuf():
req->argbuf = kmalloc(len, GFP_ATOMIC);
On a 64K-page guest, this can require a multi-megabyte contiguous
GFP_ATOMIC allocation. In the failing setup, a 64K-page guest on a 4K-page
host negotiated max_pages=124, so the computed buffer was about 8MB. The
same guest on a 64K-page host negotiated max_pages=16, limiting the
computed buffer to 1MB and masking the bug.
One way to reproduce this is a 64K-page guest on a 4K-page host with an
overlayfs mount whose lower directory is on virtiofs. Reading a merged
directory through overlayfs can then fail with:
ls: reading directory '<path>': Cannot allocate memory
Treat unknown and unlimited counts the same way fuse_readdir_uncached()
did before dabb90391028: use PAGE_SIZE. Keep the larger readdir buffer
for callers that provide a meaningful positive count.
Fixes: dabb90391028 ("fuse: increase readdir buffer size")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
---
fs/fuse/readdir.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index c2aae2eef086..0e436c563efb 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -341,7 +341,10 @@ static int fuse_readdir_uncached(struct file *file, struct dir_context *ctx)
struct fuse_io_args ia = {};
struct fuse_args *args = &ia.ap.args;
void *buf;
- size_t bufsize = clamp((unsigned int) ctx->count, PAGE_SIZE, fc->max_pages << PAGE_SHIFT);
+ unsigned int count = (unsigned int)ctx->count;
+ size_t bufsize = (count && count != (unsigned int)INT_MAX) ?
+ clamp(count, (unsigned int)PAGE_SIZE, fc->max_pages << PAGE_SHIFT) :
+ PAGE_SIZE;
u64 attr_version = 0, evict_ctr = 0;
bool locked;
--
2.50.1
On Tue, 28 Apr 2026 at 04:13, Matthew R. Ochs <mochs@nvidia.com> wrote: > For virtiofs, the output kvec is included in the request bounce buffer > allocated by copy_args_to_argbuf(): > > req->argbuf = kmalloc(len, GFP_ATOMIC); Ugh. The real bug here is inappropriate use of the bounce buffer. fuse_readdir_uncached() should instead supply an array of pages. It's a little more complicated, but would fix this properly: overlayfs does want to get as much of the directory as possible in one go to be most efficient. I'd go with vmalloc -> alloc_pages_bulk, then vm_map_ram() before parsing the result. Thanks, Miklos
© 2016 - 2026 Red Hat, Inc.