From: Taotao Chen <chentaotao@didiglobal.com>
Initialize fsdata with &iocb->ki_flags to allow filesystems to check
iocb flags like IOCB_DONTCACHE during ->write_begin().
Signed-off-by: Taotao Chen <chentaotao@didiglobal.com>
---
mm/filemap.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 7b90cbeb4a1a..9174b6310f0b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4087,7 +4087,11 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
size_t offset; /* Offset into folio */
size_t bytes; /* Bytes to write to folio */
size_t copied; /* Bytes copied from user */
- void *fsdata = NULL;
+ /*
+ * Initialize fsdata with iocb flags pointer so that filesystems
+ * can check ki_flags (like IOCB_DONTCACHE) in write operations.
+ */
+ void *fsdata = &iocb->ki_flags;
bytes = iov_iter_count(i);
retry:
--
2.34.1
On Mon, Apr 21, 2025 at 10:50:30AM +0000, 陈涛涛 Taotao Chen wrote: > diff --git a/mm/filemap.c b/mm/filemap.c > index 7b90cbeb4a1a..9174b6310f0b 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -4087,7 +4087,11 @@ ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i) > size_t offset; /* Offset into folio */ > size_t bytes; /* Bytes to write to folio */ > size_t copied; /* Bytes copied from user */ > - void *fsdata = NULL; > + /* > + * Initialize fsdata with iocb flags pointer so that filesystems > + * can check ki_flags (like IOCB_DONTCACHE) in write operations. > + */ > + void *fsdata = &iocb->ki_flags; Unfortunately, this is't safe. There may very well be code paths which depend on fsdata being initialized to NULL before calling write_begin(). In fact in the patch 2/3 in this series, ext4_write_end() will get confused in the non-delayed allocation case, since ext4_write_begin() doesn't force fsdata to be not be &iocb->ki_flags, and this will cause ext4_write_end() to potentially get confused and do the wrong thing. I understand that it would be a lot more inconvenient change the function signature of write_begin() to pass through iocb->ki_fags via a new parameter. But I think that probably is the best way to go. Cheers, - Ted
On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote: > I understand that it would be a lot more inconvenient change the > function signature of write_begin() to pass through iocb->ki_fags via > a new parameter. But I think that probably is the best way to go. I'd suggest that passing in iocb rather than file is the way to go. Most callers of ->write_begin already pass NULL as the first argument so would not need to change. i915/gem passes a non-NULL file, but it only operates on shmem and shmem does not use the file argument, so they can pass NULL instead. fs/buffer.c simply passes through the file passed to write_begin and can be changed to pass through the iocb passed in. exfat_extend_valid_size() has an iocb in its caller and can pass in the iocb instead. generic_perform_write() has an iocb.
On Wed, May 14, 2025 at 02:38:05PM +0100, Matthew Wilcox wrote: > On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote: > > I understand that it would be a lot more inconvenient change the > > function signature of write_begin() to pass through iocb->ki_fags via > > a new parameter. But I think that probably is the best way to go. > > I'd suggest that passing in iocb rather than file is the way to go. > Most callers of ->write_begin already pass NULL as the first argument so > would not need to change. i915/gem passes a non-NULL file, but it only > operates on shmem and shmem does not use the file argument, so they can > pass NULL instead. i915/gem needs to stop using write_begin/end and just do an iter_write. Someone who has the hardware and/or CI setup needs to figure out if vfs_write and kernel_write is fine, or this is magic enough to skip checks and protection and go straight to callig ->iter_write. > fs/buffer.c simply passes through the file passed > to write_begin and can be changed to pass through the iocb passed in. > exfat_extend_valid_size() has an iocb in its caller and can pass in the > iocb instead. generic_perform_write() has an iocb. And we really need to stop theading this through the address_space ops because it's not a generic method but a callback for the file system..
Hi Christoph, Matthew, Ted Thanks for the suggestions. Replacing file with iocb in write_begin(), updating call sites, and adjusting i915/gem usage makes a lot of sense. I’ll send a v2 to reflect this change. Thanks, Taotao
On Wed, May 14, 2025 at 02:38:05PM +0100, Matthew Wilcox wrote:
> On Tue, May 13, 2025 at 11:51:25PM -0400, Theodore Ts'o wrote:
> > I understand that it would be a lot more inconvenient change the
> > function signature of write_begin() to pass through iocb->ki_fags via
> > a new parameter. But I think that probably is the best way to go.
>
> I'd suggest that passing in iocb rather than file is the way to go.
> Most callers of ->write_begin already pass NULL as the first argument so
> would not need to change. i915/gem passes a non-NULL file, but it only
> operates on shmem and shmem does not use the file argument, so they can
> pass NULL instead. fs/buffer.c simply passes through the file passed
> to write_begin and can be changed to pass through the iocb passed in.
> exfat_extend_valid_size() has an iocb in its caller and can pass in the
> iocb instead. generic_perform_write() has an iocb.
Mmm, nice! I agree, that's probably the way to go.
There might be some callers if write_begin() that might require some
restructing because they don't have an iocb. For example,
shmem_pwrite() in drivers/gpu/i915/gem/i915_gem_shmem.c came up when I
did a quick grep.
- Ted
© 2016 - 2026 Red Hat, Inc.