From: Taotao Chen <chentaotao@didiglobal.com>
Set FOP_DONTCACHE in ext4_file_operations to declare support for
uncached buffered I/O.
To handle this flag, add processing for IOCB_DONTCACHE in
ext4_write_begin() and ext4_da_write_begin() by passing FGP_DONTCACHE
to page cache lookups.
Part of a series refactoring address_space_operations write_begin and
write_end callbacks to use struct kiocb for passing write context and
flags.
Signed-off-by: Taotao Chen <chentaotao@didiglobal.com>
---
fs/ext4/file.c | 3 ++-
fs/ext4/inode.c | 6 ++++++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 21df81347147..274b41a476c8 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -977,7 +977,8 @@ const struct file_operations ext4_file_operations = {
.splice_write = iter_file_splice_write,
.fallocate = ext4_fallocate,
.fop_flags = FOP_MMAP_SYNC | FOP_BUFFER_RASYNC |
- FOP_DIO_PARALLEL_WRITE,
+ FOP_DIO_PARALLEL_WRITE |
+ FOP_DONTCACHE,
};
const struct inode_operations ext4_file_inode_operations = {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 08c10200d6fe..639e2e231c4b 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1270,6 +1270,9 @@ static int ext4_write_begin(const struct kiocb *iocb,
if (unlikely(ret))
return ret;
+ if (iocb->ki_flags & IOCB_DONTCACHE)
+ fgp |= FGP_DONTCACHE;
+
trace_ext4_write_begin(inode, pos, len);
/*
* Reserve one block more for addition to orphan list in case
@@ -3068,6 +3071,9 @@ static int ext4_da_write_begin(const struct kiocb *iocb,
return 0;
}
+ if (iocb->ki_flags & IOCB_DONTCACHE)
+ fgp |= FGP_DONTCACHE;
+
retry:
fgp |= fgf_set_order(len);
folio = __filemap_get_folio(mapping, index, fgp,
--
2.34.1
On Fri, Jun 27, 2025 at 11:03:13AM +0000, 陈涛涛 Taotao Chen wrote:
> +++ b/fs/ext4/inode.c
> @@ -1270,6 +1270,9 @@ static int ext4_write_begin(const struct kiocb *iocb,
> if (unlikely(ret))
> return ret;
>
> + if (iocb->ki_flags & IOCB_DONTCACHE)
> + fgp |= FGP_DONTCACHE;
I think this needs to be:
if (iocb && iocb->ki_flags & IOCB_DONTCACHE)
because it's legit to call write_begin with a NULL argument. The
'file' was always an optional argument, and we should preserve that
optionality with this transformation.
I wonder if it's worth abstracting some of this boilerplate. Something
like:
struct folio *write_begin_get_folio(iocb, mapping, index, len)
{
fgf_t fgflags = FGP_WRITEBEGIN;
if (iocb && iocb->ki_flags & IOCB_DONTCACHE)
fgflags |= FGP_DONTCACHE;
fgflags |= fgf_set_order(len);
return __filemap_get_folio(mapping, index, fgflags,
mapping_gfp_mask(mapping));
}
On Fri, Jun 27, 2025 at 06:03:09PM +0100, Matthew Wilcox wrote:
> On Fri, Jun 27, 2025 at 11:03:13AM +0000, 陈涛涛 Taotao Chen wrote:
> > +++ b/fs/ext4/inode.c
> > @@ -1270,6 +1270,9 @@ static int ext4_write_begin(const struct kiocb *iocb,
> > if (unlikely(ret))
> > return ret;
> >
> > + if (iocb->ki_flags & IOCB_DONTCACHE)
> > + fgp |= FGP_DONTCACHE;
>
> I think this needs to be:
>
> if (iocb && iocb->ki_flags & IOCB_DONTCACHE)
>
> because it's legit to call write_begin with a NULL argument. The
> 'file' was always an optional argument, and we should preserve that
> optionality with this transformation.
write_begin and write_end are only callbacks through helpers called
by the file system. So if the file system never passes a NULL
file/kiocb it doesn't need to check for it.
> I wonder if it's worth abstracting some of this boilerplate. Something
> like:
>
> struct folio *write_begin_get_folio(iocb, mapping, index, len)
> {
> fgf_t fgflags = FGP_WRITEBEGIN;
>
> if (iocb && iocb->ki_flags & IOCB_DONTCACHE)
> fgflags |= FGP_DONTCACHE;
> fgflags |= fgf_set_order(len);
>
> return __filemap_get_folio(mapping, index, fgflags,
> mapping_gfp_mask(mapping));
> }
But this helper still seems useful.
On Sun, Jun 29, 2025 at 11:41:12PM -0700, hch@infradead.org wrote: > On Fri, Jun 27, 2025 at 06:03:09PM +0100, Matthew Wilcox wrote: > > On Fri, Jun 27, 2025 at 11:03:13AM +0000, 陈涛涛 Taotao Chen wrote: > > I think this needs to be: > > > > if (iocb && iocb->ki_flags & IOCB_DONTCACHE) > > > > because it's legit to call write_begin with a NULL argument. The > > 'file' was always an optional argument, and we should preserve that > > optionality with this transformation. > > write_begin and write_end are only callbacks through helpers called > by the file system. So if the file system never passes a NULL > file/kiocb it doesn't need to check for it. Sure, but some of those helpers are non-obvious, like page_symlink().
© 2016 - 2026 Red Hat, Inc.