[GIT PULL 00/14 for v6.17] vfs 6.17

Christian Brauner posted 14 patches 2 months, 1 week ago
Only 3 patches received!
[GIT PULL 00/14 for v6.17] vfs 6.17
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

This is the batch of pull requests for the v6.17 merge window!

I'm trying something new where I'm attaching a cover letter with a short
summary of all the various pull requests flowing to you during this
cycle.

Lucky for me the v6.17 merge window coincides with me moving. IOW, I'm
currently getting squashed by moving boxes and disassembled furniture.
I'm just happy that I did find my laptop in this mess and I hope there's
no notable effects due to the last couple of weeks.

In any case, this cycle was pretty usual for us given the past years.
We have two new system call additions in core vfs file_getattr() and
file_setattr() which are exensible successors to the legacy ioctl()s.

There's further work in the form of preparatory changes to the directory
locking scheme we currently have; both on the vfs level and for
overlayfs specificall. I want to stress that no actual locking changes
have happened yet and that there's not yet any commitment by us to
actually land any of this.

We have a new bpf kfunc extension for reading extended attributes from
cgroups. This is the first time we're routing bpf patches but I will do
this for all future vfs bpf extensions so we know exactly how and when
something is happening.

There's another round of extensive coredump work. Not just an extension
to the coredump socket but also a rework of the coredump code to just be
more readable and maintainable. I'm somewhat afraid of what I've gotten
myself into by touching that code but hey, that's part of the deal.

We have some work at the intersection of the block and vfs layer in the
form of the new FS_IOC_GETLBMD_CAP ioctl() which returns information
about the files integrity profile for userspace applications that need
to understand a files end-to-end data protection support and configure
the I/O accordingly.

Iomap has been quite active as well with some refactoring and changes to
the infrastucture to extend the abilities of fuse and support large
folios. Hell, if this keeps going on every filesystem will move to fuse
and we'll all be out of a job soon.

There's the usual pile of miscellaneous changes to the vfs layer and
filesystems. No need to cover this in detail here.

We also have some work at the intersection of mm and the vfs by porting
a good chunk of filesystems from f_op->mmap() to the new and better
f_op->mmap_prepare(). I'm going to haunt the relevant developers to
finish this conversion asap because I have no appetite of running around
with yet more duplicated methods than we already have. I mean, we've
just gotten rid of f_op->readdir() last year or so - actually you did.

I'm also routing the usual namespace work. This time in the form of some
minor nsfs extensions by exposing a bunch of uapi symbols that a lot of
userspace already relies on and so we can't change those constants
anyway. That's the root inode number of procfs and the inode numbers of
the initial set of namespaces.

We've also been very active in pidfs which gains a bunch of new features
such as persisent exit and coredump information, extended attributes,
autonomous file handles, and pidfd for reaped task from SCM_PDIFD
messages.

A few minor Rust updates are also in there but they're really not that
interesting at all.

And at last a new struct super_operations method that allows
multi-device filesystems such as btrfs to be informed when a block
device is removed. Since btrfs can survive surprise device removal this
complements the usual ->shutdown() call nicely.

That's all! Expect some slight delay in responses as I'm going to be
preoccupied with the move over the weekend.

Thanks!
Christian
Re: [GIT PULL 00/14 for v6.17] vfs 6.17
Posted by Christian Brauner 2 months ago
> Lucky for me the v6.17 merge window coincides with me moving. IOW, I'm
> currently getting squashed by moving boxes and disassembled furniture.

Fyi, the move is now mostly over. We're not really done yet setting
everything up and so on but I managed to get back behind a computer for
once. So I'm slowly trying to catch up with everything.
[GIT PULL 01/14 for v6.17] vfs misc
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This contains the usual selections of misc updates for this cycle.

Features:

- Add ext4 IOCB_DONTCACHE support

  This refactors the address_space_operations write_begin() and
  write_end() callbacks to take const struct kiocb * as their first
  argument, allowing IOCB flags such as IOCB_DONTCACHE to propagate to
  the filesystem's buffered I/O path.

  Ext4 is updated to implement handling of the IOCB_DONTCACHE flag and
  advertises support via the FOP_DONTCACHE file operation flag.

  Additionally, the i915 driver's shmem write paths are updated to
  bypass the legacy write_begin/write_end interface in favor of directly
  calling write_iter() with a constructed synchronous kiocb. Another
  i915 change replaces a manual write loop with kernel_write() during
  GEM shmem object creation.

Cleanups:

- don't duplicate vfs_open() in kernel_file_open()

- proc_fd_getattr(): don't bother with S_ISDIR() check

- fs/ecryptfs: replace snprintf with sysfs_emit in show function

- vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes()

- filelock: add new locks_wake_up_waiter() helper

- fs: Remove three arguments from block_write_end()

- VFS: change old_dir and new_dir in struct renamedata to dentrys

- netfs: Remove unused declaration netfs_queue_write_request()

Fixes:

- eventpoll: Fix semi-unbounded recursion

- eventpoll: fix sphinx documentation build warning

- fs/read_write: Fix spelling typo

- fs: annotate data race between poll_schedule_timeout() and pollwake()

- fs/pipe: set FMODE_NOWAIT in create_pipe_files()

- docs/vfs: update references to i_mutex to i_rwsem

- fs/buffer: remove comment about hard sectorsize

- fs/buffer: remove the min and max limit checks in __getblk_slow()

- fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable

- fs_context: fix parameter name in infofc() macro

- fs: Prevent file descriptor table allocations exceeding INT_MAX

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.misc

for you to fetch changes up to 4e8fc4f7208b032674ef8a4977b96484c328515c:

  netfs: Remove unused declaration netfs_queue_write_request() (2025-07-23 15:08:36 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.misc tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.misc

----------------------------------------------------------------
Al Viro (2):
      don't duplicate vfs_open() in kernel_file_open()
      proc_fd_getattr(): don't bother with S_ISDIR() check

Andy Shevchenko (1):
      fs/read_write: Fix spelling typo

Ankit Chauhan (1):
      fs/ecryptfs: replace snprintf with sysfs_emit in show function

Christian Brauner (1):
      Merge patch series "fs: refactor write_begin/write_end and add ext4 IOCB_DONTCACHE support"

Dmitry Antipov (1):
      fs: annotate suspected data race between poll_schedule_timeout() and pollwake()

Jan Kara (1):
      vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes()

Jann Horn (2):
      eventpoll: Fix semi-unbounded recursion
      eventpoll: fix sphinx documentation build warning

Jeff Layton (1):
      filelock: add new locks_wake_up_waiter() helper

Jens Axboe (1):
      fs/pipe: set FMODE_NOWAIT in create_pipe_files()

Junxuan Liao (1):
      docs/vfs: update references to i_mutex to i_rwsem

Matthew Wilcox (Oracle) (1):
      fs: Remove three arguments from block_write_end()

NeilBrown (1):
      VFS: change old_dir and new_dir in struct renamedata to dentrys

Pankaj Raghav (3):
      fs/buffer: remove comment about hard sectorsize
      fs/buffer: remove the min and max limit checks in __getblk_slow()
      fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable

RubenKelevra (1):
      fs_context: fix parameter name in infofc() macro

Sasha Levin (1):
      fs: Prevent file descriptor table allocations exceeding INT_MAX

Taotao Chen (5):
      drm/i915: Use kernel_write() in shmem object create
      drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter
      fs: change write_begin/write_end interface to take struct kiocb *
      mm/pagemap: add write_begin_get_folio() helper function
      ext4: support uncached buffered I/O

Yue Haibing (1):
      netfs: Remove unused declaration netfs_queue_write_request()

 Documentation/filesystems/locking.rst     |   4 +-
 Documentation/filesystems/vfs.rst         |  11 +--
 block/fops.c                              |  15 ++--
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 115 ++++++++----------------------
 fs/adfs/inode.c                           |   9 +--
 fs/affs/file.c                            |  26 ++++---
 fs/attr.c                                 |  10 +--
 fs/bcachefs/fs-io-buffered.c              |   4 +-
 fs/bcachefs/fs-io-buffered.h              |   4 +-
 fs/bfs/file.c                             |   7 +-
 fs/buffer.c                               |  47 ++++++------
 fs/cachefiles/namei.c                     |   4 +-
 fs/ceph/addr.c                            |  10 ++-
 fs/dcache.c                               |  10 +--
 fs/direct-io.c                            |   8 +--
 fs/ecryptfs/inode.c                       |   4 +-
 fs/ecryptfs/main.c                        |   3 +-
 fs/ecryptfs/mmap.c                        |  10 +--
 fs/eventpoll.c                            |  58 +++++++++++----
 fs/exfat/file.c                           |  11 ++-
 fs/exfat/inode.c                          |  16 +++--
 fs/ext2/dir.c                             |   2 +-
 fs/ext2/inode.c                           |  11 +--
 fs/ext4/file.c                            |   3 +-
 fs/ext4/inode.c                           |  35 ++++-----
 fs/f2fs/data.c                            |   8 ++-
 fs/fat/inode.c                            |  18 ++---
 fs/file.c                                 |  15 ++++
 fs/fuse/file.c                            |  14 ++--
 fs/hfs/hfs_fs.h                           |   2 +-
 fs/hfs/inode.c                            |   4 +-
 fs/hfsplus/hfsplus_fs.h                   |   6 +-
 fs/hfsplus/inode.c                        |   8 ++-
 fs/hostfs/hostfs_kern.c                   |   8 ++-
 fs/hpfs/file.c                            |  18 ++---
 fs/hugetlbfs/inode.c                      |   9 +--
 fs/inode.c                                |  13 ++--
 fs/iomap/buffered-io.c                    |   3 +-
 fs/jffs2/file.c                           |  28 ++++----
 fs/jfs/inode.c                            |  16 +++--
 fs/libfs.c                                |  26 ++++---
 fs/locks.c                                |   4 +-
 fs/minix/dir.c                            |   2 +-
 fs/minix/inode.c                          |   7 +-
 fs/namei.c                                |  29 ++++----
 fs/namespace.c                            |   2 +-
 fs/nfs/file.c                             |   8 ++-
 fs/nfsd/vfs.c                             |   7 +-
 fs/nilfs2/dir.c                           |   2 +-
 fs/nilfs2/inode.c                         |   8 ++-
 fs/nilfs2/recovery.c                      |   3 +-
 fs/ntfs3/file.c                           |   4 +-
 fs/ntfs3/inode.c                          |   7 +-
 fs/ntfs3/ntfs_fs.h                        |  10 +--
 fs/ocfs2/aops.c                           |   6 +-
 fs/omfs/file.c                            |   7 +-
 fs/open.c                                 |   5 +-
 fs/orangefs/inode.c                       |  16 +++--
 fs/overlayfs/copy_up.c                    |   6 +-
 fs/overlayfs/dir.c                        |  16 ++---
 fs/overlayfs/overlayfs.h                  |  16 ++---
 fs/overlayfs/readdir.c                    |   2 +-
 fs/overlayfs/super.c                      |   2 +-
 fs/overlayfs/util.c                       |   2 +-
 fs/pipe.c                                 |   8 ++-
 fs/proc/fd.c                              |  11 +--
 fs/read_write.c                           |   2 +-
 fs/select.c                               |   4 +-
 fs/smb/server/vfs.c                       |   4 +-
 fs/stack.c                                |   4 +-
 fs/ubifs/file.c                           |   8 ++-
 fs/udf/inode.c                            |  11 +--
 fs/ufs/dir.c                              |   2 +-
 fs/ufs/inode.c                            |  16 +++--
 fs/vboxsf/file.c                          |   5 +-
 fs/xattr.c                                |   2 +-
 include/linux/buffer_head.h               |   8 +--
 include/linux/exportfs.h                  |   4 +-
 include/linux/filelock.h                  |   7 +-
 include/linux/fs.h                        |  25 +++----
 include/linux/fs_context.h                |   2 +-
 include/linux/fs_stack.h                  |   2 +-
 include/linux/netfs.h                     |   1 -
 include/linux/pagemap.h                   |  27 +++++++
 include/linux/quotaops.h                  |   2 +-
 io_uring/openclose.c                      |   2 -
 mm/filemap.c                              |   4 +-
 mm/shmem.c                                |  12 ++--
 88 files changed, 520 insertions(+), 457 deletions(-)
Re: [GIT PULL 01/14 for v6.17] vfs misc
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:21 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.misc

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7879d7aff0ffd969fcb1a59e3f87ebb353e47b7f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 03/14 for v6.17] overlayfs
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This contains overlayfs updates for this cycle. Note that some of the
changes depend on parts of the vfs misc pull request this cycle.

They're shown in the diffstat for clarity but will obviously be already
included in the vfs misc pull request that I'm pretty sure you're going
to merge before anyway.

The changes for overlayfs in here are primarily focussed on preparing
for some proposed changes to directory locking.

Overlayfs currently will sometimes lock a directory on the upper
filesystem and do a few different things while holding the lock. This is
incompatible with the new potential scheme.

This series narrows the region of code protected by the directory lock,
taking it multiple times when necessary. This theoretically opens up
the possibilty of other changes happening on the upper filesytem between
the unlock and the lock. To some extent the patches guard against that
by checking the dentries still have the expect parent after retaking the
lock. In general, concurrent changes to the upper and lower filesystems
aren't supported properly anyway.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.ovl

for you to fetch changes up to 672820a070ea5e6ae114f6109726a4e18313a527:

  ovl: properly print correct variable (2025-07-25 10:20:36 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.ovl tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.ovl

----------------------------------------------------------------
Al Viro (2):
      don't duplicate vfs_open() in kernel_file_open()
      proc_fd_getattr(): don't bother with S_ISDIR() check

Amir Goldstein (3):
      fs: constify file ptr in backing_file accessor helpers
      ovl: remove unneeded non-const conversion
      ovl: support layers on case-folding capable filesystems

Andy Shevchenko (1):
      fs/read_write: Fix spelling typo

Antonio Quartulli (1):
      ovl: properly print correct variable

Christian Brauner (2):
      Merge patch series "backing_file accessors cleanup"
      Merge patch series "ovl: narrow regions protected by i_rw_sem"

Jeff Layton (1):
      filelock: add new locks_wake_up_waiter() helper

Jens Axboe (1):
      fs/pipe: set FMODE_NOWAIT in create_pipe_files()

NeilBrown (22):
      VFS: change old_dir and new_dir in struct renamedata to dentrys
      ovl: simplify an error path in ovl_copy_up_workdir()
      ovl: change ovl_create_index() to take dir locks
      ovl: Call ovl_create_temp() without lock held.
      ovl: narrow the locked region in ovl_copy_up_workdir()
      ovl: narrow locking in ovl_create_upper()
      ovl: narrow locking in ovl_clear_empty()
      ovl: narrow locking in ovl_create_over_whiteout()
      ovl: simplify gotos in ovl_rename()
      ovl: narrow locking in ovl_rename()
      ovl: narrow locking in ovl_cleanup_whiteouts()
      ovl: narrow locking in ovl_cleanup_index()
      ovl: narrow locking in ovl_workdir_create()
      ovl: narrow locking in ovl_indexdir_cleanup()
      ovl: narrow locking in ovl_workdir_cleanup_recurse()
      ovl: change ovl_workdir_cleanup() to take dir lock as needed.
      ovl: narrow locking on ovl_remove_and_whiteout()
      ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed
      ovl: narrow locking in ovl_whiteout()
      ovl: narrow locking in ovl_check_rename_whiteout()
      ovl: change ovl_create_real() to receive dentry parent
      ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()

 fs/backing-file.c        |   4 +-
 fs/cachefiles/namei.c    |   4 +-
 fs/ecryptfs/inode.c      |   4 +-
 fs/file_table.c          |  13 ++-
 fs/internal.h            |   1 +
 fs/locks.c               |   2 +-
 fs/namei.c               |   7 +-
 fs/nfsd/vfs.c            |   7 +-
 fs/open.c                |   5 +-
 fs/overlayfs/copy_up.c   |  52 +++++-----
 fs/overlayfs/dir.c       | 260 +++++++++++++++++++++++++----------------------
 fs/overlayfs/file.c      |   2 +-
 fs/overlayfs/namei.c     |  31 +++++-
 fs/overlayfs/overlayfs.h |  45 +++++---
 fs/overlayfs/ovl_entry.h |   1 +
 fs/overlayfs/params.c    |  12 +--
 fs/overlayfs/readdir.c   |  44 ++++----
 fs/overlayfs/super.c     |  50 ++++-----
 fs/overlayfs/util.c      |  46 ++++++---
 fs/pipe.c                |   8 +-
 fs/proc/fd.c             |  11 +-
 fs/read_write.c          |   2 +-
 fs/smb/server/vfs.c      |   4 +-
 include/linux/filelock.h |   7 +-
 include/linux/fs.h       |  14 +--
 io_uring/openclose.c     |   2 -
 26 files changed, 353 insertions(+), 285 deletions(-)
Re: [GIT PULL 03/14 for v6.17] overlayfs
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:24 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.ovl

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/934600daa7bcce8ad6d5efe05cce4811c8d2f464

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 04/14 for v6.17] namespace updates
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This contains namespace updates. This time specifically for nsfs:

- Userspace heavily relies on the root inode numbers for namespaces to
  identify the initial namespaces. That's already a hard dependency. So
  we cannot change that anymore. Move the initial inode numbers to a
  public header and align the only two namespaces that currently don't
  do that with all the other namespaces.

- The root inode of /proc having a fixed inode number has been part of
  the core kernel ABI since its inception, and recently some userspace
  programs (mainly container runtimes) have started to explicitly depend
  on this behaviour.

  The main reason this is useful to userspace is that by checking that a
  suspect /proc handle has fstype PROC_SUPER_MAGIC and is
  PROCFS_ROOT_INO, they can then use
  openat2(RESOLVE_{NO_{XDEV,MAGICLINK},BENEATH}) to ensure that there
  isn't a bind-mount that replaces some procfs file with a different
  one. This kind of attack has lead to security issues in container
  runtimes in the past (such as CVE-2019-19921) and libraries like
  libpathrs[1] use this feature of procfs to provide safe procfs
  handling functions.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.nsfs

for you to fetch changes up to 76fdb7eb4e1c91086ce9c3db6972c2ed48c96afb:

  uapi: export PROCFS_ROOT_INO (2025-07-10 09:39:18 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.nsfs tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.nsfs

----------------------------------------------------------------
Aleksa Sarai (1):
      uapi: export PROCFS_ROOT_INO

Christian Brauner (4):
      nsfs: move root inode number to uapi
      netns: use stable inode number for initial mount ns
      mntns: use stable inode number for initial mount ns
      Merge patch series "nsfs: expose the stable inode numbers in a public header"

 fs/namespace.c            |  4 +++-
 fs/proc/root.c            | 10 +++++-----
 include/linux/proc_ns.h   | 16 +++++++++-------
 include/uapi/linux/fs.h   | 11 +++++++++++
 include/uapi/linux/nsfs.h | 11 +++++++++++
 net/core/net_namespace.c  |  8 ++++++++
 6 files changed, 47 insertions(+), 13 deletions(-)
Re: [GIT PULL 04/14 for v6.17] namespace updates
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:23 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.nsfs

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f70d24c230bcaa1e95f66252133068a98c895200

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 05/14 for v6.17] vfs async dir
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This contains preparatory changes for the asynchronous directory locking
scheme. While the locking scheme is still very much controversial and
we're still far away from landing any actual changes in that area the
preparatory work that we've been upstreaming for a while now has been
very useful. This is another set of minor changes and cleanups.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.async.dir

for you to fetch changes up to d4db71038ff592aa4bc954d6bbd10be23954bb98:

  Merge patch series "Minor cleanup preparation for some dir-locking API changes" (2025-06-11 13:44:21 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.async.dir tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.async.dir

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "Minor cleanup preparation for some dir-locking API changes"

NeilBrown (4):
      VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl()
      VFS: Minor fixes for porting.rst
      coda: use iterate_dir() in coda_readdir()
      exportfs: use lookup_one_unlocked()

 Documentation/filesystems/porting.rst |  3 ---
 fs/coda/dir.c                         | 12 ++----------
 fs/exportfs/expfs.c                   |  4 +---
 fs/namei.c                            | 37 +++++++++++++----------------------
 4 files changed, 17 insertions(+), 39 deletions(-)
Re: [GIT PULL 05/14 for v6.17] vfs async dir
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:14 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.async.dir

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0c4ec4a339b435381bc998f74862bd7a23d33f79

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 06/14 for v6.17] vfs fallocate
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
fallocate() currently supports creating preallocated files efficiently.
However, on most filesystems fallocate() will preallocate blocks in an
unwriten state even if FALLOC_FL_ZERO_RANGE is specified.

The extent state must later be converted to a written state when the
user writes data into this range, which can trigger numerous metadata
changes and journal I/O. This may leads to significant write
amplification and performance degradation in synchronous write mode.

At the moment, the only method to avoid this is to create an empty file
and write zero data into it (for example, using 'dd' with a large block
size). However, this method is slow and consumes a considerable amount
of disk bandwidth.

Now that more and more flash-based storage devices are available it is
possible to efficiently write zeros to SSDs using the unmap write zeroes
command if the devices do not write physical zeroes to the media.

For example, if SCSI SSDs support the UMMAP bit or NVMe SSDs support the
DEAC bit[1], the write zeroes command does not write actual data to the
device, instead, NVMe converts the zeroed range to a deallocated state,
which works fast and consumes almost no disk write bandwidth.

This series implements the BLK_FEAT_WRITE_ZEROES_UNMAP feature and
BLK_FLAG_WRITE_ZEROES_UNMAP_DISABLED flag for SCSI, NVMe and
device-mapper drivers, and add the FALLOC_FL_WRITE_ZEROES and
STATX_ATTR_WRITE_ZEROES_UNMAP support for ext4 and raw bdev devices.

fallocate() is subsequently extended with the FALLOC_FL_WRITE_ZEROES
flag. FALLOC_FL_WRITE_ZEROES zeroes a specified file range in such a way
that subsequent writes to that range do not require further changes to
the file mapping metadata. This flag is beneficial for subsequent pure
overwriting within this range, as it can save on block allocation and,
consequently, significant metadata changes.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit e04c78d86a9699d136910cfc0bdcf01087e3267e:

  Linux 6.16-rc2 (2025-06-15 13:49:41 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fallocate

for you to fetch changes up to 4f984fe7b4d9aea332c7ff59827a4e168f0e4e1b:

  Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag" (2025-06-23 12:45:32 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.fallocate tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.fallocate

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag"

Zhang Yi (9):
      block: introduce max_{hw|user}_wzeroes_unmap_sectors to queue limits
      nvme: set max_hw_wzeroes_unmap_sectors if device supports DEAC bit
      nvmet: set WZDS and DRB if device enables unmap write zeroes operation
      scsi: sd: set max_hw_wzeroes_unmap_sectors if device supports SD_ZERO_*_UNMAP
      dm: clear unmap write zeroes limits when disabling write zeroes
      fs: introduce FALLOC_FL_WRITE_ZEROES to fallocate
      block: factor out common part in blkdev_fallocate()
      block: add FALLOC_FL_WRITE_ZEROES support
      ext4: add FALLOC_FL_WRITE_ZEROES support

 Documentation/ABI/stable/sysfs-block | 33 ++++++++++++++++++
 block/blk-settings.c                 | 20 +++++++++--
 block/blk-sysfs.c                    | 26 ++++++++++++++
 block/fops.c                         | 44 +++++++++++++-----------
 drivers/md/dm-table.c                |  4 ++-
 drivers/nvme/host/core.c             | 20 ++++++-----
 drivers/nvme/target/io-cmd-bdev.c    |  4 +++
 drivers/scsi/sd.c                    |  5 +++
 fs/ext4/extents.c                    | 66 ++++++++++++++++++++++++++++++------
 fs/open.c                            |  1 +
 include/linux/blkdev.h               | 10 ++++++
 include/linux/falloc.h               |  3 +-
 include/trace/events/ext4.h          |  3 +-
 include/uapi/linux/falloc.h          | 17 ++++++++++
 14 files changed, 212 insertions(+), 44 deletions(-)
Re: [GIT PULL 06/14 for v6.17] vfs fallocate
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:17 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fallocate

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/278c7d9b5e0ca73a75e5151c22fb05c91cb4495f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 09/14 for v6.17] vfs bpf
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
These changes allow bpf to read extended attributes from cgroupfs.
This is useful in redirecting AF_UNIX socket connections based on cgroup
membership of the socket. One use-case is the ability to implement log
namespaces in systemd so services and containers are redirected to
different journals.

Please note that I plan on merging bpf changes related to the vfs
exclusively via vfs trees.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.bpf

for you to fetch changes up to 70619d40e8307b4b2ce1d08405e7b827c61ba4a8:

  selftests/kernfs: test xattr retrieval (2025-07-02 14:18:22 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.bpf tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.bpf

----------------------------------------------------------------
Christian Brauner (3):
      kernfs: remove iattr_mutex
      Merge patch series "Introduce bpf_cgroup_read_xattr"
      selftests/kernfs: test xattr retrieval

Song Liu (3):
      bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
      bpf: Mark cgroup_subsys_state->cgroup RCU safe
      selftests/bpf: Add tests for bpf_cgroup_read_xattr

 fs/bpf_fs_kfuncs.c                                 |  34 +++++
 fs/kernfs/inode.c                                  |  70 ++++-----
 kernel/bpf/helpers.c                               |   3 +
 kernel/bpf/verifier.c                              |   5 +
 tools/testing/selftests/bpf/bpf_experimental.h     |   3 +
 .../selftests/bpf/prog_tests/cgroup_xattr.c        | 145 +++++++++++++++++++
 .../selftests/bpf/progs/cgroup_read_xattr.c        | 158 +++++++++++++++++++++
 .../selftests/bpf/progs/read_cgroupfs_xattr.c      |  60 ++++++++
 tools/testing/selftests/filesystems/.gitignore     |   1 +
 tools/testing/selftests/filesystems/Makefile       |   2 +-
 tools/testing/selftests/filesystems/kernfs_test.c  |  38 +++++
 11 files changed, 486 insertions(+), 33 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/cgroup_xattr.c
 create mode 100644 tools/testing/selftests/bpf/progs/cgroup_read_xattr.c
 create mode 100644 tools/testing/selftests/bpf/progs/read_cgroupfs_xattr.c
 create mode 100644 tools/testing/selftests/filesystems/kernfs_test.c
Re: [GIT PULL 09/14 for v6.17] vfs bpf
Posted by Alexei Starovoitov 2 months, 1 week ago
On Fri, Jul 25, 2025 at 01:27:15PM +0200, Christian Brauner wrote:
> Hey Linus,
> 
> /* Summary */
> These changes allow bpf to read extended attributes from cgroupfs.
> This is useful in redirecting AF_UNIX socket connections based on cgroup
> membership of the socket. One use-case is the ability to implement log
> namespaces in systemd so services and containers are redirected to
> different journals.
> 
> Please note that I plan on merging bpf changes related to the vfs
> exclusively via vfs trees.

That was not discussed and agreed upon.

> /* Testing */

The selftests/bpf had bugs flagged by BPF CI.

> /* Conflicts */
> 
> Merge conflicts with mainline
> =============================
> 
> No known conflicts.
> 
> Merge conflicts with other trees
> ================================
> 
> No known conflicts.

You were told a month ago that there are conflicts
and you were also told that the branch shouldn't be rebased,
yet you ignored it.

> Christian Brauner (3):
>       kernfs: remove iattr_mutex
>       Merge patch series "Introduce bpf_cgroup_read_xattr"
>       selftests/kernfs: test xattr retrieval
> 
> Song Liu (3):
>       bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
>       bpf: Mark cgroup_subsys_state->cgroup RCU safe
>       selftests/bpf: Add tests for bpf_cgroup_read_xattr
> 
>  fs/bpf_fs_kfuncs.c                                 |  34 +++++
>  fs/kernfs/inode.c                                  |  70 ++++-----
>  kernel/bpf/helpers.c                               |   3 +
>  kernel/bpf/verifier.c                              |   5 +
>  tools/testing/selftests/bpf/bpf_experimental.h     |   3 +
>  .../selftests/bpf/prog_tests/cgroup_xattr.c        | 145 +++++++++++++++++++
>  .../selftests/bpf/progs/cgroup_read_xattr.c        | 158 +++++++++++++++++++++
>  .../selftests/bpf/progs/read_cgroupfs_xattr.c      |  60 ++++++++

Now Linus needs to resolve the conflicts again.
More details in bpf-next PR:
https://lore.kernel.org/bpf/20250729180626.35057-1-alexei.starovoitov@gmail.com/
Re: [GIT PULL 09/14 for v6.17] vfs bpf
Posted by Christian Brauner 2 months ago
On Tue, Jul 29, 2025 at 11:15:56AM -0700, Alexei Starovoitov wrote:
> On Fri, Jul 25, 2025 at 01:27:15PM +0200, Christian Brauner wrote:
> > Hey Linus,
> > 
> > /* Summary */
> > These changes allow bpf to read extended attributes from cgroupfs.
> > This is useful in redirecting AF_UNIX socket connections based on cgroup
> > membership of the socket. One use-case is the ability to implement log
> > namespaces in systemd so services and containers are redirected to
> > different journals.
> > 
> > Please note that I plan on merging bpf changes related to the vfs
> > exclusively via vfs trees.
> 
> That was not discussed and agreed upon.
> 
> > /* Testing */
> 
> The selftests/bpf had bugs flagged by BPF CI.
> 
> > /* Conflicts */
> > 
> > Merge conflicts with mainline
> > =============================
> > 
> > No known conflicts.
> > 
> > Merge conflicts with other trees
> > ================================
> > 
> > No known conflicts.
> 
> You were told a month ago that there are conflicts
> and you were also told that the branch shouldn't be rebased,
> yet you ignored it.
> 
> > Christian Brauner (3):
> >       kernfs: remove iattr_mutex
> >       Merge patch series "Introduce bpf_cgroup_read_xattr"
> >       selftests/kernfs: test xattr retrieval
> > 
> > Song Liu (3):
> >       bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
> >       bpf: Mark cgroup_subsys_state->cgroup RCU safe
> >       selftests/bpf: Add tests for bpf_cgroup_read_xattr
> > 
> >  fs/bpf_fs_kfuncs.c                                 |  34 +++++
> >  fs/kernfs/inode.c                                  |  70 ++++-----
> >  kernel/bpf/helpers.c                               |   3 +
> >  kernel/bpf/verifier.c                              |   5 +
> >  tools/testing/selftests/bpf/bpf_experimental.h     |   3 +
> >  .../selftests/bpf/prog_tests/cgroup_xattr.c        | 145 +++++++++++++++++++
> >  .../selftests/bpf/progs/cgroup_read_xattr.c        | 158 +++++++++++++++++++++
> >  .../selftests/bpf/progs/read_cgroupfs_xattr.c      |  60 ++++++++
> 
> Now Linus needs to resolve the conflicts again.
> More details in bpf-next PR:
> https://lore.kernel.org/bpf/20250729180626.35057-1-alexei.starovoitov@gmail.com/

As many times before you seem to conveniently misremember the facts.

Every tree that has meaningful VFS changes such as adding new helpers
uses a shared branch. Such as in this case that touched kernfs and the
VFS.

The conflict arises from the fact that somehow you manage to maintain
all of the complexities of bpf but you refuse to make shared branches
work due to a simple merge conflict:

  "imo this shared branch experience wasn't good.
  We should have applied the series to bpf-next only.
  It was more bpf material than vfs. I wouldn't do this again."

  https://lore.kernel.org/r/CAADnVQ+pPt7Zt8gS0aW75WGrwjmcUcn3s37Ahd9bnLyzOfB=3g@mail.gmail.com

Something that we succesfully manage with all other subsystems. Is it
perfect? Of course not.

But instead of trying to come to a simple solution you just stop
replying. That's not how this works.

The branch had a bug and I informed you and told you how I would resolve
it in:

  https://lore.kernel.org/r/20250702-hochmoderne-abklatsch-af9c605b57b2@brauner

It's been in -next a few days. Instead of slapping some hotfix on top
that leaves the tree in a broken state the fix was squashed. In other
words you would have to reapply the series anyway.

I also explicitly told you as a reply to the very issue in the same thread:

  "Anything that touches VFS will go through VFS. Shared
  branches work just fine. We manage to do this with everyone else in the
  kernel so bpf is able to do this as well. If you'd just asked this would
  not have been an issue. Merge conflicts are a fact of kernel
  development, we all deal with it you can too."

  https://lore.kernel.org/r/20250702-anhaften-postleitzahl-06a4d4771641@brauner

For the record, I don't have a problem with some stuff going through
other trees. For example, if Jens wanted to do that I'd go "hell yeah,
let's try and make this work."

The reason I'm hesitant to do it here is because of continuous mails
like the one you sent here where you aggressively spin a story and then
try to make someone take the blame.

I mean, your mail is very short of "Linus, I'm subtly telling you what
mean Christian did wrong and that he's rebased, which I know you hate
and you have to resolve merge conflicts so please yell at him.". Come
on.

I work hard to effectively cooperate with you but until there is a
good-faith mutual relationship on-list I don't want meaningful VFS work
going through the bpf tree. You can take it or leave it and I would
kindly ask Linus to respect that if he agrees.
Re: [GIT PULL 09/14 for v6.17] vfs bpf
Posted by Alexei Starovoitov 2 months ago
On Thu, Jul 31, 2025 at 1:28 AM Christian Brauner <brauner@kernel.org> wrote:
>
> It's been in -next a few days. Instead of slapping some hotfix on top
> that leaves the tree in a broken state the fix was squashed. In other
> words you would have to reapply the series anyway.

That's not how stable branches work. The whole point of a stable
branch is that sha-s should not change. You don't squash things
after a branch is created.
That extra fix could have been easily added on top.

> I mean, your mail is very short of "Linus, I'm subtly telling you what
> mean Christian did wrong and that he's rebased, which I know you hate
> and you have to resolve merge conflicts so please yell at him.". Come
> on.

Not subtly. You made a mistake and instead of admitting it
you're doubling down on your wrong git process.

> I work hard to effectively cooperate with you but until there is a
> good-faith mutual relationship on-list I don't want meaningful VFS work
> going through the bpf tree. You can take it or leave it and I would
> kindly ask Linus to respect that if he agrees.

Look, you took bpf patches that BPF CI flagged as broken
and bpf maintainers didn't even ack.
Out of 4 patches that you applied one was yours that
touched VFS and 3 were bpf related.
That was a wtf moment, but we didn't complain,
since the feature is useful, so we were happy to see
it land even in this half broken form.
We applied your "stable" branch to bpf-next and added fixes on top.
Then you squashed "hotfix".
That made all of our fixes in bpf-next to become conflicts.
We cannot reapply your branch. We don't rebase the trees.
That was the policy for years. Started long ago during
net-next era and now in bpf-next too.
This time we were lucky that conflicts were not that bad
and it was easy enough for Linus to deal with them,
but that must not repeat.

Do not touch bpf patches if you refuse to follow
stable branch process that everyone else does.
And it's not VFS. It's really just you, Christian.
Back in August 2024 Al created a true stable branch
vfs/stable-struct_fd. We pulled it into bpf-next
in commit 50470d3899cd ("Merge remote-tracking branch 'vfs/stable-struct_fd'")
While Al sent a PR for it during the merge window:
https://lore.kernel.org/all/20240923034731.GF3413968@ZenIV/
On the kernel/bpf/* side we added more changes on top of Al's work,
and, surprise, there were no conflicts during the merge window.
That's how stable branches meant to work.
Re: [GIT PULL 09/14 for v6.17] vfs bpf
Posted by Christian Brauner 2 months ago
On Thu, Jul 31, 2025 at 02:57:52PM -0700, Alexei Starovoitov wrote:
> On Thu, Jul 31, 2025 at 1:28 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > It's been in -next a few days. Instead of slapping some hotfix on top
> > that leaves the tree in a broken state the fix was squashed. In other
> > words you would have to reapply the series anyway.
> 
> That's not how stable branches work. The whole point of a stable
> branch is that sha-s should not change. You don't squash things
> after a branch is created.
> That extra fix could have been easily added on top.
> 
> > I mean, your mail is very short of "Linus, I'm subtly telling you what
> > mean Christian did wrong and that he's rebased, which I know you hate
> > and you have to resolve merge conflicts so please yell at him.". Come
> > on.
> 
> Not subtly. You made a mistake and instead of admitting it
> you're doubling down on your wrong git process.
> 
> > I work hard to effectively cooperate with you but until there is a
> > good-faith mutual relationship on-list I don't want meaningful VFS work
> > going through the bpf tree. You can take it or leave it and I would
> > kindly ask Linus to respect that if he agrees.
> 
> Look, you took bpf patches that BPF CI flagged as broken
> and bpf maintainers didn't even ack.
> Out of 4 patches that you applied one was yours that
> touched VFS and 3 were bpf related.
> That was a wtf moment, but we didn't complain,
> since the feature is useful, so we were happy to see
> it land even in this half broken form.
> We applied your "stable" branch to bpf-next and added fixes on top.
> Then you squashed "hotfix".
> That made all of our fixes in bpf-next to become conflicts.
> We cannot reapply your branch. We don't rebase the trees.
> That was the policy for years. Started long ago during
> net-next era and now in bpf-next too.
> This time we were lucky that conflicts were not that bad
> and it was easy enough for Linus to deal with them,
> but that must not repeat.

Ah, I see what you're complaining about now. But I'm still not happy
that we didn't manage to resolve this confusion earlier.

I was not clear in what way you did rely on that branch and that you
relied on me not folding in the mutex fix especially because you didn't
reply when I said I would fold it and you said that putting fixes on top
wouldn't work upthread.

If I'm aware that a branch is shared and relied upon then I won't change it.
I would've immediately rolled it back would I have know that this causes
issues for you but to me everything looked fine when I didn't hear back.
Re: [GIT PULL 09/14 for v6.17] vfs bpf
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:15 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.bpf

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7e7bc8335b1486e5b157e844c248925a763baf16

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 10/14 for v6.17] vfs rust
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This contains vfs rust updates for this cycle:

- Allow poll_table pointers to be NULL.

- Add Rust files to vfs MAINTAINERS entry.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.rust

for you to fetch changes up to 3ccc82e31d6a66600f14f6622a944f580b04da43:

  vfs: add Rust files to MAINTAINERS (2025-07-15 11:50:15 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.rust tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.rust

----------------------------------------------------------------
Alice Ryhl (2):
      poll: rust: allow poll_table ptrs to be null
      vfs: add Rust files to MAINTAINERS

 MAINTAINERS              |  4 +++
 rust/helpers/helpers.c   |  1 +
 rust/helpers/poll.c      | 10 +++++++
 rust/kernel/sync/poll.rs | 68 ++++++++++++++++++------------------------------
 4 files changed, 41 insertions(+), 42 deletions(-)
 create mode 100644 rust/helpers/poll.c
Re: [GIT PULL 10/14 for v6.17] vfs rust
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:26 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.rust

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/add07519ea6b6c2ba2b7842225eb87e0f08f2b0f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 11/14 for v6.17] vfs integrity
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This adds the new FS_IOC_GETLBMD_CAP ioctl() to query metadata and
protection info (PI) capabilities. This ioctl returns information about
the files integrity profile. This is useful for userspace applications
to understand a files end-to-end data protection support and configure
the I/O accordingly.

For now this interface is only supported by block devices. However the
design and placement of this ioctl in generic FS ioctl space allows us
to extend it to work over files as well. This maybe useful when
filesystems start supporting PI-aware layouts.

A new structure struct logical_block_metadata_cap is introduced, which
contains the following fields:

- lbmd_flags:
  bitmask of logical block metadata capability flags

- lbmd_interval:
  the amount of data described by each unit of logical block metadata

- lbmd_size:
  size in bytes of the logical block metadata associated with each
  interval

- lbmd_opaque_size:
  size in bytes of the opaque block tag associated with each interval

- lbmd_opaque_offset:
  offset in bytes of the opaque block tag within the logical block
  metadata

- lbmd_pi_size:
  size in bytes of the T10 PI tuple associated with each interval

- lbmd_pi_offset:
  offset in bytes of T10 PI tuple within the logical block metadata

- lbmd_pi_guard_tag_type:
  T10 PI guard tag type
    
- lbmd_pi_app_tag_size:
   size in bytes of the T10 PI application tag

- lbmd_pi_ref_tag_size:
   size in bytes of the T10 PI reference tag

- lbmd_pi_storage_tag_size:
  size in bytes of the T10 PI storage tag

The internal logic to fetch the capability is encapsulated in a helper
function blk_get_meta_cap(), which uses the blk_integrity profile
associated with the device. The ioctl returns -EOPNOTSUPP, if
CONFIG_BLK_DEV_INTEGRITY is not enabled.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.integrity

for you to fetch changes up to bc5b0c8febccbeabfefc9b59083b223ec7c7b53a:

  block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP (2025-07-23 14:55:51 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.integrity tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.integrity

----------------------------------------------------------------
Anuj Gupta (5):
      block: rename tuple_size field in blk_integrity to metadata_size
      block: introduce pi_tuple_size field in blk_integrity
      nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE
      fs: add ioctl to query metadata and protection info capabilities
      block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP

Arnd Bergmann (1):
      block: fix FS_IOC_GETLBMD_CAP parsing in blkdev_common_ioctl()

Christian Brauner (1):
      Merge patch series "add ioctl to query metadata and protection info capabilities"

 block/bio-integrity-auto.c        |  4 +--
 block/blk-integrity.c             | 70 ++++++++++++++++++++++++++++++++++++++-
 block/blk-settings.c              | 44 ++++++++++++++++++++++--
 block/ioctl.c                     |  6 ++++
 block/t10-pi.c                    | 16 ++++-----
 drivers/md/dm-crypt.c             |  4 +--
 drivers/md/dm-integrity.c         | 12 +++----
 drivers/nvdimm/btt.c              |  2 +-
 drivers/nvme/host/core.c          |  7 ++--
 drivers/nvme/target/io-cmd-bdev.c |  2 +-
 drivers/scsi/sd_dif.c             |  3 +-
 include/linux/blk-integrity.h     | 11 ++++--
 include/linux/blkdev.h            |  3 +-
 include/uapi/linux/fs.h           | 59 +++++++++++++++++++++++++++++++++
 14 files changed, 213 insertions(+), 30 deletions(-)
Re: [GIT PULL 11/14 for v6.17] vfs integrity
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:19 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.integrity

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/cec40a7c80e8b0ef03667708ea2660bc1a99b464

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL 11/14 for v6.17] vfs integrity
Posted by Hugh Dickins 2 months, 1 week ago
On Fri, 25 Jul 2025, Christian Brauner wrote:

> Hey Linus,
> 
> /* Summary */
> This adds the new FS_IOC_GETLBMD_CAP ioctl() to query metadata and
> protection info (PI) capabilities. This ioctl returns information about
> the files integrity profile. This is useful for userspace applications
> to understand a files end-to-end data protection support and configure
> the I/O accordingly.
> 
> For now this interface is only supported by block devices. However the
> design and placement of this ioctl in generic FS ioctl space allows us
> to extend it to work over files as well. This maybe useful when
> filesystems start supporting PI-aware layouts.
> 
> A new structure struct logical_block_metadata_cap is introduced, which
> contains the following fields:
> 
> - lbmd_flags:
>   bitmask of logical block metadata capability flags
> 
> - lbmd_interval:
>   the amount of data described by each unit of logical block metadata
> 
> - lbmd_size:
>   size in bytes of the logical block metadata associated with each
>   interval
> 
> - lbmd_opaque_size:
>   size in bytes of the opaque block tag associated with each interval
> 
> - lbmd_opaque_offset:
>   offset in bytes of the opaque block tag within the logical block
>   metadata
> 
> - lbmd_pi_size:
>   size in bytes of the T10 PI tuple associated with each interval
> 
> - lbmd_pi_offset:
>   offset in bytes of T10 PI tuple within the logical block metadata
> 
> - lbmd_pi_guard_tag_type:
>   T10 PI guard tag type
>     
> - lbmd_pi_app_tag_size:
>    size in bytes of the T10 PI application tag
> 
> - lbmd_pi_ref_tag_size:
>    size in bytes of the T10 PI reference tag
> 
> - lbmd_pi_storage_tag_size:
>   size in bytes of the T10 PI storage tag
> 
> The internal logic to fetch the capability is encapsulated in a helper
> function blk_get_meta_cap(), which uses the blk_integrity profile
> associated with the device. The ioctl returns -EOPNOTSUPP, if
> CONFIG_BLK_DEV_INTEGRITY is not enabled.
> 
> /* Testing */
> 
> gcc (Debian 14.2.0-19) 14.2.0
> Debian clang version 19.1.7 (3)
> 
> No build failures or warnings were observed.
> 
> /* Conflicts */
> 
> Merge conflicts with mainline
> =============================
> 
> No known conflicts.
> 
> Merge conflicts with other trees
> ================================
> 
> No known conflicts.
> 
> The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:
> 
>   Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)
> 
> are available in the Git repository at:
> 
>   git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.integrity
> 
> for you to fetch changes up to bc5b0c8febccbeabfefc9b59083b223ec7c7b53a:
> 
>   block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP (2025-07-23 14:55:51 +0200)
> 
> Please consider pulling these changes from the signed vfs-6.17-rc1.integrity tag.
> 
> Thanks!
> Christian
> 
> ----------------------------------------------------------------
> vfs-6.17-rc1.integrity
> 
> ----------------------------------------------------------------
> Anuj Gupta (5):
>       block: rename tuple_size field in blk_integrity to metadata_size
>       block: introduce pi_tuple_size field in blk_integrity
>       nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE
>       fs: add ioctl to query metadata and protection info capabilities
>       block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP
> 
> Arnd Bergmann (1):
>       block: fix FS_IOC_GETLBMD_CAP parsing in blkdev_common_ioctl()
> 
> Christian Brauner (1):
>       Merge patch series "add ioctl to query metadata and protection info capabilities"
> 
>  block/bio-integrity-auto.c        |  4 +--
>  block/blk-integrity.c             | 70 ++++++++++++++++++++++++++++++++++++++-
>  block/blk-settings.c              | 44 ++++++++++++++++++++++--
>  block/ioctl.c                     |  6 ++++
>  block/t10-pi.c                    | 16 ++++-----
>  drivers/md/dm-crypt.c             |  4 +--
>  drivers/md/dm-integrity.c         | 12 +++----
>  drivers/nvdimm/btt.c              |  2 +-
>  drivers/nvme/host/core.c          |  7 ++--
>  drivers/nvme/target/io-cmd-bdev.c |  2 +-
>  drivers/scsi/sd_dif.c             |  3 +-
>  include/linux/blk-integrity.h     | 11 ++++--
>  include/linux/blkdev.h            |  3 +-
>  include/uapi/linux/fs.h           | 59 +++++++++++++++++++++++++++++++++
>  14 files changed, 213 insertions(+), 30 deletions(-)

It would be great if Klara's patch at
https://lore.kernel.org/lkml/20250725164334.9606-1-klarasmodin@gmail.com/
could follow just after this pull: I had been bisecting -next to find out
why "losetup /dev/loop0 tmpfsfile" was failing, and that patch fixes it -
and presumably other odd failures for anyone without BLK_DEV_INTEGRITY=y.

Thanks,
Hugh
Re: [GIT PULL 11/14 for v6.17] vfs integrity
Posted by Linus Torvalds 2 months, 1 week ago
On Sun, 27 Jul 2025 at 18:29, Hugh Dickins <hughd@google.com> wrote:
>
> It would be great if Klara's patch at
> https://lore.kernel.org/lkml/20250725164334.9606-1-klarasmodin@gmail.com/
> could follow just after this pull: I had been bisecting -next to find out
> why "losetup /dev/loop0 tmpfsfile" was failing, and that patch fixes it -
> and presumably other odd failures for anyone without BLK_DEV_INTEGRITY=y.

Bah. I *hate* this "call blk_get_meta_cap() first" approach. There is
absolutely *NO* way it is valid for that strange specialized ioctl to
override any proper traditional ioctl numbers, so calling that code
first and relying on magic error numbers is simply not acceptable.

I'm going to fix this in my merge by just putting the call to
blk_get_meta_cap() inside the "default:" case for *after* the other
ioctl numbers have been checked.

Please don't introduce new "magic error number" logic in the ioctl
path. The fact that the traditional case of "I don't support this" is
ENOTTY should damn well tell everybody that we have about SIX DECADES
of problems in this area. Don't repeat that mistake.

And don't let new random unimportant ioctls *EVER* override the normal
default ones.

               Linus
Re: [GIT PULL 11/14 for v6.17] vfs integrity
Posted by Christoph Hellwig 2 months, 1 week ago
On Mon, Jul 28, 2025 at 03:21:21PM -0700, Linus Torvalds wrote:
> Bah. I *hate* this "call blk_get_meta_cap() first" approach. There is
> absolutely *NO* way it is valid for that strange specialized ioctl to
> override any proper traditional ioctl numbers, so calling that code
> first and relying on magic error numbers is simply not acceptable.
> 
> I'm going to fix this in my merge by just putting the call to
> blk_get_meta_cap() inside the "default:" case for *after* the other
> ioctl numbers have been checked.
> 
> Please don't introduce new "magic error number" logic in the ioctl
> path. The fact that the traditional case of "I don't support this" is
> ENOTTY should damn well tell everybody that we have about SIX DECADES
> of problems in this area. Don't repeat that mistake.
> 
> And don't let new random unimportant ioctls *EVER* override the normal
> default ones.

I don't think overrides are intentional here.  The problem is that
Christian asked for the flexible size growing decoding here, which
makes it impossible to use the simple and proven ioctl dispatch by
just using another case statement in the switch.
Re: [GIT PULL 11/14 for v6.17] vfs integrity
Posted by Linus Torvalds 2 months, 1 week ago
On Tue, 29 Jul 2025 at 00:49, Christoph Hellwig <hch@infradead.org> wrote:
>
> I don't think overrides are intentional here.  The problem is that
> Christian asked for the flexible size growing decoding here, which
> makes it impossible to use the simple and proven ioctl dispatch by
> just using another case statement in the switch.

Right. Which is why I put it in the default: branch.

IOW, just handle the important real and normal cases first - the ones
that *can* be handled with simple switch statements.

So putting it at the *top*, and then saying "if it returns this
special error code that isn't standardized we do the normal ones" is
wrong.

It's wrong because we literally have over half a century of confusion
about error codes in this area, predating Linux.

And it's also wrong because that new ioctl simply shouldn't be
prioritized over existing ones.

So I'm just saying "don't do that then".

               Linus
Re: [GIT PULL 11/14 for v6.17] vfs integrity
Posted by Christian Brauner 2 months ago
> Right. Which is why I put it in the default: branch.

Thanks for fixing that up!
[GIT PULL 12/14 for v6.17] vfs fileattr
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This introduces the new file_getattr() and file_setattr() system calls
after lengthy discussions. Both system calls serve as successors and
extensible companions to the FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR
system calls which have started to show their age in addition to being
named in a way that makes it easy to conflate them with extended
attribute related operations.

These syscalls allow userspace to set filesystem inode attributes on
special files. One of the usage examples is the XFS quota projects.

XFS has project quotas which could be attached to a directory. All new
inodes in these directories inherit project ID set on parent directory.

The project is created from userspace by opening and calling
FS_IOC_FSSETXATTR on each inode. This is not possible for special files
such as FIFO, SOCK, BLK etc. Therefore, some inodes are left with empty
project ID. Those inodes then are not shown in the quota accounting but
still exist in the directory. This is not critical but in the case when
special files are created in the directory with already existing project
quota, these new inodes inherit extended attributes. This creates a mix
of special files with and without attributes. Moreover, special files
with attributes don't have a possibility to become clear or change the
attributes. This, in turn, prevents userspace from re-creating quota
project on these existing files.

In addition, these new system calls allow the implementation of
additional attributes that we couldn't or didn't want to fit into the
legacy ioctls anymore.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fileattr

for you to fetch changes up to e85931d1cd699307e6a3f1060cbe4c42748f3fff:

  fs: tighten a sanity check in file_attr_to_fileattr() (2025-07-16 10:22:01 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.fileattr tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.fileattr

----------------------------------------------------------------
Amir Goldstein (1):
      fs: prepare for extending file_get/setattr()

Andrey Albershteyn (5):
      fs: split fileattr related helpers into separate file
      lsm: introduce new hooks for setting/getting inode fsxattr
      selinux: implement inode_file_[g|s]etattr hooks
      fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP
      fs: introduce file_getattr and file_setattr syscalls

Christian Brauner (2):
      Merge patch series "fs: introduce file_getattr and file_setattr syscalls"
      tree-wide: s/struct fileattr/struct file_kattr/g

Dan Carpenter (1):
      fs: tighten a sanity check in file_attr_to_fileattr()

 Documentation/filesystems/locking.rst       |   4 +-
 Documentation/filesystems/vfs.rst           |   4 +-
 arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
 arch/arm/tools/syscall.tbl                  |   2 +
 arch/arm64/tools/syscall_32.tbl             |   2 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
 arch/s390/kernel/syscalls/syscall.tbl       |   2 +
 arch/sh/kernel/syscalls/syscall.tbl         |   2 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
 fs/Makefile                                 |   3 +-
 fs/bcachefs/fs.c                            |   4 +-
 fs/btrfs/ioctl.c                            |   4 +-
 fs/btrfs/ioctl.h                            |   6 +-
 fs/ecryptfs/inode.c                         |   4 +-
 fs/efivarfs/inode.c                         |   4 +-
 fs/ext2/ext2.h                              |   4 +-
 fs/ext2/ioctl.c                             |   4 +-
 fs/ext4/ext4.h                              |   4 +-
 fs/ext4/ioctl.c                             |   4 +-
 fs/f2fs/f2fs.h                              |   4 +-
 fs/f2fs/file.c                              |   4 +-
 fs/file_attr.c                              | 498 ++++++++++++++++++++++++++++
 fs/fuse/fuse_i.h                            |   4 +-
 fs/fuse/ioctl.c                             |   8 +-
 fs/gfs2/file.c                              |   4 +-
 fs/gfs2/inode.h                             |   4 +-
 fs/hfsplus/hfsplus_fs.h                     |   4 +-
 fs/hfsplus/inode.c                          |   4 +-
 fs/ioctl.c                                  | 309 -----------------
 fs/jfs/ioctl.c                              |   4 +-
 fs/jfs/jfs_inode.h                          |   4 +-
 fs/nilfs2/ioctl.c                           |   4 +-
 fs/nilfs2/nilfs.h                           |   4 +-
 fs/ocfs2/ioctl.c                            |   4 +-
 fs/ocfs2/ioctl.h                            |   4 +-
 fs/orangefs/inode.c                         |   4 +-
 fs/overlayfs/copy_up.c                      |   6 +-
 fs/overlayfs/inode.c                        |  17 +-
 fs/overlayfs/overlayfs.h                    |  10 +-
 fs/overlayfs/util.c                         |   2 +-
 fs/ubifs/ioctl.c                            |   4 +-
 fs/ubifs/ubifs.h                            |   4 +-
 fs/xfs/xfs_ioctl.c                          |  18 +-
 fs/xfs/xfs_ioctl.h                          |   4 +-
 include/linux/fileattr.h                    |  38 ++-
 include/linux/fs.h                          |   6 +-
 include/linux/lsm_hook_defs.h               |   2 +
 include/linux/security.h                    |  16 +
 include/linux/syscalls.h                    |   7 +
 include/uapi/asm-generic/unistd.h           |   8 +-
 include/uapi/linux/fs.h                     |  18 +
 mm/shmem.c                                  |   4 +-
 scripts/syscall.tbl                         |   2 +
 security/security.c                         |  30 ++
 security/selinux/hooks.c                    |  14 +
 64 files changed, 752 insertions(+), 410 deletions(-)
 create mode 100644 fs/file_attr.c
Re: [GIT PULL 12/14 for v6.17] vfs fileattr
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:18 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fileattr

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/57fcb7d930d8f00f383e995aeebdcd2b416a187a

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 13/14 for v6.17] vfs super
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
Currently all filesystems which implement super_operations::shutdown()
can not afford losing a device.

Thus fs_bdev_mark_dead() will just call the ->shutdown() callback for the
involved filesystem.

But it will no longer be the case, as multi-device filesystems like
btrfs can handle certain device loss without the need to shutdown the
whole filesystem.

To allow those multi-device filesystems to be integrated to use
fs_holder_ops:

- Add a new super_operations::remove_bdev() callback

- Try ->remove_bdev() callback first inside fs_bdev_mark_dead()
  If the callback returned 0, meaning the fs can handling the device
  loss, then exit without doing anything else.

  If there is no such callback or the callback returned non-zero value,
  continue to shutdown the filesystem as usual.

This means the new remove_bdev() should only do the check on whether the
operation can continue, and if so do the fs specific handlings. The
shutdown handling should still be handled by the existing ->shutdown()
callback.

For all existing filesystems with shutdown callback, there is no change
to the code nor behavior.

Btrfs is going to implement both the ->remove_bdev() and ->shutdown()
callbacks soon.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.super

for you to fetch changes up to d9c37a4904ec21ef7d45880fe023c11341869c28:

  fs: add a new remove_bdev() callback (2025-07-15 13:36:40 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.super tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.super

----------------------------------------------------------------
Qu Wenruo (1):
      fs: add a new remove_bdev() callback

 fs/super.c         | 11 +++++++++++
 include/linux/fs.h |  9 +++++++++
 2 files changed, 20 insertions(+)
Re: [GIT PULL 13/14 for v6.17] vfs super
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:27 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.super

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0965549d6f5f23e9250cd9c642f4ea5fd682eddb

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
[GIT PULL 14/14 for v6.17] vfs iomap
Posted by Christian Brauner 2 months, 1 week ago
Hey Linus,

/* Summary */
This contains the iomap updates for this cycle:

- Refactor the iomap writeback code and split the generic and ioend/bio
  based writeback code. There are two methods that define the split
  between the generic writeback code, and the implemementation of it,
  and all knowledge of ioends and bios now sits below that layer.

- This series adds fuse iomap support for buffered writes and dirty
  folio writeback. This is needed so that granular uptodate and dirty
  tracking can be used in fuse when large folios are enabled. This has
  two big advantages. For writes, instead of the entire folio needing to
  be read into the page cache, only the relevant portions need to be.
  For writeback, only the dirty portions need to be written back instead
  of the entire folio.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

This contains a merge conflict with mainline that can be resolved as follows:

diff --cc fs/fuse/file.c
index 2ddfb3bb6483,f16426fd2bf5..000000000000
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.iomap

for you to fetch changes up to d5212d819e02313f27c867e6d365e71f1fdaaca4:

  Merge patch series "fuse: use iomap for buffered writes + writeback" (2025-07-17 09:55:23 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.iomap tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.iomap

----------------------------------------------------------------
Christian Brauner (2):
      Merge patch series "refactor the iomap writeback code v5"
      Merge patch series "fuse: use iomap for buffered writes + writeback"

Christoph Hellwig (11):
      iomap: header diet
      iomap: pass more arguments using the iomap writeback context
      iomap: refactor the writeback interface
      iomap: hide ioends from the generic writeback code
      iomap: move all ioend handling to ioend.c
      iomap: rename iomap_writepage_map to iomap_writeback_folio
      iomap: export iomap_writeback_folio
      iomap: replace iomap_folio_ops with iomap_write_ops
      iomap: improve argument passing to iomap_read_folio_sync
      iomap: add read_folio_range() handler for buffered writes
      iomap: build the writeback code without CONFIG_BLOCK

Joanne Koong (8):
      iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks
      iomap: add public helpers for uptodate state manipulation
      iomap: move folio_unlock out of iomap_writeback_folio
      fuse: use iomap for buffered writes
      fuse: use iomap for writeback
      fuse: use iomap for folio laundering
      fuse: hook into iomap for invalidating and checking partial uptodateness
      fuse: refactor writeback to use iomap_writepage_ctx inode

 Documentation/filesystems/iomap/design.rst     |   3 -
 Documentation/filesystems/iomap/operations.rst |  57 ++-
 block/fops.c                                   |  37 +-
 fs/fuse/Kconfig                                |   1 +
 fs/fuse/file.c                                 | 345 +++++++--------
 fs/gfs2/aops.c                                 |   8 +-
 fs/gfs2/bmap.c                                 |  48 ++-
 fs/gfs2/bmap.h                                 |   1 +
 fs/gfs2/file.c                                 |   3 +-
 fs/iomap/Makefile                              |   6 +-
 fs/iomap/buffered-io.c                         | 553 ++++++++-----------------
 fs/iomap/direct-io.c                           |   5 -
 fs/iomap/fiemap.c                              |   3 -
 fs/iomap/internal.h                            |   1 -
 fs/iomap/ioend.c                               | 220 +++++++++-
 fs/iomap/iter.c                                |   1 -
 fs/iomap/seek.c                                |   4 -
 fs/iomap/swapfile.c                            |   3 -
 fs/iomap/trace.c                               |   1 -
 fs/iomap/trace.h                               |   4 +-
 fs/xfs/xfs_aops.c                              | 212 ++++++----
 fs/xfs/xfs_file.c                              |   6 +-
 fs/xfs/xfs_iomap.c                             |  12 +-
 fs/xfs/xfs_iomap.h                             |   1 +
 fs/xfs/xfs_reflink.c                           |   3 +-
 fs/zonefs/file.c                               |  40 +-
 include/linux/iomap.h                          |  82 ++--
 27 files changed, 859 insertions(+), 801 deletions(-)
Re: [GIT PULL 14/14 for v6.17] vfs iomap
Posted by pr-tracker-bot@kernel.org 2 months, 1 week ago
The pull request you sent on Fri, 25 Jul 2025 13:27:20 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.iomap

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL 14/14 for v6.17] vfs iomap
Posted by Sasha Levin 2 months, 1 week ago
Hey Christian,

On Fri, Jul 25, 2025 at 01:27:20PM +0200, Christian Brauner wrote:
>Hey Linus,
>
>/* Summary */
>This contains the iomap updates for this cycle:
>
>- Refactor the iomap writeback code and split the generic and ioend/bio
>  based writeback code. There are two methods that define the split
>  between the generic writeback code, and the implemementation of it,
>  and all knowledge of ioends and bios now sits below that layer.
>
>- This series adds fuse iomap support for buffered writes and dirty
>  folio writeback. This is needed so that granular uptodate and dirty
>  tracking can be used in fuse when large folios are enabled. This has
>  two big advantages. For writes, instead of the entire folio needing to
>  be read into the page cache, only the relevant portions need to be.
>  For writeback, only the dirty portions need to be written back instead
>  of the entire folio.

While testing with the linus-next tree, it appears that LKFT can trigger
the following warning, but only on arm64 tests (both on real HW as well
as qemu):

[ 333.129662] WARNING: CPU: 1 PID: 2580 at fs/fuse/file.c:2158 fuse_iomap_writeback_range+0x478/0x558 fuse
[  333.132010] Modules linked in: btrfs blake2b_generic xor xor_neon raid6_pq zstd_compress sm3_ce sha3_ce sha512_ce fuse drm backlight ip_tables x_tables
[  333.133982] CPU: 1 UID: 0 PID: 2580 Comm: msync04 Tainted: G        W           6.16.0-rc7 #1 PREEMPT
[  333.134997] Tainted: [W]=WARN
[  333.135497] Hardware name: linux,dummy-virt (DT)
[  333.136114] pstate: 03402009 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.137090] pc : fuse_iomap_writeback_range+0x478/0x558 fuse
[ 333.138009] lr : iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
[  333.138510] sp : ffff80008be8f8c0
[  333.138653] x29: ffff80008be8f8c0 x28: fff00000c5198c00 x27: 0000000000000000
[  333.138975] x26: fff00000d32b8c00 x25: 0000000000000000 x24: 0000000000000000
[  333.139309] x23: 0000000000000000 x22: fffffc1fc039ba40 x21: 0000000000001000
[  333.139600] x20: ffff80008be8f9f0 x19: 0000000000000000 x18: 0000000000000000
[  333.139917] x17: 0000000000000000 x16: ffffbb40f61c3a48 x15: 0000000000000000
[  333.142199] x14: ffffbb40f6924788 x13: 0000ffff8e8effff x12: 0000000000000000
[  333.142739] x11: 1ffe0000199a9241 x10: fff00000ccd4920c x9 : ffffbb40f50bba18
[  333.143466] x8 : ffff80008be8f778 x7 : ffffbb40ee180b68 x6 : ffffbb40f76c9000
[  333.143718] x5 : 0000000000000000 x4 : 000000000000000a x3 : 0000000000001000
[  333.143957] x2 : fff00000c0b6e600 x1 : 000000000000ffff x0 : 0bfffe000000400b
[  333.144993] Call trace:
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.145466] fuse_iomap_writeback_range+0x478/0x558 fuse (P)
[ 333.146136] iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
[ 333.146444] iomap_writepages (fs/iomap/buffered-io.c:1762)
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.146590] fuse_writepages+0xa0/0xe8 fuse
[ 333.146774] do_writepages (mm/page-writeback.c:2636)
[ 333.146915] filemap_fdatawrite_wbc (mm/filemap.c:386 mm/filemap.c:376)
[ 333.147788] __filemap_fdatawrite_range (mm/filemap.c:420)
[ 333.148440] file_write_and_wait_range (mm/filemap.c:794)
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.149054] fuse_fsync+0x6c/0x138 fuse
[ 333.149578] vfs_fsync_range (fs/sync.c:188)
[ 333.149892] __arm64_sys_msync (mm/msync.c:96 mm/msync.c:32 mm/msync.c:32)
[ 333.150095] invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
[ 333.150330] do_el0_svc (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
[ 333.150461] el0_svc (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:768 (discriminator 1))
[ 333.150583] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:787)
[ 333.150729] el0t_64_sync (arch/arm64/kernel/entry.S:600)
[  333.150862] ---[ end trace 0000000000000000 ]---

I think that this is because the arm64 tests run on
CONFIG_PAGE_SIZE_64KB=y build, but I'm not sure why we don't see it with
4KB pages at all.

An example link to a failing test that has the full log and more
information: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-44385-g8a03a07bad83/testrun/29269158/suite/log-parser-test/test/exception-warning-cpu-pid-at-fsfusefile-fuse_iomap_writeback_range/details/

-- 
Thanks,
Sasha
Re: [GIT PULL 14/14 for v6.17] vfs iomap
Posted by Joanne Koong 2 months, 1 week ago
On Sun, Jul 27, 2025 at 6:10 AM Sasha Levin <sashal@kernel.org> wrote:
>
> Hey Christian,
>
> On Fri, Jul 25, 2025 at 01:27:20PM +0200, Christian Brauner wrote:
> >Hey Linus,
> >
> >/* Summary */
> >This contains the iomap updates for this cycle:
> >
> >- Refactor the iomap writeback code and split the generic and ioend/bio
> >  based writeback code. There are two methods that define the split
> >  between the generic writeback code, and the implemementation of it,
> >  and all knowledge of ioends and bios now sits below that layer.
> >
> >- This series adds fuse iomap support for buffered writes and dirty
> >  folio writeback. This is needed so that granular uptodate and dirty
> >  tracking can be used in fuse when large folios are enabled. This has
> >  two big advantages. For writes, instead of the entire folio needing to
> >  be read into the page cache, only the relevant portions need to be.
> >  For writeback, only the dirty portions need to be written back instead
> >  of the entire folio.
>
> While testing with the linus-next tree, it appears that LKFT can trigger
> the following warning, but only on arm64 tests (both on real HW as well
> as qemu):
>
> [ 333.129662] WARNING: CPU: 1 PID: 2580 at fs/fuse/file.c:2158 fuse_iomap_writeback_range+0x478/0x558 fuse
> [  333.132010] Modules linked in: btrfs blake2b_generic xor xor_neon raid6_pq zstd_compress sm3_ce sha3_ce sha512_ce fuse drm backlight ip_tables x_tables
> [  333.133982] CPU: 1 UID: 0 PID: 2580 Comm: msync04 Tainted: G        W           6.16.0-rc7 #1 PREEMPT
> [  333.134997] Tainted: [W]=WARN
> [  333.135497] Hardware name: linux,dummy-virt (DT)
> [  333.136114] pstate: 03402009 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.137090] pc : fuse_iomap_writeback_range+0x478/0x558 fuse
> [ 333.138009] lr : iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
> [  333.138510] sp : ffff80008be8f8c0
> [  333.138653] x29: ffff80008be8f8c0 x28: fff00000c5198c00 x27: 0000000000000000
> [  333.138975] x26: fff00000d32b8c00 x25: 0000000000000000 x24: 0000000000000000
> [  333.139309] x23: 0000000000000000 x22: fffffc1fc039ba40 x21: 0000000000001000
> [  333.139600] x20: ffff80008be8f9f0 x19: 0000000000000000 x18: 0000000000000000
> [  333.139917] x17: 0000000000000000 x16: ffffbb40f61c3a48 x15: 0000000000000000
> [  333.142199] x14: ffffbb40f6924788 x13: 0000ffff8e8effff x12: 0000000000000000
> [  333.142739] x11: 1ffe0000199a9241 x10: fff00000ccd4920c x9 : ffffbb40f50bba18
> [  333.143466] x8 : ffff80008be8f778 x7 : ffffbb40ee180b68 x6 : ffffbb40f76c9000
> [  333.143718] x5 : 0000000000000000 x4 : 000000000000000a x3 : 0000000000001000
> [  333.143957] x2 : fff00000c0b6e600 x1 : 000000000000ffff x0 : 0bfffe000000400b
> [  333.144993] Call trace:
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.145466] fuse_iomap_writeback_range+0x478/0x558 fuse (P)
> [ 333.146136] iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
> [ 333.146444] iomap_writepages (fs/iomap/buffered-io.c:1762)
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.146590] fuse_writepages+0xa0/0xe8 fuse
> [ 333.146774] do_writepages (mm/page-writeback.c:2636)
> [ 333.146915] filemap_fdatawrite_wbc (mm/filemap.c:386 mm/filemap.c:376)
> [ 333.147788] __filemap_fdatawrite_range (mm/filemap.c:420)
> [ 333.148440] file_write_and_wait_range (mm/filemap.c:794)
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.149054] fuse_fsync+0x6c/0x138 fuse
> [ 333.149578] vfs_fsync_range (fs/sync.c:188)
> [ 333.149892] __arm64_sys_msync (mm/msync.c:96 mm/msync.c:32 mm/msync.c:32)
> [ 333.150095] invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
> [ 333.150330] do_el0_svc (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
> [ 333.150461] el0_svc (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:768 (discriminator 1))
> [ 333.150583] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:787)
> [ 333.150729] el0t_64_sync (arch/arm64/kernel/entry.S:600)
> [  333.150862] ---[ end trace 0000000000000000 ]---
>
> I think that this is because the arm64 tests run on
> CONFIG_PAGE_SIZE_64KB=y build, but I'm not sure why we don't see it with
> 4KB pages at all.
>
> An example link to a failing test that has the full log and more
> information: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-44385-g8a03a07bad83/testrun/29269158/suite/log-parser-test/test/exception-warning-cpu-pid-at-fsfusefile-fuse_iomap_writeback_range/details/
>

This was reported last week as well in [1]. The fix for this is in
https://lore.kernel.org/linux-fsdevel/20250723230850.2395561-1-joannelkoong@gmail.com/

Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/CA+G9fYs5AdVM-T2Tf3LciNCwLZEHetcnSkHsjZajVwwpM2HmJw@mail.gmail.com/

> --
> Thanks,
> Sasha
>
Re: [GIT PULL 14/14 for v6.17] vfs iomap
Posted by Christian Brauner 2 months ago
> This was reported last week as well in [1]. The fix for this is in
> https://lore.kernel.org/linux-fsdevel/20250723230850.2395561-1-joannelkoong@gmail.com/

Thanks Joanne!