[RFC PATCH v5 0/2] vfs: add O_CREAT|O_DIRECTORY to open*(2)

Jori Koolstra posted 2 patches 2 weeks ago
There is a newer version of this series
fs/9p/vfs_inode.c                             |   3 +
fs/9p/vfs_inode_dotl.c                        |   3 +
fs/ceph/file.c                                |   3 +
fs/fuse/dir.c                                 |   3 +
fs/gfs2/inode.c                               |   3 +
fs/namei.c                                    | 177 ++++++++++------
fs/nfs/dir.c                                  |   3 +
fs/nfs/file.c                                 |   3 +
fs/open.c                                     |  25 ++-
fs/smb/client/dir.c                           |   3 +
fs/vboxsf/dir.c                               |   3 +
include/linux/fcntl.h                         |   2 +
.../testing/selftests/filesystems/.gitignore  |   1 +
tools/testing/selftests/filesystems/Makefile  |   4 +-
tools/testing/selftests/filesystems/fclog.c   |   1 +
.../filesystems/open_o_creat_o_dir.c          | 197 ++++++++++++++++++
16 files changed, 362 insertions(+), 72 deletions(-)
create mode 100644 tools/testing/selftests/filesystems/open_o_creat_o_dir.c
[RFC PATCH v5 0/2] vfs: add O_CREAT|O_DIRECTORY to open*(2)
Posted by Jori Koolstra 2 weeks ago
This series implements new semantics for the O_CREAT|O_DIRECTORY flag
combination for open*(2): perform a mkdir and open the resulting
directory; return a pinning fd (which mkdir does not).

Three comments from me upfront:

- This patch EINVAL bans O_CREAT|O_DIRECTORY in each individual
  atomic_open implementation. An argument to do it in the generic
  atomic_open() in fs/namei.c instead is to give out of tree
  filesystems more time to implement (or block) O_CREAT|O_DIRECTORY.

- If we create a regular file with mknod, before creation
  security_path_mknod() is called, and after creation
  security_path_post_mknod(). If we create a regular file using O_CREAT
  (and this is also pre-patch) only security_path_mknod() is called. Is
  this the correct behaviour?

- open_last_lookups() locks the parent inode like like: 

		inode_lock(dir->d_inode);

  should this perhaps be

		inode_lock_nested(dir, I_MUTEX_PARENT);

  to stay consistent with the start_dirop() path that is used by
  filename_create() for instance in mknod(2)? I get that we are only
  locking one inode here at most, so it does not really matter, but
  now one regular file create path does set the lockdep and the other
  does not.

Changes:
v5: fixed Sashiko reported issues [1]. Moved to EINVAL banning
O_CREAT|O_DIRECTORY in each individual atomic_open implementation
instead of in the generic atomic_open() in fs/namei.c.
v3/4: fixed syzbot reported bugs
v2: don't introduce a new syscall (mkdirat2) but implement this
functionality as O_CREAT|O_DIRECTORY in open*(2).

[1]: 
https://sashiko.dev/#/patchset/20260518165237.2084042-1-jkoolstra%40xs4all.nl

Jori Koolstra (2):
  vfs: add O_CREAT|O_DIRECTORY to open*(2)
  selftest: add tests for open*(O_CREAT|O_DIRECTORY)

 fs/9p/vfs_inode.c                             |   3 +
 fs/9p/vfs_inode_dotl.c                        |   3 +
 fs/ceph/file.c                                |   3 +
 fs/fuse/dir.c                                 |   3 +
 fs/gfs2/inode.c                               |   3 +
 fs/namei.c                                    | 177 ++++++++++------
 fs/nfs/dir.c                                  |   3 +
 fs/nfs/file.c                                 |   3 +
 fs/open.c                                     |  25 ++-
 fs/smb/client/dir.c                           |   3 +
 fs/vboxsf/dir.c                               |   3 +
 include/linux/fcntl.h                         |   2 +
 .../testing/selftests/filesystems/.gitignore  |   1 +
 tools/testing/selftests/filesystems/Makefile  |   4 +-
 tools/testing/selftests/filesystems/fclog.c   |   1 +
 .../filesystems/open_o_creat_o_dir.c          | 197 ++++++++++++++++++
 16 files changed, 362 insertions(+), 72 deletions(-)
 create mode 100644 tools/testing/selftests/filesystems/open_o_creat_o_dir.c

-- 
2.54.0
Re: [RFC PATCH v5 0/2] vfs: add O_CREAT|O_DIRECTORY to open*(2)
Posted by Askar Safin 1 week, 5 days ago
Jori Koolstra <jkoolstra@xs4all.nl>:
> This series implements new semantics for the O_CREAT|O_DIRECTORY flag
> combination for open*(2): perform a mkdir and open the resulting
> directory; return a pinning fd (which mkdir does not).

Al Viro strongly opposed this idea back in 2020:
> For fuck sake, *NO*!
> We don't need any more multiplexors from hell.

https://lore.kernel.org/all/20200313182844.GO23230@ZenIV.linux.org.uk/

So, at my opinion, at very least, Ack-By from Al Viro is mandatory.

-- 
Askar Safin