fs/fhandle.c | 29 ++++++-- include/linux/syscalls.h | 2 +- include/uapi/linux/fcntl.h | 81 ++++++++++++++------- tools/perf/trace/beauty/include/uapi/linux/fcntl.h | 84 +++++++++++++++------- 4 files changed, 140 insertions(+), 56 deletions(-)
Now that we provide a unique 64-bit mount ID interface in statx(2), we
can now provide a race-free way for name_to_handle_at(2) to provide a
file handle and corresponding mount without needing to worry about
racing with /proc/mountinfo parsing or having to open a file just to do
statx(2).
While this is not necessary if you are using AT_EMPTY_PATH and don't
care about an extra statx(2) call, users that pass full paths into
name_to_handle_at(2) need to know which mount the file handle comes from
(to make sure they don't try to open_by_handle_at a file handle from a
different filesystem) and switching to AT_EMPTY_PATH would require
allocating a file for every name_to_handle_at(2) call, turning
err = name_to_handle_at(-EBADF, "/foo/bar/baz", &handle, &mntid,
AT_HANDLE_MNT_ID_UNIQUE);
into
int fd = openat(-EBADF, "/foo/bar/baz", O_PATH | O_CLOEXEC);
err1 = name_to_handle_at(fd, "", &handle, &unused_mntid, AT_EMPTY_PATH);
err2 = statx(fd, "", AT_EMPTY_PATH, STATX_MNT_ID_UNIQUE, &statxbuf);
mntid = statxbuf.stx_mnt_id;
close(fd);
Also, this series adds a patch to clarify how AT_* flag allocation
should work going forwards.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
Changes in v3:
- Added a patch describing how AT_* flags should be allocated in the
future, based on Amir's suggestions.
- Included AT_* aliases for RENAME_* flags to further indicate that
renameat2(2) is an *at(2) syscall and to indicate that those flags
have been allocated already in the per-syscall range.
- Switched AT_HANDLE_MNT_ID_UNIQUE to use 0x01 (to reuse
(AT_)RENAME_NOREPLACE).
- v2: <https://lore.kernel.org/r/20240523-exportfs-u64-mount-id-v2-1-f9f959f17eb1@cyphar.com>
Changes in v2:
- Fixed a few minor compiler warnings and a buggy copy_to_user() check.
- Rename AT_HANDLE_UNIQUE_MOUNT_ID -> AT_HANDLE_MNT_ID_UNIQUE to match statx.
- Switched to using an AT_* bit from 0xFF and defining that range as
being "per-syscall" for future usage.
- Sync tools/ copy of <linux/fcntl.h> to include changes.
- v1: <https://lore.kernel.org/r/20240520-exportfs-u64-mount-id-v1-1-f55fd9215b8e@cyphar.com>
---
Aleksa Sarai (2):
uapi: explain how per-syscall AT_* flags should be allocated
fhandle: expose u64 mount id to name_to_handle_at(2)
fs/fhandle.c | 29 ++++++--
include/linux/syscalls.h | 2 +-
include/uapi/linux/fcntl.h | 81 ++++++++++++++-------
tools/perf/trace/beauty/include/uapi/linux/fcntl.h | 84 +++++++++++++++-------
4 files changed, 140 insertions(+), 56 deletions(-)
---
base-commit: c7b9563b58a77423d4c6e026ff831a69612b02fc
change-id: 20240515-exportfs-u64-mount-id-9ebb5c58b53c
Best regards,
--
Aleksa Sarai <cyphar@cyphar.com>
On Thu, Aug 01, 2024 at 01:52:39PM +1000, Aleksa Sarai wrote: > Now that we provide a unique 64-bit mount ID interface in statx(2), we > can now provide a race-free way for name_to_handle_at(2) to provide a > file handle and corresponding mount without needing to worry about > racing with /proc/mountinfo parsing or having to open a file just to do > statx(2). > > While this is not necessary if you are using AT_EMPTY_PATH and don't > care about an extra statx(2) call, users that pass full paths into > name_to_handle_at(2) need to know which mount the file handle comes from > (to make sure they don't try to open_by_handle_at a file handle from a > different filesystem) and switching to AT_EMPTY_PATH would require > allocating a file for every name_to_handle_at(2) call, turning > > err = name_to_handle_at(-EBADF, "/foo/bar/baz", &handle, &mntid, > AT_HANDLE_MNT_ID_UNIQUE); > > into > > int fd = openat(-EBADF, "/foo/bar/baz", O_PATH | O_CLOEXEC); > err1 = name_to_handle_at(fd, "", &handle, &unused_mntid, AT_EMPTY_PATH); > err2 = statx(fd, "", AT_EMPTY_PATH, STATX_MNT_ID_UNIQUE, &statxbuf); > mntid = statxbuf.stx_mnt_id; > close(fd); > > Also, this series adds a patch to clarify how AT_* flag allocation > should work going forwards. > > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> > --- > Changes in v3: > - Added a patch describing how AT_* flags should be allocated in the > future, based on Amir's suggestions. > - Included AT_* aliases for RENAME_* flags to further indicate that > renameat2(2) is an *at(2) syscall and to indicate that those flags > have been allocated already in the per-syscall range. > - Switched AT_HANDLE_MNT_ID_UNIQUE to use 0x01 (to reuse > (AT_)RENAME_NOREPLACE). > - v2: <https://lore.kernel.org/r/20240523-exportfs-u64-mount-id-v2-1-f9f959f17eb1@cyphar.com> > Changes in v2: > - Fixed a few minor compiler warnings and a buggy copy_to_user() check. > - Rename AT_HANDLE_UNIQUE_MOUNT_ID -> AT_HANDLE_MNT_ID_UNIQUE to match statx. > - Switched to using an AT_* bit from 0xFF and defining that range as > being "per-syscall" for future usage. > - Sync tools/ copy of <linux/fcntl.h> to include changes. > - v1: <https://lore.kernel.org/r/20240520-exportfs-u64-mount-id-v1-1-f55fd9215b8e@cyphar.com> > > --- > Aleksa Sarai (2): > uapi: explain how per-syscall AT_* flags should be allocated > fhandle: expose u64 mount id to name_to_handle_at(2) > Wasn't the conclusion from this discussion last time that we needed to revisit this API completely? Christoph had some pretty adamant objections. That being said the uapi comments patch looks good to me, you can add Reviewed-by: Josef Bacik <josef@toxicpanda.com> to that one. The other one I'm going to let others who have stronger opinions than me argue about. Thanks, Josef
On 2024-08-01, Josef Bacik <josef@toxicpanda.com> wrote: > On Thu, Aug 01, 2024 at 01:52:39PM +1000, Aleksa Sarai wrote: > > Now that we provide a unique 64-bit mount ID interface in statx(2), we > > can now provide a race-free way for name_to_handle_at(2) to provide a > > file handle and corresponding mount without needing to worry about > > racing with /proc/mountinfo parsing or having to open a file just to do > > statx(2). > > > > While this is not necessary if you are using AT_EMPTY_PATH and don't > > care about an extra statx(2) call, users that pass full paths into > > name_to_handle_at(2) need to know which mount the file handle comes from > > (to make sure they don't try to open_by_handle_at a file handle from a > > different filesystem) and switching to AT_EMPTY_PATH would require > > allocating a file for every name_to_handle_at(2) call, turning > > > > err = name_to_handle_at(-EBADF, "/foo/bar/baz", &handle, &mntid, > > AT_HANDLE_MNT_ID_UNIQUE); > > > > into > > > > int fd = openat(-EBADF, "/foo/bar/baz", O_PATH | O_CLOEXEC); > > err1 = name_to_handle_at(fd, "", &handle, &unused_mntid, AT_EMPTY_PATH); > > err2 = statx(fd, "", AT_EMPTY_PATH, STATX_MNT_ID_UNIQUE, &statxbuf); > > mntid = statxbuf.stx_mnt_id; > > close(fd); > > > > Also, this series adds a patch to clarify how AT_* flag allocation > > should work going forwards. > > > > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> > > --- > > Changes in v3: > > - Added a patch describing how AT_* flags should be allocated in the > > future, based on Amir's suggestions. > > - Included AT_* aliases for RENAME_* flags to further indicate that > > renameat2(2) is an *at(2) syscall and to indicate that those flags > > have been allocated already in the per-syscall range. > > - Switched AT_HANDLE_MNT_ID_UNIQUE to use 0x01 (to reuse > > (AT_)RENAME_NOREPLACE). > > - v2: <https://lore.kernel.org/r/20240523-exportfs-u64-mount-id-v2-1-f9f959f17eb1@cyphar.com> > > Changes in v2: > > - Fixed a few minor compiler warnings and a buggy copy_to_user() check. > > - Rename AT_HANDLE_UNIQUE_MOUNT_ID -> AT_HANDLE_MNT_ID_UNIQUE to match statx. > > - Switched to using an AT_* bit from 0xFF and defining that range as > > being "per-syscall" for future usage. > > - Sync tools/ copy of <linux/fcntl.h> to include changes. > > - v1: <https://lore.kernel.org/r/20240520-exportfs-u64-mount-id-v1-1-f55fd9215b8e@cyphar.com> > > > > --- > > Aleksa Sarai (2): > > uapi: explain how per-syscall AT_* flags should be allocated > > fhandle: expose u64 mount id to name_to_handle_at(2) > > > > Wasn't the conclusion from this discussion last time that we needed to revisit > this API completely? Christoph had some pretty adamant objections. There was a discussion about reworking the API and I agree with most of the issues raised about file handles (I personally don't really like this interface and it's a bit of a shame that it seems this is going to be the interface that replaces inode numbers) so I'm not at all opposed to reworking it. However, I agree with Christian[1] that we can fix this existing issue in the existing API fairly easily and then work on a new API separately. The existing usage of name_to_handle_at() is fundamentally unsafe (as outlined in the man page) and we can fix that fairly easily. [1]: https://lore.kernel.org/all/20240527-hagel-thunfisch-75781b0cf75d@brauner/ > That being said the uapi comments patch looks good to me, you can add > > Reviewed-by: Josef Bacik <josef@toxicpanda.com> > > to that one. The other one I'm going to let others who have stronger opinions > than me argue about. Thanks, > > Josef -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/>
© 2016 - 2026 Red Hat, Inc.