arch/alpha/kernel/syscalls/syscall.tbl | 3 + arch/arm/tools/syscall.tbl | 3 + arch/arm64/include/asm/unistd32.h | 4 + arch/ia64/kernel/syscalls/syscall.tbl | 3 + arch/m68k/kernel/syscalls/syscall.tbl | 3 + arch/microblaze/kernel/syscalls/syscall.tbl | 3 + arch/mips/kernel/syscalls/syscall_n32.tbl | 3 + arch/mips/kernel/syscalls/syscall_n64.tbl | 3 + arch/mips/kernel/syscalls/syscall_o32.tbl | 3 + arch/parisc/kernel/syscalls/syscall.tbl | 3 + arch/powerpc/kernel/syscalls/syscall.tbl | 3 + arch/s390/kernel/syscalls/syscall.tbl | 3 + arch/sh/kernel/syscalls/syscall.tbl | 3 + arch/sparc/kernel/syscalls/syscall.tbl | 3 + arch/x86/entry/syscalls/syscall_32.tbl | 3 + arch/x86/entry/syscalls/syscall_64.tbl | 2 + arch/xtensa/kernel/syscalls/syscall.tbl | 3 + fs/internal.h | 2 + fs/mount.h | 27 +- fs/namespace.c | 573 ++++++++++++++++---- fs/pnode.c | 2 +- fs/proc_namespace.c | 13 +- fs/stat.c | 9 +- include/linux/mount.h | 5 +- include/linux/syscalls.h | 8 + include/uapi/asm-generic/unistd.h | 8 +- include/uapi/linux/mount.h | 65 +++ include/uapi/linux/stat.h | 1 + 28 files changed, 635 insertions(+), 129 deletions(-)
Implement mount querying syscalls agreed on at LSF/MM 2023.
Features:
- statx-like want/got mask
- allows returning ascii strings (fs type, root, mount point)
- returned buffer is relocatable (no pointers)
Still missing:
- man pages
- kselftest
Please find the test utility at the end of this mail.
Usage: statmnt [-l|-r] [-u] (mnt_id|path)
Git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git#statmount-v4
Changes v3..v4:
- incorporate patch moving list of mounts to an rbtree
- wire up syscalls for all archs
- add LISTMOUNT_RECURSIVE (depth first iteration of mount tree)
- add LSMT_ROOT (list root instead of a specific mount ID)
- list_for_each_entry_del() moved to a separate patchset
Changes v1..v3:
- rename statmnt(2) -> statmount(2)
- rename listmnt(2) -> listmount(2)
- make ABI 32bit compatible by passing 64bit args in a struct (tested on
i386 and x32)
- only accept new 64bit mount IDs
- fix compile on !CONFIG_PROC_FS
- call security_sb_statfs() in both syscalls
- make lookup_mnt_in_ns() static
- add LISTMOUNT_UNREACHABLE flag to listmnt() to explicitly ask for
listing unreachable mounts
- remove .sb_opts
- remove subtype from .fs_type
- return the number of bytes used (including strings) in .size
- rename .mountpoint -> .mnt_point
- point strings by an offset against char[] VLA at the end of the struct.
E.g. printf("fs_type: %s\n", st->str + st->fs_type);
- don't save string lengths
- extend spare space in struct statmnt (complete size is now 512 bytes)
Miklos Szeredi (6):
add unique mount ID
mounts: keep list of mounts in an rbtree
namespace: extract show_path() helper
add statmount(2) syscall
add listmount(2) syscall
wire up syscalls for statmount/listmount
arch/alpha/kernel/syscalls/syscall.tbl | 3 +
arch/arm/tools/syscall.tbl | 3 +
arch/arm64/include/asm/unistd32.h | 4 +
arch/ia64/kernel/syscalls/syscall.tbl | 3 +
arch/m68k/kernel/syscalls/syscall.tbl | 3 +
arch/microblaze/kernel/syscalls/syscall.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 3 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 3 +
arch/parisc/kernel/syscalls/syscall.tbl | 3 +
arch/powerpc/kernel/syscalls/syscall.tbl | 3 +
arch/s390/kernel/syscalls/syscall.tbl | 3 +
arch/sh/kernel/syscalls/syscall.tbl | 3 +
arch/sparc/kernel/syscalls/syscall.tbl | 3 +
arch/x86/entry/syscalls/syscall_32.tbl | 3 +
arch/x86/entry/syscalls/syscall_64.tbl | 2 +
arch/xtensa/kernel/syscalls/syscall.tbl | 3 +
fs/internal.h | 2 +
fs/mount.h | 27 +-
fs/namespace.c | 573 ++++++++++++++++----
fs/pnode.c | 2 +-
fs/proc_namespace.c | 13 +-
fs/stat.c | 9 +-
include/linux/mount.h | 5 +-
include/linux/syscalls.h | 8 +
include/uapi/asm-generic/unistd.h | 8 +-
include/uapi/linux/mount.h | 65 +++
include/uapi/linux/stat.h | 1 +
28 files changed, 635 insertions(+), 129 deletions(-)
--
2.41.0
=== statmnt.c ===
#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <sys/mount.h>
#include <sys/stat.h>
#include <sys/param.h>
#include <err.h>
/*
* Structure for getting mount/superblock/filesystem info with statmount(2).
*
* The interface is similar to statx(2): individual fields or groups can be
* selected with the @mask argument of statmount(). Kernel will set the @mask
* field according to the supported fields.
*
* If string fields are selected, then the caller needs to pass a buffer that
* has space after the fixed part of the structure. Nul terminated strings are
* copied there and offsets relative to @str are stored in the relevant fields.
* If the buffer is too small, then EOVERFLOW is returned. The actually used
* size is returned in @size.
*/
struct statmnt {
__u32 size; /* Total size, including strings */
__u32 __spare1;
__u64 mask; /* What results were written */
__u32 sb_dev_major; /* Device ID */
__u32 sb_dev_minor;
__u64 sb_magic; /* ..._SUPER_MAGIC */
__u32 sb_flags; /* MS_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */
__u32 fs_type; /* [str] Filesystem type */
__u64 mnt_id; /* Unique ID of mount */
__u64 mnt_parent_id; /* Unique ID of parent (for root == mnt_id) */
__u32 mnt_id_old; /* Reused IDs used in proc/.../mountinfo */
__u32 mnt_parent_id_old;
__u64 mnt_attr; /* MOUNT_ATTR_... */
__u64 mnt_propagation; /* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */
__u64 mnt_peer_group; /* ID of shared peer group */
__u64 mnt_master; /* Mount receives propagation from this ID */
__u64 propagate_from; /* Propagation from in current namespace */
__u32 mnt_root; /* [str] Root of mount relative to root of fs */
__u32 mnt_point; /* [str] Mountpoint relative to current root */
__u64 __spare2[50];
char str[]; /* Variable size part containing strings */
};
/*
* To be used on the kernel ABI only for passing 64bit arguments to statmount(2)
*/
struct __mount_arg {
__u64 mnt_id;
__u64 request_mask;
};
/*
* @mask bits for statmount(2)
*/
#define STMT_SB_BASIC 0x00000001U /* Want/got sb_... */
#define STMT_MNT_BASIC 0x00000002U /* Want/got mnt_... */
#define STMT_PROPAGATE_FROM 0x00000004U /* Want/got propagate_from */
#define STMT_MNT_ROOT 0x00000008U /* Want/got mnt_root */
#define STMT_MNT_POINT 0x00000010U /* Want/got mnt_point */
#define STMT_FS_TYPE 0x00000020U /* Want/got fs_type */
/* listmount(2) flags */
#define LISTMOUNT_UNREACHABLE 0x01 /* List unreachable mounts too */
#define LISTMOUNT_RECURSIVE 0x02 /* List a mount tree */
/*
* Special @mnt_id values that can be passed to listmount
*/
#define LSMT_ROOT 0xffffffffffffffff /* root mount */
#ifdef __alpha__
#define __NR_statmount 564
#define __NR_listmount 565
#else
#define __NR_statmount 454
#define __NR_listmount 455
#endif
#define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */
static void free_if_neq(void *p, const void *q)
{
if (p != q)
free(p);
}
static struct statmnt *statmount(uint64_t mnt_id, uint64_t mask, unsigned int flags)
{
struct __mount_arg arg = {
.mnt_id = mnt_id,
.request_mask = mask,
};
union {
struct statmnt m;
char s[4096];
} buf;
struct statmnt *ret, *mm = &buf.m;
size_t bufsize = sizeof(buf);
while (syscall(__NR_statmount, &arg, mm, bufsize, flags) == -1) {
free_if_neq(mm, &buf.m);
if (errno != EOVERFLOW)
return NULL;
bufsize = MAX(1 << 15, bufsize << 1);
mm = malloc(bufsize);
if (!mm)
return NULL;
}
ret = malloc(mm->size);
if (ret)
memcpy(ret, mm, mm->size);
free_if_neq(mm, &buf.m);
return ret;
}
static int listmount(uint64_t mnt_id, uint64_t **listp, unsigned int flags)
{
struct __mount_arg arg = {
.mnt_id = mnt_id,
};
uint64_t buf[512];
size_t bufsize = sizeof(buf);
uint64_t *ret, *ll = buf;
long len;
while ((len = syscall(__NR_listmount, &arg, ll, bufsize / sizeof(buf[0]), flags)) == -1) {
free_if_neq(ll, buf);
if (errno != EOVERFLOW)
return -1;
bufsize = MAX(1 << 15, bufsize << 1);
ll = malloc(bufsize);
if (!ll)
return -1;
}
bufsize = len * sizeof(buf[0]);
ret = malloc(bufsize);
if (!ret)
return -1;
*listp = ret;
memcpy(ret, ll, bufsize);
free_if_neq(ll, buf);
return len;
}
int main(int argc, char *argv[])
{
struct statmnt *st;
char *end;
int res;
int list = 0;
int flags = 0;
uint64_t mask = STMT_SB_BASIC | STMT_MNT_BASIC | STMT_PROPAGATE_FROM | STMT_MNT_ROOT | STMT_MNT_POINT | STMT_FS_TYPE;
uint64_t mnt_id;
int opt;
for (;;) {
opt = getopt(argc, argv, "lru");
if (opt == -1)
break;
switch (opt) {
case 'r':
flags |= LISTMOUNT_RECURSIVE;
/* fallthrough */
case 'l':
list = 1;
break;
case 'u':
flags |= LISTMOUNT_UNREACHABLE;
break;
default:
errx(1, "usage: %s [-l|-r] [-u] (mnt_id|path)", argv[0]);
}
}
if (optind >= argc) {
if (!list)
errx(1, "missing mnt_id or path");
else
mnt_id = -1LL;
} else {
const char *arg = argv[optind];
mnt_id = strtoll(arg, &end, 0);
if (!mnt_id || *end != '\0') {
struct statx sx;
res = statx(AT_FDCWD, arg, 0, STATX_MNT_ID_UNIQUE, &sx);
if (res == -1)
err(1, "%s", arg);
if (!(sx.stx_mask & (STATX_MNT_ID | STATX_MNT_ID_UNIQUE)))
errx(1, "Sorry, no mount ID");
mnt_id = sx.stx_mnt_id;
}
}
if (list) {
uint64_t *list;
int num, i;
res = listmount(mnt_id, &list, flags);
if (res == -1)
err(1, "listmnt(0x%llx)", (unsigned long long) mnt_id);
num = res;
for (i = 0; i < num; i++) {
printf("0x%llx", (unsigned long long) list[i]);
st = statmount(list[i], STMT_MNT_POINT, 0);
if (!st) {
printf("\t[%s]\n", strerror(errno));
} else {
printf("\t%s\n", (st->mask & STMT_MNT_POINT) ? st->str + st->mnt_point : "???");
}
free(st);
}
free(list);
return 0;
}
st = statmount(mnt_id, mask, 0);
if (!st)
err(1, "statmnt(0x%llx)", (unsigned long long) mnt_id);
printf("size: %u\n", st->size);
printf("mask: 0x%llx\n", st->mask);
if (st->mask & STMT_SB_BASIC) {
printf("sb_dev_major: %u\n", st->sb_dev_major);
printf("sb_dev_minor: %u\n", st->sb_dev_minor);
printf("sb_magic: 0x%llx\n", st->sb_magic);
printf("sb_flags: 0x%08x\n", st->sb_flags);
}
if (st->mask & STMT_MNT_BASIC) {
printf("mnt_id: 0x%llx\n", st->mnt_id);
printf("mnt_parent_id: 0x%llx\n", st->mnt_parent_id);
printf("mnt_id_old: %u\n", st->mnt_id_old);
printf("mnt_parent_id_old: %u\n", st->mnt_parent_id_old);
printf("mnt_attr: 0x%08llx\n", st->mnt_attr);
printf("mnt_propagation: %s%s%s%s\n",
st->mnt_propagation & MS_SHARED ? "shared," : "",
st->mnt_propagation & MS_SLAVE ? "slave," : "",
st->mnt_propagation & MS_UNBINDABLE ? "unbindable," : "",
st->mnt_propagation & MS_PRIVATE ? "private" : "");
printf("mnt_peer_group: %llu\n", st->mnt_peer_group);
printf("mnt_master: %llu\n", st->mnt_master);
}
if (st->mask & STMT_PROPAGATE_FROM)
printf("propagate_from: %llu\n", st->propagate_from);
if (st->mask & STMT_MNT_ROOT)
printf("mnt_root: %u <%s>\n", st->mnt_root, st->str + st->mnt_root);
if (st->mask & STMT_MNT_POINT)
printf("mnt_point: %u <%s>\n", st->mnt_point, st->str + st->mnt_point);
if (st->mask & STMT_FS_TYPE)
printf("fs_type: %u <%s>\n", st->fs_type, st->str + st->fs_type);
free(st);
return 0;
}
On 25/10/23 22:01, Miklos Szeredi wrote:
> Implement mount querying syscalls agreed on at LSF/MM 2023.
>
> Features:
>
> - statx-like want/got mask
> - allows returning ascii strings (fs type, root, mount point)
> - returned buffer is relocatable (no pointers)
>
> Still missing:
> - man pages
> - kselftest
>
> Please find the test utility at the end of this mail.
>
> Usage: statmnt [-l|-r] [-u] (mnt_id|path)
>
> Git tree:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git#statmount-v4
>
>
> Changes v3..v4:
>
> - incorporate patch moving list of mounts to an rbtree
> - wire up syscalls for all archs
> - add LISTMOUNT_RECURSIVE (depth first iteration of mount tree)
> - add LSMT_ROOT (list root instead of a specific mount ID)
> - list_for_each_entry_del() moved to a separate patchset
>
> Changes v1..v3:
>
> - rename statmnt(2) -> statmount(2)
> - rename listmnt(2) -> listmount(2)
> - make ABI 32bit compatible by passing 64bit args in a struct (tested on
> i386 and x32)
> - only accept new 64bit mount IDs
> - fix compile on !CONFIG_PROC_FS
> - call security_sb_statfs() in both syscalls
> - make lookup_mnt_in_ns() static
> - add LISTMOUNT_UNREACHABLE flag to listmnt() to explicitly ask for
> listing unreachable mounts
> - remove .sb_opts
> - remove subtype from .fs_type
> - return the number of bytes used (including strings) in .size
> - rename .mountpoint -> .mnt_point
> - point strings by an offset against char[] VLA at the end of the struct.
> E.g. printf("fs_type: %s\n", st->str + st->fs_type);
> - don't save string lengths
> - extend spare space in struct statmnt (complete size is now 512 bytes)
>
>
> Miklos Szeredi (6):
> add unique mount ID
> mounts: keep list of mounts in an rbtree
> namespace: extract show_path() helper
> add statmount(2) syscall
> add listmount(2) syscall
> wire up syscalls for statmount/listmount
>
> arch/alpha/kernel/syscalls/syscall.tbl | 3 +
> arch/arm/tools/syscall.tbl | 3 +
> arch/arm64/include/asm/unistd32.h | 4 +
> arch/ia64/kernel/syscalls/syscall.tbl | 3 +
> arch/m68k/kernel/syscalls/syscall.tbl | 3 +
> arch/microblaze/kernel/syscalls/syscall.tbl | 3 +
> arch/mips/kernel/syscalls/syscall_n32.tbl | 3 +
> arch/mips/kernel/syscalls/syscall_n64.tbl | 3 +
> arch/mips/kernel/syscalls/syscall_o32.tbl | 3 +
> arch/parisc/kernel/syscalls/syscall.tbl | 3 +
> arch/powerpc/kernel/syscalls/syscall.tbl | 3 +
> arch/s390/kernel/syscalls/syscall.tbl | 3 +
> arch/sh/kernel/syscalls/syscall.tbl | 3 +
> arch/sparc/kernel/syscalls/syscall.tbl | 3 +
> arch/x86/entry/syscalls/syscall_32.tbl | 3 +
> arch/x86/entry/syscalls/syscall_64.tbl | 2 +
> arch/xtensa/kernel/syscalls/syscall.tbl | 3 +
> fs/internal.h | 2 +
> fs/mount.h | 27 +-
> fs/namespace.c | 573 ++++++++++++++++----
> fs/pnode.c | 2 +-
> fs/proc_namespace.c | 13 +-
> fs/stat.c | 9 +-
> include/linux/mount.h | 5 +-
> include/linux/syscalls.h | 8 +
> include/uapi/asm-generic/unistd.h | 8 +-
> include/uapi/linux/mount.h | 65 +++
> include/uapi/linux/stat.h | 1 +
> 28 files changed, 635 insertions(+), 129 deletions(-)
Looks ok to me,covers the primary cases I needed when I worked
on using fsinfo() in systemd.
Karel, is there anything missing you would need for adding
libmount support?
Reviewed-by: Ian Kent <raven@themaw.net>
>
On Wed, Nov 01, 2023 at 07:52:45PM +0800, Ian Kent wrote:
> On 25/10/23 22:01, Miklos Szeredi wrote:
> Looks ok to me,covers the primary cases I needed when I worked
> on using fsinfo() in systemd.
Our work on systemd was about two areas: get mount info (stat/listmount()
now) from the kernel, and get the mount ID from notification.
There was watch_queue.h with WATCH_TYPE_MOUNT_NOTIFY and struct
mount_notification->auxiliary_mount (aka mount ID) and event subtype
to get the change status (new mount, umount, etc.)
For example David's:
https://patchwork.kernel.org/project/linux-security-module/patch/155991711016.15579.4449417925184028666.stgit@warthog.procyon.org.uk/
Do we have any replacement for this?
> Karel, is there anything missing you would need for adding
> libmount support?
Miklos's statmount() and listmount() API is excellent from my point of
view. It looks pretty straightforward to use, and with the unique
mount ID, it's safe too. It will be ideal for things like umount(8)
(and recursive umount, etc.).
For complex scenarios (systemd), we need to get from the kernel the
unique ID's after any change in the mount table to save resources and
call statmount() only for the affected mount node. Parse mountinfo
sucks, call for(listmount(-1)) { statmount() } sucks too :-)
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
On 6/11/23 20:10, Karel Zak wrote:
> On Wed, Nov 01, 2023 at 07:52:45PM +0800, Ian Kent wrote:
>> On 25/10/23 22:01, Miklos Szeredi wrote:
>> Looks ok to me,covers the primary cases I needed when I worked
>> on using fsinfo() in systemd.
> Our work on systemd was about two areas: get mount info (stat/listmount()
> now) from the kernel, and get the mount ID from notification.
>
> There was watch_queue.h with WATCH_TYPE_MOUNT_NOTIFY and struct
> mount_notification->auxiliary_mount (aka mount ID) and event subtype
> to get the change status (new mount, umount, etc.)
>
> For example David's:
> https://patchwork.kernel.org/project/linux-security-module/patch/155991711016.15579.4449417925184028666.stgit@warthog.procyon.org.uk/
>
> Do we have any replacement for this?
Not yet.
I tried to mention it early on but I don't think my description
conveyed what's actually needed.
>
>> Karel, is there anything missing you would need for adding
>> libmount support?
> Miklos's statmount() and listmount() API is excellent from my point of
> view. It looks pretty straightforward to use, and with the unique
> mount ID, it's safe too. It will be ideal for things like umount(8)
> (and recursive umount, etc.).
Thanks Karel, that's what I was hoping.
>
> For complex scenarios (systemd), we need to get from the kernel the
> unique ID's after any change in the mount table to save resources and
> call statmount() only for the affected mount node. Parse mountinfo
> sucks, call for(listmount(-1)) { statmount() } sucks too :-)
I have been looking at the notifications side of things.
I too need that functionality for the systemd work I was doing on
this. There was a need for event rate management too to get the
most out of the mount query improvements which I really only
realized about the time the work stopped. So for me there's
some new work needed as well.
I'm not sure yet which way to go as the watch queue implementation
that was merged is just the framework and is a bit different from
what we were using so I'm not sure if I can port specific extensions
of David's notifications work to it. I'm only just now getting to a
point where I can spend enough time on it to work this out.
Ian
On Mon, Nov 6, 2023 at 2:11 PM Karel Zak <kzak@redhat.com> wrote: > > On Wed, Nov 01, 2023 at 07:52:45PM +0800, Ian Kent wrote: > > On 25/10/23 22:01, Miklos Szeredi wrote: > > Looks ok to me,covers the primary cases I needed when I worked > > on using fsinfo() in systemd. > > Our work on systemd was about two areas: get mount info (stat/listmount() > now) from the kernel, and get the mount ID from notification. > > There was watch_queue.h with WATCH_TYPE_MOUNT_NOTIFY and struct > mount_notification->auxiliary_mount (aka mount ID) and event subtype > to get the change status (new mount, umount, etc.) > > For example David's: > https://patchwork.kernel.org/project/linux-security-module/patch/155991711016.15579.4449417925184028666.stgit@warthog.procyon.org.uk/ > > Do we have any replacement for this? > The plan is to extend fanotify for mount namespace change notifications. Here is a simple POC for FAN_UNMOUNT notification: https://lore.kernel.org/linux-fsdevel/20230414182903.1852019-1-amir73il@gmail.com/ I was waiting for Miklos' patches to land, so that we can report mnt_id_unique (of mount and its parent mount) in the events. The plan is to start with setting a mark on a vfsmount to get FAN_MOUNT/FAN_UNMOUNT notifications for changes to direct children of that mount. This part, I was planning to do myself. I cannot say for sure when I will be able to get to it, but it should be a rather simple patch. If anybody else would like to volunteer for the task, I will be happy to assist. Not sure if we are going to need special notifications for mount move and mount beneath? Not sure if we are going to need notifications on mount attribute changes? We may later also implement a mark on a mount namespace to get events on all mount namespace changes. If you have any feedback about this rough plan, or more items to the wish list, please feel free to share them. Thanks, Amir.
On 6/11/23 21:33, Amir Goldstein wrote: > On Mon, Nov 6, 2023 at 2:11 PM Karel Zak <kzak@redhat.com> wrote: >> On Wed, Nov 01, 2023 at 07:52:45PM +0800, Ian Kent wrote: >>> On 25/10/23 22:01, Miklos Szeredi wrote: >>> Looks ok to me,covers the primary cases I needed when I worked >>> on using fsinfo() in systemd. >> Our work on systemd was about two areas: get mount info (stat/listmount() >> now) from the kernel, and get the mount ID from notification. >> >> There was watch_queue.h with WATCH_TYPE_MOUNT_NOTIFY and struct >> mount_notification->auxiliary_mount (aka mount ID) and event subtype >> to get the change status (new mount, umount, etc.) >> >> For example David's: >> https://patchwork.kernel.org/project/linux-security-module/patch/155991711016.15579.4449417925184028666.stgit@warthog.procyon.org.uk/ >> >> Do we have any replacement for this? >> > The plan is to extend fanotify for mount namespace change notifications. > > Here is a simple POC for FAN_UNMOUNT notification: > > https://lore.kernel.org/linux-fsdevel/20230414182903.1852019-1-amir73il@gmail.com/ > > I was waiting for Miklos' patches to land, so that we can report > mnt_id_unique (of mount and its parent mount) in the events. > > The plan is to start with setting a mark on a vfsmount to get > FAN_MOUNT/FAN_UNMOUNT notifications for changes to direct > children of that mount. I'll have a look at what I needed when I was working to implement this in systemd. Without looking at the code I can say I was handling mount, umount and I think remount events so that's probably a minimum. As I mentioned earlier I found I also need event rate management which was a new requirement at the time. > > This part, I was planning to do myself. I cannot say for sure when > I will be able to get to it, but it should be a rather simple patch. > > If anybody else would like to volunteer for the task, I will be > happy to assist. I would like to help with this but I'm not familiar with fanotify so I'll need to spend a bit of time on that. I am just about in a position to do that now. I'll also be looking at the watch queue framework that did get merged back then, I'm not sure how that will turn out. > > Not sure if we are going to need special notifications for mount > move and mount beneath? Yes that will be an interesting question, I have noticed Christians' work on mount beneath. We need to provide the ability to monitor mount tables as is done by using the proc mount lists to start with and I'm pretty sure that includes at least mount, umount and moves perhaps more but I'll check what I was using. > > Not sure if we are going to need notifications on mount attribute > changes? Also an interesting question, we will see in time I guess. You would think that the mount/umount/move events would get what's needed because (assuming mount move maps to remount) mount, umount and remount should cover cases were mounted mount attributes change. > > We may later also implement a mark on a mount namespace > to get events on all mount namespace changes. Monitoring the proc mount tables essentially provides lists of mounts that are present in a mount namespace (as seen by the given process) so this is going to be needed sooner rather than later if we hope to realize improvements from our new system calls. Ian
On Wed, 25 Oct 2023 16:01:58 +0200, Miklos Szeredi wrote:
> Implement mount querying syscalls agreed on at LSF/MM 2023.
>
> Features:
>
> - statx-like want/got mask
> - allows returning ascii strings (fs type, root, mount point)
> - returned buffer is relocatable (no pointers)
>
> [...]
I think we should start showing clear signs of commitment to this. In
absence of strong objections I don't see a reason to let this rot on
list until we forget about it. Maybe this will entice people to provide
more reviews as well.
It's all pretty close to what we discussed at LSFMM23 and we stated that
we aim to merge something by the end of the year. Let's see if that can
actually happen.
I don't have huge quarrels with this. Yes, there's stuff I'd like to see
done differently but nothing I consider blockers. So let's get this
into -next once rc1 is out so it can get a full cycle of exposure.
I've renamed struct statmnt to struct statmount to align with statx()
and struct statx. I also renamed struct stmt_state to struct kstatmount
as that's how we usually do this. And I renamed struct __mount_arg to
struct mnt_id_req and dropped the comment. Libraries can expose this in
whatever form they want but we'll also have direct consumers. I'd rather
have this struct be underscore free and officially sanctioned.
---
Applied to the vfs.mount branch of the vfs/vfs.git tree.
Patches in the vfs.mount branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.master
[1/6] add unique mount ID
https://git.kernel.org/vfs/vfs/c/ec873c3baa0c
[2/6] mounts: keep list of mounts in an rbtree
https://git.kernel.org/vfs/vfs/c/f15247ad234c
[3/6] namespace: extract show_path() helper
https://git.kernel.org/vfs/vfs/c/6e5f64ac5382
[4/6] add statmount(2) syscall
https://git.kernel.org/vfs/vfs/c/edf3b2ac1bd5
[5/6] add listmount(2) syscall
https://git.kernel.org/vfs/vfs/c/4412ca803757
[6/6] wire up syscalls for statmount/listmount
https://git.kernel.org/vfs/vfs/c/d0a56e829d2c
On Wed, Nov 1, 2023 at 12:13 PM Christian Brauner <brauner@kernel.org> wrote: > I've renamed struct statmnt to struct statmount to align with statx() > and struct statx. I also renamed struct stmt_state to struct kstatmount > as that's how we usually do this. And I renamed struct __mount_arg to > struct mnt_id_req and dropped the comment. Libraries can expose this in > whatever form they want but we'll also have direct consumers. I'd rather > have this struct be underscore free and officially sanctioned. Thanks. arch/arm64/include/asm/unistd.h needs this fixup: -#define __NR_compat_syscalls 457 +#define __NR_compat_syscalls 459 Can you fix inline, or should I send a proper patch? Thanks, Miklos
On Wed, Nov 01, 2023 at 02:18:30PM +0100, Miklos Szeredi wrote: > On Wed, Nov 1, 2023 at 12:13 PM Christian Brauner <brauner@kernel.org> wrote: > > > I've renamed struct statmnt to struct statmount to align with statx() > > and struct statx. I also renamed struct stmt_state to struct kstatmount > > as that's how we usually do this. And I renamed struct __mount_arg to > > struct mnt_id_req and dropped the comment. Libraries can expose this in > > whatever form they want but we'll also have direct consumers. I'd rather > > have this struct be underscore free and officially sanctioned. > > Thanks. > > arch/arm64/include/asm/unistd.h needs this fixup: > > -#define __NR_compat_syscalls 457 > +#define __NR_compat_syscalls 459 Everytime with that file. It's like a tradition that I forget to update it at least once. > > Can you fix inline, or should I send a proper patch? No need to send. I'll just fix it it here.
© 2016 - 2025 Red Hat, Inc.